Date: 9th - 11th May 2006 - Met Office, Exeter
Present:
| Allyn Treshansky, Met Office | Frank Toussaint, MPI |
| Graham Riley, University of Manchester | Marco Christoforou, Met Office |
| Steve Mullerworth, Met Office | Jamie Kettleborough, Met Office |
| Sophie Valcke, CERFACS | Katherine Bouton, University of Reading |
| Rupert Ford, University of Manchester | Balaji, GFDL |
| Rosalyn Hatcher, University of Reading | Bryan Lawrence, CCLRC |
| Lois Steenman-Clark, University of Reading | Mick Carter, Met Office |
| Michael Lautenschlager, MPI | Irina Linova-Pavlova, Met Office |
PRISM Metadata
(Sophie Valcke)
Metadata requirements for OASIS4:
For each {application}, i.e. in PRISM terms, for each code which when compiled leads to one executable:
AD - Application description
For each {component}:
PMIOD - Potential Model Input/Output description
SMIOC - Specific Model Input/Output configuration
For each {coupled model}:
SCC - Specific Coupling Configuration
Numerical grid metadata is needed in the PMIOD and subsequently in the SMIOC. The PRISM grid metadata is divided into:
- physical space (geographical mapping)
- sampled space (model domain)
- compute space (mapping to the computer processors)
Within each {component} there is a call to PRISM_DEF_GRID (which defines the type of grid used)
Outcome of the discussion:
- There is a need for a standard vocabulary for the taxonomy of grids
- The vocabulary to describe the coupled modelling process {component) needs to be clarified
- Currently PRISM defines the numerical grid in 3D whereas CF treats horizontal and vertical separately, is this an issue?
- Need for versioning as a mechanism for managing change as the metadata standards evolve
CF Metadata (Sophie Valcke)
A detailed comparison of CF metadata attributes was presented.
Outcome from the discussion:
- Future extensions including the horizontal numerical grid metadata
- Naming convention for attributes, use of capital letters and/or underscores needs sorting out
- Maintenance of CF (Bryan expanded on this - pointing to the
CF White paper
) - Vocabulary of metadata attributes needs coordination between PRISM and CF
CERA Metadata
(Frank Toussaint)
This talk outlined the service offered by the World Data Centre for Climate which uses the CERA-2 metadata standard.
- The catalogue interface has been changed and upgraded to overcome previous security issues.
- Uses XSL mapping to provide users with different metadata formats
- Metadata is divided into different types
- Metadata is collected from different sources eg GRIB, CF, as well as notes, in-line comments and from users
- There is an additional area - bulk metadata - which is used to store all other metadata e.g. numerical grid descriptor or other model metadata (NMM?)
NDG - NERC Data Grid
(Bryan Lawrence)
The talk focused on the metadata issues for NDG. The metadata is separated into different categories:
A - archive, B - Browse, C - Comment D - Discovery E - Extra
- Want to encapsulate semantic (descriptive) metadata and separate it from detailed metadata
- Proposing NumSim as a DIF (Directory Interchange Format) as a discovery metadata format which gives a science level summary
Outcomes from the discussion:
- Need to explore the vocabulary for experiments/simulations
- Detailed criticism of NumSim is invited (Timetable?) - http://proj.badc.rl.ac.uk/ndg/wiki/NumSim
- Need a detailed proposal for grid metadata
NMM - Numerical Model Metadata
(Katherine Bouton)
- NMM is semantic metadata for codebase/model/simulation
- NMM has attributes for 5 properties (Information, technical, numerical,science,input/output)
- numerical properties have semantic metadata attributes for grid types and spatial+temporal resolution at present no standard vocabulary has emerged for all numerical models in IPCC or APE
Outcomes from the discussion:
- Sstandard vocabulary needed for grid type descriptors
- NMM should maintain close contact with Rocky Dunlap (Georgia Tech) who is working on metadata for Curator
- Need to look at NMM with respect to ESMF which has a finer level of granularity
ESMF - Earth System Modelling Framework (Balaji)
ESMF has a different container class for metadata:
- grid - numerical grid metadata
- field - physical variable
- attribute - NMM like descriptor
- state - instance of some set of fields
- component - top level entity
The sandwiched structure of ESMF means it has a finer granularity.
Outcome from the discussion:
- Llook at NMM (and NumSim?) with this level of granularity
- Vocabulary mapping for all the different elements in coupled models
BFG - Bespoke Framework Generator
(Rupert Ford)
- BFG2 has a put/get interface and an argument driven communications interface. As yet there is no spatial metadata, BFG will implement standards as they emerge.
- BFG uses a definition, configuration, deployment methodology for constructing the framework.
FLUME - Flexible Unified Model Environment
(Allyn Treshansky)
- Uses a DCCD approach, definition, configuration, composition and deployment.
- For grids the definition stage has uninstantiated grids the configuration stage has instantiated grids (full gridspec) using a grid instantiator which does the mapping.
- The FLUME metadata model uses XML to implement constraints.
Outcome from the discussion:
- Further discussion needed on whether to and how to implement constraints in metadata. PRISM and CF have simple constraints of the form max/min but there are no formal mechanism for more complex constraints.
GRIDSPEC (Balaji)
A draft paper "A standard description of grids used in Earth system Models"
was circulated by Balaji.
Balaji went through the paper inviting discussion and feedback. There were
many issues raised including:
- conservation, not addressed in the paper
- how to cope with "deployment" parallelisation
- standard names for descriptive metadata and attributes
- vertical discretisation which currently is not part of this paper
Outcomes from the discussion:
- glossary needed of the complete list of keywords
- tools for constructing grids from a gridspec (circulated after the meeting in Toulouse) will have to be changed to reflect the current standard. There is also a tool to plot a gridspec file ncvtk
- need to have versioning
The discussion of the gridspec standard also covered the relationship between the two types of definitions, grid tile and grid mosaic.
The grid tile contains the definition of the supergrid which has a higher refinement than the actual numerical grid to ensure the placement of quantities at different locations within the grid cell can be accommodated.
The grid mosaic defines the contact regions between the grid tiles. This accommodates numerical grids where the grid tiles are independently discretised such as ying-yang grids or nested grids.
Outcomes from the discussion:
- Should semantic, descriptive, metadata be provided using a controlled
vocabulary for grids, should this also describe the resolution? This
metadata can be derived from the gridspec but this would not provide
standard metadata for discovery systems or couplers like OASIS or for intercomparisons like IPCC so the general conclusion was that a controlled vocabulary was necessary and this needed to be part of the
gridspec. - If semantic metadata is provided would it be in the mosaic or the tile descriptor of the gridspec. The discussion concluded that the current standard which allowed a gridspec to have a grid tile without a mosaic should really be changed to ensure that a gridspec always had a mosaic and a tile descriptor even if there was only one grid tile. This would also mean that semantic metadata would be in the mosaic descriptors.
- The question of defining a complete gridspec for a coupled numerical model with a hierarchy of mosaics is outstanding, this concept still needs to be tested.
- The discussion moved to whether a gridspec definition was unique or could the same numerical grid be constructed in different ways. It was concluded that currently the gridspec was not unique which could lead to a loss of information. This needs to be explored further.
- The hierarchy could also lead to very large gridspec files and possibly performance problems when using them to construct numerical grids. This issue needs to be addressed when the gridspec is more stable.
- There also needs to be a discussion on the aggregation of the different elements of the gridspec. The proposal on page 21 of the draft was not thought to be robust or practical.
What is a "thingy"?
Because the discussion of metadata was being hindered by the different
vocabularies used in the process of coupled modelling, the last stage
of the meeting attempted to find the vocabulary currently used in the
different communities (PRISM, FLUME, ESMF). A "thingy" is the different
parts of the process of an Earth system modelling event that produces
numerical model data.
- it was hard to separate process and outcome during the discussion
- the discussion highlighted the different levels of granularity used by different communities in the whole process
- different communities use very different coupling strategies so the path through the whole process is different
- the need to understand these differences is important for the
discussion of the aggregation of metadata (ie for a gridspec) as well as for
extracting higher level metadata for discovery or intercomparison
purposes.
The enclosed table
is a summary of this discussion
- further discussion is needed to move towards a more common set of terms.

