This document is a draft. It is intended to form a part the specification of a proposed CaveXML format. It should be understood in the context of the other documents that make up that proposal.
The proposed CaveXML format is intended to be a common format for data derived from cave surveys. It is intended to be used for storing, archiving, transferring (between people and applications), and distributing cave data.
This draft document is known to be incomplete. There are many features required for a final version of CaveXML which are not included. There also may be inconsistencies, bugs, or simply bad design decisions.
Tis draft is one of several proposals for all or part of an XML format for cave survey data. Others currently known to the author (in no particular order) are Cave Survey Data in XML, by Devin Kouts, (located on the same server as this one),CaveScript, by Michael Lake, KML, by Martin Laverty, CaveXML, by Richard Knapp, and Proposal for Cave and Karst Data Exchange Standards, by Peter Matthews. This proposal is the result of discussions on the mailing list cavexml@cartography.ch, (archives). The list is sponsored by Cave Data Format Working Group of the UIS Informatics Commission
Comments should be emailed to the author Ralph Harley . Or discussed on cavexml@cartography.ch
This document describes the semantics of CaveXML.
It is not concerned with what constitutes a valid CaveXML file (that is described by the DTD and possibly other not yet written documents). Instead it intends to describe the meaning of any CaveXML file.
Nothing in this document should be construed as describing any processing
that must, should, or even may, be performed on any file, even if computational
terms are used. It is intended to describe only the interpretation of the
files content.
Some elements in a survey file have attributes with values which often have the same value for all, or large portions of a file. CaveXML allows overridable defaults that allow values of attributes to be set for sections of the file.
The interpretation of files containing Default elements is determined by the DefaultTransform. Any two files for which DefaultTransform produces identical results are defined to be BasicEquivalent. Some sections of this document will describe the semantics of files that contain no Default elements, the corresponding semantics for files which do contain Default elements are defined to be the same as the result of applying the DefaultTransform to them.
The DefaultTransform is not information preserving. It results in a file which has a very different structure from the original. In fact it is not recommended that the DefaultTransform ever be applied to any file, it is intended only to describe the semantics of defaults.
The DefaultTransform starts with the innermost nested Default element:
<Default element="element" attribute="value" ... >
The value of the element attribute, and the subtype attribute (if present)
are interpreted as a space delimited lists.
An element E matches the Default element if:
One of the items in the element attribute of the Default is the type of
the element E.
and
If both E and the Default have a subtype attribute, then one of the items
in the subtype attribute of the default is the same as the subtype element of E.
For each matching element nested in the innermost Default element, which has no attribute named <attribute>, but for which one is legal, DefaultTransform adds attribute="value;" to the start tag. For the purposes of this section, an attribute who's value comes from the global default (in the DTD) is NOT considered to have that attribute. This is repeated for each attribute value pair in the start tag of the Default element. The start and end tags of the Default element are then removed.
The above is repeated until no Default elements remain.
Again, DefaultTransform only describes the meaning of files that contain
Default elements. It does not describe any part of any recommended processing.
| Original |
is BasicEquivalent to (due to DefaultTransform) |
Comment |
|---|---|---|
| <Default element="Distance"
unit="feet"> <Distance value="12.6"/> </Default> |
<Distance value="12.6" unit="feet"/> | |
|
<Default element="Distance" unit="feet"> |
<Distance value="25.6" unit="feet"/> |
The value is always supplied by the innermost
nested Default element that matches. |
| <Default element="Distance"
unit="meters"> <Distance value="12.6" unit="feet"/> </Default> |
<Distance value="12.6" unit="feet"/> |
Defaults only supply values to elements that
do not have an attribute of their own |
| <Default element="Inclination"
unit="mils"> <Azimuth value="12.6"/> </Default> |
<Azimuth value="12.6"/> |
Defaults only supply values to elements of matching type. |
| <Default element="Azimuth
Inclination" subtype="for" unit="degrees"> <Default element="Azimuth Inclination" subtype ="back" unit="grads"> <Azimuth subtype="for" value="32"/> <Inclination subtype="for" value="25"/> <Azimuth subtype="back" value="86"/> <Inclination subtype="back" value="-12"/> </Default> </Default> |
<Azimuth subtype="for" value="32"
unit="degrees"/> <Inclination subtype="for" value="25" unit="degrees"/> <Azimuth subtype="back" value="86" unit="grads"/> <Inclination subtype="back" value="-12" unit="grads"/> |
Azimuth and inclination are in degrees for forsights
and in grads for backsights. |
| <Default element="Station"
name="defname" prefix="SURVEYB"> <Station/> </Default> |
<Station name="defname" prefix="SURVEYB"/> | One Default element can provide multiple attributes. |
| Attribute |
Meaning |
||||||
|---|---|---|---|---|---|---|---|
| source |
The name of the person or organization that
supplied the data. |
||||||
| signature |
A cyptographic signature of the person or organization that supplied the data. (PGP?) Invalidated if the
enclosed data is changed in any way. |
||||||
| recipient |
The name of the person or organization that received the data when the Provenance element was
written. This allows gaps in the chain of possession to be detected. |
||||||
| description |
A brief text description of the data's origin. |
||||||
| reliablity |
A qualitative measure of the reliability of
the data. This does not indicate the expected precision or accuracy of
the measurements themselves, instead it is a measure of the integrity and
trustworthiness of the source. The reliability of the data is no better
than the worst of the nested provenance elements.
|
||||||
| date |
The date on which the data was obtained. This
may be latter than when it was actually recorded. Format is ISO 8601. |
||||||
| ID |
All ProvenanceSource elements
with the same ID refer come from the same source. |
||||||
| reference |
| Attribute |
Meaning |
|---|---|
| converter |
The name of the conversion program that produced this data from another format. |
| version |
The version of the conversion program. |
| format |
The name of that format. |
| program |
If this data was produced by a program, this attribute
may contain its name. This indicates the program that produced the data
in the other format. |
| programversion |
The version of the program that produced the original
data. |
| processdetail |
Other information describing the conversion that
produced this data. The meaning of this attribute is specific to the
converter (arguments, preference settings etc.). |
| original |
A link to the data in the other format data was produced from. If the data
was copied from another source, this may contain a link to the original.
The interpretation provenance can be complicated if the original is part
of the same file and/or has Provenance elements of its own. (Not in DTD yet) |
| date |
The date of the conversion.
Format is ISO 8601. |
| sourcedata |
The actual data from which the data was derived. For example the line in the Compass file. This would be used if the conversion is "fine grained", that is it is being recorded exactly which parts of the CaveXML file correspond to which parts of the original (which may facilitate merging edits in one format into the other). Only the converter would be expected to understand this attribute. |
| ID |
All ProvenanceImport elements with the
same ID contain data from the same conversion. |
| reference |
A station consists of a set of Station elements (XML elements of type "Station" as described by the DTD). It is defined to contain one element with an ID attribute, together with all the Station elements with a matching reference attribute. Station elements are XML elements, stations are slightly more abstract, they are intended to correspond to actual survey stations, which may be described by more than one element.
For the purpose of this discussion, it is assumed that every Station element has either an ID or a reference attribute. If not, the file is to be interpreted as described in the section on IDs
No significance should be placed on which Station element in a station is the one with an ID. Which station has the ID attribute does not change the meaning of the file in any way.
Each Station element provides information about the station of which it is a member. This information may be encoded in the position of the element within other elements, or in its attributes, or in the elements it contains.
A station may have as many names as it has Station elements. However, for some purposes it is important to distinguish between different names for a single station. This is what NameSets are for.
If a Station element has a nameset attribute, it's name is associated
with the NameSet referred to. We will also say that the name is in the NameSet
| Attribute |
Meaning |
| program |
The name of the program that uses this nameset. One of the purposes of namesets is that programs have incompatible restrictions on what constitutes a valid name. All names in nameset with this attribute should be acceptable as station names by the specified program. |
| name |
A name for the nameset. Distinguishes between multiple
namesets for the same program. |
| ID |
Used to refer to this nameset |
Note that in this example, the second and third Station elements are part of the same station, because they have the same ID. An Equivalence element could be used instead, in that case there would be two stations that were declared to be equivalent, but with separate identities.
All stations are assigned unique IDs, ether explicitly or implicitly. Any Station element has either an ID or a reference that identifies the station it is a member of. If all Station elements have either an ID or a reference attribute, it is a member of the station with the matching ID. The station to which a Station element with no ID or reference attribute is determined as described below.
Note that it is sometimes important to remember the distinction between a station and a Station element. A station is a set of Station elements.
For some programs or converters, or when entering data by hand, it may
be inconvenient to assign IDs to stations, as this requires
ensuring that the IDs are unique and that exactly one Station
element has an ID attribute for each ID. The mechanism described in this
section is designed to mitigate this problem. Nothing below
should be construed as implying that IDs should be assigned
in any particular way. For station elements that have an ID or
reference attribute, there may be no connection between the name attribute
and the ID.
Any Station element that lacks both an ID and a reference attribute has <nameset><prefix><name> (the concatenation of the values of the nameset, prefix, and name attributes, if the nameset is absent <prefix><name> is used) as its ID. The values of the prefix and name attributes may also be supplied by a nesting Default element. For the purposes of this section all stations will behave as if nested within an outermost Default element providing the name "noname" and the prefix "STA".
The meaning of any file in which some Station elements lack both ID and reference elements is the same as that of a file in which each such element has had an ID or reference element added. The value of the attribute is the concatenation of the values of the values of the element's prefix and name attributes. For each such value, if the file does not already contain a Station element with an ID attribute with that value, then exactly one of the elements has an ID attribute added, all other elements have a reference attribute added. The transform also adds the attribute generatedname="true" to each station that has an ID or reference added. Two files that give the same result when applying this transformation (IDTransform) are defined to have the same meaning.
| Original |
Equivalent |
Comment |
|---|---|---|
| <Station name="AB1" ID="foobar"\> |
<Station name="AB1" ID="foobar"\> |
Stations with explicit IDs don't change.
Name is independant of ID. |
| <Default element="Station"
prefix="Asurvey"> <Station name="1"/> <Station name="2"/> <Station name="1" prefix="Bsurvey"\> </Default> |
<Station name="1" ID="Asurvey1"
generatedname="true"/> <Station name="2" ID="Asurvey2" generatedname="true"/> <Station name="1" prefix="Bsurvey reference="Bsurvey1" generatedname="true"\> |
Prefix or name may be provided by a default,
which may be overriden. |
| <Station ID="foobar"/> <Station prefix="foo" name="bar"/> |
<Station ID="foobar"/> <Station reference="foobar" prefix="foo" name="bar" generatedname="true"/> |
Only one ID per value. |
| <Station/> |
<Station ID="STAnoname" generatedname="true"/> |
Global defaults |
and
<Location station="STA25" northing="23.6" easting="15.2" elevation="-23"/>Are both legal, but they do not have exactly the same meaning, the first locates a station element and the second the whole station (a station is a set of Station elements). The distinction may be irrelevant for raw data, but sometimes can be important (e.g. when a station has more than one Location, See discussion of unclosed processed data).
In future versions of this proposal, Location elements will be permitted
to describe the position of a station in other ways, e.g. lat/lon or GPS
fixes.
Sometimes stations that were initially believed to be distinct turn out to be the same point. They could be identified by assigning them the same ID (directly or using the mechanism described in the section on IDs) but this is dangerous because it does not permit the stations to be easily disentangled if the identification turns out to be incorrect.
The following are all BasicEquivalent:
Nested within one station, the other named by an attribute:
<Station reference="STA1">
<Equivalent reference="STA2"/>
</Station>
Nested within one station, containing the other.
<Station reference="STA1">
<Equivalent>
<Station reference="STA2"/>
</Equivalent>
</Station>
Directly between two sibling Station Elements.:
<Station reference="STA1"/>
<Equivalent/>
<Station reference="STA2"/>
They all state that the station with ID STA1 is equivalent to the station with ID STA2.
Shots, like stations, may consist of more than one Shot element, connected
by an ID. Because it is common for a given shot to be mentioned only
once, however, no mechanism for providing default IDs for shots is provided.
| Source for from station |
|---|
| The station referred to by the from attribute.
The Station element within the station referred to is the closest member
of the station preceding the Shot element (important only for processed
data with process=unclosed or treesignificant=true). |
| The first of two Station elements nested inside
the shot. |
| A Station element that is a sibling of the
Shot, that precedes it, and is not separated from it by any station or
shot elements. |
| The Station element that most closely precedes
the Shot element in the file. |
| The Station element that most closely follows
the Shot element in the file. |
| Source for to station |
|---|
| The station referred to by the to attribute.
The Station element within the station referred to is the closest member of
the station following the Shot element (important only for processed data
with process=unclosed or treesignificant=true). |
| The second of two Station elements nested inside
the shot. |
| A Station element that is a sibling of the
Shot, that follows it, and is not separated from it by any station or
shot elements. |
| The Station element that most closely follows
the Shot element in the file. |
| The Station element that most closely precedes
the Shot element in the file. |
| Example |
Comment |
|---|---|
| <Station ID="STA1"/> <Station ID="STA2"/> <Shot from="STA1" to="STA2"/> |
STA1 to STA2 (Real shots would also include data.) |
| <Shot> <Station reference="STA1"/> <Station reference="STA2"/> </Shot> |
STA1 to STA2 |
| <Station ID="STA1"/> <Shot/> <Station ID="STA2"/> <Shot/> <Station ID="STA3"/> |
A shot from STA1 to STA2 and A Shot from STA2 to STA3 |
| <Station ID="STA2"/> <Station ID="STA1"/> <Shot to="STA2"/> |
A combination. Again the shot is from STA1 to
STA2. |
| <Station ID="STA1"/> <Shot> <Station reference="STA2"/> </Shot> <Station ID="STA3"/> |
This shot is from STA1 to STA3. Nested Station
elements only define to and from if there are exactly two. |
The value of the subtype attribute of an Azimuth element indicates what type of sighting it was: "for" means that the reading represents a forsight. "back" means that the reading represents a backsight. "combined" means the data was taken as a forsight and a backsight, but that the two readings where averaged, and the original numbers were thrown away (some applications require this). "unspecified" means that you don't know (e.g. it was not recorded) how this reading was taken, if a backsight was taken at all, and if one was, what happened to it.
If subtype="for", "combined", or "unspecified" the reading is to be interpreted as having been taken at the "from" station sighting toward the "to" station, while if subtype="back" it is the reverse. This may be modified by the reversed and inverted attributes.
If the reversed attribute is "true" the direction of the shot is reversed, that is, it is taken from the opposite station than normal. This is intended for shots where the surveyors temporarily trade places, or two forward readings replace a forsight and backsight. Because defaults can apply to forsights and backsights differently, two forsights may not the same as a forsight and a reversed backsight.
If the inverted attribute is "true" the reading recorded is 180 degrees different than normal. This is intended to indicate "corrected" backsights. (it needs to be a separate attribute from reversed, because may be be set as a default.)
All of the following indicate the same direction between from and
to stations:
<Azimuth subtype="for" value="23"/>
<Azimuth subtype="back" value="203"/>
<Azimuth subtype="for" reversed="true" value="203"/>
<Azimuth subtype="back" reversed="true" value="23"/>
<Azimuth subtype="for" inverted="true" value="203"/>
<Azimuth subtype="back" inverted="true" value="23"/>
<Azimuth subtype="for" inverted="true" reversed="true" value="23"/>
<Azimuth subtype="back" inverted="true" reversed="true" value="203"/>
If other units are used the reversed value will be whatever the
opposite direction is in those units.
The interpretation of Inclination is almost exactly the same as Azimuth
(except that in default units the inverse of +5 is -5).
| Attribute |
Meaning |
|---|---|
| orientation |
The direction in which the section is facing. Azimuth
in degrees or N, S, NE, SSW etc. |
| shot |
The ID of the shot that determines the facing direction.
The direction attribute takes precedence. |
| station |
The Station at which the section was taken. |
| position |
If direction comes from a Shot, indicates that the
section was taken at the to or from station of that shot. |
| left |
Distance to left wall at the station looking in
the specified direction. |
| right |
Distance to right wall at the station looking in
the specified direction. |
| up |
Distance from station to ceiling. |
| cieling |
Distance from floor to ceiling at the station. |
| floor |
Distance from station to floor. |
| Sources for direction |
|---|
| The direction indicated by the value of the
orientation attribute. |
| The the direction of the Shot element the CrossSection
element is nested inside. |
| The direction of the unique Shot element nested
inside the CrossSection element. |
| The average of the directions of the two shot
elements nested inside the CrossSection element. |
| The direction of the single shot referred to
by the shot attribute. |
| The average of the directions of the two shots
referred to by the shot attribute. |
| The average of the directions of the two Shot
elements that are siblings of the CrossSection element, and are not separated
from the CrossSection element by any Shot or Station elements. |
| The direction of the unique Shot element that
is a sibling of the CrossSection element, and is not separated from the
CrossSection element by any Shot or Station element. |
| The average of the directions of the two Shot
elements that are siblings of the Station element within which the CrossSection
element is nested, and are not separated from that Station element by any
Shot or Station elements. |
| The direction of the unique Shot element that
is a sibling of the Station element within which the CrossSection element
is nested, and is not separated from that Station element by any Shot or
Station elements. |
| The direction of the Shot element that most
closely precedes the CrossSection element in the file. |
| The direction of the shot element that most
closely follows the CrossSection element in the file. |
| Sources for position |
|---|
| The station Element the CrossSection element
is nested inside. |
| The unique Station element nested inside the
CrossSection element. |
| The station referred to by the station attribute.
The Station element within the station referred to is the closest member
of the station preceding the Shot element (important for unclosed data only
). Discouraged in unclosed processed data. |
| If the position attribute is "from", the from
station of the single shot used to determine the orientation of the CrossSection. |
| If the position attribute is "to", the to station
of the single shot used to determine the orientation of the CrossSection. |
| The unique Station element that is a sibling
of the CrossSection element, and is not separated from the CrossSection
element by any Shot or Station element. |
| The Station element that most closely precedes
the CrossSection element in the file. |
| The Station element that most closely follows
the CrossSection element in the file. |
| Example |
|---|
| <Section station="STA1" direction="NE" up="1"
down="0" left="4" right="2"/> |
| <Station ID="STA1"/> <Section up="1" down="0" left="4" right="2"/> <Shot> <Azimuth subtype="for" value="45"/> </Shot> |
| <Shot> <Azimuth subtype="for" value="0"/> </Shot> <Station ID="STA1"> <Section up="1" down="0" left="4" right="2"/> </Station> <Shot> <Azimuth subtype="for" value="90"/> </Shot> |
| <Shot from="STA2" to="STA1"> <Section position="to" up="1" down="0" left="4" right="2"/> <Azimuth subtype="for" value="45"/> </Shot> |
| <Shot> <Section up="1" down="0" left="4" right="2"> <Station reference="STA1"> </Section> <Azimuth subtype="for" value="45"/> </Shot> |
| <Station ID="STA1"> <Section up="1" down="0" left="4" right="2"/> </Station> <Shot> <Azimuth subtype="for" value="45"/> </Shot> |
| <Station ID="STA1"> <Section up="1" down="0" left="4" right="2"> <Shot> <Azimuth subtype="for" value="45"/> </Shot> </Section> |
| <Station ID="STA1"> <Shot> <Azimuth subtype="for" value="0"/> </Shot> <Section up="1" down="0" left="4" right="2"> <Station reference="STA1"/> </Section> <Shot> <Azimuth subtype="for" value="90"/> </Shot> |
| Attribute |
Meaning |
||||||
|---|---|---|---|---|---|---|---|
| program |
The name of the program that processed this data. |
||||||
| process |
|
||||||
| processdetail |
Other information describing the processing or
conversion that produced this data. The meaning of this attribute is
specific to the program. |
||||||
| treesignificant |
If true, the program that produced the data considers
the tree structure of the data to be significant. Otherwize it may not. |
||||||
| original |
This may contain a link to the data it was produced
from. (Not in DTD yet) |
||||||
| date |
The date of the processing. Format
is ISO 8601. |
||||||
| ID |
All Processed elements with the same
ID contain data from the same processing, as if they were all in the same
Processed element. |
||||||
| reference |
The optional TreeRoot element indicates that a Station element is the root of the tree on which the processing was based. There should be only one Station element so marked in each connected component of the data.
The Station element indicated is the first one following the TreeRoot element that matches the station attribute, or if the station attribute is missing, the first Station element following the TreeRoot Element.
If no TreeRoot element is present, root of the tree not defined. Applications
that need a rooted tree may use the first Station in the Processed element
as the root. If process is not "unclosed" and treesignificant is not "true",
then TreeRoot elements are meaningless and may be ignored.