Personal tools
You are here: Home Developer Incubation Kepler Engineering View for REAP Transfer Sensor Data from DataTurbine to Metacat

Transfer Sensor Data from DataTurbine to Metacat

This is the design document (UNDER CONSTRUCTION) for Bug 4757.

(http://bugzilla.ecoinformatics.org/show_bug.cgi?id=4757)

Workflow Design to store sensor data from dataturbine to metacat (bugs/developers)

Bugs And Users

Bug 5213
This bug implements the feature to query a dataturbine server to retrieve all the associated sensors. The following information shall be retrieved for each sensors: (i) sensor id; (ii) logger id to which the sensor is attached; and (iii) site Id to which the sensor belongs. I naming convention can be followed where each sensor is named as: site(x)_logger(y)_sensor(z), where x, y, z are site id, logger id, and sensor id.

Bug 5214
This bug implements the feature to retrieve the last timestamp for which a sensor's data was stored in metacat. For each sensors, the data is stored in the metacat. We need to keep track of the timestamp of the last data for the sensor that was stored in the metacat. At current, we are proposing to store this information in a database (mysql). Table sensorTimeInfo(sensorId, lastTime) which will keep updated with the last timestamp for each sensor which the data has been stored in the metacat.
 
Bug 5216
This bug implements the feature which takes a sensor id and the last time stamp since the data was stored in the metacat for that sensor, and returns all the timestamps when the metadata was changed. These timestamps reflect the time points where the data sets need to be chunked.
 
Bug 5217
Given a sensor id and a timestamp, retrieve the metadata of the sensor at that specific timestamp. This information is required to build the sensorML information.
 
Bug 5218
Given the metadata string, parse this info to create the sensorML format for the same.
 
Bug 5220
Given a sensorId and a series of timestamps (since a previous date) that reflect when the metadata for the sensor changed, the data set need to be chunked into distinct files. The output is the file with the naming convention
as "site(x)_logger(y)_sensor(z)_01102010-01302010.txt". This file is created at some local location, and the complete path of the file is sent as string output. "01102010-01302010" denotes the timestamp as 01(day)  10(month) 2010 (year). Also, time interval "01102010-01302010" denotes the time when the metadata for sensor(x) was same, starting from  "01102010" and till "01102010". So, the dataset is chunked to reflect the same metadata for this duration.
 
Bug 5221
Given a dataset file path string and sensorML string, create the eml string for the same.
 
Bug 5222
Given a data set file path and the EML file, store the data and EML file to metacat. Seems that we use the already existing EcoGridWriter to achieve this functionality.
 
Bug 5223
Once the data set for a specific time interval is stored in the metacat, we need to store the last timestamp interval, so that we know since when to start retrieving the next datasets for the same sensor, when we make a read connection to the dataturbine server. This feature is achieved by storing this information in the local mysql db. This same information will be read when the next chunking needs to be done for the dataset.

Design Document (here).

A sample moml file describing a site layout with 2 sensors 

 

<?xml version="1.0" standalone="no"?>
<!DOCTYPE entity PUBLIC "-//UC Berkeley//DTD MoML 1//EN"
    "http://ptolemy.eecs.berkeley.edu/xml/dtd/MoML_1.dtd">
<entity name="sml_(2)" class="ptolemy.actor.TypedCompositeActor">
    <property name="_createdBy" class="ptolemy.kernel.attributes.VersionAttribute" value="8.1.devel">
    </property>
    <property name="derivedFrom" class="org.kepler.moml.NamedObjIdReferralList">
    </property>
    <property name="entityId" class="org.kepler.moml.NamedObjId" value="urn:lsid:kepler-project.org/ns/:2082:9:31">
    </property>
    <property name="TOP Provenance Recorder" class="org.kepler.provenance.ProvenanceRecorder">
    </property>
    <property name="Reporting Listener" class="org.kepler.reporting.ReportingListener">
    </property>
    <property name="DE Director" class="ptolemy.domains.de.kernel.DEDirector">
        <property name="startTime" class="ptolemy.data.expr.Parameter" value="0.0">
        </property>
        <property name="stopTime" class="ptolemy.data.expr.Parameter" value="Infinity">
        </property>
        <property name="stopWhenQueueIsEmpty" class="ptolemy.data.expr.Parameter" value="false">
        </property>
        <property name="synchronizeToRealTime" class="ptolemy.data.expr.Parameter" value="false">
        </property>
        <property name="isCQAdaptive" class="ptolemy.data.expr.Parameter" value="true">
        </property>
        <property name="minBinCount" class="ptolemy.data.expr.Parameter" value="2">
        </property>
        <property name="binCountFactor" class="ptolemy.data.expr.Parameter" value="2">
        </property>
        <property name="timeResolution" class="ptolemy.actor.parameters.SharedParameter" value="1E-10">
        </property>
        <property name="_location" class="ptolemy.kernel.util.Location" value="{45.0, 40.0}">
        </property>
        <property name="_hide" class="ptolemy.kernel.util.StringAttribute">
        </property>
    </property>
    <property name="_windowProperties" class="ptolemy.actor.gui.WindowPropertiesAttribute" value="{bounds={175, 24, 1136, 878}, maximized=false}">
    </property>
    <property name="_vergilSize" class="ptolemy.actor.gui.SizeAttribute" value="[795, 703]">
    </property>
    <property name="_vergilZoomFactor" class="ptolemy.data.expr.ExpertParameter" value="1.0">
    </property>
    <property name="_vergilCenter" class="ptolemy.data.expr.ExpertParameter" value="{397.5, 351.5}">
    </property>
    <entity name="sensor1" class="org.kepler.sensor.actor.Sensor">
        <property name="isOn" class="ptolemy.actor.parameters.PortParameter" value="true">
        </property>
        <property name="samplingPeriod" class="ptolemy.actor.parameters.PortParameter" value="10.000000">
        </property>
        <property name="dataLogger" class="ptolemy.data.expr.StringParameter" value="CR800">
        </property>
        <property name="sensorServer" class="ptolemy.data.expr.StringParameter" value="localhost">
        </property>
        <property name="_location" class="ptolemy.kernel.util.Location" value="{315.0, 250.0}">
        </property>
        <property name="coefficients" class="ptolemy.data.expr.StringParameter" value="">
        </property>
        <property name="conversion-type" class="ptolemy.data.expr.StringParameter" value="Linear">
        </property>
        <property name="measurement-unit" class="ptolemy.data.expr.StringParameter" value="degrees Celsius">
        </property>
        <property name="sampleMethod" class="ptolemy.data.expr.StringParameter" value="average">
        </property>
        <property name="samples-per-measurement" class="ptolemy.data.expr.Parameter" value="10">
        </property>
        <property name="sensor-make" class="ptolemy.data.expr.StringParameter" value="Vaisala">
        </property>
        <property name="sensor-measurement" class="ptolemy.data.expr.StringParameter" value="Temperature">
        </property>
        <property name="sensor-model" class="ptolemy.data.expr.StringParameter" value="HMP45A">
        </property>
        <property name="serial-number" class="ptolemy.data.expr.StringParameter" value="B3310001">
        </property>
        <port name="data" class="ptolemy.actor.TypedIOPort">
            <property name="output"/>
            <property name="_cardinal" class="ptolemy.kernel.util.StringAttribute" value="NORTH">
            </property>
        </port>
        <port name="isOn" class="ptolemy.actor.parameters.ParameterPort">
            <property name="input"/>
        </port>
        <port name="samplingPeriod" class="ptolemy.actor.parameters.ParameterPort">
            <property name="input"/>
        </port>
    </entity>
    <entity name="sensor2" class="org.kepler.sensor.actor.Sensor">
        <property name="isOn" class="ptolemy.actor.parameters.PortParameter" value="true">
        </property>
        <property name="samplingPeriod" class="ptolemy.actor.parameters.PortParameter" value="10.000000">
        </property>
        <property name="dataLogger" class="ptolemy.data.expr.StringParameter" value="CR800">
        </property>
        <property name="sensorServer" class="ptolemy.data.expr.StringParameter" value="localhost">
        </property>
        <property name="_location" class="ptolemy.kernel.util.Location" value="{430.0, 250.0}">
        </property>
        <property name="coefficients" class="ptolemy.data.expr.StringParameter" value="">
        </property>
        <property name="conversion-type" class="ptolemy.data.expr.StringParameter" value="Linear">
        </property>
        <property name="measurement-unit" class="ptolemy.data.expr.StringParameter" value="degrees Celsius">
        </property>
        <property name="sampleMethod" class="ptolemy.data.expr.StringParameter" value="average">
        </property>
        <property name="samples-per-measurement" class="ptolemy.data.expr.Parameter" value="10">
        </property>
        <property name="sensor-make" class="ptolemy.data.expr.StringParameter" value="Vaisala">
        </property>
        <property name="sensor-measurement" class="ptolemy.data.expr.StringParameter" value="Temperature">
        </property>
        <property name="sensor-model" class="ptolemy.data.expr.StringParameter" value="HMP45A">
        </property>
        <property name="serial-number" class="ptolemy.data.expr.StringParameter" value="B3310002">
        </property>
        <port name="data" class="ptolemy.actor.TypedIOPort">
            <property name="output"/>
            <property name="_cardinal" class="ptolemy.kernel.util.StringAttribute" value="NORTH">
            </property>
        </port>
        <port name="isOn" class="ptolemy.actor.parameters.ParameterPort">
            <property name="input"/>
        </port>
        <port name="samplingPeriod" class="ptolemy.actor.parameters.ParameterPort">
            <property name="input"/>
        </port>
    </entity>
</entity>

 

The sensorML file for the above site layout information (.moml)

 

<?xml version="1.0" encoding="UTF-8"?>
<sml:SensorML
	xmlns:sml="http://www.opengis.net/sensorML/1.0.1" 
	xmlns:swe="http://www.opengis.net/swe/1.0.1" 
  	xmlns:gml="http://www.opengis.net/gml" 
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  	xmlns:xlink="http://www.w3.org/1999/xlink" 
 	xsi:schemaLocation="http://www.opengis.net/sensorML/1.0.1 http://schemas.opengis.net/sensorML/1.0.1/sensorML.xsd" version="1.0"> 
	<sml:member> 
		<sml:System> 
			<gml:name>moml_site(x)_2</gml:name>
			<sml:identification>
 				<sml:IdentifierList>
					<sml:identifier>
						<sml:Term definition="urn:ogc:def:identifier:OGC:uniqueID">
							<sml:value>moml_site(x)_2</sml:value>
						</sml:Term>
					</sml:identifier>
				</sml:IdentifierList>
			</sml:identification>
			<sml:timeStamp>
				<gml:timePosition>2010.10.11 AD at 10:50:16 PDT</gml:timePosition>
			</sml:timeStamp>
			<sml:components>
				<sml:ComponentList>
					<sml:component name="sensor1">
						<gml:description>sensor</gml:description>
						<sml:identification>
 							<sml:IdentifierList>
								<sml:identifier>
									<sml:Term definition="urn:ogc:def:identifier:OGC:serialNumber">
										<sml:value>B3310001</sml:value>
									</sml:Term>
								</sml:identifier>
							</sml:IdentifierList>
						</sml:identification>
						<sml:characteristics>
							<swe:DataRecord gml:id="sensorCharateristics">
								<swe:field name="isOn">
									<swe:Boolean>
										<swe:value>true</swe:value>
									</swe:Boolean>
								</swe:field>
								<swe:field name="samplingPeriod">
									<swe:Quantity>
										<swe:value>10.000000</swe:value>
									</swe:Quantity>
								</swe:field>
								<swe:field name="dataLogger">
									<swe:Text>
										<swe:value>CR800</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="sensorServer">
									<swe:Text>
										<swe:value>localhost</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="coefficients">
									<swe:Text>
										<swe:value></swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="conversion-type">
									<swe:Text>
										<swe:value>Linear</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="measurement-unit">
									<swe:Text>
										<swe:value>degrees Celsius</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="sampleMethod">
									<swe:Text>
										<swe:value>average</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="samples-per-measurement">
									<swe:Count>
										<swe:value>10</swe:value>
									</swe:Count>
								</swe:field>
								<swe:field name="sensor-make">
									<swe:Text>
										<swe:value>Vaisala</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="sensor-measurement">
									<swe:Text>
										<swe:value>Temperature</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="sensor-model">
									<swe:Text>
										<swe:value>HMP45A</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="serial-number">
									<swe:Text>
										<swe:value>B3310001</swe:value>
									</swe:Text>
								</swe:field>
							</swe:DataRecord>
						</sml:characteristics>
					</sml:component>
					<sml:component name="sensor2">
						<gml:description>sensor</gml:description>
						<sml:identification>
 							<sml:IdentifierList>
								<sml:identifier>
									<sml:Term definition="urn:ogc:def:identifier:OGC:serialNumber">
										<sml:value>B3310002</sml:value>
									</sml:Term>
								</sml:identifier>
							</sml:IdentifierList>
						</sml:identification>
						<sml:characteristics>
							<swe:DataRecord gml:id="sensorCharateristics">
								<swe:field name="isOn">
									<swe:Boolean>
										<swe:value>true</swe:value>
									</swe:Boolean>
								</swe:field>
								<swe:field name="samplingPeriod">
									<swe:Quantity>
										<swe:value>10.000000</swe:value>
									</swe:Quantity>
								</swe:field>
								<swe:field name="dataLogger">
									<swe:Text>
										<swe:value>CR800</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="sensorServer">
									<swe:Text>
										<swe:value>localhost</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="coefficients">
									<swe:Text>
										<swe:value></swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="conversion-type">
									<swe:Text>
										<swe:value>Linear</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="measurement-unit">
									<swe:Text>
										<swe:value>degrees Celsius</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="sampleMethod">
									<swe:Text>
										<swe:value>average</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="samples-per-measurement">
									<swe:Count>
										<swe:value>10</swe:value>
									</swe:Count>
								</swe:field>
								<swe:field name="sensor-make">
									<swe:Text>
										<swe:value>Vaisala</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="sensor-measurement">
									<swe:Text>
										<swe:value>Temperature</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="sensor-model">
									<swe:Text>
										<swe:value>HMP45A</swe:value>
									</swe:Text>
								</swe:field>
								<swe:field name="serial-number">
									<swe:Text>
										<swe:value>B3310002</swe:value>
									</swe:Text>
								</swe:field>
							</swe:DataRecord>
						</sml:characteristics>
					</sml:component>
				</sml:ComponentList>
			</sml:components>
		</sml:System>
	</sml:member>
</sml:SensorML>

 

 A Sample dataset produced by a sensor within two timestamp (2010-10-20 17:00:00.00Z, 2010-10-20 17:10:00.00Z)

temp_c	date		time
1.691	2010-10-20	17:00:00.00Z
2.367	2010-10-20 	17:02:00.00Z
3.456	2010-10-20 	17:04:00.00Z
4.898	2010-10-20 	17:06:00.00Z
2.123	2010-10-20 	17:08:00.00Z
4.246	2010-10-20 	17:10:00.00Z

 

The sample EML for a data set produced by a sensor for timestamp ts1 and ts2

<?xml version="1.0" encoding="UTF-8"?>

<!-- how to get packageId -->

<eml:eml 
	packageId="reap.44.1" scope="system" system="knb" 
	xmlns:eml="eml://ecoinformatics.org/eml-2.0.1"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="eml://ecoinformatics.org/eml-2.0.1 eml.xsd">
  
  <dataset>
      
	<title>Dataset for sensor:"sensorName" at site:"siteName" for time period "x" and "y" </title>
    
	<creator>
      <organizationName>REAP</organizationName>
      <onlineUrl>http://reap.ecoinformatics.org/</onlineUrl>
    </creator>
	
	<contact>
      <positionName>REAP Data Manager</positionName>
      <organizationName>NCEAS</organizationName>
      <electronicMailAddress>dataManager@reap.ecoinformatics.org/</electronicMailAddress>
      <onlineUrl>http://reap.ecoinformatics.org/</onlineUrl>
    </contact>	
        
	<abstract>
      <para>This metadata record describes the dataset for sensor "sensorName" located at site "siteName"
	  for the time period "x" and "y". 
	</abstract>
    
	<keywordSet>
      <keyword>NCEAS</keyword>
      <keyword>REAP</keyword>
    </keywordSet>
	
	<coverage>
	  <temporalCoverage>
        <rangeOfDates>
          <beginDate>
            <calendarDate>2010-10-20</calendarDate>
            <time>17:00:00.00Z</time>
          </beginDate>
          <endDate>
            <calendarDate>2010-10-20</calendarDate>
            <time>17::20.00Z</time>
          </endDate>
        </rangeOfDates>
      </temporalCoverage>
	</coverage>
    
        
	<dataTable>
      <entityName>siteName_dataLogger_sensorName_2010120_170000-20101020_1701000.txt</entityName>
	  <entityDescription>Dataset for sensor:"sensorName" at site:"siteName" for timeperiod between "x" and "y"</entityDescription>	  
	  <physical>
        <objectName>siteName_dataLogger_sensorName_2010120_170000-20101020_1701000</objectName>
        <size unit="bytes">1404825</size>
        <characterEncoding>ASCII</characterEncoding>
        <dataFormat>
          <textFormat>
            <numHeaderLines>1</numHeaderLines>
            <recordDelimiter>#x0A</recordDelimiter>
            <attributeOrientation>column</attributeOrientation>
            <simpleDelimited>
              <fieldDelimiter>#x20</fieldDelimiter>
            </simpleDelimited>
          </textFormat>
        </dataFormat>
		
        <distribution id="siteName_dataLogger_sensorName">
          <online>
            <url function="download">ecogrid://knb/reap.40.1</url>
          </online>
        </distribution>
      </physical>
      
	  <attributeList>
	  
		<attribute>
          <attributeName>temp_c</attributeName>
          <attributeDefinition>atmospheric temperature</attributeDefinition>
          <storageType typeSystem="http://www.w3.org/2001/XMLSchema-datatypes">float</storageType>
          <measurementScale>
            <interval>
              <unit>
                <standardUnit>celsius</standardUnit>
              </unit>
            </interval>
          </measurementScale>
        </attribute>
	  
        <attribute>
          <attributeName>date</attributeName>
          <attributeDefinition>calendar date of each temperature measurement record</attributeDefinition>
          <storageType typeSystem="http://www.w3.org/2001/XMLSchema-datatypes">date</storageType>
          <measurementScale>
            <datetime>
              <formatString>YYYY-MM-DD</formatString>
              <dateTimePrecision>1 day</dateTimePrecision>
              <dateTimeDomain/>
            </datetime>
          </measurementScale>
        </attribute>
        
		<attribute>
          <attributeName>time</attributeName>
          <attributeDefinition>Greenwich Mean Time of each temperature measurement record</attributeDefinition>
          <storageType typeSystem="http://www.w3.org/2001/XMLSchema-datatypes">time</storageType>
          <measurementScale>
            <datetime>
              <formatString>hh:mm:ss.ssZ</formatString>
              <dateTimePrecision>1 second</dateTimePrecision>
              <dateTimeDomain/>
            </datetime>
          </measurementScale>
        </attribute>
		
		</attributeList>
    
		<numberOfRecords>5</numberOfRecords>
	
	</dataTable>
	
     <additionalMetadata>
        <describes>siteName_dataLogger_sensorName</describes>
		<metadata>
			<sml>
				<description> Include the contents of file "sml_site(x)_logger(y)_sensor(z)_ts1-ts2.xml" as a string <description> 
			</sml>
		</metadata>
	</additionalMetadata>	
	
  </dataset>
  
</eml:eml>

A sample sensorML file for a sensor whose metadata was same between timestamp ts1 and ts2 

<?xml version="1.0" encoding="UTF-8"?>
<sml:SensorML 
   xmlns:sml="http://www.opengis.net/sensorML/1.0.1"
   xmlns:swe="http://www.opengis.net/swe/1.0.1"
   xmlns:gml="http://www.opengis.net/gml"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns:xlink="http://www.w3.org/1999/xlink"
   xsi:schemaLocation="http://www.opengis.net/sensorML/1.0.1 http://schemas.opengis.net/sensorML/1.0.1/sensorML.xsd" version="1.0">
   
   <sml:member>	  
	  <sml:System gml:id="siteName">
		 
         <sml:identification>
            <sml:IdentifierList>
				<sml:identifier name="siteID">
                  <sml:Term definition="urn:ogc:def:identifier:OGC:uniqueID">
                     <sml:value>urn:ogc:object:system:REAP:siteName</sml:value>
                  </sml:Term>
               </sml:identifier>
			</sml:IdentifierList>
         </sml:identification>
		 
		<sml:components>
			<sml:ComponentList>					
				<sml:component gml:id="B3310001">
					
					<sml:characteristics>
						<swe:DataRecord gml:id="sensorCharateristics">	

							<swe:field name="isOn">
								<swe:value>true</swe:value>
							</swe:field>
														
							<swe:field name="samplingPeriod">
								<swe:value>10.000000</swe:value>
							</swe:field>

							<swe:field name="dataLogger">
								<swe:value>CR800</swe:value>
							</swe:field>														
																																																																									
							<swe:field name="sensorServer">
								<swe:value>localhost</swe:value>
							</swe:field>
											
							<swe:field name="location">
								<sml:value>[330.0, 150.0]</swe:value>
							</swe:field>						

							<swe:field name="coefficients">
								<sml:value></swe:value>
							</swe:field>																												
													
							<swe:field name="conversion-type">
								<sml:value>Linear</swe:value>
							</swe:field>	
	
							<swe:field name="measurement-unit">
								<sml:value>degrees Celsius</swe:value>
							</swe:field>	
																																																																																																																																																																	
							<swe:field name="sampleMethod">
								<sml:value>Average</swe:value>
							</swe:field>
	
							<swe:field name="samples-per-measurement">
								<sml:value>10</swe:value>
							</swe:field>
																										
							<swe:field name="sensor-make">
								<sml:value>Vaisala</swe:value>
							</swe:field>	
									
							<swe:field name="sensor-measurement">
								<sml:value>Temperature</swe:value>
							</swe:field>
									
							<swe:field name="sensor-model">
								<sml:value>HMP45A</swe:value>
							</swe:field>	

							<swe:field name="serial-number">
								<sml:value>B3310001</swe:value>
							</swe:field>																																																																																																																																																																																																																							

						  </swe:DataRecord>
					  </sml:characteristics>
					  	
				  </sml:component>					
			  </sml:ComponentList>
		  </sml:components>
	  </sml:System>
   </sml:member>
</sml:SensorML>
Document Actions