AgGateway Post-Image Collection Specification (PICS)

This houses the deliverable (an implementation guideline) for AgGateway's PICS project.

This guideline follows the introduction made in the flyer:


1.      Overview

This document contains guidelines for establishing a common approach to sharing image data used in agriculture. By creating these guidelines, AgGateway seeks to optimize the quality of information obtained from those images. Such an approach alleviates several issues that obstruct image usability, by providing:

  • Transfer of complete, intact image information
  • Descriptive metadata to further describe the meaning of each image
  • Time and date stamping of images for accurate reference

1.1       AgGateway

AgGateway is a non-profit consortium of businesses serving the agriculture industry. Together, its members focus on promoting, enabling, and expanding digital agriculture. As a long-term vision, AgGateway endeavors to establish itself as a trusted leader in enabling digital agriculture. To do this, its member businesses enable electronic connectivity as a means to:

  • Improve business processes
  • Help deliver excellent customer service
  • Streamline the supply chain
  • Support greater productivity and sustainable agricultural practices.

Through this collaboration, AgGateway not only designs effective industry-wide solutions, but im­plements those solutions in the field, ensuring that digital agriculture becomes a working reality.

1.2       PICS

AgGateway designed the Post Image Collection Specification (PICS) to support agricultural remote sensing. PICS uses the GeoTIFF graphics format, including a set of specific metadata tags that convey what an image means. Use of PICS requires access to this free PICS Implementation Guideline.

2.      Problem Definition

Aerial and satellite images have been used for decades to solve cadastral problems and other applications of photogrammetry. As a result, current technology for representing the position of an image in space is very sophisticated and accurate. However, the industry’s ability to represent the meaning of the image has not developed accordingly: the complex, multispectral nature of today’s electronic images complicates their usability, to the extent where a grower’s farm management information system (FMIS) typically can’t automatically recognize and “comprehend” the details of a particular image. This then requires that the user make assumptions and manually enter metadata into image processing systems. Image use then becomes slow and frustrating, rather than useful.

3.      Solution Approach

3.1       Identify Gaps

To start, the team noted the gaps that currently exist in remote sensing image data. Those gaps were identified as the specific interoperability pain points that make images less useful. For the initial solution, these gaps included:

  • Where an image was taken
  • When an image was taken
  • What bands are included in the image
  • The order of the bands in the image
  • Whether the writing software is PICS-aware

Future refinements of this solution may include:

  • Adding to the solution’s ability to represent the meaning of a product by creating a registry of derivative products, such as NDVI
  • Expressing mathematical conversions for characteristics such as digital values, reflectance, or irradiance so users can unambiguously convert numbers into physical quantities where possible.

3.2       Apply Standards

The second step taken by the team was to agree on the steps necessary to resolve each of the identified pain points, in a way that incorporated as many existing standards as possible.

3.3       Refine Focus

From here, the team narrowed the scope of their plan, focusing on a single, popular image format (GeoTIFF) to simplify the solution.

3.4       Document Guidelines

Finally, the team established implementation guidelines that would enable users to easily apply the PICS strategy. This document serves as that guide, resulting in the main deliverable for the PICS project.

4.      Solution Description

4.1       Why GeoTIFF

GeoTIFF, a popular, royalty-free format, provides sophisticated tagging that simplifies communication of image data. The TIFF format itself is lossless, capable of compression which is ideal for preserving the radiometric details of multispectral imagery. As shown in Figure 1, the GeoTIFF specification defines metadata tags that can be added to image files. 

Figure 1: Encoding of image information within a GeoTIFF file.

4.2       Tags, and the Pain Points they Target

The tags created by GeoTIFF, when implemented according to this guide, resolve a series of specific pain points in agricultural image processing. Table 1 shows these pain points and their solutions through the use of tags.

Table 1: Tags used by PICS, and the pain points they target.

Pain point

Why it hurts

The PICS solution

Band order

Identifying individual bands in multi-channel images often requires a sidecar file or arbitrary naming conventions. This can lead to confusion, especially if the sidecar file is lost.

Implement the standard tag

 Xmp.Camera.BandName

Band Definition

An accurate description of a band’s width and central wavelength enables knowing if it can be used to make specific indexes and other products.

Implement two standard tags:

Xmp.Camera.CentralWavelength

Xmp.Camera.WavelengthFWHM

Acquisition Time & Duration

Knowing when an image was captured helps sort files for proper analysis, and flag / filter unwanted data. This is important with long UAV acquisition times.

Implement three standard tags:

Exif.GPSInfo.GPSDateStamp 

Exif.GPSInfo.GPSTimeStamp

Xmp.Camera.AcquisitionDuration

Projection Information (Geolocation)

It’s critical to locate individual image pixels and support multiple coord­inate systems and projections.

PICS chose the GeoTIFF image format because it natively enables the needed functionality.

PICS compliance &  version

Can the FMIS trust that the tags are used in a PICS-compliant way?

A private AgGateway tag (with tag number 65265) is added to the image, formatted as a PAIL Observations document with a ContextItem.


5.      Implementing PICS


Publishing PICS-compliant imagery requires specific information to properly tag an image. It also requires the ability to write those tags to the image products. Consuming PICS-compliant imagery requires knowledge of these specific tags.  It also requires the ability to extract the necessary information to effectively and efficiently utilize the image. 

5.1       Publishing Data

5.1.1    Requirements

For proper tagging, data must be written in GeoTIFF format version 1.0 or newer. Specific EXIF, XMP and a GeoTIFF tags are necessary to be written.  To do this, several open source libraries and tools are available on the world wide web and links are provided in Appendix A.

Regarding band values, black is represented by a zero.

Successful tagging of an image requires collection of the following critical pieces of information:

  • Central wavelength of each band
  • Order of the bands (if multiple layers)
  • General name of the band
  • Start and end times that express when an image was acquired

All tags have specific formats and locations within the GeoTIFF. PICS uses GeoTIFF, XMP and EXIF tags to ensure full expression of an image.

5.1.2 Tagging the image

Below is a table that references which types of tags are required, their tag type/key and type.  Notes on each provide some guidance on what they mean.  For data examples refer to Appendix B.

Table 2: Specifics about PICS tag definitions

Item

Tag Type

Key

Type

Notes

Band Name

XMP

Xmp.Camera.BandName

XmpSeq

Name of each band (sequence). Single page image: one name for each band of the image. Multipage image: one band name per image page; each page should have only one band.

Band Center Wavelength

XMP

Xmp.Camera.CentralWavelength

XmpSeq

Central wavelength of each band in nanometers (nm).

Band Width

XMP

Xmp.Camera.WavelengthFWHM

XmpSeq

Full width half maximum of the wavelength distribution in nanometers (nm).

Acquisition Date

EXIF

Exif.GPSInfo.GPSDateStamp

ASCII

Mean GPS date as year, month, day (UTC) in the format “YYYY:MM:DD"

Acquisition Time

EXIF

Exif.GPSInfo.GPSTimeStamp

List of Rational

Mean time of capture - GPS time as hour, minute, second (UTC), with sub-second accuracy

Acquisition Duration

XMP

Xmp.Camera.AcquisitionDuration

XmpText

Time duration taken to acquire all images in dataset, measured in seconds

Versioning

GeoTIFF

65265

ASCII

This is using the PAIL schema to accomplish recording a version.

5.1.3 Date & Time
When tagging imagery for date and time it is critical to agree on a standard time reference.  In the PICS case it was decided to use the GPSDateStamp and GPSTimeStamp which are in based in Coordinated Universal Time (UTC) with an offset of +0000, by using GPSDate/TimeStamp it eliminated the confusion of a more ambiguous tag called DateTimeStamp that has no time zone reference.  The GPS time stamp provides clarity because it is always displayed in UTC.  One example might be a satellite image which is not using official GPS time, but it’s internal clock (likely in UTC) could be used for this tag.

5.1.4 When not to tag

Tagging an image with specific information such as band width implies that the user knows with some level of certainty that the image represents what it says in the metadata.   In some cases, it may be difficult obtain sensor specifications for the center wavelength or a bands width.  If this is the case, especially for traditional RGB broadband sensors, it is permissible to leave these tags out of the metadata, however it is still possible to use the band name.  Not tagging information defined in the specification is not an encouraged practice, especially for single band imagery and any multiband modified sensors.  However, it is better to not list these properties if you are unsure or cannot verify/validate them.

5.1.5    Limitations

Some limitations affect image tagging. Among these: TIFF (and thus GeoTIFF) is limited to a file size of 4GB. This occurs because images in the TIFF file format use 32-bit byte offsets.  The largest offset represented is 232 = 4GB. This constraint can be remedied, however, by use of either BigTIFF or tiling of 4GB TIFF files.

BigTIFF files have a ".tif" or ".tiff" file extension just like ordinary TIFF files and are similar in many ways to be backwards compatible. The primary difference is the BigTIFF uses 64-bit byte offsets instead of 32-bit. The primary library used for BigTIFF is libtiff which can be found in Appendix A.

Tiling allows display of images that are too large to be read entirely into memory. For example, satellite images can exceed a gigabyte of data, making it impossible to display as a single unit on a typical business computer. Such an image can be displayed by segmenting it into smaller images, or tiles. Each tile can be then kept below the 4GB limitation. More information on this subject can be found in the TIFF specification, section 15, referenced in Appendix A. 

5.2       Consuming Data 

5.2.1 Requirements

Consuming imagery that is PICS v1.0 compliant requires knowledge of its tags and the ability to read them. Several open source libraries and tools are available on the world wide web and links are provided in Appendix A.

5.2.2    What Does It Mean When There are Missing Values?

In some cases, one of the values associated with an image may be unknown, in which case the corresponding tag won’t be added. For example, the story presented in Table 3 of the Appendix B has no value for the band width (nm) value. In that case, the omission of the tag indicates that no data was available or verifiable for the referenced attribute. The absence of tags is contextualized by the PICS tag.

5.2.3    What Does the PICS Tag Mean?

The PICS 1.0 tag indicates that an image is PICS-compliant for the indicated version number.

The meaning of PICS compliance is related to the previous point about missing values. The Yale Book of Quotations claims that Martin Rees, an English astronomer, said that “Absence of evidence is not the same as evidence of absence”.  A common problem with digital imagery is when a desired attribute (e.g., the acquisition time for an image) is not included in the image metadata; the user can’t discern if the value is missing because it is unknown, or because the writer failed to include it.

The presence of a PICS tag means that the image writer is aware of the tagging requirements shown in Table 1 and will comply with them if the data is available. A missing tag, in this case, would indicate that the value is unknown, otherwise it would be listed.

5.3 Implementing the PICS Compliance and Version Tag

Figure 2: Byte Structure of a TIFF File. The file begins with a header, followed by one or more image file directories (IFDs), each one of which has one or more directory entries.


The text below shows how to encode a PICS compliance and version tag as a string within a GeoTiff file.

<?xml version="1.0" encoding="UTF-8"?>

<!--Sample XML instance file showing how to encode a PICS version as a ContextItem using the PAIL schema-->

<Observations xmlns="http://aggateway.org/PAIL/0.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://aggateway.org/PAIL/0.1 PAIL_Obs_020.xsd">

    <ContextItem>

        <Code>https://api.contextitem.org/def/PICS-V</Code>

        <Value>1.0</Value>

    </ContextItem>

</Observations>


There are two ways to insert a tag like this:

  • Add it into any existing IFD (image file directory) or
  • Add an ad-hoc IFD at the end of the file.

The latter is preferable due to its simplicity: it only involves appending content (i.e., the string) to the end of the file, and changing the trailing pointer from the heretofore last IFD from 0 to the original file length (i.e., pointing to where the appended material is) without having to insert anything in mid-file and having to recalculate any offsets accordingly.

A TIFF field is a logical entity within a TIFF file: a key-value pair formed by the TIFF tag and its value.

An IFD entry is a TIFF field, plus the space occupied by its value if it extends beyond the 4 bytes allocated to the value.

Notes:

As per the TIFF 6.0 specification, the entries in an IFD must be sorted in ascending order by tag.


Andres will fill in this blank.

6. Conclusion

The main goal of this project was to establish an image format that would be used to simplify and easily transmit information about an image’s projection, band order, bandwidth, band name, acquisition time and duration. The method outlined in this document utilizes tags currently available and only requires one additional tag to implement its versioning. The specification requires no side car files or special tags that break form with any format.  In doing so the PICS accomplishes the goals initially set by the Remote Sensing Working Group / Ag Gateway organization. In doing so we put forth this specification, developed as a group effort of many companies, as a solution to be used in the transmission of agriculture imagery. 

6.1 Acknowledgements:

Many contributing parties helped in the formation of this specification from the initial Remote Sensing Working Group, the support of the Ag Gateway Organization and Precision Ag Council, to the technical persons from many of the companies who helped formulate the tags and those who help draft the specification in written form. 

The authors would like to thank the following companies for their contribution of talent and resources: Ag Connections, Ag Leader, Agrian, Agritrend, BASF, Bayer, DuPont Pioneer, Entira, Farmers Mutual Hail, John Deere, Land O’ Lakes, Pix4D, Planet Labs, senseFly, Syngenta, SSI, Wilbur-Ellis

Appendix A: References


This appendix provides different references and popular libraries used by companies today to create, maintain and manipulate agriculture imagery and associated metadata.


Specifications:


GeoTIFF 1.0:
http://web.archive.org/web/20160814115308/http://www.remotesensing.org:80/geotiff/spec/geotiffhome.html


http://www.loc.gov/preservation/digital/formats/fdd/fdd000279.shtml


Adobe’s Tagged Image File Format 6.0:

http://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf


Adobe’s Extensible Metadata Platform (XMP):

https://www.adobe.com/devnet/xmp.html


Exchangeable Image File Format (EXIF) Tags:

http://www.cipa.jp/std/documents/e/DC-008-2012_E.pdf


BigTIFF:

http://bigtiff.org/

http://www.loc.gov/preservation/digital/formats/fdd/fdd000328.shtml


Popular libraries and tools:

Geospatial Data Abstraction Library (GDAL) – GeoTIFF metadata library and tools, written in C, C++
http://www.gdal.org/


Exiv2 – EXIF, XMP, IPTC metadata library and tools, written in C++
www.exiv2.org/

http://www.exiv2.org/tags.html


LibTiff – BigTIFF metadata library and tools, written in C
http://libtiff.org/

https://bitmiracle.com/libtiff/ - .NET wrapper for LibTiff

Appendix B: PICS User Story

The following example illustrates how users/companies can implement PICS to address their needs in publishing and consuming image data within agriculture.

Farm Photography: An Overhead View

A grower wants a specific analysis on one of his/her fields using UAV image data, but doesn’t know what sensor to use, when to fly, or how to perform the analysis. The grower contracts an analytics company to help. The company uses historical data, merged with weather, planting, soil fertility, and historical and current satellite data to help the client schedule the flight and select what data to collect and send. Once deployed, the UAV gathers specific bands of information at the proper time. The grower makes the image data available to the analytics company, which checks it for quality, processes it, and uses it in the specific analysis, and produces a recommendation, which the grower can use to make key decisions that help improve crop yield and operation efficiency.

Collecting Raw Data

In this example, data was gathered over two years using a senseFly eBee fixed-wing UAV with different sensors (senseFly ThermoMAP, Canon S110NIR modified camera). Additional data was gathered and provided upon request of the analytics company (Parrot Sequoia, senseFly SODA). Image thumbnails are shown in Figures 3 and 4, and the relevant image metadata is shown in Tables 3 to 6 below, with values for the fields described in Table 2:

Table 3 - Canon S110 - Modified NIR Data

Band names:

Red, Green, NIR

Band width (nm):

90, 70, 100

Band center wavelength (nm):

625, 560, 850

Acquisition Date (UTC):

2015:06:26

Acquisition Time (UTC):

18:38:52

Acquisition Duration (s):

1540


Table 4 - senseFly ThermoMAP - Thermal Data

Band names:

Thermal IR

Band width (nm):

3000

Band center wavelength (nm):

10000.5

Acquisition Date (UTC):

2015:06:23

Acquisition Time (UTC):

20:31:19

Acquisition Duration (s):

1328


Table 5 - SenseFly S.O.D.A. - RGB Data

Band names:

Red, Green, Blue



Band center wavelength (nm):

660, 550, 470

Acquisition Date (UTC):

2017:06:05

Acquisition Time (UTC):

18:55:34

Acquisition Duration (s):

3426

* Note there are no band widths defined


Figure 3: Thumbnails of the NIR, thermal, and S.O.D.A. images collected for a field in the example.


Table 6 - Parrot Sequoia - Multispectral Data

Band names:

Green, Red, Red Edge, NIR

Band width (nm):

40, 40, 10, 40

Band center wavelength (nm):

550, 660, 735, 790

Acquisition Date (UTC):

2017:07:07

Acquisition Time (UTC):

20:56:54

Acquisition Duration (s):

2049

* each individual image is tagged separately with name, band width, center wavelength

Figure 4: Thumbnails for the multispectral bands in the example field.


Processing and Tagging the Data

Using hundreds of raw images collected from the UAV, the PICS-aware software (Pix4Dmapper, in this case) combined images into single reflectance map product, a GeoTIFF output. Before creation of the final product the software derived the required PICS data by extracting necessary metadata from each set of images.  Tables 3-5 above, presents the relevant metadata extracted from images in each project (UAV flight mission). The reflectance map was down-sampled to 1m/pixel. 

The reflectance map product was embedded in the EXIF, XMP, and GeoTIFF tags, as specified by PICS. For example: the thermal data, the band name value “Thermal IR” was added to the tag XMP.Camera.BandName. The band width was known to be 3000nm and so “3000” was added to the XMP.Camera.WavelengthFWHM tag. The band center was known to be 1000.5nm and so it was tagged in the XMP.Camera.CentralWavelength as “1000.5”. The date “2015:06:23” was derived from the mean of all raw project images (from the time of the first picture of project to last)   and tagged in the EXIF.GPSInfo.GPSDateStamp. The time in UTC is “20:31:19Z”, which was derived from the mean of all the raw project image captures (from the time of the first picture of project to last) and was tagged in EXIF.GPSInfo.GPSTimeStamp. The time it took to capture the entire project of images was calculated to be “1328” in seconds, and was tagged in XMP.Camera.AcquisitionDuration. Finally, with all PICS-compliant tags added, the proper versioning tag for PICS “1.0” was added to GeoTIFF tag 65265.

The  image processing software automatically completed this process, tagging the resulting reflectance product image files using the same methodology throughout. For multi-band (stacked) images, commas were used to delimit respective band data in the GeoTIFF tags. In the special case when no data was available for bandwidth or central band wavelength, no tag was generated, and so no value appears in the table. This happened in the RGB dataset, where band width was wide or not known; the implication is that these images should be used for display purposes only and not for any sort of radiometric analysis.

Data intake and Quality Checks

The processing software collects the output from the processed image taken by the grower and transfers that data to the analytics company. Once the data is received, tags such as XMP.Camera.AcquisitionDuration, EXIF.GPSInfo.GPSDateStamp, and EXIF.GPSInfoGSPTimeStamp can be used to filter out any data outside the desired quality thresholds (e.g., an acquisition taken at a time when the sun is too low in the sky). Compliance with PICS can also be checked and, once verified, that data can enter the processing pipeline. If tags are missing and bands are unknown (as in the case with the RGB) the image can be utilized as an RGB base layer. 

Customer Output/Derivative Product

After the data passes all quality checks, specific band/time information is used and blended with other data for a specific analysis. The completed analysis is then returned to the grower and overlaid on the RGB provided. This provides context and enables the farmer to make sense of the analysis. The farmer can then use this analysis to draw conclusions and make key decisions, with the goal of enhancing farm operations.


A tiny link to this page is: https://aggateway.atlassian.net/wiki/x/XABrDw and a tiny-er link: http://bit.ly/2z90aeD 

The tiny-link-bearing QR code: Preview of your QR Code