Skip to content
Doug Rand edited this page Jul 30, 2013 · 24 revisions

Table of Contents

Description

GIScore provides the capability to perform streaming input and output of data from different file formats with an emphasis on GIS file formats such as ESRI Shapefiles or geo-databases (GDB) and Google Earth KML/KMZ. As time went on it was extended to include other record oriented formats that included GIS information such as WKT, GeoRSS and GeoAtom. Additionally it has proven useful to support some non-GIS formats such as Dbf and CSV.

GIScore provides the mediation between these file formats by converting each format to an internal normalized form.

GIScore was created to overcome the perceived problems in prior projects related to the use of in-memory representations of data. In-memory models have many advantages in terms of ease of use and speed, but lack the ability to deal with large data sets. GIScore tries to straddle both of these worlds by providing good performance with modest data set sizes while addressing the ability to deal with arbitrary data set sizes.

The ideal for GIScore was that the object model would be agnostic of the underlying file formats. This is at best unrealistic and factually seemed impossible. GIScore chooses to make the library representative of the richest set of the underlying libraries and ignore features of the object model when a library cannot represent a given feature for a specific implementation. This is one of a number of choices, but one that seemed best to the author, and better than choosing only the common features of each.

GIScore started as a MITRE developed software library written in Java.

Links

  • A reference to the document types used in GIScore
  • A developers page contains information about integration points and builds
  • dependencies information for FileGDB use

Foreign Language Capabilities

GIScore supports standard GIS data formats (KML, CSV, Shape, OGC Well-Known Text (WKT), GDB, GeoRSS, GeoAtom) so any foreign language support is dependent on what these data formats support. For XML sources (KML, GeoRss, GeoAtom) GIScore API supports non UTF-8 encoding for non-ASCII characters.

Pluses

Uses StAX (STreaming API for XML processing) to process XML an element at a time so sources larger than those than that can fit into memory can easily be read and written to.
Likewise, a similar InputStream/OutputStream pipeline is designed for handling large shape files so a (relatively) small memory footprint is needed for GIScore.

Getting Started

GIScore is built around a factory pattern. The user of a particular stream is meant to be ignorant of the implementation of the stream class that one is using. The caller unfortunately does need to know about extra arguments each document type requires, but not extra arguments based on the underlying class per se.

GIScore can be combined with the open source GPSBabel tool with its support of hundreds of geo-data formats to extend the native capabilities of both. The common denominator between GIScore and GPSBabel are the KML and CSV data formats.

Reading a Shapefile

Here is the code to instantiate a shapefile input stream taken from some actual code:

IGISInputStream sis = GISFactory.getInputStream(DocumentType.Shapefile, ngr.getShapefile().getInputStream());
Some things to note. The user passes an input stream that contains a zip input stream holding a directory of shapefiles (which may have just a single shapefile). To switch to a FileGDB the type changes the contents of the zip input stream changes to a GDB directory. KMZ is a zip holding a KML file. There are some other factory methods available as well.

Here's a simple processing loop, taken from the same code. This particular loop looks for rings and polygons to pull out the rings to determine an area of interest to query.

IGISObject obj = sis.read();
while(obj != null) 
{
	if (obj instanceof Feature) {
		Geometry geo = ((Feature) obj).getGeometry();
		if (geo instanceof LinearRing) {
			processRing(ngr, ((LinearRing) geo).getPoints());
		} else if (geo instanceof Polygon) {
			processRing(ngr, ((Polygon) geo).getOuterRing().getPoints());
		} else if (geo instanceof MultiLinearRings) {
			MultiLinearRings mlr = (MultiLinearRings) geo;
			for(LinearRing ring : mlr.getLinearRings()) {
				processRing(ngr, ring.getPoints());
			}
		} else if (geo instanceof MultiPolygons) {
			MultiPolygons mlp = (MultiPolygons) geo;
			for(Polygon poly : mlp.getPolygons()) {
				processRing(ngr, poly.getOuterRing().getPoints());
			}
		} else if (geo instanceof Line) {
			processRing(ngr, ((Line) geo).getPoints());
		}
	}
	obj = sis.read();
}
The loop terminates when the read method returns a null value. This particular loop ignores the schema, but yours may want to look at data associated with the features. that's up to you.

Finally, the processing should terminate by closing the input stream, which cleans everything up:

sis.close();

Writing a Shapefile

Writing any GIS output file is going to be more well determined then reading in one. Why? Because when you're reading one in you don't really need to be concerned with what's required and structure. When you're writing one out you do. Ideally we'd hide all the details from you but we really aren't as clever as we'd all like to be.

That said, you will generally be safe writing a GIS output file if you remember to first write the following elements:

  • Schema
  • DocumentStart(DocumentType)
  • ContainerStart("Folder")
Then write your features

Then don't forget to finish by writing out:

  • ContainerEnd
To start writing a shapefile you use the output factory:
IGISOutputStream shpos = GISFactory.getOutputStream(DocumentType.Shapefile, zos, outDir);
In addition to the zip output stream we have an output directory specified. This output directory is a scratch directory used to create the actual shapefiles before writing them out to the stream. Ideally we would be able to write the files directly as entries to the zip, but that doesn't work in practice, so we need to write them in the file system first in order to create the zip stream.

Writing to the stream is simple enough, we create the various events and write them to the stream. Here's a test method that writes a simple point geometry:

public void testWriteReferencePointOutput(File shapeOutputDir) throws Exception {
	FileOutputStream zip = new FileOutputStream(new File(shapeOutputDir, "reference.zip"));
	ZipOutputStream zos = new ZipOutputStream(zip);
	File outDir = new File("testOutput/shptest/buf");
	outDir.mkdirs();
	IGISOutputStream shpos = GISFactory.getOutputStream(DocumentType.Shapefile, zos, outDir);
	Schema schema = new Schema(new URI("urn:test"));
	SimpleField id = new SimpleField("testid");
	id.setLength(10);
	schema.put(id);
	DocumentStart ds = new DocumentStart(DocumentType.Shapefile);
	shpos.write(ds);
	ContainerStart cs = new ContainerStart("Folder");
	cs.setName("aaa");
	shpos.write(cs);
	shpos.write(schema);
	for(int i = 0; i < 5; i++) {
		Feature f = new Feature();
		f.putData(id, "id " + i);
		f.setSchema(schema.getId());
		double lat = 40.0 + (5.0 * RandomUtils.nextDouble());
		double lon = 40.0 + (5.0 * RandomUtils.nextDouble());
		Point point = new Point(lat, lon);
		f.setGeometry(point);
		shpos.write(f);
	}
	shpos.close();
	zos.flush();
	zos.close();		
}

KML

Google Earth data also known as KML data can likewise be created with GISFactory for most basic KML needs. Elements such as Placemark, GroundOverlay, NetworkLink, Point, LineString, Polygon, IconStyle, ListStyle, Schema, etc. are supported in addition to Google's gx: KML extensions.

IGISInputStream kis = GISFactory.getInputStream(DocumentType.KML, is)
IGISOutputStream kos = GISFactory.getOutputStream(DocumentType.KML, os)
Support for creating KMZ output streams need to create KmzOutputStream object explicitly and use addEntry() to add files as entries to the KMZ (ZIP) output stream.
 KmzOutputStream kmzos = new KmzOutputStream(new FileOutputStream(file));
 // write out KML content which gets written to doc.kml as first entry of KMZ
 GroundOverlay g = new GroundOverlay();
 TaggedMap icon = new TaggedMap("Icon");
 icon.put("href", "images/etna.jpg");
 g.setIcon(icon);
 kmzos.write(g);
 // add image entry to KMZ file
 File file = new File("data/kml/GroundOverlay/etna.jpg");
 kmzos.addEntry(new FileInputStream(file), "images/etna.jpg");
 kmzos.close();

If more than basic KML or KMZ handling is needed then use the KmlReader and KmlWriter classes, which are wrappers for KmlInputStream and KmlOutputStream, respectively, and do a lot of special handling. KmlReader class transparently handles KML and compressed KMZ files by file or URL along with fetching all NetworkLinks. Likewise, a KmlWriter class handles creation of KML or KMZ files and optionally allow adding other files as entries in the KMZ (ZIP) file. Most importantly KmlReader rewrites all the relative URLs such that they can be traced back to the correct URL and the appropriate resource fetched which would normally be tricky for nested KML/KMZ files and resources referenced within a KMZ file. All versions of KML specifications are imported so don't need to convert older 2.0 and 2.1 KML documents into the latest OGC KML 2.2 spec. The API takes care of most dirty details and conversions.

While most deprecated and deleted features from older KML specs are implemented to support importing of legacy KML data, KML output conforms to KML 2.2 schema so some deprecated/deleted features may not be preserved on output without some manual intervention.

For example a 'parent' element or attribute appearing in the Schema element is a legacy non-XML Schema compliant mechanism in KML 2.0 to alias KML features with user-defined element names also with user-defined child elements. This is correctly handled in import in that aliased elements are converted to Placemarks to be valid KML but the metadata is not auto-converted to ExtendedData fields.
While some fine-grain customization may be lost or incomplete as in a few less common cases (e.g. NetworkLinkControl, Model, etc.), the core geospatial and temporal data is preserved in a common representation. The feature structure can be exported to standard formats (or user-defined ones) once the common representation is created programmatically or imported from existing sources.

Writing KML with gx: extensions

Here is KML with elements in the Google extension namespace marked with the gx: prefix. In this example is a Track element representing a single entity with multiple time-tagged locations each with a <when> element and a corresponding <gx:coord> element.

<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2">
  <Placemark>
	<gx:Track>
		  <when>2010-05-28T02:02:09Z</when>
		  <when>2010-05-28T02:02:56Z</when>
		  <gx:coord>-122.207881 37.371915 156.000000</gx:coord>
		  <gx:coord>-122.203207 37.374857 140.199997</gx:coord>
	</gx:track>
  </placemark>
</kml>

Here are the few lines of Java code to generate the above KML:

	ByteArrayOutputStream bos = new ByteArrayOutputStream();
	KmlOutputStream kos = new KmlOutputStream(bos);
	DocumentStart ds = new DocumentStart(DocumentType.KML);
	Namespace gxNs = Namespace.getNamespace("gx", IKml.NS_GOOGLE_KML_EXT);
	ds.getNamespaces().add(gxNs);
	kos.write(ds);
	Feature f = new Feature();
	f.setName("track");
	Element gxElt = new Element(gxNs, "Track");
	List<Element> elts = gxElt.getChildren();
	elts.add(new Element("when").withText("2010-05-28T02:02:09Z"));
	elts.add(new Element("when").withText("2010-05-28T02:02:56Z"));
	elts.add(new Element(gxNs, "coord").withText("-122.207881 37.371915 156.000000"));
	elts.add(new Element(gxNs, "coord").withText("-122.203207 37.374857 140.199997"));
	f.addElement(gxElt);
	kos.write(f);
	kos.close();

Reading KML

The following snippet of code uses the convenience methods to fetch all features from a given KML resource then recursively load features from any NetworkLinks. This is fine when it is known that the number of features and network links is relatively small and fits into memory. If, however, the number of elements might be very large and might need to process items one at a time with a strategy to abort if user chooses then the second approach should be used. There is a user-settable limit in KmlReader to restrict the number of network links and prevent recursively loading a deeply nested super-overlay like KML resource.

 File file = new File("placemarks.kmz");
 KmlReader reader = new KmlReader(file);
 List<IGISObject> features = reader.readAll();
 List<IGISObject> linkedFeatures = reader.importFromNetworkLinks();
 List<URI> networkLinks = reader.getNetworkLinks();

This second example loads a KML resource which includes NetworkLinks. This examples uses a callback ImportEventHandler to handle each of the features of the imported features. If the callback handleEvent() method returns false then recursion is aborted no more NetworkLink features are added.

 KmlReader reader = new KmlReader(new URL("http://kml-samples.googlecode.com/svn/trunk/kml/NetworkLink/visibility.kml"))
 for (IGISObject gisObj; (gisObj = reader.read()) != null; ) {
    if (gisObj instanceOf Feature) {
      checkFeature(gisObj);
    }
 }
 reader.importFromNetworkLinks(
 	new KmlReader.ImportEventHandler() {
                public boolean handleEvent(UrlRef ref, IGISObject gisObj)
 			{
                    		checkFeature(gisObj);
                  		return true;
                	}
      });

Other Useful Reference Material

Clone this wiki locally