-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathqueries.php
executable file
·347 lines (227 loc) · 35.5 KB
/
queries.php
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
<?php
error_reporting(E_ALL);
set_time_limit(0);
date_default_timezone_set('Europe/London');
include_once('functions.php');
getHead("queries.php");
?>
<div class="container start">
<div class="row-fluid">
<div class="span4" style="text-align: center">
<img src="img/download.png" />
<div class="navspy">
<ul class="nav nav-tabs nav-stacked affix-top sidenav" data-spy="affix" data-offset-top="314">
<li><a href="#ld">Linked Data 101 <i class="icon-chevron-right pull-right"></i></a></li>
<li><a href="#sparql">SPARQL 101 <i class="icon-chevron-right pull-right"></i></a></li>
<li><a href="#examples">HXL by Example <i class="icon-chevron-right pull-right"></i></a></li>
<li><a href="#geo">Geodata in HXL <i class="icon-chevron-right pull-right"></i></a></li>
<li><a href="#tldr">TL;DR <i class="icon-chevron-right pull-right"></i></a></li>
</ul>
</div>
</div>
<div class="span8">
<h1>Querying HXL</h1>
<p class="punchline">This tutorial explains how to access HXL data, including crash courses on Linked Data and the SPARQL query language.</p>
<!-- LINKED DATA 101 SECTION -->
<h2 id="ld">Linked Data 101</h2>
<p>A <em>very</em> brief introduction to the ideas behind Linked Data, explaining why we followed this approach instead of developing an XML schema or a proprietary API, for example. For more detailed introductions to the topic, take a look at the <a href="#ld-further">reading list</a>. If you are already familiar with the idea of Linked Data, move along, nothing to see here.</p>
<h3>The four rules</h3>
<p><a href="http://en.wikipedia.org/wiki/Tim_Berners-Lee">Tim Berners-Lee</a> is not only quoted for developing the basics of the Web as we know it today, he also constantly thinks about the next steps – one of them being Linked Data. In 2006, he published these four rules, also known as the <em>Linked Data principles</em>, <a href="http://www.w3.org/DesignIssues/LinkedData.html">on his blog</a>:</p>
<ol>
<li>Use URIs as names for things.</li>
<li>Use HTTP URIs so that people can look up those names.</li>
<li>When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL).</li>
<li>Include links to other URIs. so that they can discover more things.</li>
</ol>
<p>In a nutshell, these rules transfer the idea of interlinked, human-readable documents (aka. the Web as we know it), to raw data: Instead of jumping from one web page to the next by clicking links on that page, we have datasets that refer to each other and thus create a <em>Web of Data</em>. Let's take a look at how that works:</p>
<h3>Statements about resources</h3>
<p>Linked Data builds on the <a href="http://www.w3.org/RDF/">Resource Description Framework</a> (RDF), a W3C standard for the, err, description of resources. These descriptions come as statements, very much like simple sentences in English, consisting of <em>subject</em>, <em>predicate</em> (or <em>property</em>), and <em>object</em>. Since it's three parts, these statements are often referred to as <em>triples</em>. Let's take a simple example, such as the statement <code>Batman knows Robin</code>. RDF allows us to make this statement machine-readable by replacing all three parts with URLs:</p>
<pre class="prettyprint linenums"><<a href="http://dbpedia.org/resource/Batman" target="_blank">http://dbpedia.org/resource/Batman</a>>
<<a href="http://xmlns.com/foaf/0.1/knows" target="_blank">http://xmlns.com/foaf/0.1/knows</a>>
<<a href="http://dbpedia.org/resource/Robin_(comics)" target="_blank">http://dbpedia.org/resource/Robin_(comics)</a>> .</pre>
<p>With this triple, you can already see Linked Data in Action: Click on the subject (<code>Batman</code>), predicate (<code>knows</code>), or object (<code>Robin</code>), and you will get <em>more data</em> about these things: More data about Batman and Robin, and, in the case of the <code>knows</code> predicate, the documentation of the <a href="http://xmlns.com/foaf/spec/">Friend Of A Friend vocabulary</a> (FOAF). This is what Linked Data is all about: <strong>Sharing and reusing data</strong>. This is achieved by using URLs as identifiers, which not only provide unique IDs, but at the same time provide a location on the Web that contains data about the identified resource. Obviously, these resources cannot only be comic characters, but people, places, books, movies, drugs, events, and so on.</p>
<p>Strictly speaking, what you get when you click any of those links is a <em>human-readable representation</em> of the data, since you are accessing it with a browser asking the server for HTML. Going to the same location while asking for RDF in your HTTP header will return the same data as RDF; this mechanism is called <a href="http://linkeddatabook.com/editions/1.0/#htoc11">content negotiation</a>. Since we are talking about tech stuff: RDF is just a model, which can be serialized in <a href="http://en.wikipedia.org/wiki/Resource_Description_Framework#Serialization_formats">different notations</a>. We use the <a href="http://www.w3.org/TR/2011/WD-turtle-20110809/">Turtle syntax</a> here, because it is very easy to read.</p>
<p>In case you are wondering where those URLs come from: <a href="http://dbpedia.org/About">DBpedia</a> provides facts extracted from Wikipedia as Linked Data, and <a href="http://xmlns.com/foaf/spec/">FOAF</a> is a widely used vocabulary to express relationships between people. You might already guess now where we are going with HXL: Providing a vocabulary and URLs as identifiers for the humanitarian domain.
<h3>Creating a graph by combining statements</h3>
<p>The concept of the statement becomes really powerful once we start combining several statements. Let's extent our Batman example a bit; in this example, we use prefixes (lines 1–6) to declare the different name spaces used in the rest of the document, to make the URIs shorter and more readable:</p>
<pre class="prettyprint linenums">@prefix foaf: <<a href="http://xmlns.com/foaf/0.1/" target="_blank">http://xmlns.com/foaf/0.1/</a>> .
@prefix dbp: <<a href="http://dbpedia.org/resource/" target="_blank">http://dbpedia.org/resource/</a>> .
@prefix dbpprop: <<a href="http://dbpedia.org/property/" target="_blank">http://dbpedia.org/property/</a>> .
@prefix geonames: <<a href="http://sws.geonames.org/" target="_blank">http://sws.geonames.org/</a>> .
@prefix gn: <<a href="http://www.geonames.org/ontology#" target="_blank">http://www.geonames.org/#ontology</a>> .
@prefix wgs84_pos: <<a href="http://www.w3.org/2003/01/geo/wgs84_pos#">http://www.w3.org/2003/01/geo/wgs84_pos#</a>> .
<a href="http://dbpedia.org/resource/Batman" target="_blank">dbp:Batman</a> <a href="http://xmlns.com/foaf/0.1/knows" target="_blank">foaf:knows</a> <a href="http://dbpedia.org/resource/Robin_%28comics%29" target="_blank">dbp:Robin_(comics)</a> ;
<a href="http://dbpedia.org/property/creator" target="_blank">dbpprop:creator</a> <a href="http://dbpedia.org/resource/Bob_Kane" target="_blank">dbp:Bob_Kane</a> .
<a href="http://dbpedia.org/resource/Bob_Kane" target="_blank">dbp:Bob_Kane</a> <a href="http://dbpedia.org/property/birthPlace" target="_blank">dbpprop:birthPlace</a> <a href="http://sws.geonames.org/5128581" target="_blank">gn:5128581</a> .
<a href="http://sws.geonames.org/5128581" target="_blank">gn:5128581</a> <a href="http://www.geonames.org/ontology#name">gn:name "New York City" ;
<a href="http://www.w3.org/2003/01/geo/wgs84_pos#lat">wgs84_pos:lat</a> "40.71427" ;
<a href="http://www.w3.org/2003/01/geo/wgs84_pos#long">wgs84_pos:long</a> "-74.00597" . </pre>
<p>In this example, we are combining data from DBpedia (telling us that Batman knows Robin and that he has been created by Bob Kane, as well as Bob Kane's birthplace) with data from the <a href="http://www.geonames.org/">GeoNames gazetteer</a>, which provides us with the name and geocoordinates of Bob Kane's birthplace (New York City). The object of a statement does not always have to be another URL, it can also be a string, number, date, or other kind of <em>literal</em>. If we visualize the example above, we get a small graph:</p>
<p align="center"><img src="img/batman.png" /></p>
<p>If we scale this tiny example up by a couple of orders of magnitude, we get the <em>global graph</em> that constitutes the Web of Data. The <a href="http://richard.cyganiak.de/2007/10/lod/">Linked Data Cloud diagram</a> gives an overview of the biggest sources for Linked Data and how they are interlinked.
<h3>Shared Vocabularies</h3>
<p>The previous example already demonstrates the reuse of shared vocabularies, as it uses the DBpedia, GeoNames, and W3C WGS84 vocabularies. Reusing existing vocabularies increases the oddds that a linked dataset is being reusing, as existing vocabularies (especially the ubiquitous ones such as <a href="http://xmlns.com/foaf/spec/">FOAF</a>, <a href="http://purl.org/dc/terms/">DC</a>, or <a href="http://www.w3.org/2004/02/skos/">SKOS</a>) are often already known to potential data consumbers. A good place to look for existing vocabularies is the <a href="http://lov.okfn.org/dataset/lov/">Linked Open Vocabularies</a> website.</p>
<p><strong>HXL provides such a shared vocabulary for the humanitarian domain.</strong> So far, such a vocabulary does not exist, apart from the <a href="http://observedchange.com/moac/ns/">Management of a Crisis</a> (MOAC) vocabulary (see <a href="http://ceur-ws.org/Vol-798/paper2.pdf">this paper</a> for details). MOAC has certainly inspired HXL; however, we chose to redefine the classes and properties shared with HXL, instead of importing them, to make sure that the vocabulary is under control of UN OCHA. However, we did reuse classes and properties from broadly use vocabularies where unexpected changes are unlikely, such as <a href="http://xmlns.com/foaf/spec/">FOAF</a>, the Open Geospatial Consortium's <a href="http://www.opengis.net/ont/geosparql">GeoSPARQL ontology</a>, <a href="http://purl.org/dc/terms/">DC</a>, and <a href="http://www.w3.org/2004/02/skos/">SKOS</a>,</p>
<h3>Why Linked Data?</h3>
<p>As the name suggests, the original idea for HXL was to develop an XML schema that allows humanitarian organisation to publish XML adhering to this schema. We then went for a Linked Data approach for a number of reasons:</p>
<ul>
<li><strong>Extensibility.</strong> HXL can only provide the core classes and properties for the domain. Publishing it as an RDF vocabulary allows data publishers to extend it according to their needs.</li>
<li><strong>Reuse of external data sources.</strong> Many sources in the Linked Data cloud contain information that is also useful in a humanitarian context, such as geographic or demographic data. </li>
<li><strong>Semantic annotations.</strong> An XML schema strongly focuses on <em>syntactic</em> interoperability. In Linked Data, RDF provides the syntax, and shared vocabularies define the <em>semantics</em> of the shared data.</li>
<li><strong>Standardized API.</strong> Data access works through standard HTTP requests, and the data can be queried in the <a href="#sparql">SPARQL query language</a>.</li>
<li><strong>Inference capabilities.</strong> The structure of the shared vocabularies supports <a href="http://en.wikipedia.org/wiki/Inference">inference</a> (also referred to as reasoning) on the data.</li>
<li><strong>Success stories.</strong> The Linked Data approach has generated a number of impressive <a href="http://answers.semanticweb.com/questions/1533/what-are-the-success-stories-of-the-semantic-weblinked-data">success stories</a>, including the BBC and BestBuy.</li>
<li><strong>Future-proofness.</strong> This is obviously always a bet, but we had the feeling (given the arguments above) that following the Linked Data approach is future proof in the medium to long term.</li>
</ul>
<h3 id="ld-further">Further reading</h3>
<p>Obviously, we can only scratch on the surface of Linked Data here. For a more detailed introduction, we recommend the book <em>Linked Data: Evolving the Web into a Global Data Space</em> by <a href="http://tomheath.com/">Tom Heath</a> and <a href="http://dws.informatik.uni-mannheim.de/en/people/professors/prof-dr-christian-bizer/">Chris Bizer</a>. You can read the whole book <a href="http://linkeddatabook.com">online for free</a>. Moreover, <a href="http://linkeddata.org/">linkeddata.org</a> offers a list with <a href="http://linkeddata.org/guides-and-tutorials">guides and tutorials</a>.</p>
<!-- SPARQL 101 SECTION -->
<h2 id="sparql">SPARQL 101</h2>
<p>So far, we have a structure where you can jump from one dataset to the next, browsing the Web of Data by <em>following your nose</em>. While that is a nice thing to have, what if we want to find all resources that match a certain criterion? Enter <a href="http://www.w3.org/TR/sparql11-query/">SPARQL</a>. The SPARQL query language defines a standard (currently a W3C recommendation) that supports such queries against <a href="http://en.wikipedia.org/wiki/Triplestore">triple stores</a>, i.e., databases optimzed for RDF storage.</p>
<h3>Basic queries</h3>
<p>If you are familiar with SQL, you will find many similarities in SPARQL. The big difference obviously is that we are querying graphs now, not tables in a relational database. In a SPARQL query, we define a graph pattern with a number of variables, and the result is the set of values for these variables for each match. As a starting example, following query asks for ten arbitrary triples:</p>
<pre class="prettyprint linenums">SELECT * WHERE {
?a ?b ?c .
} LIMIT 10</pre><a href="http://sparql.carsten.io/?query=SELECT%20*%20WHERE%20%7B%0A%20%20%3Fa%20%3Fb%20%3Fc%20.%0A%7D%0ALIMIT%2010&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<p>The <code>*</code> in the <code>SELECT</code> clause means that we want to return the values for all variables in the <code>WHERE</code> clause, which picks random triples (subject (<code>?a</code>), predicate (<code>?b</code>), and object (<code>?c</code>) are not restricted in any way). The number of results to return is then limited to 10 – without this restriction, this clause would return <em>all</em> triples in the store.</p>
<p>A SPARQL query is sent as an HTTP request to a <em>SPARQL endpoint</em>. <strong>The SPARQL endpoint for the HXL triple store is at <code>http://hxl.humanitarianresponse.info/sparql</code>.</strong></p>
<p>Since there is no database schema for a triple store, one of the first things to do when interacting with a new endpoint is often to explore its contents. To do this, we can easily ask for the different kinds of things the triple store hosts:</p>
<pre class="prettyprint linenums">SELECT DISTINCT ?type WHERE {
?thing a ?type .
}</pre><a href="http://sparql.carsten.io/?query=SELECT%20DISTINCT%20%3Ftype%20WHERE%20%7B%0A%20%20%20%3Fthing%20a%20%3Ftype%20.%20%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<p>In this case, we only ask for values for the <code>?type</code> variable (note that the variable names are completely arbitrary; we usually try to pick something meaningful that makes it easier to read the queries, though). The <code>DISTINCT</code> keyword makes sure that we don't get any duplicates. A look at the <code>WHERE</code> clause shows that the <code>?type</code> variable is used for the object of the triple. The predicate is a simple <code>a</code>, which is a shortcut for the <code>rdf:type</code> property that makes RDF resources instances of a class. Hence, this query gives us all classes that the triple store contains instances of.</p>
<p>Likewise, we can ask for all the predicates used in the triple store:</p>
<pre class="prettyprint linenums">SELECT DISTINCT ?predicate WHERE {
?subject ?predicate ?object .
}</pre><a href="http://sparql.carsten.io/?query=SELECT%20DISTINCT%20%3Fpredicate%20WHERE%20%7B%0A%20%20%20%3Fsubject%20%3Fpredicate%20%3Fobject%20.%20%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
These very simple examples illustrate the basics of SPARQL; we will show some more complex examples <a href="examples">below</a>.
<h3>Tools</h3>
<p>SPARQL is based on HTTP, so any tool that can fire an HTTP request can query a SPARQL endpoint (<a href="http://hxl.humanitarianresponse.info/sparql?query=prefix%20foaf%3A%20%3Chttp%3A//xmlns.com/foaf/0.1/%3E%20%0A%0ASELECT%20*%20WHERE%20%7B%0A%20%20%3Fagent%20a%20foaf%3APerson%20.%0A%7D">even your browser</a> – clicking this link should download an XML-encoded SPARQL results file to your computer). However, this is obviously not the most convenient way. Trying out queries is easier in an editor with proper syntax highlighting such as the one at <a href="http://sparql.carsten.io/">sparql.carsten.io</a>. Once you want to process the results programmatically, using libraries such as <a href="http://graphite.ecs.soton.ac.uk/sparqllib/">sparqllib</a> (PHP) or <a href="http://www.openrdf.org">Jena</a> (Java) come in handy. As with all things HTTP, <a href="http://curl.haxx.se">cURL</a> is extremely useful for testing and debugging. </p>
<h3 id="sparql-further">Further reading</h3>
<p>For further reading, we recommend the <a href="http://www.cambridgesemantics.com/de/semantic-university/sparql-by-example">SPARQL by Example</a> introduction over at Cambridge Semantics, and the W3C's <a href="http://www.w3.org/TR/sparql11-query/">SPARQL 1.1 working draft</a>, which has all the details, along with some handy examples.</p>
<!-- HXL BY EXAMPLE SECTION -->
<h2 id="examples">HXL by Example</h2>
<p>This section demonstrates how to query HXL by a number of common examples. We won't be able to cover ever possible query that you can make against our store, but the examples should get you started and allow you to extent them to build your own queries. Reading the <a href="http://hxl.humanitarianresponse.info/ns/">vocabulary documentation</a> will help you phrase your queries.</p>
<h3>Query by type</h3>
<div class="example">
<p>One of the most basic queries is asking for all resources of a certain type, e.g., all <a href="http://hxl.humanitarianresponse.info/ns/#APL">affected people locations</a> and their names, ordered by name: </p>
</div>
<pre class="prettyprint linenums">prefix hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT * WHERE {
?apl a hxl:APL ;
hxl:featureRefName ?name .
} ORDER BY ?name</pre><a href="http://sparql.carsten.io/?query=prefix%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20*%20WHERE%20%7B%0A%20%20%3Fapl%20a%20hxl%3AAPL%20%3B%0A%20%20%20%20%20%20%20%20%20hxl%3AfeatureRefName%20%3Fname%20.%0A%7D%20ORDER%20BY%20%3Fname&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<h3>Query by resource</h3>
<div class="example">
<p>This query gets all facts about a specific resource (Burkina Faso in this example) : </p>
</div>
<pre class="prettyprint linenums">prefix hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT * WHERE {
<http://hxl.humanitarianresponse.info/data/locations/admin/bfa/BFA> ?pred ?obj .
}</pre><a href="http://sparql.carsten.io/?query=prefix%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%20%0ASELECT%20*%20WHERE%20%7B%0A%0A%20%20%3Chttp%3A//hxl.humanitarianresponse.info/data/locations/admin/bfa/BFA%3E%20%3Fpred%20%3Fobj%20.%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<h3>Query specific property</h3>
<div class="example">
<p>Similar to the queries by type, specific properties of a resource are straight-forward to get. In this case, we query for the title of a specific <a href="http://hxl.humanitarianresponse.info/ns/#AgeGroupSet">age group set</a>. </p>
</div>
<pre class="prettyprint linenums">PREFIX hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT DISTINCT ?title WHERE {
<http://hxl.humanitarianresponse.info/data/agegroups/unhcr/ages_0-4> hxl:title ?title .
}</pre><a href="http://sparql.carsten.io/?query=PREFIX%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20DISTINCT%20%3Ftitle%20WHERE%20%7B%0A%20%20%3Chttp%3A//hxl.humanitarianresponse.info/data/agegroups/unhcr/ages_0-4%3E%20hxl%3Atitle%20%3Ftitle%20.%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<h3>Query by datacontainer</h3>
<div class="example">
<p>The HXL data is organized in datacontainers, which correspond to <a href="http://blog.ldodds.com/2009/11/05/managing-rdf-using-named-graphs/">named graphs</a> in our triple store. This query gets all triples in the given datacontainer. Note that if no datacontainer is explicitly given in the query (using the <code>GRAPH</code> syntax), the query will be ran across <em>all</em> datacontainers (the so-called <em>union graph</em>). </p>
</div>
<pre class="prettyprint linenums">SELECT * WHERE {
GRAPH <http://hxl.humanitarianresponse.info/data/datacontainers/unocha/1344942253.196156> {
?subject ?predicate ?object .
}
}</pre><a href="http://sparql.carsten.io/?query=SELECT%20DISTINCT%20*%20WHERE%20%7B%0A%20%20GRAPH%20%3Chttp%3A//hxl.humanitarianresponse.info/data/datacontainers/unocha/1344942253.196156%3E%20%7B%0A%20%20%20%20%3Fsubject%20%3Fpredicate%20%3Fobject%20.%0A%20%20%7D%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<div class="example">
<p>This query gets all datacontainers that are about the Mali emergency. </p>
</div>
<pre class="prettyprint linenums">PREFIX hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT DISTINCT * WHERE {
?container hxl:aboutEmergency <http://hxl.humanitarianresponse.info/data/emergencies/mali2012test> .
}</pre><a href="http://sparql.carsten.io/?query=PREFIX%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20DISTINCT%20*%20WHERE%20%7B%0A%20%20%20%3Fcontainer%20hxl%3AaboutEmergency%20%3Chttp%3A//hxl.humanitarianresponse.info/data/emergencies/mali2012test%3E%20.%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<h3>Querying sets</h3>
<div class="example">
<p>Some of the HXL reference instances, such as the age groups, are organized into sets because each organization may define their own age group break down. This query gets all groups within a specific age group set (the one provided by UNHCR in this case), along with their respective group boundaries.</p>
</div>
<pre class="prettyprint linenums">prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT * WHERE {
<http://hxl.humanitarianresponse.info/data/agegroupsets/unhcr> rdfs:member ?group .
?group hxl:fromAge ?from ;
hxl:toAge ?to .
}</pre><a href="http://sparql.carsten.io/?query=prefix%20rdfs%3A%20%3Chttp%3A//www.w3.org/2000/01/rdf-schema%23%3E%20%0Aprefix%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20*%20WHERE%20%7B%0A%20%20%3Chttp%3A//hxl.humanitarianresponse.info/data/agegroupsets/unhcr%3E%20rdfs%3Amember%20%3Fgroup%20.%0A%20%20%3Fgroup%20hxl%3AfromAge%20%3Ffrom%20%3B%0A%20%20%20%20%20%20%20%20%20hxl%3AtoAge%20%3Fto%20.%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<h3>Property path queries</h3>
<div class="example">
<p>A new feature in SPARQL 1.1 makes complex queries a snap. <a href="http://www.w3.org/TR/sparql11-property-paths/">Property paths</a> allow us to query along the graph without prior knowledge of how deep we need to go with this query. For example, this query gets all places within Burkina Faso (note that in our data, every place is linked to the containing administrative unit via the <a href="http://hxl.humanitarianresponse.info/ns/#atLocation">at location</a> property; see the <a href="#geo">Geoata section</a> below for details):</p>
</div>
<pre class="prettyprint linenums">PREFIX hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT * WHERE {
?feature hxl:atLocation+ <http://hxl.humanitarianresponse.info/data/locations/admin/bfa/BFA> .
}</pre><a href="http://sparql.carsten.io/?query=PREFIX%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20*%20WHERE%20%7B%0A%20%20%3Ffeature%20hxl%3AatLocation%2B%20%3Chttp%3A//hxl.humanitarianresponse.info/data/locations/admin/bfa/BFA%3E%20.%0A%7D%0A&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<div class="example">
<p>This query uses a property paths to fetch all things that are typed as a HXL population or <em>any of the subclasses of population</em>:</p>
</div>
<pre class="prettyprint linenums">prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT * WHERE {
?population rdf:type/rdfs:subClassOf* hxl:Population .
}</pre><a href="http://sparql.carsten.io/?query=prefix%20rdf%3A%20%3Chttp%3A//www.w3.org/1999/02/22-rdf-syntax-ns%23%3E%20%0Aprefix%20rdfs%3A%20%3Chttp%3A//www.w3.org/2000/01/rdf-schema%23%3E%20%0Aprefix%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20*%20WHERE%20%7B%0A%20%20%3Fpopulation%20rdf%3Atype/rdfs%3AsubClassOf*%20hxl%3APopulation%20.%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<h3>Some math</h3>
<div class="example">
<p>A more complex example which queries the latest overall population numbers per location in the Mali crisis. The <code>MAX</code> syntax gets us only the latest dates, and the <code>SUM</code> adds up the person counts for the different types of population at the given place.</p>
</div>
<pre class="prettyprint linenums">prefix ogc: <http://www.opengis.net/ont/geosparql#>
prefix hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT (MAX(?valid) as ?latest) ?location ?locationName ?wkt (SUM(?count) AS ?totalRefugees) WHERE {
GRAPH ?g {
?g hxl:aboutEmergency <http://hxl.humanitarianresponse.info/data/emergencies/mali2012test> ;
hxl:validOn ?valid .
?pop a hxl:RefugeesAsylumSeekers ;
hxl:personCount ?count ;
hxl:atLocation ?location .
}
?location hxl:featureName ?locationName;
ogc:hasGeometry ?geom .
?geom ogc:hasSerialization ?wkt .
} GROUP BY ?location ?locationName ?wkt
ORDER BY ?locationName</pre><a href="http://sparql.carsten.io/?query=prefix%20ogc%3A%20%3Chttp%3A//www.opengis.net/ont/geosparql%23%3E%20%0Aprefix%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20%28MAX%28%3Fvalid%29%20as%20%3Flatest%29%20%3Flocation%20%3FlocationName%20%3Fwkt%20%28SUM%28%3Fcount%29%20AS%20%3FtotalRefugees%29%20WHERE%20%7B%0A%20%20%0A%20%20GRAPH%20%3Fg%20%7B%0A%20%20%20%20%3Fg%20hxl%3AaboutEmergency%20%3Chttp%3A//hxl.humanitarianresponse.info/data/emergencies/mali2012test%3E%20%3B%20%0A%20%20%20%20%20%20%20hxl%3AvalidOn%20%3Fvalid%20.%0A%20%20%20%20%3Fpop%20a%20hxl%3ARefugeesAsylumSeekers%20%3B%20%0A%20%20%20%20%20%20%20%20%20%20%20hxl%3ApersonCount%20%3Fcount%20%3B%0A%20%20%20%20%20%20%20%20%20%20%20hxl%3AatLocation%20%20%3Flocation%20.%0A%20%20%7D%0A%20%20%3Flocation%20hxl%3AfeatureName%20%3FlocationName%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20ogc%3AhasGeometry%20%3Fgeom%20.%0A%20%20%3Fgeom%20ogc%3AhasSerialization%20%3Fwkt%20.%0A%20%20%0A%7D%20GROUP%20BY%20%3Flocation%20%3FlocationName%20%3Fwkt%20ORDER%20BY%20%3FlocationName&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<!-- GEODATA SECTION -->
<h2 id="geo">Geodata in HXL</h2>
<p>HXL will be the central source for geographic information provided by OCHA, in particular the <a href="http://cod.humanitarianresponse.info">common operational datasets</a> (CODs). We have already loaded the worldwide country boundaries, as well as the administrative units and populted places for the countries we are currently using during the test phase (i.e., the countries affected by the Mali crisis) into our triple store.</p>
<h3>GeoSPARQL</h3>
<p>HXL adopts the <a href="http://www.opengeospatial.org/standards/sfa">Simple Features</a> model defined by the <a href="http://www.opengeospatial.org/">Open Geospatial Consortium</a>, an industry standard that is widely adopted and has recently been ported to RDF in the <a href="http://www.opengeospatial.org/standards/geosparql">GeoSPARQL</a> specification. GeoSPARQL extends SPARQL with spatial query capabilities, such as topological queries ("give me all camps inside this administrative unit") or buffering ("give me all hospitals within 20km of this route"). Note that <strong>the HXL SPARQL endpoint does not support GeoSPARQL queries yet</strong>; however, we already use the GeoSPARQL ontology for the representation of our geographic information, so that we can easily add this functionality later.</p>
<h3>Feature representation</h3>
<p>The Simple Features model distinguishes between a Feature (such as a country, an administrative unit, or an airport), its geometry, the serialization of the geometry:</p>
<p><img src="img/features.png" /></p>
<p>Note that each feature can have several geometries, as the geometry may change over time. Moreover, it may be convenient to have both a point representation (e.g. for coarse mapping) and a detailed representation, e.g. as a polygon, for detailed mapping and analysis. Each of these geometries can have several serializations, e.g., as <a href="http://en.wikipedia.org/wiki/Well-known_text">well-known text</a> (WKT) and in the <a href="http://en.wikipedia.org/wiki/Geography_Markup_Language">Geography Markup Language</a> (GML). Currently, we only offer geometries in WKT, as they are both compact in storage and easy to convert to any other representation, such as GML, KML, or GeoJSON, using the functionality offered by <a href="http://postgis.refractions.net">PostGIS</a> or libraries such as <a href="https://github.com/phayes/geoPHP">GeoPHP</a>.</p>
<p>The following query demonstrates how to query the WKT representation of a specific feature (the country boundary for Niger in this case):</p>
<pre class="prettyprint linenums">prefix ogc: <http://www.opengis.net/ont/geosparql#>>
SELECT ?wkt WHERE {
<http://hxl.humanitarianresponse.info/data/locations/admin/ner/NG> ogc:hasGeometry ?geometry .
?geometry ogc:hasSerialization ?wkt .
}</pre><a href="http://sparql.carsten.io/?query=prefix%20ogc%3A%20%3Chttp%3A//www.opengis.net/ont/geosparql%23%3E%0A%0ASELECT%20%3Fwkt%20WHERE%20%7B%0A%20%20%3Chttp%3A//hxl.humanitarianresponse.info/data/locations/admin/ner/NG%3E%20ogc%3AhasGeometry%20%3Fgeometry%20.%0A%20%20%3Fgeometry%20ogc%3AhasSerialization%20%3Fwkt%20.%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<h3>HXL administrative hierarchy</h3>
<p>The administrative hierarchy in HXL is organized around administrative unit levels, where the countries are at level 0. The hierarchy below country level differs from country to country, as each country has its own hierarchy system. HXL therefore adopts the commonly used approach to number the levels, where 1 is directly below country level, the 2, then 3, etc. The depth of the hierarchy may again be different from country to country. In HXL, each populated place and admin unit is linked to the <em>containing</em> admin unit via the <code>hxl:atLocation</code> property. As an example, a populated place may be linked to its containing admin unit at level 2, which is in turn linked to its containing admin unit at level 1, which is linked to the containing country. This way, we can easily query e.g. all admin units inside a specific country:</p>
<pre class="prettyprint linenums">prefix hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT ?place WHERE {
?place hxl:atLocation <http://hxl.humanitarianresponse.info/data/locations/admin/ner/NG> .
}</pre><a href="http://sparql.carsten.io/?query=prefix%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20%3Fplace%20WHERE%20%7B%0A%20%20%3Fplace%20hxl%3AatLocation%20%3Chttp%3A//hxl.humanitarianresponse.info/data/locations/admin/ner/NG%3E%20.%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<p>Note that this query will only return places and admin units that are <em>directly</em> linked to Niger via the <code>hxl:atLocation</code> property. In order to follow the <code>hxl:atLocation</code> hierarchy, we need a <a href="http://www.w3.org/TR/sparql11-property-paths/">property path</a> query. This example retrieves the whole admin hierarchy for <a href="http://hxl.humanitarianresponse.info/data/locations/admin/bfa/BFA050003122">Koutiala</a> in Burkina Faso; it also shows that each admin unit has a unique p-code that reflects the place hierarchy:</p>
<pre class="prettyprint linenums">prefix hxl: <http://hxl.humanitarianresponse.info/ns/#>
SELECT * WHERE {
<http://hxl.humanitarianresponse.info/data/locations/admin/bfa/BFA050003122> hxl:atLocation* ?place .
?place hxl:pcode ?pcode .
}</pre><a href="http://sparql.carsten.io/?query=prefix%20hxl%3A%20%3Chttp%3A//hxl.humanitarianresponse.info/ns/%23%3E%0A%0ASELECT%20*%20WHERE%20%7B%0A%20%20%3Chttp%3A//hxl.humanitarianresponse.info/data/locations/admin/bfa/BFA050003122%3E%20hxl%3AatLocation*%20%3Fplace%20.%0A%20%20%3Fplace%20hxl%3Apcode%20%3Fpcode%0A%7D&endpoint=http%3A//hxl.humanitarianresponse.info/sparql" class="btn pull-right execute" target="_blank">Execute <i class="icon-play"></i></a>
<p>On top of the RDF representation for the geographic information in HXL, we will also offer standard OGC services (<a href="http://en.wikipedia.org/wiki/Web_Map_Service">WMS</a> and <a href="http://en.wikipedia.org/wiki/Web_Feature_Service">WFS</a>) that are kept in sync with the HXL triple store. We are currently testing the setup, and will enable public access as soon as possible.</p>
<!-- TLDR / SUMMARY-->
<h2 id="tldr">TL;DR<sup><a href="http://en.wiktionary.org/wiki/TL;DR">?</a></sup></h2>
<p>HXL provides an RDF vocabulary that allows humanitarian data to be provided as Linked Data. Geographic information in HXL is encoded as defined by the GeoSPARQL spec. Our triple store can be queried at <code>http://hxl.humanitarianresponse.info/sparql</code>.</p>
</div>
</div>
</div>
<?php getFoot(array('jquery-ui-1.8.21.custom.min.js'), null ); ?>