Did you know ... | Search Documentation: |
Packs (add-ons) for SWI-Prolog |
Title: | SPARQLprog bindings for WikiData |
---|---|
Rating: | Not rated. Create the first rating! |
Latest version: | 0.0.3 |
SHA1 sum: | 52cd638fea49af2fbc2ed8b078afd7073ea0dea4 |
Author: | Chris Mungall <cmungall@gmail.com> |
Packager: | Chris Mungall <cmungall@gmail.com> |
Home page: | https://github.com/cmungall/sparqlprog_wikidata |
Download URL: | https://github.com/cmungall/sparqlprog_wikidata/releases/*.zip |
Requires: | dcgutils |
regex | |
sparqlprog |
No reviews. Create the first review!.
Version | SHA1 | #Downloads | URL |
---|---|---|---|
0.0.1 | b98c0c575900b89162f3cadceead488524995d30 | 7 | https://github.com/cmungall/sparqlprog_wikidata/archive/v0.0.1.zip |
0.0.2 | 446a607e627ef66eff20b815b34bba05b9feba4e | 1 | https://github.com/cmungall/sparqlprog_wikidata/archive/v0.0.2.zip |
0.0.3 | 52cd638fea49af2fbc2ed8b078afd7073ea0dea4 | 49 | https://github.com/cmungall/sparqlprog_wikidata/archive/v0.0.3.zip |
![DOI](https://zenodo.org/badge/latestdoi/13996//sparqlprog_wikidata)
This is a module for sparqlprog that provides convenience predicates for making sparql queries over wikidata using logic programming terms. It allows you to define reusable query predicates, and to integrate programming constructs with queries in a declarative way.
All cities over a certain population size together with their continents:
pq-wikidata -l -L enlabel "city(City),part_of_continent(City,Continent),population(City,Pop),Pop>10000000"
yields:
City | Continent | Population | City Name | Continent Name | Pop Name |
---|---|---|---|---|---|
wd:Q174 | wd:Q18 | 12106920 | São Paulo | South America | $null$ |
wd:Q649 | wd:Q46 | 12500123 | Moscow | Europe | $null$ |
wd:Q1355 | wd:Q48 | 10535000 | Bangalore | Asia | $null$ |
wd:Q1156 | wd:Q48 | 12442373 | Mumbai | Asia | $null$ |
wd:Q406 | wd:Q48 | 14657434 | Istanbul | Asia | $null$ |
wd:Q406 | wd:Q46 | 14657434 | Istanbul | Europe | $null$ |
wd:Q85 | wd:Q15 | 19500000 | Cairo | Africa | $null$ |
wd:Q15174 | wd:Q48 | 11908400 | Shenzhen | Asia | $null$ |
wd:Q956 | wd:Q48 | 21710000 | Beijing | Asia | $null$ |
wd:Q1353 | wd:Q48 | 26495000 | Delhi | Asia | $null$ |
The -l
argument auto-adds labels for every column (this is meaningless for the population column but this is included for consistency)
Create a file city_ontology.pro
with a single line:
big_city(City) :- city(City),population(City,Pop),Pop>10000000.
Now the big_city/1 predicate can be reused in queries:
pq-wikidata --consult city_ontology.pro -l -L enlabel "big_city(City),part_of_continent(City,Continent)"
yields:
city | continent | city label | continent label |
---|---|---|---|
wd:Q956 | wd:Q48 | Beijing | Asia |
wd:Q15174 | wd:Q48 | Shenzhen | Asia |
wd:Q1353 | wd:Q48 | Delhi | Asia |
wd:Q649 | wd:Q46 | Moscow | Europe |
wd:Q406 | wd:Q48 | Istanbul | Asia |
wd:Q406 | wd:Q46 | Istanbul | Europe |
wd:Q1156 | wd:Q48 | Mumbai | Asia |
wd:Q85 | wd:Q15 | Cairo | Africa |
wd:Q1355 | wd:Q48 | Bangalore | Asia |
wd:Q174 | wd:Q18 | São Paulo | South America |
Qualified properties may be represented by n-ary predicates, such as population_at/3:
$ pq-wikidata -f tsv --consult tests/city_ontology.pro -l -L enlabel 'big_city(City),population_at(City,Pop,At),in_time_interval("2010-01-01"^^xsd:dateTime,"2013-01-01"^^xsd:dateTime,At)' | tbl2ghwiki
|City|Pop|Time|City Name|---|---| |---|---|---|---|---|---| |wd:Q1353|16787941|2011-01-01T00:00:00Z|Delhi|$null$|$null$ |wd:Q406|13624240|2011-01-01T00:00:00Z|Istanbul|$null$|$null$ |wd:Q649|11856578|2012-01-01T00:00:00Z|Moscow|$null$|$null$ |wd:Q3630|9607787|2010-01-01T00:00:00Z|Jakarta|$null$|$null$ |wd:Q404763|12010000|2011-01-01T00:00:00Z|Nanyang|$null$|$null$ |wd:Q649|11979529|2013-01-01T00:00:00Z|Moscow|$null$|$null$ |wd:Q649|11503501|2010-01-01T00:00:00Z|Moscow|$null$|$null$ |wd:Q11746|10220000|2013-01-01T00:00:00Z|Wuhan|$null$|$null$ |wd:Q406|14160467|2013-01-01T00:00:00Z|Istanbul|$null$|$null$ |wd:Q649|11776764|2011-01-01T00:00:00Z|Moscow|$null$|$null$ |wd:Q42622|10465994|2010-01-01T00:00:00Z|Suzhou|$null$|$null$ |wd:Q174|11316149|2011-01-01T00:00:00Z|São Paulo|$null$|$null$ |wd:Q406|13255685|2010-01-01T00:00:00Z|Istanbul|$null$|$null$ |wd:Q406|13854740|2012-01-01T00:00:00Z|Istanbul|$null$|$null$ |wd:Q1355|8425970|2011-01-01T00:00:00Z|Bangalore|$null$|$null$ |wd:Q174|11253503|2010-01-01T00:00:00Z|São Paulo|$null$|$null$ |wd:Q1490|13159388|2010-01-01T00:00:00Z|Tokyo|$null$|$null$ |wd:Q15174|10628900|2013-01-01T00:00:00Z|Shenzhen|$null$|$null$ |wd:Q3838|9464000|2012-01-01T00:00:00Z|Kinshasa|$null$|$null$ |wd:Q373346|10820000|2011-01-01T00:00:00Z|Linyi|$null$|$null$ |wd:Q1352|4646732|2011-01-01T00:00:00Z|Chennai|$null$|$null$ |wd:Q11739|7129629|2010-01-01T00:00:00Z|Lahore|$null$|$null$ |wd:Q1156|12442373|2011-01-01T00:00:00Z|Mumbai|$null$|$null$
Location queries:
Find all forests around San Francisco in a 100 mile radius
$ pq-wikidata -l -L enlabel -f tsv "coordinate_location(wd:'Q62',Loc),geolocation_around(Loc,100,X),forest(X)"
The entity_search/2 predicate provides access to the Wikibase EntitySearch function. The following example finds all subclasses of a symptom by name:
$ pq-wikidata -l -L enlabel "entity_search(vomiting,Match),subclass_of_transitive(Symptom,Match)"
|Match|Symptom|Match Label|Symptom Label| |---|---|---|---| |wd:Q127076|wd:Q127076|vomiting|vomiting |wd:Q127076|wd:Q2635499|vomiting|Projectile vomiting |wd:Q127076|wd:Q21993813|vomiting|chronic vomiting |wd:Q127076|wd:Q23012213|vomiting|glowing vomit |wd:Q127076|wd:Q5140942|vomiting|coffee ground vomiting |wd:Q127076|wd:Q54974197|vomiting|anticipatory vomiting |...|...|...|...|
$ pq-wikidata -f tsv -l -L enlabel "subclass_of_transitive(S,wd:'Q127076'),has_cause(S,C)" | tbl2ghwiki
Note that affixing _transitive to a predicate will always translate to the reflexive transitive version of that predicate. Here we find all known causes of different kinds of vomiting in Wikidata:
|S|C|S Label|C Label| |---|---|---|---| |wd:Q1570161|wd:Q1495657|hematemesis|gastrointestinal bleeding |wd:Q1938763|wd:Q16244733|fecal vomiting|intestinal obstruction |wd:Q5140942|wd:Q1883970|coffee ground vomiting|upper gastrointestinal bleeding |wd:Q127076|wd:Q133823|vomiting|migraine |wd:Q127076|wd:Q121041|vomiting|appendicitis |wd:Q127076|wd:Q164778|vomiting|rotavirus |wd:Q127076|wd:Q943897|vomiting|gastroparesis |wd:Q127076|wd:Q974135|vomiting|chemotherapy
To run queries on the command line:
alias pq-wikidata="docker run cmungall/sparqlprog_wikidata pq-wikidata
To run a service:
docker run -p 9083:9083 cmungall/sparqlprog_wikidata
Install SWI-Prolog from http://www.swi-prolog.org
pack_install(sparqlprog_wikidata)
For each class in the defined subset, for example Country, multiple predicates will be defined:
For example, the query
city(City)
will be expanded to:
SELECT ?city WHERE { ?city (<http://www.wikidata.org/prop/direct/P31>/<http://www.wikidata.org/prop/direct/P279>*) <http://www.wikidata.org/entity/Q515> }
For each predicate in the defined subset, for example, regulates (molecular biology), the following predicates will be defined:
Further predicates will be defined that utilize the wikidata reification model.
$predicate_eiri/1
- Entity IRI for predicate$predicate_e2s
- links entity to statement$predicate_s2v
- links value from statement$predicate_s2q
- links qualifier statementTo illustrate consider the definition of the following 3-ary predicate, based on the positive therapeutic predictor predicate in wikidata. These triples can be qualified by medical condition treated.
positive_therapeutic_predictor_for_condition(V,D,C) :- positive_therapeutic_predictor_e2s(V,S), medical_condition_treated_s2q(S,C), positive_therapeutic_predictor_s2v(S,D).
The following query:
pq-wikidata -C "positive_therapeutic_predictor_for_condition(Var,Drug,Condition)"
will be translated to:
SELECT ?var ?drug ?condition WHERE { ?var <http://www.wikidata.org/prop/P3354> ?v0 . ?v0 <http://www.wikidata.org/prop/qualifier/P2175> ?condition . ?v0 <http://www.wikidata.org/prop/statement/P3354> ?drug }
(use -C
to generate the SPARQL without executing it
Location of San Francisco:
$ pq-wikidata -l -L enlabel "geolocation(wd:'Q62',Lat,Long,Precision,Globe)" 37.766667,-122.433333,1.0E-6,wd:Q2,$null$,$null$,$null$,Earth
Currently on a small subset of the overall Wikidata schema is exposed, mostly a subset focused around life science and geoscience/geographic use cases. More can be added on request.
In future we may translate the entire Wikidata model (i.e. all classes and properties) into sparqlprog predicates.
See [bin/wikidata-examples.sh](bin/wikidata-examples.sh)
Document API calls, search, etc
Pack contains 18 files holding a total of 63.8K bytes.