REVEAL Results Vol. 3: Geoparsing and Verification Decision Support
In Volume 3 of our REVEAL Results series, we report about the work of IT Innovation. IT Inno, in short, were primarily active in geolocation work and developed the so-called “Journalist Decision Support System (JDSS).” Find out more in the article below.
In the area of geoparsing, IT Innovation deveoped and can provide access to the following outcomes:
The Open geoparsepy Python PyPi library
Geoparsepy is a free Python library for scalable geoparsing hosted on the Python Package Index (PyPi). Geoparsepy uses a local OpenStreetMap planet database to avoid rate limits seen on services such as the Google Geocoder API. This allows it full access to the location geometry, and consequently out-perform all existing gazetteer-based geoparsing approaches. In addition, because the planet’s geography is at its disposal it can geoparse at region, street and building levels i.e. much more detailed than most state of the art approaches which only work at region level.
Geoparsepy is a powerful and scalable proposition to add location-awareness to your software. More on https://pypi.python.org/pypi?:action=display&name=geoparsepy
The Geoparse Benchmark Open Dataset
The geoparsing benchmark dataset contains thousands of tweets recorded during four different natural disasters. These events are Hurricane Sandy in 2012, the Milan Blackouts of 2013, the Turkish Earthquake of 2012 and the Christchurch Earthquake of 2012. Each tweet in the dataset has been manually labelled with location entries at the building, street and region levels to provide a gold standard for evaluation work. The data consists of the full JSON serialized tweet metadata (i.e. including text) with an additional ‘entities’ field of type ‘mentions’ for the ground truth location annotations.
The geoparse benchmark open dataset is a major scientific resource for evaluating geoparsing approaches of the future. See http://web-001.ecs.soton.ac.uk/ for details and search for ‘GEOPARSE TWITTER BENCHMARK DATASET’
Another field of research by IT Innovation in REVEAL has been the area of Social Media Verification Decision Support. Check out the following:
The Journalist Decision Support System (JDSS)
The Journalist Decision Support System is a free, scalable Twitter analytics platform allowing journalists to crawl Twitter for posts and find user generated content (UGC) relevant to verification tasks. Up to 19 journalists can use JDSS simultaneously, each interactively browsing 10,000’s of posts in real-time. Background analytics are automatically run on all posts and any linked / embedded images and videos, including sentiment analysis, fake and eyewitness media labelling and trusted fact and claim extraction. Journalists can interactively explore posts, clustering and sub-clustering the data to quickly find groups of contextual posts highly related to the event or claim being verified.
The Journalist Decision Support System quickly gets journalists to the right content so they can make UGC verification decisions quicker. You find it on https://reveal-jdss.it-innovation.soton.ac.uk/reveal_journalists_dss/
Note: this only works properly with a recent version of the Chrome browser.
Bias in Linked Open Data (BLinD)
Bias in Linked Open Data (BLinD) is a free dashboard to support source bias analytics. Journalists can quickly gauge the possible bias of any resource (e.g. person or organisation) in Wikipedia. It contains interactive visualisations to provide a quick and simple way to visually examine a biased report whilst keeping all the links back to the original evidence on Wikipedia. Data sources include DBpedia and Wikidata.
BLinD automatically compiles Wikipedia evidence about sources so journalists can focus on making decisions about potential bias. You find more on https://github.com/it-innovation/BLinD
If you would like to provide any feedback on the work performed and illustrated above, this would be more than welcome. Feel free to get in touch for this or any related issues. For inquiries, please contact Stuart E. Middleton of IT Innovation.
All services and demos to which links are provided above are operated by IT Innovation. IT Innovation provide the services as well as access to code and data “as is” and for demonstration purposes only, not assuming any responsibility / liability of whatever nature. Neither the REVEAL consortium nor the hosts of the REVEAL website are liable for anything resulting from usage of the above demos / code either.