REVEAL Results Vol 1: Verification tools, knowledge and code
In the series “REVEAL Results” we provide information about REVEAL outcomes and achievements. To kick off, we list a number of verification tools, technologies and services that were developed by project partner CERTH. You can trial them out yourself and access the code, if available.
The CERTH Image Verification Assistant – as part of the work on image forensics
This is a web-based user interface that can support the verification of an arbitrary image found on the web or uploaded by the user. Once the user clicks “Verify”, the system presents the results of seven forensics analysis algorithms (six of them third party state-of-the-art and one developed within REVEAL). In addition, it presents in a concise way all Exif metadata found in the image, and automatically generates a link to perform a reverse image search (with the use of the respective Google service). Finally, the system offers the possibility of exporting the results of the analysis in a PDF document.
Try it out on http://reveal-mklab.iti.gr/reveal/
Datasets and Evaluation Toolbox – also carried out as part of CERTH’s work on image forensics. With contributions from DW.
This is a particularly useful resource for the image forensics research community. It comprises two datasets, the “Wild Web Tampered Image dataset” and the “DW Image Forensics Dataset”. Both datasets were used to expose several limitations of state-of-the-art image forensics algorithms and are available for use by other researchers working on the topic. Importantly, to facilitate reproducibility of results, we have made available the implementations of a number of reference image forensics methods (along with the one developed within REVEAL) along with a solid experimental benchmark to compare them on different datasets.
Get the data from:
http://mklab.iti.gr/project/wild-web-tampered-image-dataset
http://revealproject.eu/the-deutsche-welle-image-forensics-dataset/
https://github.com/MKLab-ITI/image-forensics
For Tweet Verification (1): the CERTH Tweet Verification Assistant
This is a web-based user interface that can support the verification of an arbitrary tweet. Once the user clicks “Verify”, the system presents a prediction of whether the tweet should be considered fake (misleading) or real (trustworthy) using red and green colour respectively. It visualizes a number of extracted features that are typically associated with fake and real tweets. The implementation of the algorithm that produces these results has been open sourced.
Trial it on http://reveal-mklab.iti.gr/reveal/fake/ and
get the code from https://github.com/MKLab-ITI/computational-verification
For Tweet Verification (2): Verification Corpus and Benchmarking. With contributions from ITINNO.
The Tweet Verification Assistant was based on a large labelled collection of historic tweets that shared fake and real images and videos. The dataset formed the basis for setting up the Verifying Multimedia Use task in the context of the MediaEval Benchmarking activity, which ran for two years (2015 and 2016) and has led to important findings with respect to automated verification of tweets.
Data and more on https://github.com/MKLab-ITI/image-verification-corpus
and http://www.multimediaeval.org/mediaeval2016/verifyingmultimediause/
Disturbing Image Detection
This is a web-based user interface that demonstrates the potential of modern computer vision technology to automatically identify images with disturbing and potentially traumatizing content. The implementation of the classification algorithm is based on a state-of-the-art approach that uses Deep Convolutional Neural Network features and was trained on a dataset collected and annotated by CERTH researchers for this purpose.
http://reveal-mklab.iti.gr/reveal/disturbing/
Multimedia Geotagging
This is a highly reliable algorithm for estimating the location (in terms of latitude/longitude coordinates) of a social media post based on its text metadata (e.g. title, tags). The algorithm is based on a statistical Language Model that is “trained” on a massive set of geotagged text items from the recently released YFCC100M dataset. The algorithm also features a number of refinements, including improved feature selection and weighting, and has achieved top performance in the recent MediaEval Placing Task editions (2015, 2016). The implementation of the algorithm has been open-sourced.
Get the data from https://github.com/MKLab-ITI/multimedia-geotagging
Network-based User Classifier
This is an algorithm that classifies Twitter users to categories based on their associations with other Twitter users. The novel characteristics of the algorithm are that a) it automatically establishes the categories to categorize users by fetching and processing Twitter lists, to which a small set of Twitter (out of the set of interest) users belong, and b) the algorithm assigns categories to users both by looking into their interactions with other users and by looking into the similarities between their posts.
More on https://github.com/MKLab-ITI/reveal-graph-embedding
and http://mklab.iti.gr/resources/arcte/
News Popularity Prediction
This is an algorithm that predicts the future popularity of a Reddit post or YouTube video by analyzing the respective comment activities. The algorithm builds a comment tree and a user-response graph and extracts a number of features that capture the structure of this tree and graph, which it then uses to build a predictive model. The implementation of the algorithm has been open-sourced and a public web demo is available.
You can find them on http://reveal-mklab.iti.gr/reveal/popularity/
and https://github.com/MKLab-ITI/news-popularity-prediction
Multimedia Summarization
This is an algorithm that analyzes a large collection of social multimedia items (i.e. posts containing images) around an event or topic of interest. It can select an appropriate subset that summarizes the main highlights of the event/topic of interest. The implementation of the algorithm has been open-sourced and made available along with two large-scale labelled datasets that can be used for evaluating its performance.
Find them here: https://github.com/MKLab-ITI/mgraph-summarization
Contact
If you would like to provide – much welcomed – feedback on the prototypes and demos listed here, or if you want to get in touch for any other related issues, please do not hesitate to get in touch with CERTH’s Symeon Papadopoulos.
Notice
All services and demos to which links are provided above are operated by CERTH-ITI. CERTH-ITI provide the services as well as access to code and data “as is” and for demonstration purposes only, not assuming any responsibility / liability of whatever nature. Neither the REVEAL consortium nor the hosts of the REVEAL website are liable for anything resulting from usage of the above demos / code either.