Send in the Robots: Automated Journalism and its Potential Impact on Media Pluralism
Resources for investigative journalism are diminishing. In the digital age, this was a foreseeable evolution: publishers typically regard these pieces as time-consuming and expensive, and the results of the research are often unpredictable and potentially disappointing. In this post, Pieter-Jan Ombelet of the KU Leuven Centre for IT & IP Law analyses automated journalism (also referred to as robotic reporting) as a potential solution to combat the diminution of investigative journalism, and looks at the potential (positive and negative) impact of automated journalism on media pluralism.
What is automated journalism?
Automated journalism was defined by Matt Carlson as “algorithmic processes that convert data into narrative news texts with limited to no human intervention beyond the initial programming”. Narrative Science and Automated Insights are arguably the biggest companies at the moment specialising in this algorithmic content creation. Once there is core data to work with, the software of these companies can extrapolate complete news stories out of this data. To date, the most common uses of this software have been in the field of sports and financial reporting, often creating niche content that would not exist without the respective software (e.g. reports on ‘Little League’ games).
Don’t forget the humans!
Once these algorithms are optimised to allow newsrooms to use robotic reporters to write and edit news stories independently, this could have a serious impact on human journalists. Stuart Frankel, CEO of Narrative Science, envisions a media landscape in which “a reporter is off researching his or her story, getting information that’s not captured in some database somewhere, really understanding the story in total, and writing part of that story, but also including a portion of the story that in fact is written by a piece of technology.” The portions written by the algorithm would often provide meaningful output from complex data, and be less biased and in that sense more trustworthy than could be expected from a human journalist.
Other voices have however expressed more caution. They emphasise the humanity, inherently linked to proper journalism. This argument is valid, since an article written by an algorithm will never intentionally contain new ideas or viewpoints. And this generic nature is one of the downsides of automated journalism. The media play a crucial role in a representative democracy, characterised by its culture of dissent and argument. Generic news stories do not invigorate this culture.
Still, evolving to a media landscape which uses algorithms to write portions of the story should be embraced. However, there is an important caveat: these pieces should be edited by human journalists or publishers and supplemented by parts written by the human reporters themselves, to combat a sole focus on quantitative content diversity, i.e. a merely numerical assessment of diversity, without taking quality into account.
Also, one must not underestimate the possibility of human journalists simply losing their jobs or seeing their jobs change to the role of an editor of algorithmic output.
These are definitely possible risks. However, one should not overestimate these negative side effects and lapse into doom scenarios. Reallocation of resources due to converging media value chains have had remarkably interesting consequences that often show this interest. Original content creation by streaming services such as Netflix and Amazon has had incredible success. Furthermore, the proliferation and popularity of user-generated (journalistic) content and citizen investigative journalism websites (e.g. Bellingcat) has shown that there is interesting new printed content, albeit maybe in a less traditional sense.
Don’t forget the individual user!
Another future usage of automated journalism could be to bring personalised news products to individual users. Paul Bradshaw remains unsure of the added economic value in personalised automated news stories. Yet, media personalisation techniques and complex algorithms, such as Google’s Page Rank algorithm or Twitter’s Trends list are already designed to define every user’s profile in order to develop an individualised relationship with them. These filtering techniques are used to customise news services to serve the users’ specific needs and interests and help them shift through the vast stores of online information. Following the same user-centric approach, news content-creating algorithms could create multiple customised versions of a specific news story to better suit the taste, viewpoints or profile of every individual user.
This personalisation of news items could become worrisome once the news stories automatically produced by algorithms are not merely factual but also include some adjustable viewpoints. And even if the articles remain neutral, Evgeny Morozov worries that “some people might get stuck in a vicious news circle, consuming nothing but information junk food and having little clue that there is a different, more intelligent world out there.”
This view resembles the fears expressed by Neil Richards. In his work, Richards coined the term intellectual privacy, defined as “the protection from surveillance or unwanted interference by others when we are engaged in the processes of generating ideas and forming beliefs.” Indeed, once news stories would be adjusted for each individual, one’s intellectual privacy will be hindered. Trapped in a prison, in a prism of light, the idle audience will concentrate their attention on a very niche array of sources, a filter bubble, solely focusing on their very specific needs and interests and containing only like-minded speech. Once citizens do not realise that they are reading a different version of the same news story than their neighbour, even critical citizens will partly lose their freedom of choice in composing a pluralistic media diet.
Additional issues surface when legally assessing these personalised news stories. Personal data of individual users needs to be processed to properly conduct this far-reaching type of profiling. For example ad networks use tracking techniques, cookie based technologies and data mining software to establish profiles on individual users. Online advertising systems further often classify data subjects into segments, for example by their marketing categories (examples are “gardening” or “cars”, etc.). The location of the data subject is further deduced from the IP address of the terminals and WiFi access points.
Along the line of this example, the personal data processing involved in personalising news stories should be in line with the European Privacy and Data Protection Framework. More specifically, the provisions in the E-Privacy Directive (ePD) – or Cookie Directive – and Data Protection Directive should be respected whenever automated journalism involves personal data processing. In order to use the personal data to write the story, the robotic reporter will have to obtain unambiguous consent of the user (Article 5.1 ePD and Article 7 DPD), signifying his agreement to personal data relating to him being processed. Individuals will have a general right not to be subject to solely automated processing of data which evaluates certain personal aspects relating to them (Article 15 DPD). Personal data should further only be collected for specified, explicit and legitimate purposes and not further processed in a way incompatible with those purposes (Article 1 (b) DPD). Every new purpose for processing data, such as personalising news items, must have its own particular legal basis. The robotic reporter cannot use the personal data that was initially acquired or processed for another purpose, e.g. advertising. Moreover, the Draft General Data Protection Regulation (GDPR) explicitly grants every natural person the right not to be subject to profiling (Article 20 GDPR). The ambiguity of the legal status of profiling – also in the context of personalised news stories – will therefore be removed once the regulation enters into force.
Conclusion: trusting the transparent robot
It is still unclear how sophisticated these news content creating algorithms will become. Yet, taking into account already employed algorithms that compose music and write poetry comparable to human composers and poets, it is never too early to be aware of the remarkable, for some even frightening, possibilities of artificial intelligence. Especially when personalised robotic news stories would become reality, informing the readers of the processing of their personal data involved in producing these stories will be crucial.
Note: This blog was not written by an algorithm! Instead, it is a modified and updated version of two texts published on the LSE media Policy Project blog. This article provides the views of the named author only and does not represent the position of the REVEAL Consortium, the LSE Media Policy Project blog, nor the views of the London School of Economics.