Full Fact AI

Every day fact checkers around the world find, check and challenge false claims identified by AI enabled software produced by Full Fact.

We have built scalable, robust software designed to be used by fact checkers and good information focused organisations, to save time, money and effort in identifying the most important bad information to address. 

If you want to use or test our software: You can sign up via the form below.

 

Our Goals

Bad information ruins lives. It harms our communities, by spreading hate through misleading claims. It hurts our democracy, by damaging trust in politicians and political processes. It leads to bad decisions, by disrupting public debate on the issues that most affect us, including climate change and public spending.

In May 2019 we – along with Africa Check, Chequeado and the Open Data Institute – won the Google.org AI Impact Challenge. We were just one of 20 international winners, chosen from more than 2,600 entrants. Over the next three years, with Google’s support, we used machine learning to improve and scale fact checking, working with international experts to define how artificial intelligence could transform this work, to develop new tools and to deploy and evaluate them. 

These tools are now available for other organisations to use via a paid licence.Our goal is to create a global collaborative effort to help media outlets, civil society, platforms and public policy makers better understand the landscape, and to bring the benefits of those tools to everyone.

What we are building

We have made a set of tools designed to alleviate the pain points we experience in the fact checking process. As fact checkers with ten years’ experience, we understand the operational advantages these tools can bring, making us uniquely placed to build them.

We want to use technology to help:

  • Know the most important thing to be fact checking each day
  • Know when someone repeats something they already know to be false
  • Check things in as close to real-time as possible

Across a suite of products, our technology does the following tasks:

Collecting and monitoring the data

Data we collect can be taken from speech on live TV, online news sites, and social media pages. Our users are able to define this themselves via a simple UI.

Once we have all the input information available as text we split everything down to individual sentences, which are our atomic unit of our work. The sentences are then passed through a number of steps to enrich them and make them more useful.

Identifying and labelling claims

We define a claim as the checkable part of any sentence.

There are many different types of claims - ranging from claims about quantities (“GDP has risen by x%”), claims about cause and effect (“this policy leads to y”), predictive claims about the future (“the economy will grow by z”) and more.

We have developed a claim-type classifier to guide users towards claims that might be worth investigating. We built this with the BERT model and fine-tuned it using our own annotated data. BERT is a tool released by Google Research that has been pre trained with hundreds of millions of sentences in over 100 languages. This makes it a broad statistical model of language as it is actually used and helps us build tools designed to work in an equally wide range of languages. 

Labelling claims in this way filters the volume of data we process from hundreds of thousands to tens of thousands. It is a vital first step in ensuring that the users of our tools have a chance to make sense of all the information. We then filter these claims further, by topic (like health and the economy) to ensure a time poor user can instantly see 100 important claims on a topic.

Matching claims

Once we have labelled claims, sentences are checked to see if they are a match to something we have previously fact checked. Some claims are easier to model than others due to specificity and ambiguity in the language used to describe them.

Again we have trained a BERT-style model to predict match/no-match for sentences and then added in a range of other techniques like entity analysis (e.g. count if both sentences contain the sample numbers, people, organisations etc.). In combination, these stages consistently find repeats of a claim even if different words are used to describe it.

Real time checks

Finally, we use external processes to help spot more claims and further identify patterns of language that can be automatically checked.

Given a sentence, our tool attempts to identify the topic, trend, values, dates and location. If that succeeds, it compares the extracted information with the corresponding data from the UK’s Office for National Statistics. It knows about topics such as inflation and unemployment, and can check claims about values, trends and records. This means our technology can automatically compare each claim made to reliable data to identify whether it’s correct. This requires the publication of high quality openly accessible statistics in machine readable formats, which are sadly only available in a handful of countries.

Once the claim has been identified

The fact checking process is often undertaken offline. We then publish the results on our website. We also describe each fact check with some very specific markup, called ClaimReview. This is part of the wider schema.org project. It describes content on a range of topics in domain specific terms. This is important for us as describing our content so specifically helps ensure that our fact checks can travel further than our own platforms. Fact checks can form a vital part of the web. Over 130,000 fact checks exist in the Google Fact Check Explorer and these were seen over 4 billion times in 2019 in Google Search alone.

Limitations

We are careful not to overstate our results. There are a lot of people who say that artificial intelligence and machine learning is a panacea, but we have been at the front lines of fact checking since 2010, we know how difficult fact checking is first hand. Humans aren't going anywhere anytime soon—and nor would we want them to be.


Our AI team is made up of:

  • Andy Dudfield, Head of Full Fact AI
  • Kate Wilkinson, Senior Product Manager, Full Fact AI
  • David Corney, Senior Data Scientist 
  • Ed Dearden, Data Scientist
  • Dakota Harris, Software Engineer
  • Cameron Johnston, Data Scientist
  • James McMinn, Senior Software Engineer

We need support and funding to develop this work further. Please get in touch if you can help.

In the news

  • Poynter Full Fact has developed and is using an inward-facing automated fact checking platform
  • BBC Click Full Fact talks automated fact checking on BBC Click
  • The Guardian Journalists to use 'immune system' software against fake news
  • TechCrunch Full Fact aims to end fake news with automated fact checking tools
  • Wired Google is helping Full Fact create an automated, real-time fact-checker
  • The Guardian Fake news clampdown: Google gives €150,000 to fact-checking projects
  • Engadget Full Fact wants to automate fact checking to fight fake news
  • Independent Google funds automated fact-checking software in bid to fight fake news
  • Nieman Lab Fact-checking and data-driven projects among winners of Google’s Digital News Initiative funding

More on automated fact checking