Crowdsourcing is increasingly used by journalists, activists, researchers and citizen investigators to collect information, verify it and build a body of evidence that can help expose issues affecting communities everywhere. What are the main needs, methods and considerations to be aware of when setting up and managing a crowdsourcing effort?
This article provides a snapshot from Tetyana Bohdanova's talk on “Beyond DOs and DON'Ts: Main considerations for crowdsourcing evidence” and Justus Von Daniels's session on "CrowdNewsroom: It needs the crowd to get the bigger story" at the Investigation is Collaboration conference organised by Exposing the Invisible Project on 2-6 August 2021.
by Di Luong
What is crowdsourcing and when to use it?
Crowdsourcing is the practice of asking large numbers of people to contribute feedback or information on a specific topic, incident, problem, etc. News organisations, NGOs, researchers and journalists are increasingly using crowdsourcing as a form of information collection in order to engage with communities and activate people by asking them to think about a subject, report issues or explore solutions.
The term crowdsourcing was first coined by Jeff How in a 2006 “Wired” magazine article where he defined it as a new way of sourcing labor enabled by the Internet. Different types of commercial and non-commercial crowdsourcing emerged since.
- For instance, Wikipedia, is the best example of collective knowledge sourcing, while Ushahidi is a popular platform for crowdmapping information.
- Bellingcat is an organisation focused on online citizen investigations and often uses crowdsourcing on social media to gather information, document and verify events as part of its stories and reports. For instance, one of their flagship investigations into the downing of the Malaysia Airlines 17 (MH17) passenger airplane in Ukraine in 2014 relied heavily on crowdsourcing.
- Another example is a 2017 collaboration between Amnesty International and Airwars’ in a joint investigation into the bombing of Raqqa, Syria, which involved over 138 thousand volunteer contributors from 124 countries.
What to consider in crowdsourcing-based projects?
Tetyana Bohdanova - who researches the impact of technology on democracy, specialising in elections monitoring and civil society engagement - proposes a framework for developing and implementing crowdsourcing initiatives. This framework can be applied to any crowdsourcing-driven project, from collecting information from the public as part of investigations or monitoring ongoing events (like demonstrations, natural disasters) to mapping specific problems in an area or seeking answers to a question:
- Define the purpose of your initiative
- Consider ethics, legality, and safety
- Define audience, format, and duration
- Identify the best method
- Effectively engage contributors
- Identify the right tools
- Set up a data verification process
- Analyse data and present findings
Image: Pros and Cons of using crowdsourcing in your investigation or research project. Courtesy of a presentation by Tetyana Bohdanova - see full presentation on the topic of “Beyond DOs and DON'Ts: Main considerations for crowdsourcing evidence.”
1. Define the purpose of your initiative
Why do you want/need to crowdsource? It could be that you want to gather data and build evidence for a story, to support other organisations in their research and advocacy, to raise awareness about a problem, or to mobilise people around an important issue, event, etc.
Whichever the purpose, it’s important to remember that when setting up a crowdsourcing effort, you need to weigh the risks because data could be manipulated or corrupted by opposing entities. This method does not need to be the primary source of data collection for your project, it can be used in parallel with other sources like field research, satellite imagery research, or verification of other media sources to ensure data accuracy.
2. Consider ethics, legality, and safety
To ensure that “crowds” indeed have access to submit information to you, you need to make an effort to provide widely accessible tools and platforms for that to happen. This means using tools that are inclusive. For instance, don’t employ crowdsourcing platforms and tools that only work with strong internet connectivity or with laptops in a context when none of these may be easily accessible to the majority of people. In addition, when managing crowdsourcing projects, you are also responsible for raising awareness to and mitigating the risks of those involved, thus being transparent about the safety issues contributors may face.
3. Define audience, format, and duration
Consider who are the beneficiaries. Beneficiaries may be different than contributors.
Have a plan for what happens to the data once it has been collected.When will the data be published after it has been collected and when will the public see the results of this work? These questions influence the duration of the project and the format of the data you collect.
4. Identify the best method
There are various methods and formats to crowdsource information, such as structured, unstructured, or a mixed approach - and they all depend on the goal, context, beneficiaries, among others:
- When an investigation is fishing for tips, don’t limit the format. People contributing should be able to submit this data in an unstructured format that is easiest for them, e.g., email, text, or photo, etc.
- In a scenario that involves a humanitarian response for instance, every piece of information should be submitted in a structured format - such as location and time - to be able to quickly analyze data and act upon it.
- In case of a mixed method, investigators are aware that in the heat of the process people can’t provide data in one preferred format so allowing more flexibility might be necessary.
5. Effectively engage contributors
You may choose a great topic, collaborate with other organisations, use the right tools, but if the audience does not contribute, there is no crowdsourcing. Get to know your audience and think about the benefits and risks associated with this engagement. Audiences are inspired to participate when the benefits are clear, the process is transparent and the results can be quickly demonstrated.
6. Choose appropriate tech tools
Identify the right tools by considering the technology environment like the internet connectivity or high/low access to specific devices. Consider the habits of the audience involved as they may be slow to accept new technology. Also, how much security do you really need from the tech you use in crowdsourcing? It’s worth evaluating the tech tools and software you will use based on your needs and context. Start by reading some helpful introductory materials like the “Safety First!” guide in the Exposing the Invisible Kit and the article “Technology Is Stupid: How to choose tech for remote working” by Marek Tuszynski.
7. Set up a data verification process
Establish when data is considered verified or has been deemed ready for publication. An example of data verification may include comparing crowdsourced data with media reports. For more tips and methods of verification, check valuable resources such as the “Verification Handbook: A Definitive Guide To Verifying Digital Content For Emergency Coverage” and the “Verification Handbook For Disinformation And Media Manipulation.”
8. Analyse data and present findings
When presenting the collected data, it’s indicated to describe the method or methodology of the crowdsourcing process and the public’s engagement. Showing how results or conclusions have been derived could bolster the credibility of your findings and of the stories or reports you publish. It is also highly important to express gratitude and give credit to collaborators, team members, tool developers, etc.
Example: CrowdNewsroom - “It needs the crowd to get the bigger story”
The independent investigative non-profit CORRECTIV designed CrowdNewsroom as a platform for investigators to create and conduct collaborative projects with the help of communities. Justus von Daniels, Editor-in-Chief of CORRECTIV and one of the initiators of CrowdNewsroom, notes that while crowdsourcing is a powerful way to engage the community, databases of information collected from people via this process may raise various considerations related to validity of data, safety risks to people providing the data as well as the security of data storage and sharing by the project organizers themselves. Databases with crowdsurced information contain sensitive information, including personal data such as locations of people contributing - many of them being possibly at risk in their own contexts. These risks need to be made transparent to everyone participating. Therefore, while running such projects may seem like an easy of way of gathering information, it requires a major investment of time, communication, skills and funding to ensure contributors’ and data safety, data validity and transparency as well as a desired impact of the findings and resulting stories.
CORRECTIV spearheaded Who owns Hamburg?, an ongoing crowdsourcing project, asking the public for information related to ownership conditions in their local residential real-estate market of the German city of Hamburg.The reason for starting this project was that there is no open and transparent database of residential property owners in German cities and nobody really knows who are the big real-estate owners controlling the local housing prices market.
With “Who owns Hamburg?”, data security and accountability were essential for establishing and maintaining trust with people contributing information to the project. Most of them were local tenants providing sensitive information about who owns the houses they rented. Security measures were crucial because such a crowdsourced project may be targeted by those being investigated or by anyone else opposing such an investigative initiative.
From the CrowdNewsroom’s experience, these are some questions that the public may initially ask to organisers of a crowdsourced project:
- Who is running the server where my information is submitted/collected?
- Where is server located? (It is essential for investigators carrying out crowdsourcing efforts to invest a lot in data security to increase trust.)
- Who has access to database?
- Who are your partners?
- How trustworthy are your partners?
In addition to prioritising risk assessment and risk mitigation in such a collaborative effort, there is the need to motivate and engage contributors in order to have an effective and ultimately successfully crowdsourcing project. This includes understanding their political and social conditions and what drives them to want to contribute. People are more likely to participate when they are inspired by the investigation and clearly see the benefits of the investigation to address the issues and topics they care about or are affected by. A crowdsourcing-based investigation needs to strike a balance between demonstrating results quickly so as to inspire more contributors while also considering the long term safety and well-being of contributors.
*Tetyana Bohdanova is an elections and civil society development specialist and a researcher of technology’s impact on democracy. She has over a decade of field-based experience in citizen engagement and electoral transparency work across Eastern Europe and Eurasia, including the implementation of crowdsourcing projects in restrictive environments.
*Justus von Daniels is Editor-in-Chief of the german non-profit newsroom CORRECTIV. He joined CORRECTIV in 2015 as an investigative reporter. In 2018 he led the project "Who owns the city", a crowd-based investigation with 10000 participants and seven media partners. The project and individual publications won several awards. Justus is a trained lawyer and obtained a PhD before joining journalism.
This article is part of a series of resources and publications produced by Exposing the Invisible during a one-year project (September 2020 - August 2021) supported by the European Commission (DG CONNECT)
This text reflects the author’s view and the Commission is not responsible for any use that may be made of the information it contains.