Written by shirin anlen
As generative AI technology evolves, so do the tools designed to detect it. Yet, in the race to develop high-performing detection systems, a critical element risks being overlooked. Effectiveness is often measured through technical benchmarks: accuracy, speed, scalability, and versatility. While these metrics are important, they fail to capture the full complexities of real-world use. At WITNESS, we’ve consistently observed a noticeable gap between the technical capabilities of AI detection tools and their practical value in high-stakes situations globally. This detection equity gap is most pronounced in the Global Majority world.
Since March 2023, WITNESS has been leading the Deepfakes Rapid Response Force (DRRF), a pioneering initiative that connects frontline fact-checkers and journalists with leading media forensics and deepfake detection experts. This collaboration provides a timely, detailed analysis of content that threatens democracy and human rights. Additionally, through trainings with frontline journalists on detection and responding to AI over the past year, we’ve observed firsthand how our partners in the Majority world face compounded challenges: fragile media ecosystems, ineffective detection tools due to gaps in training data for local languages, accents, public figures and manipulation trends, along with widespread gaps in AI media literacy that are limiting the ability to interpret and trust detection outputs.
From combating AI-driven disinformation to safeguarding human rights, these challenges demand more than technical robustness—they require a broader, more holistic evaluation. This is where the concept of equitable effectiveness becomes crucial. It calls for evaluation frameworks that go beyond the lab and consider the sociotechnical realities in which these tools operate. For frontline users—journalists, human rights defenders, fact-checkers and civil society more broadly—facing disinformation head-on, detection tools must do more than deliver impressive results in controlled environments. They need to provide actionable, reliable insights amid the unpredictable and resource-constrained conditions of the real world.
The stakes of equitable effectiveness are underscored by real-world missteps. In one instance, DRRF received a suspected radio conversation recording from the ongoing civil war in Sudan. Verification was impossible because the detection models lacked training data specific to “radio conversations.” In another case, also from Sudan, a purported leaked conversation between the Sudanese Army Commander and Chief of Staff went unanalyzed because the available teams were unable to process content in Arabic. In the Philippines, a low-resolution video allegedly showing the president snorting cocaine posed another challenge. The video’s poor quality rendered AI detection tools ineffective. Similarly, journalists from Mexico testing two images—one original and one manipulated—encountered conflicting results from online detectors, with one falsely identifying the authentic image as AI-generated. Only after employing DRRF’s advanced tools were they able to confirm the manipulation. Inconsistencies across multiple tools are not uncommon. In some DRRF cases, different detectors offered varying results for the same content. Providing users with context—such as the types of manipulations a tool is designed to detect and how quality issues might affect outcomes—can make even inconsistent results valuable.
The Technical vs. The Practical
To determine if a tool is genuinely effective, it must align not only with rigorous technical standards but also with the practical realities faced by those using it. These actors often operate under resource constraints, dealing with low-quality content or time-sensitive threats. Equitable effectiveness involves addressing six core considerations:
- Real-World Challenges: The DRRF offers a valuable lens into these challenges, where, with many of the cases escalated to media forensic and synthetic media detection experts, we’ve observed that AI detection tools perform best on high-resolution, near-original, and clear materials. However, the reality is that much of the content they analyze—often sourced from social media—comes in compressed, noisy, and low-resolution formats. Audio files with background noise or poor quality frequently yield inconclusive results. Tools must be designed to adapt to these imperfect conditions, or frontline users must be equipped with complementary resources to navigate unpredictable cases.
- Transparency and Explainability: Detection tools often provide binary results accompanied by confidence scores, but these alone are insufficient. To make results actionable and reliable, tools must also offer additional information, such as: guidance on interpreting results, the types of manipulations the tool was trained to detect, information on the dataset used for training, and limitations of the tool, including how content quality may influence outcomes. In today’s fast-paced media landscape, such transparency enables journalists and fact-checkers to navigate results and produce evidence-based reporting, building trust in the tool’s outputs.
- Accessibility: Technical excellence is meaningless if tools are inaccessible to diverse communities. For instance, language barriers persist, with many tools primarily trained in English and Spanish, limiting their utility for other languages, accents and skin tones. Additionally, tools that require advanced computational expertise or resources risk excluding users in under-resourced areas. Simplifying interfaces and providing actionable and in depth insights are vital steps to ensure these tools serve a broader audience and fit into a broader journalistic capacity.
- Fairness: The fairness of detection tools hinges on the fairness of their training data. The demographic composition of training datasets significantly affects tool performance. For instance, researchers from the University of Southern California found that popular deepfake detection datasets were predominantly composed of Caucasian faces. This imbalance resulted in poorer performance on content featuring other demographics. Ensuring diverse and representative training data is essential for achieving fair and accurate detection outcomes.
- Durability: As deepfake technology evolves rapidly, detection tools must keep pace. Tools need to be designed with adaptability in mind, capable of responding to the fast-changing landscape of generative techniques. Regular updates, maintenance and public communication about these changes are crucial to ensuring these tools remain effective against increasingly sophisticated manipulations. Without continuous refinement, even the most advanced tools risk obsolescence, leaving users vulnerable to emerging threats.
- Contextualization to other skill sets: AI detection tools are frequently applied to complex, unpredictable content, where relying on them as standalone solutions often falls short. Instead, when possible, these tools should be considered as part of a broader verification process and not as a complete solution.
A Framework for Equitable Effectiveness AI Detection
In response, WITNESS is developing a framework to assess AI detection tools based on these six principles that prioritize equity and social context. Built from insights gathered through our work with the Deepfakes Rapid Response Force and informed by consultations with fact-checkers, AI forensic specialists, and digital rights defenders, this framework centers on guiding questions designed to ensure these tools serve the communities that need them most. It moves beyond purely technical benchmarks to consider the lived experiences of those using these tools to counter disinformation daily.
Equitable effectiveness requires asking critical questions, including:
- What are the intended outcomes of using this tool?
- What unintended consequences could arise?
- Can frontline users understand and trust the outputs?
- Are the tools accessible and affordable to diverse users?
- How effectively are the tool’s limitations communicated to users?
- What additional information does the tool provide to support responsible use and reporting on AI?
- How effectively does a detection approach complement other relevant, existing expertise?
- Are biases adequately mitigated to ensure reliable results across different contexts?
By addressing these considerations, we can ensure AI detection tools are not only technically sound but also equitable and effective in practice.
An Opportunity for Real-World Standards
Beyond tool development, this framework can inform crucial policy conversations. As lawmakers and regulators consider mandates on AI transparency and detection, an equitable framework could shape the compliance standards of the future. By centering the needs of communities on the frontlines of our information and disinformation ecosystems, we can influence AI development practices and establish fair, effective standards for the entire ecosystem, ensuring that detection tools fulfill their purpose: safeguarding truth and protecting human rights.
We are at a pivotal moment. The future of AI detection must be guided by values that reflect the lived realities of those impacted by its failures. WITNESS’ framework isn’t just a guide for creating fairer detection tools––it’s a call for an AI field that upholds the principles of equity, inclusivity, and accountability in every stage of technology design and deployment.
Published 19 November 2024