How We Assess Evidence Quality: A Look Behind Our Evaluation Process
Image Source: Living Goods
< BLOG

How We Assess Evidence Quality: A Look Behind Our Evaluation Process


When donors ask us why we recommend certain nonprofits over others, evidence quality plays a crucial role in our answer. We’ve developed a systematic framework that moves beyond gut feelings or impressive-sounding claims to rigorously evaluate the strength of research behind each intervention.

The Challenge of Nonprofit Evidence

Not all evidence is created equal. A nonprofit might point to research showing their education program improves literacy rates, but if that study had major methodological flaws or was conducted in completely different communities, how much should we trust those results?

The interventions we evaluate span diverse areas: from cash transfer programs improving living standards to school feeding initiatives enhancing education outcomes to maternal health programs saving lives. Each operates in complex social contexts where measuring impact requires careful consideration of what constitutes reliable evidence.

We needed a way to cut through this complexity and provide donors with clear, honest assessments of how confident we are in each intervention’s effectiveness across these varied domains.

Our Three-Level Scoring System

We score evidence quality on a simple scale that captures the essential question: How confident can we be that this intervention actually works?

Level 3: High Evidence

Strong confidence based on well-conducted experimental studies like randomized controlled trials or comprehensive meta-analyses or systematic reviews. These interventions have robust evidence with minimal limitations.

Level 2: Medium Evidence

Moderate confidence, typically from quasi-experimental studies or experimental studies with some limitations. The evidence has gaps or uncertainties, but isn’t fundamentally flawed.

Level 1: Low Evidence

Limited confidence due to significant methodological limitations, high uncertainty, or results that don’t directly apply to the intervention we’re evaluating.

Our Five Assessment Criteria

We adapted the GRADE framework to evaluate five key factors that can undermine our confidence in research findings. To learn more about the GRADE framework, read about it here.

Study Design

Randomized controlled trials, meta-analyses and systematic reviews give us the strongest foundation because they’re specifically designed to isolate the intervention’s true effects from other factors. Quasi-experimental studies with comparison groups are valuable but less definitive. Observational studies without proper controls provide weaker evidence for causal claims.

Risk of Bias

We examine whether studies have serious methodological problems that could distort results. Even a well-designed study can be unreliable if participants weren’t properly selected, if many people dropped out without explanation, or if researchers only reported favorable outcomes while hiding negative results.

Imprecision

Studies with small sample sizes might find impressive-looking effects that are actually just random variation. We want to see studies with enough participants and statistical power to detect real effects reliably.

Inconsistency

When multiple studies produce widely different results for the same intervention, we need to understand why before drawing conclusions. If one study shows substantial improvements in school attendance while another shows no effect, that raises questions about the reliability of the impact of an intervention.

Indirectness

We assess how well the research matches the actual intervention. A study of conditional cash transfers in rural Mexico provides less direct evidence for an unconditional cash transfer program in urban Kenya, even though both aim to improve living standards.

Making Real-World Judgments

Perfect evidence rarely exists in the nonprofit world, particularly when working with vulnerable communities where conducting research involves ethical considerations and practical constraints. We make informed judgments about the best available evidence rather than applying impossibly high standards.

When nonprofits implement multiple interventions – perhaps combining nutrition programs with educational support – we evaluate each one separately and average the scores. 

Why This Matters for Effective Giving

Confidence Over Hope

This framework transforms how you can approach charitable giving by providing systematic insight into intervention effectiveness rather than relying on emotional appeals or organizational reputation alone. 

When you see a Level 3 intervention, you can trust that multiple well-designed studies consistently demonstrate meaningful impact. A Level 1 intervention might address urgent needs and be worth supporting, but you understand the evidence base is more limited.

Maximum Impact Where It’s Needed Most

For effective giving, this means your donations can target interventions where we have genuine confidence in their ability to improve lives – whether that’s reducing child mortality, increasing school completion rates, or lifting families out of poverty. Rather than hoping your money makes a difference, you can know with reasonable certainty that it will.

Our framework helps answer the fundamental question every thoughtful donor faces: Among all the worthy causes competing for attention, which interventions have demonstrated they can reliably transform the lives of vulnerable communities?


Share this story:


Related stories:

Categories:


About the author:

Kudzai Machingawuta

Brand and Content Manager

Kudzai is a content marketing and digital strategy expert with eight years of experience delivering impactful campaigns across diverse industries, audiences, and platforms.

The views expressed in blog posts are those of the author, and not necessarily those of Peter Singer or The Life You Can Save.