How We Assess Evidence Quality: A Look Behind Our Evaluation Process

Image Source: Living Goods

< BLOG

How We Assess Evidence Quality: A Look Behind Our Evaluation Process

by Kudzai Machingawuta | 23 Sep 2025

When donors ask us why we recommend certain nonprofits over others, evidence quality plays a crucial role in our answer. We’ve developed a systematic framework that moves beyond gut feelings or impressive-sounding claims to rigorously evaluate the strength of research behind each intervention.

The Challenge of Nonprofit Evidence

Not all evidence is created equal. A nonprofit might point to research showing their education program improves literacy rates, but if that study had major methodological flaws or was conducted in completely different communities, how much should we trust those results?

The interventions we evaluate span diverse areas: from cash transfer programs improving living standards to school feeding initiatives enhancing education outcomes to maternal health programs saving lives. Each operates in complex social contexts where measuring impact requires careful consideration of what constitutes reliable evidence.

We needed a way to cut through this complexity and provide donors with clear, honest assessments of how confident we are in each intervention’s effectiveness across these varied domains.

Our Three-Level Scoring System

We score evidence quality on a simple scale that captures the essential question: How confident can we be that this intervention actually works?

Level 3: High Evidence

Strong confidence based on well-conducted experimental studies like randomized controlled trials or comprehensive meta-analyses or systematic reviews. These interventions have robust evidence with minimal limitations.

Level 2: Medium Evidence

Moderate confidence, typically from quasi-experimental studies or experimental studies with some limitations. The evidence has gaps or uncertainties, but isn’t fundamentally flawed.

Level 1: Low Evidence

Limited confidence due to significant methodological limitations, high uncertainty, or results that don’t directly apply to the intervention we’re evaluating.

Our Five Assessment Criteria

We adapted the GRADE framework to evaluate five key factors that can undermine our confidence in research findings. To learn more about the GRADE framework, read about it here.

Study Design

Randomized controlled trials, meta-analyses and systematic reviews give us the strongest foundation because they’re specifically designed to isolate the intervention’s true effects from other factors. Quasi-experimental studies with comparison groups are valuable but less definitive. Observational studies without proper controls provide weaker evidence for causal claims.

Risk of Bias

We examine whether studies have serious methodological problems that could distort results. Even a well-designed study can be unreliable if participants weren’t properly selected, if many people dropped out without explanation, or if researchers only reported favorable outcomes while hiding negative results.

Imprecision

Studies with small sample sizes might find impressive-looking effects that are actually just random variation. We want to see studies with enough participants and statistical power to detect real effects reliably.

Inconsistency

When multiple studies produce widely different results for the same intervention, we need to understand why before drawing conclusions. If one study shows substantial improvements in school attendance while another shows no effect, that raises questions about the reliability of the impact of an intervention.

Indirectness

We assess how well the research matches the actual intervention. A study of conditional cash transfers in rural Mexico provides less direct evidence for an unconditional cash transfer program in urban Kenya, even though both aim to improve living standards.

Making Real-World Judgments

Perfect evidence rarely exists in the nonprofit world, particularly when working with vulnerable communities where conducting research involves ethical considerations and practical constraints. We make informed judgments about the best available evidence rather than applying impossibly high standards.

When nonprofits implement multiple interventions – perhaps combining nutrition programs with educational support – we evaluate each one separately and average the scores.

Why This Matters for Effective Giving

Confidence Over Hope

This framework transforms how you can approach charitable giving by providing systematic insight into intervention effectiveness rather than relying on emotional appeals or organizational reputation alone.

When you see a Level 3 intervention, you can trust that multiple well-designed studies consistently demonstrate meaningful impact. A Level 1 intervention might address urgent needs and be worth supporting, but you understand the evidence base is more limited.

Maximum Impact Where It’s Needed Most

For effective giving, this means your donations can target interventions where we have genuine confidence in their ability to improve lives – whether that’s reducing child mortality, increasing school completion rates, or lifting families out of poverty. Rather than hoping your money makes a difference, you can know with reasonable certainty that it will.

Our framework helps answer the fundamental question every thoughtful donor faces: Among all the worthy causes competing for attention, which interventions have demonstrated they can reliably transform the lives of vulnerable communities?

Share this story:

A Conversation with Sanku: Rapid Response Fund and Aid Cuts

Twenty million people in Eastern Africa rely on flour from over 1,500 small mills for primary nutrition. For many, especial... Read more >

When Governments Step Back, Individual Donors Step Forward

The close of the financial year often invites reflection, a moment to consider our achievements and the impact we aspire to... Read more >

About the author:

Kudzai Machingawuta

Brand and Content Manager

Kudzai is a content marketing and digital strategy expert with eight years of experience delivering impactful campaigns across diverse industries, audiences, and platforms.

The views expressed in blog posts are those of the author, and not necessarily those of Peter Singer or The Life You Can Save.

How We Assess Evidence Quality: A Look Behind Our Evaluation Process

The Challenge of Nonprofit Evidence

Our Three-Level Scoring System

Level 3: High Evidence

Level 2: Medium Evidence

Level 1: Low Evidence

Our Five Assessment Criteria

Study Design

Risk of Bias

Imprecision

Inconsistency

Indirectness

Making Real-World Judgments

Why This Matters for Effective Giving

Confidence Over Hope

Maximum Impact Where It’s Needed Most

Share this story:

Related stories:

A Conversation with Sanku: Rapid Response Fund and Aid Cuts

When Governments Step Back, Individual Donors Step Forward

Categories:

About the author:

Kudzai Machingawuta

Get updates to make your giving more cost-effective and impactful.

RESOURCES

DONATE

OTHER

How We Assess Evidence Quality: A Look Behind Our Evaluation Process

The Challenge of Nonprofit Evidence

Our Three-Level Scoring System

Level 3: High Evidence

Level 2: Medium Evidence

Level 1: Low Evidence

Our Five Assessment Criteria

Study Design

Risk of Bias

Imprecision

Inconsistency

Indirectness

Making Real-World Judgments

Why This Matters for Effective Giving

Confidence Over Hope

Maximum Impact Where It’s Needed Most

Share this story:

Related stories:

A Conversation with Sanku: Rapid Response Fund and Aid Cuts

When Governments Step Back, Individual Donors Step Forward

Categories:

About the author:

Kudzai Machingawuta

Give 10% of your donation to The Life You Can Save?

Living in Australia?