Measuring program effectiveness may sound like the bean-counting afterthought to the real work of program implementation. But measuring program effectiveness in the social sector can feel pretty high-stakes for organizations competing for limited funding dollars. And there’s plenty of controversy around how it should be done.
One particularly persistent debate centers around whether or not some programs are too “soft” to evaluate. Are the outcomes of, say, advocacy or empowerment initiatives too subjective, complex, or nuanced to evaluate with “hard” objective metrics?
A recent Stanford Social Innovation Review article by representatives of the Walton Family Foundation argues that this does not have to be the case, declaring “advocacy isn’t ‘soft’”. The authors describe how they have been able to apply logic models and performance measures to their advocacy projects in more or less the same way that they might to direct services programs.
As someone who thinks ways can be found to at minimum approximately measure just about anything, I view attempts to measure advocacy grantee effectiveness as a generally positive thing.
But an acceptance of the assumption that being “soft” is bad, and that all interventions should be evaluated in the same way as direct service programs, isn’t the only way forward. While this can provide a good start for laying out how any intervention can be monitored, many examples of performance assessment still focus primarily on outputs instead of the changes those outputs are intended to produce.
The SSIR article, for example, described a report produced by an education advocacy organization. Evaluation of the report’s performance included tracking indicators like the number of times the report was downloaded and the number of media mentions it received (including one very powerful mention from Secretary of Education Arne Duncan).
But understanding how advocacy outputs lead to behavior or opinion change and, ultimately, to impact is more difficult than just getting creative about which output response metrics feed into performance assessment templates. Sure, I can understand that a certain number of people have downloaded a report. But without digging deeper, how can I know if it was read once downloaded, and then if it changed anyone’s mind? What’s more important: the number of media mentions that a report garners, or whether one mention in particular led a lawmaker to author new legislation or vote in a different way?
We’d do well to remember and embrace the fact that advocacy is political in nature, yielding unintended results and patterns that are difficult to predict and that resist simple measures. Outcomes are based on interactions between individual actors with different incentives, capabilities, and needs all responding to any given intervention. As a result, politics evolves in a nonlinear manner, where one action has numerous effects that may alter the intended trajectory of said action.
Any effort to evaluate such complex interactions demands “trained judgment”, which is based in tacit knowledge and not just a scientific method. And tacit knowledge starts with a deep understanding of the incentives that inform people’s everyday behavior, the institutions they build, and the political landscape within which they operate. Measuring the effectiveness of advocacy efforts as if they were the same as direct service programs sells advocacy, and the complex social dynamics it takes on, short.
Advocacy requires understanding the people whose beliefs and behaviors the intervention hopes to influence. People are subjective, complex, and nuanced. It should follow that an advocacy program attempting to change behavior and how people interact with institutions and each other deserves to be evaluated as both “hard” and “soft”.