by Jade Maloney, Jo Farmer and Eunice Sotelo
With so many authors and approaches to evaluation, knowing what to pay attention to can be hard. Evaluation, just like the catwalk, is subject to the whims of the day. How do you know what’s a passing fad and what will remain in fashion?
At the AES Victoria regional seminar in November, Brad Astbury suggested the following 10 books will stand the test of time.
1960s: In the age of bell bottoms, beehives and lava lamps, and protests against the Vietnam war, evaluation was on the rise. The U.S. Congress passed the Elementary and Secondary Education Act (ESEA) in 1965, the first major piece of social legislation to require evaluation of local projects undertaken with federal funds and to mandate project reporting. The hope was that timely and objective information about projects could reform local governance and practice of education for disadvantaged children, and that systemic evaluation could reform federal management of education programs.
While you wouldn’t return to the fashion or the days when evaluation was uniquely experimental, hold onto your copy Experimental and quasi-experimental designs for research by Donald T. Campbell and Julian C. Stanley. These two coined the concepts of external and internal validity. Even if you’re not doing an experiment, you can use a validity checklist.
1970s: Back in the day of disco balls and platform shoes, the closest thing evaluation has to a rock star – Michael Quinn Patton – penned the first edition of Utilisation-focused evaluation. Now in its fourth edition, it’s the bible for evaluation consultants. It’s also one of the evaluation theories with the most solid evidence base – drawn from Patton’s research. In today’s age of customer centricity, it’s clear focusing on intended use by intended users is a concept that’s here to stay.
Carol Weiss’s message – ignore politics at your peril – could also have been written for our times. Her Evaluation research: methods of assessing program effectiveness provides a solid grounding in the politics of evaluation. It also describes theory-based evaluation – an approach beyond the experimental, that is commonly used today.
1980s: While you may no longer work out in in fluorescent tights, leotards and sweat bands, your copy of Qualitative evaluation methods won’t go out of fashion any time soon. In his second appearance on the list, Michael Quinn Patton made a strong case that qualitative ways of knowing are not inferior.
You may also know the name of the second recommended author from this decade, but more likely for the statistical test that bears his name (Cronbach’s alpha) than his contribution to evaluation theory, which is under-acknowledged. In Toward reform of program evaluation, Lee Cronbach and associates set out 95 theses to reform evaluation (in the style of Martin Luther’s 95 theses). That many of the 95 theses still ring true could be seen as either depressing or a consolation for the challenges evaluators face. For Astbury – ever the evaluation lecturer – thesis number 93 “the evaluator is an educator; his success is to be judged by what others learn” is the standout, but there’s one in there for everyone. (No. 13. “The evaluator’s professional conclusions cannot substitute for the political process” aligns with Weiss’s message, while No. 9. “Commissioners of evaluation complain that the messages from evaluations are not useful, while evaluators complain that the messages are not used” could have been pulled from Patton’s Utilisation-focused evaluation).
1990s: Alongside Vanilla Ice, the Spice Girls and the Macarena, the 90s brought us CMOCs – context, mechanism, outcome, configurations – and a different way of doing evaluation. Ray Pawson and Nick Tilley’s Realistic evaluation taught us not to just ask what works, but what works, for whom, in what circumstances, and why?
To balance the specificity of this perspective, the other recommendation from the 90s is agnostic. Foundations of program evaluation: Theories of practice by William Shadish Jr., Thomas Cook and Laura Leviton describes the three stages in the evolution of evaluation thinking. It articulates the criteria for judging the merits of evaluation theories: the extent to which they are coherent on social programming, knowledge construction, valuing, use, and practice. The message here is there is no single theory or ideal theory of evaluation to guide practice.
2000s: While iPods and flash mobs are a faint memory, these two books have had a lasting impact: Evaluation: An integrated framework for understanding, guiding, and improving public and non-profit policies and programs by Melvin Mark, Gary Henry and George Julnes; and Evaluation roots: Tracing theorists’ views and influences edited by Marvin Alkin.
The former covers four key purposes of evaluation: to review the merit of programs and their value to society (as per Scriven’s definition); to improve the organisation and its services; to ensure program compliance with mandates; to build knowledge and expertise for future programs. The take-out is to adopt a contingency perspective.
The latter is the source of the evaluation theory tree – which sparked commentary at this year’s AES and AEA conferences for its individualism, and limited gender and cultural diversity. Still, Brad reminds us that there’s value in learning from the thinkers as well as the practitioners; we can learn in the field but also from the field. According to Sage, Alkin’s Evaluation roots is one of the most sold book on evaluation.
It’s a reminder that there is much still to learn from those who’ve come before – that we can learn as much from those who’ve thought about evaluation for decades as we can from our practical experience.
2010s: The age of the selfie has not yet faded and nor has Evaluating values, biases and practical wisdom by Ernest R. House. It covers three meta-themes: values (Scriven, House); biases (Campbell and the experimental approach; expanding the concept of validity); and practical wisdom (on Aristotle’s notion of praxis – blending/embedding theory and practice). It gives us the wise advice to pay more attention to cognitive biases and conflicts of interest.
So now to the questions.
Why didn’t Scriven make the list? Because he’s written few books and there wasn’t enough room in the 90s. Nevertheless, Michael Scriven’s Evaluation thesaurus and The Logic of evaluation are among the books Astbury notes are worth reading.
What about local authors? Grab a copy of Building in research and evaluation: Human inquiry for living systems by Yoland Wadsworth and Purposeful program theory: Effective use of theories of change and logic models by Sue Funnell and Patricia Rogers. If you’re new to evaluation Evaluation methodology basics: The nuts and bolts of sound evaluation – from this year’s AES conference keynote E. Jane Davidson – can help you get a grasp on evaluation in practice.
Brad ended by sounding a word of warning not to get too caught up in the fads of the day. Buzzwords may come and go, but to avoid becoming a fashion victim, these ten books should be a staple of any evaluator’s bookshelf.
Brad Astbury is a Director at ARTD Consulting, based in the Melbourne office. He has over 18 years’ experience in evaluation and applied social research and considerable expertise in combining diverse forms of evidence to improve both the quality and utility of evaluation. He has managed and conducted needs assessments, process and impact studies and theory-driven evaluations across a wide range of policy areas for industry, government, community and not-for-profit clients. Prior to joining ARTD in 2018, Brad worked for over a decade at the University of Melbourne, where he taught and mentored postgraduate evaluation students.
by Florent Gomez
Have you ever tried to grow evaluation capacity across your organisation? And this, with very limited resources?
At the recent AES International Evaluation Conference in Sydney, I shared some learnings from our successful Evaluation Community of Practice in the NSW Department of Customer Service (previously NSW Department of Finance) and other soft approaches to evaluation capacity building we are using in our department.
When I started as an internal evaluator with the department in February 2017, I quickly realised that contrary to other government departments, such as in community services, health or education, we didn’t have an established central evaluation unit; only pockets of evaluation capacity here and there. However, there was definitely a need and appetite to learn more about evaluation across the department!
This is why we decided to put together an Evaluation Community of Practice at the department level. The Community of Practice is an open and informal forum allowing staff from different roles and with varying levels of evaluation experience to share and learn about good evaluation practices. Quarterly events are the main component, supported by a Yammer group (corporate social media platform) that we use as a blog to keep the community engaged between events, and an Intranet page with key resources, templates and presentations from the events.
After one-and-a-half years of existence, we decided to evaluate ourselves and see how we were travelling in terms of building evaluation capacity across the department. We refined our intended outcomes by developing a detailed program logic. The evidence we gathered for the different outcome levels showed that this low-cost approach effectively contributed to raising evaluation awareness and capability across the department. Our evaluation showed that:
As one key stakeholder summed up: the Evaluation Community of Practice collectively played the role of the evaluation centre of excellence the department didn’t have.
The following key success factors we identified could be applied to other organisations facing similar challenges:
This is definitely a low-cost evaluation capacity building approach I’d recommend, in particular in organisations with less established evaluation capacity. What tips can you share from successful approaches to evaluation capacity building in your organisation?
Florent is a Manager for planning, evaluation and reporting at the NSW Department of Customer Service. Before that, he worked as an external evaluator for over 10 years in Europe then Australia.
by Anthea Rutter
The question of what brings a person into the field of evaluation is always an interesting question to ask, particularly as you are never sure of the answer. In this case I did not expect the answer I got.
My crisis of confidence! In the late 1980s I was a program manager in the NT Health and Community Services Department, when I came to have serious doubts whether or not our programs were making a positive difference. I had been taught evidence-based decision making, and when I asked myself whether we were making a difference, I didn’t know, and it really bothered me. A work colleague suggested I might be interested in reading this new book by Patton on utilisation focused evaluation. I was immediately hooked, and I knew then that I wanted to work as an evaluator! I subsequently did courses at Darwin Uni in research methods, evaluation and statistics. I then joined ATSIC in Canberra (1991-1992); I worked as an evaluator in Indigenous affairs and have been working in evaluation ever since. Later on, after I had a bit more practical experience, I did my Master’s degree at Murdoch University in Perth and studied evaluation with Ralph Straton.
Clearly Scott is a person who thinks hard about his practice, so I was interested in what he regarded as his main area of interest.
I am interested in theories of change which is very important in international development. I’m also interested in impact evaluation methods, particularly critical multiplism which is not well known in Australia. It was developed by Cook and Shadish and is based on a particular view of reality – the idea being that the world is complex, and we can never know it perfectly. The best that we can do as evaluators is to study it from multiple perspectives using multiple methods. CM also believes that causality is probalistic not deterministic. Not every smoker gets cancer, but a significant proportion do and hence we can say smoking causes cancer. To test causal relationships CM uses three criteria first proposed by John Stuart Mill in 1850. In order to conclude that program A causes outcome B you need to establish an association between A and B, and you need to show that A occurs in time before B. Finally we need to rule out alternative explanations for the relationship between A and B. If and only if we can credibly satisfy all 3 tests can we conclude that program A causes outcome B. The real value of CM is that is asks us to focus on the evidence we need for making causal inferences, rather than getting bogged down in unproductive debates about experiments vs case studies vs surveys etc.
My other main interest is evaluation capacity building. I was doing that in China, Vietnam, Cambodia and Laos for four years with the Asian Development Bank. The international experience with ECB is now quite clear. We can focus our capacity building efforts on: leadership’s demand for and ability to make use of evaluative feedback; our institutional infrastructure (evaluation policies, resources, staff skills, IT systems etc.); or on the supply of evaluation feedback. The international lesson is that demand is where we need to focus our capacity building efforts; supply side strategies (producing more evaluation reports) simply doesn’t work.
Clearly Scott has worked in some complex areas requiring multiple skill levels, and I wanted to know, in particular, what he saw as major challenges to his practice.
Initially developing my own skills was a big challenge. Evaluation is such a big field with so much to learn! Undertaking cross-cultural evaluations is very complex. There are many potential dimensions to performance and some of them are not immediately obvious. Speaking truth to power is an issue all evaluators face at some point in their career. I’ve had some tense discussions in Australia when evaluating the economic impact of the Melbourne Grand Prix, the privatisation of a prison, mental health services for suicidal youth, contracting NGOs for service delivery, and when evaluating the policy advising process in state government agencies. All highly controversial evaluations that ultimately helped stakeholders to engage with the issue and make more informed decisions. I have also noticed that the commitment to evaluation of both state and Commonwealth governments waxes and wanes over time; this is very short sighted, and the public deserves better. We should be aiming to use public monies for best effect.
A career so varied as Scott’s must have had some highlights and I was keen to discover what they were.
I worked on a wide variety of challenging evaluation topics: the delivery of health and community services in rural Australia, cost-benefit study of the Melbourne Grand Prix, assessing cement production in China, the effectiveness of petrol sniffing programs for remote Indigenous youth, financial management reforms in Mongolia, quality assuring Australia’s international aid program, and complaint handling systems in government departments. I’ve had the great fortune to have had a number of highly skilled advisors, people who went out of their way to coach and mentor me. They include Gordon Robertson, Des Pearson, Patrick Batho, Ralph Straton, Darryl Cauley, David Andrich, Ron Penney, John Owen, Robert Ho, Rick Cummings, Burt Perrin and Ray Rist. I’ve been exceptionally lucky in that regard.
AES Fellowship – big highlight.
All of us are influenced either by particular people or theories which help to define our evaluation practice. Scott’s response was brief and to the point.
Those people named above plus my academic background in social research methods and later in public policy analysis [were my main influences].
A question which I asked all of the Fellows was to find out how the field of evaluation had changed during the course of their careers. His response made me reflect how we have matured as a profession and expanded our horizons into multiple areas of practice.
One thing which I have noticed is that the AES membership has changed. When I first joined, it was all academics and government staff. Now we have a lot more NGOs and private consultants. A great many more Australians are now working in international development; that was quite rare when I first got into evaluation. Another change is the range of new impact evaluation methods which we have seen coming up in the last 10 years. I’ve also noticed that 25 years ago, there were various programs that were considered to be almost impossible to evaluate: environment, community development, Indigenous programs and policy advice to name a few. These topics were considered to be too complex and hard to evaluate. Now we routinely do such evaluations. I think that the boundaries and work of practicing evaluators has evolved significantly over time.
All of us, as evaluators, want to ensure that our practice is developing so that we keep up with emerging trends and remain relevant. Scott’s response to this topic was concise and informative.
- People skills – facilitation, negotiation, conflict management, communication
- Evaluation theory and practice – knowing different models, being familiar with various approaches, plus having an expert understanding of evaluation logic
- Research skills – broad skills including plain English style reporting.
In the future we will see more of a demand for real-time evaluation. I believe evaluation will increasingly adopt action research methods, and appreciative enquiry will become much more common. Value for money is generally underdone in most of the evaluations that I read these days.
I asked Scott about what he saw as the main social issues/problems that evaluators ought to be thinking about and seeking to resolve in the next decade. His response showed a great deal of insight into the issues and the ways we can address them.
I think our communities are experiencing a loss of confidence in government and parliamentary processes. I would like to see government focusing on processes for good policy formulation and evaluation, and AES members should be helping with this so that more informed decisions can be made.
I believe that our theory of change for evaluation itself needs to be better. I don’t think that evaluation has fulfilled what we set out to do in the 60s and 70s. We talk a lot about transparency, and that this should drive better program results, but the world doesn’t work that way. We rely on a supply driven model, focusing on delivering reports but not building demand for performance feedback and the ability of decision makers to make use of this feedback. Evaluators need to be more involved at the front-end of program planning/design.
I see a lot of contracted evaluation work and often it’s not of very good quality. This is partly because of poorly written terms of reference and inadequate budgets, and also partly due to our own skill levels. I worry about the status and credibility of the evaluation field. A few years ago I was against the idea of professional accreditation for evaluators but now I’m starting to change my mind on this. I see so many badly written terms of reference and evaluation reports. Accreditation might help to raise the bar. However, we would have to have many more training opportunities for evaluators and I cannot see that happening in the near future. Still I think it’s a debate worth having in the AES.
The answer to the question on how the AES can position itself to still be relevant in the future is an important one for AES members as well as the Board. Scott’s comments on this displayed a level of maturity and understanding of the situation.
I think it’s important to begin by clarifying the AES’s role and priorities. Is the AES an interest group, an advocacy body or a professional association (or some mixture of all three)? We can begin by focusing on members’ needs and priorities (while recognising the difficulties of working that out!). Individuals, like governments, ebb and flow in their degree of interest in evaluation. Is there an opportunity for the AES to form more alliances and partnerships? I think there is, particularly with external agencies such as IPAA and ANZSOG. It’s hard for the AES to get things done when we rely so heavily on voluntary members; we simply lack the advantages of having a well-developed administrative capacity. I’ve been impressed with the Board’s recent work on engaging with and advising Commonwealth government departments such as DoF.
In the Commonwealth government, evaluation was at its peak from 1986-96. In the last 8 months there seems to be more talk of evaluation with the view that we need to lift the state of current practice so hopefully we will get more evaluation into decision making processes. In my view, the main issue for central agencies (PM&C, Dept of Finance, ANAO, Treasury) is the lack of demand for evaluative feedback and incentives to drive the continuous improvement of programs. On a positive note, we are seeing some discussion recently on issues such as the potential benefits of having an Evaluator General.
Before we completed the discussion, Scott candidly shared one of his biases (new evaluators take note!).
One of my biases is that coming up with answers to evaluation questions is generally not that difficult. The hard part is actually identifying good questions to ask: questions that stakeholders care about; questions that reduce uncertainty; questions that support learning, collaborative relationships and better program results.
Scott is currently a consultant for Oxford Policy Management in the UK but living in Canberra. He was previously Principal Specialist Performance Management and Results in the Department of Foreign Affairs Canberra.
His major roles in evaluation include: Evaluator with ATSIC, Auditor General’s Office Perth and Melbourne, Asian Development Bank Philippines, Vietnam UNDP, Department of Human Services Melbourne, and AusAID Canberra.
by Anthea Rutter
Chris Milne was an early pioneer in the use of program logic. As a founding partner of ARTD Consultants, he has designed and delivered numerous evaluations across diverse sectors and built the evaluation capacity of government and non-government organisations. In recent years, he worked with another AES Fellow, Patricia Rogers, on the NSW Government evaluation toolkit.
I enjoyed speaking with Chris. He struck me as a man with a high degree of humility, as well as someone who considers his answers in a balanced way. He is obviously committed to the environment and the world in which we live, and passionate about making it a good place for the generations that follow.
How did you fall into evaluation and establishing ARTD?
I was working in adult education with Aboriginal people at Tranby College in Sydney. Then, in 1989, two of us set up ARTD as a training consultancy. I became more interested in evaluating training rather than doing it, which led me to the work of Donald Kirkpatrick and Robert Brinkerhoff in the US. Then I saw the program logic approach developed by Sue Funnell and others for the NSW Government. I found program logic a great tool for monitoring and evaluation, and I began to use it with all kinds of programs.
What have been your particular interests in evaluation?
Program theory and program logic. The spread of program logic has been a highlight, especially the approach developed by Sue Funnel and others in the 1990s. I’ve seen it go from a pioneering concept to a fundamental tool of evaluation. In 1995 at the first international evaluation conference in Vancouver, Sue and I ran a workshop on program logic – attended by a lot of experienced American evaluators who gave us very positive feedback. At ARTD we developed a computer training package on program logic in the 1990s and sold hundreds of copies around Australia and internationally – a couple of years ago I heard from a woman in Alaska who had been using it for years.
I’ve also enjoyed working out evaluation strategies, questions, designs and plans and advising organisations on all aspects of evaluation including overall approaches, capacity and use. I am very interested in meta-evaluation, whether assessing the quality of an individual evaluation, or more rarely, reviewing a collection of evaluations.
I have enjoyed supporting people to be sound and informed evaluators, whether clients or our staff at ARTD. I particularly liked coaching people to write executive summaries that are succinct, balanced and evidence-based – getting it all down to a couple of pages is an art!
I would imagine that anyone who has been involved in a profession for over 30 years would have faced a number of challenges along the way. What was interesting was the wide-ranging nature of Chris’s reply, bringing in issues of practice, culture, methods and the political landscape.
Well the world of evaluation itself is one big challenge, that’s why we love it!
At the practice level we need to deal with all the constraints in doing good evaluation work, especially in organisations where people have limited experience. Take costs for example – some people have no idea of the likely cost of the evaluation that they want.
A more recent challenge is the increased complexity of interventions and, therefore, the complexity of the evaluation.
Another is the clash of cultures around evidence and methods across different policy fields, so expectations vary across health, education, environment, human services, economics and so on. Similarly, governments go through different fads around evidence, so that requirements change; for example, managerialist approaches tried to make decisions on a few metrics (KPIs), rather than the full story of contexts, strengths and weaknesses.
How has evaluation changed over the past 30 years?
Organisations involved in public policy are always going through changes in how they use evidence and make decisions. I’ve seen two or three cycles of evaluative approaches come and go. Each earlier approach is retained somewhere, so it seems that evaluation will always be multi-faceted and contested.
Another change is the greater and greater influence of technology. In evaluation, we have more access to big data, sophisticated tools for qualitative and quantitative analysis, social media and the prospect of artificial intelligence. But, as far as I can tell, evaluative arguments and executive summaries will remain human endeavours for some time.
What are the main skills or competencies that evaluators need to keep pace with emerging trends?
I think that evaluators need a broad base in evaluation theory and practice, in addition to their specialist skills. You need to keep up with literature in evaluation and related fields, such as public policy, management and systems. I believe that scepticism is an important attitude. You also should approach evaluation with curiosity and mindfulness and be able to live with ambiguity and uncertainty.
We live in an uncertain world in which goal-posts change at a rapid rate. So what do you see as the main social issues that we should be thinking about and seeking to resolve in the next decade?
In recent times, a major issue is the less certain role of democratic institutions and governments in our society, and their lack of capacity to deal with the important problems. Everything becomes short term and often ideological. There is less focus on evidence in decision-making. Governments are becoming more populist, with less capacity for, or even interest in, rational and balanced decisions about the big problems that we face.
Reconciliation with Aboriginal people is far from complete in Australia and this has many implications for how we do evaluation. The AES has had a good record in recent years, but we need to continue our focus on Aboriginal issues and the involvement of Aboriginal people.
More broadly, I believe that the biggest issue we face is addressing climate change and its impact on all aspects of our lives. For evaluators, this includes how we deal with special interests and the unbalanced use of data. We also need to address the increasing inequity within our society, whereby, my generation is way better off than younger people. Another issue is the control and use of data collected by Google and social media companies. Data that are not-transparent may breach our ethical standards, and are essentially used for commercial and political purposes. The challenge for evaluation is to be able to access and use the growing amount of data for reasoned inquiry, balanced decisions and ultimately the public good.
How can the AES position itself to remain relevant into the future?
It’s so important we stay inclusive of people with very different interests, approaches, backgrounds and experiences with evaluation. We should be a platform to communicate the trends and challenges for evidence and evaluation in public policy.
The Society needs a high profile – it should stand out as the key authority on evaluation in Australia.
Chris Milne is a founding partner of ARTD Consultants, a public policy consulting firm specialising in evaluation established in 1983. While he is mostly retired, Chris continues to chair the ARTD Board and act as a sounding board for ARTD Directors.