Member login
Forgot Login?   Sign up  

This is the AES Blog, where we regularly post articles by the Australasian evaluation community on the subjects that matter to us. If you have an idea, please contact us on This email address is being protected from spambots. You need JavaScript enabled to view it.. Blog guidelines can be found here.

May 2020
by Jade Maloney

Over the last couple of months, evaluators around the world have been grappling with the question of whether and how we evaluate in the COVID-19 context. What can and should be done now, and what should wait? How can we be most useful?

For a recent online session with AES members, which Keren Winterford, Greg Masters and I hosted on behalf of the NSW Committee, I rounded up a range of reflections on these questions to prompt discussion.

We need to consider carefully whether to pause or press on

Evaluation is the oxygen that powers decision making. Too little and we are likely to make poor decisions. And when faced with big challenges, we need more than usual. Too much evaluation without action leads to hyperventilation. Analysis paralysis. As an evaluator, it is your responsibility to keep the breathing steady. [Chris Lysy]

To decide whether to pause or press on with our existing evaluations, we need to ask ourselves a series of questions.

Can it be done without undue stress to an organisation responding to COVID-19? At the best of times, evaluation can be anxiety inducing, does the organisation/ team have the bandwidth to engage?

Can the evaluation still be useful now? Can you adapt your questions? Can you assess how COVID-19 adaptations are working? Can you help to identify what adjustments should continue post COVID-19?

Can you adapt your methods to comply with physical distancing? Will the people you are trying to engage, engage online? Can you draw on existing data sources?

The World Bank’s, Adapting Evaluation Designs sets out four questions that you can adapt to work through whether to press on. Better Evaluation, has also begun a series on Adapting evaluation in the time of COVID-19. Part 1: MANAGE has a range of useful prompts to help you work through changes to stakeholder context, engagement, decision-making protocols, information needs, Terms of Reference and budgeting.

Think beyond individual “evaluations” to tap into our value

I think one of the key gaps or aspects I don’t see addressed much is around utility of evaluation in this space. A lot of the discussion online is around the ‘how’ – how do we adapt evaluation? But I feel a deeper question is around the ‘why’ of evaluation. Why is it still important to do evaluation in this context? Is it actually important to making a difference? This is quite a tricky question and one that can make an evaluator really uncomfortable as it forces us to reconsider our work. But, on the contrary, I see this as an opportunity to reinforce our conviction, sense of purpose and clarity. Evaluation was already often an after-thought and now urgent customer-facing delivery initiatives are definitely taking priority. The case for evaluation will be harder to make. We need to genuinely think about the value evaluation can bring in these times and more broadly. {Florent Gomez, NSW Department of Customer Service]

As Michael Quinn Patton has said, we need to be prepared to make a case for the value of evaluation now. We can do this by proactively demonstrating the ongoing relevance of evaluative thinking, supporting real-time sensemaking of data, engaging in systems thinking (identifying the interconnections and their implications), enabling decision-making based on “good enough” data, and identifying the potential for negative unintended consequences so they can be prevented. In other words, “All evaluators must now become developmental evaluators, capable of adapting to complex dynamics systems, preparing for the unknown, for uncertainties, turbulence, lack of control, nonlinearities, and for emergence of the unexpected.”

For guidance on Sense-making in real-time check out Canadian facilitator Chris Corrigan’s blog.  First, observe the situation. Then, look for patterns and inquire into these. What do you notice in general? What are the exceptions to these generalisations? The contradictions? The surprises? What are you curious about? Then, using complexity concepts, look at what is keeping the patterns you have identified in place and the actionable insights that could enable change.

Sense making in real time 

My team at ARTD have also developed the 3 R Framework as a tool for using evaluative thinking under pressure. It is based around questions because, in our experience, being an evaluator is about asking effective questions at the right time, not about having all the answers. You can use the framework to direct responses at an organisational, team, program or individual level. If you’re applying it within your organisation, team or to a program, we suggest getting a diverse group of people together to reflect, drawing on existing data, stories and experiences to ensure you are not missing critical insights as you make decisions.

3 R Framework 

While being useful right now, we can also keep our eye on the long game – what data needs to be collected now to enable evaluation of pandemic responses?

Think through the implications of your choices

Among evaluators I have spoken to around Australia and overseas, there is a strong concern about the equity implications of changes. It is important we recognise differential impacts of the crisis, consider accessibility when adapting our methods and whose voice is missed if we draw only on existing data.

We also need to be as purposeful in choosing our online methods as we are in planning methods generally. Not everything has to become a Zoom session. Asynchronous (contributing at different times) online methods bring different benefits and drawbacks to synchronous (contributing at the same time) online methods.

Remember: not everything is changing and some things should

One of the things I have found most useful in this time is my colleague Emily Verstege’s reminder (with reference to Kieran Flanagan and Dan Gregory’s Forever Skills) that, while many things are changing, including how we evaluate, what is at the core of evaluation is not. We can take comfort in this, as well as in the potential to change things that need changing.

One of the benefits of taking our regular AES session online was the ability to engage with members in regional areas and other jurisdictions. It’s something the committee is already thinking about continuing when physical distancing rules are relaxed.

I have been most struck by the validity of that old maxim that necessity is the mother of invention. In many areas of work and life, more fundamental change has occurred in the last few weeks than in previous years, despite the relentless urging for innovation. Witness working from home arrangements, expansion of telehealth services, online delivery of educational programs.

Hopefully, one of the legacies of this awful crisis is that some of these new practices become ingrained and that we become more comfortable challenging the status quo and traditional modes of operation. Returning to normal is neither feasible nor desirable. Evaluators have a large role to play in leading that campaign but we also need to challenge our existing practices. [Greg Masters, Nexus Management Consulting and NSW AES Committee member]

If you have ideas for further AES blogs, the AES Blog Working Group would be keen to hear them. Please complete the online form below.



Jade Maloney is a Partner and Managing Director of ARTD Consultants, which specialises in program evaluation. 


Fellows gillW

February 2020
by Anthea Rutter

Gill was named an AES Fellow in 2018, and I was pleased to introduce her at the AES conference in Launceston that year. We started with what brought her into the field of evaluation, and what it was about realist methodology that not only piqued her interest but now defines her as a practitioner

I came into evaluation from a background in human services and managing human services. I’d always been concerned about how we could tell whether we were doing any good or not. I was introduced to realist evaluation through some work I was doing in crime prevention, and it provided a way to work out why some things work for some people but not for others. I found out through reading evaluations that there’s quite a common pattern – that programs often don’t work for those who are most disadvantaged, and some actually cause harm to them. I wanted to know why.

The realist approach assumes that outcomes will be different for different people. The more I worked with it, the more I realised that it’s not just how I approach evaluation, it’s actually how I see the world. I am a realist. It has shaped my life and my thinking in general. People who use it often don’t understand it and often get it wrong. It’s a methodological approach rather than a method, [that is] a philosophy for method.

It was clear from our conversation that Gill is committed to realist philosophies and methodologies. I was intrigued by her passion.

I describe myself as a realist methodologist. Within that I think my real area of expertise is developing methods or strategies for the application of realist methods in things which are hard to evaluate, for example, prevention programs. How do you evaluate things which haven’t happened? More recently I have looked at how to use realist methods in very large scale, very complex programs.

The other area of interest is in grappling with the implications of the fundamental philosophy of realism. Others have done a lot of work on realist ontology. My two current interests are realist axiology – how you think about valuing from a realist perspective – and what does that mean for evaluation? The other is realist epistemology. Some people have argued that realists are constructivists, epistemologically. But I think there are points of difference and I’m interested in what that means for practice.

All of us have experienced challenges along the way, and I was keen to explore these with Gill.

It’s not a single thing but a range of things. Some commissioners have asked for realist evaluation, but it turned out they didn’t understand it and what it can do. There are challenges in other projects where people who have been taken on as part of the team look as though they will be ok using a realist lens, but it turns out they’re not.

Challenges in terms of the usual constraints on evaluation, money and time. I do pick difficult things to evaluate and there can be challenges with that. Generally it’s the interaction of a number of factors in particular programs. The skill is being able to think through and negotiate the different factors in an evaluation.

She also pointed out some highlights.

A particular one is Nick inviting me to do the PhD –this was in a sense a starting point and an influential moment which changed my direction. I had decided to move into evaluation in some way, but this changed everything.

Writing the standards for realist evaluation was another one – that was an honour – but also working deeply and closely with those who really understood realist approaches. I enjoyed thinking about what really matters if you want to use this approach coherently and consistently.

A number of people and methodologies had a great influence on Gill’s practice.

Nick Tilley and Ray Pawson, of course. Bhaskar’s work, including his model thinking about levels of reality, the empirical, the actual and the real. Patricia Rogers. I’ve done a lot of training in other methods too, and probably each of them has had some influence.

I’ve also adapted other methods to suit realist evaluation. One example is Most Significant Change stories. To do that, you have to look back at what the developers of a particular theory or method were trying to achieve, and the strengths and weaknesses of that for realist work. So for MSC stories, I looked at what Rick Davies intended, but then recognised that selecting the ‘most significant’ changes hides all the variation that realist analysis depends on. So I worked with a project to develop other strategies to maintain that variation while still identifying what it was that mattered to people, and why.

Gill had some definite ideas on how evaluation had changed over the years.

The pendulum swings back and forth in relation to methodologies and methods. At the moment there are parts of government here, and some overseas, that are swinging towards positivist approaches, i.e. Randomised Controlled Trials. I worry about that and think it could be a danger because RCTs don’t give all the information you need to make some kinds of evidence-informed judgments.

I see a lot of younger people coming into the profession, which I think is great. The courses at University of Melbourne (CPE) and our own in Darwin does help to bring in younger people. I see the influence of technology, for example, the ability to manipulate big data.

I think there are some challenges too. For example, the use of social media in evaluation is fraught with dangers, but the ability to record data via iPad in the international development context is great. There are lots of implications in regard to new technologies.

Gill’s response to the issue of skills and competencies for the evaluator of today reinforced some of the fundamental qualities evaluators need in order to be successful practitioners.

The two biggest competencies for evaluators, I think, are the ability to think hard and well, because our job is to make judgments. Your judgments are supposed to be well informed. The skill of the evaluator lies in the analytic ability to think through the implications of what people are doing, but also the implications of the data you’ve collected, and work out what it all means.

The other competency is that you have to be able to engage with people, even though it can be difficult because people often feel uncomfortable with being evaluated, and with some of the findings. The relationship with the client is important.

She was definite about some of the social issues she thinks evaluators should be thinking about as well as helping to resolve in the next decade.

I choose to work in areas that are grappling with things which are threats to humanity – environment and climate issues, or international development issues, which have big implications for the balance of power.

The other priority for me relates to social justice, for example, women’s issues, youth, domestic violence, sexual assault, employment/unemployment – anything to do with social disadvantage, which is underpinned by injustice.

If you let society get unjust enough, and I think we are right there now, then the situation becomes a state of dangerous unrest. Those are my driving forces and where I think that’s where the field of evaluation can make its best contribution.

Gill has been involved with the society in a number of roles: as a committee member, a Board member (twice) and convening a conference committee, so I felt she would be in a good position to ponder the direction which the AES should take in the future.

The AES has gone through a necessary stage of being inward focused, looking at the constitution, the strategic plan and so on. Now it needs to be more outwardly focused. At this exact moment, it needs to think about the implications of the proposal for an Evaluator General.

The society should have a stronger policy advocacy focus, which should be manifested at both a national and a state level. The members live in states and territories, and for many of us, our working lives are framed by state and territory legislation.

The third way in which it can look outward is dealing with other professions because the things they are doing are informing policy and practice. We need stronger bridges with other fields. It needs to begin a conversation which can inform practice both ways; otherwise we will become irrelevant.

The fourth way is to build some knowledge of the implications of new technologies. There are people within the field with specialist knowledge but many of us don’t know enough, and haven’t thought hard enough, about them as yet. Myself included.


Gill Westhorp is a Professorial Research Fellow at Charles Darwin University, Darwin, where she leads the Realist Research Evaluation and Learning Initiative (RREALI). She is also Director of Community Matters Pty Ltd, a research and evaluation consultancy based in South Australia.


Fellows jeromeW

March 2020
by Anthea Rutter

Jerome Winston’s career spans over 45 years. He has fascinating insights into how evaluation was viewed in the 70s, which reminded me that back then, evaluation was not viewed as a separate profession, but as part of other disciplines.

I started teaching at Preston Institute of Technology (which, following two mergers, became Phillip Institute of Technology and then RMIT University).  At first, I was teaching both diploma and degree courses in engineering and applied chemistry. When the School of Social Work opened, they were looking for staff who would teach computing and statistics. As an applied scientist, I proposed building monitoring and evaluation into practice, so recommended that computing and statistics be taught as just one aspect of program planning, monitoring and evaluation. This suggestion, first adopted in social work, was later included in a number of other disciplines such as nursing, chiropractic and leisure studies.

Jerome then talked about the 80s and the advent of program budgeting in the Victorian – and later, federal – government, and what this meant for the next stage of his career.

Although program budgeting was intended to incorporate evaluation, Jerome believed that reporting simple, aggregated, numerical data as ‘performance indicators’ would not provide the depth of information needed about most government programs.  The use – and misuse – of ‘performance indicators’ became a main focus of Jerome’s research. 

In 1978, Jerome designed post graduate programs in data collection and analysis for research, monitoring and evaluation. These programs started at Phillip Institute of Technology (PIT) at about the same time that John Owen’s program in evaluation was starting at The University of Melbourne. Most of Jerome’s career was as a senior lecturer in multi-method research, monitoring and evaluation at PIT (later, RMIT).

The AES Fellows’ reasons for coming into the field of evaluation have been eclectic and Jerome presented yet another pathway.

I wouldn’t have gone into evaluation unless I had started with an interest in both science and government. When I met social work academics at PIT, I found they shared a broad sense of systems theory, research methods, and data collection and analysis. I ended up as an applied scientist teaching multi-method approaches to evaluation in the human services.

My main interest is in applying systems theory to the planning and evaluation of human services. My other interest is integrating multiple methods of data collection and analysis, and their use in building practice knowledge. I don’t expect any method, on its own, to be particularly useful. 

As an evaluation practitioner, he points to the challenges of bringing together multiple disciplines.

Most of the challenges I have encountered have to do with responding to the splitting of disciplines from each other – finding ways to bridge gaps among disciplines – gaps between public administration, applied science, planning, budgeting, evaluation and management. 

The main highlights for his career have been about building networks as well as being able to embrace opportunities.

In the 70s and early 80s, colleagues supported me to set up two different networks: the Australian Evaluation Network and its occasional newsletter were intended to link people across Australia. In Victoria, Colin Sharp and I set up the Evaluation Training Network, so that our colleagues could run low-cost evaluation workshops.  Then, meeting Anona Armstrong and being invited by her to contribute to planning the first evaluation conferences, then becoming a foundation member of the AES, and then a Board member. 

Towards the end of the 80s, I was encouraged by colleagues in Canberra to apply for an executive interchange into the Australian Public Service. I was selected to work for six months in the evaluation section of the Commonwealth Department of Finance at the time they were introducing program budgeting – and performance indicators – across the public service. 

About the same time, I started to speak on evaluation and performance indicators at conferences on public administration and accounting in Australia and New Zealand. This led in 1994 to co-leading conference workshops in Kuala Lumpur with Dr. Arunaselam Rasappan – then an evaluation trainer and consultant at the Malaysian government’s public administration training college and later the head of an evaluation research, training and development centre that a few of us established in Malaysia. 

Of the influences in his career, it was no surprise that they have been practice based.

The first influence was the philosophy of social work to which I was introduced at PIT.  Their approach saw evaluation as an ‘intervention for change’ integral to professional practice. Another influence was having the opportunity to work within the Department of Finance in Canberra on evaluation and what it meant within that department. 

I also asked him what changes he had seen during his career. Jerome’s perception is that formative evaluation has disappeared as a concept in some organisations that promote evaluation. He thinks that the emphasis has been more on summative and impact evaluation, with limited work on theory, without which summative evaluation provides limited information. 

In Australia and New Zealand, evaluation was typically understood as a team activity. We did not expect one person – ‘the evaluator ‘– to carry out an evaluation, largely on their own, so we did not use the term ‘evaluator’ as frequently as it is used now, referring instead to ‘evaluation teams’ and ‘evaluation practitioners’.

I was also keen to find out what skills and competencies the evaluators of today need to have to keep up with emerging trends in evaluation practice.

I think most of the members of the AES come from a narrow professional or academic background. In the 80s, the AES conferences included more auditors, public health, public administration and public finance professionals, economists, etc. We need to return to our multi-profession roots, which were evident in evaluation journals in the 1970s and early 1980s.  

If you let society get unjust enough, and I think we are right there now, then the situation becomes a state of dangerous unrest. Those are my driving forces and where I think that’s where the field of evaluation can make its best contribution.

When I asked Jerome about what he saw as the major social issues evaluators ought to be thinking about as well as seeking to resolve, his answers were very perceptive.

We need to understand that Indigenous cultures have different approaches to using knowledge in their community from what is common in the dominant Aussie culture. We sometimes have quite naïve approaches to Indigenous cultures. 

Another issue is including the ‘value’ in ‘evaluation’.  Some evaluation practitioners do what they are told is wanted, rather than insist on reporting on how other ‘values’ may influence findings, conclusions and recommendations. 

I asked Jerome how he saw the AES maintain its relevance. His answer was focused and direct.

Build those bridges between professional disciplines that share an interest in evaluation. Take advantage of individuals’ different sources of knowledge and skills. Increase the relevance of evaluation at the practice level, and it is important that we keep doing research about the practice of monitoring and evaluation.


Jerome Winston continues to work with the research centre in Malaysia – the Centre for Development and Research in Evaluation. He does a range of work for government and aid programs on how well new evaluation models and frameworks work, and why. He also runs a small consultancy in Australia.


December 2019
by Jade Maloney, Jo Farmer and Eunice Sotelo

With so many authors and approaches to evaluation, knowing what to pay attention to can be hard. Evaluation, just like the catwalk, is subject to the whims of the day. How do you know what’s a passing fad and what will remain in fashion?

At the AES Victoria regional seminar in November, Brad Astbury suggested the following 10 books will stand the test of time.

1960s BA blog

1960s: In the age of bell bottoms, beehives and lava lamps, and protests against the Vietnam war, evaluation was on the rise. The U.S. Congress passed the Elementary and Secondary Education Act (ESEA) in 1965, the first major piece of social legislation to require evaluation of local projects undertaken with federal funds and to mandate project reporting. The hope was that timely and objective information about projects could reform local governance and practice of education for disadvantaged children, and that systemic evaluation could reform federal management of education programs.

While you wouldn’t return to the fashion or the days when evaluation was uniquely experimental, hold onto your copy Experimental and quasi-experimental designs for research by Donald T. Campbell and Julian C. Stanley. These two coined the concepts of external and internal validity. Even if you’re not doing an experiment, you can use a validity checklist.

1970s BA blog

1970s: Back in the day of disco balls and platform shoes, the closest thing evaluation has to a rock star – Michael Quinn Patton – penned the first edition of Utilisation-focused evaluation. Now in its fourth edition, it’s the bible for evaluation consultants. It’s also one of the evaluation theories with the most solid evidence base – drawn from Patton’s research. In today’s age of customer centricity, it’s clear focusing on intended use by intended users is a concept that’s here to stay.

Carol Weiss’s message – ignore politics at your peril – could also have been written for our times. Her Evaluation research: methods of assessing program effectiveness provides a solid grounding in the politics of evaluation. It also describes theory-based evaluation – an approach beyond the experimental, that is commonly used today.

1980s BA blog

1980s: While you may no longer work out in in fluorescent tights, leotards and sweat bands, your copy of Qualitative evaluation methods won’t go out of fashion any time soon. In his second appearance on the list, Michael Quinn Patton made a strong case that qualitative ways of knowing are not inferior.

You may also know the name of the second recommended author from this decade, but more likely for the statistical test that bears his name (Cronbach’s alpha) than his contribution to evaluation theory, which is under-acknowledged. In Toward reform of program evaluation, Lee Cronbach and associates set out 95 theses to reform evaluation (in the style of Martin Luther’s 95 theses). That many of the 95 theses still ring true could be seen as either depressing or a consolation for the challenges evaluators face. For Astbury – ever the evaluation lecturer – thesis number 93 “the evaluator is an educator; his success is to be judged by what others learn” is the standout, but there’s one in there for everyone. (No. 13. “The evaluator’s professional conclusions cannot substitute for the political process” aligns with Weiss’s message, while No. 9. “Commissioners of evaluation complain that the messages from evaluations are not useful, while evaluators complain that the messages are not used” could have been pulled from Patton’s Utilisation-focused evaluation).

1990s BA blog

1990s: Alongside Vanilla Ice, the Spice Girls and the Macarena, the 90s brought us CMOCs – context, mechanism, outcome, configurations – and a different way of doing evaluation. Ray Pawson and Nick Tilley’s Realistic evaluation taught us not to just ask what works, but what works, for whom, in what circumstances, and why?

To balance the specificity of this perspective, the other recommendation from the 90s is agnostic. Foundations of program evaluation: Theories of practice by William Shadish Jr., Thomas Cook and Laura Leviton describes the three stages in the evolution of evaluation thinking. It articulates the criteria for judging the merits of evaluation theories: the extent to which they are coherent on social programming, knowledge construction, valuing, use, and practice. The message here is there is no single theory or ideal theory of evaluation to guide practice.

2000s BA blog

2000s: While iPods and flash mobs are a faint memory, these two books have had a lasting impact: Evaluation: An integrated framework for understanding, guiding, and improving public and non-profit policies and programs by Melvin Mark, Gary Henry and George Julnes; and Evaluation roots: Tracing theorists’ views and influences edited by Marvin Alkin.

The former covers four key purposes of evaluation: to review the merit of programs and their value to society (as per Scriven’s definition); to improve the organisation and its services; to ensure program compliance with mandates; to build knowledge and expertise for future programs. The take-out is to adopt a contingency perspective.

The latter is the source of the evaluation theory tree – which sparked commentary at this year’s AES and AEA conferences for its individualism, and limited gender and cultural diversity. Still, Brad reminds us that there’s value in learning from the thinkers as well as the practitioners; we can learn in the field but also from the field. According to Sage, Alkin’s Evaluation roots is one of the most sold book on evaluation.

It’s a reminder that there is much still to learn from those who’ve come before – that we can learn as much from those who’ve thought about evaluation for decades as we can from our practical experience.

2010s BA blog

2010s: The age of the selfie has not yet faded and nor has Evaluating values, biases and practical wisdom by Ernest R. House. It covers three meta-themes: values (Scriven, House); biases (Campbell and the experimental approach; expanding the concept of validity); and practical wisdom (on Aristotle’s notion of praxis – blending/embedding theory and practice). It gives us the wise advice to pay more attention to cognitive biases and conflicts of interest.

So now to the questions.

Why didn’t Scriven make the list? Because he’s written few books and there wasn’t enough room in the 90s. Nevertheless, Michael Scriven’s Evaluation thesaurus and The Logic of evaluation are among the books Astbury notes are worth reading.

What about local authors? Grab a copy of Building in research and evaluation: Human inquiry for living systems by Yoland Wadsworth and Purposeful program theory: Effective use of theories of change and logic models by Sue Funnell and Patricia Rogers. If you’re new to evaluation Evaluation methodology basics: The nuts and bolts of sound evaluation – from this year’s AES conference keynote E. Jane Davidson – can help you get a grasp on evaluation in practice.

Brad ended by sounding a word of warning not to get too caught up in the fads of the day. Buzzwords may come and go, but to avoid becoming a fashion victim, these ten books should be a staple of any evaluator’s bookshelf. 


Brad Astbury is a Director at ARTD Consulting, based in the Melbourne office. He has over 18 years’ experience in evaluation and applied social research and considerable expertise in combining diverse forms of evidence to improve both the quality and utility of evaluation. He has managed and conducted needs assessments, process and impact studies and theory-driven evaluations across a wide range of policy areas for industry, government, community and not-for-profit clients. Prior to joining ARTD in 2018, Brad worked for over a decade at the University of Melbourne, where he taught and mentored postgraduate evaluation students.