• - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • A new framework for...

A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance

  • Related content
  • Peer review
  • Kathryn Skivington , research fellow 1 ,
  • Lynsay Matthews , research fellow 1 ,
  • Sharon Anne Simpson , professor of behavioural sciences and health 1 ,
  • Peter Craig , professor of public health evaluation 1 ,
  • Janis Baird , professor of public health and epidemiology 2 ,
  • Jane M Blazeby , professor of surgery 3 ,
  • Kathleen Anne Boyd , reader in health economics 4 ,
  • Neil Craig , acting head of evaluation within Public Health Scotland 5 ,
  • David P French , professor of health psychology 6 ,
  • Emma McIntosh , professor of health economics 4 ,
  • Mark Petticrew , professor of public health evaluation 7 ,
  • Jo Rycroft-Malone , faculty dean 8 ,
  • Martin White , professor of population health research 9 ,
  • Laurence Moore , unit director 1
  • 1 MRC/CSO Social and Public Health Sciences Unit, Institute of Health and Wellbeing, University of Glasgow, Glasgow, UK
  • 2 Medical Research Council Lifecourse Epidemiology Unit, University of Southampton, Southampton, UK
  • 3 Medical Research Council ConDuCT-II Hub for Trials Methodology Research and Bristol Biomedical Research Centre, Bristol, UK
  • 4 Health Economics and Health Technology Assessment Unit, Institute of Health and Wellbeing, University of Glasgow, Glasgow, UK
  • 5 Public Health Scotland, Glasgow, UK
  • 6 Manchester Centre for Health Psychology, University of Manchester, Manchester, UK
  • 7 London School of Hygiene and Tropical Medicine, London, UK
  • 8 Faculty of Health and Medicine, Lancaster University, Lancaster, UK
  • 9 Medical Research Council Epidemiology Unit, University of Cambridge, Cambridge, UK
  • Correspondence to: K Skivington Kathryn.skivington{at}glasgow.ac.uk
  • Accepted 9 August 2021

The UK Medical Research Council’s widely used guidance for developing and evaluating complex interventions has been replaced by a new framework, commissioned jointly by the Medical Research Council and the National Institute for Health Research, which takes account of recent developments in theory and methods and the need to maximise the efficiency, use, and impact of research.

Complex interventions are commonly used in the health and social care services, public health practice, and other areas of social and economic policy that have consequences for health. Such interventions are delivered and evaluated at different levels, from individual to societal levels. Examples include a new surgical procedure, the redesign of a healthcare programme, and a change in welfare policy. The UK Medical Research Council (MRC) published a framework for researchers and research funders on developing and evaluating complex interventions in 2000 and revised guidance in 2006. 1 2 3 Although these documents continue to be widely used and are now accompanied by a range of more detailed guidance on specific aspects of the research process, 4 5 6 7 8 several important conceptual, methodological and theoretical developments have taken place since 2006. These developments have been included in a new framework commissioned by the National Institute of Health Research (NIHR) and the MRC. 9 The framework aims to help researchers work with other stakeholders to identify the key questions about complex interventions, and to design and conduct research with a diversity of perspectives and appropriate choice of methods.

Summary points

Complex intervention research can take an efficacy, effectiveness, theory based, and/or systems perspective, the choice of which is based on what is known already and what further evidence would add most to knowledge

Complex intervention research goes beyond asking whether an intervention works in the sense of achieving its intended outcome—to asking a broader range of questions (eg, identifying what other impact it has, assessing its value relative to the resources required to deliver it, theorising how it works, taking account of how it interacts with the context in which it is implemented, how it contributes to system change, and how the evidence can be used to support real world decision making)

A trade-off exists between precise unbiased answers to narrow questions and more uncertain answers to broader, more complex questions; researchers should answer the questions that are most useful to decision makers rather than those that can be answered with greater certainty

Complex intervention research can be considered in terms of phases, although these phases are not necessarily sequential: development or identification of an intervention, assessment of feasibility of the intervention and evaluation design, evaluation of the intervention, and impactful implementation

At each phase, six core elements should be considered to answer the following questions:

How does the intervention interact with its context?

What is the underpinning programme theory?

How can diverse stakeholder perspectives be included in the research?

What are the key uncertainties?

How can the intervention be refined?

What are the comparative resource and outcome consequences of the intervention?

The answers to these questions should be used to decide whether the research should proceed to the next phase, return to a previous phase, repeat a phase, or stop

Development of the Framework for Developing and Evaluating Complex Interventions

The updated Framework for Developing and Evaluating Complex Interventions is the culmination of a process that included four stages:

A gap analysis to identify developments in the methods and practice since the previous framework was published

A full day expert workshop, in May 2018, of 36 participants to discuss the topics identified in the gap analysis

An open consultation on a draft of the framework in April 2019, whereby we sought stakeholder opinion by advertising via social media, email lists and other networks for written feedback (52 detailed responses were received from stakeholders internationally)

Redraft using findings from the previous stages, followed by a final expert review.

We also sought stakeholder views at various interactive workshops throughout the development of the framework: at the annual meetings of the Society for Social Medicine and Population Health (2018), the UK Society for Behavioural Medicine (2017, 2018), and internationally at the International Congress of Behavioural Medicine (2018). The entire process was overseen by a scientific advisory group representing the range of relevant NIHR programmes and MRC population health investments. The framework was reviewed by the MRC-NIHR Methodology Research Programme Advisory Group and then approved by the MRC Population Health Sciences Group in March 2020 before undergoing further external peer and editorial review through the NIHR Journals Library peer review process. More detailed information and the methods used to develop this new framework are described elsewhere. 9 This article introduces the framework and summarises the main messages for producers and users of evidence.

What are complex interventions?

An intervention might be considered complex because of properties of the intervention itself, such as the number of components involved; the range of behaviours targeted; expertise and skills required by those delivering and receiving the intervention; the number of groups, settings, or levels targeted; or the permitted level of flexibility of the intervention or its components. For example, the Links Worker Programme was an intervention in primary care in Glasgow, Scotland, that aimed to link people with community resources to help them “live well” in their communities. It targeted individual, primary care (general practitioner (GP) surgery), and community levels. The intervention was flexible in that it could differ between primary care GP surgeries. In addition, the Link Workers did not support just one specific health or wellbeing issue: bereavement, substance use, employment, and learning difficulties were all included. 10 11 The complexity of this intervention had implications for many aspects of its evaluation, such as the choice of appropriate outcomes and processes to assess.

Flexibility in intervention delivery and adherence might be permitted to allow for variation in how, where, and by whom interventions are delivered and received. Standardisation of interventions could relate more to the underlying process and functions of the intervention than on the specific form of components delivered. 12 For example, in surgical trials, protocols can be designed with flexibility for intervention delivery. 13 Interventions require a theoretical deconstruction into components and then agreement about permissible and prohibited variation in the delivery of those components. This approach allows implementation of a complex intervention to vary across different contexts yet maintain the integrity of the core intervention components. Drawing on this approach in the ROMIO pilot trial, core components of minimally invasive oesophagectomy were agreed and subsequently monitored during main trial delivery using photography. 14

Complexity might also arise through interactions between the intervention and its context, by which we mean “any feature of the circumstances in which an intervention is conceived, developed, implemented and evaluated.” 6 15 16 17 Much of the criticism of and extensions to the existing framework and guidance have focused on the need for greater attention on understanding how and under what circumstances interventions bring about change. 7 15 18 The importance of interactions between the intervention and its context emphasises the value of identifying mechanisms of change, where mechanisms are the causal links between intervention components and outcomes; and contextual factors, which determine and shape whether and how outcomes are generated. 19

Thus, attention is given not only to the design of the intervention itself but also to the conditions needed to realise its mechanisms of change and/or the resources required to support intervention reach and impact in real world implementation. For example, in a cluster randomised trial of ASSIST (a peer led, smoking prevention intervention), researchers found that the intervention worked particularly well in cohesive communities that were served by one secondary school where peer supporters were in regular contact with their peers—a key contextual factor consistent with diffusion of innovation theory, which underpinned the intervention design. 20 A process evaluation conducted alongside a trial of robot assisted surgery identified key contextual factors to support effective implementation of this procedure, including engaging staff at different levels and surgeons who would not be using robot assisted surgery, whole team training, and an operating theatre of suitable size. 21

With this framing, complex interventions can helpfully be considered as events in systems. 16 Thinking about systems helps us understand the interaction between an intervention and the context in which it is implemented in a dynamic way. 22 Systems can be thought of as complex and adaptive, 23 characterised by properties such as emergence, feedback, adaptation, and self-organisation ( table 1 ).

Properties and examples of complex adaptive systems

  • View inline

For complex intervention research to be most useful to decision makers, it should take into account the complexity that arises both from the intervention’s components and from its interaction with the context in which it is being implemented.

Research perspectives

The previous framework and guidance were based on a paradigm in which the salient question was to identify whether an intervention was effective. Complex intervention research driven primarily by this question could fail to deliver interventions that are implementable, cost effective, transferable, and scalable in real world conditions. To deliver solutions for real world practice, complex intervention research requires strong and early engagement with patients, practitioners, and policy makers, shifting the focus from the “binary question of effectiveness” 26 to whether and how the intervention will be acceptable, implementable, cost effective, scalable, and transferable across contexts. In line with a broader conception of complexity, the scope of complex intervention research needs to include the development, identification, and evaluation of whole system interventions and the assessment of how interventions contribute to system change. 22 27 The new framework therefore takes a pluralistic approach and identifies four perspectives that can be used to guide the design and conduct of complex intervention research: efficacy, effectiveness, theory based, and systems ( table 2 ).

Although each research perspective prompts different types of research question, they should be thought of as overlapping rather than mutually exclusive. For example, theory based and systems perspectives to evaluation can be used in conjunction, 33 while an effectiveness evaluation can draw on a theory based or systems perspective through an embedded process evaluation to explore how and under what circumstances outcomes are achieved. 34 35 36

Most complex health intervention research so far has taken an efficacy or effectiveness perspective and for some research questions these perspectives will continue to be the most appropriate. However, some questions equally relevant to the needs of decision makers cannot be answered by research restricted to an efficacy or effectiveness perspective. A wider range and combination of research perspectives and methods, which answer questions beyond efficacy and effectiveness, need to be used by researchers and supported by funders. Doing so will help to improve the extent to which key questions for decision makers can be answered by complex intervention research. Example questions include:

Will this effective intervention reproduce the effects found in the trial when implemented here?

Is the intervention cost effective?

What are the most important things we need to do that will collectively improve health outcomes?

In the absence of evidence from randomised trials and the infeasibility of conducting such a trial, what does the existing evidence suggest is the best option now and how can this be evaluated?

What wider changes will occur as a result of this intervention?

How are the intervention effects mediated by different settings and contexts?

Phases and core elements of complex intervention research

The framework divides complex intervention research into four phases: development or identification of the intervention, feasibility, evaluation, and implementation ( fig 1 ). A research programme might begin at any phase, depending on the key uncertainties about the intervention in question. Repeating phases is preferable to automatic progression if uncertainties remain unresolved. Each phase has a common set of core elements—considering context, developing and refining programme theory, engaging stakeholders, identifying key uncertainties, refining the intervention, and economic considerations. These elements should be considered early and continually revisited throughout the research process, and especially before moving between phases (for example, between feasibility testing and evaluation).

Fig 1

Framework for developing and evaluating complex interventions. Context=any feature of the circumstances in which an intervention is conceived, developed, evaluated, and implemented; programme theory=describes how an intervention is expected to lead to its effects and under what conditions—the programme theory should be tested and refined at all stages and used to guide the identification of uncertainties and research questions; stakeholders=those who are targeted by the intervention or policy, involved in its development or delivery, or more broadly those whose personal or professional interests are affected (that is, who have a stake in the topic)—this includes patients and members of the public as well as those linked in a professional capacity; uncertainties=identifying the key uncertainties that exist, given what is already known and what the programme theory, research team, and stakeholders identify as being most important to discover—these judgments inform the framing of research questions, which in turn govern the choice of research perspective; refinement=the process of fine tuning or making changes to the intervention once a preliminary version (prototype) has been developed; economic considerations=determining the comparative resource and outcome consequences of the interventions for those people and organisations affected

  • Download figure
  • Open in new tab
  • Download powerpoint

Core elements

The effects of a complex intervention might often be highly dependent on context, such that an intervention that is effective in some settings could be ineffective or even harmful elsewhere. 6 As the examples in table 1 show, interventions can modify the contexts in which they are implemented, by eliciting responses from other agents, or by changing behavioural norms or exposure to risk, so that their effects will also vary over time. Context can be considered as both dynamic and multi-dimensional. Key dimensions include physical, spatial, organisational, social, cultural, political, or economic features of the healthcare, health system, or public health contexts in which interventions are implemented. For example, the evaluation of the Breastfeeding In Groups intervention found that the context of the different localities (eg, staff morale and suitable premises) influenced policy implementation and was an explanatory factor in why breastfeeding rates increased in some intervention localities and declined in others. 37

Programme theory

Programme theory describes how an intervention is expected to lead to its effects and under what conditions. It articulates the key components of the intervention and how they interact, the mechanisms of the intervention, the features of the context that are expected to influence those mechanisms, and how those mechanisms might influence the context. 38 Programme theory can be used to promote shared understanding of the intervention among diverse stakeholders, and to identify key uncertainties and research questions. Where an intervention (such as a policy) is developed by others, researchers still need to theorise the intervention before attempting to evaluate it. 39 Best practice is to develop programme theory at the beginning of the research project with involvement of diverse stakeholders, based on evidence and theory from relevant fields, and to refine it during successive phases. The EPOCH trial tested a large scale quality improvement programme aimed at improving 90 day survival rates for patients undergoing emergency abdominal surgery; it included a well articulated programme theory at the outset, which supported the tailoring of programme delivery to local contexts. 40 The development, implementation, and post-study reflection of the programme theory resulted in suggested improvements for future implementation of the quality improvement programme.

A refined programme theory is an important evaluation outcome and is the principal aim where a theory based perspective is taken. Improved programme theory will help inform transferability of interventions across settings and help produce evidence and understanding that is useful to decision makers. In addition to full articulation of programme theory, it can help provide visual representations—for example, using a logic model, 41 42 43 realist matrix, 44 or a system map, 45 with the choice depending on which is most appropriate for the research perspective and research questions. Although useful, any single visual representation is unlikely to sufficiently articulate the programme theory—it should always be articulated well within the text of publications, reports, and funding applications.

Stakeholders

Stakeholders include those individuals who are targeted by the intervention or policy, those involved in its development or delivery, or those whose personal or professional interests are affected (that is, all those who have a stake in the topic). Patients and the public are key stakeholders. Meaningful engagement with appropriate stakeholders at each phase of the research is needed to maximise the potential of developing or identifying an intervention that is likely to have positive impacts on health and to enhance prospects of achieving changes in policy or practice. For example, patient and public involvement 46 activities in the PARADES programme, which evaluated approaches to reduce harm and improve outcomes for people with bipolar disorder, were wide ranging and central to the project. 47 Involving service users with lived experiences of bipolar disorder had many benefits, for example, it enhanced the intervention but also improved the evaluation and dissemination methods. Service users involved in the study also had positive outcomes, including more settled employment and progression to further education. Broad thinking and consultation is needed to identify a diverse range of appropriate stakeholders.

The purpose of stakeholder engagement will differ depending on the context and phase of the research, but is essential for prioritising research questions, the co-development of programme theory, choosing the most useful research perspective, and overcoming practical obstacles to evaluation and implementation. Researchers should nevertheless be mindful of conflicts of interest among stakeholders and use transparent methods to record potential conflicts of interest. Research should not only elicit stakeholder priorities, but also consider why they are priorities. Careful consideration of the appropriateness and methods of identification and engagement of stakeholders is needed. 46 48

Key uncertainties

Many questions could be answered at each phase of the research process. The design and conduct of research need to engage pragmatically with the multiple uncertainties involved and offer a flexible and emergent approach to exploring them. 15 Therefore, researchers should spend time developing the programme theory, clearly identifying the remaining uncertainties, given what is already known and what the research team and stakeholders identify as being most important to determine. Judgments about the key uncertainties inform the framing of research questions, which in turn govern the choice of research perspective.

Efficacy trials of relatively uncomplicated interventions in tightly controlled conditions, where research questions are answered with great certainty, will always be important, but translation of the evidence into the diverse settings of everyday practice is often highly problematic. 27 For intervention research in healthcare and public health settings to take on more challenging evaluation questions, greater priority should be given to mixed methods, theory based, or systems evaluation that is sensitive to complexity and that emphasises implementation, context, and system fit. This approach could help improve understanding and identify important implications for decision makers, albeit with caveats, assumptions, and limitations. 22 Rather than maintaining the established tendency to prioritise strong research designs that answer some questions with certainty but are unsuited to resolving many important evaluation questions, this more inclusive, deliberative process could place greater value on equivocal findings that nevertheless inform important decisions where evidence is sparse.

Intervention refinement

Within each phase of complex intervention research and on transition from one phase to another, the intervention might need to be refined, on the basis of data collected or development of programme theory. 4 The feasibility and acceptability of interventions can be improved by engaging potential intervention users to inform refinements. For example, an online physical activity planner for people with diabetes mellitus was found to be difficult to use, resulting in the tool providing incorrect personalised advice. To improve usability and the advice given, several iterations of the planner were developed on the basis of interviews and observations. This iterative process led to the refined planner demonstrating greater feasibility and accuracy. 49

Refinements should be guided by the programme theory, with acceptable boundaries agreed and specified at the beginning of each research phase, and with transparent reporting of the rationale for change. Scope for refinement might also be limited by the policy or practice context. Refinement will be rare in the evaluation phase of efficacy and effectiveness research, where interventions will ideally not change or evolve within the course of the study. However, between the phases of research and within systems and theory based evaluation studies, refinement of interventions in response to accumulated data or as an adaptive and variable response to context and system change are likely to be desirable features of the intervention and a key focus of the research.

Economic considerations

Economic evaluation—the comparative analysis of alternative courses of action in terms of both costs (resource use) and consequences (outcomes, effects)—should be a core component of all phases of intervention research. Early engagement of economic expertise will help identify the scope of costs and benefits to assess in order to answer questions that matter most to decision makers. 50 Broad ranging approaches such as cost benefit analysis or cost consequence analysis, which seek to capture the full range of health and non-health costs and benefits across different sectors, 51 will often be more suitable for an economic evaluation of a complex intervention than narrower approaches such as cost effectiveness or cost utility analysis. For example, evaluation of the New Orleans Intervention Model for infants entering foster care in Glasgow included short and long term economic analysis from multiple perspectives (the UK’s health service and personal social services, public sector, and wider societal perspectives); and used a range of frameworks, including cost utility and cost consequence analysis, to capture changes in the intersectoral costs and outcomes associated with child maltreatment. 52 53 The use of multiple economic evaluation frameworks provides decision makers with a comprehensive, multi-perspective guide to the cost effectiveness of the New Orleans Intervention Model.

Developing or identifying a complex intervention

Development refers to the whole process of designing and planning an intervention, from initial conception through to feasibility, pilot, or evaluation study. Guidance on intervention development has recently been developed through the INDEX study 4 ; although here we highlight that complex intervention research does not always begin with new or researcher led interventions. For example:

A key source of intervention development might be an intervention that has been developed elsewhere and has the possibility of being adapted to a new context. Adaptation of existing interventions could include adapting to a new population, to a new setting, 54 55 or to target other outcomes (eg, a smoking prevention intervention being adapted to tackle substance misuse and sexual health). 20 56 57 A well developed programme theory can help identify what features of the antecedent intervention(s) need to be adapted for different applications, and the key mechanisms that should be retained even if delivered slightly differently. 54 58

Policy or practice led interventions are an important focus of evaluation research. Again, uncovering the implicit theoretical basis of an intervention and developing a programme theory is essential to identifying key uncertainties and working out how the intervention might be evaluated. This step is important, even if rollout has begun, because it supports the identification of mechanisms of change, important contextual factors, and relevant outcome measures. For example, researchers evaluating the UK soft drinks industry levy developed a bounded conceptual system map to articulate their understanding (drawing on stakeholder views and document review) of how the intervention was expected to work. This system map guided the evaluation design and helped identify data sources to support evaluation. 45 Another example is a recent analysis of the implicit theory of the NHS diabetes prevention programme, involving analysis of documentation by NHS England and four providers, showing that there was no explicit theoretical basis for the programme, and no logic model showing how the intervention was expected to work. This meant that the justification for the inclusion of intervention components was unclear. 59

Intervention identification and intervention development represent two distinct pathways of evidence generation, 60 but in both cases, the key considerations in this phase relate to the core elements described above.

Feasibility

A feasibility study should be designed to assess predefined progression criteria that relate to the evaluation design (eg, reducing uncertainty around recruitment, data collection, retention, outcomes, and analysis) or the intervention itself (eg, around optimal content and delivery, acceptability, adherence, likelihood of cost effectiveness, or capacity of providers to deliver the intervention). If the programme theory suggests that contextual or implementation factors might influence the acceptability, effectiveness, or cost effectiveness of the intervention, these questions should be considered.

Despite being overlooked or rushed in the past, the value of feasibility testing is now widely accepted with key terms and concepts well defined. 61 62 Before initiating a feasibility study, researchers should consider conducting an evaluability assessment to determine whether and how an intervention can usefully be evaluated. Evaluability assessment involves collaboration with stakeholders to reach agreement on the expected outcomes of the intervention, the data that could be collected to assess processes and outcomes, and the options for designing the evaluation. 63 The end result is a recommendation on whether an evaluation is feasible, whether it can be carried out at a reasonable cost, and by which methods. 64

Economic modelling can be undertaken at the feasibility stage to assess the likelihood that the expected benefits of the intervention justify the costs (including the cost of further research), and to help decision makers decide whether proceeding to a full scale evaluation is worthwhile. 65 Depending on the results of the feasibility study, further work might be required to progressively refine the intervention before embarking on a full scale evaluation.

The new framework defines evaluation as going beyond asking whether an intervention works (in the sense of achieving its intended outcome), to a broader range of questions including identifying what other impact it has, theorising how it works, taking account of how it interacts with the context in which it is implemented, how it contributes to system change, and how the evidence can be used to support decision making in the real world. This implies a shift from an exclusive focus on obtaining unbiased estimates of effectiveness 66 towards prioritising the usefulness of information for decision making in selecting the optimal research perspective and in prioritising answerable research questions.

A crucial aspect of evaluation design is the choice of outcome measures or evidence of change. Evaluators should work with stakeholders to assess which outcomes are most important, and how to deal with multiple outcomes in the analysis with due consideration of statistical power and transparent reporting. A sharp distinction between one primary outcome and several secondary outcomes is not necessarily appropriate, particularly where the programme theory identifies impacts across a range of domains. Where needed to support the research questions, prespecified subgroup analyses should be carried out and reported. Even where such analyses are underpowered, they should be included in the protocol because they might be useful for subsequent meta-analyses, or for developing hypotheses for testing in further research. Outcome measures could capture changes to a system rather than changes in individuals. Examples include changes in relationships within an organisation, the introduction of policies, changes in social norms, or normalisation of practice. Such system level outcomes include how changing the dynamics of one part of a system alters behaviours in other parts, such as the potential for displacement of smoking into the home after a public smoking ban.

A helpful illustration of the use of system level outcomes is the evaluation of the Delaware Young Health Program—an initiative to improve the health and wellbeing of young people in Delaware, USA. The intervention aimed to change underlying system dynamics, structures, and conditions, so the evaluation identified systems oriented research questions and methods. Three systems science methods were used: group model building and viable systems model assessment to identify underlying patterns and structures; and social network analysis to evaluate change in relationships over time. 67

Researchers have many study designs to choose from, and different designs are optimally suited to consider different research questions and different circumstances. 68 Extensions to standard designs of randomised controlled trials (including adaptive designs, SMART trials (sequential multiple assignment randomised trials), n-of-1 trials, and hybrid effectiveness-implementation designs) are important areas of methods development to improve the efficiency of complex intervention research. 69 70 71 72 Non-randomised designs and modelling approaches might work best if a randomised design is not practical, for example, in natural experiments or systems evaluations. 5 73 74 A purely quantitative approach, using an experimental design with no additional elements such as a process evaluation, is rarely adequate for complex intervention research, where qualitative and mixed methods designs might be necessary to answer questions beyond effectiveness. In many evaluations, the nature of the intervention, the programme theory, or the priorities of stakeholders could lead to a greater focus on improving theories about how to intervene. In this view, effect estimates are inherently context bound, so that average effects are not a useful guide to decision makers working in different contexts. Contextualised understandings of how an intervention induces change might be more useful, as well as details on the most important enablers and constraints on its delivery across a range of settings. 7

Process evaluation can answer questions around fidelity and quality of implementation (eg, what is implemented and how?), mechanisms of change (eg, how does the delivered intervention produce change?), and context (eg, how does context affect implementation and outcomes?). 7 Process evaluation can help determine why an intervention fails unexpectedly or has unanticipated consequences, or why it works and how it can be optimised. Such findings can facilitate further development of the intervention programme theory. 75 In a theory based or systems evaluation, there is not necessarily such a clear distinction between process and outcome evaluation as there is in an effectiveness study. 76 These perspectives could prioritise theory building over evidence production and use case study or simulation methods to understand how outcomes or system behaviour are generated through intervention. 74 77

Implementation

Early consideration of implementation increases the potential of developing an intervention that can be widely adopted and maintained in real world settings. Implementation questions should be anticipated in the intervention programme theory, and considered throughout the phases of intervention development, feasibility testing, process, and outcome evaluation. Alongside implementation specific outcomes (such as reach or uptake of services), attention to the components of the implementation strategy, and contextual factors that support or hinder the achievement of impacts, are key. Some flexibility in intervention implementation might support intervention transferability into different contexts (an important aspect of long term implementation 78 ), provided that the key functions of the programme are maintained, and that the adaptations made are clearly understood. 8

In the ASSIST study, 20 a school based, peer led intervention for smoking prevention, researchers considered implementation at each phase. The intervention was developed to have minimal disruption on school resources; the feasibility study resulted in intervention refinements to improve acceptability and improve reach to male students; and in the evaluation (cluster randomised controlled trial), the intervention was delivered as closely as possible to real world implementation. Drawing on the process evaluation, the implementation included an intervention manual that identified critical components and other components that could be adapted or dropped to allow flexible implementation while achieving delivery of the key mechanisms of change; and a training manual for the trainers and ongoing quality assurance built into rollout for the longer term.

In a natural experimental study, evaluation takes place during or after the implementation of the intervention in a real world context. Highly pragmatic effectiveness trials or specific hybrid effectiveness-implementation designs also combine effectiveness and implementation outcomes in one study, with the aim of reducing time for translation of research on effectiveness into routine practice. 72 79 80

Implementation questions should be included in economic considerations during the early stages of intervention and study development. How the results of economic analyses are reported and presented to decision makers can affect whether and how they act on the results. 81 A key consideration is how to deal with interventions across different sectors, where those paying for interventions and those receiving the benefits of them could differ, reducing the incentive to implement an intervention, even if shown to be beneficial and cost effective. Early engagement with appropriate stakeholders will help frame appropriate research questions and could anticipate any implementation challenges that might arise. 82

Conclusions

One of the motivations for developing this new framework was to answer calls for a change in research priorities, towards allocating greater effort and funding to research that can have the optimum impact on healthcare or population health outcomes. The framework challenges the view that unbiased estimates of effectiveness are the cardinal goal of evaluation. It asserts that improving theories and understanding how interventions contribute to change, including how they interact with their context and wider dynamic systems, is an equally important goal. For some complex intervention research problems, an efficacy or effectiveness perspective will be the optimal approach, and a randomised controlled trial will provide the best design to achieve an unbiased estimate. For others, alternative perspectives and designs might work better, or might be the only way to generate new knowledge to reduce decision maker uncertainty.

What is important for the future is that the scope of intervention research is not constrained by an unduly limited set of perspectives and approaches that might be less risky to commission and more likely to produce a clear and unbiased answer to a specific question. A bolder approach is needed—to include methods and perspectives where experience is still quite limited, but where we, supported by our workshop participants and respondents to our consultations, believe there is an urgent need to make progress. This endeavour will involve mainstreaming new methods that are not yet widely used, as well as undertaking methodological innovation and development. The deliberative and flexible approach that we encourage is intended to reduce research waste, 83 maximise usefulness for decision makers, and increase the efficiency with which complex intervention research generates knowledge that contributes to health improvement.

Monitoring the use of the framework and evaluating its acceptability and impact is important but has been lacking in the past. We encourage research funders and journal editors to support the diversity of research perspectives and methods that are advocated here and to seek evidence that the core elements are attended to in research design and conduct. We have developed a checklist to support the preparation of funding applications, research protocols, and journal publications. 9 This checklist offers one way to monitor impact of the guidance on researchers, funders, and journal editors.

We recommend that the guidance is continually updated, and future updates continue to adopt a broad, pluralist perspective. Given its wider scope, and the range of detailed guidance that is now available on specific methods and topics, we believe that the framework is best seen as meta-guidance. Further editions should be published in a fluid, web based format, and more frequently updated to incorporate new material, further case studies, and additional links to other new resources.

Acknowledgments

We thank the experts who provided input at the workshop, those who responded to the consultation, and those who provided advice and review throughout the process. The many people involved are acknowledged in the full framework document. 9 Parts of this manuscript have been reproduced (some with edits and formatting changes), with permission, from that longer framework document.

Contributors: All authors made a substantial contribution to all stages of the development of the framework—they contributed to its development, drafting, and final approval. KS and LMa led the writing of the framework, and KS wrote the first draft of this paper. PC, SAS, and LMo provided critical insights to the development of the framework and contributed to writing both the framework and this paper. KS, LMa, SAS, PC, and LMo facilitated the expert workshop, KS and LMa developed the gap analysis and led the analysis of the consultation. KAB, NC, and EM contributed the economic components to the framework. The scientific advisory group (JB, JMB, DPF, MP, JR-M, and MW) provided feedback and edits on drafts of the framework, with particular attention to process evaluation (JB), clinical research (JMB), implementation (JR-M, DPF), systems perspective (MP), theory based perspective (JR-M), and population health (MW). LMo is senior author. KS and LMo are the guarantors of this work and accept the full responsibility for the finished article. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting authorship criteria have been omitted.

Funding: The work was funded by the National Institute for Health Research (Department of Health and Social Care 73514) and Medical Research Council (MRC). Additional time on the study was funded by grants from the MRC for KS (MC_UU_12017/11, MC_UU_00022/3), LMa, SAS, and LMo (MC_UU_12017/14, MC_UU_00022/1); PC (MC_UU_12017/15, MC_UU_00022/2); and MW (MC_UU_12015/6 and MC_UU_00006/7). Additional time on the study was also funded by grants from the Chief Scientist Office of the Scottish Government Health Directorates for KS (SPHSU11 and SPHSU18); LMa, SAS, and LMo (SPHSU14 and SPHSU16); and PC (SPHSU13 and SPHSU15). KS and SAS were also supported by an MRC Strategic Award (MC_PC_13027). JMB received funding from the NIHR Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol and by the MRC ConDuCT-II Hub (Collaboration and innovation for Difficult and Complex randomised controlled Trials In Invasive procedures - MR/K025643/1). DF is funded in part by the NIHR Manchester Biomedical Research Centre (IS-BRC-1215-20007) and NIHR Applied Research Collaboration - Greater Manchester (NIHR200174). MP is funded in part as director of the NIHR’s Public Health Policy Research Unit. This project was overseen by a scientific advisory group that comprised representatives of NIHR research programmes, of the MRC/NIHR Methodology Research Programme Panel, of key MRC population health research investments, and authors of the 2006 guidance. A prospectively agreed protocol, outlining the workplan, was agreed with MRC and NIHR, and signed off by the scientific advisory group. The framework was reviewed and approved by the MRC/NIHR Methodology Research Programme Advisory Group and MRC Population Health Sciences Group and completed NIHR HTA Monograph editorial and peer review processes.

Competing interests: All authors have completed the ICMJE uniform disclosure form at http://www.icmje.org/coi_disclosure.pdf and declare: support from the NIHR, MRC, and the funders listed above for the submitted work; KS has project grant funding from the Scottish Government Chief Scientist Office; SAS is a former member of the NIHR Health Technology Assessment Clinical Evaluation and Trials Programme Panel (November 2016 - November 2020) and member of the Chief Scientist Office Health HIPS Committee (since 2018) and NIHR Policy Research Programme (since November 2019), and has project grant funding from the Economic and Social Research Council, MRC, and NIHR; LMo is a former member of the MRC-NIHR Methodology Research Programme Panel (2015-19) and MRC Population Health Sciences Group (2015-20); JB is a member of the NIHR Public Health Research Funding Committee (since May 2019), and a core member (since 2016) and vice chairperson (since 2018) of a public health advisory committee of the National Institute for Health and Care Excellence; JMB is a former member of the NIHR Clinical Trials Unit Standing Advisory Committee (2015-19); DPF is a former member of the NIHR Public Health Research programme research funding board (2015-2019), the MRC-NIHR Methodology Research Programme panel member (2014-2018), and is a panel member of the Research Excellence Framework 2021, subpanel 2 (public health, health services, and primary care; November 2020 - February 2022), and has grant funding from the European Commission, NIHR, MRC, Natural Environment Research Council, Prevent Breast Cancer, Breast Cancer Now, Greater Sport, Manchester University NHS Foundation Trust, Christie Hospital NHS Trust, and BXS GP; EM is a member of the NIHR Public Health Research funding board; MP has grant funding from the MRC, UK Prevention Research Partnership, and NIHR; JR-M is programme director and chairperson of the NIHR’s Health Services Delivery Research Programme (since 2014) and member of the NIHR Strategy Board (since 2014); MW received a salary as director of the NIHR PHR Programme (2014-20), has grant funding from NIHR, and is a former member of the MRC’s Population Health Sciences Strategic Committee (July 2014 to June 2020). There are no other relationships or activities that could appear to have influenced the submitted work.

Patient and public involvement: This project was methodological; views of patients and the public were included at the open consultation stage of the update. The open consultation, involving access to an initial draft, was promoted to our networks via email and digital channels, such as our unit Twitter account ( @theSPHSU ). We received five responses from people who identified as service users (rather than researchers or professionals in a relevant capacity). Their input included helpful feedback on the main complexity diagram, the different research perspectives, the challenge of moving interventions between different contexts and overall readability and accessibility of the document. Several respondents also highlighted useful signposts to include for readers. Various dissemination events are planned, but as this project is methodological we will not specifically disseminate to patients and the public beyond the planned dissemination activities.

Provenance and peer review: Not commissioned; externally peer reviewed.

This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/ .

  • Macintyre S ,
  • Nazareth I ,
  • Petticrew M ,
  • Medical Research Council Guidance
  • Campbell M ,
  • Fitzpatrick R ,
  • O’Cathain A ,
  • Gunnell D ,
  • Ruggiero ED ,
  • Frohlich KL ,
  • Copeland L ,
  • Skivington K ,
  • Matthews L ,
  • Simpson SA ,
  • Hawkins K ,
  • Fitzpatrick B ,
  • Mercer SW ,
  • Blencowe NS ,
  • Skilton A ,
  • ROMIO Study team
  • Greenhalgh T ,
  • Petticrew M
  • Campbell R ,
  • Starkey F ,
  • Holliday J ,
  • ↵ Randell R, Honey S, Hindmarsh J, et al. A realist process evaluation of robot-assisted surgery: integration into routine practice and impacts on communication, collaboration and decision-making . NIHR Journals Library, 2017. https://www.ncbi.nlm.nih.gov/books/NBK447438/ .
  • ↵ The Health Foundation. Evidence Scan. Complex adaptive systems. Health Foundation 2010. https://www.health.org.uk/publications/complex-adaptive-systems .
  • Wiggins M ,
  • Sawtell M ,
  • Robinson M ,
  • Kessler R ,
  • Folegatti PM ,
  • Oxford COVID Vaccine Trial Group
  • Clemens SAC ,
  • Shearer JC ,
  • Burgess RA ,
  • Osborne RH ,
  • Yongabi KA ,
  • Paltiel AD ,
  • Schwartz JL ,
  • Walensky RP
  • Lhussier M ,
  • Williams L ,
  • Guthrie B ,
  • Pinnock H ,
  • ↵ Penney T, Adams J, Briggs A, et al. Evaluation of the impacts on health of the proposed UK industry levy on sugar sweetened beverages: developing a systems map and data platform, and collection of baseline and early impact data. National Institute for Health Research, 2018. https://www.journalslibrary.nihr.ac.uk/programmes/phr/164901/#/ .
  • Hoddinott P ,
  • Britten J ,
  • Funnell SC ,
  • Lawless A ,
  • Delany-Crowe T ,
  • Stephens TJ ,
  • Pearse RM ,
  • EPOCH trial group
  • Melendez-Torres GJ ,
  • Mounier-Jack S ,
  • Hargreaves SC ,
  • Manzano A ,
  • Uzochukwu B ,
  • ↵ White M, Cummins S, Raynor M, et al. Evaluation of the health impacts of the UK Treasury Soft Drinks Industry Levy (SDIL) Project Protocol. NIHR Journals Library, 2018. https://www.journalslibrary.nihr.ac.uk/programmes/phr/1613001/#/summary-of-research .
  • ↵ National Institute for Health and Care Excellence. What is public involvement in research? – INVOLVE. https://www.invo.org.uk/find-out-more/what-is-public-involvement-in-research-2/ .
  • Barrowclough C ,
  • Stuckler D ,
  • Monteiro C ,
  • Lancet NCD Action Group
  • Yardley L ,
  • Ainsworth B ,
  • Arden-Close E ,
  • Barnett ML ,
  • Ettner SL ,
  • Powell BJ ,
  • ↵ National Institute for Health and Care Excellence. Developing NICE guidelines: the manual. NICE, 2014. https://www.nice.org.uk/process/pmg20/resources/developing-nice-guidelines-the-manual-pdf-72286708700869 .
  • Balogun MO ,
  • BeST study team
  • Escoffery C ,
  • Lebow-Skelley E ,
  • Haardoerfer R ,
  • Stirman SW ,
  • Miller CJ ,
  • Forsyth R ,
  • Purcell C ,
  • Hawkins J ,
  • Movsisyan A ,
  • Rehfuess E ,
  • ADAPT Panel ,
  • ADAPT Panel comprises of Laura Arnold
  • Hawkes RE ,
  • Ogilvie D ,
  • Eldridge SM ,
  • Campbell MJ ,
  • PAFS consensus group
  • Thabane L ,
  • Hopewell S ,
  • Lancaster GA ,
  • ↵ Craig P, Campbell M. Evaluability Assessment: a systematic approach to deciding whether and how to evaluate programmes and policies. Evaluability Assessment working paper. 2015. http://whatworksscotland.ac.uk/wp-content/uploads/2015/07/WWS-Evaluability-Assessment-Working-paper-final-June-2015.pdf
  • Cummins S ,
  • ↵ Expected Value of Perfect Information (EVPI). YHEC - York Health Econ. Consort. https://yhec.co.uk/glossary/expected-value-of-perfect-information-evpi/ .
  • Cartwright N
  • Britton A ,
  • McPherson K ,
  • Sanderson C ,
  • Burnett T ,
  • Mozgunov P ,
  • Pallmann P ,
  • Villar SS ,
  • Wheeler GM ,
  • Collins LM ,
  • Murphy SA ,
  • McDonald S ,
  • Coronado GD ,
  • Schwartz M ,
  • Tugwell P ,
  • Knottnerus JA ,
  • McGowan J ,
  • ↵ Egan M, McGill E, Penney T, et al. NIHR SPHR Guidance on Systems Approaches to Local Public Health Evaluation. Part 1: Introducing systems thinking. NIHR School for Public Health Research, 2019. https://sphr.nihr.ac.uk/wp-content/uploads/2018/08/NIHR-SPHR-SYSTEM-GUIDANCE-PART-1-FINAL_SBnavy.pdf .
  • Fletcher A ,
  • ↵ Bicket M, Christie I, Gilbert N, et al. Magenta Book 2020 Supplementary Guide: Handling Complexity in Policy Evaluation. Lond HM Treas 2020. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/879437/Magenta_Book_supplementary_guide._Handling_Complexity_in_policy_evaluation.pdf
  • Pfadenhauer LM ,
  • Gerhardus A ,
  • Mozygemba K ,
  • Curran GM ,
  • Mittman B ,
  • Landes SJ ,
  • McBain SA ,
  • ↵ Imison C, Curry N, Holder H, et al. Shifting the balance of care: great expectations. Research report. Nuffield Trust. https://www.nuffieldtrust.org.uk/research/shifting-the-balance-of-care-great-expectations
  • Martinez-Alvarez M ,
  • Chalmers I ,

research and evaluation framework

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.11(3); 2021

Logo of bmjo

Original research

Evaluating evaluation frameworks: a scoping review of frameworks for assessing health apps, sarah lagan.

Division of DIgital Psychaitry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA

Lev Sandler

John torous, associated data.

bmjopen-2020-047001supp001.pdf

bmjopen-2020-047001supp002.pdf

Despite an estimated 300 000 mobile health apps on the market, there remains no consensus around helping patients and clinicians select safe and effective apps. In 2018, our team drew on existing evaluation frameworks to identify salient categories and create a new framework endorsed by the American Psychiatric Association (APA). We have since created a more expanded and operational framework Mhealth Index and Navigation Database (MIND) that aligns with the APA categories but includes objective and auditable questions (105). We sought to survey the existing space, conducting a review of all mobile health app evaluation frameworks published since 2018, and demonstrate the comprehensiveness of this new model by comparing it to existing and emerging frameworks.

We conducted a scoping review of mobile health app evaluation frameworks.

Data sources

References were identified through searches of PubMed, EMBASE and PsychINFO with publication date between January 2018 and October 2020.

Eligibility criteria

Papers were selected for inclusion if they meet the predetermined eligibility criteria—presenting an evaluation framework for mobile health apps with patient, clinician or end user-facing questions.

Data extraction and synthesis

Two reviewers screened the literature separately and applied the inclusion criteria. The data extracted from the papers included: author and dates of publication, source affiliation, country of origin, name of framework, study design, description of framework, intended audience/user and framework scoring system. We then compiled a collection of more than 1701 questions across 79 frameworks. We compared and grouped these questions using the MIND framework as a reference. We sought to identify the most common domains of evaluation while assessing the comprehensiveness and flexibility—as well as any potential gaps—of MIND.

New app evaluation frameworks continue to emerge and expand. Since our 2019 review of the app evaluation framework space, more frameworks include questions around privacy (43) and clinical foundation (57), reflecting an increased focus on issues of app security and evidence base. The majority of mapped frameworks overlapped with at least half of the MIND categories. The results of this search have informed a database ( apps.digitalpsych.org ) that users can access today.

As the number of app evaluation frameworks continues to rise, it is becoming difficult for users to select both an appropriate evaluation tool and to find an appropriate health app. This review provides a comparison of what different app evaluation frameworks are offering, where the field is converging and new priorities for improving clinical guidance.

Strengths and limitations of this study

  • This scoping review is the largest and most up to date review and comparison of mobile health app evaluation frameworks.
  • The analysis highlighted the flexibility and comprehensiveness of the Mhealth Index and Navigation Database (MIND) framework, which was used as a reference framework in this review, in diverse contexts.
  • MIND was initially tailored to mental health and thus does not encompass thorough disease-specific criteria for other conditions such as asthma, diabetes and sickle cell anaemia—though such questions may be easily integrated.
  • Subjective questions, especially those around ease of use and visual appeal, are difficult to standardise but may be among the most important features driving user engagement with mental health apps.

Introduction

The past 5 years have seen a proliferation of both mobile health apps and proposed tools to rate such apps. While these digital health tools hold great potential, concerns around privacy, efficacy and credibility, coupled with a lack of strict oversight by governing bodies, have highlighted a need for frameworks that can help guide clinicians and consumers to make informed app choices. Although the USs’ Food and Drug Administration has recognised the issue and is piloting a precertification programme that would prioritise app safety at the developer level, 1 this model is still in pilot stages and there has yet to be an international consensus around standards for health apps, resulting in a profusion of proposed frameworks across governments, academic institutions and commercial interests.

In 2018, our team drew on existing evaluation frameworks to identify salient categories from existing rating schemes and create a new framework. 2 The American Psychiatric Association’s (APA) App Evaluation Model was developed by harmonising questions from 45 evaluation frameworks and selecting 38 total questions that mapped to five categories: background information, privacy and security, clinical foundation, ease of use and interoperability. This APA model subsequently has been used by many diverse stakeholders given its flexibility in guiding informed decision-making. 3–7 However, the flexibility of the model also created a demand for a more applied approach that offered users more concrete information instead of placing the onus entirely on a clinician or provider.

Thus, since the framework’s development, the initial 38 questions have been operationalised into 105 new objective questions that invite a binary (yes/no) or numeric response by a rater. 8 These questions align with the categories proposed by the APA model but are more extensive and objective, with, for example, ‘app engagement’ operationalised into 11 different engagement styles to select. These 105 questions are sorted into six categories (App Origin and Functionality, Inputs and Outputs, Privacy and Security, Clinical Foundation, Features and Engagement, Interoperability and Data Sharing) and are intended to be answerable for any trained rater—clinician, peer, end user—and inform the public-facing Mhealth Index and Navigation Database (MIND), where users can view app attributes and compare ratings (see figure 1 below). MIND, thus, constitutes a new framework based on the APA model, with an accompanying public-facing database.

An external file that holds a picture, illustration, etc.
Object name is bmjopen-2020-047001f01.jpg

A screenshot of MIND highlighting several of the app evaluation questions (green boxes) and ability to access more. MIND, Mhealth Index and Navigation Database.

Recent systematic reviews have illustrated the growing number of evaluation tools for digital health devices, including mobile health apps. 9–11 Given the rapidly evolving health app space and the need to understand what aspects are considered in evaluation frameworks, we have sought to survey the landscape of existing frameworks. Our goal was to compare the categories and questions composing other frameworks to (1) identify common elements between them, (2) identify if gaps in evaluation frameworks have improved since 2018 and (3) assess how reflective our team’s MIND framework is in the current landscape. We, thus, aimed to map every question from the 2018 review, as well as questions from new app evaluation frameworks that have emerged since, using the questions of MIND as a reference. While informing our own efforts around MIND, the results of this review offer broad relevance across all of digital health, as understanding the current state of app evaluation helps inform how any new app may be assessed, categorised, judged and adopted.

Patient and public involvement

Like the APA model, MIND shifts the app evaluation process away from finding one ‘best’ app, and instead guiding users towards an informed decision based on selecting and placing value on the clinically relevant criteria that account for the needs and preferences of each patient and case. Questions were created with input of clinicians, patients, family members, researchers and policy-makers. The goal is not for a patient of clinician to consider all 105 questions but rather be able to access a subset of questions that appear most appropriate for the current use case at hand. Thus, thanks to its composition of discrete questions that aim to be objective and reproducible, MIND offers a useful tool to compare evaluation frameworks. It also offers an actionable resource for any user anywhere in the world to engage with app evaluation, providing tangible results in the often more theoretical world of app evaluation.

We followed a three-step process in order to identify and compare frameworks to MIND. This process included (1) assembling all existing frameworks for mobile medical applications, (2) separating each framework into the discrete evaluation questions comprising it and (3) mapping all questions to the 105 MIND framework questions as a reference.

Search strategy and selection criteria

We started with the 45 frameworks identified in the 2018 review by Moshi et al 9 and included 34 frameworks that have emerged since our initial analysis of the space that was conducted in 2018 and published in 2019. 2 To accomplish this, we conducted an adapted scoping review based on the Moshi criteria to identify recent frameworks. Although MIND focuses on mental health apps, its considerations and categories are transferable to health apps more broadly, and, thus, there was no mental health specification in the search terms.

References were identified through searches of PubMed, EMBASE and PsychINFO with the search terms ((mobile application) OR (smartphone app)) AND ((framework) OR (criteria) OR (rating)) and publication date between January 2018 and October 2020. We also identified records beyond the database search by seeking frameworks mentioned in subsequent and recent reviews 5 12 13 and surveying the grey literature and government websites. Papers were selected for inclusion if they meet the predetermined eligibility criteria—presenting an evaluation framework for mobile health apps with patient, clinician or end user-facing questions. Two reviewers (SL and JT) screened the literature separately and applied the inclusion criteria. The data extracted from the papers included: author and dates of publication, source affiliation, country of origin, name of framework, study design, description of framework, intended audience/user and framework scoring system. Articles were screened if they describe the evaluation of a single app, did not present a new framework (instead conducting a review of the space or relying on a previous framework), the framework was focused on developer instead of clinicians or end users, was the implementation and not evaluation focused, was not a framework for health apps and was a satisfaction survey instead of an evaluation framework. The data selection process is outlined in figure 2 .

An external file that holds a picture, illustration, etc.
Object name is bmjopen-2020-047001f02.jpg

Framework identification through database searches (PubMed, EMBASE, PsychINFO) and other sources (reviews since 2018, grey literature, government websites).

The 34 frameworks identified in the search were combined with the 45 frameworks from the 2018 review for a total of 79 frameworks for consideration. To our knowledge, this list comprehensively reflects the state of the field at the time of assembly. However, we do not claim it to be exhaustive, as frameworks are constantly changing, emerging and sunsetting, with no central repository. The final list of frameworks assembled can be found in online supplemental appendix 1 .

Supplementary data

Each resulting framework was reviewed and compiled into a complete list of its unique questions. The 79 frameworks yielded 1701 questions in total. Several of the original 45 frameworks focused exclusively on in-depth privacy considerations (evaluating privacy and security practices rather than the app itself), 14 and after eliminating these checklists that did not facilitate app evaluation by a clinician or end user, 70 total frameworks were mapped in entirety to the MIND framework.

In mapping questions, discussion was sometimes necessary as not every question was an exact, word-for-word match. The authors, thus, used discretion when it came to matching questions to MIND and discussed each decision to confirm mapping placement. Two raters (SL, LS) agreed on mapping placement, and disputes were brought to a third reviewer (JT) for final consideration. ‘Is data portable and interoperable?’, 15 for example, would be mapped to the question ‘can you email or export your data?’ ‘Connectivity’ 16 was mapped to ‘Does the app work offline?’ and ‘Is the arrangement and size of buttons/content on the screen zoomable if needed’ 17 was mapped to ‘is there at least one accessibility feature?’ Questions about suitability for the ‘target audience’ were mapped to the ‘patient-facing’ question in MIND.

Framework type

The aim of this review was to identify and compare mobile health app rating frameworks, assessing overlap and exploring changes and gaps relative to both previous reviews and to the MIND framework. Of the 70 frameworks ultimately assessed and mapped, the majority 39 (55.7%) offered models for evaluating mobile health apps broadly. Seven (10%) considered mental health apps, while six (8.5%) focused on apps for diabetes management. Other evaluation focuses included apps for asthma, autism, concussions, COVID-19, dermatology, eating disorders, heart failure, HIV, pain management, infertility and sickle cell disease ( table 1 ).

Number of disease-specific and general app evaluation frameworks, with general mobile health frameworks constituting more than half of identified frameworks

We mapped questions from 70 app evaluation frameworks against the six categories and 105 questions of MIND (see online supplemental appendix 2 ). We examined the number of frameworks that addressed each specific MIND category and identified areas of evaluation that are not addressed by MIND. Through the mapping process, we were able to gauge the most common questions and categories across different app evaluation frameworks.

We sorted the questions into MIND’s six different categories—App Origin & Functionality, Inputs & Outputs, Privacy & Security, Evidence & Clinical Foundation, Features & Engagement Style and Interoperability & Data Sharing—in order to assess the most common broad areas of consideration. Across frameworks, the most common considerations were around privacy/security and clinical foundation, with 43 frameworks posing at least one question around the app’s privacy protections and 57 of the frameworks containing at least one question to evaluate evidence base or clinical foundation, as denoted in table 2 . Fifty-nine frameworks covered at least two of the MIND categories, with the majority of frameworks overlapping with at least four of MIND categories.

The questions from all frameworks were mapped to the reference framework (MIND) sorted into its six categories, with this table denoting how many frameworks had questions that could be sorted into each of the categories

MIND, Mhealth Index and Navigation Database.

We then took a more granular look at the questions from each of the 70 frameworks, matching questions one-by-one to questions of the MIND framework when possible. On an individual question level, specific questions about the presence of a privacy policy, security measures in place, supporting studies and patient-facing (or target population) tools were the most prevalent, with representation from 20, 25, 27 and 28 frameworks, respectively, for each question. Each of the 70 frameworks had at least one question that mapped to MIND. The most common questions, sorted into their respective categories, are depicted in figure 3 and table 3 , while the full list of mapped questions can be found in online supplemental appendix 2 .

An external file that holds a picture, illustration, etc.
Object name is bmjopen-2020-047001f03.jpg

The most commonly addressed questions, grouped within the categories of MIND. The blue triangle constitutes MIND and its six main categories, while the green trapezoid represents questions pertaining to usability or ease of use, which are not covered by MIND. MIND, Mhealth Index and Navigation Database.

Commonly addressed questions among those that could be mapped to the MIND reference framework (blue), and those that could not (green)

HIPAA, Health Insurance Portability and Accountability Act; MIND, Mhealth Index and Navigation Database.

Every question was examined but not every question in every framework could be matched to a corresponding question in MIND, and some questions fell outside one of the six categories. For example, 18 frameworks continue to present the subjective question of ‘is the app easy to use’ which will vary depending on the person and use case. MIND also does not offer questions related to other objective questions to which answers are not readily available such as ‘How were target users involved in the initial design and usability evaluations of the app?’ 18 While questions such as this are of high importance, lack of easily accessible answers creates a dilemma in their present utility for app evaluation. Furthermore, some questions such as economic analysis were not covered by MIND but by other frameworks and represent a similar dilemma in that actual data to base evaluation on are often lacking. Aside from subjective questions, other pronounced absences MIND were questions about customisability (addressed by seven other frameworks) and advertising (nine frameworks). Although MIND does ask about customisability in part by encouraging raters to consider accessibility features (and some frameworks ask about the ability to customise in conjunction with accessibility features, 19 MIND neither pose a question around the user’s ability to tailor or customise app content nor does it ask questions about the presence of advertisements on an app. Other questions unaddressed by MIND were about the user’s ability to contact the producer or developer to seek guidance about app use. Variations of this question include ‘is there a way to feedback user comments to the app developer?’ MIND also does not pose any questions regarding instructions in the app or the existence of a user guide. 20 Finally, it does ask about speed of app functionality. This variant of question asks, ‘is the app fast and easy to use in clinical settings?’ 15 figure 3 above, and table 3 below presents additional details on categories and questions both inside and outside the MIND reference framework.

As mobile health apps have proliferated, choosing the right one has become increasingly challenging for patient and clinician alike. While app evaluation frameworks can help sort through the myriad of mobile health apps, the growing number of frameworks further complicates the process of evaluation. Our review examined the largest number of evaluation frameworks to date with the goal of assessing their unique characteristics, gaps as well as overlap with the 105 questions in MIND. We identified frameworks for evaluating a wide range of mobile health apps—some focused on general mobile health, some specific and addressing specific disease domains like asthma, heart failure, mental health or pain management.

Despite the different disease conditions they addressed, there was substantial overlap among the frameworks, especially around clinical foundation and privacy and security. The most common category addressed was clinical foundation, with 57 of the evaluation frameworks posing at least one question regarding evidence base. More than half of the frameworks also addressed privacy and/or security and app functionality or origin.

The widespread focus on clinical foundation and privacy represents a major change in the space since 2018, when our team analysed an initial review of 45 health app evaluation frameworks and found that the most common category of consideration among the different frameworks was usability, with short-term usability highly overrepresented compared with privacy and evidence with base. In this 2018 review, there were 93 unique questions corresponding to short-term usability but only 10 to the presence of a privacy policy. Although many frameworks continue to consider usability, our current review suggests the most common questions across frameworks now concern evidence, clinical foundation and privacy. This shift may reflect an increased recognition of the privacy dangers some apps may pose.

This review illustrates the challenges in conceiving a comprehensive evaluation model. A continued concern in mobile health apps is engagement, 6 and it is unclear whether any framework adequately predicts engagement. Another persistent challenge is striking a balance between transparency/objectivity and subjectivity. Questions that prompt consideration of subjective user experiences may limit the generalisability and standardisation of a framework, as the questions inherently reflect the experience of the rater. An app’s ease of use, for example, will differ significantly depending on an individual’s level of comfort and experience with technology. However, subjective questions around user friendliness, visual appeal and interface design may be of greatest concern to an app user, and most predictive of engagement with an app. 21 Finally, a thorough assessment of an app is only feasible if information about the app is available. For example, some questions with clinical significance, such as the consideration of how peers or target users may be involved in app development, are not easily answerable by a health app consumer. Overall, there is a need for more data and transparency when it comes to health apps. App evaluation frameworks, while thorough, rigorous and tailored to clinical app use, can only go so far without transparency on the part of app developers. 22

The analysis additionally highlighted the flexibility and comprehensiveness of the MIND framework, which was used as a reference framework in this review, in diverse contexts. The MIND categories are inclusive of a wide range of frameworks and questions. Even without including any subjective questions in the mapping process, each of the 70 frameworks that were ultimately mapped had some overlap with MIND, and many of the 1700 questions ultimately included were mapped exactly with a MIND question. Although MIND was initially conceptualised as an evaluation tool specifically for mental health apps, the coherence between MIND and diverse types of app evaluation frameworks, such as those for concussion, 23 heart disease 24 and sickle cell anaemia, 25 demonstrates how the MIND categories can encompass many health domains. Condition-specific questions, for example, are a good fit for the ‘Features & Engagement’ category of MIND.

The results of our analysis suggest while numerous new app evaluation frameworks continue to emerge, there is a naturally appearing standard of common questions asked across all. While different use cases and medical subspecialties will require unique questions to evaluate apps, there are a set of common questions around aspects like privacy and level of evidence that are more universal. MIND appears to cover a large subset of these questions and, thus, may offer a useful starting point for new efforts as well as means to consolidate exiting efforts. Advantage of the more objective approach offered by MIND is that it can be represented as a research database to facilitate discovery of apps while not conflicting with local needs, personal preferences or cultural priorities. 26

Limitations

Our work is not the first to compare app evaluation frameworks. Recently, several reviews have compared how different mobile health app evaluation models address privacy, 11 12 14 and another database ( https://search.appcensus.io/ ) focuses exclusively on compiling privacy assessments of Android apps. We chose to exclude app evaluation frameworks that focused exclusively on in-depth privacy considerations and were unusable by a clinician or layperson as our goal was more comprehensive app evaluation. This decision is not to reject considerations of privacy and security that are of critical importance, but rather to narrow the focus to frameworks that are usable in the hands of the public today and can be used to inform clinical decisions. In addition, MIND was initially tailored to mental health, and thus does not encompass thorough disease-specific criteria for other conditions such as asthma, diabetes and sickle cell anaemia—though such questions may be easily integrated. Finally, subjective questions, especially those around ease of use and visual appeal, are difficult to standardise but may be among the most important features driving user engagement with mental health apps. 21

Our work demonstrates the expansion of app evaluation frameworks. By illustrating how the MIND overlaps with many of these existing and emerging frameworks—we suggest the practical need for consolidation. Although specific disease tailored mobile health apps require specialised app evaluation questions , concerns around accessibility, privacy, clinical foundation and interoperability are nonspecific. If the full potential of digital health can be realised, there is a need for increased collaboration among industry, government and academia in order to ensure that the highest quality digital health tools reach the public. We emphasise that this effort is just a first step and highlight the need for interdisciplinary continued communication among diverse digital health stakeholders in order to best serve the public.

Supplementary Material

Contributors: SL and JT designed the procedure. SL and JT screened articles for eligibility. SL and LS compiled and mapped questions from frameworks. SL and JT composed manuscript.

Funding: This work was supported by a gift from the Argosy Foundation.

Competing interests: None declared.

Patient consent for publication: Not required.

Provenance and peer review: Not commissioned; externally peer reviewed.

Data availability statement: Addtional data are presented in Appendix A and B.

Supplemental material: This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

  • Open access
  • Published: 30 April 2020

Ten recommendations for using implementation frameworks in research and practice

  • Joanna C. Moullin 1 , 2 ,
  • Kelsey S. Dickson 2 , 3 ,
  • Nicole A. Stadnick 2 , 4 , 5 ,
  • Bianca Albers 6 , 7 ,
  • Per Nilsen 8 ,
  • Sarabeth Broder-Fingert 9 ,
  • Barbara Mukasa 10 &
  • Gregory A. Aarons 2 , 4 , 5  

Implementation Science Communications volume  1 , Article number:  42 ( 2020 ) Cite this article

86k Accesses

144 Citations

71 Altmetric

Metrics details

Recent reviews of the use and application of implementation frameworks in implementation efforts highlight the limited use of frameworks, despite the value in doing so. As such, this article aims to provide recommendations to enhance the application of implementation frameworks, for implementation researchers, intermediaries, and practitioners.

Ideally, an implementation framework, or multiple frameworks should be used prior to and throughout an implementation effort. This includes both in implementation science research studies and in real-world implementation projects. To guide this application, outlined are ten recommendations for using implementation frameworks across the implementation process. The recommendations have been written in the rough chronological order of an implementation effort; however, we understand these may vary depending on the project or context: (1) select a suitable framework(s), (2) establish and maintain community stakeholder engagement and partnerships, (3) define issue and develop research or evaluation questions and hypotheses, (4) develop an implementation mechanistic process model or logic model, (5) select research and evaluation methods (6) determine implementation factors/determinants, (7) select and tailor, or develop, implementation strategy(s), (8) specify implementation outcomes and evaluate implementation, (9) use a framework(s) at micro level to conduct and tailor implementation, and (10) write the proposal and report. Ideally, a framework(s) would be applied to each of the recommendations. For this article, we begin by discussing each recommendation within the context of frameworks broadly, followed by specific examples using the Exploration, Preparation, Implementation, Sustainment (EPIS) framework.

The use of conceptual and theoretical frameworks provides a foundation from which generalizable implementation knowledge can be advanced. On the contrary, superficial use of frameworks hinders being able to use, learn from, and work sequentially to progress the field. Following the provided ten recommendations, we hope to assist researchers, intermediaries, and practitioners to improve the use of implementation science frameworks.

Peer Review reports

Contributions to the literature

Provision of recommendations and concrete approaches to enhance the use of implementation science frameworks, models, and theories by researchers, intermediaries, and practitioners

Increase the ability of implementation researchers to produce generalizable implementation knowledge through comprehensive application of implementation frameworks, models, and theories

Increase implementation intermediaries and practitioners ability to use implementation frameworks as a shared language to familiarize stakeholders with implementation and as practical tools for planning, executing, and evaluating real-world implementation efforts

Provision of a worksheet to assist the application our recommendations for comprehensive framework use

Provision of a checklist to assist in reviewing ways in which the selected framework(s) are used

There is great value in effectively using implementation frameworks, models, and theories [ 1 , 2 ]. When used in research, they can guide the design and conduct of studies, inform the theoretical and empirical thinking of research teams, and aid interpretation of findings. For intermediaries and practitioners, they can provide shared language to familiarize stakeholders with implementation and function as practical tools for planning, executing, and evaluating real-world implementation efforts. Implementation frameworks, models, and theories have proliferated, and there are concerns that they are not used optimally to substantiate or advance implementation science and practice.

Theories are generally specific and predictive, with directional relationships between concepts making them suitable for hypothesis testing as they may guide what may or may not work [ 3 ]. Models are also specific in scope, however are more often prescriptive, for example, delineating a series of steps. Frameworks on the other hand tend to organize, explain, or describe information and the range and relationships between concepts, including some which delineate processes, and therefore are useful for communication. While we acknowledge the need for greater use of implementation frameworks, models, and potentially even more so theories, we use the term frameworks to encompass the broadest organizing structure.

Suboptimal use of frameworks can impact the viability and success of implementation efforts [ 4 ]. This can result in wasted resources, erroneous conclusions, specification errors in implementation methods and data analyses, and attenuated reviews of funding applications [ 5 ]. There can be a lack of theory or poorly articulated assumptions (i.e., program theory/logic model), guiding which constructs or processes are involved, operationalized, measured, and analyzed. While guidance for effective grant applications [ 4 ] and standards for evaluating implementation science proposals exist [ 6 ], the poor use of frameworks goes beyond proposals and projects and can slow or misguide the progress of implementation science as a field. Consistent terms and constructs aid communication and synthesis of findings and therefore are keys to replication and to building the evidence base. In real-world practice, the suboptimal use of implementation frameworks can lead stakeholders to misjudge their implementation context or develop inappropriate implementation strategies. Just as important, poor use of frameworks can slow the translation of research evidence into practice, and thereby limit public health impact.

Frameworks are graphical or narrative representations of the factors, concepts, or variables of a phenomenon [ 3 ]. In the case of implementation science, the phenomenon of interest is implementation. Implementation frameworks can provide a structure for the following: (1) describing and/or guiding the process of translating effective interventions and research evidence into practice (process frameworks), (2) analyzing what influences implementation outcomes (determinant frameworks), and (3) evaluating implementation efforts (outcome frameworks) [ 2 ]. Concepts within implementation frameworks may therefore include the following: the implementation process, often delineated into a series of phases; factors influencing the implementation process, frequently referred to as determinants or barriers and facilitators/enablers; implementation strategies to guide the implementation process; and implementation outcomes. The breadth and depth to which the concepts are described within frameworks vary [ 7 ].

Recent analyses of implementation science studies show suboptimal use of implementation frameworks [ 1 , 8 ]. Suboptimal use of a framework is where it is applied conceptually, but not operationalized or incorporated throughout the phases of an implementation effort, such as limited use to guide research methods [ 1 , 9 ]. While there is some published guidance on the use of specific frameworks such as the Theoretical Domains Framework (TDF) [ 10 ], RE-AIM [ 11 ], the Consolidated Framework for Implementation Research (CFIR) [ 12 ], the Exploration, Preparation, Implementation, Sustainment (EPIS) framework [ 1 ], and combined frameworks [ 13 ], there is a need for explicit guidance on the use of frameworks generally. As such, this article provides recommendations and concrete approaches to enhance the use of implementation science frameworks by researchers, intermediaries, and practitioners.

Recommendations for using implementation framework(s)

Ideally, implementation frameworks are used prior to and throughout an implementation effort, which includes both implementation research and real-world implementation projects. Described below, we present ten recommendations for the use of implementation frameworks, presented in the rough chronological order of an implementation effort. The sequence is not prescriptive to accommodate flexibility in project design and objectives; the order of recommendations one to three in particular may vary or occur concurrently. The key is that all recommendations are considered and that ideally a framework(s) would be applied to each recommendation. This may mean one framework is used across all recommendations or multiple frameworks are employed. We recognize that this may be unrealistic when working under real-world resource constraints and instead strategic selection of frameworks may be necessary (e.g., based on the greatest needs or strongest preferences of stakeholders).

Depending on the stage in the implementation process, it may not be necessary to apply all the recommendations. The full list is suitable for implementation efforts that will progress at least to the implementation stage, whereby implementation strategies are being employed. However, for those who are early in the exploration phase of implementation or perhaps at the point of trying to establish implementation determinants, they may not be able to produce process or logic models or articulate mechanisms yet. This does not mean a framework is not very informative, but the order of the recommendations would vary and the full list may only be applicable as the implementation project progresses in future work.

We begin by discussing each recommendation within the context of frameworks broadly, followed by specific examples using the EPIS framework. The EPIS framework acknowledges the dynamic nature of implementation by defining important outer context, inner context, bridging, and innovation factors that influence or are influenced by an implementation effort throughout the phases of implementation. These applied examples are based on the results of a recent systematic review [ 1 ], and the collective experience of the co-authors applying the EPIS framework in national and international implementation efforts. In addition, we provide two tools that summarize each recommendation along with key questions to consider for optimal framework application within research, evaluation, and practice projects (Additional files 1 and 2 ).

To ensure that the recommendations are clear, practical, and comprehensive, we invited an international stakeholder panel who come from different perspectives (e.g., researcher, NGO administrator, intermediary, provider/physician) to review the recommendations and consider their utility applied to their implementation efforts. Our four-member panel included at least one stakeholder from each target audience for this article including implementation researchers, whose work spans diverse contexts, populations, and academic disciplines; evidence-based practice (EBP); intermediaries; and practitioners. Stakeholders reported extensive applied and training experience using multiple frameworks (e.g., CFIR and the Capability, Opportunity, Motivation (COM-B) component of the Behaviour Change Wheel (BCW)). Specifically, the goal of the stakeholder input was to critically review the paper, making any additions, edits, and comments, by concentrating their thinking on (i) Would they be able to apply these recommendations as they are written to their implementation work (proposals, studies, projects, evaluations, reports etc.)? (ii) Would they as a researcher, administrator, intermediary, or provider know what to do to use an implementation framework for each recommendation? In addition, we felt one area that needed some extra attention was the two tools, which aim to assist readers apply the recommendations. They were asked to test/trial the tools with any projects that they or a colleague had to ensure they were functional. The tools were refined according to their suggestions.

Select a suitable framework(s)

The process for selecting implementation framework(s) for a particular implementation effort should consider the following: (i) the purpose of the framework (describing/guiding the implementation process, analyzing what influences outcomes [barriers and facilitators], or evaluating the implementation effort); (ii) the level(s) included within the framework (e.g., provider, organization, system); (iii) the degree of inclusion and depth of analysis or operationalization of implementation concepts (process, determinants [barriers and facilitators], strategies, evaluation); and (iv) the framework’s orientation, which includes the setting and type of intervention (i.e., EBP generally, a specific intervention, a guideline, a public health program being implemented) for which the framework was originally designed [ 7 ]. Reviews and websites of implementation frameworks provide lists of potential options [ 1 , 2 , 14 , 15 ], and the Theory Comparison and Selection Tool (T-CaST) defines specific framework selection criteria [ 16 ]. Frameworks may be evaluated against these four criteria to see if they fit the implementation effort’s purpose (aims and objectives) and context (setting in which implementation is to occur). If for example a project was aiming to implement an educational program in a school setting, a framework that includes factors associated with the healthcare system or patient characteristics would not be a good fit.

It may be necessary and desirable to use multiple frameworks. Confusing matters, some frameworks fit neatly within one framework category, while others cross multiple framework “types.” For example, EPIS is both a process as well as a determinant framework with its focus on inner and outer context determinants across the phases of implementation. Furthermore, frameworks include different concepts and operationalize these to varying degrees. Put simply, some frameworks are more general, while others are more context or intervention specific; some frameworks are more comprehensive than others. Selecting a given framework can simultaneously expand and limit consideration of factors and processes likely to be important in an implementation effort. For expansion, frameworks can enumerate issues that might not have been considered for a given effort. On the other hand, limiting consideration of implementation issues to only the theories, constructs, and/or processes identified in a given framework may attenuate or curtail the degree to which factors affecting implementation are considered. Thus, it is sometimes desirable to use multiple frameworks for specific purposes, or alternatively expand on a current framework. For example, researchers may use a framework for understanding and testing determinants (e.g., EPIS [ 17 ], CFIR [ 18 ], TDF [ 10 , 19 , 20 ]) and another for evaluating outcomes (e.g., RE-AIM [ 21 ] or Proctor’s [ 22 ]).

Finally, we recommend that framework users invest in knowledge of the service setting in which they are working. This includes knowing or seeking involvement from stakeholders who understand the external context such as community norms and culture, policy and government processes, as well as the inner context such as organizational culture and climate, employee expectations, and attitudes towards innovations. Framework use in isolation without a deep understanding of context specific issues can result in a mismatch between framework selection and its applicability in research and practice. Furthermore, it is vital to seek permissions from both inner context and external context leadership.

EPIS application

A mixed-methods developmental project aimed to systematically adapt and test an EBP for youth with Autism Spectrum Disorder in publicly-funded mental health settings and develop a corresponding implementation plan [ 23 ]. EPIS was specifically selected by the research team, given the EPIS framework’s focus on public services settings, that it specifies multi-level inner and outer contextual factors, bridging factors between outer and inner contexts, addresses implementation process, and emphasizes innovation fit. EPIS was an apt fit for the project aims and context. In combination with the EPIS framework and as one example of a bridging factor, a community partnership model [ 24 ] was also applied to inform the community-academic partnership integrated throughout this study.

Establish and maintain community stakeholder engagement and partnerships

Stakeholder engagement is an integral component of implementation [ 25 , 26 ]. Growing calls are being made for [ 27 ] and examples of embedded research models, such as practice-based research networks, learning health systems, and implementation laboratories [ 28 ], that foster collaborations between researchers, implementers, and policy-makers integrated within a healthcare system to conduct research. Frameworks help inform discussions related to the types and specific roles of stakeholders who should be engaged, and the timing of stakeholder engagement. Stakeholders should not only include those who are proximally involved in EBP service delivery and receipt (consumers, providers, and administrative staff), but also those who are distally involved in oversight and structuring organizations, legislative actions, policy design, and financing of EBP delivery [ 29 ]. Engaging stakeholders across multiple levels of an implementation ecosystem (e.g., policy/legislative, funders, community, organizational, provider, client/patient) is recommended best practice for implementation researchers [ 30 ] and as indicated in the multi-level nature of the majority of implementation frameworks. Implementation frameworks generally encourage stakeholder engagement prior to funding, and for it to continue during implementation effort justification and as part of future implementation iterations and adaptations. Further, an implementation framework can inform clarity. Stakeholders can be engaged in the application of an implementation framework by, for example, having them involved in defining the local health system needs and selecting EBP(s) and/or implementation strategies in the EPIS implementation phase, as these are important to enhance their collaboration and ownership of the implementation effort [ 26 ].

Several implementation and improvement science frameworks explicitly include stakeholder engagement as a key construct or process (e.g., EPIS framework, PRECEDE-PROCEED, Plan-Do-Study-Act cycles, Promoting Action on Research Implementation in Health Services [PARIHS]). Additionally, there are pragmatic tools drawn from frameworks that can facilitate stakeholder engagement. For example, key criteria within the aforementioned T-CaST tool include the extent to which stakeholders are able to understand, apply, and operationalize a given implementation framework, and the degree to which the framework is familiar to stakeholders [ 16 ]. Methods, such as concept mapping [ 31 ], nominal group technique [ 32 ], and design thinking [ 33 ], may be used to guide stakeholder engagement meetings and define the issue or gap to be addressed. Other frameworks, such as the BCW [ 34 ], EPIS [ 17 ], or CFIR [ 18 ], may be used to prioritize and define implementation outcomes, determinants, and strategies together with stakeholders.

The EPIS framework explicitly highlights the importance of engaging multiple levels of stakeholders to influence implementation efforts longitudinally and contextually, from the initial identification of a need to sustainment of EBP delivery to address that need. While duration or depth of stakeholder engagement is not explicitly prescribed in EPIS, if combined with, for example, a designated partnership engagement model [ 24 ], EPIS has shown to enable the conceptualization and characterization of roles and levels of stakeholder engagement (system leaders program managers, providers) within system-driven implementation efforts [ 35 ].

Define issue and develop research or evaluation questions and hypotheses

Use of frameworks to inform the articulation of an implementation need (i.e., a research-practice gap) and the development of practice-related or research questions and hypotheses has the potential to optimize implementation efforts and outcomes [ 2 ]. Specifically, frameworks facilitate the framing and formulation of implementation questions, including those related to needs assessment (e.g., what is the clinical or implementation issue needing to be addressed?), process (e.g., what phases will the implementation undergo to translate an intervention into practice, or when is an organization ready to implement a new intervention?), implementation effectiveness (e.g., do the proposed implementation strategies work in the local context?), mechanisms of success (e.g., did an increase in implementation climate improve implementation intentions?), and associated impact on outcomes (e.g., how did the implementation effort perform in terms of adoption or reach?). Ideally, these questions—be they related to research projects or practice issues that providers want to resolve—should be closely linked with the framework selected to maximize impact. For example, the selection of the BCW as a guiding framework necessitates for a question or issue to be described in behavioral terms and, in many cases, refined to be more specific. Being specific about the problem to be addressed entails being precise about the behaviors you are trying to change and whose behavior is involved [ 36 ].

Frameworks also provide guidance for the translation of implementation literature to research or evaluation questions. For example, it has been written that education used alone as a single implementation strategy is not sufficient for successful implementation. An implementation framework will assist in realizing implementation determinants that remain to be addressed and therefore the selection of additional implementation(s) strategies. This can be challenging given the presence of multiple factors spanning different levels that vary across contexts and phases of implementation. Further, they contextualize and provide critical links between theory and individual experience gained through practice, such as supporting the perceived value of targeting leadership in promoting the adoption and use of effective interventions or research evidence [ 37 ].

Finally, and perhaps most relevant to many implementation efforts, frameworks provide explicit guidance and justification for proposed hypotheses to be tested that strengthen proposals, projects, trials, and products, both research and practice based [ 2 , 4 ]. Despite its explanatory power, use of frameworks to explicitly guide hypothesis formation are the minority, even within implementation efforts using theory to guide other aspects of the research process [ 38 , 39 , 40 ]. Thus, the increased use of frameworks to inform implementation questions and hypotheses is sorely needed.

EPIS Application

Work by Becan and colleagues [ 41 ] provides an example of a comprehensive application of EPIS framework to inform hypothesis development in their US National Institute on Drug Abuse study Translational Research on Interventions for Adolescents in the Legal System (JJ-TRIALS). JJ-TRIALS utilized EPIS to inform, identification of outer and inner context determinants, measures to assess those determinants, predictions based on theory, and tracking progress through the EPIS phases including identifying what constitutes the transition between each phase and the next phase. Specifically, the trial applied EPIS to inform the development of four tiers of questions related to the following: (1) the differential effect of two implementation strategies, (2) the factors that impacted and supported the transition across implementation phases, (3) the impact of this process on key implementation outcomes, and (4) tracking progress through the EPIS phases. For example, relevant determinants at the outer context system level and inner context organizational levels were identified. Specific hypotheses were developed to test how determinants (e.g., independent variables) influenced mechanisms (e.g., mediators/moderators) and ultimately “targets” (e.g., dependent variables) that are implementation outcomes and outcomes with clinical relevance.

Develop implementation program theory or logic model

Within research and practice projects, implementation frameworks can inform the program logics that describe the anticipated relationships between inputs, activities, outputs, and implementation and client outcomes, thereby supporting the explicit formulation of key assumptions and outlining of crucial project details.

In addition, implementation frameworks guide the design of a model for testing, for example, mediation and moderation of various influences on the process and outcomes of implementation. Despite an increasing emphasis on understanding key mechanisms of change in implementation [ 4 , 42 , 43 ], few evaluations examine implementation change mechanisms and targets [ 44 ]. Change mechanisms explain how or why underlying processes create change, whereas targets are defined as the identified focus or end aim of implementation efforts [ 45 ]. From a public health perspective, mechanism and target evaluation is critical to facilitate replication and scaling up of implementation protocols to more effectively change healthcare practice and achieve broader public health impact. Mechanism measurement and evaluation is critical to increase the rigor and relevance of implementation science [ 46 ]. Frameworks can facilitate beyond simple evaluation of key determinants and highlight fundamental single-level (e.g., organizational characteristics, individual adopter characteristics) and cross-cutting mechanisms of change spanning context or setting, levels [ 4 ]. Frameworks also enlighten the complex and evolving nature of determinants, mechanisms, and targets, varying across implementation phases. As an example, leadership may determine organizational climate during implementation within one specific service setting or context but serve as change mechanism impacting implementation targets during the exploration phase in a different setting. Frameworks provide the necessary roadmap for understanding these complex associations by offering prescriptive guidance for the evolving nature of these determinants.

The EPIS framework was applied to predict implementation leadership and climate and provider attitudes as key mechanisms of change in two linked Hybrid Type 3 cluster randomized trials testing the effectiveness of multi-level implementation strategies targeting leadership and attitudes (Brookman-Frazee and Stahmer [ 47 ]; see Fig. 1 ). Consistent with the explanatory nature of EPIS, this work highlights the interconnected nature of these mechanisms, with leadership hypothesized as both a mechanism impacting outcomes as well as the predictor (determinant) of further mechanisms such as provider attitudes during implementation [ 47 ].

figure 1

TEAMS intervention, mechanisms, and outcomes [ 47 ]

Determine research and evaluation methods (overall design, data collection, data analysis)

The distinct aims and purposes of implementation efforts require distinct evaluation designs such as mixed-methods, hybrid effectiveness-implementation, and quality improvement approaches including formative evaluations or Plan-Do-Study-Act cycles [ 48 ]. Implementation frameworks should be used to inform development of such designs across all phases, from the broader construction down to the measurement and analysis.

In the design of an evaluation, frameworks should be used to inform decisions about what constructs to assess, data to collect, and which measures to use. In this process, frameworks can help to identify and/or expand the implementation determinants or aspects assumed to impact the implementation process at different levels and across multiple phases for consideration or measurement. They can also help to operationalize constructs of importance to an evaluation and the identification of suitable measures. Fortunately, there is expanding work in implementation science to develop and catalog tools tied to existing frameworks to aid in this application (e.g., EPIS, see episframework.com/measures [ 1 ]; CFIR, see cfirguide.org/evaluation-design [ 49 ]; RE-AIM, see re-aim.org/resources-and-tools [ 50 ]).

For the collection and analysis of qualitative data, frameworks such as EPIS or CFIR provide developed and freely available data analytic tools, including pre-populated coding templates and data aggregation matrices [ 1 , 49 ]. Again, the use of framework-informed tools permits better alignment of concepts examined with broader implementation science literature. Analytically, frameworks can inform decisions about sequencing and directionality of implementation processes and strategies. Beyond identifying and analyzing key implementation determinants, theory should be applied along with frameworks in order to describe important implementation determinants (e.g., independent variables), implementation mechanisms (e.g., mediators), and their associated impacts on implementation targets (e.g., dependent variables) across the phases of implementation processes.

The EPIS framework was used to inform the development of key informant interviews and focus groups, and data coding and analytic procedures to capture the key outer and inner context and innovation factor influences across implementation phases of two large-scale community effectiveness trials [ 51 ]. Within the trials themselves, EPIS informed the selection of quantitative measures of inner context organizational and provider measures [ 52 ]. Such integrated and thorough framework use is needed to further build an integrated body of knowledge about effective implementation strategies.

Determine implementation determinants

Implementation frameworks often include several implementation determinants (i.e., barriers and enablers) that have been found to influence implementation outcomes [ 1 , 2 ]. Such lists of potential determinants are useful for exploratory work, for example, identifying key factors for applying an intervention in a particular context. This may occur early in an implementation process to guide implementation strategy selection or EBP adaptation, or further along to aid in the development of an implementation plan or in tailoring implementation strategies to support the EBP implementation or adaptation. The implementation science literature includes numerous examples of using frameworks in this manner across health contexts (see Birken et al. (2017) [ 13 ]; Helfrich et al. (2010) [ 53 ]). Examples of relevant determinant frameworks include the EPIS [ 1 , 17 ], CFIR [ 18 ], integrated checklist to identify determinants of practice (TICD checklist) [ 54 ], TDF [ 19 ], and BCW [ 36 ].

Another important reason for assessing implementation determinants using a theoretical framework is to specify the target of the implementation effort. It is not possible or necessary for all determinants to be targeted. Often, due to funding or other constraints, it is important to consider individual beneficiaries and community or government needs in prioritizing which determinants to targets. For example, the BCW methodology guides users to conduct a thorough behavioral diagnosis using the COM-B and to then prioritize which behaviors to address. In research, changes to pre-specified determinants included in the protocol require amendments to be documented, justified, and possibly approved by a research ethics committee. Prospective framework application may also reveal different determinants and aid selection of particular influencing factors to target during subsequent implementation studies.

The Leadership and Organizational Change for Implementation (LOCI) intervention employed the EPIS framework to select key implementation determinants to test in a large cluster RCT [ 55 ]. In this study, implementation leadership from first-level team leaders/managers, organizational climate and culture, implementation climate, and psychological safety climate were selected as determinants to test their influence on the fidelity of the EBP being implemented. In addition, to the developed implementation model and implementation strategy, EPIS was used to code qualitative data and select quantitative survey measures.

Select and tailor, or develop, an implementation strategy(s)

Implementation frameworks are necessary for selecting, tailoring, or developing implementation strategies. Defined as methods or techniques to aid the adoption, implementation, sustainment, and scale-up of evidence-based public health or clinical interventions [ 8 ], implementation strategies are the linchpin of successful implementation efforts. Implementation strategies vary in purpose and complexity, ranging from discrete strategies [ 56 ] such as audit and feedback [ 57 ] to multifaceted, and often branded, strategies that integrate at least two discrete strategies, such as the Leadership and Organizational Change for Implementation (LOCI) intervention [ 37 ], Availability, Responsiveness and Continuity model (ARC) [ 58 ], Replicating Effective Programs (REP) [ 59 ], Getting to Outcomes (GTO) [ 60 ], and Quality Implementation Framework (QIF) [ 61 ]. Powell and colleagues have outlined four primary methods for matching implementation strategies to barriers (conjoint analysis, intervention mapping, concept mapping, group model building) [ 62 ]. Each approach is highly participatory but varies in strengths and weaknesses of application. Additionally, comprehensive framework(s) application can help address identified priorities (e.g., methods for tailoring strategies, specifying, and testing mechanisms) for enhancing the impact of implementation strategies [ 63 ]. Taxonomies of strategies, such as the Expert Recommendations for Implementing Change (ERIC) discrete strategies list [ 64 ], BCT [ 65 ], and EPOC checklist [ 66 ], are useful to promote uniform communication and synthesis across implementation science.

Following the identification and prioritization of important barriers and facilitators (see recommendation 5), an implementation framework can support the process of matching determinants to implementation strategies. For example, the PARIHS framework [ 67 ] can be used to identify critical evidentiary (e.g., patient experience, information from the local setting) and contextual (e.g., leadership, receptive context) elements that may impact EBP implementation. This evidentiary and contextual analysis is then used to develop or tailor implementation strategies, primarily focused on facilitation as the anchoring approach. Use of frameworks like PARIHS to guide selection and tailoring of implementation strategies may be particularly suitable for implementation efforts and settings that have a strong need for facilitation to support the engagement and participation of a wide range or number of stakeholders.

The EPIS framework and the Dynamic Adaptation Process (DAP) were used in a cluster randomized trial to implement school nursing EBPs in US high schools to reduce LGBTQ adolescent suicide [ 68 ]. The DAP [ 69 ] is a multicomponent implementation strategy directly drawn from the EPIS framework. The DAP uses an iterative, data-informed approach to facilitate implementation across each phase of EPIS. A critical and core component of the DAP is the creation of an Implementation Resource Team that is a multiple stakeholder collaborative designed to support implementation, data interpretation, and explicitly address adaptations during the implementation process. Within this study, the EPIS framework and the DAP were used to (1) inform the constructs measured in the multi-level needs assessment during the exploration phase, (2) support the identification of the stakeholders and activities involved in the Implementation Resource Team that was developed in the preparation phase, (3) guide the tracking and integration of adaptations to the EBP strategy training and delivery during the implementation phase, and (4) inform the constructs and measurement of the implementation outcomes in the sustainment phase.

Specify implementation outcomes and evaluate Implementation

Implementation evaluation may include evaluation of progression through implementation stages, formative and summative evaluation of factors and strategies, as well as evaluation of the degree of implementation success as reflected in implementation outcomes. These may be measured at micro (individual), meso (team or organization), and macro (system) levels. Regardless of the particular scope and design of implementation evaluations, they should be informed by implementation frameworks.

As outlined by Nilsen et al. [ 2 ], there are a few implementation frameworks that have the expressed purpose of evaluating implementation, including RE-AIM [ 21 ], PRECEDE-PROCEED [ 70 ], and frameworks by Stetler et al. [ 71 ], Moullin et al. [ 72 ], and Proctor et al. [ 22 ]. Furthermore, there are particular implementation process measures such as the Stages of Implementation Completion (SIC), which may be used as both a formative and summative tool to measure the rate and depth of implementation [ 73 ]. Furthermore, there is an increasing number of measures of implementation determinants [ 74 , 75 ] (e.g., implementation leadership [ 76 ], implementation climate [ 77 , 78 ], or implementation intentions [ 79 ]). Evaluation of changes in these factors over time may be indicators of implementation success. While there are aforementioned specific evaluation frameworks, other frameworks also include evaluation elements to varying degrees [ 7 ]. For example, the conceptual framework for sustainability of public health programs by Scheirer and Dearing [ 80 ], the framework of dissemination in health services intervention research by Mendel et al. [ 81 ], and the integrated 2-phase Texas Christian University (TCU) approach to strategic system change by Lehman [ 82 ] include comprehensive evaluation of the influencing factors depicted in the corresponding frameworks. Frameworks that do not explicitly include measurement components can draw upon evaluation frameworks to work alongside and to determine which measures to select for each of the influencing factors chosen to be studied and the nominated implementation outcomes.

While the EPIS framework is not primarily an evaluation framework, its website includes a list of measures for quantitative analysis and definitions for qualitative work. After selecting implementation determinants and developing specific implementation questions and/or hypotheses, implementation measures should be selected for the chosen determinants as mediators of implementation success. In addition, measures of movement through the EPIS stages and measures of implementation outcomes may be included (e.g., fidelity). Both JJ-trials (Juvenile Justice—Translational Research on Interventions for Adolescents in the Legal System) [ 83 ] and the LOCI study [ 37 ] provide examples for using EPIS in implementation evaluation. From a practice perspective, teams should measure the baselines and periodically throughout the project to determine how the process measures and outcomes have improved over time. These evaluations help determine the rate of progress, which can inform improvements in other recommendations, such as recommendations 5 and 7.

Use a framework(s) at micro level to conduct and tailor implementation

Implementation is a dynamic, context-specific process. Each layer of a context (e.g., organization, profession, team, individual) requires ongoing individual tailoring of implementation strategies. Implementation frameworks, therefore, should be used to guide the overarching implementation plan, and—at the micro level—processes such as site-specific implementation team creation, barrier and facilitator assessment, implementation planning, and goal setting. This may be done by formatively evaluating implementation determinants either qualitatively or quantitatively as described above and then using the results to select or adapt implementation strategies for the particular context. Stetler et al. [ 71 ] provide four progressive yet integrated stages of formative evaluation. Another method would be to conduct implementation barrier, and facilitator assessments at different levels within the implementation context and subsequently determine tailor the implementation strategies. For example, coaching calls may reveal that a range of different behavioral change techniques [ 34 ] suited to each provider or leader.

During the aforementioned LOCI study, the goal was to improve first-level leader’s leadership and implementation climate to facilitate EBP adoption and use [ 55 ]. Baseline and ongoing 360-degree evaluation (where individuals, such as mid-level managers, rate themselves and receive ratings from their boss and staff) were performed and implementation plans subsequently adapted for each agency and team leader based on the data and emergent issues in the implementation process. This process was broadly informed by the focus on innovation fit and emphasis on leadership across levels within the EPIS framework. The Climate Embedding Mechanisms [ 84 ] were then used in combination with EPIS to formulate the individual, leader-specific implementation plans.

Write the proposal and report

Documenting an implementation effort—be it in the form of a research proposal, a scientific article, or a practice report—is key for any project. As part of this documentation, detailing the use of an implementation framework(s) is vital for the implementation project to be replicable and analyzable. The use of the selected implementation framework(s) should be documented across the proposal and report. This includes description or selection of appropriate methods to assess the selected implementation determinants. Furthermore, as outlined by Proctor et al. [ 8 ], implementation strategies should be named, defined, and specified, based on seven components enabling their measurement and replication: actor, action, action targets, temporality (when), dose (duration and how often), outcomes, and theory/justification. Similarly, outcomes should be named, specified, measured, and reported. Again, the work of Proctor and colleagues [ 22 ] provides a useful taxonomy for classifying and reporting types of implementation research outcomes that also includes guidance regarding level of analysis and measurement, theoretical basis, and maps the salience of outcome onto the phases of implementation.

Consistent with these recommendations are existing standards and guidelines to improve transparent and accurate reporting of implementation studies such as the Standards for Reporting Implementation Studies (STaRI; Pinnock et al. [ 85 ]). Ideally, incorporating these standards will strengthen the comprehensive use and reporting of frameworks to inform the formulation, planning, and reporting of implementation studies. Our recommendation is to explicitly document the use of implementation frameworks in research proposals, scientific outputs, and evaluation reports. To aid this process, Additional file 1 provides the Implementation Framework Application Worksheet to provide examples of key questions to assist implementation scientists and practitioners in applying our recommendations for comprehensive framework application. Finally, Additional file 2 provides the Implementation Framework Utilization Checklist to assist in thinking through and reviewing ways in which the selected framework(s) are used. In combination with the Implementation Framework Application Worksheet, the Checklist may inform revisions to a project (proposal, active project, or dissemination materials) and facilitate comprehensive framework application. Additionally, this Checklist may serve to provide documentation of implementation utilization (e.g., for inclusion in project proposals, reports, manuscripts).

An example of EPIS framework reporting is the “ATTAIN” (Access to Tailored Autism Integrated Care) study protocol [ 86 ]. Within this example, the authors display an adapted EPIS framework to highlight the unique outer (e.g., American Academy of Pediatrics recommendation for mental health screening) and inner context (e.g., organizational and technological capacity for innovation) determinants relevant to the phases of implementation included in the study (Exploration through Implementation). In addition, the authors describe how the unique contextual determinants and proposed implementation strategies (e.g., inter-organizational relationships among stakeholders) were conceptualized and to be measured across the study’s lifespan.

The use of implementation frameworks provides a structure for describing, guiding, analyzing, and evaluating implementation efforts, thus facilitating advancement of generalizable implementation science knowledge. Superficial use of frameworks hinders researchers’ and practitioners’ learning and efforts to sequentially progress the field. By following the provided ten recommendations, we hope researchers, intermediaries, and practitioners will bolster the use of implementation science frameworks.

Availability of data and materials

Not Applicable

Abbreviations

Availability, Responsiveness and Continuity model

Access to Tailored Autism Integrated Care

Behaviour Change Wheel

Consolidated Framework for Implementation Research

Capability, Opportunity, Motivation - Behaviour

Dynamic Adaptation Process

Evidence-Based Practice

Exploration, Preparation, Implementation, Sustainment framework

Expert Recommendations for Implementing Change

Getting to Outcomes

Juvenile Justice—Translational Research on Interventions for Adolescents in the Legal System

Leadership and Organizational Change Intervention

Promoting Action on Research Implementation in Health Services

Quality Implementation Framework

Reach, Effectiveness, Adoption, Implementation, Maintenance

Replicating Effective Programs

Standards for Reporting Implementation Studies

Texas Christian University

Theoretical Domains Framework

Moullin JC, Dickson KS, Stadnick NA, Rabin B, Aarons GA. Systematic review of the exploration, preparation, implementation, sustainment (EPIS) framework. Implement Sci. 2019;14:1.

PubMed   PubMed Central   Google Scholar  

Nilsen P. Making sense of implementation theories, models and frameworks. Implement Sci. 2015;10:53.

Rycroft-Malone J, Bucknall T. Theory, frameworks, and models: laying down the groundwork. In: Rycroft-Malone J, Bucknall T, editors. Models and frameworks for implementing evidence-based practice: Linking evidence to action. Oxford: Wiley-Blackwell; 2010. p. 23–50.

Google Scholar  

Proctor EK, Powell BJ, Baumann AA, Hamilton AM, Santens RL. Writing implementation research grant proposals: ten key ingredients. Implement Sci. 2012;7:96.

Pedhazur EJ. Multiple regression in behavioral research: explanation and prediction. 2nd ed. Fort Worth, TX: Harcourt Brace; 1982.

Crable EL, Biancarelli D, Walkey AJ, Allen CG, Proctor EK, Drainoni M. Standardizing an approach to the evaluation of implementation science proposals. Implement Sci. 2018;13:71.

Moullin JC, Sabater-Hernández D, Fernandez-Llimos F, Benrimoj SI. A systematic review of implementation frameworks of innovations in healthcare and resulting generic implementation framework. Health Res Policy Syst. 2015;13:16.

Proctor EK, Powell BJ, McMillen JC. Implementation strategies: recommendations for specifying and reporting. Implement Sci. 2013;8:139.

Kirk MA, Kelley C, Yankey N, Birken SA, Abadie B, Damschroder L. A systematic review of the use of the consolidated framework for implementation research. Implement Sci. 2016;11:72.

Atkins L, Francis J, Islam R, O’Connor D, Patey A, Ivers N, Foy R, Duncan EM, Colquhoun H, Grimshaw JM. A guide to using the Theoretical Domains Framework of behaviour change to investigate implementation problems. Implement Sci. 2017;12:77..

Glasgow RE, Estabrooks PE. Pragmatic applications of RE-AIM for health care initiatives in community and clinical settings. Prev Chronic Dis. 2018;15.

Keith RE, Crosson JC, O’Malley AS, Cromp D, Taylor EF. Using the consolidated framework for implementation research (CFIR) to produce actionable findings: a rapid-cycle evaluation approach to improving implementation. Implement Sci. 2017;12:15.

Birken SA, Powell BJ, Presseau J, Kirk MA, Lorencatto F, Gould NJ, Shea CM, Weiner BJ, Francis JJ, Yu Y. Combined use of the Consolidated Framework for Implementation Research (CFIR) and the Theoretical Domains Framework (TDF): a systematic review. Implement Sci. 2017;12:2.

Tabak RG, Khoong EC, Chambers DA, Brownson RC. Bridging research and practice: models for dissemination and implementation research. Am J Prev Med. 2012;43:337–50.

Dissemination & Implementation Models in Health Research & Practice [ http://dissemination-implementation.org/content/aboutUs.aspx ].

Birken SA, Rohweder CL, Powell BJ, Shea CM, Scott J, Leeman J, Grewe ME, Kirk MA, Damschroder L, Aldridge WA. T-CaST: an implementation theory comparison and selection tool. Implement Sci. 2018;13:143.

Aarons GA, Hurlburt M, Horwitz SM. Advancing a conceptual model of evidence-based practice implementation in public service sectors. Adm Policy Ment Hlth. 2011;38:4–23.

Damschroder L, Aron D, Keith R, Kirsh S, Alexander J, Lowery J. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009;4:50–64.

Michie S, Johnston M, Abraham C, Lawton R, Parker D, Walker A. Making psychological theory useful for implementing evidence based practice: a consensus approach. BMJ Qual Saf. 2005;14:26–33.

CAS   Google Scholar  

Cane J, O’Connor D, Michie S. Validation of the theoretical domains framework for use in behaviour change and implementation research. Implement Sci. 2012;7:37.

Glasgow RE, Vogt T, Boles S. Evaluating the public health impact of health promotion interventions: the RE-AIM framework. Am J Public Health. 1999;89:1322–7.

CAS   PubMed   PubMed Central   Google Scholar  

Proctor EK, Silmere H, Raghavan R, Hovmand P, Aarons GA, Bunger A, Griffey R, Hensley M. Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda. Adm Policy Ment Hlth. 2011;38:65–76.

Dickson KS, Aarons GA, Anthony LG, Kenworthy L, Crandal BR, Williams K, Brookman-Frazee L. Adaption and pilot implementation of an autism executive functioning intervention in children’s mental health services: a mixed-methods study protocol. Under review. .

Brookman-Frazee L, Stahmer AC, Lewis K, Feder JD, Reed S. Building a research-community collaborative to improve community care for infants and toddlers at-risk for autism spectrum disorders. J Community Psychol. 2012;40:715–34.

Drahota A, Meza R, Brikho G, Naaf M, Estabillo J, Spurgeon E, Vejnoska S, Dufek E, Stahmer AC, Aarons GA. Community-academic partnerships: a systematic review of the state of the literature and recommendations for future research. Milbank Q. 2016;94:163–214..

Miller WL, Rubinstein EB, Howard J, Crabtree BF. Shifting implementation science theory to empower primary care practices. Ann Fam Med. 2019;17:250–6.

World Health Organization. Changing mindsets: strategy on health policy and systems research. Geneva, Switzerland: World Health Organization; 2012.

Ivers NM, Grimshaw JM. Reducing research waste with implementation laboratories. Lancet. 2016;388:547–8.

PubMed   Google Scholar  

Green AE, Aarons GA. A comparison of policy and direct practice stakeholder perceptions of factors affecting evidence-based practice implementation using concept mapping. Implement Sci. 2011;6:104.

Brookman-Frazee L, Stahmer A, Stadnick N, Chlebowski C, Herschell A, Garland AF. Characterizing the use of research-community partnerships in studies of evidence-based interventions in children’s community services. Adm Policy Ment Hlth. 2016;43:93–104.

Trochim WM. An introduction to concept mapping for planning and evaluation. Eval Program Plann. 1989;12:1–16.

Rankin NM, McGregor D, Butow PN, White K, Phillips JL, Young JM, Pearson SA, York S, Shaw T. Adapting the nominal group technique for priority setting of evidence-practice gaps in implementation science. BMC Med Res Methodol. 2016;16:110.

Mintrom M, Luetjens J. Design thinking in policymaking processes: opportunities and challenges. Aust J Public Adm. 2016;75:391–402.

Michie S, van Stralen MM, West R. The behaviour change wheel: a new method for characterising and designing behaviour change interventions. Implement Sci. 2011;6:42.

Lau AS, Rodriguez A, Bando L, Innes-Gomberg D, Brookman-Frazee L. Research community collaboration in observational implementation research: complementary motivations and concerns in engaging in the study of implementation as usual. Adm Policy Ment Hlth. 2019:1–17.

Michie S, Atkins L, West R. The behaviour change wheel: a guide to designing interventions. Great Britain: Silverback Publishing; 2014.

Aarons GA, Ehrhart MG, Farahnak LR, Hurlburt MS. Leadership and organizational change for implementation (LOCI): a randomized mixed method pilot study of a leadership and organization development intervention for evidence-based practice implementation. Implement Sci. 2015;10:11.

Birken SA, Powell BJ, Shea CM, Haines ER, Alexis Kirk M, Leeman J, Rohweder C, Damschroder L, Presseau J. Criteria for selecting implementation science theories and frameworks: results from an international survey. Implement Sci. 2017;12:124.

Davies P, Walker AE, Grimshaw JM. A systematic review of the use of theory in the design of guideline dissemination and implementation strategies and interpretation of the results of rigorous evaluations. Implement Sci. 2010;5:14.

Johnson AM, Moore JE, Chambers DA, Rup J, Dinyarian C, Straus SE. How do researchers conceptualize and plan for the sustainability of their NIH R01 implementation projects? Implement Sci. 2019;14:50.

Becan JE, Bartkowski JP, Knight DK, Wiley TR, DiClemente R, Ducharme L, Welsh WN, Bowser D, McCollister K, Hiller M. A model for rigorously applying the Exploration, Preparation, Implementation, Sustainment (EPIS) framework in the design and measurement of a large scale collaborative multi-site study. Health & Justice. 2018;6:9.

Lewis CC, Stanick C, Lyon A, Darnell D, Locke J, Puspitasari A, Marriott BR, Dorsey CN, Larson M, Jackson C, et al. Proceedings of the Fourth Biennial Conference of the Society for Implementation Research Collaboration (SIRC) 2017: implementation mechanisms: what makes implementation work and why? Part 1. Implement Sci. 2018;13:30.

National Institute of Mental Health. Strategic Plan for Research. 2015. Retrieved from http://www.nimh.nih.gov/about/strategic-planning-reports/index.shtml .

Lewis CC, Klasnja P, Powell B, Tuzzio L, Jones S, Walsh-Bailey C, Weiner B. From classification to causality: advancing understanding of mechanisms of change in implementation science. Frontiers in Public Health. 2018;6:136.

Lewis C, Boyd M, Beidas R, Lyon A, Chambers D, Aarons G, Mittman B: A research agenda for mechanistic dissemination and implementation research. In Conference on the Science of Dissemination and Implementation; Bethesda, MD. 2015.

Geng E, Peiris D, Kruk ME. Implementation science: relevance in the real world without sacrificing rigor. PLOS Med. 2017;14:e1002288.

Brookman-Frazee L, Stahmer AC. Effectiveness of a multi-level implementation strategy for ASD interventions: study protocol for two linked cluster randomized trials. Implement Sci. 2018;13:66.

Landsverk J, Brown CH, Chamberlain P, Palinkas L, Ogihara M, Czaja S, Goldhaber-Fiebert JD, Rolls Reutz J, McCue Horwitz S. Design and analysis in dissemination and implementation research. In: Brownson RC, Colditz GA, Proctor EK, editors. Dissemination and Implementation Research in Health: Translating Science to Practice. New York, NY: Oxford University Press; 2012.

Consolidated Framework for Implementation Research (CFIR) [ http://www.cfirguide.org/ ].

Reach Effectiveness Adoption Implementation Maintenance (RE-AIM) [ http://www.re-aim.org/ ].

Brookman-Frazee L, Chlebowski C, Suhrheinrich J, Finn N, Dickson KS, Aarons GA, Stahmer A. Characterizing shared and unique implementation influences in two community services systems for autism: applying the EPIS framework to two large-scale autism intervention community effectiveness trials. Adm Policy Ment Hlth. 2020;47(2):176–87.

Suhrheinrich J, et al. Exploring inner-context factors associated with implementation outcomes in a randomized trial of classroom pivotal response teaching. Under Review.

Helfrich CD, Damschroder LJ, Hagedorn HJ, Daggett GS, Sahay A, Ritchie M, Damush T, Guihan M, Ullrich PM, Stetler CB. A critical synthesis of literature on the promoting action on research implementation in health services (PARIHS) framework. Implement Sci. 2010;5:82.

Flottorp SA, Oxman AD, Krause J, Musila NR, Wensing M, Godycki-Cwirko M, Baker R, Eccles MP. A checklist for identifying determinants of practice: a systematic review and synthesis of frameworks and taxonomies of factors that prevent or enable improvements in healthcare professional practice. Implement Sci. 2013;8:35.

Aarons GA, Ehrhart MG, Moullin JC, Torres EM, Green AE. Testing the leadership and organizational change for implementation (LOCI) intervention in substance abuse treatment: a cluster randomized trial study protocol. Implement Sci. 2017;12:29.

Powell BJ, McMillen JC, Proctor EK, Carpenter CR, Griffey RT, Bunger AC, Glass JE, York JL. A compilation of strategies for implementing clinical innovations in health and mental health. Med Care Res Rev. 2012;69:123–57.

Ivers N, Jamtvedt G, Flottorp S, Young JM, Odgaard-Jensen J, French SD, O’Brien MA, Johansen M, Grimshaw J, Oxman AD: Audit and feedback: effects on professional practice and healthcare outcomes. Cochrane Database of Systematic Reviews 2012.

Glisson C, Schoenwald S. The ARC organizational and community intervention strategy for implementing evidence-based children’s mental health treatments. Ment Health Serv Res. 2005;7:243–59.

Kilbourne AM, Neumann MS, Pincus HA, Bauer MS, Stall R. Implementing evidence-based interventions in health care: application of the replicating effective programs framework. Implement Sci. 2007;2:42.

Chinman M, Imm P, Wandersman A. Getting to outcomes™ 2004: promoting accountability through methods and tools for planning, implementation, and evaluation. Santa Monica: Rand Corporation; 2004.

Meyers DC, Durlak JA, Wandersman A. The quality implementation framework: a synthesis of critical steps in the implementation process. Am J Community Psychol. 2012;50:462–80.

Powell BJ, Beidas RS, Lewis CC, Aarons GA, McMillen JC, Proctor EK, Mandell DS. Methods to improve the selection and tailoring of implementation strategies. J Behav Health Serv Res. 2017;44:177–94.

Powell BJ, Fernandez ME, Williams NJ, Aarons GA, Beidas RS, Lewis CC, McHugh SM, Weiner BJ. Enhancing the impact of implementation strategies in healthcare: a research agenda. Front Public Health. 2019;7:3.

Powell BJ, Waltz TJ, Chinman MJ, Damschroder L, Smith JL, Matthieu MM, Proctor E, Kirchner JE. A refined compilation of implementation strategies: results from the expert recommendations for implementing change (ERIC) project. Implement Sci. 2015;10:21.

Abraham C, Michie S. A taxonomy of behavior change techniques used in interventions. Health Psychol. 2008;27:379–87.

Effective Practice and Organisation of Care (EPOC) Taxonomy [ epoc.cochrane.org/epoc-taxonomy ].

Kitson A, Harvey G, McCormack B. Enabling the implementation of evidence based practice: a conceptual framework. BMJ Qual Saf. 1998;7:149–58.

Willging CE, Green AE, Ramos MM. Implementing school nursing strategies to reduce LGBTQ adolescent suicide: a randomized cluster trial study protocol. Implement Sci. 2016;11:145.

Aarons GA, Green AE, Palinkas LA, Self-Brown S, Whitaker DJ, Lutzker JR, Silovsky JF, Hecht DB, Chaffin MJ. Dynamic adaptation process to implement an evidence-based child maltreatment intervention. Implement Sci. 2012;7:32.

Green L, Kreuter M. Health program planning: an educational and ecological approach. Boston: McGraw Hill; 2005.

Stetler CB, Legro MW, Wallace CM, Bowman C, Guihan M, Hagedorn H, Kimmel B, Sharp ND, Smith JL. The role of formative evaluation in implementation research and the QUERI experience. J Gen Intern Med. 2006;21:S1–8.

Moullin JC, Sabater-Hernandez D, Benrimoj SI. Model for the evaluation of implementation programs and professional pharmacy services. Res Social Adm Pharm. 2016;12:515–22.

Chamberlain P, Brown CH, Saldana L. Observational measure of implementation progress in community based settings: the stages of implementation completion (SIC). Implement Sci. 2011;6:116–23.

Lewis CC, Weiner BJ, Stanick C, Fischer SM. Advancing implementation science through measure development and evaluation: a study protocol. Implement Sci. 2015;10:102.

Rabin BA, Purcell P, Naveed S, MR P, Henton MD, Proctor EK, Brownson RC, Glasgow RE. Advancing the application, quality and harmonization of implementation science measures. Implement Sci. 2012;7:119.

Aarons GA, Ehrhart MG, Farahnak LR. The implementation leadership scale (ILS): development of a brief measure of unit level implementation leadership. Implement Sci. 2014;9:157.

Ehrhart MG, Aarons GA, Farahnak LR. Assessing the organizational context for EBP implementation: the development and validity testing of the Implementation Climate Scale (ICS). Implement Sci. 2014;9:157.

Weiner BJ, Belden CM, Bergmire DM, Johnston M. The meaning and measurement of implementation climate. Implement Sci. 2011;6:78.

Moullin JC, Ehrhart MG, Aarons GA. Development and testing of the Measure of Innovation-Specific Implementation Intentions (MISII) using Rasch measurement theory. Implement Sci. 2018;13:89.

Scheirer MA, Dearing JW. An agenda for research on the sustainability of public health programs. Am J Public Health. 2011;101:2059–67.

Mendel P, Meredith L, Schoenbaum M, Sherbourne C, Wells K. Interventions in organizational and community context: a framework for building evidence on dissemination and implementation in health services research. Adm Policy Ment Hlth. 2008;35:21–37.

Lehman WE, Simpson DD, Knight DK, Flynn PM. Integration of treatment innovation planning and implementation: strategic process models and organizational challenges. Psychol Addict Behav. 2011;25:252.

Knight DK, Belenko S, Wiley T, Robertson AA, Arrigona N, Dennis M, Wasserman GA. Juvenile Justice—Translational Research on Interventions for Adolescents in the Legal System (JJ-TRIALS): a cluster randomized trial targeting system-wide improvement in substance use services. Implement Sci. 2016;11:57.

Schein EH. Organizational culture. Am Psychol. 1990;45:109–19.

Pinnock H, Barwick M, Carpenter CR, Eldridge S, Grandes G, Griffiths CJ, Rycroft-Malone J, Meissner P, Murray E, Patel A, Sheikh A. Standards for reporting implementation studies (StaRI) statement. bmj. 2017;356:i6795.

Stadnick NA, Brookman-Frazee L, Mandell DS, Kuelbs CL, Coleman KJ, Sahms T, Aarons GA. A mixed methods study to adapt and implement integrated mental healthcare for children with autism spectrum disorder. Pilot Feasibility Stud. 2019;5:51.

Download references

Acknowledgements

Dr. Aarons is core faculty, and Dr. Dickson, Dr. Stadnick, and Dr. Broder-Fingert are fellows with the Implementation Research Institute (IRI), at the George Warren Brown School of Social Work, Washington University in St. Louis; through an award from the National Institute of Mental Health (5R25MH08091607).

Trial registration

Not applicable

This project was supported in part by the US National Institute of Mental Health R03MH117493 (Aarons and Moullin), K23MH115100 (Dickson), K23MH110602 (Stadnick), K23MH109673 (Broder-Fingert), and National Institute of Drug Abuse R01DA038466 (Aarons). The opinions expressed herein are the views of the authors and do not necessarily reflect the official policy or position of the NIMH, NIDA, or any other part of the US Department of Health and Human Services.

Author information

Authors and affiliations.

Faculty of Health Sciences, School of Pharmacy and Biomedical Sciences, Curtin University, Kent Street, Bentley, Søborg, Western Australia, 6102, Australia

Joanna C. Moullin

Child and Adolescent Services Research Center, 3665 Kearny Villa Rd., Suite 200N, San Diego, CA, 92123, USA

Joanna C. Moullin, Kelsey S. Dickson, Nicole A. Stadnick & Gregory A. Aarons

San Diego State University, 5500 Campanile Drive, San Diego, CA, 92182, USA

Kelsey S. Dickson

Department of Psychiatry, University of California San Diego, 9500 Gilman Drive (0812), La Jolla, CA, 92093-0812, USA

Nicole A. Stadnick & Gregory A. Aarons

UC San Diego Dissemination and Implementation Science Center, 9452 Medical Center Dr, La Jolla, CA, 92037, USA

European Implementation Collaborative, Odense, Denmark

Bianca Albers

School of Health Sciences, University of Melbourne, 161 Barry St, Carlton, VIC, 3053, Australia

Department of Health, Medicine and Caring Sciences, Linköping University, 58183, Linköping, Sweden

School of Medicine, Department of Pediatrics, Boston Medical Center and Boston University, 801 Albany Street, Boston, MA, 02114, USA

Sarabeth Broder-Fingert

Mildmay Uganda, 24985 Lweza, Entebbe Road, Kampala, Uganda

Barbara Mukasa

You can also search for this author in PubMed   Google Scholar

Contributions

GAA, KSD, NS, and JCM conceptualized the debate and drafted the manuscript. BA, PN, SBF, and BM provided expert opinion and guidance on the manuscript. All authors edited and approved the final manuscript.

Corresponding author

Correspondence to Joanna C. Moullin .

Ethics declarations

Ethics approval and consent to participate.

Ethics approval was not required.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1:.

Table S1. Implementation Framework Application Worksheet.

Additional file 2:

Table S2. Implementation Framework Utilization Tool.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Moullin, J.C., Dickson, K.S., Stadnick, N.A. et al. Ten recommendations for using implementation frameworks in research and practice. Implement Sci Commun 1 , 42 (2020). https://doi.org/10.1186/s43058-020-00023-7

Download citation

Received : 06 November 2019

Accepted : 26 February 2020

Published : 30 April 2020

DOI : https://doi.org/10.1186/s43058-020-00023-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Implementation

Implementation Science Communications

ISSN: 2662-2211

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

research and evaluation framework

Advancing Research & Innovation in STEM Education of Preservice Teachers in High-Needs School Districts

NSF

Program Evaluation Frameworks: Why Do They Matter?

November 4, 2020 by Betty Calinger

By: Meltem Alemdar, Ph.D. , Associate Director, Principal Research Scientist, Center for Education Integrating Science, Mathematics, and Computing , Georgia Institute of Technology Christopher Cappelli, MPH , Senior Research Associate, Center for Education Integrating Science, Mathematics and Computing , Georgia Institute of Technology

research and evaluation framework

The role of evaluation in National Science Foundation (NSF) projects has become critically important. Evaluation produces information that can be utilized to improve the project. Information on how different aspects of the project are working and the extent to which the goals and objectives are being met is essential to a continuous improvement process. Additionally, evaluation documents what has been achieved by the project.

Evaluators should always work closely with principal investigators (PIs) during the proposal stage to ensure that the evaluation aligns well with project goals; however, the degree to which this happens before and after funding is received depends on the PI’s approach to collaboration and perspective on the value of evaluation. Some PIs perceive evaluation as a formality required for the proposal to get funded, and perhaps for accountability purposes. Others see it as the most important part of the proposal. Whether the PI perceives evaluation as a critical component of a project or not has profound ramifications for the quality of the evaluation, the clarity of the stated evaluation focus, the selection of methodology and design, as well as the utilization of evaluation results by the PI. The need to deal with this array of critical considerations requires a robust evaluation plan. The choice of evaluation frameworks thus becomes very critical.

Evaluation frameworks provide guidance for program developers and evaluators to ensure that the evaluation’s overall design reflects and incorporates the originating motivations, principles, and context of the program examined. While the term “evaluation framework” is very common in the discipline of program evaluation, it has had various interpretations. The recent article by Arbour (2020 ), “Frameworks for Program Evaluation: Considerations on Research, Practice, and Institutions,” analyzes various approaches by different evaluation associations and organizations.

This article specifically highlights program evaluation, rather than the more general domain of evaluation. The paper provides examples of how frameworks are defined within the field and for organizations and associations. For example, the Organization for Economic Co-operation and Development (OECD) (2014) created its Framework for Regulatory Policy Evaluation, which is an extensive guide to assist “countries in systematically evaluating the design and implementation of regulatory policy” (p. 13), whereas the United Nations Office for Disaster Risk Reduction (2015) refers to its Monitoring and Evaluation Framework as a way to “provide a consistent approach to the monitoring and evaluation” of its programs (p. 2). The paper also describes the well-known Chen (2001) and Cooksy (1999) frameworks, which mostly focus on program theories and logic models. The paper also highlights the context dependent dimensions of choosing evaluation frameworks, such as the practice of program evaluators, as well as the type of intervention and program evaluation functions. Arbour (2020) concludes by emphasizing that “a framework has an impact because someone decides to adopt, adapt, or develop that framework in a given evaluation context” (p.13). This leads in many cases to locally developed logic models, evaluation plans, evaluation policies, and many other products associated with the term “evaluation frameworks.” This important observation is also borne out in our experience as evaluators, where we have found that different fields of study or practice govern the choice and implementation of evaluation frameworks. For example, participatory evaluation ( King, 2005 ) is most commonly used in community-based interventions. The developmental evaluation framework (Patton, 2010) tends to be used for innovation, radical program re-design, and addressing complex issues and crises.

In Alemdar, Cappelli, Criswell, and Rushton (2018) , we attempt to provide a template for evaluating teacher leadership training programs funded through the NSF Robert Noyce Teacher Scholarship (Noyce) program. The Noyce teacher leadership training programs are particularly challenging to evaluate for multiple reasons. First, the program-specific characteristics might evolve over the years, which can make it difficult to develop an effective evaluation of the program. Noyce programs are also hard to evaluate due to the small number of individuals admitted into cohorts each year. Most evaluations focus on yearly data for primarily formative purposes. The summative data usually focus on program level data rather than teacher level outcomes. To provide useful evaluation data and analysis to key stakeholders, teacher leadership professional development programs need to be evaluated longitudinally, utilizing proven methodologies and frameworks that are able to account for the small sample sizes common in these programs. It takes years for a teacher to transform into a leader who moves her colleagues toward positive change. Hence, it becomes important to capture the longitudinal teacher development.

Some evaluation frameworks require substantial time commitments from the project PIs, management, and others involved in the project during every step of the evaluation. Considering the limited knowledge of evaluation methodologies that are useful for evaluating teacher leadership programs with small sample sizes, as well as the relationships that we have built with the PIs, we chose to use multiple complementary evaluation frameworks to determine the overall program impact on the development of teacher leadership skills.

One approach was a utilization-focused evaluation, described as, “evaluation done for and with specific intended primary users for specific, intended uses” (Patton, 2008, p. 37). An essential component of utilization-focused evaluation is identifying the program stakeholders, or the primary intended users of the evaluation, and understanding their perspectives on the intended use of the evaluation. Patton (2008) describes the importance of the “personal factor” when identifying the intended users of the evaluation, defined as “the presence of an identifiable individual or group of people who personally care about the evaluation and the findings it generates” (p. 44). This group of people has a personal interest in the success of the program and enhance their own ability as consumers or decision-makers to predict and guide the outcomes of the program. Through this framework, we built close relationships with both program leadership and participants, developing a high level of trust that proved to be a cornerstone for the success of this evaluation. By understanding the “personal factor” and its importance in utilization, we involved key stakeholders so as to better understand their perspectives regarding the intended uses of this evaluation. This approach ensured that throughout the program period, evaluation data were presented in a way that placed utilization at the forefront.

Furthermore, teacher leadership training programs often incorporate theories for the development of leadership. In our early conversations, the PIs discussed multiple teacher leadership theories that guided their development of the program, such as Dempsey (1992) and Snell and Swanson (2000) . Since these theories formed a theoretical foundation for the program, the theory-driven evaluation framework by Chen (1990) was also adopted. This framework is designed to use a validated theory to guide the evaluation.

While utilization-focused evaluation provides timely, useful information to the program leadership for decision making, theory-driven evaluation can be “…analytically and empirically powerful and lead to better evaluation questions, better evaluation answers and better programs” ( Rogers, 2000, p. 209). Moreover, with theory-driven evaluation guiding the evaluation process, it is thought that the evaluation is able to not only assess whether or not a program is working, but also illuminate why or how the program is having an impact on its participants ( Chen, 2012 ). This is particularly important in the context of teacher leadership programs, so that the programs can build a theory-driven model, which can be easily adapted by others. This should be the goal of any NSF-related project evaluations – to effectively assess the merit of the program. Given the complementary data provided through the use of these theoretical frameworks, the results of the evaluation were used by the PIs extensively to improve the program and achieve the program goals. For example, in the early stages of the program, the formative data showed that teachers were struggling in reflecting their teaching practice, which is an important domain for developing teacher leadership. The program addressed this challenge by including more professional development and discussions around this topic.

In our paper, we also showed how the program theory guided the development of interview and focus group protocols to longitudinally track the development of leadership through Snell and Swanson’s four dimensions of teacher leadership: Empowerment, Expertise, Reflection and Collaboration. Documenting the development of teacher leadership over time is particularly difficult with sample sizes and limited evaluation resources. Using multiple frameworks substantially assisted the program in documenting its impact regarding change over time in the four dimensions of teacher leadership, and therefore, in the development of teacher leaders. Because of the collaborative nature of these evaluation frameworks, a conceptual framework was also constructed in collaboration with the PIs. From the perspective of a utilization-focused evaluation, involving key stakeholders in the development of a conceptual framework ensures a common understanding of the relationship between program components and the desired outcomes, resulting in agreement for the intended use of evaluation results. Similarly, from a theory-driven perspective, the development of a conceptual framework for the program systematically organizes stakeholders’ perceptions of both the process that is expected to happen to produce change and the activities needed to create the desired change as a result of participation in the program (Chen, 2012).

Implications

Choosing and implementing an evaluation framework(s) to better determine the merit of programs will vary by a program’s specific context. Based on our experiences, we developed several recommendations that evaluators and Noyce programs should consider when developing an evaluation plan:

Given the continuously evolving nature of the teacher leadership programs, the often small sample size, and the historic lack of literature offering a clear concept of teacher leadership, we, as evaluators, found that the concurrent use of both Utilization-Focused and Theory-Driven evaluation frameworks provided a firm foundation on which the evaluation could develop and evolve in tandem with the program. Further, the use of evaluation frameworks significantly improves documentation of the impact of the programs, which, in turn, facilitates replication of the program in new and different settings.

Alemdar, M., Cappelli, C., Criswell, B., & Rushton, G. (2018). Evaluation of a Noyce program: Development of teacher leaders in STEM education. Evaluation and Program Planning, 71, 1-11.

Arbour, G. (2020). Frameworks for program evaluation: Considerations on research, practice, and institutions. Evaluation, 26 (4).

Chen, H.T. (1990).  Theory-driven evaluations.   Sage:  Newbury Park.

Chen, H.T. (2001). Development of a national evaluation system to evaluate CDC-funded health department HIV prevention programs. American Journal of Evaluation, 22 (1), 55–70.

Chen, H. (2012). Theory-driven evaluation: Conceptual framework, application and advancement. In R. Strobl, O. Lobermeier, & W. Heitmeyer (eds) Evaluation von Programmen und Projekten für eine demokratische Kultur . Springer VS, Wiesbaden.

Coryn, C.L.S., Noakes, L.A., Westine, C.D., & Schröter, D.C. (2011). A systematic review of theory-driven evaluation practice from 1990 to 2009. American Journal of Evaluation, 32 (2), 199-226.

Cooksy, L.J. (1999). The meta-evaluand: The evaluation of project TEAMS. American Journal of Evaluation , 20, 123–36.

Dempsey R. (1992). Teachers as leaders: Towards a conceptual framework. Teaching Education 5 (1), 113–120.

King, J. A. (2005). Participatory evaluation. In S. Mathison (Ed.), Encyclopedia of evaluation (pp. 291-294). Thousand Oaks, CA: Sage.

OECD. (2014). OECD Framework for regulatory policy evaluation . Paris: OECD.

Patton, M.Q. (2008). Utilization-focused evaluation . Thousand Oak, CA: Sage Publications, Inc.

Rogers, P.J. (2000). Program theory evaluation: Not whether programs work but how they work. In D.L. Stufflebeam, G.F. Madaus, & T. Kellaghan (Eds.), Evaluation models: Viewpoints on educational and human services evaluation . Kluwer: Boston, MA, 209–232.

Snell, J. & Swanson, J. (April, 2000). The essential knowledge and skills of teacher leaders: A search for a conceptual framework, New Orleans, LA. Presented at the Annual Meeting of the American Educational Research Association, 2000.

United Nations Office for Disaster Reduction. (2015). Monitoring and evaluation framework. Geneva: United Nations Office for Disaster Reduction.

Meltem Alemdar, Ph.D. , Associate Director, Principal Research Scientist, Center for Education Integrating Science, Mathematics, and Computing , Georgia Institute of Technology [email protected]

Dr. Meltem Alemdar is Associate Director and Principal Research Scientist at Georgia Institute of Technology’s Center for Education Integrating Science, Mathematics and Computing. Her research focuses on improving K-12 STEM education through research on curriculum development, teacher education, and student learning in integrated STEM environments. Her NSF-funded research projects have focused on project-based learning, STEM integration, engineering education, and social network analysis. Meltem has been an external evaluator for various NSF projects. As part of an NSF-funded project, she directs a longitudinal study that focuses on measuring an engineering curriculum’s impact on student learning and 21st century skills. Her expertise includes program evaluation, social network analysis and quantitative methods such as Hierarchical Linear Modeling and Structural Equation Modeling.

Christopher Cappelli, MPH , Senior Research Associate, Center for Education Integrating Science, Mathematics and Computing , Georgia Institute of Technology [email protected]

Christopher Cappelli, MPH, a Senior Research Associate at Georgia Institute of Technology’s Center for Education Integrating Science, Mathematics and Computing, is currently pursuing his Ph.D. in Research, Measurement, and Statistics at Georgia State University. His work centers on research and evaluation for education and public health programs, specifically on the use of innovative methods to design and conduct useful evaluations to provide program stakeholder’s with data-informed feedback to improve their programs and information regarding overall program impact. He contributes to research projects that aim to extend knowledge around teacher professional development programs, teacher retention, and graduate student education. His methodological interests and expertise include survey development, survival analysis, social network analysis, and multilevel modeling.

research and evaluation framework

This material is based upon work supported by the National Science Foundation (NSF) under Grant Numbers DUE- 2041597 and DUE-1548986. Any opinions, findings, interpretations, conclusions or recommendations expressed in this material are those of its authors and do not represent the views of the AAAS Board of Directors, the Council of AAAS, AAAS’ membership or the National Science Foundation.

research and evaluation framework

Search form

research and evaluation framework

  • Table of Contents
  • Troubleshooting Guide
  • A Model for Getting Started
  • Justice Action Toolkit
  • Best Change Processes
  • Databases of Best Practices
  • Online Courses
  • Ask an Advisor
  • Subscribe to eNewsletter
  • Community Stories
  • YouTube Channel
  • About the Tool Box
  • How to Use the Tool Box
  • Privacy Statement
  • Workstation/Check Box Sign-In
  • Online Training Courses
  • Capacity Building Training
  • Training Curriculum - Order Now
  • Community Check Box Evaluation System
  • Build Your Toolbox
  • Facilitation of Community Processes
  • Community Health Assessment and Planning
  • Section 1. A Framework for Program Evaluation: A Gateway to Tools

Chapter 36 Sections

  • Section 2. Community-based Participatory Research
  • Section 3. Understanding Community Leadership, Evaluators, and Funders: What Are Their Interests?
  • Section 4. Choosing Evaluators
  • Section 5. Developing an Evaluation Plan
  • Section 6. Participatory Evaluation
  • Main Section
This section is adapted from the article "Recommended Framework for Program Evaluation in Public Health Practice," by Bobby Milstein, Scott Wetterhall, and the CDC Evaluation Working Group.

Around the world, there exist many programs and interventions developed to improve conditions in local communities. Communities come together to reduce the level of violence that exists, to work for safe, affordable housing for everyone, or to help more students do well in school, to give just a few examples.

But how do we know whether these programs are working? If they are not effective, and even if they are, how can we improve them to make them better for local communities? And finally, how can an organization make intelligent choices about which promising programs are likely to work best in their community?

Over the past years, there has been a growing trend towards the better use of evaluation to understand and improve practice.The systematic use of evaluation has solved many problems and helped countless community-based organizations do what they do better.

Despite an increased understanding of the need for - and the use of - evaluation, however, a basic agreed-upon framework for program evaluation has been lacking. In 1997, scientists at the United States Centers for Disease Control and Prevention (CDC) recognized the need to develop such a framework. As a result of this, the CDC assembled an Evaluation Working Group comprised of experts in the fields of public health and evaluation. Members were asked to develop a framework that summarizes and organizes the basic elements of program evaluation. This Community Tool Box section describes the framework resulting from the Working Group's efforts.

Before we begin, however, we'd like to offer some definitions of terms that we will use throughout this section.

By evaluation , we mean the systematic investigation of the merit, worth, or significance of an object or effort. Evaluation practice has changed dramatically during the past three decades - new methods and approaches have been developed and it is now used for increasingly diverse projects and audiences.

Throughout this section, the term program is used to describe the object or effort that is being evaluated. It may apply to any action with the goal of improving outcomes for whole communities, for more specific sectors (e.g., schools, work places), or for sub-groups (e.g., youth, people experiencing violence or HIV/AIDS). This definition is meant to be very broad.

Examples of different types of programs include:

  • Direct service interventions (e.g., a program that offers free breakfast to improve nutrition for grade school children)
  • Community mobilization efforts (e.g., organizing a boycott of California grapes to improve the economic well-being of farm workers)
  • Research initiatives (e.g., an effort to find out whether inequities in health outcomes based on race can be reduced)
  • Surveillance systems (e.g., whether early detection of school readiness improves educational outcomes)
  • Advocacy work (e.g., a campaign to influence the state legislature to pass legislation regarding tobacco control)
  • Social marketing campaigns (e.g., a campaign in the Third World encouraging mothers to breast-feed their babies to reduce infant mortality)
  • Infrastructure building projects (e.g., a program to build the capacity of state agencies to support community development initiatives)
  • Training programs (e.g., a job training program to reduce unemployment in urban neighborhoods)
  • Administrative systems (e.g., an incentive program to improve efficiency of health services)

Program evaluation - the type of evaluation discussed in this section - is an essential organizational practice for all types of community health and development work. It is a way to evaluate the specific projects and activities community groups may take part in, rather than to evaluate an entire organization or comprehensive community initiative.

Stakeholders refer to those who care about the program or effort. These may include those presumed to benefit (e.g., children and their parents or guardians), those with particular influence (e.g., elected or appointed officials), and those who might support the effort (i.e., potential allies) or oppose it (i.e., potential opponents). Key questions in thinking about stakeholders are: Who cares? What do they care about?

This section presents a framework that promotes a common understanding of program evaluation. The overall goal is to make it easier for everyone involved in community health and development work to evaluate their efforts.

Why evaluate community health and development programs?

The type of evaluation we talk about in this section can be closely tied to everyday program operations. Our emphasis is on practical, ongoing evaluation that involves program staff, community members, and other stakeholders, not just evaluation experts. This type of evaluation offers many advantages for community health and development professionals.

For example, it complements program management by:

  • Helping to clarify program plans
  • Improving communication among partners
  • Gathering the feedback needed to improve and be accountable for program effectiveness

It's important to remember, too, that evaluation is not a new activity for those of us working to improve our communities. In fact, we assess the merit of our work all the time when we ask questions, consult partners, make assessments based on feedback, and then use those judgments to improve our work. When the stakes are low, this type of informal evaluation might be enough. However, when the stakes are raised - when a good deal of time or money is involved, or when many people may be affected - then it may make sense for your organization to use evaluation procedures that are more formal, visible, and justifiable.

How do you evaluate a specific program?

Before your organization starts with a program evaluation, your group should be very clear about the answers to the following questions:.

  • What will be evaluated?
  • What criteria will be used to judge program performance?
  • What standards of performance on the criteria must be reached for the program to be considered successful?
  • What evidence will indicate performance on the criteria relative to the standards?
  • What conclusions about program performance are justified based on the available evidence?

To clarify the meaning of each, let's look at some of the answers for Drive Smart, a hypothetical program begun to stop drunk driving.

  • Drive Smart, a program focused on reducing drunk driving through public education and intervention.
  • The number of community residents who are familiar with the program and its goals
  • The number of people who use "Safe Rides" volunteer taxis to get home
  • The percentage of people who report drinking and driving
  • The reported number of single car night time crashes (This is a common way to try to determine if the number of people who drive drunk is changing)
  • 80% of community residents will know about the program and its goals after the first year of the program
  • The number of people who use the "Safe Rides" taxis will increase by 20% in the first year
  • The percentage of people who report drinking and driving will decrease by 20% in the first year
  • The reported number of single car night time crashes will decrease by 10 % in the program's first two years
  • A random telephone survey will demonstrate community residents' knowledge of the program and changes in reported behavior
  • Logs from "Safe Rides" will tell how many people use their services
  • Information on single car night time crashes will be gathered from police records
  • Are the changes we have seen in the level of drunk driving due to our efforts, or something else? Or (if no or insufficient change in behavior or outcome,)
  • Should Drive Smart change what it is doing, or have we just not waited long enough to see results?

The following framework provides an organized approach to answer these questions.

A framework for program evaluation

Program evaluation offers a way to understand and improve community health and development practice using methods that are useful, feasible, proper, and accurate. The framework described below is a practical non-prescriptive tool that summarizes in a logical order the important elements of program evaluation.

The framework contains two related dimensions:

  • Steps in evaluation practice, and
  • Standards for "good" evaluation.

The six connected steps of the framework are actions that should be a part of any evaluation. Although in practice the steps may be encountered out of order, it will usually make sense to follow them in the recommended sequence. That's because earlier steps provide the foundation for subsequent progress. Thus, decisions about how to carry out a given step should not be finalized until prior steps have been thoroughly addressed.

However, these steps are meant to be adaptable, not rigid. Sensitivity to each program's unique context (for example, the program's history and organizational climate) is essential for sound evaluation. They are intended to serve as starting points around which community organizations can tailor an evaluation to best meet their needs.

  • Engage stakeholders
  • Describe the program
  • Focus the evaluation design
  • Gather credible evidence
  • Justify conclusions
  • Ensure use and share lessons learned

Understanding and adhering to these basic steps will improve most evaluation efforts.

The second part of the framework is a basic set of standards to assess the quality of evaluation activities. There are 30 specific standards, organized into the following four groups:

  • Feasibility

These standards help answer the question, "Will this evaluation be a 'good' evaluation?" They are recommended as the initial criteria by which to judge the quality of the program evaluation efforts.

Engage Stakeholders

Stakeholders are people or organizations that have something to gain or lose from what will be learned from an evaluation, and also in what will be done with that knowledge. Evaluation cannot be done in isolation. Almost everything done in community health and development work involves partnerships - alliances among different organizations, board members, those affected by the problem, and others. Therefore, any serious effort to evaluate a program must consider the different values held by the partners. Stakeholders must be part of the evaluation to ensure that their unique perspectives are understood. When stakeholders are not appropriately involved, evaluation findings are likely to be ignored, criticized, or resisted.

However, if they are part of the process, people are likely to feel a good deal of ownership for the evaluation process and results. They will probably want to develop it, defend it, and make sure that the evaluation really works.

That's why this evaluation cycle begins by engaging stakeholders. Once involved, these people will help to carry out each of the steps that follows.

Three principle groups of stakeholders are important to involve:

  • People or organizations involved in program operations may include community members, sponsors, collaborators, coalition partners, funding officials, administrators, managers, and staff.
  • People or organizations served or affected by the program may include clients, family members, neighborhood organizations, academic institutions, elected and appointed officials, advocacy groups, and community residents. Individuals who are openly skeptical of or antagonistic toward the program may also be important to involve. Opening an evaluation to opposing perspectives and enlisting the help of potential program opponents can strengthen the evaluation's credibility.

Likewise, individuals or groups who could be adversely or inadvertently affected by changes arising from the evaluation have a right to be engaged. For example, it is important to include those who would be affected if program services were expanded, altered, limited, or ended as a result of the evaluation.

  • Primary intended users of the evaluation are the specific individuals who are in a position to decide and/or do something with the results.They shouldn't be confused with primary intended users of the program, although some of them should be involved in this group. In fact, primary intended users should be a subset of all of the stakeholders who have been identified. A successful evaluation will designate primary intended users, such as program staff and funders, early in its development and maintain frequent interaction with them to be sure that the evaluation specifically addresses their values and needs.

The amount and type of stakeholder involvement will be different for each program evaluation. For instance, stakeholders can be directly involved in designing and conducting the evaluation. They can be kept informed about progress of the evaluation through periodic meetings, reports, and other means of communication.

It may be helpful, when working with a group such as this, to develop an explicit process to share power and resolve conflicts . This may help avoid overemphasis of values held by any specific stakeholder.

Describe the Program

A program description is a summary of the intervention being evaluated. It should explain what the program is trying to accomplish and how it tries to bring about those changes. The description will also illustrate the program's core components and elements, its ability to make changes, its stage of development, and how the program fits into the larger organizational and community environment.

How a program is described sets the frame of reference for all future decisions about its evaluation. For example, if a program is described as, "attempting to strengthen enforcement of existing laws that discourage underage drinking," the evaluation might be very different than if it is described as, "a program to reduce drunk driving by teens." Also, the description allows members of the group to compare the program to other similar efforts, and it makes it easier to figure out what parts of the program brought about what effects.

Moreover, different stakeholders may have different ideas about what the program is supposed to achieve and why. For example, a program to reduce teen pregnancy may have some members who believe this means only increasing access to contraceptives, and other members who believe it means only focusing on abstinence.

Evaluations done without agreement on the program definition aren't likely to be very useful. In many cases, the process of working with stakeholders to develop a clear and logical program description will bring benefits long before data are available to measure program effectiveness.

There are several specific aspects that should be included when describing a program.

Statement of need

A statement of need describes the problem, goal, or opportunity that the program addresses; it also begins to imply what the program will do in response. Important features to note regarding a program's need are: the nature of the problem or goal, who is affected, how big it is, and whether (and how) it is changing.

Expectations

Expectations are the program's intended results. They describe what the program has to accomplish to be considered successful. For most programs, the accomplishments exist on a continuum (first, we want to accomplish X... then, we want to do Y...). Therefore, they should be organized by time ranging from specific (and immediate) to broad (and longer-term) consequences. For example, a program's vision, mission, goals, and objectives , all represent varying levels of specificity about a program's expectations.

Activities are everything the program does to bring about changes. Describing program components and elements permits specific strategies and actions to be listed in logical sequence. This also shows how different program activities, such as education and enforcement, relate to one another. Describing program activities also provides an opportunity to distinguish activities that are the direct responsibility of the program from those that are conducted by related programs or partner organizations. Things outside of the program that may affect its success, such as harsher laws punishing businesses that sell alcohol to minors, can also be noted.

Resources include the time, talent, equipment, information, money, and other assets available to conduct program activities. Reviewing the resources a program has tells a lot about the amount and intensity of its services. It may also point out situations where there is a mismatch between what the group wants to do and the resources available to carry out these activities. Understanding program costs is a necessity to assess the cost-benefit ratio as part of the evaluation.

Stage of development

A program's stage of development reflects its maturity. All community health and development programs mature and change over time. People who conduct evaluations, as well as those who use their findings, need to consider the dynamic nature of programs. For example, a new program that just received its first grant may differ in many respects from one that has been running for over a decade.

At least three phases of development are commonly recognized: planning , implementation , and effects or outcomes . In the planning stage, program activities are untested and the goal of evaluation is to refine plans as much as possible. In the implementation phase, program activities are being field tested and modified; the goal of evaluation is to see what happens in the "real world" and to improve operations. In the effects stage, enough time has passed for the program's effects to emerge; the goal of evaluation is to identify and understand the program's results, including those that were unintentional.

A description of the program's context considers the important features of the environment in which the program operates. This includes understanding the area's history, geography, politics, and social and economic conditions, and also what other organizations have done. A realistic and responsive evaluation is sensitive to a broad range of potential influences on the program. An understanding of the context lets users interpret findings accurately and assess their generalizability. For example, a program to improve housing in an inner-city neighborhood might have been a tremendous success, but would likely not work in a small town on the other side of the country without significant adaptation.

Logic model

A logic model synthesizes the main program elements into a picture of how the program is supposed to work. It makes explicit the sequence of events that are presumed to bring about change. Often this logic is displayed in a flow-chart, map, or table to portray the sequence of steps leading to program results.

Creating a logic model allows stakeholders to improve and focus program direction. It reveals assumptions about conditions for program effectiveness and provides a frame of reference for one or more evaluations of the program. A detailed logic model can also be a basis for estimating the program's effect on endpoints that are not directly measured. For example, it may be possible to estimate the rate of reduction in disease from a known number of persons experiencing the intervention if there is prior knowledge about its effectiveness.

The breadth and depth of a program description will vary for each program evaluation. And so, many different activities may be part of developing that description. For instance, multiple sources of information could be pulled together to construct a well-rounded description. The accuracy of an existing program description could be confirmed through discussion with stakeholders. Descriptions of what's going on could be checked against direct observation of activities in the field. A narrow program description could be fleshed out by addressing contextual factors (such as staff turnover, inadequate resources, political pressures, or strong community participation) that may affect program performance.

Focus the Evaluation Design

By focusing the evaluation design, we mean doing advance planning about where the evaluation is headed, and what steps it will take to get there. It isn't possible or useful for an evaluation to try to answer all questions for all stakeholders; there must be a focus. A well-focused plan is a safeguard against using time and resources inefficiently.

Depending on what you want to learn, some types of evaluation will be better suited than others. However, once data collection begins, it may be difficult or impossible to change what you are doing, even if it becomes obvious that other methods would work better. A thorough plan anticipates intended uses and creates an evaluation strategy with the greatest chance to be useful, feasible, proper, and accurate.

Among the issues to consider when focusing an evaluation are:

Purpose refers to the general intent of the evaluation. A clear purpose serves as the basis for the design, methods, and use of the evaluation. Taking time to articulate an overall purpose will stop your organization from making uninformed decisions about how the evaluation should be conducted and used.

There are at least four general purposes for which a community group might conduct an evaluation:

  • To gain insight .This happens, for example, when deciding whether to use a new approach (e.g., would a neighborhood watch program work for our community?) Knowledge from such an evaluation will provide information about its practicality. For a developing program, information from evaluations of similar programs can provide the insight needed to clarify how its activities should be designed.
  • To improve how things get done .This is appropriate in the implementation stage when an established program tries to describe what it has done. This information can be used to describe program processes, to improve how the program operates, and to fine-tune the overall strategy. Evaluations done for this purpose include efforts to improve the quality, effectiveness, or efficiency of program activities.
  • To determine what the effects of the program are . Evaluations done for this purpose examine the relationship between program activities and observed consequences. For example, are more students finishing high school as a result of the program? Programs most appropriate for this type of evaluation are mature programs that are able to state clearly what happened and who it happened to. Such evaluations should provide evidence about what the program's contribution was to reaching longer-term goals such as a decrease in child abuse or crime in the area. This type of evaluation helps establish the accountability, and thus, the credibility, of a program to funders and to the community.
  • Empower program participants (for example, being part of an evaluation can increase community members' sense of control over the program);
  • Supplement the program (for example, using a follow-up questionnaire can reinforce the main messages of the program);
  • Promote staff development (for example, by teaching staff how to collect, analyze, and interpret evidence); or
  • Contribute to organizational growth (for example, the evaluation may clarify how the program relates to the organization's mission).

Users are the specific individuals who will receive evaluation findings. They will directly experience the consequences of inevitable trade-offs in the evaluation process. For example, a trade-off might be having a relatively modest evaluation to fit the budget with the outcome that the evaluation results will be less certain than they would be for a full-scale evaluation. Because they will be affected by these tradeoffs, intended users have a right to participate in choosing a focus for the evaluation. An evaluation designed without adequate user involvement in selecting the focus can become a misguided and irrelevant exercise. By contrast, when users are encouraged to clarify intended uses, priority questions, and preferred methods, the evaluation is more likely to focus on things that will inform (and influence) future actions.

Uses describe what will be done with what is learned from the evaluation. There is a wide range of potential uses for program evaluation. Generally speaking, the uses fall in the same four categories as the purposes listed above: to gain insight, improve how things get done, determine what the effects of the program are, and affect participants. The following list gives examples of uses in each category.

Some specific examples of evaluation uses

To gain insight:.

  • Assess needs and wants of community members
  • Identify barriers to use of the program
  • Learn how to best describe and measure program activities

To improve how things get done:

  • Refine plans for introducing a new practice
  • Determine the extent to which plans were implemented
  • Improve educational materials
  • Enhance cultural competence
  • Verify that participants' rights are protected
  • Set priorities for staff training
  • Make mid-course adjustments
  • Clarify communication
  • Determine if client satisfaction can be improved
  • Compare costs to benefits
  • Find out which participants benefit most from the program
  • Mobilize community support for the program

To determine what the effects of the program are:

  • Assess skills development by program participants
  • Compare changes in behavior over time
  • Decide where to allocate new resources
  • Document the level of success in accomplishing objectives
  • Demonstrate that accountability requirements are fulfilled
  • Use information from multiple evaluations to predict the likely effects of similar programs

To affect participants:

  • Reinforce messages of the program
  • Stimulate dialogue and raise awareness about community issues
  • Broaden consensus among partners about program goals
  • Teach evaluation skills to staff and other stakeholders
  • Gather success stories
  • Support organizational change and improvement

The evaluation needs to answer specific questions . Drafting questions encourages stakeholders to reveal what they believe the evaluation should answer. That is, what questions are more important to stakeholders? The process of developing evaluation questions further refines the focus of the evaluation.

The methods available for an evaluation are drawn from behavioral science and social research and development. Three types of methods are commonly recognized. They are experimental, quasi-experimental, and observational or case study designs. Experimental designs use random assignment to compare the effect of an intervention between otherwise equivalent groups (for example, comparing a randomly assigned group of students who took part in an after-school reading program with those who didn't). Quasi-experimental methods make comparisons between groups that aren't equal (e.g. program participants vs. those on a waiting list) or use of comparisons within a group over time, such as in an interrupted time series in which the intervention may be introduced sequentially across different individuals, groups, or contexts. Observational or case study methods use comparisons within a group to describe and explain what happens (e.g., comparative case studies with multiple communities).

No design is necessarily better than another. Evaluation methods should be selected because they provide the appropriate information to answer stakeholders' questions, not because they are familiar, easy, or popular. The choice of methods has implications for what will count as evidence, how that evidence will be gathered, and what kind of claims can be made. Because each method option has its own biases and limitations, evaluations that mix methods are generally more robust.

Over the course of an evaluation, methods may need to be revised or modified. Circumstances that make a particular approach useful can change. For example, the intended use of the evaluation could shift from discovering how to improve the program to helping decide about whether the program should continue or not. Thus, methods may need to be adapted or redesigned to keep the evaluation on track.

Agreements summarize the evaluation procedures and clarify everyone's roles and responsibilities. An agreement describes how the evaluation activities will be implemented. Elements of an agreement include statements about the intended purpose, users, uses, and methods, as well as a summary of the deliverables, those responsible, a timeline, and budget.

The formality of the agreement depends upon the relationships that exist between those involved. For example, it may take the form of a legal contract, a detailed protocol, or a simple memorandum of understanding. Regardless of its formality, creating an explicit agreement provides an opportunity to verify the mutual understanding needed for a successful evaluation. It also provides a basis for modifying procedures if that turns out to be necessary.

As you can see, focusing the evaluation design may involve many activities. For instance, both supporters and skeptics of the program could be consulted to ensure that the proposed evaluation questions are politically viable. A menu of potential evaluation uses appropriate for the program's stage of development could be circulated among stakeholders to determine which is most compelling. Interviews could be held with specific intended users to better understand their information needs and timeline for action. Resource requirements could be reduced when users are willing to employ more timely but less precise evaluation methods.

Gather Credible Evidence

Credible evidence is the raw material of a good evaluation. The information learned should be seen by stakeholders as believable, trustworthy, and relevant to answer their questions. This requires thinking broadly about what counts as "evidence." Such decisions are always situational; they depend on the question being posed and the motives for asking it. For some questions, a stakeholder's standard for credibility could demand having the results of a randomized experiment. For another question, a set of well-done, systematic observations such as interactions between an outreach worker and community residents, will have high credibility. The difference depends on what kind of information the stakeholders want and the situation in which it is gathered.

Context matters! In some situations, it may be necessary to consult evaluation specialists. This may be especially true if concern for data quality is especially high. In other circumstances, local people may offer the deepest insights. Regardless of their expertise, however, those involved in an evaluation should strive to collect information that will convey a credible, well-rounded picture of the program and its efforts.

Having credible evidence strengthens the evaluation results as well as the recommendations that follow from them. Although all types of data have limitations, it is possible to improve an evaluation's overall credibility. One way to do this is by using multiple procedures for gathering, analyzing, and interpreting data. Encouraging participation by stakeholders can also enhance perceived credibility. When stakeholders help define questions and gather data, they will be more likely to accept the evaluation's conclusions and to act on its recommendations.

The following features of evidence gathering typically affect how credible it is seen as being:

Indicators translate general concepts about the program and its expected effects into specific, measurable parts.

Examples of indicators include:

  • The program's capacity to deliver services
  • The participation rate
  • The level of client satisfaction
  • The amount of intervention exposure (how many people were exposed to the program, and for how long they were exposed)
  • Changes in participant behavior
  • Changes in community conditions or norms
  • Changes in the environment (e.g., new programs, policies, or practices)
  • Longer-term changes in population health status (e.g., estimated teen pregnancy rate in the county)

Indicators should address the criteria that will be used to judge the program. That is, they reflect the aspects of the program that are most meaningful to monitor. Several indicators are usually needed to track the implementation and effects of a complex program or intervention.

One way to develop multiple indicators is to create a "balanced scorecard," which contains indicators that are carefully selected to complement one another. According to this strategy, program processes and effects are viewed from multiple perspectives using small groups of related indicators. For instance, a balanced scorecard for a single program might include indicators of how the program is being delivered; what participants think of the program; what effects are observed; what goals were attained; and what changes are occurring in the environment around the program.

Another approach to using multiple indicators is based on a program logic model, such as we discussed earlier in the section. A logic model can be used as a template to define a full spectrum of indicators along the pathway that leads from program activities to expected effects. For each step in the model, qualitative and/or quantitative indicators could be developed.

Indicators can be broad-based and don't need to focus only on a program's long -term goals. They can also address intermediary factors that influence program effectiveness, including such intangible factors as service quality, community capacity, or inter -organizational relations. Indicators for these and similar concepts can be created by systematically identifying and then tracking markers of what is said or done when the concept is expressed.

In the course of an evaluation, indicators may need to be modified or new ones adopted. Also, measuring program performance by tracking indicators is only one part of evaluation, and shouldn't be confused as a basis for decision making in itself. There are definite perils to using performance indicators as a substitute for completing the evaluation process and reaching fully justified conclusions. For example, an indicator, such as a rising rate of unemployment, may be falsely assumed to reflect a failing program when it may actually be due to changing environmental conditions that are beyond the program's control.

Sources of evidence in an evaluation may be people, documents, or observations. More than one source may be used to gather evidence for each indicator. In fact, selecting multiple sources provides an opportunity to include different perspectives about the program and enhances the evaluation's credibility. For instance, an inside perspective may be reflected by internal documents and comments from staff or program managers; whereas clients and those who do not support the program may provide different, but equally relevant perspectives. Mixing these and other perspectives provides a more comprehensive view of the program or intervention.

The criteria used to select sources should be clearly stated so that users and other stakeholders can interpret the evidence accurately and assess if it may be biased. In addition, some sources provide information in narrative form (for example, a person's experience when taking part in the program) and others are numerical (for example, how many people were involved in the program). The integration of qualitative and quantitative information can yield evidence that is more complete and more useful, thus meeting the needs and expectations of a wider range of stakeholders.

Quality refers to the appropriateness and integrity of information gathered in an evaluation. High quality data are reliable and informative. It is easier to collect if the indicators have been well defined. Other factors that affect quality may include instrument design, data collection procedures, training of those involved in data collection, source selection, coding, data management, and routine error checking. Obtaining quality data will entail tradeoffs (e.g. breadth vs. depth); stakeholders should decide together what is most important to them. Because all data have limitations, the intent of a practical evaluation is to strive for a level of quality that meets the stakeholders' threshold for credibility.

Quantity refers to the amount of evidence gathered in an evaluation. It is necessary to estimate in advance the amount of information that will be required and to establish criteria to decide when to stop collecting data - to know when enough is enough. Quantity affects the level of confidence or precision users can have - how sure we are that what we've learned is true. It also partly determines whether the evaluation will be able to detect effects. All evidence collected should have a clear, anticipated use.

By logistics , we mean the methods, timing, and physical infrastructure for gathering and handling evidence. People and organizations also have cultural preferences that dictate acceptable ways of asking questions and collecting information, including who would be perceived as an appropriate person to ask the questions. For example, some participants may be unwilling to discuss their behavior with a stranger, whereas others are more at ease with someone they don't know. Therefore, the techniques for gathering evidence in an evaluation must be in keeping with the cultural norms of the community. Data collection procedures should also ensure that confidentiality is protected.

Justify Conclusions

The process of justifying conclusions recognizes that evidence in an evaluation does not necessarily speak for itself. Evidence must be carefully considered from a number of different stakeholders' perspectives to reach conclusions that are well -substantiated and justified. Conclusions become justified when they are linked to the evidence gathered and judged against agreed-upon values set by the stakeholders. Stakeholders must agree that conclusions are justified in order to use the evaluation results with confidence.

The principal elements involved in justifying conclusions based on evidence are:

Standards reflect the values held by stakeholders about the program. They provide the basis to make program judgments. The use of explicit standards for judgment is fundamental to sound evaluation. In practice, when stakeholders articulate and negotiate their values, these become the standards to judge whether a given program's performance will, for instance, be considered "successful," "adequate," or "unsuccessful."

Analysis and synthesis

Analysis and synthesis are methods to discover and summarize an evaluation's findings. They are designed to detect patterns in evidence, either by isolating important findings (analysis) or by combining different sources of information to reach a larger understanding (synthesis). Mixed method evaluations require the separate analysis of each evidence element, as well as a synthesis of all sources to examine patterns that emerge. Deciphering facts from a given body of evidence involves deciding how to organize, classify, compare, and display information. These decisions are guided by the questions being asked, the types of data available, and especially by input from stakeholders and primary intended users.

Interpretation

Interpretation is the effort to figure out what the findings mean. Uncovering facts about a program's performance isn't enough to make conclusions. The facts must be interpreted to understand their practical significance. For example, saying, "15 % of the people in our area witnessed a violent act last year," may be interpreted differently depending on the situation. For example, if 50% of community members had watched a violent act in the last year when they were surveyed five years ago, the group can suggest that, while still a problem, things are getting better in the community. However, if five years ago only 7% of those surveyed said the same thing, community organizations may see this as a sign that they might want to change what they are doing. In short, interpretations draw on information and perspectives that stakeholders bring to the evaluation. They can be strengthened through active participation or interaction with the data and preliminary explanations of what happened.

Judgments are statements about the merit, worth, or significance of the program. They are formed by comparing the findings and their interpretations against one or more selected standards. Because multiple standards can be applied to a given program, stakeholders may reach different or even conflicting judgments. For instance, a program that increases its outreach by 10% from the previous year may be judged positively by program managers, based on standards of improved performance over time. Community members, however, may feel that despite improvements, a minimum threshold of access to services has still not been reached. Their judgment, based on standards of social equity, would therefore be negative. Conflicting claims about a program's quality, value, or importance often indicate that stakeholders are using different standards or values in making judgments. This type of disagreement can be a catalyst to clarify values and to negotiate the appropriate basis (or bases) on which the program should be judged.

Recommendations

Recommendations are actions to consider as a result of the evaluation. Forming recommendations requires information beyond just what is necessary to form judgments. For example, knowing that a program is able to increase the services available to battered women doesn't necessarily translate into a recommendation to continue the effort, particularly when there are competing priorities or other effective alternatives. Thus, recommendations about what to do with a given intervention go beyond judgments about a specific program's effectiveness.

If recommendations aren't supported by enough evidence, or if they aren't in keeping with stakeholders' values, they can really undermine an evaluation's credibility. By contrast, an evaluation can be strengthened by recommendations that anticipate and react to what users will want to know.

Three things might increase the chances that recommendations will be relevant and well-received:

  • Sharing draft recommendations
  • Soliciting reactions from multiple stakeholders
  • Presenting options instead of directive advice

Justifying conclusions in an evaluation is a process that involves different possible steps. For instance, conclusions could be strengthened by searching for alternative explanations from the ones you have chosen, and then showing why they are unsupported by the evidence. When there are different but equally well supported conclusions, each could be presented with a summary of their strengths and weaknesses. Techniques to analyze, synthesize, and interpret findings might be agreed upon before data collection begins.

Ensure Use and Share Lessons Learned

It is naive to assume that lessons learned in an evaluation will necessarily be used in decision making and subsequent action. Deliberate effort on the part of evaluators is needed to ensure that the evaluation findings will be used appropriately. Preparing for their use involves strategic thinking and continued vigilance in looking for opportunities to communicate and influence. Both of these should begin in the earliest stages of the process and continue throughout the evaluation.

The elements of key importance to be sure that the recommendations from an evaluation are used are:

Design refers to how the evaluation's questions, methods, and overall processes are constructed. As discussed in the third step of this framework (focusing the evaluation design), the evaluation should be organized from the start to achieve specific agreed-upon uses. Having a clear purpose that is focused on the use of what is learned helps those who will carry out the evaluation to know who will do what with the findings. Furthermore, the process of creating a clear design will highlight ways that stakeholders, through their many contributions, can improve the evaluation and facilitate the use of the results.

Preparation

Preparation refers to the steps taken to get ready for the future uses of the evaluation findings. The ability to translate new knowledge into appropriate action is a skill that can be strengthened through practice. In fact, building this skill can itself be a useful benefit of the evaluation. It is possible to prepare stakeholders for future use of the results by discussing how potential findings might affect decision making.

For example, primary intended users and other stakeholders could be given a set of hypothetical results and asked what decisions or actions they would make on the basis of this new knowledge. If they indicate that the evidence presented is incomplete or irrelevant and that no action would be taken, then this is an early warning sign that the planned evaluation should be modified. Preparing for use also gives stakeholders more time to explore both positive and negative implications of potential results and to identify different options for program improvement.

Feedback is the communication that occurs among everyone involved in the evaluation. Giving and receiving feedback creates an atmosphere of trust among stakeholders; it keeps an evaluation on track by keeping everyone informed about how the evaluation is proceeding. Primary intended users and other stakeholders have a right to comment on evaluation decisions. From a standpoint of ensuring use, stakeholder feedback is a necessary part of every step in the evaluation. Obtaining valuable feedback can be encouraged by holding discussions during each step of the evaluation and routinely sharing interim findings, provisional interpretations, and draft reports.

Follow-up refers to the support that many users need during the evaluation and after they receive evaluation findings. Because of the amount of effort required, reaching justified conclusions in an evaluation can seem like an end in itself. It is not . Active follow-up may be necessary to remind users of the intended uses of what has been learned. Follow-up may also be required to stop lessons learned from becoming lost or ignored in the process of making complex or political decisions. To guard against such oversight, it may be helpful to have someone involved in the evaluation serve as an advocate for the evaluation's findings during the decision -making phase.

Facilitating the use of evaluation findings also carries with it the responsibility to prevent misuse. Evaluation results are always bounded by the context in which the evaluation was conducted. Some stakeholders, however, may be tempted to take results out of context or to use them for different purposes than what they were developed for. For instance, over-generalizing the results from a single case study to make decisions that affect all sites in a national program is an example of misuse of a case study evaluation.

Similarly, program opponents may misuse results by overemphasizing negative findings without giving proper credit for what has worked. Active follow-up can help to prevent these and other forms of misuse by ensuring that evidence is only applied to the questions that were the central focus of the evaluation.

Dissemination

Dissemination is the process of communicating the procedures or the lessons learned from an evaluation to relevant audiences in a timely, unbiased, and consistent fashion. Like other elements of the evaluation, the reporting strategy should be discussed in advance with intended users and other stakeholders. Planning effective communications also requires considering the timing, style, tone, message source, vehicle, and format of information products. Regardless of how communications are constructed, the goal for dissemination is to achieve full disclosure and impartial reporting.

Along with the uses for evaluation findings, there are also uses that flow from the very process of evaluating. These "process uses" should be encouraged. The people who take part in an evaluation can experience profound changes in beliefs and behavior. For instance, an evaluation challenges staff members to act differently in what they are doing, and to question assumptions that connect program activities with intended effects.

Evaluation also prompts staff to clarify their understanding of the goals of the program. This greater clarity, in turn, helps staff members to better function as a team focused on a common end. In short, immersion in the logic, reasoning, and values of evaluation can have very positive effects, such as basing decisions on systematic judgments instead of on unfounded assumptions.

Additional process uses for evaluation include:

  • By defining indicators, what really matters to stakeholders becomes clear
  • It helps make outcomes matter by changing the reinforcements connected with achieving positive results. For example, a funder might offer "bonus grants" or "outcome dividends" to a program that has shown a significant amount of community change and improvement.

Standards for "good" evaluation

There are standards to assess whether all of the parts of an evaluation are well -designed and working to their greatest potential. The Joint Committee on Educational Evaluation developed "The Program Evaluation Standards" for this purpose. These standards, designed to assess evaluations of educational programs, are also relevant for programs and interventions related to community health and development.

The program evaluation standards make it practical to conduct sound and fair evaluations. They offer well-supported principles to follow when faced with having to make tradeoffs or compromises. Attending to the standards can guard against an imbalanced evaluation, such as one that is accurate and feasible, but isn't very useful or sensitive to the context. Another example of an imbalanced evaluation is one that would be genuinely useful, but is impossible to carry out.

The following standards can be applied while developing an evaluation design and throughout the course of its implementation. Remember, the standards are written as guiding principles, not as rigid rules to be followed in all situations.

The 30 more specific standards are grouped into four categories:

The utility standards are:

  • Stakeholder Identification : People who are involved in (or will be affected by) the evaluation should be identified, so that their needs can be addressed.
  • Evaluator Credibility : The people conducting the evaluation should be both trustworthy and competent, so that the evaluation will be generally accepted as credible or believable.
  • Information Scope and Selection : Information collected should address pertinent questions about the program, and it should be responsive to the needs and interests of clients and other specified stakeholders.
  • Values Identification: The perspectives, procedures, and rationale used to interpret the findings should be carefully described, so that the bases for judgments about merit and value are clear.
  • Report Clarity: Evaluation reports should clearly describe the program being evaluated, including its context, and the purposes, procedures, and findings of the evaluation. This will help ensure that essential information is provided and easily understood.
  • Report Timeliness and Dissemination: Significant midcourse findings and evaluation reports should be shared with intended users so that they can be used in a timely fashion.
  • Evaluation Impact: Evaluations should be planned, conducted, and reported in ways that encourage follow-through by stakeholders, so that the evaluation will be used.

Feasibility Standards

The feasibility standards are to ensure that the evaluation makes sense - that the steps that are planned are both viable and pragmatic.

The feasibility standards are:

  • Practical Procedures: The evaluation procedures should be practical, to keep disruption of everyday activities to a minimum while needed information is obtained.
  • Political Viability : The evaluation should be planned and conducted with anticipation of the different positions or interests of various groups. This should help in obtaining their cooperation so that possible attempts by these groups to curtail evaluation operations or to misuse the results can be avoided or counteracted.
  • Cost Effectiveness: The evaluation should be efficient and produce enough valuable information that the resources used can be justified.

Propriety Standards

The propriety standards ensure that the evaluation is an ethical one, conducted with regard for the rights and interests of those involved. The eight propriety standards follow.

  • Service Orientation : Evaluations should be designed to help organizations effectively serve the needs of all of the targeted participants.
  • Formal Agreements : The responsibilities in an evaluation (what is to be done, how, by whom, when) should be agreed to in writing, so that those involved are obligated to follow all conditions of the agreement, or to formally renegotiate it.
  • Rights of Human Subjects : Evaluation should be designed and conducted to respect and protect the rights and welfare of human subjects, that is, all participants in the study.
  • Human Interactions : Evaluators should respect basic human dignity and worth when working with other people in an evaluation, so that participants don't feel threatened or harmed.
  • Complete and Fair Assessment : The evaluation should be complete and fair in its examination, recording both strengths and weaknesses of the program being evaluated. This allows strengths to be built upon and problem areas addressed.
  • Disclosure of Findings : The people working on the evaluation should ensure that all of the evaluation findings, along with the limitations of the evaluation, are accessible to everyone affected by the evaluation, and any others with expressed legal rights to receive the results.
  • Conflict of Interest: Conflict of interest should be dealt with openly and honestly, so that it does not compromise the evaluation processes and results.
  • Fiscal Responsibility : The evaluator's use of resources should reflect sound accountability procedures and otherwise be prudent and ethically responsible, so that expenditures are accounted for and appropriate.

Accuracy Standards

The accuracy standards ensure that the evaluation findings are considered correct.

There are 12 accuracy standards:

  • Program Documentation: The program should be described and documented clearly and accurately, so that what is being evaluated is clearly identified.
  • Context Analysis: The context in which the program exists should be thoroughly examined so that likely influences on the program can be identified.
  • Described Purposes and Procedures: The purposes and procedures of the evaluation should be monitored and described in enough detail that they can be identified and assessed.
  • Defensible Information Sources: The sources of information used in a program evaluation should be described in enough detail that the adequacy of the information can be assessed.
  • Valid Information: The information gathering procedures should be chosen or developed and then implemented in such a way that they will assure that the interpretation arrived at is valid.
  • Reliable Information : The information gathering procedures should be chosen or developed and then implemented so that they will assure that the information obtained is sufficiently reliable.
  • Systematic Information: The information from an evaluation should be systematically reviewed and any errors found should be corrected.
  • Analysis of Quantitative Information: Quantitative information - data from observations or surveys - in an evaluation should be appropriately and systematically analyzed so that evaluation questions are effectively answered.
  • Analysis of Qualitative Information: Qualitative information - descriptive information from interviews and other sources - in an evaluation should be appropriately and systematically analyzed so that evaluation questions are effectively answered.
  • Justified Conclusions: The conclusions reached in an evaluation should be explicitly justified, so that stakeholders can understand their worth.
  • Impartial Reporting: Reporting procedures should guard against the distortion caused by personal feelings and biases of people involved in the evaluation, so that evaluation reports fairly reflect the evaluation findings.
  • Metaevaluation: The evaluation itself should be evaluated against these and other pertinent standards, so that it is appropriately guided and, on completion, stakeholders can closely examine its strengths and weaknesses.

Applying the framework: Conducting optimal evaluations

There is an ever-increasing agreement on the worth of evaluation; in fact, doing so is often required by funders and other constituents. So, community health and development professionals can no longer question whether or not to evaluate their programs. Instead, the appropriate questions are:

  • What is the best way to evaluate?
  • What are we learning from the evaluation?
  • How will we use what we learn to become more effective?

The framework for program evaluation helps answer these questions by guiding users to select evaluation strategies that are useful, feasible, proper, and accurate.

To use this framework requires quite a bit of skill in program evaluation. In most cases there are multiple stakeholders to consider, the political context may be divisive, steps don't always follow a logical order, and limited resources may make it difficult to take a preferred course of action. An evaluator's challenge is to devise an optimal strategy, given the conditions she is working under. An optimal strategy is one that accomplishes each step in the framework in a way that takes into account the program context and is able to meet or exceed the relevant standards.

This framework also makes it possible to respond to common concerns about program evaluation. For instance, many evaluations are not undertaken because they are seen as being too expensive. The cost of an evaluation, however, is relative; it depends upon the question being asked and the level of certainty desired for the answer. A simple, low-cost evaluation can deliver information valuable for understanding and improvement.

Rather than discounting evaluations as a time-consuming sideline, the framework encourages evaluations that are timed strategically to provide necessary feedback. This makes it possible to make evaluation closely linked with everyday practices.

Another concern centers on the perceived technical demands of designing and conducting an evaluation. However, the practical approach endorsed by this framework focuses on questions that can improve the program.

Finally, the prospect of evaluation troubles many staff members because they perceive evaluation methods as punishing ("They just want to show what we're doing wrong."), exclusionary ("Why aren't we part of it? We're the ones who know what's going on."), and adversarial ("It's us against them.") The framework instead encourages an evaluation approach that is designed to be helpful and engages all interested stakeholders in a process that welcomes their participation.

Evaluation is a powerful strategy for distinguishing programs and interventions that make a difference from those that don't. It is a driving force for developing and adapting sound strategies, improving existing programs, and demonstrating the results of investments in time and other resources. It also helps determine if what is being done is worth the cost.

This recommended framework for program evaluation is both a synthesis of existing best practices and a set of standards for further improvement. It supports a practical approach to evaluation based on steps and standards that can be applied in almost any setting. Because the framework is purposefully general, it provides a stable guide to design and conduct a wide range of evaluation efforts in a variety of specific program areas. The framework can be used as a template to create useful evaluation plans to contribute to understanding and improvement. The Magenta Book - Guidance for Evaluation  provides additional information on requirements for good evaluation, and some straightforward steps to make a good evaluation of an intervention more feasible, read The Magenta Book - Guidance for Evaluation.

Online Resources

Are You Ready to Evaluate your Coalition? prompts 15 questions to help the group decide whether your coalition is ready to evaluate itself and its work.

The  American Evaluation Association Guiding Principles for Evaluators  helps guide evaluators in their professional practice.

CDC Evaluation Resources  provides a list of resources for evaluation, as well as links to professional associations and journals.

Chapter 11: Community Interventions in the "Introduction to Community Psychology" explains professionally-led versus grassroots interventions, what it means for a community intervention to be effective, why a community needs to be ready for an intervention, and the steps to implementing community interventions.

The  Comprehensive Cancer Control Branch Program Evaluation Toolkit  is designed to help grantees plan and implement evaluations of their NCCCP-funded programs, this toolkit provides general guidance on evaluation principles and techniques, as well as practical templates and tools.

Developing an Effective Evaluation Plan  is a workbook provided by the CDC. In addition to information on designing an evaluation plan, this book also provides worksheets as a step-by-step guide.

EvaluACTION , from the CDC, is designed for people interested in learning about program evaluation and how to apply it to their work. Evaluation is a process, one dependent on what you’re currently doing and on the direction in which you’d like go. In addition to providing helpful information, the site also features an interactive Evaluation Plan & Logic Model Builder, so you can create customized tools for your organization to use.

Evaluating Your Community-Based Program  is a handbook designed by the American Academy of Pediatrics covering a variety of topics related to evaluation.

GAO Designing Evaluations  is a handbook provided by the U.S. Government Accountability Office with copious information regarding program evaluations.

The CDC's  Introduction to Program Evaluation for Publilc Health Programs: A Self-Study Guide  is a "how-to" guide for planning and implementing evaluation activities. The manual, based on CDC’s Framework for Program Evaluation in Public Health, is intended to assist with planning, designing, implementing and using comprehensive evaluations in a practical way.

McCormick Foundation Evaluation Guide  is a guide to planning an organization’s evaluation, with several chapters dedicated to gathering information and using it to improve the organization.

A Participatory Model for Evaluating Social Programs from the James Irvine Foundation.

Practical Evaluation for Public Managers  is a guide to evaluation written by the U.S. Department of Health and Human Services.

Penn State Program Evaluation  offers information on collecting different forms of data and how to measure different community markers.

Program Evaluaton  information page from Implementation Matters.

The Program Manager's Guide to Evaluation  is a handbook provided by the Administration for Children and Families with detailed answers to nine big questions regarding program evaluation.

Program Planning and Evaluation  is a website created by the University of Arizona. It provides links to information on several topics including methods, funding, types of evaluation, and reporting impacts.

User-Friendly Handbook for Program Evaluation  is a guide to evaluations provided by the National Science Foundation.  This guide includes practical information on quantitative and qualitative methodologies in evaluations.

W.K. Kellogg Foundation Evaluation Handbook  provides a framework for thinking about evaluation as a relevant and useful program tool. It was originally written for program directors with direct responsibility for the ongoing evaluation of the W.K. Kellogg Foundation.

Print Resources

This Community Tool Box section is an edited version of:

CDC Evaluation Working Group. (1999). (Draft). Recommended framework for program evaluation in public health practice . Atlanta, GA: Author.

The article cites the following references:

Adler. M., &  Ziglio, E. (1996). Gazing into the oracle: the delphi method and its application to social policy and community health and development. London: Jessica Kingsley Publishers.

Barrett, F.   Program Evaluation: A Step-by-Step Guide.  Sunnycrest Press, 2013. This practical manual includes helpful tips to develop evaluations, tables illustrating evaluation approaches, evaluation planning and reporting templates, and resources if you want more information.

Basch, C., Silepcevich, E., Gold, R., Duncan, D., & Kolbe, L. (1985).   Avoiding type III errors in health education program evaluation: a case study . Health Education Quarterly. 12(4):315-31.

Bickman L, & Rog, D. (1998). Handbook of applied social research methods. Thousand Oaks, CA: Sage Publications.

Boruch, R.  (1998).  Randomized controlled experiments for evaluation and planning. In Handbook of applied social research methods, edited by Bickman L., & Rog. D. Thousand Oaks, CA: Sage Publications: 161-92.

Centers for Disease Control and Prevention DoHAP. Evaluating CDC HIV prevention programs: guidance and data system . Atlanta, GA: Centers for Disease Control and Prevention, Division of HIV/AIDS Prevention, 1999.

Centers for Disease Control and Prevention. Guidelines for evaluating surveillance systems. Morbidity and Mortality Weekly Report 1988;37(S-5):1-18.

Centers for Disease Control and Prevention. Handbook for evaluating HIV education . Atlanta, GA: Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Division of Adolescent and School Health, 1995.

Cook, T., & Campbell, D. (1979). Quasi-experimentation . Chicago, IL: Rand McNally.

Cook, T.,& Reichardt, C. (1979).  Qualitative and quantitative methods in evaluation research . Beverly Hills, CA: Sage Publications.

Cousins, J.,& Whitmore, E. (1998).   Framing participatory evaluation. In Understanding and practicing participatory evaluation , vol. 80, edited by E Whitmore. San Francisco, CA: Jossey-Bass: 5-24.

Chen, H. (1990).  Theory driven evaluations . Newbury Park, CA: Sage Publications.

de Vries, H., Weijts, W., Dijkstra, M., & Kok, G. (1992).  The utilization of qualitative and quantitative data for health education program planning, implementation, and evaluation: a spiral approach . Health Education Quarterly.1992; 19(1):101-15.

Dyal, W. (1995).  Ten organizational practices of community health and development: a historical perspective . American Journal of Preventive Medicine;11(6):6-8.

Eddy, D. (1998). Performance measurement: problems and solutions . Health Affairs;17 (4):7-25.Harvard Family Research Project. Performance measurement. In The Evaluation Exchange, vol. 4, 1998, pp. 1-15.

Eoyang,G., & Berkas, T. (1996).  Evaluation in a complex adaptive system . Edited by (we don´t have the names), (1999): Taylor-Powell E, Steele S, Douglah M. Planning a program evaluation. Madison, Wisconsin: University of Wisconsin Cooperative Extension.

Fawcett, S.B., Paine-Andrews, A., Fancisco, V.T., Schultz, J.A., Richter, K.P, Berkley-Patton, J., Fisher, J., Lewis, R.K., Lopez, C.M., Russos, S., Williams, E.L., Harris, K.J., & Evensen, P. (2001). Evaluating community initiatives for health and development. In I. Rootman, D. McQueen, et al. (Eds.),  Evaluating health promotion approaches . (pp. 241-277). Copenhagen, Denmark: World Health Organization - Europe.

Fawcett , S., Sterling, T., Paine-, A., Harris, K., Francisco, V. et al. (1996).  Evaluating community efforts to prevent cardiovascular diseases . Atlanta, GA: Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion.

Fetterman, D.,, Kaftarian, S., & Wandersman, A. (1996).  Empowerment evaluation: knowledge and tools for self-assessment and accountability . Thousand Oaks, CA: Sage Publications.

Frechtling, J.,& Sharp, L. (1997).  User-friendly handbook for mixed method evaluations . Washington, DC: National Science Foundation.

Goodman, R., Speers, M., McLeroy, K., Fawcett, S., Kegler M., et al. (1998).  Identifying and defining the dimensions of community capacity to provide a basis for measurement . Health Education and Behavior;25(3):258-78.

Greene, J.  (1994). Qualitative program evaluation: practice and promise . In Handbook of Qualitative Research, edited by NK Denzin and YS Lincoln. Thousand Oaks, CA: Sage Publications.

Haddix, A., Teutsch. S., Shaffer. P., & Dunet. D. (1996). Prevention effectiveness: a guide to decision analysis and economic evaluation . New York, NY: Oxford University Press.

Hennessy, M.  Evaluation. In Statistics in Community health and development , edited by Stroup. D.,& Teutsch. S. New York, NY: Oxford University Press, 1998: 193-219

Henry, G. (1998). Graphing data. In Handbook of applied social research methods , edited by Bickman. L., & Rog.  D.. Thousand Oaks, CA: Sage Publications: 527-56.

Henry, G. (1998).  Practical sampling. In Handbook of applied social research methods , edited by  Bickman. L., & Rog. D.. Thousand Oaks, CA: Sage Publications: 101-26.

Institute of Medicine. Improving health in the community: a role for performance monitoring . Washington, DC: National Academy Press, 1997.

Joint Committee on Educational Evaluation, James R. Sanders (Chair). The program evaluation standards: how to assess evaluations of educational programs . Thousand Oaks, CA: Sage Publications, 1994.

Kaplan,  R., & Norton, D.  The balanced scorecard: measures that drive performance . Harvard Business Review 1992;Jan-Feb71-9.

Kar, S. (1989). Health promotion indicators and actions . New York, NY: Springer Publications.

Knauft, E. (1993).   What independent sector learned from an evaluation of its own hard-to -measure programs . In A vision of evaluation, edited by ST Gray. Washington, DC: Independent Sector.

Koplan, J. (1999)  CDC sets millennium priorities . US Medicine 4-7.

Lipsy, M. (1998).  Design sensitivity: statistical power for applied experimental research . In Handbook of applied social research methods, edited by Bickman, L., & Rog, D. Thousand Oaks, CA: Sage Publications. 39-68.

Lipsey, M. (1993). Theory as method: small theories of treatments . New Directions for Program Evaluation;(57):5-38.

Lipsey, M. (1997).  What can you build with thousands of bricks? Musings on the cumulation of knowledge in program evaluation . New Directions for Evaluation; (76): 7-23.

Love, A.  (1991).  Internal evaluation: building organizations from within . Newbury Park, CA: Sage Publications.

Miles, M., & Huberman, A. (1994).  Qualitative data analysis: a sourcebook of methods . Thousand Oaks, CA: Sage Publications, Inc.

National Quality Program. (1999).  National Quality Program , vol. 1999. National Institute of Standards and Technology.

National Quality Program . Baldridge index outperforms S&P 500 for fifth year, vol. 1999.

National Quality Program , 1999.

National Quality Program. Health care criteria for performance excellence , vol. 1999. National Quality Program, 1998.

Newcomer, K.  Using statistics appropriately. In Handbook of Practical Program Evaluation, edited by Wholey,J.,  Hatry, H., & Newcomer. K. San Francisco, CA: Jossey-Bass, 1994: 389-416.

Patton, M. (1990).  Qualitative evaluation and research methods . Newbury Park, CA: Sage Publications.

Patton, M (1997).  Toward distinguishing empowerment evaluation and placing it in a larger context . Evaluation Practice;18(2):147-63.

Patton, M. (1997).  Utilization-focused evaluation . Thousand Oaks, CA: Sage Publications.

Perrin, B. Effective use and misuse of performance measurement . American Journal of Evaluation 1998;19(3):367-79.

Perrin, E, Koshel J. (1997).  Assessment of performance measures for community health and development, substance abuse, and mental health . Washington, DC: National Academy Press.

Phillips, J. (1997).  Handbook of training evaluation and measurement methods . Houston, TX: Gulf Publishing Company.

Poreteous, N., Sheldrick B., & Stewart P. (1997).  Program evaluation tool kit: a blueprint for community health and development management . Ottawa, Canada: Community health and development Research, Education, and Development Program, Ottawa-Carleton Health Department.

Posavac, E., & Carey R. (1980).  Program evaluation: methods and case studies . Prentice-Hall, Englewood Cliffs, NJ.

Preskill, H. & Torres R. (1998).  Evaluative inquiry for learning in organizations . Thousand Oaks, CA: Sage Publications.

Public Health Functions Project. (1996). The public health workforce: an agenda for the 21st century . Washington, DC: U.S. Department of Health and Human Services, Community health and development Service.

Public Health Training Network. (1998).  Practical evaluation of public health programs . CDC, Atlanta, GA.

Reichardt, C., & Mark M. (1998).  Quasi-experimentation . In Handbook of applied social research methods, edited by L Bickman and DJ Rog. Thousand Oaks, CA: Sage Publications, 193-228.

Rossi, P., & Freeman H.  (1993).  Evaluation: a systematic approach . Newbury Park, CA: Sage Publications.

Rush, B., & Ogbourne A. (1995).  Program logic models: expanding their role and structure for program planning and evaluation . Canadian Journal of Program Evaluation;695 -106.

Sanders, J. (1993).  Uses of evaluation as a means toward organizational effectiveness. In A vision of evaluation , edited by ST Gray. Washington, DC: Independent Sector.

Schorr, L. (1997).   Common purpose: strengthening families and neighborhoods to rebuild America . New York, NY: Anchor Books, Doubleday.

Scriven, M. (1998) . A minimalist theory of evaluation: the least theory that practice requires . American Journal of Evaluation.

Shadish, W., Cook, T., Leviton, L. (1991).  Foundations of program evaluation . Newbury Park, CA: Sage Publications.

Shadish, W. (1998).   Evaluation theory is who we are. American Journal of Evaluation:19(1):1-19.

Shulha, L., & Cousins, J. (1997).  Evaluation use: theory, research, and practice since 1986 . Evaluation Practice.18(3):195-208

Sieber, J. (1998).   Planning ethically responsible research . In Handbook of applied social research methods, edited by L Bickman and DJ Rog. Thousand Oaks, CA: Sage Publications: 127-56.

Steckler, A., McLeroy, K., Goodman, R., Bird, S., McCormick, L. (1992).  Toward integrating qualitative and quantitative methods: an introduction . Health Education Quarterly;191-8.

Taylor-Powell, E., Rossing, B., Geran, J. (1998). Evaluating collaboratives: reaching the potential. Madison, Wisconsin: University of Wisconsin Cooperative Extension.

Teutsch, S.  A framework for assessing the effectiveness of disease and injury prevention . Morbidity and Mortality Weekly Report: Recommendations and Reports Series 1992;41 (RR-3 (March 27, 1992):1-13.

Torres, R., Preskill, H., Piontek, M., (1996).   Evaluation strategies for communicating and reporting: enhancing learning in organizations . Thousand Oaks, CA: Sage Publications.

Trochim, W. (1999).  Research methods knowledge base , vol.

United Way of America. Measuring program outcomes: a practical approach . Alexandria, VA: United Way of America, 1996.

U.S. General Accounting Office. Case study evaluations . GAO/PEMD-91-10.1.9. Washington, DC: U.S. General Accounting Office, 1990.

U.S. General Accounting Office. Designing evaluations . GAO/PEMD-10.1.4. Washington, DC: U.S. General Accounting Office, 1991.

U.S. General Accounting Office. Managing for results: measuring program results that are under limited federal control . GAO/GGD-99-16. Washington, DC: 1998.

U.S. General Accounting Office. Prospective evaluation methods: the prosepctive evaluation synthesis . GAO/PEMD-10.1.10. Washington, DC: U.S. General Accounting Office, 1990.

U.S. General Accounting Office. The evaluation synthesis . Washington, DC: U.S. General Accounting Office, 1992.

U.S. General Accounting Office. Using statistical sampling . Washington, DC: U.S. General Accounting Office, 1992.

Wandersman, A., Morrissey, E., Davino, K., Seybolt, D., Crusto, C., et al. Comprehensive quality programming and accountability: eight essential strategies for implementing successful prevention programs . Journal of Primary Prevention 1998;19(1):3-30.

Weiss, C. (1995). Nothing as practical as a good theory: exploring theory-based evaluation for comprehensive community initiatives for families and children . In New Approaches to Evaluating Community Initiatives, edited by Connell, J. Kubisch, A. Schorr, L.  & Weiss, C.  New York, NY, NY: Aspin Institute.

Weiss, C. (1998).  Have we learned anything new about the use of evaluation? American Journal of Evaluation;19(1):21-33.

Weiss, C. (1997).  How can theory-based evaluation make greater headway? Evaluation Review 1997;21(4):501-24.

W.K. Kellogg Foundation. (1998). The W.K. Foundation Evaluation Handbook . Battle Creek, MI: W.K. Kellogg Foundation.

Wong-Reiger, D.,& David, L. (1995).  Using program logic models to plan and evaluate education and prevention programs. In Evaluation Methods Sourcebook II, edited by Love. A.J. Ottawa, Ontario: Canadian Evaluation Society.

Wholey, S., Hatry, P., & Newcomer, E. .  Handbook of Practical Program Evaluation.  Jossey-Bass, 2010. This book serves as a comprehensive guide to the evaluation process and its practical applications for sponsors, program managers, and evaluators.

Yarbrough,  B., Lyn, M., Shulha, H., Rodney K., & Caruthers, A. (2011).  The Program Evaluation Standards: A Guide for Evalualtors and Evaluation Users Third Edition . Sage Publications.

Yin, R. (1988).  Case study research: design and methods . Newbury Park, CA: Sage Publications.

Introduction

  • Introduction to Program Evaluation for Public Health Programs: A Self-Study Guide

‹ View Table of Contents

  • What Is Program Evaluation?
  • Evaluation Supplements Other Types of Reflection and Data Collection
  • Distinguishing Principles of Research and Evaluation
  • Why Evaluate Public Health Programs?
  • CDC’s Framework for Program Evaluation in Public Health
  • How to Establish an Evaluation Team and Select a Lead Evaluator
  • Organization of This Manual

Most program managers assess the value and impact of their work all the time when they ask questions, consult partners, make assessments, and obtain feedback. They then use the information collected to improve the program. Indeed, such informal assessments fit nicely into a broad definition of evaluation as the “ examination of the worth, merit, or significance of an object. ” [4] And throughout this manual, the term “program” will be defined as “ any set of organized activities supported by a set of resources to achieve a specific and intended result. ” This definition is intentionally broad so that almost any organized public health action can be seen as a candidate for program evaluation:

  • Direct service interventions (e.g., a program that offers free breakfasts to improve nutrition for grade school children)
  • Community mobilization efforts (e.g., an effort to organize a boycott of California grapes to improve the economic well-being of farm workers)
  • Research initiatives (e.g., an effort to find out whether disparities in health outcomes based on race can be reduced)
  • Advocacy work (e.g., a campaign to influence the state legislature to pass legislation regarding tobacco control)
  • Training programs (e.g., a job training program to reduce unemployment in urban neighborhoods)

What distinguishes program evaluation from ongoing informal assessment is that program evaluation is conducted according to a set of guidelines. With that in mind, this manual defines program evaluation as “the systematic collection of information about the activities, characteristics, and outcomes of programs to make judgments about the program, improve program effectiveness, and/or inform decisions about future program development.” [5] Program evaluation does not occur in a vacuum; rather, it is influenced by real-world constraints. Evaluation should be practical and feasible and conducted within the confines of resources, time, and political context. Moreover, it should serve a useful purpose, be conducted in an ethical manner, and produce accurate findings. Evaluation findings should be used both to make decisions about program implementation and to improve program effectiveness.

Many different questions can be part of a program evaluation, depending on how long the program has been in existence, who is asking the question, and why the information is needed.

In general, evaluation questions fall into these groups:

  • Implementation: Were your program’s activities put into place as originally intended?
  • Effectiveness: Is your program achieving the goals and objectives it was intended to accomplish?
  • Efficiency: Are your program’s activities being produced with appropriate use of resources such as budget and staff time?
  • Cost-Effectiveness: Does the value or benefit of achieving your program’s goals and objectives exceed the cost of producing them?
  • Attribution: Can progress on goals and objectives be shown to be related to your program, as opposed to other things that are going on at the same time?

All of these are appropriate evaluation questions and might be asked with the intention of documenting program progress, demonstrating accountability to funders and policymakers, or identifying ways to make the program better.

Planning asks, “What are we doing and what should we do to achieve our goals?” By providing information on progress toward organizational goals and identifying which parts of the program are working well and/or poorly, program evaluation sets up the discussion of what can be changed to help the program better meet its intended goals and objectives.

Increasingly, public health programs are accountable to funders, legislators, and the general public. Many programs do this by creating, monitoring, and reporting results for a small set of markers and milestones of program progress. Such “performance measures” are a type of evaluation—answering the question “How are we doing?” More importantly, when performance measures show significant or sudden changes in program performance, program evaluation efforts can be directed to the troubled areas to determine “Why are we doing poorly or well?”

Linking program performance to program budget is the final step in accountability. Called “activity-based budgeting” or “performance budgeting,” it requires an understanding of program components and the links between activities and intended outcomes. The early steps in the program evaluation approach (such as logic modeling) clarify these relationships, making the link between budget and performance easier and more apparent.

While the terms surveillance and evaluation are often used interchangeably, each makes a distinctive contribution to a program, and it is important to clarify their different purposes. Surveillance is the continuous monitoring or routine data collection on various factors (e.g., behaviors, attitudes, deaths) over a regular interval of time. Surveillance systems have existing resources and infrastructure. Data gathered by surveillance systems are invaluable for performance measurement and program evaluation, especially of longer term and population-based outcomes. In addition, these data serve an important function in program planning and “formative” evaluation by identifying key burden and risk factors—the descriptive and analytic epidemiology of the public health problem. There are limits, however, to how useful surveillance data can be for evaluators. For example, some surveillance systems such as the Behavioral Risk Factor Surveillance System (BRFSS), Youth Tobacco Survey (YTS), and Youth Risk Behavior Survey (YRBS) can measure changes in large populations, but have insufficient sample sizes to detect changes in outcomes for more targeted programs or interventions. Also, these surveillance systems may have limited flexibility to add questions for a particular program evaluation.

In the best of all worlds, surveillance and evaluation are companion processes that can be conducted simultaneously. Evaluation may supplement surveillance data by providing tailored information to answer specific questions about a program. Data from specific questions for an evaluation are more flexible than surveillance and may allow program areas to be assessed in greater depth. For example, a state may supplement surveillance information with detailed surveys to evaluate how well a program was implemented and the impact of a program on participants’ knowledge, attitudes, and behavior. Evaluators can also use qualitative methods (e.g., focus groups, semi-structured or open-ended interviews) to gain insight into the strengths and weaknesses of a particular program activity.

Both research and program evaluation make important contributions to the body of knowledge, but fundamental differences in the purpose of research and the purpose of evaluation mean that good program evaluation need not always follow an academic research model. Even though some of these differences have tended to break down as research tends toward increasingly participatory models [6]  and some evaluations aspire to make statements about attribution, “pure” research and evaluation serve somewhat different purposes (See “Distinguishing Principles of Research and Evaluation” table, page 4), nicely summarized in the adage “Research seeks to prove; evaluation seeks to improve.” Academic research focuses primarily on testing hypotheses; a key purpose of program evaluation is to improve practice. Research is generally thought of as requiring a controlled environment or control groups. In field settings directed at prevention and control of a public health problem, this is seldom realistic. Of the ten concepts contrasted in the table, the last three are especially worth noting. Unlike pure academic research models, program evaluation acknowledges and incorporates differences in values and perspectives from the start, may address many questions besides attribution, and tends to produce results for varied audiences.

Research Principles

Program Evaluation Principles

Scientific method

  • State hypothesis.
  • Collect data.
  • Analyze data.
  • Draw conclusions.

Framework for program evaluation

  • Engage stakeholders.
  • Describe the program.
  • Focus the evaluation design.
  • Gather credible evidence.
  • Justify conclusions.
  • Ensure use and share lessons learned.

Decision Making

Investigator-controlled

  • Authoritative.

Stakeholder-controlled

  • Collaborative.
  • Internal (accuracy, precision).
  • External (generalizability).

Repeatability program evaluation standards

  • Feasibility.
  • Descriptions.
  • Associations.
  • Merit (i.e., quality).
  • Worth (i.e., value).
  • Significance (i.e., importance).

Isolate changes and control circumstances

  • Narrow experimental influences.
  • Ensure stability over time.
  • Minimize context dependence.
  • Treat contextual factors as confounding (e.g., randomization, adjustment, statistical control).
  • Understand that comparison groups are a necessity.

Incorporate changes and account for circumstances

  • Expand to see all domains of influence.
  • Encourage flexibility and improvement.
  • Maximize context sensitivity.
  • Treat contextual factors as essential information (e.g., system diagrams, logic models, hierarchical or ecological modeling).
  • Understand that comparison groups are optional (and sometimes harmful).

Data Collection

  • Limited number (accuracy preferred).
  • Sampling strategies are critical.
  • Concern for protecting human subjects.

Indicators/Measures

  • Quantitative.
  • Qualitative.
  • Multiple (triangulation preferred).
  • Concern for protecting human subjects, organizations, and communities.
  • Mixed methods (qualitative, quantitative, and integrated).

Analysis & Synthesis

  • One-time (at the end).
  • Focus on specific variables.
  • Ongoing (formative and summative).
  • Integrate all data.
  • Attempt to remain value-free.
  • Examine agreement on values.
  • State precisely whose values are used.

Conclusions

Attribution

  • Establish time sequence.
  • Demonstrate plausible mechanisms.
  • Control for confounding.
  • Replicate findings.

Attribution and contribution

  • Account for alternative explanations.
  • Show similar effects in similar contexts.

Disseminate to interested audiences

  • Content and format varies to maximize comprehension.

Feedback to stakeholders

  • Focus on intended uses by intended users.
  • Build capacity.
  • Emphasis on full disclosure.
  • Requirement for balanced assessment.
  • To monitor progress toward the program’s goals
  • To determine whether program components are producing the desired progress on outcomes
  • To permit comparisons among groups, particularly among populations with disproportionately high risk factors and adverse health outcomes
  • To justify the need for further funding and support
  • To find opportunities for continuous quality improvement.
  • To ensure that effective programs are maintained and resources are not wasted on ineffective programs

Program staff may be pushed to do evaluation by external mandates from funders, authorizers, or others, or they may be pulled to do evaluation by an internal need to determine how the program is performing and what can be improved. While push or pull can motivate a program to conduct good evaluations, program evaluation efforts are more likely to be sustained when staff see the results as useful information that can help them do their jobs better.

Data gathered during evaluation enable managers and staff to create the best possible programs, to learn from mistakes, to make modifications as needed, to monitor progress toward program goals, and to judge the success of the program in achieving its short-term, intermediate, and long-term outcomes. Most public health programs aim to change behavior in one or more target groups and to create an environment that reinforces sustained adoption of these changes, with the intention that changes in environments and behaviors will prevent and control diseases and injuries. Through evaluation, you can track these changes and, with careful evaluation designs, assess the effectiveness and impact of a particular program, intervention, or strategy in producing these changes.

Recognizing the importance of evaluation in public health practice and the need for appropriate methods, the World Health Organization (WHO) established the Working Group on Health Promotion Evaluation. The Working Group prepared a set of conclusions and related recommendations to guide policymakers and practitioners. [7] Recommendations immediately relevant to the evaluation of comprehensive public health programs include:

  • Encourage the adoption of participatory evaluation approaches that provide meaningful opportunities for involvement by all of those with a direct interest in initiatives (programs, policies, and other organized activities).
  • Require that a portion of total financial resources for a health promotion initiative be allocated to evaluation—they recommend 10%.
  • Ensure that a mixture of process and outcome information is used to evaluate all health promotion initiatives.
  • Support the use of multiple methods to evaluate health promotion initiatives.
  • Support further research into the development of appropriate approaches to evaluating health promotion initiatives.
  • Support the establishment of a training and education infrastructure to develop expertise in the evaluation of health promotion initiatives.
  • Create and support opportunities for sharing information on evaluation methods used in health promotion through conferences, workshops, networks, and other means.

The figure presents the steps and standards of the CDC Evaluation Framework.  The 6 steps are (1) engage stakeholders, (2) describe the program (3) focus the evaluation and its design, (4) gather credible evidence, (5) justify conclusions, and (6)ensure use and share lessons learned.

Program evaluation is one of ten essential public health services [8] and a critical organizational practice in public health. [9] Until recently, however, there has been little agreement among public health officials on the principles and procedures for conducting such studies. In 1999, CDC published Framework for Program Evaluation in Public Health and some related recommendations. [10] The Framework, as depicted in Figure 1.1, defined six steps and four sets of standards for conducting good evaluations of public health programs.

The underlying logic of the Evaluation Framework is that good evaluation does not merely gather accurate evidence and draw valid conclusions, but produces results that are used to make a difference. To maximize the chances evaluation results will be used, you need to create a “market” before you create the “product”—the evaluation. You determine the market by focusing evaluations on questions that are most salient, relevant, and important. You ensure the best evaluation focus by understanding where the questions fit into the full landscape of your program description, and especially by ensuring that you have identified and engaged stakeholders who care about these questions and want to take action on the results.

The steps in the CDC Framework are informed by a set of standards for evaluation. [11] These standards do not constitute a way to do evaluation; rather, they serve to guide your choice from among the many options available at each step in the Framework. The 30 standards cluster into four groups:

Utility: Who needs the evaluation results? Will the evaluation provide relevant information in a timely manner for them?

Feasibility: Are the planned evaluation activities realistic given the time, resources, and expertise at hand?

Propriety: Does the evaluation protect the rights of individuals and protect the welfare of those involved? Does it engage those most directly affected by the program and changes in the program, such as participants or the surrounding community?

Accuracy: Will the evaluation produce findings that are valid and reliable, given the needs of those who will use the results?

Sometimes the standards broaden your exploration of choices. Often, they help reduce the options at each step to a manageable number. For example, in the step “Engaging Stakeholders,” the standards can help you think broadly about who constitutes a stakeholder for your program, but simultaneously can reduce the potential list to a manageable number by posing the following questions: ( Utility ) Who will use these results? ( Feasibility ) How much time and effort can be devoted to stakeholder engagement? ( Propriety ) To be ethical, which stakeholders need to be consulted, those served by the program or the community in which it operates? ( Accuracy ) How broadly do you need to engage stakeholders to paint an accurate picture of this program?

Similarly, there are unlimited ways to gather credible evidence (Step 4). Asking these same kinds of questions as you approach evidence gathering will help identify ones what will be most useful, feasible, proper, and accurate for this evaluation at this time. Thus, the CDC Framework approach supports the fundamental insight that there is no such thing as the right program evaluation. Rather, over the life of a program, any number of evaluations may be appropriate, depending on the situation.

  • Experience in the type of evaluation needed
  • Comfortable with quantitative data sources and analysis
  • Able to work with a wide variety of stakeholders, including representatives of target populations
  • Can develop innovative approaches to evaluation while considering the realities affecting a program (e.g., a small budget)
  • Incorporates evaluation into all program activities
  • Understands both the potential benefits and risks of evaluation
  • Educates program personnel in designing and conducting the evaluation
  • Will give staff the full findings (i.e., will not gloss over or fail to report certain findings)

Good evaluation requires a combination of skills that are rarely found in one person. The preferred approach is to choose an evaluation team that includes internal program staff, external stakeholders, and possibly consultants or contractors with evaluation expertise.

An initial step in the formation of a team is to decide who will be responsible for planning and implementing evaluation activities. One program staff person should be selected as the lead evaluator to coordinate program efforts. This person should be responsible for evaluation activities, including planning and budgeting for evaluation, developing program objectives, addressing data collection needs, reporting findings, and working with consultants. The lead evaluator is ultimately responsible for engaging stakeholders, consultants, and other collaborators who bring the skills and interests needed to plan and conduct the evaluation.

Although this staff person should have the skills necessary to competently coordinate evaluation activities, he or she can choose to look elsewhere for technical expertise to design and implement specific tasks. However, developing in-house evaluation expertise and capacity is a beneficial goal for most public health organizations. Of the characteristics of a good evaluator listed in the text box below, the evaluator’s ability to work with a diverse group of stakeholders warrants highlighting. The lead evaluator should be willing and able to draw out and reconcile differences in values and standards among stakeholders and to work with knowledgeable stakeholder representatives in designing and conducting the evaluation.

Seek additional evaluation expertise in programs within the health department, through external partners (e.g., universities, organizations, companies), from peer programs in other states and localities, and through technical assistance offered by CDC. [12]

You can also use outside consultants as volunteers, advisory panel members, or contractors. External consultants can provide high levels of evaluation expertise from an objective point of view. Important factors to consider when selecting consultants are their level of professional training, experience, and ability to meet your needs. Overall, it is important to find a consultant whose approach to evaluation, background, and training best fit your program’s evaluation needs and goals. Be sure to check all references carefully before you enter into a contract with any consultant.

To generate discussion around evaluation planning and implementation, several states have formed evaluation advisory panels. Advisory panels typically generate input from local, regional, or national experts otherwise difficult to access. Such an advisory panel will lend credibility to your efforts and prove useful in cultivating widespread support for evaluation activities.

Evaluation team members should clearly define their respective roles. Informal consensus may be enough; others prefer a written agreement that describes who will conduct the evaluation and assigns specific roles and responsibilities to individual team members. Either way, the team must clarify and reach consensus on the:

  • Purpose of the evaluation
  • Potential users of the evaluation findings and plans for dissemination
  • Evaluation approach
  • Resources available
  • Protection for human subjects.

The agreement should also include a timeline and a budget for the evaluation.

This manual is organized by the six steps of the CDC Framework. Each chapter will introduce the key questions to be answered in that step, approaches to answering those questions, and how the four evaluation standards might influence your approach. The main points are illustrated with one or more public health examples that are composites inspired by actual work being done by CDC and states and localities. [13] Some examples that will be referred to throughout this manual:

The program aims to provide affordable home ownership to low-income families by identifying and linking funders/sponsors, construction volunteers, and eligible families. Together, they build a house over a multi-week period. At the end of the construction period, the home is sold to the family using a no-interest loan.

Lead poisoning is the most widespread environmental hazard facing young children, especially in older inner-city areas. Even at low levels, elevated blood lead levels (EBLL) have been associated with reduced intelligence, medical problems, and developmental problems. The main sources of lead poisoning in children are paint and dust in older homes with lead-based paint. Public health programs address the problem through a combination of primary and secondary prevention efforts. A typical secondary prevention program at the local level does outreach and screening of high-risk children, identifying those with EBLL, assessing their environments for sources of lead, and case managing both their medical treatment and environmental corrections. However, these programs must rely on others to accomplish the actual medical treatment and the reduction of lead in the home environment.

A common initiative of state immunization programs is comprehensive provider education programs to train and motivate private providers to provide more immunizations. A typical program includes a newsletter distributed three times per year to update private providers on new developments and changes in policy, and provide a brief education on various immunization topics; immunization trainings held around the state conducted by teams of state program staff and physician educators on general immunization topics and the immunization registry; a Provider Tool Kit on how to increase immunization rates in their practice; training of nursing staff in local health departments who then conduct immunization presentations in individual private provider clinics; and presentations on immunization topics by physician peer educators at physician grand rounds and state conferences.

Each chapter also provides checklists and worksheets to help you apply the teaching points.

[4] Scriven M. Minimalist theory of evaluation: The least theory that practice requires. American Journal of Evaluation 1998;19:57-70.

[5] Patton MQ. Utilization-focused evaluation: The new century text. 3rd ed. Thousand Oaks, CA: Sage, 1997.

[6] Green LW, George MA, Daniel M, Frankish CJ, Herbert CP, Bowie WR, et al. Study of participatory research in health promotion: Review and recommendations for the development of participatory research in health promotion in Canada . Ottawa, Canada : Royal Society of Canada , 1995.

[7] WHO European Working Group on Health Promotion Evaluation. Health promotion evaluation: Recommendations to policy-makers: Report of the WHO European working group on health promotion evaluation. Copenhagen, Denmark : World Health Organization, Regional Office for Europe, 1998.

[8] Public Health Functions Steering Committee. Public health in America . Fall 1994. Available at <http://www.health.gov/phfunctions/public.htm>. January 1, 2000.

[9] Dyal WW. Ten organizational practices of public health: A historical perspective. American Journal of Preventive Medicine 1995;11(6)Suppl 2:6-8.

[10] Centers for Disease Control and Prevention. op cit.

[11] Joint Committee on Standards for Educational Evaluation. The program evaluation standards: How to assess evaluations of educational programs. 2nd ed. Thousand Oaks, CA: Sage Publications, 1994.

[12] CDC’s Prevention Research Centers (PRC) program is an additional resource. The PRC program is a national network of 24 academic research centers committed to prevention research and the ability to translate that research into programs and policies. The centers work with state health departments and members of their communities to develop and evaluate state and local interventions that address the leading causes of death and disability in the nation. Additional information on the PRCs is available at www.cdc.gov/prc/index.htm.

[13] These cases are composites of multiple CDC and state and local efforts that have been simplified and modified to better illustrate teaching points. While inspired by real CDC and community programs, they are not intended to reflect the current

Pages in this Report

  • Acknowledgments
  • Guide Contents
  • Executive Summary
  • › Introduction
  • Step 1: Engage Stakeholders
  • Step 2: Describe the Program
  • Step 3: Focus the Evaluation Design
  • Step 4: Gather Credible Evidence
  • Step 5: Justify Conclusions
  • Step 6: Ensure Use of Evaluation Findings and Share Lessons Learned
  • Program Evaluation Resources

E-mail: [email protected]

To receive email updates about this page, enter your email address:

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Conceptual Framework – Types, Methodology and Examples

Conceptual Framework – Types, Methodology and Examples

Table of Contents

Conceptual Framework

Conceptual Framework

Definition:

A conceptual framework is a structured approach to organizing and understanding complex ideas, theories, or concepts. It provides a systematic and coherent way of thinking about a problem or topic, and helps to guide research or analysis in a particular field.

A conceptual framework typically includes a set of assumptions, concepts, and propositions that form a theoretical framework for understanding a particular phenomenon. It can be used to develop hypotheses, guide empirical research, or provide a framework for evaluating and interpreting data.

Conceptual Framework in Research

In research, a conceptual framework is a theoretical structure that provides a framework for understanding a particular phenomenon or problem. It is a key component of any research project and helps to guide the research process from start to finish.

A conceptual framework provides a clear understanding of the variables, relationships, and assumptions that underpin a research study. It outlines the key concepts that the study is investigating and how they are related to each other. It also defines the scope of the study and sets out the research questions or hypotheses.

Types of Conceptual Framework

Types of Conceptual Framework are as follows:

Theoretical Framework

A theoretical framework is an overarching set of concepts, ideas, and assumptions that help to explain and interpret a phenomenon. It provides a theoretical perspective on the phenomenon being studied and helps researchers to identify the relationships between different concepts. For example, a theoretical framework for a study on the impact of social media on mental health might draw on theories of communication, social influence, and psychological well-being.

Conceptual Model

A conceptual model is a visual or written representation of a complex system or phenomenon. It helps to identify the main components of the system and the relationships between them. For example, a conceptual model for a study on the factors that influence employee turnover might include factors such as job satisfaction, salary, work-life balance, and job security, and the relationships between them.

Empirical Framework

An empirical framework is based on empirical data and helps to explain a particular phenomenon. It involves collecting data, analyzing it, and developing a framework to explain the results. For example, an empirical framework for a study on the impact of a new health intervention might involve collecting data on the intervention’s effectiveness, cost, and acceptability to patients.

Descriptive Framework

A descriptive framework is used to describe a particular phenomenon. It helps to identify the main characteristics of the phenomenon and to develop a vocabulary to describe it. For example, a descriptive framework for a study on different types of musical genres might include descriptions of the instruments used, the rhythms and beats, the vocal styles, and the cultural contexts of each genre.

Analytical Framework

An analytical framework is used to analyze a particular phenomenon. It involves breaking down the phenomenon into its constituent parts and analyzing them separately. This type of framework is often used in social science research. For example, an analytical framework for a study on the impact of race on police brutality might involve analyzing the historical and cultural factors that contribute to racial bias, the organizational factors that influence police behavior, and the psychological factors that influence individual officers’ behavior.

Conceptual Framework for Policy Analysis

A conceptual framework for policy analysis is used to guide the development of policies or programs. It helps policymakers to identify the key issues and to develop strategies to address them. For example, a conceptual framework for a policy analysis on climate change might involve identifying the key stakeholders, assessing their interests and concerns, and developing policy options to mitigate the impacts of climate change.

Logical Frameworks

Logical frameworks are used to plan and evaluate projects and programs. They provide a structured approach to identifying project goals, objectives, and outcomes, and help to ensure that all stakeholders are aligned and working towards the same objectives.

Conceptual Frameworks for Program Evaluation

These frameworks are used to evaluate the effectiveness of programs or interventions. They provide a structure for identifying program goals, objectives, and outcomes, and help to measure the impact of the program on its intended beneficiaries.

Conceptual Frameworks for Organizational Analysis

These frameworks are used to analyze and evaluate organizational structures, processes, and performance. They provide a structured approach to understanding the relationships between different departments, functions, and stakeholders within an organization.

Conceptual Frameworks for Strategic Planning

These frameworks are used to develop and implement strategic plans for organizations or businesses. They help to identify the key factors and stakeholders that will impact the success of the plan, and provide a structure for setting goals, developing strategies, and monitoring progress.

Components of Conceptual Framework

The components of a conceptual framework typically include:

  • Research question or problem statement : This component defines the problem or question that the conceptual framework seeks to address. It sets the stage for the development of the framework and guides the selection of the relevant concepts and constructs.
  • Concepts : These are the general ideas, principles, or categories that are used to describe and explain the phenomenon or problem under investigation. Concepts provide the building blocks of the framework and help to establish a common language for discussing the issue.
  • Constructs : Constructs are the specific variables or concepts that are used to operationalize the general concepts. They are measurable or observable and serve as indicators of the underlying concept.
  • Propositions or hypotheses : These are statements that describe the relationships between the concepts or constructs in the framework. They provide a basis for testing the validity of the framework and for generating new insights or theories.
  • Assumptions : These are the underlying beliefs or values that shape the framework. They may be explicit or implicit and may influence the selection and interpretation of the concepts and constructs.
  • Boundaries : These are the limits or scope of the framework. They define the focus of the investigation and help to clarify what is included and excluded from the analysis.
  • Context : This component refers to the broader social, cultural, and historical factors that shape the phenomenon or problem under investigation. It helps to situate the framework within a larger theoretical or empirical context and to identify the relevant variables and factors that may affect the phenomenon.
  • Relationships and connections: These are the connections and interrelationships between the different components of the conceptual framework. They describe how the concepts and constructs are linked and how they contribute to the overall understanding of the phenomenon or problem.
  • Variables : These are the factors that are being measured or observed in the study. They are often operationalized as constructs and are used to test the propositions or hypotheses.
  • Methodology : This component describes the research methods and techniques that will be used to collect and analyze data. It includes the sampling strategy, data collection methods, data analysis techniques, and ethical considerations.
  • Literature review : This component provides an overview of the existing research and theories related to the phenomenon or problem under investigation. It helps to identify the gaps in the literature and to situate the framework within the broader theoretical and empirical context.
  • Outcomes and implications: These are the expected outcomes or implications of the study. They describe the potential contributions of the study to the theoretical and empirical knowledge in the field and the practical implications for policy and practice.

Conceptual Framework Methodology

Conceptual Framework Methodology is a research method that is commonly used in academic and scientific research to develop a theoretical framework for a study. It is a systematic approach that helps researchers to organize their thoughts and ideas, identify the variables that are relevant to their study, and establish the relationships between these variables.

Here are the steps involved in the conceptual framework methodology:

Identify the Research Problem

The first step is to identify the research problem or question that the study aims to answer. This involves identifying the gaps in the existing literature and determining what specific issue the study aims to address.

Conduct a Literature Review

The second step involves conducting a thorough literature review to identify the existing theories, models, and frameworks that are relevant to the research question. This will help the researcher to identify the key concepts and variables that need to be considered in the study.

Define key Concepts and Variables

The next step is to define the key concepts and variables that are relevant to the study. This involves clearly defining the terms used in the study, and identifying the factors that will be measured or observed in the study.

Develop a Theoretical Framework

Once the key concepts and variables have been identified, the researcher can develop a theoretical framework. This involves establishing the relationships between the key concepts and variables, and creating a visual representation of these relationships.

Test the Framework

The final step is to test the theoretical framework using empirical data. This involves collecting and analyzing data to determine whether the relationships between the key concepts and variables that were identified in the framework are accurate and valid.

Examples of Conceptual Framework

Some realtime Examples of Conceptual Framework are as follows:

  • In economics , the concept of supply and demand is a well-known conceptual framework. It provides a structure for understanding how prices are set in a market, based on the interplay of the quantity of goods supplied by producers and the quantity of goods demanded by consumers.
  • In psychology , the cognitive-behavioral framework is a widely used conceptual framework for understanding mental health and illness. It emphasizes the role of thoughts and behaviors in shaping emotions and the importance of cognitive restructuring and behavior change in treatment.
  • In sociology , the social determinants of health framework provides a way of understanding how social and economic factors such as income, education, and race influence health outcomes. This framework is widely used in public health research and policy.
  • In environmental science , the ecosystem services framework is a way of understanding the benefits that humans derive from natural ecosystems, such as clean air and water, pollination, and carbon storage. This framework is used to guide conservation and land-use decisions.
  • In education, the constructivist framework is a way of understanding how learners construct knowledge through active engagement with their environment. This framework is used to guide instructional design and teaching strategies.

Applications of Conceptual Framework

Some of the applications of Conceptual Frameworks are as follows:

  • Research : Conceptual frameworks are used in research to guide the design, implementation, and interpretation of studies. Researchers use conceptual frameworks to develop hypotheses, identify research questions, and select appropriate methods for collecting and analyzing data.
  • Policy: Conceptual frameworks are used in policy-making to guide the development of policies and programs. Policymakers use conceptual frameworks to identify key factors that influence a particular problem or issue, and to develop strategies for addressing them.
  • Education : Conceptual frameworks are used in education to guide the design and implementation of instructional strategies and curriculum. Educators use conceptual frameworks to identify learning objectives, select appropriate teaching methods, and assess student learning.
  • Management : Conceptual frameworks are used in management to guide decision-making and strategy development. Managers use conceptual frameworks to understand the internal and external factors that influence their organizations, and to develop strategies for achieving their goals.
  • Evaluation : Conceptual frameworks are used in evaluation to guide the development of evaluation plans and to interpret evaluation results. Evaluators use conceptual frameworks to identify key outcomes, indicators, and measures, and to develop a logic model for their evaluation.

Purpose of Conceptual Framework

The purpose of a conceptual framework is to provide a theoretical foundation for understanding and analyzing complex phenomena. Conceptual frameworks help to:

  • Guide research : Conceptual frameworks provide a framework for researchers to develop hypotheses, identify research questions, and select appropriate methods for collecting and analyzing data. By providing a theoretical foundation for research, conceptual frameworks help to ensure that research is rigorous, systematic, and valid.
  • Provide clarity: Conceptual frameworks help to provide clarity and structure to complex phenomena by identifying key concepts, relationships, and processes. By providing a clear and systematic understanding of a phenomenon, conceptual frameworks help to ensure that researchers, policymakers, and practitioners are all on the same page when it comes to understanding the issue at hand.
  • Inform decision-making : Conceptual frameworks can be used to inform decision-making and strategy development by identifying key factors that influence a particular problem or issue. By understanding the complex interplay of factors that contribute to a particular issue, decision-makers can develop more effective strategies for addressing the problem.
  • Facilitate communication : Conceptual frameworks provide a common language and conceptual framework for researchers, policymakers, and practitioners to communicate and collaborate on complex issues. By providing a shared understanding of a phenomenon, conceptual frameworks help to ensure that everyone is working towards the same goal.

When to use Conceptual Framework

There are several situations when it is appropriate to use a conceptual framework:

  • To guide the research : A conceptual framework can be used to guide the research process by providing a clear roadmap for the research project. It can help researchers identify key variables and relationships, and develop hypotheses or research questions.
  • To clarify concepts : A conceptual framework can be used to clarify and define key concepts and terms used in a research project. It can help ensure that all researchers are using the same language and have a shared understanding of the concepts being studied.
  • To provide a theoretical basis: A conceptual framework can provide a theoretical basis for a research project by linking it to existing theories or conceptual models. This can help researchers build on previous research and contribute to the development of a field.
  • To identify gaps in knowledge : A conceptual framework can help identify gaps in existing knowledge by highlighting areas that require further research or investigation.
  • To communicate findings : A conceptual framework can be used to communicate research findings by providing a clear and concise summary of the key variables, relationships, and assumptions that underpin the research project.

Characteristics of Conceptual Framework

key characteristics of a conceptual framework are:

  • Clear definition of key concepts : A conceptual framework should clearly define the key concepts and terms being used in a research project. This ensures that all researchers have a shared understanding of the concepts being studied.
  • Identification of key variables: A conceptual framework should identify the key variables that are being studied and how they are related to each other. This helps to organize the research project and provides a clear focus for the study.
  • Logical structure: A conceptual framework should have a logical structure that connects the key concepts and variables being studied. This helps to ensure that the research project is coherent and consistent.
  • Based on existing theory : A conceptual framework should be based on existing theory or conceptual models. This helps to ensure that the research project is grounded in existing knowledge and builds on previous research.
  • Testable hypotheses or research questions: A conceptual framework should include testable hypotheses or research questions that can be answered through empirical research. This helps to ensure that the research project is rigorous and scientifically valid.
  • Flexibility : A conceptual framework should be flexible enough to allow for modifications as new information is gathered during the research process. This helps to ensure that the research project is responsive to new findings and is able to adapt to changing circumstances.

Advantages of Conceptual Framework

Advantages of the Conceptual Framework are as follows:

  • Clarity : A conceptual framework provides clarity to researchers by outlining the key concepts and variables that are relevant to the research project. This clarity helps researchers to focus on the most important aspects of the research problem and develop a clear plan for investigating it.
  • Direction : A conceptual framework provides direction to researchers by helping them to develop hypotheses or research questions that are grounded in existing theory or conceptual models. This direction ensures that the research project is relevant and contributes to the development of the field.
  • Efficiency : A conceptual framework can increase efficiency in the research process by providing a structure for organizing ideas and data. This structure can help researchers to avoid redundancies and inconsistencies in their work, saving time and effort.
  • Rigor : A conceptual framework can help to ensure the rigor of a research project by providing a theoretical basis for the investigation. This rigor is essential for ensuring that the research project is scientifically valid and produces meaningful results.
  • Communication : A conceptual framework can facilitate communication between researchers by providing a shared language and understanding of the key concepts and variables being studied. This communication is essential for collaboration and the advancement of knowledge in the field.
  • Generalization : A conceptual framework can help to generalize research findings beyond the specific study by providing a theoretical basis for the investigation. This generalization is essential for the development of knowledge in the field and for informing future research.

Limitations of Conceptual Framework

Limitations of Conceptual Framework are as follows:

  • Limited applicability: Conceptual frameworks are often based on existing theory or conceptual models, which may not be applicable to all research problems or contexts. This can limit the usefulness of a conceptual framework in certain situations.
  • Lack of empirical support : While a conceptual framework can provide a theoretical basis for a research project, it may not be supported by empirical evidence. This can limit the usefulness of a conceptual framework in guiding empirical research.
  • Narrow focus: A conceptual framework can provide a clear focus for a research project, but it may also limit the scope of the investigation. This can make it difficult to address broader research questions or to consider alternative perspectives.
  • Over-simplification: A conceptual framework can help to organize and structure research ideas, but it may also over-simplify complex phenomena. This can limit the depth of the investigation and the richness of the data collected.
  • Inflexibility : A conceptual framework can provide a structure for organizing research ideas, but it may also be inflexible in the face of new data or unexpected findings. This can limit the ability of researchers to adapt their research project to new information or changing circumstances.
  • Difficulty in development : Developing a conceptual framework can be a challenging and time-consuming process. It requires a thorough understanding of existing theory or conceptual models, and may require collaboration with other researchers.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Data collection

Data Collection – Methods Types and Examples

Delimitations

Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Evaluating Research

Evaluating Research – Process, Examples and...

Research on Information Literacy Evaluation Framework and Its Indicator Interaction Relationship

  • Conference paper
  • First Online: 10 January 2024
  • Cite this conference paper

Book cover

  • Chunhong Liu   ORCID: orcid.org/0000-0001-7364-0568 11 ,
  • Zhengling Zhang 12 ,
  • Wenfeng Li 11 ,
  • Congpin Zhang 11 &
  • Dong Liu 11  

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1899))

Included in the following conference series:

  • International Conference on Computer Science and Educational Informatization

119 Accesses

In the information society, information literacy is significant to students’ success, but the existing evaluation frame is far from satisfying. To address this issue, this paper introduces disciplinary literacy into the evaluation system and constructs an evaluation framework for normal university students of computer science and technology from the three modules of disciplinary development, general level and teaching orientation. Then, typical features of each module are extracted by the canonical correlation analysis method, and correlations between typical features are assessed by the Pearson correlation coefficient. With the support of 139 students’ data, we have already obtained the following conclusions. First, the general level is a bridge between disciplinary development and teaching orientation. Second, there are two groups of moderate positive correlations here, one between computational thinking attitude and information knowledge and the other between teaching belief and information ethics, with coefficients of 0.62 and 0.53, respectively. Moreover, there is a strong positive correlation between teaching integration and information ability, with a coefficient of 0.72.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Action Plan for Revitalizing Teacher Education. http://www.moe.gov.cn/srcsite/A10/s7034/201803/t20180323_331063.html . Accessed 23 Mar 2018

Notice of the Ministry of Education on Printing and Distributing the Education Informatization 2.0 Action Plan. http://www.moe.gov.cn/srcsite/A16/s3342/201804/t20180425_334188.html . Accessed 25 Apr 2018

Mohagheghzadeh, M.S., Mortazavi, S.M.J., Ghasempour, M., et al.: The impact of computer and information communication technology literacy on the academic achievement of medical and dental students at Shiraz University of Medical sciences. Eur. Sci. J. 10 (9), 273–280 (2014)

Google Scholar  

Zhong, Z.X.: Lifelong learning: the connotation, evolution and standard of information literacy. Distance Educ. China (8), 21–29+95 (2013)

Zhang, Q.W.: Information literacy and information literacy education. E-Educ. Res. (2), 9–14 (2001)

Na, R., Wu, X.W., Lv, J.H.: Research status and prospect of information literacy standards at home and abroad. Libr. Inf. Work 54 (3), 32–35 (2010)

Zhou, J., Wang, Y., Iris, X.: Generation characteristics, changes of information environment and innovation of information literacy education for college students. J. China Libr. 41 (4), 25–39 (2015)

Luo, M., Wang, Z.H.: Research on the influencing factors of students’ information literacy based on ISM and AHP. China Educ. Technol. (4), 5–11 (2018)

Zhu, S., Wu, D., Yang, H., et al.: Research framework of student information literacy evaluation based on ECD. China Educ. Technol. (10), 88–96 (2020)

Na, R., Wu, X.W., Lv, J.H.: Network information literacy evaluation based on AHP and fuzzy comprehensive evaluation. Inf. Mag. 30 (7), 81–84 (2011)

Markauskaite, L., Goodwin, N., Reid, D., Reimann, P.: Modelling and evaluating ICT courses for pre-service teachers: what works and how it works? In: Mittermeir, R.T. (ed.) ISSEP 2006. LNCS, vol. 4226, pp. 242–254. Springer, Heidelberg (2006). https://doi.org/10.1007/11915355_23

Chapter   Google Scholar  

Dolenc, K., Sorgo, A.: Information literacy capabilities of lower secondary school students in Slovenia. J. Educ. Res. 113 (5), 335–342 (2020)

Article   Google Scholar  

Gerrity, C.: The new National School Library Standards: implications for information literacy instruction in higher education. J. Acad. Librarianship 44 (4), 455–458 (2018)

Bury, S.: Learning from faculty voices on information literacy. Ref. Serv. Rev. 44 (3), 237–252 (2016)

Tu, P.: Information literacy survey and curriculum system construction of normal students in ethnic areas. Comput. Educ. 326 (2), 17–21 (2022)

Li, Y., He, S.W., Qiu, L.H.: Research on the evaluation index system of normal university students’ information literacy in the era of educational informatization 2.0. China Educ. Technol. 401 (6), 104–111 (2020)

Guo, J.C., Miu, L., Chen, Q.H.: Construction and investigation of evaluation framework of mathematics teachers’ information literacy. Res. Mod. Basic Educ. 44 (4), 32–40 (2021)

Yadav, A., Mayfield, C., Zhou, N., et al.: Computational thinking in elementary and secondary teacher education. ACM Trans. Comput. Educ. 14 (1), 1–16 (2014)

Ren, Y.Q., Sui, F.W., Li, F.: How is digital indigenous possible—on the necessity and possibility of computing thinking in information technology education in primary and secondary schools. China Educ. Technol. (1), 2–8 (2016)

David, W., Elham, B., Michael, H.: Defining computational thinking for mathematics and science classrooms. J. Sci. Educ. Technol. 25 , 127–147 (2016). https://doi.org/10.1007/s10956-015-9581-5

Xiao, G.D., Huang, R.H.: Problems in the implementation of high school information technology curriculum and consideration of the new curriculum standard. China Educ. Technol. (12), 10–15 (2016)

Li, F., Wang, J.Q.: Computational thinking: an intrinsic value of information technology curriculum. China Educ. Technol. (8), 19–23 (2013)

Shi, G.: The connotation and cultivation of digital literacy of primary and secondary school students. Curriculum Textbook Teach. Method 36 (7), 69–75 (2016)

Sang, G.Y., Dong, Y.: On the connotation evolution of teachers’ information literacy and its promotion strategies in the internet plus era. E-Educ. Res. 37 (11), 108–112 (2016)

Liu, H.: Analysis of the concept and content elements of paninformation literacy. Books Inf. (4), 67–73 (2020)

Kim, K.S., Sin, S.C.J.: Perception and selection of information sources by undergraduate students: effects of avoidant style, confidence, and personal control in problem-solving. J. Acad. Librarianship 33 (6), 655–665 (2007)

Article   MathSciNet   Google Scholar  

Greenberg, R., Bar-Ilan, J.: Information needs of students in Israel—a case study of a multicultural society. J. Acad. Librarianship 40 (2), 185–191 (2014)

Meng, X.B., Li, A.G.: Research on scientific data literacy education in foreign university libraries. J. Univ. Libr. 32 (3), 11–16 (2014)

MathSciNet   Google Scholar  

Song, J., Liu, W., Liu, L.S.: Analysis on the current situation of online teaching and training needs of teachers during the epidemic prevention and control period—based on the results of a sampling survey of 100 primary and secondary schools nationwide. Res. Teach. Educ. 32 (3), 1–9 (2020)

Wang, Y., Shi, W.L., Cui, Y.H.: Research on young teachers’ information literacy in the internet plus era. China Educ. Technol. (3), 109–114 (2017)

Huang, Y.X.: Research on the development of primary and secondary school teachers’ literacy in the internet plus environment. Teach. Manag. (27), 59–61 (2018)

Mukaka, M.M.: A guide to appropriate use of correlation coefficient in medical research. Malawi Med. J. 24 (3), 69–71 (2012)

Download references

Acknowledgements

We would like to thank the Henan Provincial Higher Education Teaching Reform Research and Practice Project Foundation for supporting the study [number: 2021SJGLX355]. We are supported by the 2021 special research project of Wisdom Teaching in General Undergraduate College and University of Henan Province, the Tracking of Group Intelligent Learning Knowledge Integrated with Learning Emotion, Personalized Guidance and Effectiveness Research. Simultaneously, we are honestly grateful to Grade 20 computer science and technology normal students of Henan Normal University for allowing us to use questionnaires to obtain relevant information literacy data.

Author information

Authors and affiliations.

Henan Key Laboratory of Educational Artificial Intelligence and Personalized Learning, Xinxiang, Henan, China

Chunhong Liu, Wenfeng Li, Congpin Zhang & Dong Liu

Henan Normal University, Xinxiang, Henan, China

Zhengling Zhang

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Chunhong Liu .

Editor information

Editors and affiliations.

Yunnan Normal University, Kunming, China

Jianhou Gan

Georgia State University, Atlanta, GA, USA

Juxiang Zhou

Henan Normal University, Xinxiang, China

Harbin University of Science and Technol, Harbin, China

Xianhua Song

National Academy of Guo Ding Institute of Data Science, Beijing, China

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper.

Liu, C., Zhang, Z., Li, W., Zhang, C., Liu, D. (2024). Research on Information Literacy Evaluation Framework and Its Indicator Interaction Relationship. In: Gan, J., Pan, Y., Zhou, J., Liu, D., Song, X., Lu, Z. (eds) Computer Science and Educational Informatization. CSEI 2023. Communications in Computer and Information Science, vol 1899. Springer, Singapore. https://doi.org/10.1007/978-981-99-9499-1_16

Download citation

DOI : https://doi.org/10.1007/978-981-99-9499-1_16

Published : 10 January 2024

Publisher Name : Springer, Singapore

Print ISBN : 978-981-99-9498-4

Online ISBN : 978-981-99-9499-1

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

tools4dev Practical tools for international development

research and evaluation framework

How to write a monitoring and evaluation (M&E) framework

Download as PDF

Note: An M&E framework can also be called an evaluation matrix.

One of the most popular downloads on tools4dev is our M&E framework template . We’ve had lots of questions from people on how to use specific parts of the template, so we’ve decided to put together this short how-to guide.

Choose your indicators

The first step in writing an M&E framework is to decide which indicators you will use to measure the success of your program. This is a very important step, so you should try to involve as many people as possible to get different perspectives.

You need to choose indicators for each level of your program – outputs, outcomes and goal (for more information on these levels see our articles on how to design a program and logical frameworks ). There can be more than one indicator for each level, although you should try to keep the total number of indicators manageable.

Each indicator should be:

  • Directly related to the output, outcome or goal listed on the problem tree or logframe.
  • Something that you can measure accurately using either qualitative or quantitative methods, and your available resources.
  • If possible, a standard indicator that is commonly used for this type of program. For example, poverty could be measured using the Progress Out of Poverty Index . Using standard indicators can be better because they are already well defined, there are tools available to measure them, and you will be able to compare your results to other programs or national statistics.

Here is an example of some indicators for the goal, outcome and output of an education program:

research and evaluation framework

Some organisations have very strict rules about how the indicators must be written (for example, it must always start with a number, or must always contain an adjective). In my experience these rules usually lead to indicators that are convoluted or don’t make sense. My advice is just to make sure the indicators are written in a way where everyone involved in the project (including the donor) can understand them.

Define each indicator

Once you have chosen your indicators you need to write a definition for each one. The definition describes exactly how the indicator is calculated. If you don’t have definitions there is a serious risk that indicators might be calculated differently at different times, which means the results can’t be compared.

Here is an example of how one indicator in the education program is defined:

research and evaluation framework

After writing the definition of each indicator you also need to identify where the data will come from (the “data source”). Common sources are baseline and endline surveys, monitoring reports, and existing information systems. You also need to decide how frequently it will be measured (monthly, quarterly, annually, etc.).

Measure the baseline and set the target

Before you start your program you need to measure the starting value of each indicator – this is called the “baseline”. In the education example above that means you would need to measure the current percentage of Grade 6 students continuing on to Grade 7 (before you start your program).

In some cases you will need to do a survey to measure the baseline. In other cases you might have existing data available. In this case you need to make sure the existing data is using the same definition as you for calculating the indicator.

Once you know the baseline you need to set a target for improvement. Before you set the target it’s important to do some research on what a realistic target actually is. Many people set targets that are unachievable, without realising it. For example, I once worked on a project where the target was a 25% reduction in the child mortality rate within 12 months. However, a brief review of other child health programs showed that even the best programs only managed a 10-20% reduction within 5 years.

Identify who is responsible and where the results will be reported

The final step is to decide who will be responsible for measuring each indicator. Output indicators are often measured by field staff or program managers, while outcome and goal indicators may be measured by evaluation consultants or even national agencies.

You also need to decide where the results for each indicator will be reported. This could be in your monthly program reports, annual donor reports, or on your website. Indicator results are used to assess whether the program is working or not, so it’s very important that decision makers and stakeholders (not just the donor) have access to them as soon as possible.

Put it all into the template

Once you have completed all these steps, you’re now ready to put everything into the M&E framework template .

Download the M&E framework template and example

Tags Monitoring & Evaluation

About Piroska Bisits Bullen

Avatar photo

Related Articles

research and evaluation framework

What can international development learn from tech start-ups?

13 May 2021

research and evaluation framework

Social Enterprise Business Plan Template

12 May 2021

research and evaluation framework

How to write an M&E framework – Free video tutorial & templates

10 September 2017

  • Short report
  • Open access
  • Published: 12 April 2024

A modified action framework to develop and evaluate academic-policy engagement interventions

  • Petra Mäkelä   ORCID: orcid.org/0000-0002-0938-1175 1 ,
  • Annette Boaz   ORCID: orcid.org/0000-0003-0557-1294 2 &
  • Kathryn Oliver   ORCID: orcid.org/0000-0002-4326-5258 1  

Implementation Science volume  19 , Article number:  31 ( 2024 ) Cite this article

707 Accesses

23 Altmetric

Metrics details

There has been a proliferation of frameworks with a common goal of bridging the gap between evidence, policy, and practice, but few aim to specifically guide evaluations of academic-policy engagement. We present the modification of an action framework for the purpose of selecting, developing and evaluating interventions for academic-policy engagement.

We build on the conceptual work of an existing framework known as SPIRIT (Supporting Policy In Health with Research: an Intervention Trial), developed for the evaluation of strategies intended to increase the use of research in health policy. Our aim was to modify SPIRIT, (i) to be applicable beyond health policy contexts, for example encompassing social, environmental, and economic policy impacts and (ii) to address broader dynamics of academic-policy engagement. We used an iterative approach through literature reviews and consultation with multiple stakeholders from Higher Education Institutions (HEIs) and policy professionals working at different levels of government and across geographical contexts in England, alongside our evaluation activities in the Capabilities in Academic Policy Engagement (CAPE) programme.

Our modifications expand upon Redman et al.’s original framework, for example adding a domain of ‘Impacts and Sustainability’ to capture continued activities required in the achievement of desirable outcomes. The modified framework fulfils the criteria for a useful action framework, having a clear purpose, being informed by existing understandings, being capable of guiding targeted interventions, and providing a structure to build further knowledge.

The modified SPIRIT framework is designed to be meaningful and accessible for people working across varied contexts in the evidence-policy ecosystem. It has potential applications in how academic-policy engagement interventions might be developed, evaluated, facilitated and improved, to ultimately support the use of evidence in decision-making.

Peer Review reports

Contributions to the literature

There has been a proliferation of theories, models and frameworks relating to translation of research into practice. Few specifically relate to engagement between academia and policy.

Challenges of evidence-informed policy-making are receiving increasing attention globally. There is a growing number of academic-policy engagement interventions but a lack of published evaluations.

This article contributes a modified action framework that can be used to guide how academic-policy engagement interventions might be developed, evaluated, facilitated, and improved, to support the use of evidence in policy decision-making.

Our contribution demonstrates the potential for modification of existing, useful frameworks instead of creating brand-new frameworks. It provides an exemplar for others who are considering when and how to modify existing frameworks to address new or expanded purposes while respecting the conceptual underpinnings of the original work.

Academic-policy engagement refers to ways that Higher Education Institutions (HEIs) and their staff engage with institutions responsible for policy at national, regional, county or local levels. Academic-policy engagement is intended to support the use of evidence in decision-making and in turn, improve its effectiveness, and inform the identification of barriers and facilitators in policy implementation [ 1 , 2 , 3 ]. Challenges of evidence-informed policy-making are receiving increasing attention globally, including the implications of differences in cultural norms and mechanisms across national contexts [ 4 , 5 ]. Although challenges faced by researchers and policy-makers have been well documented [ 6 , 7 ], there has been less focus on actions at the engagement interface. Pragmatic guidance for the development, evaluation or comparison of structured responses to the challenges of academic-policy engagement is currently lacking [ 8 , 9 ].

Academic-policy engagement exists along a continuum of approaches from linear (pushing evidence out from academia or pulling evidence into policy), relational (promoting mutual understandings and partnerships), and systems approaches (addressing identified barriers and facilitators) [ 4 ]. Each approach is underpinned by sets of beliefs, assumptions and expectations, and each raises questions for implementation and evaluation. Little is known about which academic-policy engagement interventions work in which settings, with scarce empirical evidence to inform decisions about which interventions to use, when, with whom, or why, and how organisational contexts can affect motivation and capabilities for such engagement [ 10 ]. A deeper understanding through the evaluation of engagement interventions will help to identify inhibitory and facilitatory factors, which may or may not transfer across contexts [ 11 ].

The intellectual technologies [ 12 ] of implementation science have proliferated in recent decades, including models, frameworks and theories that address research translation and acknowledge difficulties in closing the gap between research, policy and practice [ 13 ]. Frameworks may serve overlapping purposes of describing or guiding processes of translating knowledge into practice (e.g. the Quality Implementation Framework [ 14 ]); or helping to explain influences on implementation outcomes (e.g. the Theoretical Domains Framework [ 15 ]); or guiding evaluation (e.g. the RE-AIM framework [ 16 , 17 ]. Frameworks can offer an efficient way to look across diverse settings and to identify implementation differences [ 18 , 19 ]. However, the abundance of options raises its own challenges when seeking a framework for a particular purpose, and the use of a framework may mean that more weight is placed on certain aspects, leading to a partial understanding [ 13 , 17 ].

‘Action frameworks’ are predictive models that intend to organise existing knowledge and enable a logical approach for the selection, implementation and evaluation of intervention strategies, thereby facilitating the expansion of that knowledge [ 20 ]. They can guide change by informing and clarifying practical steps to follow. As flexible entities, they can be adapted to accommodate new purposes. Framework modification may include the addition of constructs or changes in language to expand applicability to a broader range of settings [ 21 ].

We sought to identify one organising framework for evaluation activities in the Capabilities in Academic-Policy Engagement (CAPE) programme (2021–2023), funded by Research England. The CAPE programme aimed to understand how best to support effective and sustained engagement between academics and policy professionals across the higher education sector in England [ 22 ]. We first searched the literature and identified an action framework that was originally developed between 2011 and 2013, to underpin a trial known as SPIRIT (Supporting Policy In health with Research: an Intervention Trial) [ 20 , 23 ]. This trial evaluated strategies intended to increase the use of research in health policy and to identify modifiable points for intervention.

We selected the SPIRIT framework due to its potential suitability as an initial ‘road map’ for our evaluation of academic-policy interventions in the CAPE programme. The key elements of the original framework are catalysts, organisational capacity, engagement actions, and research use. We wished to build on the framework’s embedded conceptual work, derived from literature reviews and semi-structured interviews, to identify policymakers’ views on factors that assist policy agencies’ use of research [ 20 ]. The SPIRIT framework developers defined its “locus for change” as the policy organisation ( [ 20 ], p. 151). They proposed that it could offer the beginning of a process to identify and test pathways in policy agencies’ use of evidence.

Our goal was to modify SPIRIT to accommodate a different locus for change: the engagement interface between academia and policy. Instead of imagining a linear process in which knowledge comes from researchers and is transmitted to policy professionals, we intended to extend the framework to multidirectional relational and system interfaces. We wished to include processes and influences at individual, organisational and system levels, to be relevant for HEIs and their staff, policy bodies and professionals, funders of engagement activities, and facilitatory bodies. Ultimately, we seek to address a gap in understanding how engagement strategies work, for whom, how they are facilitated, and to improve the evaluation of academic-policy engagement.

We aimed to produce a conceptually guided action framework to enable systematic evaluation of interventions intending to support academic-policy engagement.

We used a pragmatic combination of processes for framework modification during our evaluation activities in the CAPE programme [ 22 ]. The CAPE programme included a range of interventions: seed funding for academic and policy professional collaboration in policy-focused projects, fellowships for academic placements in policy settings, or for policy professionals with HEI staff, training for policy professionals, and a range of knowledge exchange events for HEI staff and policy professionals. We modified the SPIRIT framework through iterative processes shown in Table  1 , including reviews of literature; consultations with HEI staff and policy professionals across a range of policy contexts and geographic settings in England, through the CAPE programme; and piloting, refining and seeking feedback from stakeholders in academic-policy engagement.

A number of characteristics of the original SPIRIT framework could be applied to academic-policy engagement. While keeping the core domains, we modified the framework to capture dynamics of engagement at multiple academic and policy levels (individuals, organisations and system), extending beyond the original unidirectional focus on policy agencies’ use of research. Components of the original framework, the need for modifications, and their corresponding action-oriented implications are shown in Table  2 . We added a new domain, ‘Impacts and Sustainability’, to consider transforming and enduring aspects at the engagement interface. The modified action framework is shown in Fig.  1 .

figure 1

SPIRIT Action Framework Modified for Academic-Policy Engagement Interventions (SPIRIT-ME), adapted with permission from the Sax Institute. Legend: The framework acknowledges that elements in each domain may influence other elements through mechanisms of action and that these do not necessarily flow through the framework in a ‘pipeline’ sequence. Mechanisms of action are processes through which engagement strategies operate to achieve desired outcomes. They might rely on influencing factors, catalysts, an aspect of an intervention action, or a combination of elements

Identifying relevant theories or models for missing elements

Catalysts and capacity.

Within our evaluation of academic-policy interventions, we identified a need to develop the original domain of catalysts beyond ‘policy/programme need for research’ and ‘new research with potential policy relevance’. Redman et al. characterised a catalyst as “a need for information to answer a particular problem in policy or program design, or to assist in supporting a case for funding” in the original framework (p. 149). We expanded this “need for information” to a perceived need for engagement, by either HEI staff or policy professionals, linking to the potential value they perceived in engaging. Specifically, there was a need to consider catalysts at the level of individual engagement, for example HEI staff wanting research to have real-world impact, or policy professionals’ desires to improve decision-making in policy, where productive interactions between academic and policy stakeholders are “necessary interim steps in the process that lead to societal impact” ( [ 24 ], p. 214). The catalyst domain expands the original emphasis on a need for research, to take account of challenges to be overcome by both the academic and policy communities in knowing how, and with whom, to engage and collaborate with [ 25 ].

We used a model proposing that there are three components for any behaviour: capability, opportunity and motivation, which is known as the COM-B model [ 26 ]. Informed by CAPE evaluation activities and our discussions with stakeholders, we mapped the opportunity and motivation constructs into the ‘catalysts’ domain of the original framework. Opportunity is an attribute of the system that can facilitate engagement. It may be a tangible factor such as the availability of seed funding, or a perceived social opportunity such as institutional support for engagement activities. Opportunity can act at the macro level of systems and organisational structures. Motivation acts at the micro level, deriving from an individual’s mental processes that stimulate and direct their behaviours; in this case, taking part in academic-policy engagement actions. The COM-B model distinguishes between reflective motivation through conscious planning and automatic motivation that may be instinctive or affective [ 26 ].

We presented an early application of the COM-B model to catalysts for engagement at an academic conference, enabling an informal exploration of attendees’ subjective views on the clarity and appropriateness, when developing the framework. This application introduces possibilities for intervention development and support by highlighting ‘opportunities’ and ‘motivations’ as key catalysts in the modified framework.

Within the ‘capacity’ domain, we retained the original levels of individuals, organisations and systems. We introduced individual capability as a construct from the COM-B model, describing knowledge, skills and abilities to generate behaviour change as a precursor of academic-policy engagement. This reframing extends the applicability to HEI staff as well as policy professionals. It brings attention to different starting conditions for individuals, such as capabilities developed through previous experience, which can link with social opportunity (for example, through training or support) as a catalyst.

Engagement actions

We identified a need to modify the original domain ‘engagement actions’ to extend the focus beyond the use of research. We added three categories of engagement actions described by Best and Holmes [ 27 ]: linear, relational, and systems. These categories were further specified through a systematic mapping of international organisations’ academic-policy engagement activities [ 5 ]. This framework modification expands the domain to encompass: (i) linear ‘push’ of evidence from academia or ‘pull’ of evidence into policy agencies; (ii) relational approaches focused on academic-policy-maker collaboration; and (iii) systems’ strategies to facilitate engagement for example through strategic leadership, rewards or incentives [ 5 ].

We retained the elements in the original framework’s ‘outcomes’ domain (instrumental, tactical, conceptual and imposed), which we found could apply to outcomes of engagement as well as research use. For example, discussions between a policy professional and a range of academics could lead to a conceptual outcome by considering an issue through different disciplinary lenses. We expanded these elements by drawing on literature on engagement outcomes [ 28 ] and through sense-checking with stakeholders in CAPE. We added capacity-building (changes to skills and expertise), connectivity (changes to the number and quality of relationships), and changes in organisational culture or attitude change towards engagement.

Impacts and sustainability

The original framework contained endpoints described as: ‘Better health system and health outcomes’ and ‘Research-informed health policy and policy documents’. For modification beyond health contexts and to encompass broader intentions of academic-policy engagement, we replaced these elements with a new domain of ‘Impacts and sustainability’. This domain captures the continued activities required in achievement of desirable outcomes [ 29 ]. The modification allows consideration of sustainability in relation to previous stages of engagement interventions, through the identification of beneficial effects that are sustained (or not), in which ways, and for whom. Following Borst [ 30 ], we propose a shift from the expectation that ‘sustainability’ will be a fixed endpoint. Instead, we emphasise the maintenance work needed over time, to sustain productive engagement.

Influences and facilitators

We modified the overarching ‘Policy influences’ (such as public opinion and media) in the original framework, to align with factors influencing academic-policy engagement beyond policy agencies’ use of research. We included influences at the level of the individual (for example, individual moral discretion [ 31 ]), the organisation (for example, managerial practices [ 31 ]) and the system (for example, career incentives [ 32 ]). Each of these processes takes place in the broader context of social, policy and financial environments (that is, potential sources of funding for engagement actions) [ 29 ].

We modified the domain ‘Reservoir of relevant and reliable research’ underpinning the original framework, replacing it with ‘Reservoir of people skills’, to emphasise intangible facilitatory work at the engagement interface, in place of concrete research outputs. We used the ‘Promoting Action on Research Implementation in Health Services’ (PARiHS) framework [ 33 , 34 ], which gives explicit consideration to facilitation mechanisms for researchers and policy-makers [ 13 ] . Here, facilitation expertise includes mechanisms that focus on particular goals (task-oriented facilitation) or enable changes in ways of working (holistic-oriented facilitation). Task-orientated facilitation skills might include, for example, the provision of contacts, practical help or project management skills, while holistic-oriented facilitation involves building and sustaining partnerships or support skills’ development across a range of capabilities. These conceptualisations aligned with our consultations with facilitators of engagement in CAPE. We further extended these to include aspects identified in our evaluation activities: strategic planning, contextual awareness and entrepreneurial orientation.

Piloting and refining the modified framework through stakeholder engagement

We piloted an early version of the modified framework to develop a survey for all CAPE programme participants. During this pilot stage, we sought feedback from the CAPE delivery team members across HEI and policy contexts in England. CAPE delivery team members are based at five collaborating universities with partners in the Parliamentary Office for Science and Technology (POST) and Government Office for Science (GO-Science), and Nesta (a British foundation that supports innovation). The HEI members include academics and professional services knowledge mobilisation staff, responsible for leading and coordinating CAPE activities. The delivery team comprised approximately 15–20 individuals (with some fluctuations according to individual availabilities).

We assessed appropriateness and utility, refined terminology, added domain elements and explored nuances. For example, stakeholders considered the multi-layered possibilities within the domain ‘capacity’, where some HEI or policy departments may demonstrate a belief that it is important to use research in policy, but this might not be the perception of the organisation as a whole. We also sought stakeholders’ views on the utility of the new domains, for example, the identification of facilitator expertise such as acting as a knowledge broker or intermediary; providing training, advice or guidance; facilitating engagement opportunities; creating engagement programmes; and sustainability of engagement that could be conceptualised at multiple levels: personally, in processes or through systems.

Testing against criteria for useful action framework

The modified framework fulfils the properties of a useful action framework [ 20 ]:

It has a clearly articulated purpose: development and evaluation of academic-policy engagement interventions through linear, relational and/or system approaches. It has identified loci for change, at the level of the individual, the organisation or system.

It has been informed by existing understandings, including conceptual work of the original SPIRIT framework, conceptual models identified from the literature, published empirical findings, understandings from consultation with stakeholders, and evaluation activities in CAPE.

It can be applied to the development, implementation and evaluation of targeted academic-policy engagement actions, the selection of points for intervention and identification of potential outcomes, including the work of sustaining them and unanticipated consequences.

It provides a structure to build knowledge by guiding the generation of hypotheses about mechanisms of action in academic-policy engagement interventions, or by adapting the framework further through application in practice.

The proliferation of frameworks to articulate processes of research translation reveals a need for their adaptation when applied in specific contexts. The majority of models in implementation science relate to translation of research into practice. By contrast, our focus was on engagement between academia and policy. There are a growing number of academic-policy engagement interventions but a lack of published evaluations [ 10 ].

Our framework modification provides an exemplar for others who are considering how to adapt existing conceptual frameworks to address new or expanded purposes. Field et al. identified the multiple, idiosyncratic ways that the Knowledge to Action Framework has been applied in practice, demonstrating its ‘informal’ adaptability to different healthcare settings and topics [ 35 ]. Others have reported on specific processes for framework refinement or extension. Wiltsey Stirman et al. adopted a framework that characterised forms of intervention modification, using a “pragmatic, multifaceted approach” ( [ 36 ], p.2). The authors later used the modified version as a foundation to build a further framework to encompass implementation strategies in a range of settings [ 21 ]. Oiumet et al. used the approach of borrowing from a different disciplinary field for framework adaptation, by using a model of absorptive capacity from management science to develop a conceptual framework for civil servants’ absorption of research knowledge [ 37 ].

We also took the approach of “adapting the tools we think with” ( [ 38 ], p.305) during our evaluation activities on the CAPE programme. Our conceptual modifications align with the literature on motivation and entrepreneurial orientation in determining policy-makers’ and researchers’ intentions to carry out engagement in addition to ‘usual’ roles [ 39 , 40 ]. Our framework offers an enabler for academic-policy engagement endeavours, by providing a structure for approaches beyond the linear transfer of information, emphasising the role of multidirectional relational activities, and the importance of their facilitation and maintenance. The framework emphasises the relationship between individuals’ and groups’ actions, and the social contexts in which these are embedded. It offers additional value by capturing the organisational and systems level factors that influence evidence-informed policymaking, incorporating the dynamic features of contexts shaping engagement and research use.

Conclusions

Our modifications extend the original SPIRIT framework’s focus on policy agencies’ use of research, to encompass dynamic academic-policy engagement at the levels of individuals, organisations and systems. Informed by the knowledge and experiences of policy professionals, HEI staff and knowledge mobilisers, it is designed to be meaningful and accessible for people working across varied contexts and functions in the evidence-policy ecosystem. It has potential applications in how academic-policy engagement interventions might be developed, evaluated, facilitated and improved, and it fulfils Redman et al.’s criteria as a useful action framework [ 20 ].

We are testing the ‘SPIRIT-Modified for Engagement’ framework (SPIRIT-ME) through our ongoing evaluation of academic-policy engagement activities. Further empirical research is needed to explore how the framework may capture ‘additionality’, that is, to identify what is achieved through engagement actions in addition to what would have happened anyway, including long-term changes in strategic behaviours or capabilities [ 41 , 42 , 43 ]. Application of the modified framework in practice will highlight its strengths and limitations, to inform further iterative development and adaptation.

Availability of data and materials

Not applicable.

Stewart R, Dayal H, Langer L, van Rooyen C. Transforming evidence for policy: do we have the evidence generation house in order? Humanit Soc Sci Commun. 2022;9(1):1–5.

Article   Google Scholar  

Sanderson I. Complexity, ‘practical rationality’ and evidence-based policy making. Policy Polit. 2006;34(1):115–32.

Lewin S, Glenton C, Munthe-Kaas H, Carlsen B, Colvin CJ, Gülmezoglu M, et al. Using Qualitative Evidence in Decision Making for Health and Social Interventions: An Approach to Assess Confidence in Findings from Qualitative Evidence Syntheses (GRADE-CERQual). PLOS Med. 2015;12(10):e1001895.

Article   PubMed   PubMed Central   Google Scholar  

Bonell C, Meiksin R, Mays N, Petticrew M, McKee M. Defending evidence informed policy making from ideological attack. BMJ. 2018;10(362):k3827.

Hopkins A, Oliver K, Boaz A, Guillot-Wright S, Cairney P. Are research-policy engagement activities informed by policy theory and evidence? 7 challenges to the UK impact agenda. Policy Des Pract. 2021;4(3):341–56.

Google Scholar  

Head BW. Toward More “Evidence-Informed” Policy Making? Public Adm Rev. 2016;76(3):472–84.

Walker LA, Lawrence NS, Chambers CD, Wood M, Barnett J, Durrant H, et al. Supporting evidence-informed policy and scrutiny: A consultation of UK research professionals. PLoS ONE. 2019;14(3):e0214136.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Graham ID, Tetroe J, Group the KT. Planned action theories. In: Knowledge Translation in Health Care. John Wiley and Sons, Ltd; 2013. p. 277–87. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/9781118413555.ch26 Cited 2023 Nov 1

Davies HT, Powell AE, Nutley SM. Mobilising knowledge to improve UK health care: learning from other countries and other sectors – a multimethod mapping study. Southampton (UK): NIHR Journals Library; 2015. (Health Services and Delivery Research). Available from: http://www.ncbi.nlm.nih.gov/books/NBK299400/ Cited 2023 Nov 1

Oliver K, Hopkins A, Boaz A, Guillot-Wright S, Cairney P. What works to promote research-policy engagement? Evid Policy. 2022;18(4):691–713.

Nelson JP, Lindsay S, Bozeman B. The last 20 years of empirical research on government utilization of academic social science research: a state-of-the-art literature review. Adm Soc. 2023;28:00953997231172923.

Bell D. Technology, nature and society: the vicissitudes of three world views and the confusion of realms. Am Sch. 1973;42:385–404.

Milat AJ, Li B. Narrative review of frameworks for translating research evidence into policy and practice. Public Health Res Pract. 2017; Available from: https://apo.org.au/sites/default/files/resource-files/2017-02/apo-nid74420.pdf Cited 2023 Nov 1

Meyers DC, Durlak JA, Wandersman A. The quality implementation framework: a synthesis of critical steps in the implementation process. Am J Community Psychol. 2012;50(3–4):462–80.

Article   PubMed   Google Scholar  

Cane J, O’Connor D, Michie S. Validation of the theoretical domains framework for use in behaviour change and implementation research. Implement Sci. 2012;7(1):37.

Glasgow RE, Battaglia C, McCreight M, Ayele RA, Rabin BA. Making implementation science more rapid: use of the RE-AIM framework for mid-course adaptations across five health services research projects in the veterans health administration. Front Public Health. 2020;8. Available from: https://www.frontiersin.org/articles/10.3389/fpubh.2020.00194 Cited 2023 Jun 13

Nilsen P. Making sense of implementation theories, models and frameworks. Implement Sci IS. 2015 Apr 21 10. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4406164/ Cited 2020 May 4

Sheth A, Sinfield JV. An analytical framework to compare innovation strategies and identify simple rules. Technovation. 2022;1(115):102534.

Birken SA, Powell BJ, Shea CM, Haines ER, Alexis Kirk M, Leeman J, et al. Criteria for selecting implementation science theories and frameworks: results from an international survey. Implement Sci. 2017;12(1):124.

Redman S, Turner T, Davies H, Williamson A, Haynes A, Brennan S, et al. The SPIRIT Action Framework: A structured approach to selecting and testing strategies to increase the use of research in policy. Soc Sci Med. 2015;136:147–55.

Miller CJ, Barnett ML, Baumann AA, Gutner CA, Wiltsey-Stirman S. The FRAME-IS: a framework for documenting modifications to implementation strategies in healthcare. Implement Sci. 2021;16(1):36.

CAPE. CAPE. 2021. CAPE Capabilities in Academic Policy Engagement. Available from: https://www.cape.ac.uk/ Cited 2021 Aug 3

CIPHER Investigators. Supporting policy in health with research: an intervention trial (SPIRIT)—protocol for a stepped wedge trial. BMJ Open. 2014;4(7):e005293.

Spaapen J, Van Drooge L. Introducing ‘productive interactions’ in social impact assessment. Res Eval. 2011;20(3):211–8.

Williams C, Pettman T, Goodwin-Smith I, Tefera YM, Hanifie S, Baldock K. Experiences of research-policy engagement in policymaking processes. Public Health Res Pract. 2023. Online early publication. https://doi.org/10.17061/phrp33232308 .

Michie S, van Stralen MM, West R. The behaviour change wheel: a new method for characterising and designing behaviour change interventions. Implement Sci. 2011;6(1):42.

Best A, Holmes B. Systems thinking, knowledge and action: towards better models and methods. Evid Policy J Res Debate Pract. 2010;6(2):145–59.

Edwards DM, Meagher LR. A framework to evaluate the impacts of research on policy and practice: A forestry pilot study. For Policy Econ. 2020;1(114):101975.

Scheirer MA, Dearing JW. An agenda for research on the sustainability of public health programs. Am J Public Health. 2011;101(11):2059–67.

Borst RAJ, Wehrens R, Bal R, Kok MO. From sustainability to sustaining work: What do actors do to sustain knowledge translation platforms? Soc Sci Med. 2022;1(296):114735.

Zacka B. When the state meets the street: public service and moral agency. Harvard university press; 2017. Available from: https://books.google.co.uk/books?hl=en&lr=&id=3KdFDwAAQBAJ&oi=fnd&pg=PP1&dq=zacka+when+the+street&ots=x93YEHPKhl&sig=9yXKlQiFZ0XblHrbYKzvAMwNWT4 Cited 2023 Nov 28

Torrance H. The research excellence framework in the United Kingdom: processes, consequences, and incentives to engage. Qual Inq. 2020;26(7):771–9.

Rycroft-Malone J. The PARIHS framework—a framework for guiding the implementation of evidence-based practice. J Nurs Care Qual. 2004;19(4):297–304.

Stetler CB, Damschroder LJ, Helfrich CD, Hagedorn HJ. A guide for applying a revised version of the PARIHS framework for implementation. Implement Sci. 2011;6(1):99.

Field B, Booth A, Ilott I, Gerrish K. Using the knowledge to action framework in practice: a citation analysis and systematic review. Implement Sci. 2014;9(1):172.

Wiltsey Stirman S, Baumann AA, Miller CJ. The FRAME: an expanded framework for reporting adaptations and modifications to evidence-based interventions. Implement Sci. 2019;14(1):58.

Ouimet M, Landry R, Ziam S, Bédard PO. The absorption of research knowledge by public civil servants. Evid Policy. 2009;5(4):331–50.

Martin D, Spink MJ, Pereira PPG. Multiple bodies, political ontologies and the logic of care: an interview with Annemarie Mol. Interface - Comun Saúde Educ. 2018;22:295–305.

Sajadi HS, Majdzadeh R, Ehsani-Chimeh E, Yazdizadeh B, Nikooee S, Pourabbasi A, et al. Policy options to increase motivation for improving evidence-informed health policy-making in Iran. Health Res Policy Syst. 2021;19(1):91.

Athreye S, Sengupta A, Odetunde OJ. Academic entrepreneurial engagement with weak institutional support: roles of motivation, intention and perceptions. Stud High Educ. 2023;48(5):683–94.

Bamford D, Reid I, Forrester P, Dehe B, Bamford J, Papalexi M. An empirical investigation into UK university–industry collaboration: the development of an impact framework. J Technol Transf. 2023 Nov 13; Available from: https://doi.org/10.1007/s10961-023-10043-9 Cited 2023 Dec 20

McPherson AH, McDonald SM. Measuring the outcomes and impacts of innovation interventions assessing the role of additionality. Int J Technol Policy Manag. 2010;10(1–2):137–56.

Hind J. Additionality: a useful way to construct the counterfactual qualitatively? Eval J Australas. 2010;10(1):28–35.

Download references

Acknowledgements

We are very grateful to the CAPE Programme Delivery Group members, for many discussions throughout this work. Our thanks also go to the Sax Institute, Australia (where the original SPIRIT framework was developed), for reviewing and providing helpful feedback on the article. We also thank our reviewers who made very constructive suggestions, which have strengthened and clarified our article.

The evaluation of the CAPE programme, referred to in this report, was funded by Research England. The funding body had no role in the design of the study, analysis, interpretation or writing the manuscript.

Author information

Authors and affiliations.

Department of Health Services Research and Policy, Faculty of Public Health and Policy, London School of Hygiene and Tropical Medicine, 15-17 Tavistock Place, Kings Cross, London, WC1H 9SH, UK

Petra Mäkelä & Kathryn Oliver

Health and Social Care Workforce Research Unit, The Policy Institute, Virginia Woolf Building, Kings College London, 22 Kingsway, London, WC2B 6LE, UK

Annette Boaz

You can also search for this author in PubMed   Google Scholar

Contributions

PM conceptualised the modification of the framework reported in this work. All authors made substantial contributions to the design of the work. PM drafted the initial manuscript. AB and KO contributed to revisions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Petra Mäkelä .

Ethics declarations

Ethics approval and consent to participate.

Ethics approval was granted for the overarching CAPE evaluation by the London School of Hygiene and Tropical Medicine Research Ethics Committee (reference 26347).

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Mäkelä, P., Boaz, A. & Oliver, K. A modified action framework to develop and evaluate academic-policy engagement interventions. Implementation Sci 19 , 31 (2024). https://doi.org/10.1186/s13012-024-01359-7

Download citation

Received : 09 January 2024

Accepted : 20 March 2024

Published : 12 April 2024

DOI : https://doi.org/10.1186/s13012-024-01359-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Evidence-informed policy
  • Academic-policy engagement
  • Framework modification

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

research and evaluation framework

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Constructing an evaluation model for the comprehensive level of sustainable development of provincial competitive sports in China based on DPSIR and MCDM

Roles Writing – original draft

* E-mail: [email protected]

Affiliations School of Physical Education, Shanghai University of Sport, Shanghai, China, College of Physical Education, Quanzhou Normal University, Quanzhou, China

ORCID logo

Roles Methodology

Affiliation School of Economics and Management, Sanming University, Sanming, China

Roles Writing – review & editing

Affiliation School of Foreign Languages, Quanzhou Normal University, Quanzhou, China

  • Ke Xu, 
  • Hung‐Lung Lin, 

PLOS

  • Published: April 16, 2024
  • https://doi.org/10.1371/journal.pone.0301411
  • Reader Comments

Fig 1

This study focuses on the objective assessment of sport development in socio-economic environments, considering the challenges faced by the industry. These challenges include disparities in regional investments, limited market participation, slow progress towards sports professionalization, and insufficient technological innovations. To tackle these challenges, we suggest implementing an integrated evaluation model that follows the DPSIR (Drivers, Pressures, States, Impacts, Responses) framework and incorporates comprehensive socioeconomic indicators. Subsequently, we utilized the Entropy power method and TOPSIS (Order Preference Technique for Similarity to an Ideal Solution, TOPSIS) analysis to comprehensively assess the progress of competitive sports development in 31 provinces and cities in China. Additionally, we recommended further developments in competitive sports and proposed precise strategies for promoting its growth. The framework and methodology developed in this paper provide an objective and scientifically based set of decision-making guidelines that can be adopted by government agencies and related industries in order to create successful plans that promote the sustainable growth of competitive sport. This is expected to bolster the nation’s global influence, enhance social unity, and fuel economic expansion. The findings of this study offer policymakers valuable insights regarding competitive sports and can advance the development of the sports sector in China, thus making it a crucial driver of regional socio-economic progress.

Citation: Xu K, Lin H, Qiu J (2024) Constructing an evaluation model for the comprehensive level of sustainable development of provincial competitive sports in China based on DPSIR and MCDM. PLoS ONE 19(4): e0301411. https://doi.org/10.1371/journal.pone.0301411

Editor: Mehdi Keshavarz-Ghorabaee, Gonbad Kavous University, IRAN, ISLAMIC REPUBLIC OF

Received: November 9, 2023; Accepted: March 16, 2024; Published: April 16, 2024

Copyright: © 2024 Xu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting information files.

Funding: this paper is funded by Humanities and Social Sciences Youth Foundation, Ministry of Education (cn: No. 20YJC890034).

Competing interests: The authors have declared that no competing interests exist.

Introduction

Background and motivation of the study.

With the progress of human civilization, the continuous development of social economy, culture and science and technology, and the accompanying drastic changes in the way of life of human beings, competitive sports have become more and more a symbol of national soft power. It plays an irreplaceable role in uplifting the national spirit and enhancing national cohesion. The primary purpose of competitive sports in China was initially to secure medals in international competitions. However, with the changes in society, competitive sports have not only enhanced the country’s soft power but also assumed an important role in promoting economic development. In particular, competitive sports play a significant role in promoting the development of the domestic economy and building new economic growth points. The interaction between competitive sports and the social economy has become increasingly evident. Sport is an important part of socio-economic development, and competitive sport is a major contributor to regional economic development.

According to data from the General Administration of Sport of China and the National Bureau of Statistics, China’s competitive sports have achieved a total of 251 medals at the London, Rio, and Tokyo Olympic Games from 2013 to 2022, thanks to consistent investment. Of these, 967 world championships and 113 world records were won in various international competitions, including the Olympic Games. The development of the sports industry has been greatly boosted by the long-standing high-level competitive sports, resulting in the total size of China’s sports industry increasing from 1.1 trillion yuan to 3.5 trillion yuan between 2013 and 2022. The investment has increased by 2.4 trillion yuan over the course of ten years. The sports industry’s share of GDP rose from 0.63% to 2.89%, representing a 459% increase over a span of 10 years. It can be seen that competitive sports not only reflect the soft power of a country but also play an important role in promoting social and economic development. Hence, the sustainability of competitive sports programs is an important research issue. The boost to domestic consumption from competitive sporting events is also significant. According to the Guangzhou Sports Bureau and the Guangzhou Bureau of Statistics, taking the 2010 Guangzhou Asian Games as an example, not only did China rank first in the medal table with 199 gold medals, 119 silver medals, and 98 bronze medals, but also the “Asian Games effect” brought direct economic benefits of up to 800 billion yuan to the Guangzhou. In 2010, Guangdong’s GDP surpassed one trillion yuan and Guangzhou’s total tourism revenue exceeded 100 billion yuan, reaching 125.461 billion yuan, representing a year-on-year increase of 26.21%. It also promoted the construction of stadiums and the rapid development of the sports industry. In 2013, the number of stadiums in Guangzhou doubled compared to 2003, and the added value of the sports industry was 28.22 billion yuan [ 1 ] accounting for 1.84% of the city’s GDP, which maintained a high rate of growth for six consecutive years. The importance that the Chinese government attaches to competitive sports is reflected in the documents it has issued.

Since 1984, when China won its first Olympic gold medal, the General Administration of Sport (GAS) and the State Council have introduced many policies to promote the quality and sustainable development of competitive sports, and through these policies, the country’s soft power is enhanced to promote China’s social and economic development. For example, Beijing’s GDP increased by 105.5 billion yuan from 2004 to 2008 as a result of hosting the 2008 Olympic Games, according to the National Bureau of Statistics. The average annual GDP growth was 12.3% from 2005 to 2007, and 1.82 million job opportunities were created in the Beijing area. The “Olympic effect” has also contributed to the long-term development of the Beijing area, with an average annual growth rate of 5% for domestic tourists and 8% for foreign tourists. Additionally, the city received 4 million domestic and foreign tourists during the Beijing Olympics. In the period from 2001 to 2007, Beijing experienced an average annual increase of over RMB 15 billion in retail sales of consumer goods, attributed to the Olympic Games. This resulted in a total increase of RMB 110 billion in retail sales of consumer goods over the course of seven years. From the 2008 Beijing Olympics to the 2022 Beijing Winter Olympics, the Chinese government has achieved some success in international soft power and socio-economic development as a result of its active investment in the sustainable development of sport.

However, there are still many problems. These problems, which are illustrated as follows:

(1) The problem of unbalanced inter-regional development.

Regarding the uneven development of competitive sports, Ma and Kurscheidt’s [ 2 ] argued that the allocation of resources to competitive sports in China is measured by the number of medals won by athletes affiliated with provincial sports bureaus (PSBs) at the National Games of China (NGC). The provincial government (PG) provides incentives to its athletes and coaches based on the number of medals won at the NGC, and the salaries of PSB managers at all levels are paid by the PG rather than the General Administration of Sport (GAS). The only criteria for promotions or salary increases for PSB personnel at all levels is also based on the number of medals won at the NGC by athletes from the provinces to which the PSB belongs. The number of medals won in the NGC becomes a clear indicator of the performance of competitive sports in a province. The PSB’s excessive focus on the number of NGC medals inevitably leads to a rush for high-level athletes in the transfer market for elite athletes, and economic strength becomes the most critical indicator of whether a province can obtain high-level athletes. This has led to an imbalance in the distribution of competitive sports resources due to the disparity in regional economic strength, with some regions becoming the gathering place for elite athletes due to the development of competitive sports, while other economically backward regions have few people seeking them. This leads to a significant Matthew effect in provincial sports, and in the long run, the sustainable development of competitive sports in China will be out of reach.

(2) Insufficient market participation.

Yang [ 3 ] stated, as observed from the effects of the Chinese government’s long-term policy of promoting competitive sports, the government’s investment of a large amount of human, material and financial resources in the short and medium term (5–15 years) can yield significant benefits for the country’s economic and social development. However, after the event, the large sports venues or large hotels may be unused and rarely used. In the long run, there is no effective synergy between the sports industry and various other industries. Due to the industry’s high degree of dependence on the government, the refusal of the industry to commit resources, the dysfunctional market mechanism among various industries, and the lack of motivation to actively innovate in research and development, the sustainable development of the competitive sports industry will ultimately be significantly affected.

(3) The lagging behind of professionalization reforms.

Peng [ 4 ] claimed that the professionalization reform is a top priority for the sustainable development of competitive sports. The government’s massive investment of resources in athletics is unsustainable, and China must reduce the amount of state funding per gold medal. Making professional clubs one of the training paths for China’s elite athletes and high-level coaches can lower the dependence of athletes and coaches on the government. In this context, the reform of the professionalization of competitive sports is of the utmost importance. Whether it is the NBA, MFL, English Premier League, Serie A, these professional sports leagues do not need the financial investment of the country in the training of world-class athletes and coaches, but also drive the development of the country’s sports industry, boosting employment, and creating extremely high economic value. Thus, a world-class professional league is the key to the sustainable and high-quality development of competitive sports in China in the future. In this regard, China’s professionalization reform of competitive sports is still lagging a bit behind and needs to catch up.

(4) Low level of technology empowerment.

Science and technology will undoubtedly play a pivotal role in the future of global competition in sports. The rise of competitive sports in the UK in the last decade or so reflects this trend effectively. The lack of scientific and technological advancement in China’s competitive sports is evident in the insufficient scientific level of sports training, the limited capacity of scientific training In the new stage of development, science, and technology are important for enhancing the efficiency of competitive sports and facilitating their high-quality and sustainable development in China.

(5) Some conflict exists between national and provincial sports organizations.

Zheng et al. [ 5 ] revealed the characteristics of conflict between national and provincial sports organizations through a study of inter-organizational conflict in three sports: artistic gymnastics, swimming, and cycling, in terms of evidence of conflict, factors leading to conflict, initiatives to mitigate conflict, and the impact of conflict on elite sport. Despite the differences in the degree and characteristics of national and provincial conflicts between different sports, the causes of conflicts and measures to deal with conflicts are somewhat consistent, whether it is “horizontal coordination at the national level” or “vertical coordination between the national policy level and the region” is conducive to alleviating conflicts to a certain extent, and the effective adoption of effective measures to alleviate organizational conflicts is of great significance to improving the overall performance of China’s elite sports in the future.

(6) Weak succession of human resources related to competitive sports.

Athletes and coaches are key to the success of competitive sports in the international arena, but as a result of China’s 30-year one-child policy and the dwindling number of elite coaches in China, the number of athletes and coaches in the country is decreasing. In the increasingly fierce international competition in competitive sports, the shortage of human resources related to competitive sports becomes a problem that should be solved in order to construct a strong sports nation. Li et al. [ 6 ] concluded that China’s one-child policy has led to a decline in the birth rate and the number of youth football participants, which would affect the development of professional football in China. Chen & Chen [ 7 ] suggested that the traditional way of training elite coaches is no longer suitable for the current competitive landscape of international sports, and that the training path of elite coaches needs to be further reformed, especially in terms of sustainable development, professional training, quality training, and the establishment of a “dual-track” training model for sports coaches. (Note: the education system and the sports system share the resources of elite coaches).

Scholars have conducted a variety of studies to address the problems of regional imbalance, insufficient market participation, low level of scientific and technological empowerment, organizational conflicts, and human resources for competitive sports, but to date, there have been no in-depth studies on the sustainable development of provincial elite sports in the Chinese context.

Objectives of the study

In view of the above challenges, this study aims to achieve the following objectives.

  • (1) Integrating assessment model: An assessment model based on social and economic indicators was created to distinguish it from previous studies. The model follows the DPSIR framework and incorporates not only economic indicators, but also a range of sport- and socially-related environment indicators. By objectively evaluating social and economic factors, this study aims to assess the level of development in competitive sports across various provinces. Emphasis is placed on the interdependent relationship between sports and the economy
  • (2) Addressing Regional Disparities: This study aims to address the regional imbalances in the development of competitive sports. It will provide insights and recommendations to reduce the development gap between states, thus promoting more equitable growth of the national competitive sports sector.
  • (3) Promoting Market Participation: Addressing the challenge of insufficient market participation in competitive sport. This paper aimed to analyze the impact of long-term government investments and policies on competitive sport, and to provide recommendations for improving synergies between the sport industry and other sectors. This entails ensuring the efficient utilization of sports facilities and infrastructure beyond particular events, which can contribute to sustainable development.
  • (4) Facilitating Professionalization Reforms: The responsible parties should assess the advancement of professionalization reforms in competitive sports and highlight the necessity for China to reduce its reliance on extensive state funding for each gold medal. The study aims to explore the potential advantages of cultivating professional sports leagues similar to those of international competitors like the NBA, MFL, English Premier League, and Serie A. These leagues operate with limited government funding and make significant contributions to the economic and employment growth of their respective nations.
  • (5) Promoting Technology Enablement: The significance of science and technology in the advancement of competitive sports is highlighted. This involves improving the scientific elements of sports training, strengthening logistical support for scientific training, and integrating state-of-the-art technology into training programs. The aim is to enhance the efficiency, quality and sustainability of competitive sports through technological advancements.

Research innovations and contributions

This study presents a groundbreaking method by creating an integrated assessment model utilizing the DPSIR framework that surpasses typical economic indicators and incorporates sport-related and socio-environmental indicators. This pioneering model offers a thorough comprehension of the correlation between competitive sport and economic progress. Moreover, the combination of entropy and Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) in this study can combine subjective and objective evaluation, and its evaluation principle is clear, operable, and has a wide range of application [ 8 ]. A description of these two methods is given in more detail in research methods. Importantly, the combination of the qualitative DPSIR with the quantitative Entropy and TOPSIS models provides a valuable tool for government officials, industry leaders and policy makers. This tool aims to better guide the integration of sport into broader economic development strategies, raise awareness of the economic value of sport, and improve the planning and management of China’s sports industry.

The DPSIR model and sustainability in sport

The dpsir model.

Driving forces—Pressure—State—Impact—Response” is referred to as the DPSIR model. The DPSIR model was first proposed by the Organization for Economic Cooperation and Development (OECD) Niemeijer and Groot [ 9 ]. The explanation of each element of the DPSIR model is as follows [ 10 ]. (1) Driving forces refer to the underlying causes and factors that drive change in a system. Driving forces are typically social, economic, and political factors. These can include population growth, technological innovations, policy changes, and resource utilization, among others. Driving forces typically have an impact on the system and initiate a series of transformations. (2) Pressure refers to the forces or impacts exerted on a system, either from within or from external sources. These pressures can include overexploitation of resources, environmental pollution, and ecosystem destruction. Pressure usually has a negative impact on the state of the system. (3) State refers to a description or characterization of the current condition of the system. In the DPSIR model, the state typically encompasses a range of indicators and parameters in environmental, social, or economic terms. These state parameters can include water quality, air quality, biodiversity, population health, economic growth, and so on. (4) Impacts are the results or changes that occur as a result of the interaction of pressures and states. These impacts can be positive or negative, visible or latent, but they are usually used to gauge the extent to which a system has been affected. (5) Responses are measures taken by society or the government in response to pressures and impacts. This can include formulating policies, implementing regulations, environmental protection initiatives, social action, etc. The goal of the response is usually to reduce stress, improve conditions, minimize adverse impacts, and enhance system sustainability. Through this approach, policymakers and researchers can identify the key issues in the system and devise the appropriate responses to promote sustainable development and improve the overall system performance. The model has been widely used in the fields of the environmental management, the sustainability research, the policy making and the social science research.

Since it was first proposed, the DPSIR model has been widely used in sustainability planning assessment and research in environmental and economic development. For example,

(1) studies with natural resource environment: Malmir et al. [ 11 ] used DPSIR model to provide a solution to groundwater resource management problems in Najafabad region in central Iran, which provides a reference for managers and decision-makers. Wang and Yang. [ 12 ] employed the DPSIR model to comprehensively evaluate the sustainable development potential of China’s shale gas industry, by selecting economic, environmental, resource, and technological factors, and constructing an evaluation index system for the shale gas DPSIR framework. The evaluation results of the study concluded that the sustainable development potential of China’s shale gas industry is relatively lower, and that the sustainable development potential of the southwest region is better than that of the northwest region. Khan et al. [ 13 ] constructed a DPSIR analytical framework to evaluate the rural sustainable development efficiency (RSDE) in the Yellow River Basin of China, with the use of 30-year panel data from 1997–2017 in nine provinces. The results of the study showed that the initial RSDE, cropping structure, rate of financial autonomy and level of mechanization inhibited the increase in RSDE, whereas the level of urbanization and rural GDP per capita had a negative and non-significant effect on RSDE across the watershed. Hou et al. [ 14 ] utilized the DPSIR model to conduct a comprehensive evaluation of 13 prefecture-level cities in Jiangsu Province, China, with the aim of proposing a sustainable development response that integrates ecological integrity, ecosystem services, and human well-being as the province develops.

(2) Research with a focus on socio-economic development: Gupta et al. [ 15 ] applied the DPISR model to investigate the worldwide developments and challenges brought to light by the COVID-19 pandemic in 2019. They found that the affluent’s “status” and “impact” were prioritized over the poor, ignoring any potential “drivers” and “pressures” in favor of swift economic recovery. The “drivers” that caused the global spread of the pandemic were also overlooked. Second, the response disregarded the underlying causes and pressures in favor of a rapid economic recovery. Ultimately, the primary cause of the global outbreak was the utilization of government funding to recuperate from the pandemic. Ultimately, the primary cause of the global outbreak was the use of government funds to recover from the pandemic. After all, the “driver” of the global outbreak is the use of government funds to restore business as usual, which will lead to a vicious cycle of further ecological degradation, socio-economic inequality, and domestic abuse. An inclusive development approach based on the DPSIR model would lead to a virtuous circle by emphasizing human health, well-being and ecosystem regeneration. Liu et al. [ 16 ] proposed a Driver-Pressure-State-Impact-Response (DPSIR) framework to study the major socio-economic influences on SO 2 emissions in Chinese cities. The study found that socio-economic factors, including urbanization, sustained economic growth, optimization of industrial structure, and an improvement in energy efficiency, played a significant role in lowering SO 2 emissions in China. In contrast, the government’s mandated environmental monitoring and financial investments in technology were deemed ineffective in reducing SO 2 emissions. Brunhara et al. [ 17 ] employed the DPISR model to develop an indicator system of 16 driver indicators (D), 74 pressure indicators (P), 23 state indicators (S), 35 impact indicators (I), and 38 response indicators (R) to help a cooperative of waste pickers in Brazil self-assess its social, environmental, and economic performance and to facilitate the cooperative’s proposal of highly targeted improvements. Kim et al. [ 18 ] In order to achieve effective, efficient and equitable global fire governance, the DPSIR framework is used to construct fire-related issues to establish a coherent causal path among drivers (D), pressures (P), states (S), effects (I) and responses (R). The overall decline in global burning area masks economic realities that increase the likelihood and cost of hazards caused by fires, suggesting the use of new indicators to assess and communicate the impact of global economic drivers on fire activity. Afrin and Shammi [ 19 ] utilized the DPSIR framework to examine the impact of the COVID-19 pandemic on women’s education, occupations, and health in Bangladesh. The study focused on five SDGs which directly affect women’s livelihood and well-being, namely, no poverty, good health, quality education, gender equality, and decent work and economic growth. The authors emphasized objectivity and utilized clear and concise language to convey their findings accurately. Technical abbreviations were explained upon first use, and the paper adhered to commonly accepted style and formatting guidelines. Five Sustainable Development Goals (SDGs) having a direct impact on women’s livelihoods and well-being were chosen and analyzed. These goals are: poverty eradication, promoting good health and well-being, providing quality education, ensuring gender equality, and fostering decent work and economic growth. The study indicates that the current neoliberal market economy has not succeeded in protecting the world from a pandemic. Therefore, to establish a harmonious society centered on caring relationships between nature, humans, and society, it is essential to dismantle the present economic system.

In summary, the DPSIR model plays a key role in conducting environmental and economic sustainability assessments and studies in various fields. The model assists in addressing natural resource and environmental concerns, including groundwater resource management, and provides critical reference material for decision makers. It underscores the significance of China’s leaf rock gas sector’s potential for sustainable development and the Yellow River Basin’s rural areas’ efficient sustainable development, as well as the relevance of related factors. In the field of socio-economic development, the DPSIR model offers comprehensive analysis and response tactics regarding topics like managing urban land, reducing urban SO 2 emissions, implementing global fire governance, and coping with global development and challenges during the COVID-19 pandemic. This underscores the broad applicability of the DPSIR model, which this study adopts for constructing our model.

The sustainability of the sports sector

Most of the evaluations, programs and studies related to sustainability programs in the sports sector and industry are based on resource allocation and sustainable development of the industry. For example, Zou et al. [ 20 ] adopted the Structural Equation Modeling (SEM) to investigate the factors that influence audience loyalty on Chinese live sports platforms. The study examined these factors from the three levels: government, industry, and platforms. The findings revealed that the perceived value of the audience to the live broadcast platform directly affects their loyalty. Additionally, the functionality of the live broadcast platform functionality, the easiness of use and the quality of service also have a direct impact on the audience’s loyalty. The study’s findings provided a valuable insight into the sales and marketing plans of the sports industry, which in turn offered insights into the sustainability of the industry. Qin and Liu [ 21 ] analyzed the issue of allocating sports resources to colleges and universities in Shaanxi Province, China, through an empirical investigation. They also proposed countermeasures and methods for sharing sports resources in colleges and universities, such as Xi’an Jiaotong University and Xi’an University of Foreign Studies, which provided valuable insights into the sustainability of sports resources in higher education. Kadagi et al. [ 22 ] concluded that the sustainability development plan of the marine sports industry, which may cause obstacles to the local socio-cultural and economic environment, and the development of regional fisheries, and therefore it is important to conduct a comprehensive assessment of these obstacles in order to reduce the impact caused by the development of the marine sports industry through a survey study. Zhao et al. [ 23 ] constructed an environmental science model for sports tourism based on socio-economic benefits and ecological environmental protection, which solved the drawbacks of the traditional sports tourism development model that only focuses on the perspective of operation, development, and management while ignoring the benefits to the economic and social interests and ecological environmental protection, and put forward a proposal for the sustainable development of the resource integration and functional integration of sports tourism. Josa et al. [ 24 ] proposed a new method for the rigid structure of stadium rooftop buildings, which integrates value modeling and multi-criteria decision-making to construct an evaluation index system that provided an important reference for the sustainability and durability of stadiums in the use of the building. Bellver et al. [ 25 ] adopted the empirical analysis method to conduct a questionnaire survey on 374 college students majoring in Physical Activity and Sports Sciences (PASS). The study indicated that the conditions of effective creation and management of sustainable businesses, along with a high level of social and civic values, and the support of the surrounding environment, are significant factors in achieving sustainable entrepreneurial intentions among students of physical education.

With the above in mind, studies on sustainability involving the sports sector and industry have mostly focused on the sustainable management of the industry, resource development, resource allocation, and environmental construction. In short, they are each focused on their own areas and an integrated evaluation model is lacking. Accordingly, this study constructed an evaluation model for the sustainability of competitive sports using the DPSIR model as the theoretical basis. The aim was to address the following problems in competitive sports: (1) the imbalance of development among different regions, (2) the low level of market participation, (3) the slow progress of professionalization reform, and (4) the low level of scientific and technological empowerment. This model aims to provide a set of systematic and scientific decision-making reference standards for government departments, managers, and decision-makers in related industries to support them in establishing sustainable development plans for competitive sports.

Research methods

This study constructed an evaluation index system for the sustainable development of competitive sports based on the DPSIR model. The construction of the evaluation model will be carried out in three stages. First, the evaluation index system is divided into five criteria based on the theoretical basis of the DPSIR model, after which the affiliation of each index at each level is determined by the modified Delphi method. The second stage involves the use of entropy to determine the weights of each criterion in the index system. The third stage used TOPSIS (Technique for Order Preference by Similarity to an Ideal Solution) to approximate the ideal point method in ranking the alternatives to verify the suitability of the evaluation index system. Finally, in response to the analysis results, specific measures for the development of competitive sports sustainability in the province were proposed. The framework of this paper is shown in Fig 1 . The methodology of this study is described in the following sections.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0301411.g001

Constructing an evaluation index system based on the Delphi method and the modified Delphi method

The Delphi Method was pioneered by Helm and Dahlke developed the concept in the 1940s, which was later expanded upon by Gordon and the RAND Corporation [ 26 ]. The method has been applied to the military, technology, medical care, and market demand to make qualitative predictions, to avoid blindly following orders or submitting to authority without question. As a result, it has gained widespread acceptance. Cho and Lin [ 27 ]. pointed out that the implementation steps of the Delphi method can be organized into the following six steps: (1) drawing up the survey outline, (2) selecting experts, (3) conducting the first questionnaire survey, (4) conducting the second questionnaire survey, (5) conducting the third questionnaire survey, and (6) synthesizing opinions to form a consensus. The process of implementing the Delphi Method is detailed below. (1) First, a detailed outline of the questions to be answered by the expert is drawn up, and the expert is provided with relevant background material. (2) Anonymous, representative, and relevant experts in the field of competitive sports were selected to form a panel of experts and were provided with a prepared outline of the survey in accordance with the purpose of the survey. (3) The first questionnaire is an open-ended or semi-open-ended questionnaire that collects and summarizes the experts’ initial judgment and removes indexes on which there is no agreement. (4) To conduct a second questionnaire, which is structured similarly to the first questionnaire design, and send it to the experts again. This will allow the experts to compare their opinions with others and refine their judgments. (5) A third questionnaire will be conducted, identical to the second round. (6) A consensus among the experts is reached, and if there is no unanimous agreement, steps 4 and 5 are repeated until the experts’ opinions converge.

The traditional Delphi method has the following advantages. First, the opinions are anonymous, so experts are more likely to express their true opinions. Second, the opinions have equal weight, eliminating the influence of authority. Third, the experts do not have to be gathered in one place, but only communicate by correspondence, which makes implementation easier. Fourth, the final opinion of the expert group is more broadly representative and accurate. However, the traditional Delphi method has a major drawback, which is that it uses an unstructured questionnaire to extract the priorities and opinions of the experts, and it is time-consuming and complicated. Therefore, the scholars have proposed the modified Delphi method, which utilizes a structured (open-ended) questionnaire to amend the shortcomings of the traditional Delphi method. The Modified Delphi Method (MDM) is an expert survey method that applies a structured questionnaire instead of an open-ended questionnaire, thus enhancing the efficiency of the traditional Delphi method. Listone and Turoff [ 28 ] argued that the traditional Delphi method has the following limitations. (1) The results of the Delphi method are susceptible to interference by subjective expert judgment. (2) The process of implementing the Delphi method is likely to be influenced by the questionnaire’s distributor. (3) The Delphi method is time-consuming and labor-intensive, and it is not easy to monitor the progress. In this regard, this study applied the modified Delphi method to determine the comprehensive evaluation index system for the sustainability development of competitive sports based on the preliminary evaluation index system decided in the previous section.

The prioritization of guidelines and regional development based on Entropy and TOPSIS

The Multiple Criteria Decision Making (MCDM) refers to the process of selecting from a finite or infinite set of alternatives that are mutually conflicting and incommensurable. Since the 1960s, Multi-Criteria Decision Making (MCDM) has been introduced as a normative method in the field of decision-making. It has gained widespread usage in various domains, including investment decision-making, management decision-making, and resource allocation. Many researchers and policy-makers have used MCDM to address complex social, economic, and scientific management problems that involve a multitude of factors. For example, Aksel et al. [ 29 ] employed the MCDM model to select suppliers for the aerospace and defense industry. Ofori et al. [ 30 ] applied the MCDM model to optimize decisions regarding renewable energy generation resources in Ghana. Kuttusi et al. [ 31 ] assessed the potential for glacial geoheritage development in the Yalnizam Mountains of northeastern Turkey through the use of MCMD. Dayarnab et al. [ 32 ] used the MCDM model for the selection of natural fibers as substrates for flexible sensors. Hamidah et al. [ 33 ] applied MCDM to establish a protocol for developing standardized weights for important plant areas in Malaysia. Shabani et al. [ 34 ] applied the MCDM technique to measure customer satisfaction in Tehran’s public transportation during the COVID-19 pandemic Mosetlhe et al. [ 35 ] presented a framework for microgrid configuration selection for water pumping applications in rural areas using the MCDM approach. The MCDM method includes Entropy, TOPSIS, ANP, AHP, GRA, ELECTRE, COPRAS, WASPAS, SECA, CODAS, SWARA II, MEREC and EDAS etc., and each of these methods has its own advantages and disadvantages in different application scenarios. And this research is to use entropy to combine TOPSIS on the grounds that entropy is an objective weighting method, which is based on the principle of using the concept of entropy value to determine the relative weights among the indicators. This method first evaluated the weight of each indicator and indicated the degree of influence of each indicator in the overall decision process by calculating its entropy values. The individual entropy values were then compared to determine the relative weight relationship between them. Therefore, when the entropy value was used to calculate the weights of the indicators, if the entropy value of an indicator was higher, then the weight of that indicator was greater, so that the different importance between different indicators could be distinguished. The entropy method calculates its entropy weights from the given raw data, without the interference of individual subjective factors. This makes the index weights more objective and ensures the scientific nature of the evaluation results [ 36 ]. Furthermore, TOPSIS’ computational process is relatively simple, computationally efficient, easily and quickly implemented in applications, and broadly adaptable to different data types and decision environments, including comprehensive evaluation of qualitative and quantitative data. TOPSIS is a widely used method for multi-criteria decision analysis. Finally, TOPSIS provides a clear and consistent framework for evaluating and comparing different options, thus increasing the transparency of the decision-making process. Overall, TOPSIS has been widely used in the field of multi-criteria decision analysis due to its many advantages such as comprehensiveness, intuitiveness, efficiency, practicality and adaptability [ 37 , 38 ]. Given this, the purpose of combining Entropy and TOPSIS in this study was to use Entropy to determine the weights of the indicators to be evaluated and TOPSIS to rank the objects to be evaluated in the final order. The combination of these two methods increased the transparency and rationality of the decision-making process, improved the accuracy of the decision, balanced objectivity and subjectivity, reduced the impact of data uncertainty and bias, and had a wider scope of application [ 39 ].

According to the previous studies, Entropy and OPSIS methods have been proved to be relatively effective and fast problem solving, important in solving management development strategy problems, and investment project evaluation problems Cho and Lin. [ 27 ] The construction of evaluation models with Entropy and TOPSIS methods has been widely adopted in several fields such as, Zhang et al. [ 40 ] used Entropy and TOPSIS methods to evaluate the intelligence level of 15 new first-tier cities in China. Li et al. [ 41 ] evaluated the suitability level for regional shallow geothermal energy implementation using Entropy and TOPSIS methods. Wu et al. [ 42 ] assessed the operational safety of urban railway stations using Entropy and TOPSIS methods. Xu et al. [ 43 ] assessed the level of sustainable development of urban agglomerations in the Yangtze River Delta region using Entropy and TOPSIS methods Li, et al. [ 44 ] valued the risk management priorities of the historic and cultural reserves in 31 provinces of China from the use of Entropy and TOPSIS methods. These studies suggested that the combination of the Entropy and TOPSIS methods has a wide range of promising applications in multi-criteria decision-making. It is therefore scientifically reasonable to adopt this model to evaluate the sustainable development level of competitive sports in Chinese provinces. This study constructed the decision-making model of Entropy and TOPSIS on the basis of previous research, aiming to construct the evaluation index system of sustainable development level of provincial competitive sports in a scientific and reasonable way. The following section provides a detailed explanation of the construction process for the Entropy and TOPSIS models.

The entropy is one of the highly used evaluation models in the field of MCDM (Multiple Criteria Decision Making). The entropy is an objective measure used to estimate the amount of information contained in each segment of data and calculated to determine the index weighting. The concept of entropy was originally developed as a metric to quantify the “messiness” of a system. The lower the level of molecular disorder in a given state of a system, the more aligned the molecules are. The more aligned the molecules are and the closer they are to perfect crystals, the greater the weights are. The more irregular the molecular arrangement, the higher the degree of molecular disorder. Entropy references weights to determine the relative weights between attributes. First, the weights of each attribute were calculated for each alternative to determine the extent to which the information possessed by the attribute could influence the overall decision outcome. Then the weights of each criterion were compared and the weights between them were calculated.

The process of the Entropy calculation is as follows Cho and Lin [ 27 ]:

research and evaluation framework

The MCDM consists of a number of methods such as Entropy, TOPSIS, AHP, ANP, ELECTRE, GRA ……, etc., each of which has its own advantages and disadvantages in terms of use. From the past researches, it has been pointed out that the use of Entropy and TOPSIS is an important method to solve the problem efficiently and quickly when facing the evaluation of investment projects and management development strategies [ 31 ]. The Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method was first proposed by Hwang and Yoon in 1981 [ 45 ]. This method is used by the decision-maker to evaluate the relative advantages and disadvantages by specifying the positive and negative ideal solutions, and then comparing and measuring them on the basis of the positive ideal points to calculate the distance between the evaluation solution and the positive ideal solution. The main strength of the TOPSIS method is that the relative distance from the ideal solution is used to rank the advantages and disadvantages of the solutions, which avoids the difficulty of evaluating the solutions that are the closest to the positive ideal solution or the farthest from the negative ideal solution, and therefore not comparable.

The steps for the solution are as follows [ 8 , 27 , 46 ].

research and evaluation framework

  • Step 6: Sorting the calculated values in order of magnitude and select the best option.

The construction of the evaluation model

This study aimed to construct an evaluation index system for the sustainability development of competitive sports relying on the DPSIR model. The construction of the evaluation model will be carried out in three stages, the main elements of which are detailed below.

Phase 1: The evaluation index system based on DPSIR and the Delphi decisions

  • Step 1: Collecting the relevant literature and making preliminary organizations Firstly, the five guidelines based on the DPSIR theory are identified in this study, and a total of 150 related literature are collected based on these guidelines. Subsequently, after collating and summarizing the literature, the duplicated indexes were deleted and merged, and 30 sub-criteria were finally extracted. The data collection period spanned from October 2022 to December 2022.
  • Step 2: Defining a preliminary system of evaluation indexes and amending the Delphi questionnaire Three experts were invited to test the content validity of the scale in this study by using the five aforementioned criteria and 30 sub-criteria to design the Modified Delphi Questionnaire elicitation scale. Three issues that were not representative were deleted in accordance with the recommendations of the experts. Eventually, 24 sub-norms were identified, and a preliminary system of evaluation indicators and a questionnaire were developed. The implementation date is from December 2022 to January 2023, a total of one month.
  • Step 3: Developing a survey outline to determine a preliminary system of the evaluation indexes Based on the preliminary evaluation indicator system decided in step 2 above, the evaluation index system was divided into five evaluation criteria of DPSIR and 24 evaluation sub-criteria. Initially, there are 7 sub-criteria for Drive, 3 sub-criteria for Pressure, 4 sub-criteria for State, 4 sub-criteria for Impact, and 6 sub-criteria for Response. This study utilized the Likert Scale to assess the significance of each index, with a rating of 7 indicating high importance and a rating of 1 indicating low importance. A structured questionnaire was developed based on these rules, and after the Delphi questionnaire was revised by the experts, the definitions and categorization of the indicators were gradually revised according to the experts’ opinions until a consensus was reached.
  • Step 4: Selecting the expert group and implementing the questionnaire Murry and Hammons [ 47 ] considered that the Delphi method requires a group of experts consisting of more than ten people. A total of 20 experts were invited to form the expert panel for this study, taking into account both feasibility and workload. The selection of the expert panel mainly includes the following types: firstly, experts and scholars engaged in the research of competitive sports and elite sports both domestically and internationally; secondly, departmental leaders with extensive experience in the administrative management of competitive sports; and thirdly, coaches and athletes actively involved in the practice of competitive sports. Under these three aspects, 40% of the experts were from academia, 30% from the industry, and 30% from leaders of sports administrative bodies. Their research areas encompass a wide range of competitive sports, sports management, social sports, mass sports, and management science. Among them, there are 8 teachers who are mainly engaged in the research of competitive sports, and all of them have the title of associate professor or above; there are 6 experts from the sports authorities, who are mainly from sports administrative departments, and their main functional business is the management of competitive sports; 6 representatives are from the leading edge of competitive sports practice, who are primarily involved in coaching and refereeing in competitive sports.
  • Step 5: Soliciting questionnaires and determining the consistency of the experts’ opinion. This study conducted three rounds of the Modified Delphi questionnaire, which were delivered in person, by email, and by WeChat. Expert questionnaires are collated and analyzed before proceeding to the next round. The questionnaire survey was performed in March-April 2023, with three rounds of the Modified Delphi questionnaire.
  • Step 6: Deciding on a system of evaluation indexes The study employed the interquartile deviation as a method for examining the distribution of expert opinions, which is the difference between the upper quartile and the lower quartile of the distribution of expert opinions. The smaller the interquartile range, the more centered the views of the expert group are; the larger the interquartile range, the more divergent the views of the expert group are. If the interquartile variance is less than 0.6, then the index has a high degree of agreement among the expert group, and the index is retained. If the value is between 0.6 and 1.0, then there is a medium level of agreement and the index will be retained; if the index is greater than 1.0, then there is no agreement in the expert group on the index and it will be deleted. Finally, after two rounds of revising the Modified Delphi questionnaire, “China’s Competitive Sports Sustainable Development Evaluation Index System” is classified into five dimensions (criteria) and 24 sub-criteria, as shown in Table 1 .

thumbnail

https://doi.org/10.1371/journal.pone.0301411.t001

Phase 2: Determining evaluation index weights based on Entropy

According to the evaluation index system constructed in Table 1 , this study designed the questionnaire solicitation form to determine the weights of individual sub-criteria by the Entropy. The expert group invited was the aforementioned 20 experts in the revision of the Modified Delphi to be solicited, with each expert being consulted on a ten-scaled scale, and the measurement period was from 16th to 29th, April, 2023. After the scale was recovered, the Entropy model was used to determine the weights of each criterion and the analysis process was described below.

  • Step 1: Constructing the decision matrix D ( Eq 1 ) and obtaining the target attribute P ij . This paper constructed a D-matrix based on the importance ratings of 20 experts for each metric, using a 10-point scale. A score of 1 indicates very unimportant and a score of 10 indicates very important. As shown in Table 2 , C11 is equal to 8, which means that Expert 1 rated the importance of “SC1: Gross Domestic Product” with a score of 8. C21 equals 7, indicating that the second expert rated the importance of “SC1” as a score of 7. The rating of “SC1” by Expert 3 is 7 in C31 and 8 in C41 by Expert 4. Similarly, 20 experts evaluated each of the 24 indicators to construct the decision matrix D , and then the results of the target attributes were inputted into Eq (2) to obtain P ij , as shown in Table 3 .
  • Step 2: Determining the weight of each criterion. Then the results based on Table 3 are substituted into Eqs ( 7 ), ( 8 ) and ( 9 ) to obtain the values of entropy measure E j , rank d j , and weight W j respectively, as shown in Table 4 .

thumbnail

https://doi.org/10.1371/journal.pone.0301411.t002

thumbnail

https://doi.org/10.1371/journal.pone.0301411.t003

thumbnail

https://doi.org/10.1371/journal.pone.0301411.t004

Phase 3: Prioritization of districts based on TOPSIS decisions

Taking the 31 provinces, autonomous regions and municipalities directly under the central government as the object of study, this paper uses TOPSIS to prioritise the 31 provinces according to the “Evaluation Index System for Sustainable Development of Competitive Sports in China” constructed in this study. The analysis process is explained below.

  • Step 1: Establishing a decision matrix The data collection in this study was based on the 24 sub-criteria in Table 1 , and the sources of data collection are the China Statistical Yearbook and Statistical Yearbook of sports undertakings. The example is depicted as a comprehensive evaluation result for the years 2013—2020, with the data for each region shown in Table 5 . The respective evaluation results for each year will be presented in phase 3.
  • Step 2: Establishing the regularisation and weighting matrix Firstly, the matrix was normalised using Eq (6) based on the raw data (decision matrix) in Table 5 . After that, the matrix is weighted with Eq (7) . The weights of the indexes are determined by Entropy and the result obtained is the decision of the weighting matrix and the results are shown in Table 6 .
  • Step 3: Deciding on a positive ideal solution versus a negative ideal solution The results obtained from the weighting matrix of Table 6 are applied to obtain the positive and negative ideal solutions according to Eqs ( 8 ) and ( 9 ).

research and evaluation framework

https://doi.org/10.1371/journal.pone.0301411.t005

thumbnail

https://doi.org/10.1371/journal.pone.0301411.t006

thumbnail

https://doi.org/10.1371/journal.pone.0301411.t007

Research result and discussion

This study will be discussed in two parts. The first part is the results of the overall development, which will examine the overall development of competitive sports in each region from 2013 to 2020. The second part presents a discussion of the development of competitive sports in the DPSIR (various criteria) in terms of each region, as detailed below.

The comprehensive development of competitive sports.

Based on the calculation paradigm of 4.3 TOPSIS in this study, the similarity rankings of 31 provinces in China are as follows, with Zhejiang, Jiangsu, Guangdong, Shandong, and Henan ranking in the top five, and Ningxia, Gansu, Qinghai, Jilin, and Tibet ranking in the bottom five. The similarity ranking is a general reflection of the comprehensive socio-economic development level of each province, which fully demonstrates the sustainable development level of competitive sports in each province. The data for each year was disaggregated and the results were calculated in this study and the results obtained are shown in Table 8 .

thumbnail

https://doi.org/10.1371/journal.pone.0301411.t008

The regional sustainable development level of competitive sports was estimated for each of the 31 provinces, municipalities and autonomous regions for eight consecutive years (2013–2020). As shown in Table 8 , of the top 20 provinces, 13 provinces (65%) experienced an increase in the level of sustainable development, while 7 provinces (35%) experienced a decrease in the level of sustainable development. This indicates that the sustainable development of competitive sports in China as a whole has continued to improve, but individual provinces have shown little decline or small fluctuations. These include small fluctuations in provinces such as Zhejiang and Jiangsu, which are already at a high level of sustainable development, and low-level oscillations in provinces such as Xinjiang and Sichuan, which are relatively behind in terms of socio-economic development. Zhejiang, Jiangsu and Guangdong provinces ranked in the top three in terms of overall scores for eight years, especially Guangdong Province, which has maintained a steady upward development trend over the eight-year period from 2013 to 2020. The two provinces of Zhejiang and Jiangsu have slightly wavered but kept a fairly high level of sustainable development of competitive sports as a whole, as shown in Fig 3. Shanghai, Xinjiang and Tianjin ranked in the bottom three out of 20 provinces and cities. The reason for this is that although Shanghai and Tianjin are well developed economically, they have obvious shortcomings in the natural population growth rate, greening coverage rate and forest coverage rate, which make them ranked lower under the comprehensive evaluation of the indicator system, while in Xinjiang, the economic indexes such as GNP and per capita disposable income have significantly pulled down their rankings.

Data source: Due to the limitation of space, only the top 20 provinces, municipalities and autonomous regions are listed; Guizhou, Hainan, Shanxi, Heilongjiang and Tibet are not on the list.

By using ArcGIS 10.7 to visualize the data from the 31 provinces in space (see Fig 2 ) and Table 9 , the regional sustainability level of competitive sports in China is classified into three levels. The first level, with a sustainability score of ≥0.31, is primarily located in the eastern and southeastern provinces of China, including Zhejiang, Jiangsu, Guangdong, Shandong, Henan and ten other provinces (see Fig 3 ). The second level has a sustainability score between 0.21 and 0.31, mainly in Xinjiang and the central provinces of China, including eleven provinces, including Sichuan, Anhui, Jiangxi, Guangxi, Liaoning, Shaanxi, Yunnan, etc. The third level, with a sustainability score between 0 and 0.20, is mostly in the northwest as well as the northeast provinces of China, and includes ten provinces such as Hainan, Shanxi, and Tibet. All in all, the level of sustainable development of competitive sports in China roughly reveals a significant improvement from the northwest to the southeast.

thumbnail

https://doi.org/10.1371/journal.pone.0301411.t009

thumbnail

Note: The online version is a color chart.

https://doi.org/10.1371/journal.pone.0301411.g002

thumbnail

https://doi.org/10.1371/journal.pone.0301411.g003

The assessment and analysis of various sub-systems of the level of sustainable development of the competitive sports in China.

(1) Drive Subsystem

It is shown in Fig 4 that during the 8-year period from 2013 to 2020, the provinces of Guangdong, Jiangsu, Shandong, Zhejiang, and Henan in China’s competitive sports sustainable development maintained a high average development level in the driver subsystem, while that of in the provinces of Tibet, Ningxia, Qinghai, Hainan, and Xinjiang was inferior to the average. This is closely related to the level of economic development of the province where it is located, and the better economic strength brings a stronger internal drive to sport economic. In addition to this, concerning the stability of the development of the driver subsystem, a certain degree of fluctuation has occurred in individual region, such as Shandong in 2014, Beijing in 2015, Jiangsu in 2017, Guangdong in 2017, Shaanxi in 2016, and Gansu in 2016 where there were large swings in the driver indexes. The overall view is that most of the provinces in China have shown a flat or slightly rising trend in terms of driving force, reflecting that China’s high-speed economic development has contributed to the driving force of the entire competitive sports, and reflecting the continuous emergence of China’s competitive sports’ ability to develop internally along with the high-quality development of the economy.

thumbnail

https://doi.org/10.1371/journal.pone.0301411.g004

(2) Pressure Subsystem

As depicted in Fig 5 , the provinces of Ningxia, Tibet, Fujian, Jiangxi, and Xinjiang maintained a high level in the pressure subsystem from 2013 to 2020. Conversely, the provinces of Beijing, Tianjin, Liaoning, Heilongjiang, and Jilin exhibited low levels of pressure. The reason for this is related to the fact that the pressure subsystem primarily responds to demographic and environmental factors. The five provinces at the top of the ranking showed a significant population growth rate and high green coverage rate. On the other hand, the five provinces at the bottom of the ranking, particularly Liaoning, Jilin, and Heilongjiang, have experienced a persistently low population growth rate. These provinces, known for heavy industry, face challenges in the form of resource over-exploitation and environmental degradation, which hinder the development of their competitive sports. The average pressure system level values of the 31 provinces over the 8-year period have shown a significant and consistent downward trend. As can be seen in Fig 5 , the 2020 indexes of most provinces are lower than the average of the previous seven years. This is primarily attributed to the sharp decline in China’s birth rate in recent years. During these eight years, the average development level of China’s competitive sports reacting to the pressure subsystem showed a tendency to decline. The decrease in the birth rate of the population is the major reason for this, and this is a point that must be taken into account in the future development of competitive sports in China.

thumbnail

https://doi.org/10.1371/journal.pone.0301411.g005

(3) State Subsystem

As shown in Fig 6 , the five provinces of Jiangsu, Guangdong, Shandong, Zhejiang, and Hebei ranked in the top five of the state subsystems during the 8-year period, and Tibet, Ningxia, Hainan, Qinghai, and Guizhou ranked in the bottom five. The indexes related to the state subsystem, such as the number of athletes in outstanding sports teams and the sales of sports lottery tickets, are directly related to competitive sports. The huge gap between the top five provinces and the bottom five provinces can be observed very dramatically in this regard. While the state subsystems showed a relatively flat but gradually increasing level, the eastern and central provinces in particular showed a better development trend. However, the western provinces, especially Tibet, Ningxia, and Qinghai in the north-west, exhibited a tendency to grow increasingly distant from the central and eastern provinces. Jiangsu, which ranked first, had a composite score of 0.33812 for eight years, while Tibet, which ranked last, had a composite score of 0.05100 for eight years, thus showing the huge gap between them. Furthermore, while Jiangsu has consistently achieved a high score for eight years, Tibet has remained at a low level, showing minimal growth in 2020 compared to 2013, which fully revealed the imbalance in the distribution of resources for competitive sports in China. This is what needs to be emphasized and changed for the future high-quality development of competitive sports in China.

thumbnail

https://doi.org/10.1371/journal.pone.0301411.g006

(4) Impact subsystem

Over the eight-year period from 2013 to 2020, the provinces of Guangdong, Beijing, Jiangsu, Zhejiang, and Shandong consistently ranked in the top five in terms of impact subsystems. Conversely, Gansu, Xinjiang, Ningxia, Qinghai, and Tibet provinces consistently ranked in the bottom five, as depicted in Fig 7 . The impact subsystem primarily reflects the province’s integrated economic, social, and natural attributes, with the southeastern provinces remaining overwhelmingly superior, and there is a huge socio-economic disparity between the northwestern and southeastern regions. Even if some western provinces have a slight advantage in the index of “the forest coverage rate”, it is far from sufficient to compensate for it. Compared to the eastern and central provinces, the impact level values of the next five provinces also remained almost unchanged between 2013 and 2020. This reflects the sluggish development of the impact subsystem of competitive sports in the Western region. In summary, the development level of the impact subsystem in most provinces revealed a steady growth with a slight increase. This indicates that the impact system in most provinces is gradually improving with minor fluctuations. However, the gap between the eastern and western provinces has not been narrowed over time, which is a significant issue that needs to be addressed.

thumbnail

https://doi.org/10.1371/journal.pone.0301411.g007

(5) Response subsystem

Over the eight years of 2013–2020, as shown in Fig 8 , Jiangsu, Zhejiang, Guangdong, Shandong, and Hubei provinces ranked in the top five of the response subsystem scores, and Ningxia, Qinghai, Guizhou, Hainan, and Tibet ranked in the bottom five. The response subsystem mainly manifests the degree of response of each province under the influence of the driver subsystem and the pressure subsystem, and the six indexes related to this subsystem are the outcome indexes that fully reflect the potential of the development of competitive sports in the provincial area. The enormous influence of economic strength on the core index of a province’s competitive sport remains very evident in this subsystem score. Jiangsu, Zhejiang and Guangdong are without exception the leading provinces in China’s economic development, while Ningxia, Qinghai and Guizhou are also unsurprisingly the more backward provinces, thus demonstrating the important role of a province’s economic base. As a whole, the development of the subsystem is more volatile, with the subsystem score of Zhejiang Province dropping from 0.37765 to 0.252229 from 2013 to 2020, representing a drop of 33.21%, while the subsystem score of Guangdong Province rose from 0.19396 to 0.310166, representing an increase of 59.91%. Besides, the majority of the provinces’ scores on this subsystem index fluctuate considerably because it is strongly influenced by the other four subsystems. This subsystem is subject to variations in the other four subsystems and therefore exhibited greater volatility in the system score. To sum up, the eastern and the central regions obtained higher scores in the response subsystem, the north-western region obtained lower scores, most of the provinces showed some volatility, and the gap between the eastern and western regions was larger, and half of the provinces had a decreasing trend. Despite the fact that each province has taken certain measures, the effect is not desirable, and it is necessary to further improve the effectiveness of the governance of the competitive sports in each province and to put the policy into practice.

thumbnail

https://doi.org/10.1371/journal.pone.0301411.g008

Conclusion and suggestion

The main contribution of this paper lies in addressing the key problems in the field of competitive sports in China, proposing solutions, and constructing a model for the comprehensive evaluation of the sustainable development level of competitive sports in the provinces. China’s competitive sports face challenges such as unbalanced regional development [ 2 ], organizational conflicts [ 5 ], low market participation, delayed professional reform, and low technological capability, which have become obstacles to China’s efforts to build a sports powerhouse. Based on the “Evaluation Indicator System for the Sustainable Development of Chinese Athletics”, this paper conducted an in-depth evaluation of the sustainable development level of athletics in 31 provinces, autonomous regions and municipalities directly under the Central Government of China by using the “DPSIR-TOPSIS” model and the entropy weighting method. The sustainable development level of athletics in 31 provinces, autonomous regions and municipalities directly under the Central Government of China has been thoroughly evaluated. So far, there has been no in-depth study on the sustainable development of provincial competitive sports in the Chinese context, and this study fills this gap. Through the constructed evaluation model and methodology, this paper not only provides a comprehensive assessment of the level of sustainable development of competitive sports, but also provides a set of systematic and scientific decision-making reference bases for government departments or relevant decision-making organizations. The main findings of this paper are summarized below.

  • (1) Provincially: Jiangsu, Zhejiang and Guangdong lead in both the overall level of the sustainable development and the rankings of the four subsystems (except for the pressure subsystem), while Tibet, Qinghai and Ningxia are in the most backward position, and the central provinces are in the middle of the pack. The central provinces of Shandong, Henan and Hubei, on the other hand, shone in the sustainable development level of the competitive sports.
  • (2) Spatially: Whether in the ranking of the overall level of sustainable development or in the ranking of the five subsystems, it basically presents the state of East > Centre > West. In addition to the pressure subsystem, which presented the phenomenon of West > Centre > East due to its unique index setting, the eastern provinces demonstrate greater sustainability in terms of their capacity for competitive sports development. The level of sustainable development in competitive sports is influenced by significant geographical advantages and regional characteristics.
  • (3) Timing: From 2013 to 2020, the level of sustainable development in competitive sports has shown an upward trend in most provinces, with only a few provinces experiencing a slight decline or small fluctuations. Generally speaking, the level of sustainable development is favorable, with provinces that have better development momentum further distancing themselves from the less favorable provinces. However, there is no obvious trend for the less favorable provinces, particularly in the central and western parts of the country, to catch up with each other.

(1) Implementation of a balanced development strategy for competitive sports in provincial areas

With the result of the evaluation of the overall level of sustainability of competitive sports, it is recommended that the government continue to strengthen the support for the development of competitive sports in the leading provinces of Jiangsu, Zhejiang and Guangdong to ensure their leading position in sustainability. For provinces that are relatively backward, such as Tibet, Qinghai and Ningxia, the government can put forward special policies and programs to improve the sustainable development of competitive sports and narrow the gap with the leading provinces. It is also necessary to understand the differences in the level of sustainable development of competitive sports in the eastern, central and western regions and to formulate regionally differentiated development strategies based on the results of the study. Particular attention should be paid to the establishment of indicators for the stress subsystem in order to understand the specific situation of the Western Region and to take appropriate measures to improve the sustainability of competitive sport in the region. The government will also need to continue to monitor time trends in the level of sustainable development of competitive sports, paying special attention to provinces that have experienced declines or small fluctuations. A long-term development plan should be formulated to ensure that the level of sustainable development of competitive sports continues to improve, especially in the central and western provinces. Encourage the exchange of experience and cooperation between leading provinces and relatively backward provinces to promote the improvement of the level of sustainable development of competitive sports. Central provinces should be supported in their good performance and encouraged to further develop competitive sports and pay more attention to policy and resource allocation.

(2) Constructing a new type of national sports organisational system with effective governmental and market-oriented measures.

The omnipotent government’s extensive allocation of resources for competitive sports has contributed to the remarkable achievements in China’s history. Yet, under the new pattern of competition in the world’s competitive sports, the unidirectional hierarchical management system of the government has become increasingly inadequate to adapt to the new development trend of global competitive sports. The new national system should not rely solely on the strength of the government but should mobilize the strength of individuals, families, society, and other diverse entities. The report of the 20th National Congress of the Communist Party of China pointed out that, “while giving full effect to the decisive role of the market in the allocation of resources and giving better play to the role of the government, it is necessary to form a governance pattern in which the government and the market are organically integrated, complementary, co-ordinated and mutually reinforcing.” The synergy of individuals, families, society, and other multiple subjects is particularly important in this process, taking full advantage of multiple subjects in terms of coordination of interests, high efficiency, and flexibility, as well as competition and transparency, to form a new framework for the governance of the competitive sports that unites a competent government and an effective market. The modernization of the competitive sports governance system can be achieved if the comparative advantages of the 31 provinces, municipalities, and autonomous regions with respect to their economic aggregates, natural resources, and social organization can be brought into play to form a synergistic, efficient and open competitive sports governance system.

(3) Promoting the integration of the competitive sports.

For a long time, the social function of competitive sports in China has not been fully explored. Standing at the threshold of the 14th Five-Year Plan, competitive sports should align with China’s national strategy, political interests, and political demands in the new era. To the international community, sports serve as a bridge to promote the new era of great power diplomacy with Chinese characteristics, optimising the geopolitical environment and showcasing China’s image as a peacefully rising great power. Domestically, competitive sports play a crucial role in promoting social development and fostering social integration. They also contribute to maintaining equality, harmony, stability, and national unity. In the economic sphere, competitive sports fully utilize the potential economic value of sports. They support systematic and comprehensive policies for professional sports, leisure sports, sports tourism, and other related sectors. This helps meet the increasing demands of people for a better quality of life and contributes to supply-side reform. The development of competitive sports should focus on broader political and economic goals so that it can better serve the rapid development of the country’s political and economic sectors. This will enable competitive sports to become a visible indicator of the great rejuvenation of the Chinese nation.

(4) Enhancing the innovation-driven capacity of the provincial competitive sport.

In the new era, China’s competitive sports have transitioned from the stage of reform and development to the stage of high-quality development. The drive for innovation is an important source of inspiration for maintaining the high quality of provincial competitive sports in the future. It is also important for the 31 provinces to meet the challenge of staying on track in the new stage of development. For instance, the innovation of talent selection and training mode, represented by “the cross-border cross-discipline selection,” made a significant contribution to winning gold and silver in the Beijing Winter Olympic Games. In this regard, the 31 provinces must embark on an innovation drive to enhance the mechanism for the high-quality development of competitive sports. This includes implementing the top-level design for sport’s reform, promoting collaborative governance among diverse stakeholders, adopting a diversified talent cultivation model, implementing scientific and intelligent training methods, developing effective strategies for preparing for large-scale games, integrating sports and education, creating incentive models for competitive athletes, and fostering cultural creativity in the competitive sports program, etc. Evidently, this innovation drive will greatly contribute to the sustainable development of competitive sports in the province.

(5) The implementation of the strategy for the development of the comparative advantages of competitive sports in the provincial areas.

The 31 provinces should integrate their own comparative advantages and resources as well as the basic conditions of the provinces, search for the development approach that suits them, plan the layout of provincial competitive sports scientifically, and explore the development model of the provincial competitive sports with high-quality development. For the top-ranking provinces such as Zhejiang, Jiangsu, Shandong and other provinces with better sustainable development indexes of competitive sports, the layout of program should be further refined, the industrial structure of competitive sports should be enriched, and the comprehensive integration of competitive sports and social life should be further intensified. The provinces in the central and western parts of China with poor indexes of sustainable development of competitive sports should enhance the key program suitable for the resource endowment of the province on the basis of making up for the short boards, so as to bring the whole area to the forefront and lead the way with the advantageous program to continuously narrow the gap with the eastern provinces, and to push forward the high-quality sustainable development of competitive sports in the country.

Shortcomings and future perspectives of this study

In this paper, Entropy and TOPSIS were used only for the analysis and other methods of MCDM were not considered for the methodological use, such as Complex Proportional Assessment (COPRAS), Weighted Aggregates Sum Product Assessment (WASPAS). Simultaneous Evaluation of Criteria and Alternatives (SECA), Combinative Distance-based Assessment (CODAS), Stepwise Weight Assessment Ratio Analysis II (SWARA II), MEthod based on the Removal Effects of Criteria (MEREC) and Evaluation based on Distance from Average Solution (EDAS).

The limitations of the methodological choices can be summarized as follows.

  • (1) High data requirements: Entropy and TOPSIS require sufficient data for weighting calculations and evaluating alternatives. In the absence of data, the effectiveness of the method may be compromised.
  • (2) Subjective sensitivity to weights: Entropy weight calculations can be influenced by subjective judgments, which can lead to subjective results.
  • (3) Only considers relative similarity: TOPSIS mainly considers the similarity of alternatives relative to the ideal and anti-ideal solutions, which may ignore other important information.
  • (4) Inability to handle non-linear relationships: Entropy and TOPSIS may not be able to effectively handle situations where there are non-linear relationships between criteria.

As a result, different approaches have different theoretical underpinnings. For example, COPRAS uses complex proportionality assessment, WASPAS combines weighted sums and products, SECA allows simultaneous evaluation of criteria and alternatives, CODAS is based on combinatorial distances, SWARA II performs stepwise weighting assessment, MEREC is based on criterion removal effects, and EDAS is based on distances to the mean solution. Each method has a unique computational approach, including different treatments of criterion weights, alternatives assessment, and ranking, and these methods have different emphases and computational approaches in responding to multi-criteria decision problems, and the choice of which method to use should depend on the characteristics of the problem, the state of the information, and the needs of the decision-maker. At the same time, there may be potential application scenarios where these methods complement each other. Therefore, this paper proposes recommendations for future research to address the limitations described below.

  • (1) Broadening the scope of application: For each method, future research could focus on practical applications in different fields and application scenarios. This will help to assess the generalizability and adaptability of the methods.
  • (2) Uncertainty handling: Consider introducing uncertainty factors into these methods to better handle real-world uncertainty. This may include fuzzy logic, random factors, or other forms of uncertainty handling.
  • (3) Performance evaluation: Perform systematic performance evaluation to compare the performance of these methods in different contexts. This may include assessments of accuracy, computational efficiency, scalability, etc.
  • (4) Method Integration: Consider integrating the methods to improve the performance of the overall MCDM model. Possible integration approaches include using them in tandem, using them in parallel, or developing new integration frameworks.
  • (5) Handling subjectivity: Find ways to reduce the impact of subjectivity in the methods to improve their objectivity and verifiability.
  • (6) Multi-Objective Problems: Consider the application of these methods to multi-objective problems and examine their effectiveness in dealing with multi-criteria situations.

As a conclusion, this paper suggests that through in-depth research on these commonalities, these MCDM methods can be made to better cope with complex decision-making situations encountered, and their practical application value can be strengthened.

Supporting information

https://doi.org/10.1371/journal.pone.0301411.s001

  • 1. National Bureau of Statistics. China Statistical Yearbook 2001-2007; (accessed on 1 Jun. 2023). Available online: http://www.stats.gov.cn .
  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 45. Hwang CL, Yoon K. Multiple attribute decision making: methods and applications a state-of-the-art survey. vol. 186. Springer Science & Business Media; 2012.
  • 48. General Administration of Sport of China. China Sports Statistical Yearbook 2013-2020. China Sports Yearbook Press; 2013-2020.
  • 49. National Bureau of Statistics of the People’s Republic of China. China Statistical Yearbook; (accessed on 1 Feb 2023). Available online: https://www.stats.gov.cn/sj/ndsj/ .
  • 50. Website source of Chinese Map;. Available online: https://www.openstreetmap.org/search?query=china#map=4/35.09/104.50 .

A Safe Framework for Quantitative In Vivo Human Evaluation of Image Guidance

Using an image guidance system constructed over the past several years [1], [2] we have recently collected our first in vivo human pilot study data on the use of the da Vinci for image guided partial nephrectomy [3]. Others have also previously created da Vinci image guidance systems (IGS) for various organs, using a variety of approaches [4]. Our system uses touch-based registration, in which the da Vinci’s tool tips lightly trace over the tissue surface and collect a point cloud. This point cloud is then registered to segmented medical images. We provide the surgeon a picture-in-picture 3D Slicer display, in which animated da Vinci tools move exactly as the real tools do in the endoscope view (see [2] for illustrations of this). The purpose of this paper is to discuss recent in vivo experiences and how they are informing future research on robotic IGS systems, particularly the use of ultrasound.

Comments are closed

We’re sorry, this site is currently experiencing technical difficulties. Please try again in a few moments. Exception: request blocked

Enhancing Global Surgical Care through Monitoring and Evaluation Strategies

  • Global Health

On Tuesday, April 16, 2024, the Texas Children’s Foundations in Global Health (FIGH) Lecture Series hosted an educational session that emphasized the critical role of Monitoring and Evaluation (M&E) practices in global surgery. Titled “Enhancing Surgical Impact: Advancing Monitoring and Evaluation Strategies in Global Health Initiatives,” the session was facilitated by Dr. Rachel Davis, Nadia Rahman, and Dr. Youmna Sherif. Faculty, staff, and learners from Texas Children’s who are interested in global health learned about innovative strategies that are reshaping how surgical care is delivered and evaluated in low and middle-income countries.

Five billion people worldwide currently lack access to essential surgery and anesthesia care. Texas Children’s and Baylor College of Medicine are collaborating to address this inequity. Dr. Davis highlighted the enormity of the challenge, stating that “Nine out of ten people in these regions cannot access basic surgical care.”

The solution involves using multiple strategies, including education, collaboration, advocacy, and research, to address the issue. The Global Surgery Residency at Baylor College of Medicine, the sole longitudinal integrated clinical global surgery residency in the U.S., prepares surgeons to effectively operate in under-resourced areas with challenging conditions.  

The session provided valuable insights from a recent course taught by the speakers at the University of Global Health Equity in Rwanda. This university, founded by Paul Farmer, a pioneering figure in global health from Harvard University, aims to train African leaders in the field of global health. The M&E modules demonstrated how these frameworks improve the delivery of surgical care in real-world settings. Nadia Rahman, a global M&E specialist at Texas Children's, discussed how well-executed M&E frameworks bolster healthcare systems. She outlined the fundamental steps in creating a results framework and identifying key indicators necessary for evaluating program effectiveness and identifying areas for improvement.

The event underscored the significance of integrating strong M&E strategies into global health initiatives. These crucial M&E frameworks serve as valuable dissemination tools, enabling health facilities and governments to make informed, data-driven decisions. Incorporating strong M&E strategies can greatly improve the effectiveness and impact of global health programs. 

Watch the session below:

Related Stories

Nadia's story.

The Periwinkle Foundation's 40th Anniversary

Enhancing Global Surgical Care through Monitoring and Eva...

Sleep disturbances in ataxic mice found to be caused by d..., fox 26: toddler thrives after life-saving transplant, austin video series: get to know dr. yogini prajapati, te....

The Texas Children’s Foundations in Global Health Lecture Series recently highlighted the critical role of Monitoring and Evaluation (M&E) in global surgery. Led by Dr. Rachel Davis, Nadia Rahman, and Dr. Youmna Sherif, the session explored innovative strategies reshaping surgical care delivery in low and middle-income countries. With five billion lacking access to essential surgery, collaboration between Texas Children’s and Baylor College of Medicine aims to address this gap through education and research. The Global Surgery Residency at Baylor prepares surgeons for resource-constrained settings. Insights from a recent course at the University of Global Health Equity in Rwanda were shared, demonstrating how M&E frameworks enhance surgical care delivery. Nadia Rahman, a global M&E specialist, emphasized the importance of robust M&E frameworks in healthcare systems. The event stressed the significance of integrating strong M&E strategies into global health initiatives for informed decision-making, enhancing program effectiveness and impact.

Texas Children’s Hospital Stay in touch

IMAGES

  1. PPT

    research and evaluation framework

  2. How to Use the Kirkpatrick Evaluation Model

    research and evaluation framework

  3. Developing a Framework for Research Evaluation in Complex Contexts Such

    research and evaluation framework

  4. Monitoring and Evaluation

    research and evaluation framework

  5. 4. Understanding the six criteria: Definitions, elements for analysis

    research and evaluation framework

  6. An Integrated MERLA (Monitoring, Evaluation, Research, Learning, and

    research and evaluation framework

VIDEO

  1. UXBIV An Evaluation Framework for Business Intelligence Visualization

  2. Healthark Location Evaluation Framework

  3. What is the Sustainable Evaluation Framework?

  4. RB Research Methods S5

  5. Models and Frameworks of Evaluation

  6. Obesity Institute presents: Development of a national whole systems approach evaluation framework

COMMENTS

  1. PDF Developing a research evaluation framework

    Developing a research evaluation framework. is growing demand internationally for research evaluation, due to an increasing emphasis on governance and accountability in both the public and private sectors. There is also greater understanding that policymaking must be based on objective evidence, and therefore a need for explicit and transparent ...

  2. Measuring research: A guide to research evaluation frameworks and tools

    A guide to research evaluation frameworks and tools. by Susan Guthrie, Watu Wamae, Stephanie Diepeveen, Steven Wooding, Jonathan Grant. Interest in and demand for the evaluation of research is increasing internationally. This is linked to a growing accountability agenda, driven by the demand for good governance and management growing in profile ...

  3. How to Create a Strong Research Evaluation Framework

    A research evaluation framework is a tool that helps you plan, monitor, and assess the quality, impact, and value of your research project. It can help you align your research objectives with your ...

  4. Frameworks for program evaluation: Considerations on research, practice

    The model states that a framework for evaluation is an intellectual framework made of concepts and/or theories (first dimension: types of ideas) about an object related to evaluation (second dimension: object), where the said concepts and theories can be positive and/or normative (third dimension: analytical perspective).

  5. A new framework for developing and evaluating complex interventions

    The UK Medical Research Council's widely used guidance for developing and evaluating complex interventions has been replaced by a new framework, commissioned jointly by the Medical Research Council and the National Institute for Health Research, which takes account of recent developments in theory and methods and the need to maximise the efficiency, use, and impact of research.

  6. How Can Research Be Evaluated?

    To be effective, the design of the framework should depend on the purpose of the evaluation: advocacy, accountability, analysis and/or allocation. Research evaluation tools typically fall into one of two groups, which serve different needs; multiple methods are required if researchers' needs span both groups.

  7. Developing a Framework for Research Evaluation in Complex Contexts Such

    It is during this phase that purposes and benefits (justification) for the choice of a specific research evaluation can be outlined. Furthermore, at this phase, detail of the constituting elements describing how the research evaluation will be conducted is noted (as summarized in Figure 2) in the framework. This section of the article covers ...

  8. PDF Common Framework for Research and Evaluation

    Common Framework for Research and Evaluation - ACF

  9. Original research: Evaluating evaluation frameworks: a scoping review

    In 2018, our team drew on existing evaluation frameworks to identify salient categories from existing rating schemes and create a new framework. 2 The American Psychiatric Association's (APA) App Evaluation Model was developed by harmonising questions from 45 evaluation frameworks and selecting 38 total questions that mapped to five ...

  10. What Is Evaluation?: Perspectives of How Evaluation Differs (or Not

    Source Definition; Suchman (1968, pp. 2-3) [Evaluation applies] the methods of science to action programs in order to obtain objective and valid measures of what such programs are accomplishing.…Evaluation research asks about the kinds of change desired, the means by which this change is to be brought about, and the signs by which such changes can be recognized.

  11. Ten recommendations for using implementation frameworks in research and

    Determine research and evaluation methods (overall design, data collection, data analysis) The distinct aims and purposes of implementation efforts require distinct evaluation designs such as mixed-methods, hybrid effectiveness-implementation, and quality improvement approaches including formative evaluations or Plan-Do-Study-Act cycles []. ...

  12. (PDF) Developing a Framework for Research Evaluation in Complex

    research evaluation framework that differed from traditional frameworks. This article details the flexible, rigorous, Evaluative. Action Research (EvAR) framework developed to meet the complex ...

  13. PDF Evaluation Models, Approaches, and Designs

    The following are brief descriptions of the most commonly used evaluation (and research) designs. One-Shot Design.In using this design, the evaluator gathers data following ... Evaluation roots: Tracing theorists' views and influ-ences. Thousand Oaks, CA: Sage. ... Evaluation: An integrated framework for understanding, guiding, and improving ...

  14. Evaluating impact from research: A methodological framework

    A typology of research impact evaluation designs is provided. •. A methodological framework is proposed to guide evaluations of the significance and reach of impact that can be attributed to research. •. These enable evaluation design and methods to be selected to evidence the impact of research from any discipline.

  15. Program Evaluation Frameworks: Why Do They Matter?

    While the term "evaluation framework" is very common in the discipline of program evaluation, it has had various interpretations. The recent article by Arbour (2020), "Frameworks for Program Evaluation: Considerations on Research, Practice, and Institutions," analyzes various approaches by different evaluation associations and ...

  16. Chapter 36. Introduction to Evaluation

    Recommended framework for program evaluation in public health practice. Atlanta, GA: Author. The article cites the following references: Adler. M., & Ziglio, E. (1996). Gazing into the oracle: the delphi method and its application to social policy and community health and development. ... Qualitative evaluation and research methods. Newbury ...

  17. Program Evaluation Guide

    The Framework, as depicted in Figure 1.1, defined six steps and four sets of standards for conducting good evaluations of public health programs. The underlying logic of the Evaluation Framework is that good evaluation does not merely gather accurate evidence and draw valid conclusions, but produces results that are used to make a difference.

  18. Conceptual Framework

    A conceptual framework is a structured approach to organizing and understanding complex ideas, theories, or concepts. It provides a systematic and coherent way of thinking about a problem or topic, and helps to guide research or analysis in a particular field. A conceptual framework typically includes a set of assumptions, concepts, and ...

  19. Developing a research evaluation framework

    RAND's publications do not necessarily reflect the opinions of its research clients and sponsors. Research funders, policy makers and researchers themselves need to evaluate research. This brief presents a decision tree to help develop a research evaluation framework to suit a particular purpose and context, from national down to program level.

  20. Research and evaluation

    Research and evaluation. Evaluation is a vital step in ensuring that the Western Australian community is benefiting from programs funded and managed through the Department of Health and its partners. To assist with program evaluation, the Chronic Disease Prevention Directorate have produced the Evaluation Framework and Implementation Guide 3 rd ...

  21. Research on Information Literacy Evaluation Framework and ...

    Establishing a universal evaluation framework for information literacy is an important way to measure students' abilities and optimize training programs. ... Yang, H., et al.: Research framework of student information literacy evaluation based on ECD. China Educ. Technol. (10), 88-96 (2020) Google Scholar Na, R., Wu, X.W., Lv, J.H.: Network ...

  22. How to write a monitoring and evaluation (M&E) framework

    The first step in writing an M&E framework is to decide which indicators you will use to measure the success of your program. This is a very important step, so you should try to involve as many people as possible to get different perspectives. You need to choose indicators for each level of your program - outputs, outcomes and goal (for more ...

  23. A modified action framework to develop and evaluate academic-policy

    We used a pragmatic combination of processes for framework modification during our evaluation activities in the CAPE programme [].The CAPE programme included a range of interventions: seed funding for academic and policy professional collaboration in policy-focused projects, fellowships for academic placements in policy settings, or for policy professionals with HEI staff, training for policy ...

  24. Constructing an evaluation model for the comprehensive level of

    The framework and methodology developed in this paper provide an objective and scientifically based set of decision-making guidelines that can be adopted by government agencies and related industries in order to create successful plans that promote the sustainable growth of competitive sport. ... Dong Bt. Research on the operation safety ...

  25. A Safe Framework for Quantitative In Vivo Human Evaluation of Image

    A Safe Framework for Quantitative In Vivo Human Evaluation of Image Guidance. ... The purpose of this paper is to discuss recent in vivo experiences and how they are informing future research on robotic IGS systems, particularly the use of ultrasound. Search for:

  26. An Assessment of the Current State of Monitoring and Evaluation in

    Between September 2021 and January 2022, the R/PPR Research and Evaluation team conducted a qualitative survey, which included 29 interviews with field-based PD staff. ... (PDIP), nor in other strategic planning frameworks. The result was a wariness to add new requirements around M&E, fearing that they would further strain PD sections already ...

  27. Sustainability

    First, based on the model of destination competitiveness and combined with literature research and open-ended expert interviews, an evaluation framework for the tourism competitiveness of urban historical and cultural districts is established, using the AHP method to calculate the specific weights of each evaluation indicator.

  28. Health Research Evaluation Frameworks

    An International Comparison. Published Oct 3, 2008. by Philipp-Bastian Brutscher, Steven Wooding, Jonathan Grant. This report is based upon, and summarizes findings from, eight health research evaluation frameworks in use in the United Kingdom (UK), Sweden, the United States (2), the Netherlands, Australia, the European Union, Canada, and ...

  29. A hybrid deep learning model for short‐term load forecasting of

    Finally, the developed framework is implemented on realistic load dataset of distribution networks, and the experimental results verify the effectiveness of the proposed method. Compared with the state-of-the-art models, the CNN-GRU-Attention model outperforms in different evaluation metrics.

  30. Enhancing Global Surgical Care through Monitoring and Evaluation

    The Texas Children's Foundations in Global Health Lecture Series recently highlighted the critical role of Monitoring and Evaluation (M&E) in global surgery. Led by Dr. Rachel Davis, Nadia Rahman, and Dr. Youmna Sherif, the session explored innovative strategies reshaping surgical care delivery in low and middle-income countries. With five billion lacking access to essential surgery ...