Measuring disaster preparedness has been a challenge as there is no consensus on a standardised approach to evaluation. This lack of clear definitions and performance metrics makes it difficult to determine whether past investments in preparedness have made sense or to see what is missing. This scoping review presents publications addressing the evaluation of disaster preparedness at the governmental level. A literature search was performed to identify relevant journal articles from 5 major scientific databases (Scopus, MEDLINE, PsycInfo, Business Source Premier and SocINDEX). Studies meeting the inclusion criteria were analysed. The review considered the multi-disciplinarily of disaster management and offers a broad overview of the concepts for preparedness evaluation offered in the literature. The results reveal a focus on all-hazards approach as well as local authority level in preparedness evaluation. Variation in the types of instruments used to measure preparedness and the diversity of questions and topics covered in the publications suggest little consensus on what constitutes preparedness and how it should be measured. Many assessment instruments seem to lack use in the field, which limits feedback on them from experts and practitioners. In addition, tools that are easy to use and ready for use by practitioners seem scarce.


In March 2015, 187 United Nations member states ratified the Sendai Framework for Disaster Risk Reduction 2018-2030 (UNISDR 2015) that formulated future needs and priorities for disaster risk management around the world. Priority 4, ‘Enhancing disaster preparedness for effective response and to “Build Back Better” in recovery, rehabilitation and reconstruction’ states the importance of the ‘[…] further development and dissemination of instruments, such as standards, codes, operational guides and other guidance instruments, to support coordinated action in disaster preparedness […]’ (UNISDR 2015, p.22).

Although preparedness is considered to be of high priority and importance, there is no universal guide or definition on disaster preparedness (e.g. what it comprises or how to achieve preparedness) (McEntire & Myers 2004, McConnell & Drennan 2006, Staupe-Delgado & Kruke 2017). A commonly used definition by the United Nations explains preparedness as:

The knowledge and capacities developed by governments, response and recovery organizations, communities and individuals to effectively anticipate, respond to and recover from the impacts of likely, imminent or current disasters. (UNISDR 2016, p.21).

However, standards for disaster preparedness are scarce and this lack of guidance makes collecting data about and assessing preparedness difficult. This is shown in the UNISDR definition that there are different units of analysis for preparedness.

Despite attempts to develop preparedness measures, there remains a lack of consensus and, consequently, a research gap about how preparedness evaluation should be done (Savoia et al. 2017, Khan et al. 2019, Belfroid et al. 2020, Haeberer et al. 2020). In 2005, Asch and co-authors (2005) concluded that existing tools lacked objectivity and scientific evidence, an issue that persists. Objective evaluation would allow for intersubjective comparability of preparedness. Savoia and co-authors (2017) analysed data on research in public health emergency preparedness in the USA between 2009 and 2015. Although there was a development of research towards empirical studies during that period, some research gaps remained, such as development of criteria and metrics to measure preparedness. Qari and co-authors (2019) reviewed studies conducted by the Preparedness and Emergency Response Research Centers in the USA between 2008 and 2015 that addressed criteria for measuring preparedness. They concluded that a clear standard was still lacking and guidance for the research community in developing measures would be needed. Haeberer and co-authors (2020) evaluated the characteristics and utility of existing preparedness assessment instruments. They found 12 tools, 7 of them developed by international authorities or organisations and a further 5 developed by countries (1 x England, 1 x New Zealand and 3 x USA). In their study, Haeberer and co-authors (2010) identified a lack of validity and user-friendliness. Thus, the literature shows that it remains critical to establish commonly agreed and validated methods of evaluation that help to define preparedness, identify potential for improvement and set benchmarks for comparing future efforts (Henstra 2010, Nelson, Lurie & Wasserman 2010, Davis et al. 2013, Wong et al. 2017).

Due to the relative rarity of disasters, it is unclear whether emergency plans and procedures are appropriate, whether equipment is functional and whether emergency personnel are adequately trained and able to undertake their duties (Shelton et al. 2013, Abir et al. 2017, Obaid et al. 2017, Qari et al. 2019). At the same time, a false sense of security due to unevaluated disaster preparedness strategies could lead to greater consequences from disasters (Gebbie et al. 2006). Ignorance about the status and quality of disaster preparedness impedes necessary precautionary measures and can cost lives. Moreover, the lack of proper evaluation poses a risk that mistakes of the past are not analysed, adaptations in procedures are not made and mistakes might be repeated (Abir et al. 2017). As Wong and co-authors (2017) stressed there is ‘a moral imperative’ to improve methods of assessing preparedness and raise levels of preparedness to diminish preventable.

The Sendai Framework for Disaster Risk Reduction 2015-2030 underlines the social responsibility of academia and research entities to develop tools for practical application to help lessen the consequences of disasters (UNISDR 2015; Reifels et al. 2018). In addition, the Sendai Framework highlighted the important role of local governments in disaster risk reduction. Their understanding of local circumstances and the affected communities gives them valuable insights and the best chance of implementing measures (Beccari 2020). This scoping study offers emergency and disaster planners as well as researchers an overview of existing concepts and tools in the literature for preparedness evaluation at the government level. For planners, thorough evaluation of preparedness contributes to improved outcomes for people and reduced deaths, reduces costs for response and recovery and helps with future investment decisions (FEMA 2013). Evaluation can serve as performance records as well as provide an argumentation basis in negotiations for further (financial) resources. For researchers, this scoping review helps to compile evaluation concepts and identify conceptional gaps.


This study used a scoping-review approach as this method allowed an examination of a wide range of literature to identify key concepts and recognise gaps in the current knowledge (Arksey & O’Malley 2005). Scoping reviews generally aim to map existing literature regardless of the study design reported and without any critical appraisal of the quality of the studies (Peters et al. 2015).

For the review process, a 5-stage framework was followed to conducting scoping reviews as presented by Arksey and O’Malley (2005). The research was guided by a broad question: ‘What is known in scientific literature about the evaluation of disaster preparedness on the governmental level?’ Other questions were: ‘Which tools or concepts are available for evaluating disaster preparedness?’ and ‘Have these tools been tested in the field or used in disaster management?’ A scoping review method, which does not exclude any particular methods or assess study quality, was chosen because it provides as broad an overview of the existing tools and concepts as possible. The term ‘concept’ here means theoretical work that describes what preparedness evaluation should look like and what it encompasses, whereas ‘tools’ refers to actual ready-to-use instruments.

The search was conducted in 5 academic databases (Scopus, MEDLINE, APA PsycInfo, Business Source Premier and SocINDEX) covering public health, disaster management and social sciences. Searches were conducted in December 2018 with an update conducted in May 2021. The databases were selected as they are multi-disciplinary and encompass a wide range of research fields. In an initial step to gain an understanding of the material and terminology, various quick-scan searches were conducted in the databases as well as academic journals addressing disaster management and public health preparedness. This was followed by searches within the fields of ‘Title’, ‘Abstract’ and ‘Keywords’ as adapted to the specific requirements of each database using the following search terms: ‘disaster preparedness’ OR ‘emergency preparedness’ OR ‘crisis preparedness’ AND ‘assess*’ OR ‘evaluat*’ OR ‘measur*’ OR ‘indicat*’ AND NOT ‘hospital’.

The terms ‘disaster’, ‘crisis’ and ‘emergency’ are often used synonymously in the literature (Gillespie & Streeter 1987, Sutton & Tierney 2006, Hemond & Benoit 2012, Staupe-Delgado & Kruke 2018, Monte et al. 2020), thus these terms were used in all the searches. Additionally, a search of Google Scholar including the first 100 records was conducted. The reference lists of the examined full papers were searched manually to identify additional, relevant published works not retrieved via the databases search. The literature sample was restricted to articles published between 1999 and 2021 in either English or German languages. Some relevant articles may have been excluded from this review due to these selection factors. All citations were imported into Endnote and duplicates were removed.

At the first stage of screening, the title and abstract of each published work were reviewed against eligibility criteria. Reasons for exclusion were:

  • The study did not address disaster preparedness on the governmental level but lower levels like hospital or individual and household preparedness.
  • The study only discussed disaster preparedness in general, not the assessment, measurement or evaluation of preparedness.
  • The study addressed only exercise or emergency drill evaluation.
  • The study addressed only communicable disease outbreaks (e.g. influenza pandemics, Ebola or H1N1).
  • The study addressed corporate or economic crises.
  • The study was not a full text but only an editorial or conference abstract.

Only papers that were clearly irrelevant for the study’s purpose were removed at the stage of screening titles and abstracts. The papers determined eligible for full-text review were checked by 2 researchers independently. The research team met throughout the screening process to discuss uncertainties regarding the inclusion and exclusion of works from the sample.


The database search returned 4,924 references. Removal of duplicates lead to a total of 3,955. Of these, 29 were added from the Google Scholar search and from a snowballing analysis of the reference lists of the included works. Screening abstracts led to the exclusion of 3,724 records. Of the remaining 271 full-text works that were assessed for eligibility, 29 me t all inclusion criteria for analysis, although 8 works were not available in full text. The search methodology is illustrated in Figure 1 and the analysis of the 29 integrated studies is detailed in Table 1.

Figure 1: The research methodology taken for this study.

Geographic distribution

The geographic distribution of the sample shows that the majority (n=15) of articles focused on preparedness evaluation in the USA. Two studies were from the Philippines and one each dealt with preparedness evaluation in European countries, Mexico, Canada, Indonesia, China, Saudi Arabia and Brazil. Two studies conducted case studies in Italy and France as well as Chile and Ecuador. Three conceptual works that addressed theoretical frameworks instead of instruments (Henstra 2010, Diirr & Borges 2013, Alexander 2015) did not specify a country in their descriptions.

Figure 2 graphs the number of papers in the sample by year of publication.

Hazard type

The majority (n=14) of the studies took an all-hazards approach and 2 studies chose natural hazards as the research scope. One study for each hazard type was selected for radiation, bioterrorism, meteorological disasters and flood. One study addressed flood, typhoon and earthquake. Eight studies did not specify a hazard type.

Level of analysis

Fourteen studies chose the local authority level for evaluation. Seven studies were at a state level and 2 were at the regional level. Three studies covered more than one level. Three studies did not specify a level of analysis.

Study design

In 23 works, the studies followed a quantitative approach and 12 included case studies based on the use of their assessment tools. Five studies are conceptual and offered theoretical frameworks, which could be used as a basis for setting up evaluation of disaster preparedness. One study used a qualitative approach by applying document review in combination with in-depth interviews.

Categories of evaluation

An analysis of the topics in the studies revealed a wide variety of categories and categorisation schemes. For the following description, terms that were most commonly used within the studies as keywords were chosen. For reasons of clarity and comparability, only the main categories of the instruments were analysed and subcategories were not part of the review.

Some categories emerged in the sample only once or twice, either because the instruments were designed to analyse a particular topic or because the analyses were so superficial and generic that the categories were mentioned only once. Categories that occurred less than 3 times in the studies are not listed. Agboola and co-authors (2015) use 160 general tasks in their measurement tool, however, a description of topics covered was not stated.

Figure 2: Publications by year of publication.
Table 1: Main categories of evaluation.
Category n Authors
Communication and information dissemination 14 (Somers 2007; Jones et al. 2008; Shoemaker et al. 2011; Shelton et al. 2013; Davis et al. 2013; Dalnoki-Veress, McKallagat & Klebesadal 2014; Djalali et al. 2014; Alexander 2015; Schoch-Spana, Selck & Goldberg 2015; Connelly, Lambert & Thekdi 2016; Murthy et al. 2017; Juanzon & Oreta 2018; Amin et al. 2019; Khan et al. 2019)
Plans and protocols
(some including testing and adaptation of plans**)
14 (Mann, MacKenzie & Anderson 2004; Alexander 2005, 2015; Somers 2007; Simpson 2008; Henstra 2010; Watkins et al. 2011; Davis et al. 2013; Dalnoki-Veress, McKallagat & Klebesadal 2014; Connelly, Lambert & Thekdi 2016; Juanzon & Oreta 2018; Khan et al. 2019; Dariagan, Atando & Asis 2021; Greiving et al. 2021)
(including volunteers*)
11 (Somers 2007; Jones et al. 2008; Porse 2009; Henstra 2010; Watkins et al. 2011; Davis et al. 2013; Dalnoki-Veress, McKallagat & Klebesadal 2014; Schoch-Spana, Selck & Goldberg 2015; Juanzon & Oreta 2018; Amin et al. 2019; Khan et al. 2019)
Training and exercises 9 (Mann, MacKenzie & Anderson 2004; Henstra 2010; Davis et al. 2013; Dalnoki-Veress, McKallagat & Klebesadal 2014; Djalali et al. 2014; Alexander 2015; Juanzon & Oreta 2018; Amin et al. 2019; Khan et al. 2019)
Legal and policy determinants 9 (Alexander 2005; Henstra 2010; Shoemaker et al. 2011; Davis et al. 2013; Potter et al. 2013; Djalali et al. 2014; Khan et al. 2019; Handayani et al. 2020; Dariagan, Atando & Asis 2021)
Cooperation and mutual aid agreements 8 (Henstra 2010; Watkins et al. 2011; Dalnoki-Veress, McKallagat & Klebesadal 2014; Schoch-Spana, Selck & Goldberg 2015; Connelly, Lambert & Thekdi 2016; Amin et al. 2019; Khan et al. 2019; Handayani et al. 2020)
Supplies and equipment 8 (Alexander 2005, 2015; Jones et al. 2008; Dalnoki-Veress et al. 2014; Djalali et al. 2014; Juanzon & Oreta 2018; Khan et al. 2019; Dariagan, Atando & Asis 2021)
Risk assessment 7 (Alexander 2005; Henstra 2010; Dalnoki-Veress et al. 2014; Murthy et al. 2017; Amin et al. 2019; Khan et al. 2019; Handayani et al. 2020) 
Financial resources 6 (Potter et al. 2013; Dalnoki-Veress, McKallagat & Klebesadal 2014; Connelly, Lambert & Thekdi 2016; Juanzon & Oreta 2018; Khan et al. 2019; Handayani et al. 2020) 
Evacuation and shelter 6 (Jones et al. 2008; Simpson 2008; Alexander 2015; Connelly, Lambert & Thekdi 2016; Greiving et al. 2021)
Early warning 6 (Simpson 2008; Djalali et al. 2014; Alexander 2015; Juanzon & Oreta 2018; Khan et al. 2019; Greiving et al. 2021) 
Post-disaster recovery 5 (Alexander 2005, 2015; Somers 2007; Cao, Xiao & Zhao 2011; Handayani et al. 2020) 
Community engagement 4 (Simpson 2008; Murthy et al. 2017; Juanzon & Oreta 2018; Khan et al. 2019) 

* (Davis e t al. 2013, Schoch-Spana, Selck & Goldberg 2015)
** (Alexander 2005, Henstra 2010, Connelly, Lambert & Thekdi 2016)


The majority of studies (n=16) developed a questionnaire or checklist to evaluate preparedness. Some instruments also included weighting of different indicators as well as offered a total preparedness score. The scope and number of items in the questionnaires and checklists varied widely.

Four studies provided theoretical concepts about how to measure preparedness. Alexander (2005) set up 18 criteria to formulate a standard for assessing preparedness. He suggested using them to evaluate existing plans or as guidelines when developing new ones. Diirr & Borges (2013) offered a concept for workshops in which emergency plans are evaluated. Potter and co-authors (2013) presented a framework of the needs and challenges of a preparedness evaluation tool. Khan and co-authors (2019) set up 67 evaluation indicators for local public health agencies, based on an extensive literature review and a 3-round Delphi-Process in which 33 experts participated.

Six instruments using metrics appeared in works in the sample. Cao, Xiao & Zhao (2011) used entropy-weighting to improve the TOPSIS method. A measurement tool based on the Analytical Hierarchy Process Technique was developed in each of 3 studies by Manca & Brambilla (2011), Dalnoki-Veress. McKallagat & Klebesadal (2014) and Handayani and co-authors (2020). Connelly, Lambert & Thekdi (2016) used multiple criteria and scenario analysis in their study. Porse (2009) used statistical analysis to identify significant correlations among the preparedness indicators in health districts and their demography, geography and critical infrastructure.

Three studies used other approaches. Nachtmann & Pohl (2013) developed a scorecard-based evaluation supported by a software-application. Amin and co-authors (2019) developed a fuzzy-expert-system-based framework with a corresponding software tool. Greiving and co-authors (2021) developed a guiding framework for performing preparedness evaluation through a qualitative approach using policy documents and in-depth expert interviews.

Basis of the instrument

For 5 studies, literature reviews were conducted to form a knowledge base to develop the instrument. In 6 other works, expert opinions and the experiential knowledge of the authors were stated as a basis for the instruments developed. Eleven instruments were based on existing models and techniques. A literature review in combination with expert consultation was used for 7 studies.

Field testing

Twenty-two studies included some sort of testing of the developed instruments.

Eleven studies included case studies. Alexander (2015) evaluated a civil protection program in a Mexican town. Nachtmann & Pohl (2013) evaluated 3 country-level emergency operations plans using their method. Cao, Xiao & Zhao (2011) calculated the level of meteorological emergency management capability of 31 provinces in China. Manca and Brambilla (2011) conducted a case study of an international road tunnel accident. Connelly, Lambert and Thekdi (2016) applied their method for the city of Rio de Janeiro in Brazil and possible threats around FIFA World Cup and Olympic Games held there. Juanzon and Oreta (2018) used their tool to assess the preparedness of the City of Santa Rosa. Simpson (2008) applied his methodology in 2 communities. Porse (2009) performed a statistical analysis of data collected in 35 health districts in Virginia, USA, to identify significant correlations among preparedness factor categories. Amin and co-authors (2019) analysed the flood management in Saudi Arabia. Dariagan, Atando and Asis (2021) assessed the preparedness for natural hazards of 92 profiled municipalities in central Philippines. Greiving and co-authors (2021) conducted case studies by analysing policy documents and conducting in-depth interviews with experts to evaluate the preparedness of Chile and Ecuador.

Eleven studies developed questionnaires, which were sent to (public) health agencies and departments. The sample size of respondents in the studies varied widely. The remaining 7 studies did not provide information about whether the instruments were tested.

Expert and stakeholder feedback

Nine studies reported obtaining some kind of feedback from experts or stakeholders. The various feedback methods described were expert interviews conducted (Agboola et al. 2015), expert interviews plus a questionnaire (Amin et al. 2019), informal discussions (Jones et al. 2008, Davis et al. 2013, Diirr & Borges 2013, Shelton et al. 2013), consultation with professionals who were likely to use the tool at key milestones (Khan et al. 2019), pilot testing and incorporating feedback into final version (Watkins et al. 2011) and meetings and validation sessions (Manca & Brambilla 2011). Twenty studies did not state whether feedback from experts or stakeholders was obtained.


This review identified that a wide variety of tools for government disaster preparedness evaluation is evident in the literature. However, there is no clear or standardised approach and no consensus about what preparedness encompasses and what elements need to be present in a preparedness evaluation tool. The research is far from the goal of a simple and valid tool that is ready for use for emergency and disaster managers. The lack of dissemination in practice of most of the tools identified in the review suggests that there has been little to no involvement of disaster managers in the development process.

This study revealed an array of concepts and tools to measure and evaluate disaster preparedness at the government level. The wide range of assessment categories and topics covered demonstrates a lack of consistent terminology used in the methods sections, as noted by Wong and co-authors (2017). Many of the works in the sample focused on narrow contexts or special subject areas (e.g. legal aspects, logistics or emergency plans). Concepts for evaluating preparedness and all its components remain scarce, probably due to the great complexity and consequent scope that such tools would require. Whether it is possible to develop a single one-size-fits-all tool is questionable. Cox & Hamlen (2015) argue for several individual indices as this might give meaningful insights than one aggregated index, while also offering flexibility. A major challenge in developing a comprehensive instrument is balancing between generalisability and flexibility. According to Alexander (2015), local circumstances including ‘different legal frameworks, administrative cultures, wealth levels, local hazards, risk contexts and other variations’ have to be considered when establishing evaluation criteria (Alexander 2015, p.266, Das 2018). Therefore, developing a modular system consisting of fixed, must-have criteria as well as optional criteria is recommended. That approach would provide minimum standards and comparability as well as support individualisation by adding variables depending on the circumstances of the system to be evaluated. At the same time, a degree of simplicity is necessary in order to ensure an instrument’s widespread use.

Most of the included studies were conducted in the USA and the issue of generalisability comes into play. As disaster preparedness is a topic of relevance to any community or state, an overview of existing concepts and tools, regardless of their geographic background, is valuable. By adapting concepts of socio-cultural and legal circumstances, a preparedness evaluation concept from other countries can help improve the preparedness of another system.

Many concepts offer numerical scores for sub-areas as well as overall scores to support comparability of instruments, reveal potential for improvement and help users to assess disaster preparedness. However, the question arises whether one or a few numbers can represent the whole construct of preparedness. It is important to consider whether all factors should be considered equally or whether a weighting of components in the evaluation is necessary (Davis et al. 2013).

Another potential problem in evaluating preparedness with numeric scores is the risk of simplification. Having only a few scores and values may be helpful to form an overview of the status quo and they can be a useful instrument in discussions with policy makers or for acquiring financial resources. However, can the whole complex construct of preparedness be measured properly with only one or a few numbers? Important details could be neglected (Porse 2009, Davis et al. 2013, Khan et al. 2019). Using a mix of qualitative and quantitative measures addresses aspects of cultural factors, resource constraints, institutional structures or priorities of local stakeholders (Nelson, Lurie & Wasserman 2007; Cox & Hamlen 2015).

A considerable proportion of the studies described only partial or limited involvement of experts from within the field. Some studies used the knowledge and assessments of experts as a starting point for their concepts and some tested the instruments and asked disaster managers for their feedback. However, continuous cooperation and exchange appeared to be an exception, a problem unfortunately quite common in disaster risk reduction (Owen, Krusel & Bethune 2020). This is in line with the results by Davis and co-authors (2013) and Qari and co-authors (2019) who observed a lack of awareness and, as a result, the limited dissemination of instruments for measuring. However, all of those efforts of researchers are worth nothing if not put into practice. As Hilliard, Scott-Halsell & Palakurthi (2011) stated, ‘It is not enough to talk about preparedness and keeping people, property and organisations safe. There has to be a bridge between the concepts and the real world’ (p.642).


While effort was undertaken to achieve a comprehensive overview of the scientific knowledge base about disaster preparedness evaluation, this scoping review might not have captured all existing concepts. The search algorithm was tested but other keywords might have returned additional or different results. Due to the lack of keywording, some relevant book chapters might not have been identified. Moreover, the selection of languages (English and German) as well as the chosen timeframe of publication (1999–2021) might have reduced the number of relevant results. Results from grey literature may have been missed as only the first 100 results from the web search were used. The classification of the results of the scoping review was carried out by 2 researchers independently, however, errors may have occurred during the selection process due to the subjective evaluation of eligibility. As the focus of the review was the scientific knowledge base, concepts of practice-oriented, humanitarian institutions and organisations were not included in this review. Studies dealing with infectious disease outbreaks or epidemics were not included as their course, duration and spread are very different from disasters triggered by natural hazards or human-made disasters like terror attacks.


Although disaster preparedness evaluation has importance for practice and preparedness improvement, this study’s results indicate a lack of instruments that are ready to use. There is a broad variety of concepts and tools on offer, however, there is no standard or uniform approach. Research on evaluating preparedness has been conducted and the list of these works provides an overview of concepts. However, the goal of developing a valid as well as easy-to-use tool for measuring preparedness at the government level seems far from achieved. Many assessment tools lack dissemination and use in practice, which limits feedback from experts and practitioners. The variation in types of instruments used to measure preparedness and the diversity of questions and topics covered within the studied publications demonstrate a lack of consensus on what constitutes preparedness and how it should be measured. Any tool for evaluating preparedness needs to strike a balance between simplicity and flexibility in order to account for the different circumstances of communities as well as hazard-types. Therefore, a modular evaluation system including must-have criteria as well as optional criteria is required.