To assess our current knowledge regarding lava impacts on the built environment, we conducted a literature search to compile all published information on lava flow impacts. From this search we compiled a dataset that comprises two components: 1) records of lava flow impacts in published literature, and 2) lava flow impact events compiled from the first component. From here, we will refer to a study as a published journal article, book, report or similar found in our literature search, and a record as a mention of impact on the built environment within a study. Thus, there may be multiple records within a single study. We also refer to an impact event as an eruption with atleast one recorded lava flow that impacted the built environment (buildings and/or infrastructure). For an overview of the methodology used in this study, see Fig. 1.
2.1 Records of lava flow impacts
We conducted a systematic literature review in 2022 using Google Scholar search following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Page et al., 2021), of published English language research papers, books, reports, and grey literature, to identify studies mentioning lava flow impacts on the built environment. This was conducted in September 2021 and repeated in December 2022 to include any additionally published reports. Our inclusion criteria were studies that report past or current lava flows and focus on or mention qualitative or quantitative information about impacts to buildings (e.g., buildings, villages, cities) or infrastructure (e.g., roads, ski-lifts, electricity network). For the initial search using the terms listed in Fig. 1 we filtered through the top 100 results ranked by relevance. We did not filter beyond the 101st result for each search as the studies became repetitive and less relevant. We removed repeated results or studies that did not fit the inclusion criteria.
We used web-scaping and programmatically queried the GVP Bulletin Reports and Volcano Summaries for mention of relevant strings (shown in Fig. 1A), ignoring punctuation and capitalisation (the code for this is provided at: https://github.com/elinormeredith/GVPscrape). We manually read those reports and selected those that fit the inclusion criteria (Fig. 1). We extended our search to newspaper articles using online article databases Elephind, Google News Archive and Gale NewsVault to search for mention of relevant strings (shown in Fig. 1). We filtered up to the first 50 articles for each of the search terms for each website, as most search terms resulted in less than 50 results and any results over 50 results became repetitive. From these we selected newspaper articles that fit the inclusion criteria, which resulted in 153 newspaper articles from between 1809 and 2018.
We define records of impact as those that refer to contact between lava and the built environment. Table 1 gives examples of terminology in records used to determine whether the records are included. We retained all studies referring to impacts on human-made structures on a large scale (e.g., villages, settlements, cities) or individual structures (e.g., buildings, roads, ski-lifts, ancient structures such as pyramids). If the study only recorded impact on people, agriculture or farms, and not structures, these were not included unless there was reference to structures such as farmhouses or farmsteads (Table 1). We also did not include impacts on hiking trails (e.g., 2018 CE Piton de la Fournaise: Global Volcanism Program, 2019c). We included records with phrases that refer to direct impact, such as damaged or destroyed (e.g., Jenkins et al., 2017; Mactagone, 2019; Siebe, 2000), and also those that refer to the mechanism of damage, such as engulf, buried or covered (e.g., Brown et al., 2017; Ramírez-Uribe et al., 2021; Thordarson and Höskuldsson, 2008), as these imply impact. However, we did not include records with phrases not implying contact, such as threatened (e.g., 1959 Mount Cameroon, Cameroon: Jennings, 1959; Tsang and Lindsay, 2020). We included records where lava flows have severed, cut across or traversed roads (e.g., Global Volcanism Program, 2019b; Peltier et al., 2022; Wantim et al., 2018), as contact of lava to roads results in impact (Hayes et al., 2021; Mossoux et al., 2019; Wilson et al., 2014). We did not include impacts from block-and-ash flows or lava domes. If the phrase used was ambiguous, we used other studies to verify if lava flow hazard was present during the eruption (Table 1). For example lava mud or lava flood could mean lahars, the lava reached may not imply contact, and impact to populations may not include structures.
Table 1
Examples of hazard, impact and asset terminology in records used to determine inclusion or rejection of the record in this research. Other studies were used to verify impact from lava flows to the built environment if any of the terms in the third column were found.
Term type | Included | Verified with other studies | Not included |
Hazard | Lava, lava flow | Lava mud, lava flood, cold lava, lava avalanche | Block-and-ash flow, lava dome, nuée ardente |
Impact | Damaged, destroyed, engulfed, buried, covered, severed, cut, traversed, impacted | Reached | Threatened |
Assets | Villages, settlements, cities, towns, buildings, houses, roads, ski-lifts, pyramids, farmsteads, railways | Populations, areas, regions, properties | Farms, farmland, agriculture, fences, hiking trails, spas |
After the first round of filtering, we used the studies to compile a list of 82 lava flow events that have impacted buildings or infrastructure. To capture as much information as possible we conducted a second round of literature search with the lava flow event list. Using Google Scholar, we searched terms including the volcano name (shown in Fig. 1A). The first 30 results were selected, except for Etna, Kīlauea and Vestmannaeyjar volcanoes where we selected the first 50 results, as results beyond these became repetitive or irrelevant. The search terms for the volcano name are provided in Supplementary Material, including additional names of the eruptive fissures or cones, and/or alternate spellings for the volcano, that were included in the search. We removed repeated studies and filtered the studies using the inclusion criteria. These steps were repeated for any lava flow events found in the studies. We included GVP (2022) This resulted in a total of 384 studies.
Many of these studies reference other sources when mentioning impact information. For any studies that cited another study when referring to lava flow impacts, these cited studies were added to the literature list for filtering by the inclusion criteria, adding an additional 209 records. This process was repeated until the information did not have a citation. We removed from subsequent analysis studies that were unavailable online, even through the EOS online subscription services. Information found in cited studies is duplicated in later studies citing this source. This was done so that all impact information is captured and any information from unavailable studies would still be included.
This research required an extensive methodology to compile and crawl hundreds of impact studies to capture every impact event possible. Studies were classified into different types based on their relevance to the eruption or volcano where the impact occurred (Fig. 1). Studies that did not focus on the eruption (e.g., general volcanology studies or other sources) may be less precise and/or reliable when reporting impacts than those focussed on the eruption (e.g., eruption impact assessment or eruption study) in their recording of impacts, as it is not a central focus of the study. When comparing the records for each event, if there were more records from studies that did not focus on the eruption, these records outweighed the data from those studies focussing on the eruption. To narrow down the data to the most reliable and precise, we ranked the records by study type (listed in Fig. 1B) and selected the top 10 highest ranking records for each impact event. We based these rankings on study topic relevance. This means that the potentially less precise and/or less reliable sources (e.g., newspaper articles, other sources) were only included if there were less than 10 records for a specific lava flow event. If multiple records of a certain study type resulted in more than 10 records for an event, we subjectively selected studies within this study type with more relevance to lava flow impacts. For example, for the 1928 eruption of Etna, Italy, there were 13 journal article records and 11 newspaper article records. We removed newspaper article records and three of the journal article records as it was focussed on general volcanology and the others were focussed on Etna, leaving a total of 10 records.
For each record, we collated information about the study it was in, including the date and type of publication (e.g., post-event impact assessment, eruption chronology, non-volcanology study). For each record we noted the information regarding any hazard metrics or characteristics (e.g., lava thickness, velocities, volume, area) and impact (e.g., number of buildings or villages destroyed, qualitative impact descriptions, amount of infrastructure destroyed). We also recorded the scale of both hazard data (e.g., maximum, average, point-data) and impact data (e.g., structure-level, village-level, infrastructure only). In total we collated and filtered to 536 records of lava flow impacts within 333 studies across 127 lava flow events between as early as 3419 BCE and 2022 CE (Fig. 1C; Supplementary Material).
2.2 Lava flow impact events
For the structure of the lava flow impact event dataset (Fig. 1D), we followed the structure of Brown et al. (2017), with the aim of simplifying the information presented in multiple impact records on each lava flow impact event into one standardised dataset, whilst preserving the original data. To provide context for each lava flow impact event, we added information from the (Global Volcanism Program, 2013) about the volcano type, volcano number and tectonic setting.
For the starting year, we selected the starting year of the lava flow effusion most commonly reported in the records, and noted in the year uncertainty column if this differs from the eruption start year shown in the (Global Volcanism Program, 2013). If there is an uncertainty range around the starting year, we select the mean average starting year (e.g., 5494 year B.P. to 5387 year B.P. Etna, Italy was entered as -3419 in the events dataset). For each impact event, we compiled all the quantitative and qualitative hazard data (e.g., lava volume, lava thickness, lava area) and quantitative and qualitative impact data (e.g., amount of buildings destroyed, buildings damaged and roads inundated) from the record dataset.
To select the most precise data for the dataset, we filtered the hazard and impact record information by scale (i.e., prioritising those on a structure-level scale over a village-level scale) (Fig. 1D). For quantitative data, where records gave multiple values, we provided the median value, and provided the range of values in the range column. For example, if various records for an impact event showed impacts of: 10, 13, 20 and “many” buildings destroyed, the data will be entered as; 13 in the Median Number of Buildings Destroyed column and 10–20 in the Range of Destroyed Buildings column. This is also the same for the lava flow Area and Volume columns. Other impact or hazard information is noted in the dataset. Data from individual records were converted into the same comparable unit in the lava flow impact event dataset, using the metric system. For qualitative data we listed all information, however this was only entered when no quantitative data were available. Other information such as infrastructure (e.g., roads, ski-lifts, electricity pylons), agriculture and people (numbers of fatalities or evacuations) in the records were also added to the dataset to the relevant columns. Verbatim quotes are added to the Notes columns with semi-colons to separates records when is differing information reported in multiple reports. To establish a comprehensive dataset, we used additional literature sources and Global Volcanism Program (2013) to fill any missing hazard information. The hazard and impact references are added in separate columns. This resulted in a dataset with a total of 127 recorded lava flow impact events (Fig. 1E; Supplementary Material).
This research focussed on impacts to physical structures. The dataset solely presents the recorded impacts of lava flows on the built environment (buildings and infrastructure), and relevant information associated with these events (e.g., agriculture impacts, fatalities, evacuations). The dataset does not include the many other events that have caused evacuations, fatalities or impacts to the natural or agricultural environment, but that have not impacted the built environment. Wider cascading impacts or impacts to aviation or climate are not included. Whilst we recognise that these impacts, as well as other social and economic impacts, are important and should be expanded upon in a future update of the dataset, these are out of our research scope.
2.3 Uncertainty of impact events
There are four main classifications of uncertainty in the dataset. If records included any of the following sources of uncertainty, these were noted as an uncertain record. In the events dataset, we noted in the uncertainty column if there were any uncertain records used in the compilation of the dataset. If only uncertain records are used to compile the entry in the dataset, this was classified as an uncertain event and the type of uncertainty was noted in the Uncertainty of Event column. The four sources of uncertainty are as follows:
A) Grouping of impact data. Impacts from additional hazards such as tephra fall or PDCs may be included within the total number of impacted structures. Compounding and secondary hazards can also affect structures in different ways and can result in greater overall damage. In some instances, it is difficult to distinguish the cause of the initial impact when studies report total eruption impacts (e.g., 1914 CE Sakurajima: Omori, 1916). If the type of additional hazard is known, this is added in brackets.
B) Inference of impacts. For eruptions where there are no direct observations of impact, some studies have inferred lava flow impacts based on archaeological evidence, or presence nearby settlements (e.g., 1075 CE San Francisco Volcanic Field: Elson et al., 2002; 2670 BCE Harrat Ash Shaam: Trifonov, 2007).
C) Uncertain data collection methods. Studies may report “damage” from a lava flow without detailing specific information, making it difficult to verify or expand upon if there is no source cited or data method given. Events are noted as uncertain if they do not have or cite any records with primary data collection methods (e.g., eyewitness accounts, field surveys, remote sensing). In these cases, alternate studies were used, and any photographs provided were analysed to clarify the role that different hazards, such as lahars or PDCs, had in the recorded impact, and any contradictory evidence was noted in the dataset (e.g., 1631 CE Vesuvius: Arnò et al., 1987).
D) Ambiguous hazard terminology. There are sometimes multiple or no local language translations of English volcanological terms (Harris et al., 2017), this may lead to the term lava used in records to represent other hazards. Difficulty in determining the cause and result of the impacts may be apparent for the more andesitic eruptions in Indonesia, the Philippines, and South America, where deposits of block-and-ash flows, lahars or PDCs, are often referred to as lava (e.g., Orense and Ikeda, 2007). In these potential cases, alternate studies were used, and any photographs provided were analysed to clarify the role that different hazards, such as lahars or PDCs, had in the recorded impact, and any contradictory evidence was noted in the dataset (e.g., 1814 Mayon: Bankoff et al., 2021).