A priority index for humanitarian aid after a typhoon

Source(s): Netherlands Red Cross

By Maarten van der Veen

The Priority Index is a data driven solution to predict damage to houses after a (super) typhoon or hurricane. We use data and machine learning techniques to identify priority areas for humanitarian aid. Organizations like the Red Cross  and Red Crescent National Societies, governments or UN OCHA can use these results to better understand the impact of a natural disaster and to mobilize humanitarian response faster.

Humanitarian aid improves with accurate and timely information

When a natural disaster strikes, the local government, NGOs and Red Cross and Red Crescent National Societies quickly need information on the damage (affected population, casualties, road blocks, flood extend, damaged houses) in the areas that were hit by the disaster. The information that is presented to decision-makers in the wake of a disaster needs to be accurate, appropriate, timely and valid.

One of the challenges with disaster response is scarcity of resources: not each affected family can be helped. Therefore it is essential to identify priority areas , by assessing damage and finding vulnerable people that are affected the most. Currently damage assessments and identification of the most vulnerable is a time consuming process, which can takes weeks to complete, due to logistics, safety constraints, or workload.

Assessment teams need to go into the affected area and interview people affected and review damage to houses. Due to time constraints, or limited information sharing, there is a risk that decisions on priority areas are not based on complete and accurate information, and thereby also organizational or political preferences could be taken into account, as well as influence by the media on areas that receive more media attention than other areas.

During a study in the Philippines, 60% of interviewed decision makers (government, NGO’s and UN) have indicated  that a faster, more complete and more objective analysis of priority areas (Priority Index) could be useful to identify areas with high damage and number of people affected. Thereby supporting decision makers to prioritize and distribute aid efforts and reach the most vulnerable people in the worst affected areas more efficiently.

Building the priority index for typhoons

Our aim is  to develop a methodology to identify high priority areas for humanitarian response, based on (open) secondary data of affected areas, combined with disaster impact data (such as windspeeds and rain) and by learning from past disasters. It is important that we invest in data preparedness, so that these pre-crisis secondary datasets are available and up-to-date (1,  2).

Applied research on this objective is ongoing for Typhoons (Philippines), Earthquakes (Nepal) and Floods (Malawi). Our objective is to develop machine learning methodologies that can be applied to different countries, using local data, and with minor modifications reach a fast and sufficiently accurate damage prediction. In this blog we describe initial results for the Philippines during Typhoon Haima on October 19th 2016.

Data

Data used for the prediction model includes country wide base line data (administrative boundaries, population, poverty, house wall and roof types), Geographical features per municipality (ruggedness,  slope, coastline length, distance to coast), combined with impact data (wind speed, rainfall, typhoon path), and uses a number of specific features created from these data. Official counts by DSWD and NDRRCM (Philippine government) on damaged houses are used to validate the model. For this we used data from four past typhoons: Haiyan, Melor, Hagupit and Rammasun. More detail on the data and its sources is available here.

All data was aggregated to the municipality level. Unfortunately barangay level damage counts are not available in the datasets published by the government. All information per municipality was integrated using the PCODE-system, which assigns a unique identifier to each administrative area in the Philippines. To ease this task, an efficient PCoder was developed.

The prediction model

In the risk management domain probabilistic models are being developed for determining the likelihood of losses from a disaster (usually economic loss). It creates impact scenario’s that can be used by decision makers to mitigate risk. These models however are not developed to predict impact on people during a recent disaster. Our approach is not to develop sophisticated hydrologic, seismic, or windspeed models, but to use machine learning methods to find the best predictors in existing base line data to predict typhoon impact. Different machine learning methods have been tried (including neural networks). Currently we are using a method called Random Forest Regressor.

Testing the model

We had the unique opportunity to test  the model in the recent Typhoon. We dropped all our other work and got the team to fast track the development of the model and collect and clean the impact data, so that we were able to release a first Priority Index within 24 hours after landfall. More than four days later the first official counts of damage of parts of the affected area were released. The results where shared with humanitarian organizations, government and through social media. We have produced two types of analysis.

Priority areas within 24 hours

Predicted numbers were used to prioritize municipalities on a scale from 1 to 5 (1 with the lowest predicted number of damaged houses, 5 for the highest predicted number of damaged houses). The map and data (HDX3) were shared in the humanitarian community and reviewed by a few organizations such as UN OCHA and the Shelter Cluster.

Absolute damage to fill gaps in government counts

We used the model to complete gaps in the official counts of DSWD and NDRRCM. For this we included the official counts in the model and ran the model again to predict the gaps in the official data. This map was used by the Shelter Cluster to get a better overview of total house damage in the affected areas.

Performance and error in the prediction

Due to its nature the regression model has difficulty to predict really low and really high damage. As we don’t know too much about the methodology of how damage counts are done in the Philippines we are not able to say if we have a high error on these outliers, or that the model actually predicts these fairly accurately.

Conclusions

From our work so far we can conclude that when data preparedness is done right, and disaster impact data collected structurally after an event, then it is possible to use machine learning techniques to build reliable damage predictions.

Although damage predictions by using data are not perfect, they are far more transparent than other prioritization methods, because the underlying data, assumptions and methodologies are shared openly.

While running and improving the model, we have made a few ‘discoveries’ that we are worth mentioning:

  • The importance of poverty data seems to be overestimated in other Priority / Severity models out there. Our initial model showed 10% importance of poverty for all 4 typhoons. After adding wall and roof type data to the model, the importance of poverty dropped to 3%. This is also one piece of evidence that poverty is related to how people build.
  • It is essential to use features that are proportional to the population. Otherwise population is by far the most important feature in any model.

Future steps

A complete roadmap to improve the prediction is available on our github page. A few highlights are listed below.

To improve the performance, and reduce the error, of the prediction model we will try the following:

  • Add new base line data especially on vulnerability and coping capacity, through a community level inform risk model 4
  • Work closer with in-country actors to get more complete building damage data and people affected data, and understand better the data collection methodology.

To reduce the time needed to release a prediction on damage after a new typhoon:

  • End-to-end scripting of all data collection, cleaning, aggregation and analysis steps
  • Reach agreements with data providers to get timely access to high resolution windspeed, rain, earthquake intensity and flood extend data.

Scaling up

To scale up this work to other countries:

  • Data on impact (people affected, houses damaged and destroyed, casualties) should be available for a number of recent disasters and collected through the same methodology (standardized).
  • Base line data on population, poverty, building materials should be available
  • The above should be collected at the same administrative level and collected using the same administrative devisions (Combining data from before and after a significant change in administrative devisions is hardly possible)

Due to differences in how early warning is organized, and how people build, the impact of events between countries can be widely different. It is therefor not adviseable to run the Philippines model on another country without any historical data to validate on.

Citations

  1. ACAPS, & CDAC-Network. (2014). Assessing information & communication needs: A quick and easy guide for those working in humanitarian response (pp. 10). 
  2. InterAgencyStandingCommittee. (2010). IASC Guidelines Common Operational Datasets (CODs) in Disaster Preparedness and Response. Paper presented at the 77th IASC Working Group meeting. 
  3. HDX. (n.d.). The Humanitarian Data Exchange.   Retrieved June 9, 2016, from https://data.hdx.rwlabs.org/  
  4. INFORM. (2016). INFORM Partners Severity Workshop: Towards a global severity index for humanitarian crises and disasters workshop. JRC Ispra, Italy. 21 and 22 April, 2016. 

 

Explore further

Hazards Cyclone
Country and region Philippines
Share this

Please note: Content is displayed as last posted by a PreventionWeb community member or editor. The views expressed therein are not necessarily those of UNDRR, PreventionWeb, or its sponsors. See our terms of use

Is this page useful?

Yes No Report an issue on this page

Thank you. If you have 2 minutes, we would benefit from additional feedback (link opens in a new window).