Lifting people out of poverty has long been a major goal of both national policy-making and international development. Technological shifts over the last two decades have now created an opportunity to expand the tools of monitoring poverty: passively collected data on large parts of the population. In this project – led by United Nations Development Programme Sudan, the Telecommunication and Postal Regulation Authority of Sudan and the Central Bureau of Statistics of Sudan under the Big Data 4 SDGs initiative – we augmented survey data with anonymized mobile phone metadata provided by a major Sudanese mobile network operator to create proxy indicators for multidimensional poverty in the state of Al Gezira, Sudan . The study provides a first informed look on whether mobile phone metadata can be used (more widely) to inform poverty interventions in Sudan.
Spending capacity has a strong
impact on how people use their phone
One of the central promises of using mobile phone metadata in official statistics is their ability to extend the analysis to a smaller geographical level. The methodology used to derive fine-granular proxy indicators for multidimensional poverty builds on the concept of small area estimation: survey estimates can be refined by taking auxiliary data (in this case call detail records – CDRs) into account. Intuition tells us, people who do not have a lot of money to spend, weigh their mobile phone decisions more carefully. The goal of this pilot study was to identify patterns in CDRs that relate to multidimensional poverty by creating features that capture such behaviors and use them to create fine-granular proxy indicators.
Deepening spatial granularity
Proxy indicators for the multidimensional poverty index (MPI) constructed on those anonymized CDRs for the eight localities of Al Gezira show very similar results to the representative survey estimates (Figure 1).
Figure 1: Proxy indicator for multidimensional poverty on locality level
Intriguingly, the strongest predictors where behavioural indicators which capture work-related patterns. To further exploit the spatial granularity of CDR information, we estimated proxy indicators for three different levels: locality, administrative unit (AU) and approximated antenna coverage area to paint a more nuanced picture of multidimensional poverty. Evaluated against their locality-level aggregates, the results appear plausible (Figure 2 – estimated antenna coverages on the left and administrative units on the right). Correlations with the survey MPI are high, both in value and rank.
Figure 2: Proxy indicator for multidimensional poverty (model built on locality level)
However, direct evaluation on lower spatial levels was not possible due to the lack of fine-granular ground-truth data.. In conclusion, CDR features, especially those describing work-related behaviour, show a high correlation with the MPI. Compared to the same model using randomly generated CDRs as a proof of concept, overall model performance shows to be significantly higher if actual CDRs are incorporated.