Menu ENN Search

Effect of nutrition survey ‘cleaning criteria’ on estimates of malnutrition prevalence and disease burden: secondary data analysis

Summary of research1

Location: Global

What we know: Standardised methods for collection and reporting malnutrition prevalence data in nutrition surveys are used.

What this article adds: Methods for cleaning data are not standardised, vary and are rarely described in reports. A secondary analysis of 21 DHS datasets applying different cleaning criteria found a profound effect on the reported prevalence of both moderate and severe wasting. This varied by country and was most pronounced for severe wasting. Recommendations include mandatory reporting of cleaning criteria, urgent international consensus on optimal cleaning criteria and ‘real time’ validation of extreme variables made possible with electronic data collection.

Nutrition surveys, from which the all-important malnutrition prevalence statistics are derived, generally follow similar and standardised methodologies for the collection and reporting of data. However, ‘data cleaning criteria’ are rarely described in reports, yet are almost always applied to raw data. Their purpose is to exclude very high or low values which are more likely to represent measurement or data error than a truly very large or very small child. They are particularly useful when it is not possible to return to the field to review individual children directly. While there are many reasonable and justifiable ways to decide on the inclusion/exclusion of an individual child or individual measure of growth, there is no one gold standard which is applied in all settings and all situations. Even in the same context, there is every chance that different analysts may select different criteria. The effect of these different choices on the direction and magnitude of malnutrition estimates is currently unknown. A recent paper set out to quantify how different, commonly used data cleaning criteria affect nutrition survey wasting prevalence statistics.


The researchers performed secondary analysis of 21 national demographic and health survey (DHS) datasets, each with anthropometric data collected using standard DHS methods.  The datasets were chosen as they represent countries from the Lancet series with a high burden of disease. The total dataset has a reference population of 36 countries, which account for the majority of the global malnutrition disease burden. The 21 were those which had available nutrition surveys done in the last ten years. Each DHS survey size is large enough for robust national prevalence estimates. In total, the 21 DHS surveys comprised n = 216,841 children (after n = 38,136 records with missing age variables had been removed). DHS survey methods are well standardised, both within-country and between countries, with thorough data checking and processing procedures ensuring errors are a rarity. 

Weight-for-age (WAZ), height-for-age (HAZ) and weight-for-height (WHZ) Z-scores based on WHO growth standards had previously been calculated from weight, height/length, age, and sex variables using Emergency Nutrition Assessment (ENA) software, developed for the Standardised Monitoring and Assessment of Relief and Transitions (SMART) initiative. Any records with missing WAZ, HAZ, or WHZ were removed (n = 13,545). The mean WAZ, HAZ, and WHZ for each country were then calculated for children aged 6–59 months using the appropriate DHS sample weights.

Extreme value cut-offs

Extreme anthropometric values are considered more likely to represent measurement or database errors than an individual who is truly very small or very large. The cut-offs for defining extreme values depend on the data cleaning method adopted, which may be based on the mean Z-score of either the reference population (‘fixed criteria’) or the observed data (‘flexible criteria’).The study compared five methods that are currently in widespread use (see Table 1), three of which are ‘fixed’ criteria  and two ‘flexible. Note that the Z-score is defined by the standard deviation of the reference dataset rather than that of the sample in both the flexible and fixed methods. 

The study authors note that the World Health Organisation (WHO) recommends their flexible criteria are adopted when the observed mean Z-score is below −1.5, otherwise they recommend use of fixed criteria. While none of the 21 DHS surveys studied in this work had a mean WHZ less than −1.5, the study applied both fixed and flexible criteria to allow for comparison. It should also be noted that criteria were applied in a manner that replicates their implementation in commonly used nutritional survey software when calculating the prevalence of wasting. Therefore, when using WHO and SMART criteria, outliers were only excluded using WHZ thresholds. However, when using EpiInfo criteria, WHZ, WAZ, and HAZ thresholds were used to identify outliers. When using EpiInfo criteria, exclusions were also made on the basis of biological implausibility criteria, i.e. incompatible combinations of HAZ and WHZ, HAZ > 3.09 and WHZ < −3.09, or HAZ< −3.09 and WHZ>3.09.

Table 1: Cleaning criteria: five methods currently in use for cleaning survey data prior to calculation of malnutrition prevalence

Cleaning method

Statistical probability criteria 
Exclude if:

Biological plausibility criteria
Exclude if:

Reference mean

WHO (2006)

Growth standards

(WHO, 2006b)

HAZ < −6

HAZ > 6

WAZ < −6

WAZ > 5

WHZ < −5

WHZ > 5




SMART flags*

(SMART, 2013)

HAZ < −3

HAZ > 3

WAZ < −3

WAZ > 3

WHZ < −3

WHZ > 3




WHO 1995

Flexible criteria**

(WHO, 1995)

HAZ < −4

HAZ > 3

WAZ < −4

WAZ > 4

WHZ < −4

WHZ > 4




WHO 1995

Fixed criteria

(WHO, 1995)

HAZ < −5

HAZ > 3

WAZ < −5

WAZ > 5

WHZ < −4

WHZ > 5





(WHO, 2006a)

HAZ < −6

HAZ > 6

WAZ < −6

WAZ > 6

WHZ < −4

WHZ > 6

HAZ > 3.09 and WHZ < −3.09

HAZ < −3.09 and WHZ > 3.09



HAZ, Height-for-age z-score; WAZ, Weight-for-age z-score; WHZ, Weight-for-height z-score.

* The upper and lower values are flexible, i.e., can be increased based on judgment (WHO, 2006b).

** Recommended for use when the observed mean z-score is below 1.5 (WHO, 1995).


Analysis was performed using Stata version 12 (StataCorp., TX), using the appropriate sample weights defined by DHS. Wasting and severe wasting prevalence based on current case definitions were estimated for each country, excluding records according to each cleaning criteria in turn. The standard deviation of weight-for-height measurements, after the data were cleaned, was then calculated for each country. Country-level wasting prevalence was compared to the international ‘integrated food security phase classification’ (IPC), which is used to determine the severity of an emergency and guide the need for interventions (see Table 3).  The IPC serves here to demonstrate the extent of differences between cleaning criteria. To illustrate the implications for treatment programmes, the study estimated the caseload over a one year period for severe wasting (WHZ < −3) for each of the cleaning criteria in turn, using the formula: 

Caseload for severe acute malnutrition (SAM) = N ×P×K ×C where 

N is the size of the population in the programme area

P is the estimated prevalence of SAM

K is a correction factor to account for new cases over the one year period

C is the expected mean programme coverage over the one year period.


In the 21 surveys, there was a total sample size of 163,228 children aged 6 to 59 months. These were representative of a total estimated population of 211 million children. Samples varied relative to country size; India’s DHS survey had the largest sample (45,398 children) and Cote D’Ivoire the smallest (1,710 children). 

Figure 1 shows the percentage of records excluded from prevalence estimates on the basis of five different cleaning criteria, by country (for children aged 6–59 months). SMART criteria consistently exclude the most children and WHO-2006 criteria exclude the least. However, what difference this makes, in terms of the absolute and relative proportion of exclusions, varies markedly by country. In Burkina Faso, 4.1% were excluded by WHO-2006 and 12.5% were excluded by SMART, while in Turkey 0.3% were excluded by WHO-2006 and 1.3% by SMART.

Prevalence estimates for wasting and severe wasting of children 6–59 months are shown under different cleaning criteria, by country, in Figures 2 and 3 respectively.  In many countries, whilst the absolute prevalence figure varies according to cleaning criterion, the IPC category does not change. In other countries, the application of different cleaning criteria results in the crossing of a phase boundary and a different categorisation of ‘severity’. The proportional differences in severe wasting are greater than for total wasting.

Estimated clinical caseloads for SAM for every country under each cleaning criteria can be seen to vary greatly. For example, were no cleaning criteria applied for India, the country would have to plan for some 10.4 million SAM cases; with WHO-2006 criteria, for 8.7 million cases; and with SMART flags, for only 6.5 million cases. Large differences in estimated caseload are observed in many counties when the least inclusive cleaning criteria are compared with the most inclusive cleaning criteria. In nine countries, SMART flags exclude all potential cases of SAM.

Figure 4 shows prevalence estimates under the (least inclusive) SMART criteria plotted against the standard deviation of the WHZ distribution before data exclusion. The observed linear relationship helps to explain the differences seen; the wider the survey distribution (due to either lower survey data quality or the heterogeneity of the population), the more likely the extremes are to be excluded, hence the greater the difference in prevalence estimate. It is also important to note here that there may be children with incorrect anthropometric measurements that are within the plausible range and so are not removed by the cleaning criteria.

Conclusions and discussion

The results show that the application of different cleaning criteria has a profound effect on the reported prevalence of both moderate and severe wasting. The magnitude of effect varies markedly between different countries, and is most pronounced for severe wasting. This in turn has a marked effect on estimated programme caseloads. Since wasting prevalence is a key statistic but choice of cleaning criteria is not currently standardised, differences in practice between individual analysts could unduly influence the results that are made available to decision-makers. This may potentially lead to inconsistent, inefficient and inappropriate implementation of malnutrition treatment programmes. 

This is, to the authors’ knowledge, the first paper to highlight this important issue. However, the authors acknowledge study limitations.  First, the results are based on a relatively limited number of countries and it is important that the analysis is extended elsewhere, particularly to different types of datasets. DHS follow a standardised methodology and are mostly done in relatively stable environments. Surveys conducted in emergency settings, working under greater pressure and sometimes by more inexperienced teams, may have more issues with data quality, so cleaning criteria may have even more effect on final results—but this would need to be confirmed. Second, in the caseload calculations for therapeutic feeding, it has not been possible to take into account the prevalence of oedematous malnutrition, nor that defined by low mid upper arm circumference (MUAC). These independent criteria for SAM also account for programme admissions and are not influenced by choice of cleaning criteria. So where, for example, the prevalence of oedematous malnutrition is high, the weight-for-height caseload estimates only form a small part of the estimate of global acute malnutrition (GAM); even if wasting caseload doubles, it may not make a large overall contribution to caseload planning.

Another limitation is that the study has focused on the impact of anthropometric cleaning criteria on wasting, and has ignored stunting and underweight. Future work, however, should explore the effect of data cleaning on other forms of anthropometrically defined malnutrition. In some situations, the application of cleaning criteria may well impact heavily on estimates of child overweight/obesity. Effects on stunting (low height-for-age) prevalence are also important to explore. 

As well as inadvertent differences due to poor awareness of the effect of cleaning criteria, there is clearly also potential for deliberate “gaming”, by which inclusive or exclusive criteria are deliberately chosen to fit political agendas. 

Fourth, it is important to extend the analysis to other age groups. This study has focused on children aged 6–59 months since they are the main target group for therapeutic and supplementary feeding programmes. However, malnutrition can also be prevalent in older children (and even in adults in extreme situations) and infants aged <6 months. Cleaning criteria may have different effects on prevalence estimates in these other groups. 

The way forward

To address the problems raised in this study, the authors propose several solutions. The first is a call for mandatory reporting of which cleaning criteria were used so that results may be interpreted accordingly.  Any inter-survey or time trend differences can thus be accounted for as potentially due to/not due to (if the same criteria are used consistently) data cleaning practices. Building on this work, it may be possible to establish equations by which prevalence calculated using one cleaning method could be “transformed” to an estimate using another method. This would, however, be at best an approximation and would only be needed if raw datasets were unavailable for full re-analysis (the current momentum to open-source datasets as well as results would be helpful here). A third call is for urgent international consensus and guidance on selecting and adopting a single set of optimal cleaning criteria. This would improve the comparability of nutrition survey data (including trend data in the same setting) and the coherence of associated policy recommendations.

Finally, and for the longer term, the authors note the current trend to electronic data collection. This may be particularly useful for nutrition surveys as extreme data values could potentially be validated in the field, reducing or eliminating the need for data exclusion during analysis. The study concludes by repeating a call for greater awareness of cleaning criteria as an explanation of inter-survey differences in malnutrition prevalence.

Show footnotes

1Crowe S, Seal A, Grijalva-Eternod C, Kerac M. (2014). Effect of nutrition survey ‘cleaning criteria’ on estimates of malnutrition prevalence and disease burden: secondary data analysis. PeerJ2:e380 Full text available at:

More like this

en-net: SMART Vis-à-vis EPI info 6 on definition of Flags

Why SMART is customized very tight set within range mean -3 to +3? EPI Info 6 is within -5 to +5. SMART excludes a lot of children being considered as out of range. this is...

FEX: MUAC vs WHZ in predicting mortality in hospitalised children under five years of age

Summary of research1 This research contributes to the evidence base regarding which anthropometric indicators identify malnourished sick children most at risk of death. Low...

en-net: acute malnutrition in double burden contexts

Is there any ongoing discussion on which indexes could we use to assess acute malnutrition in contexts with high level of double burden of malnutrition? .Any alternative to...

FEX: Improving screening for malnourished children at high risk of death

Research snapshot1 The purpose of this study was to investigate whether children with concurrent wasting and stunting (WaSt) require therapeutic feeding and to better...

FEX: Relationships between wasting and stunting and their concurrent occurrence in Ghanaian pre-school children

Summary of research* Location: Ghana. What we know: Wasting is a short-term health issue, but repeated episodes may lead to stunting (long-term or chronic malnutrition). This...

en-net: SMART Flagging Criteria (Technical)

I tend to use the WHO "biological plausibility" flagging criteria with anthropometric data. These are very similar to the older CDC / NCHS flagging criteria. I have...

FEX: Risk factors associated with severe acute malnutrition in infants under six months in India: a cross sectional analysis

By Susan Thurstans Susan is a registered nurse and midwife with over 12 years' experience in maternal and child health and nutrition programmes in both development and...

en-net: Height-for-age assessment - min and max

Hello, I'm currently working on my Public Health master thesis and I am using the data collected from a community-based cluster randomized control trial, in Angola. The...

FEX: How do children with severe underweight and wasting respond to treatment?

View this article as a pdf This is a summary of the following paper: Odei Obeng-Amoako G, Stobaugh H, Wrottesley S et al (2023) How do children with severe underweight and...

FEX: Taller but thinner: trends in child anthropometry in Senegal 1990–2015

View this article as a pdf Research snapshot1 This article investigates trends in child anthropometry in Senegal between1990 and 2015, associating them with potential causes;...

FEX: A reflection on the 2021 Lancet Maternal & Child Nutrition Series through a WaSt lens

View this article as a pdf This article provides a summary of the Lancet Maternal & Child Nutrition Series to date, reflecting upon the 2021 series from the perspective of the...

en-net: cm cutoffs for outliers of MUAC measurements?

Hi Everyone, I am working on cleaning muac measurments and seeing some improbable high and low cm measurments. Does anyone have a research paper or text which specifies at what...

FEX: Review of nutrition and mortality indicators for Integrated Phase Classification

Summary of technical review1 The Integrated Phase Classification (IPC) Technical Working Group and the Standing Committee on Nutrition (SCN) Task Force on Assessment,...

FEX: Estimating ‘people in need’ from combined GAM in Afghanistan

View this article as a pdf Lisez cet article en français ici By Alexandra Humphreys, Bijoy Sarker, Baidar Bakht Habib, Anteneh Gebremichael Dobamo and Danka...

en-net: Maximum % of FLAG in SMART surveys

Good morning all, I would like to know what are the maximum % of SMART Flags for WHZ, HAZ and WAZ when analysing the quality of an assessment? In plausibility check report...

WaSt TIG - the work so far

We have had three phases of work thus far and are currently in the fourth. A special section in FEX summarises a lot of the work of the WaSt TIG so far as well as experiences...

Resource: Effects on child growth of a reduction in the general food distribution ration and provision of small-quantity lipid-based nutrient supplements in refugee camps in eastern Chad

Abstract Background We used the United Nations High Commissioner for Refugees Standardised Expanded Nutrition Survey data to evaluate the effect of a change in food ration...

FEX: Wasting and Stunting Technical Interest Group (WaSt TIG) meeting

On the 15th of January 2018 the Wasting and Stunting (WaSt) Technical Interest Group (TIG) held their third face-to-face meeting at Trinity College, Oxford. This group of 30...

Resource: Children who are both wasted and stunted are also underweight and have a high risk of death: a descriptive epidemiology of multiple anthropometric deficits using data from 51 countries

Abstract Background: Wasting and stunting are common. They are implicated in the deaths of almost two million children each year and account for over 12% of...

FEX: Piloting LQAS in Somaliland

By Tom Oguta, Grainne Moloney and Louise Masese Tom Oguta has been working with FAO/FSAU in the Nutrition Surveillance Project in Somalia as a Nutrition Project Officer for...


Reference this page

Effect of nutrition survey ‘cleaning criteria’ on estimates of malnutrition prevalence and disease burden: secondary data analysis. Field Exchange 49, March 2015. p15.



Download to a citation manager

The below files can be imported into your preferred reference management tool, most tools will allow you to manually import the RIS file. Endnote may required a specific filter file to be used.