The DHS Program User Forum      

Discussions regarding The DHS Program data and results
Home » Topics » Nutrition and Anthropometry » Which data file should I use for nutrition outcomes of children under 5 years of age
Which data file should I use for nutrition outcomes of children under 5 years of age [message #347] Sun, 21 April 2013 00:47 Go to next message
m.hasan1 is currently offline  m.hasan1
Messages: 4
Registered: April 2013
Location: Brisbane, Australia
Member

Hello,

I would like to have some advice on the following issues:

(1)If I want to run a regression model to examine the effect of different determinants on nutrition outcomes of children under 5 years of age, can I use only the children data file (BDKR file), Or I have to use/merge other files such as births data file?

(2)What is the standard method of dealing the flagged cases? Is it important to include flagged cases during the analysis?

Re: Which data file should I use for nutrition outcomes of children under 5 years of age [message #375 is a reply to message #347] Tue, 30 April 2013 21:12 Go to previous messageGo to next message
Reduced-For(u)m
Messages: 90
Registered: March 2013
Senior Member

I think a DHS mod will help here, but I have an ulterior motive...

1) I use the child recode too, but you may have to use the person or household member recode to get the exact numbers the DHS calculates. There is a post here with instructions and code: http://userforum.measuredhs.com/index.php?t=tree&th=137& amp;goto=262&#msg_262

2) Which flags? Age flags? I would consider dropping those, since HAZ is computed including age, but there shouldn't be man. Or do you mean the 9999s or whatever you get when there is an invalid HAZ? Definitely drop those (they aren't Z-scores). Or are there some other flags you are curious about?

Unasked:

3) There are 2 different HAZs coded in the newer DHS rounds - the old CDC standards and the new WHO ones...the new WHO ones are generally better, but maybe not comparable to other studies done before 2007 or so.

4) Which determinants are you trying to estimate? This is totally a selfish question because I'm working on a methodology paper about estimating these, and you can get into some trouble estimating the determinants of HAZ if you use time-varying regressors...
Re: Which data file should I use for nutrition outcomes of children under 5 years of age [message #385 is a reply to message #347] Thu, 02 May 2013 17:51 Go to previous messageGo to next message
Liz-DHS
Messages: 257
Registered: February 2013
Senior Member
Dear User,
For guidance on calculating various indicators, please take a look at the Guide to DHS Statistics http://www.measuredhs.com/publications/publication-dhsg1-dhs -questionnaires-and-manuals.cfm and the standard recode manual http://www.measuredhs.com/publications/publication-dhsg4-dhs -questionnaires-and-manuals.cfm
Re: Which data file should I use for nutrition outcomes of children under 5 years of age [message #424 is a reply to message #375] Mon, 13 May 2013 02:50 Go to previous messageGo to next message
m.hasan1 is currently offline  m.hasan1
Messages: 4
Registered: April 2013
Location: Brisbane, Australia
Member

Dear Member,

Thank you very much for your advice. The information avialable in the link is very much useful.

I am still doing literature review to understand which variables should be in the model. But I think I may include maternal factors, household demographic factors and child related factors.

I am not sure whether it is possible to estimate the effects of a time varying regressor when one is dealing with a single DHS data set given that DHS is cross-sectional.....
Re: Which data file should I use for nutrition outcomes of children under 5 years of age [message #427 is a reply to message #424] Mon, 13 May 2013 17:26 Go to previous message
Reduced-For(u)m
Messages: 90
Registered: March 2013
Senior Member

Hi,

Glad I could help a little. Estimating cross-sectional determinants of child HAZ or time-invariant ones is certainly easier than trying to estimate cohort-based determinants (what I meant by time-variant), but it still requires, in my opinion, a bit more care than some people give it.

In particular, I worry most that people don't sufficiently worry about the distribution of child age-at-measurement across their explanatory variables of interest. I think we say "this is age adjusted height, so age shouldn't be a big predictor", but if you collapse HAZ by age-in-months, and graph it out, you'll realize how important age-at-measurement actually is in DHS countries...because HAZ is a cumulative measure of health/nutrition up until age-at-measurement, older kids have had a lot more time to "lose" HAZ relative to well-nourished children in the reference group.

Just a couple of things to keep in mind: 1) if estimating time-invariant factors (say, rural born or maternal age at birth), make sure that the distributions of child age are similar across X (so, if X is "rural born", overlay a histogram or kernel-density plot of ages for rural and urban born children, and see if they match). 2) if you are using "time semi-variant" things like, say, Asset Quintile, you might have a more pronounced problem in that older parents tend to have both more assets and older children (this could bias your estimates of asset effect downward). 3) if you are using "cohort" variables, such as "drought exposure in-utero", you have to be super-duper careful, because some drought year where lots of kids are exposed will be correlated with some age-at-measurement, and thus induce a spurious HAZ-drought association that is driven by a drought/age-at-measurement association.

The gist is that most people include a linear control for age-in-months, and then write "age is a strong predictor of HAZ", which is true, but almost misses the point. Age is THE best predictor of HAZ in a lot of countries, but it is decidedly non-linear, and the model misspecification error (because it is specified erroneously as linear) is often times correlated with age in such a way that any covariates just accidentally associated with child age will pick up the misspecification error and attribute it to the covariate.

I find that in things like estimating effects of maternal age this affects coefficient estimates just a little bit. In things like in-utero/birth-year economic/health environment (cohort stuff), this affects estimates a whole lot. In between...I don't know, depends on the situation.

So... If you feel like it, once you get your list of determinants down, estimate it a few ways, by specifying age as linear, quadratic, a spline with nodes at each age-in-years, and dummy variables for each age in months, and then post the coefficient estimates on a few of the key determinants of interest for each specification. We can see what kind of difference it makes to your estimates.

Sorry...I'm almost done with a paper on this, and so I talk a lot about it.

Best,
j
Previous Topic: Assistance on estimating infant feeding practices using DHS data
Next Topic: Men Body Mass Index - Lesotho 2009
Goto Forum:
  


Current Time: Tue Sep 16 09:24:15 Eastern Daylight Time 2014