Research Portfolio Post #6: Quantitative Data

I am proposing to research female infanticide because I want to find out what explains the prevalence of son preference during fertility declines, in order to help my reader, understand how women’s decisions to commit infanticide can be reinforced.

Q: What explains variations in son preference across certain family structures during declining fertility rates?

India’s 2011 Census conducted by the Government of India is a survey from both country and regional level featuring demographic stats such as fertility rates, birth rates, income and female and male population but also includes survey statistics on social factors such as religion and households. The key data set I found and that will be relevant to include in my RD was on Household Composition and Size in India. [1]

Within this data set, family compositions is categorized by family structure such as, “Single person household, Nuclear household, Sub-Nuclear household, Supplemented nuclear household, Broken extended household, Joint household, and Others”. [2] In evaluating this data set, I would like to use the social dynamics of households, more specifically, family structures to explain variances in son preference as not only explanations to the existence of preference but explanations to its prevalence within certain family structures than others.  I also wish to evaluate the relationship between family structure and son preference and possibly look at how family structures are identified and outlined by other scholars and how I would possibly differ in labeling them.

I would operationalize this dataset using nominal measurement of 0 or 1 in meaning these structures demonstrated son preference as either present or absent. Additionally, it may be safe to consider measuring each family structure type from 1-8 based on their category and then significance in degree of preference (stronger or weaker) within each family structures type.

Limitations of this data set may be in its assumptions of certain household types and then in turn maybe more specific to certain regional variations rather than variations in son preference so it will be important to make such a differentiation to avoid confusion and assumption from the data. I think in creating my data set I would need to make such a distinction between expected family structures and how they are organized versus how I intended to label and measure them.


[1]  “Households by composition and size – 2011”. Office of the Registrar General & Census Commissioner (New Delhi, India: Government of India, 2011).

[2] Ibid. 



“Households by composition and size – 2011”. Office of the Registrar General & Census Commissioner (New Delhi,      India: Government of India, 2011). (Accessed October 10, 2019).

2 thoughts to “Research Portfolio Post #6: Quantitative Data”

  1. I totally agree with your operationalizing of the family structure as a nominal variable and then assigning them numbers 1 to 8. I would be interested to see if you could find other ways of operationalizing son preference. One idea I thought of was – for each family structure – determine a ratio of the number of sons per family/the number of daughters per family. For instance, if you looked at a survey of a hundred nuclear households and across all those households there were 120 sons and 100 daughters, you might have a ratio of 1.2. That ratio might be different than the ratio for a hundred sub-nuclear households or joint households. Then you could run one of the statistical tests to see if there was a significant difference in those ratios.

  2. Lizzie — the data source that you mention here is clearly relevant for your topic area and you explore some good thoughts on how to operationalize different variables in this post. Remember that for the DV in a quantitative project we really want interval/ratio indicators if at all possible. How might you operationalize the DV of “son preference” in an interval/ratio indicator? Evan offers you some good thoughts on this question.

    You’ll also want to keep thinking about cases and case selection as you consider this data source and other potential data sources for your project. What is the unit of analysis that you are proposing to analyze (cases = countries? regions within a country? individuals/households?) You note that the census of India provides data on the country and regional level. At the country level, then you’d have n=1 and would, of course, have to find similar data sources that provide similar information for a range of other countries. If you select regions then n = however many regional units are in the data sources, and that might give you enough cases (at least 15) for meaningful comparison in this methodology, and it might now. Individual or household level data would certainly provide you with enough cases but then the question is whether there is enough data at this unit of analysis for all of the relevant variables that you might consider. Make sure to keep thinking about these questions as you continue your research!

Leave a Reply

Your email address will not be published. Required fields are marked *