Assessment of Statistical Approaches to Model Low Count Data: An Empirical Application to Youth Delinquency

Authors

  • Taimoor Malik Clinical trials Unit, Dow University of Health Sciences, Karachi, Pakistan
  • Syed Arif Ali Department of Research, Dow University of Health Sciences, Karachi, Pakistan
  • Abdur Rasheed Department of Research, Dow University of Health Sciences, Karachi, Pakistan
  • Afaq Ahmed Siddiqui Faculty of Pharmacy, University of Karachi, Karachi, Pakistan

DOI:

https://doi.org/10.6000/1929-6029.2015.04.03.6

Keywords:

Count Data, Poisson regression model, Negative Binomial Regression

Abstract

Objectives: The aim of this study was to identify the risk factors associated with number of crime committed by youth (Youth Delinquency) between ages 10-17, using Ordinary Least Square (OLS), Poisson Regression model (PRM), Negative Binomial Regression model (NBRM)& Zero Inflated Negative Binomial (ZINB) with the aim to choose the most appropriate model for the observed count data.

Methodology: The data in the study was collected from youth whose mothers enrolled in Philadelphia Collaborative Perinatal Project (CPP). School and delinquency record (between ages 10-17) was obtained by the Centre for studies in Criminology and Criminal Law. Literature search suggest that factors associated with child delinquency can be divided into four main factors as Individual, Family, School and Peer. Therefore we included variables in the analysis accordingly.

Result: For OLS scatter plot of residuals versus estimated counts showed definite pattern of heterogeneity (non-constant variance). The likelihood-ratio (LR) test of over dispersion yields the significant p-value, which implied that the outcome variable is overdispersed. The plot of the difference between the actual probabilities and the mean predicted probabilities for each model showed that PRM has poor predictions for low counts (0-2).

Conclusion: NBRM and ZINB both performed well, however fit statistics revealed that NBRM has provided more closed predication as compare ZINB.NB modeling techniques provides much more compelling and accurate results instead of basic PRM or those available through simple linear or log-linear modeling techniques

Author Biographies

Taimoor Malik, Clinical trials Unit, Dow University of Health Sciences, Karachi, Pakistan

Clinical trials Unit

Syed Arif Ali, Department of Research, Dow University of Health Sciences, Karachi, Pakistan

Department of Research

Abdur Rasheed, Department of Research, Dow University of Health Sciences, Karachi, Pakistan

Department of research

Afaq Ahmed Siddiqui, Faculty of Pharmacy, University of Karachi, Karachi, Pakistan

Faculty of Pharmacy

References

Cameron AC, Trivedi PK. Econometric models based on count data. Comparisons and applications of some estimators and tests. Journal of Applied Econometrics 1986; 1(1): 29-53. http://dx.doi.org/10.1002/jae.3950010104 DOI: https://doi.org/10.1002/jae.3950010104

Armeli S, Mohr C, Todd M, Maltby N, Tennen H, Carney MA, et al. Daily evaluation of anticipated outcomes from alcohol use among college students. Journal of Social and Clinical Psychology 2005; 24(6): 767-92. http://dx.doi.org/10.1521/jscp.2005.24.6.767 DOI: https://doi.org/10.1521/jscp.2005.24.6.767

Chin HC, Quddus MA. Modeling count data with excess zeroes an empirical application to traffic accidents. Sociological Methods & Research 2003; 32(1): 90-116. http://dx.doi.org/10.1177/0049124103253459 DOI: https://doi.org/10.1177/0049124103253459

Scott Long J. Regression models for categorical and limited dependent variables. Advanced Quantitative Techniques in the Social Sciences 1997; 7.

Poston Jr DL, McKibben SL. Using zero-inflated count regression models to estimate the fertility of US women. Journal of Modern Applied Statistical Methods 2003; 2(2): 10. DOI: https://doi.org/10.22237/jmasm/1067645400

Malik T, Khan M, Sheikh Z. Models of association between demographics and the hospital visits by patients with type 2 diabetes mellitus. International Journal of Diabetes in Developing Countries 2014: 1-5. DOI: https://doi.org/10.1007/s13410-014-0261-4

Cameron AC, Trivedi PK. Regression analysis of count data: Cambridge university press 2013. http://dx.doi.org/10.1017/cbo9781139013567 DOI: https://doi.org/10.1017/CBO9781139013567

Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 1992; 34(1): 1-14. http://dx.doi.org/10.2307/1269547 DOI: https://doi.org/10.2307/1269547

Greene WH. Accounting for excess zeros and sample selection in Poisson and negative binomial regression models 1994.

Denno DW. Sociological and human developmental explanations of crime: Conflict or consensus?* Criminology 1985; 23(4): 711-41. http://dx.doi.org/10.1111/j.1745-9125.1985.tb00371.x DOI: https://doi.org/10.1111/j.1745-9125.1985.tb00371.x

Wasserman GA, Keenan K, Tremblay RE, Coie JD, Herrenkohl TI, Loeber R, et al. Risk and protective factors of child delinquency: US Department of Justice, Office of Justice Programs, Office of Juvenile Justice and Delinquency Prevention Washington 2003. http://dx.doi.org/10.1037/e501772006-001 DOI: https://doi.org/10.1037/e501772006-001

Shader M. Risk factors for delinquency: An overview: US Department of Justice, Office of Justice Programs, Office of Juvenile Justice and Delinquency Prevention 2001.

Belknap J, Holsinger K. The gendered nature of risk factors for delinquency. Feminist Criminology 2006; 1(1): 48-71. http://dx.doi.org/10.1177/1557085105282897 DOI: https://doi.org/10.1177/1557085105282897

Fekedulegn D, Andrew M, Violanti J, Hartley T, Charles L, Burchfiel C. Comparison of statistical approaches to evaluate factors associated with metabolic syndrome. The Journal of Clinical Hypertension 2010; 12(5): 365-73. http://dx.doi.org/10.1111/j.1751-7176.2010.00264.x DOI: https://doi.org/10.1111/j.1751-7176.2010.00264.x

Gardner W, Mulvey EP, Shaw EC. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin 1995; 118(3): 392. http://dx.doi.org/10.1037/0033-2909.118.3.392 DOI: https://doi.org/10.1037/0033-2909.118.3.392

Long JS, Freese J. Regression models for categorical dependent variables using Stata: Stata press 2006.

Long JS, Freese J. Predicted probabilities for count models. Stata Journal 2001; 1(1): 51-7. DOI: https://doi.org/10.1177/1536867X0100100103

Downloads

Published

2015-08-18

How to Cite

Malik, T., Ali, S. A., Rasheed, A., & Siddiqui, A. A. (2015). Assessment of Statistical Approaches to Model Low Count Data: An Empirical Application to Youth Delinquency . International Journal of Statistics in Medical Research, 4(3), 282–286. https://doi.org/10.6000/1929-6029.2015.04.03.6

Issue

Section

General Articles