A Simple Approach to Sample Size Calculation for Count Data in Matched Cohort Studies
DOI:
https://doi.org/10.6000/1929-6029.2014.03.03.11Keywords:
Clustered Poisson data, Overdispersion, Subject heterogeneity, Statistical power, Sample size.Abstract
In matched cohort studies exposed and unexposed individuals are matched on certain characteristics to form clusters to reduce potential confounding effects. Data in these studies are clustered and thus dependent due to matching. When the outcome is a Poisson count, specialized methods have been proposed for sample size estimation. However, in practice the variance of the counts often exceeds the mean (i.e. counts are overdispersed), so that Poisson methods don’t apply. We propose a simple approach for calculating statistical power and sample size for clustered Poisson data when the proportion of exposed subjects in a cluster is constant across clusters. We extend the approach to clustered count data with overdispersion, which is common in practice. We evaluate these approaches with simulation studies and apply them to a matched cohort study examining the association of parental depression with health care utilization. Simulation results show that the methods for estimating power and sample size performed reasonably well under the scenarios examined and were robust in the presence of mixed exposure proportions up to 30%.
References
Kornek B, Aboul-Enein F, Rostasy K, et al. Natalizumab therapy for highly active pediatric multiple sclerosis. JAMA Neurol 2013; 70: 469-75. DOI: https://doi.org/10.1001/jamaneurol.2013.923
Rothman KJ, Greenland S. Cohort Studies. In Modern Epidemiology, 2nd edition, Philadelphia, PA: Lippincott-Raven 1998.
Graham PL, Mengersen K, Morton AP. Confidence limits for the ratio of two rates based on likelihood scores: non-iterative method. Statistics in Medicine 2003; 22: 2071-2083. http://dx.doi.org/10.1002/sim.1405 DOI: https://doi.org/10.1002/sim.1405
Cummings P, McKnight B, Greenland S. Matched cohort methods in injury research. Epidemiologic Reviews 2003; 25: 43-50. http://dx.doi.org/10.1093/epirev/mxg002 DOI: https://doi.org/10.1093/epirev/mxg002
Cummings P, McKnight B, Weiss NS. Matched-pair cohort methods in traffic crash research. Accident Analysis and Prevention 2003; 35: 131-141. http://dx.doi.org/10.1016/S0001-4575(01)00108-7 DOI: https://doi.org/10.1016/S0001-4575(01)00108-7
Sills MR, Shetterly S, Xu S, Magid D, Kempe A. The association between parental depression and children’s healthcare utilization. Pediatrics 2007; 119: e829-836. DOI: https://doi.org/10.1542/peds.2006-2399
Ng HKT, Tang ML. Testing the equality of two Poisson means using the rate ratio. Statistics in Medicine 2005; 24: 955-965. http://dx.doi.org/10.1002/sim.1949 DOI: https://doi.org/10.1002/sim.1949
Amatya A, Bhaumik D, Gibbons RD. Sample size determination for clustered count data..Statistics in Medicine 2013; 32: 4162-4179. DOI: https://doi.org/10.1002/sim.5819
Cox DR. Some remarks on overdispersion. Biometrics 1983; 10: 269-274. DOI: https://doi.org/10.1093/biomet/70.1.269
Dean C. Testing for overdispersion in Poisson and binomial regression models. Journal of the American Statistical Association 1992; 87: 451-457. http://dx.doi.org/10.1080/01621459.1992.10475225 DOI: https://doi.org/10.1080/01621459.1992.10475225
Lawless JF. Negative Binomial and Mixed Poisson Regression. The Canadian Journal of Statistics 1987; 15: 209-225. http://dx.doi.org/10.2307/3314912 DOI: https://doi.org/10.2307/3314912
Cameron AC, Trivedi PK. Regression Analysis of Count Data. Cambridge University Press 1998. http://dx.doi.org/10.1017/CBO9780511814365 DOI: https://doi.org/10.1017/CBO9780511814365
Friede T, Schmidli H. Blinded sample size re-estimation with count data: Methods and applications in multiple sclerosis. Statistics in Medicine 2010; 29: 1145-1156. DOI: https://doi.org/10.1002/sim.3861
Gao D. Analysis of clustered longitudinal count data. University of Colorado Health Sciences Center Thesis 2007.
Demidenko E. Poisson regression for clustered data. International Statistical Review 2007; 75: 96-113. http://dx.doi.org/10.1111/j.1751-5823.2006.00003.x DOI: https://doi.org/10.1111/j.1751-5823.2006.00003.x
Diggle PJ, Heagerty P, Liang K-Y, Zeger SL. Analysis of Longitudinal Data. 2nd ed. Oxford University Press: New York 2002.
Breslow N. Test of hypotheses in overdispersion regression and other quasi likelihood models. Journal of the American Statistical Association 1990; 85: 565-571. http://dx.doi.org/10.1080/01621459.1990.10476236 DOI: https://doi.org/10.1080/01621459.1990.10476236
Nagin DS, Land KC. Age, Criminal Careers, and Population Heterogeneity: Specification and Estimation of a Nonparametric, Mixed Poisson Model. Criminology 1993; 31: 501-523. http://dx.doi.org/10.1111/j.1745-9125.1993.tb01133.x DOI: https://doi.org/10.1111/j.1745-9125.1993.tb01133.x
Nagin DS. Group-Based Modeling of Development. Cambridge: Harvard University Press 2005. DOI: https://doi.org/10.4159/9780674041318
Sichel HS. The density and size distribution of diamonds. Bulletin of the International Statistical Institute 1973; 45: 420–427.
Atkinson AC, Yeh L. Inference for Sichel's compound Poisson distribution. Journal of the American Statistical Association 1982; 77: 153-158. http://dx.doi.org/10.1080/01621459.1982.10477779 DOI: https://doi.org/10.1080/01621459.1982.10477779
Manton KG, Woodbury MA, Stallard E. A variance components approach to categorical data models with heterogeneous cell populations: analysis of spatial gradients
in lung cancer mortality rates in north Carolina counties. Biometrics 1981; 37: 259-269. http://dx.doi.org/10.2307/2530416 DOI: https://doi.org/10.2307/2530416
Margolin BH, Kaplan N, Zeiger E. Statistical analysis of the Ames Salmonella Microsome Test. Proceedings of the National Academy of Sciences 1981; 76: 3779-3783. http://dx.doi.org/10.1073/pnas.78.6.3779 DOI: https://doi.org/10.1073/pnas.78.6.3779
Hinde J. Compound Poisson regression models. Lecture Notes in Statistics 1982; 14: 109-121. http://dx.doi.org/10.1007/978-1-4612-5771-4_11 DOI: https://doi.org/10.1007/978-1-4612-5771-4_11
Ord JK, Whitmore GA. The Poisson-inverse Gaussian distribution as a model for species abundance. Communications in Statistics-Theory and Methods 1986; 15: 853-871. http://dx.doi.org/10.1080/03610928608829156 DOI: https://doi.org/10.1080/03610928608829156
Hougaard P, Lee MLT, Whitmore GA. Analysis of overdispersed count data by mixtures of Poisson variables and Poisson processes. Biometrics 1997; 53: 1225-1238. http://dx.doi.org/10.2307/2533492 DOI: https://doi.org/10.2307/2533492
Molenberghs G, Verbeke G, Demétrio CGB. An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Analysis 2007; 13: 513-531. http://dx.doi.org/10.1007/s10985-007-9064-y DOI: https://doi.org/10.1007/s10985-007-9064-y
Ogungbenro K, Aarons L. Sample size/power calculations for population pharmacodynamic experiments involving repeated-count measurements. Journal of Biopharmaceutical Statistics 2010; 20: 1026-1042. http://dx.doi.org/10.1080/10543401003619205 DOI: https://doi.org/10.1080/10543401003619205
Cornfield J. Randomization by group: a formal analysis. American Journal of Epidemiology 1978; 108: 100-102. DOI: https://doi.org/10.1093/oxfordjournals.aje.a112592
Donner A, Klar N. Design and analysis of cluster randomization trials in health research. Arnold: London; 2000. DOI: https://doi.org/10.1191/096228000669355658
Gao D, Grunwald G, Xu S. Statistical Methods for Estimating Within-Cluster Effects for Clustered Poisson Data. J Biomet Biostat 2013; 4: 1-6. DOI: https://doi.org/10.4172/2155-6180.1000159
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2014 Dexiang Gao, Gary K. Grunwald, Stanley Xub
This work is licensed under a Creative Commons Attribution 4.0 International License.
Policy for Journals/Articles with Open Access
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
Policy for Journals / Manuscript with Paid Access
Authors who publish with this journal agree to the following terms:
- Publisher retain copyright .
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work .