2013, Number 3
<< Back Next >>
Rev Cub Gen 2013; 7 (3)
Methodology for the selection of optimal cutoff point to dichotomize continuous covariates
Fuentes SLE
Language: Spanish
References: 25
Page: 36-42
PDF size: 983.36 Kb.
ABSTRACT
Categorizing continuous covariables is a common practice in medical and epidemiological investigations due to clinical and statistical reasons. This work is only centered on the dichotomy of a continuous covariable, since it is one of the most wanted objectives from the biological point of view; however more than a single cut point may exist for the range of a continuous variable. In order to define the candidate cut points the following methodology is proposed: 1) graphical representation, 2) quantiles calculation, 3) determination of Xi-square and maximum odd ratio in the 2x2 contingency table and 4) logistic regression. After determining the candidate cut points, in order to select the optimum cut point it is proposed to use the calculation of statistics as sensitivity, specificity and Youden’s index, as well as to use the ROC curve and the area below the curve (ABC) as an exactitude index. Sensitivity, specificity and ABC are sampling estimators of demographic parameters; therefore, each one of them has an associated estimation error that makes it necessary to report their respective reliability intervals. Since dichotomizing a continuous covariable is a generalized and frequent application in biostatistics and epidemiology investigations, this work proposes a practical methodology for obtaining the optimum cut point.
REFERENCES
ltman DG. Categorising continuous variables. Br J Cancer. 1991;64:975.1.
Altman DG, Lausen B, Sauerbrei W, Schumacher M. Dangers of using “optimal”cutpoints in the evaluation of prognostic 2. factors. Journal of the National Cancer Institute. 1994;86:829-835.
Altman, D. G. Categorizing continuous variables. In Armitage, P. and Colton, T. (eds). Encyclopedia of Biostatistics. 3. Chichester: John Wiley;1998. pp.563 - 567.
Abdolell M, LeBlanc M, Stephens D, et al. Binary partitioning for continuous longitudinal data: categorizing a prognostic 4. variable. Statist Med. 2002; 21:3395-3409.
Brent A. Williams, Jayawant N. Mandrekar, Sumithra J. Mandrekar, Stephen S. Cha, Alfred F. Furth. Finding Optimal Cut5. points for Continuous Covariates with Binary and Time-to-Event Outcomes. Technical Report Series #79. June 2006.
Contal, C., O’Quigley, J. An application of changepoint methods in studying the effect of age on survival in breast cancer. 6. Computational Statistics and Data Analysis. 1999;30:253 - 270.
Faraggi D, Simon R. A simulation study of cross-validation for selecting an optimal cutpoint in univariate survival analysis. 7. Statist Med. 1996;15:2203-2213.
Mazumdar M, Glassman JR. Categorizing a prognostic variable: review of methods, code for easy implementation and ap8. plications to decision-making about cancer treatments. Statistics in Medicine. 2000;19:113-132.
Mazumdar M, Smith A, Bacik J. Methods for categorizing a prognostic variable in a multivariable setting. Statistics in 9. Medicine. 2003;22:559-571.
Baneshi MR, Talei AR. Dichotomisation of Continuous Data: Review of Methods, Advantages, and Disadvantages. Iran J 10. Cancer Prev. 2011;4(1):26-32.
Luis M. Molinero. Elección de los puntos de corte para convertir una variable cuantitativa en cualitativa. URL disponible 11. en: www.seh−lelha.org/stat1.htm.
Magder LS, Fix AD. Optimal choice of a cut point for a quantitative diagnostic test performed for research purposes. J Clin 12. Epidemiol. 2003;56:956-962.
Mandrekar JN , Mandrekar SJ , Cha SS . Cutpoint determination methods in survival analysis using SAS®. Proceedings of 13. the 28th SAS Users Group International Conference (SUGI). 2003:261-28.
Luis M. Molinero. ¿Y si los datos no siguen una distribución normal?...Bondad de ajuste a una normal. Transformaciones. 14. Pruebas no paramétricas. URL disponible en: www.seh−lelha.org/stat1.htm.
Azzimonti Renzo JC. Bioestadística para Bioquímicos. Tema 4. Estadígrafos. URL disponible en: 15. http://es.scribd.com/doc/2904463/Bioestadistica-Aplicada-a-Bioquimica-y-Farmacia.
Bioestadística amigable. Miguel A. Martínez-González, Almudena Sánchez-Villegas. Ediciones Díaz de Santos. ISBN: 84-16. 7978-791-0. Depósito legal: M.40.343-2006.
Miller R, Siegmund D. Maximally selected chi square statistics. Biometrics. 1982;38:1011-1016.17.
Cumsille F, Bangdiwala SI, Sen PK, et al. Effect of dichotomizing a continuous variable on the model structure in multiple 18. linear regression models. Commun Statist – Theory Meth. 2000;29:643-654.
Liquet B, Commenges D. Correction of the p-value after multiple coding of an explanatory variable in logistic regression. 19. Statistics in Medicine. 2001;20:2815-2826.
Harrell, F.E. Jr. Regression modelling strategies with applications to linear models, logistic regression, and survival analy20. sis. New York: Springer-Verlag; 2001.
Hollander N, Sauerbrei W, Schumacher M. Confidence intervals for the effect of a prognostic factor after selection of an 21. `optimal’ cutpoint. Statistics in Medicine. 2004;23:170-713.
Hanley J A, McNeil B J. A method of comparing the areas under receiver operating characteristic curves derived from the 22. same cases. Radiology. 1983; 148:839-43.
DeLong E R, DeLong D M, Clarke-Pearson D L. Comparing the areas under two or more correlated receiver operating 23. characteristic curves: a nonparametric approach. Biometrics. 1988;44:837-45.
Jaime Cerda y Lorena Cifuentes. Uso de curvas ROC en investigación clínica. Aspectos teórico-prácticos. Rev Chil Infect. 24. 2012;29(2):138-141.
Emma Domínguez Alonso y Roberto González Suárez. Análisis de las curvas receiver-operating characteristic: un método 25. útil para evaluar procederes diagnósticos. Rev Cubana Endocrinol. 2002;13(2):173-80.