GUMBEL DISTRIBUTION MODEL FOR THE BREAST CANCER SURVIVAL DATA USING MAXIMUM LIKELIHOOD METHOD

KHAN K.H.1*
1Department of Mathematics, College of Science and Humanities, Salman Bin Abdulaziz University, Al-Kharj, Kingdom of Saudi Arabia
* Corresponding Author : kizarkhan@yahoo.com

Received : 11-03-2012     Accepted : 09-04-2012     Published : 12-04-2012
Volume : 3     Issue : 1       Pages : 74 - 77
J Stat Math 3.1 (2012):74-77

Cite - MLA : KHAN K.H. "GUMBEL DISTRIBUTION MODEL FOR THE BREAST CANCER SURVIVAL DATA USING MAXIMUM LIKELIHOOD METHOD ." Journal of Statistics and Mathematics 3.1 (2012):74-77.

Cite - APA : KHAN K.H. (2012). GUMBEL DISTRIBUTION MODEL FOR THE BREAST CANCER SURVIVAL DATA USING MAXIMUM LIKELIHOOD METHOD . Journal of Statistics and Mathematics, 3 (1), 74-77.

Cite - Chicago : KHAN K.H. "GUMBEL DISTRIBUTION MODEL FOR THE BREAST CANCER SURVIVAL DATA USING MAXIMUM LIKELIHOOD METHOD ." Journal of Statistics and Mathematics 3, no. 1 (2012):74-77.

Copyright : © 2012, KHAN K.H., Published by Bioinfo Publications. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

Abstract

The survival rate estimates for the breast cancer censored data have been considered for the 254 patients. The data [10] was treated at the chemotherapy department, Bradford Royal Infirmary for ten years. Here in this paper Gumbel probability distribution (see [3], [4], [5]) model is used to obtain the survival rates of the patients. Maximum likelihood method [9] has been used through unconstrained optimization method [12, 13] (DFP-Davidon-Fletcher-Powell) to find the parameter estimates and variance-covariance matrix for the Gumbel distribution model. Finally the survivor rate estimates for the parametric (Gumbel) probability model has been compared with the non-parametric (Kaplan-Meier) [7] method.

Keywords

Gumbel distribution model, Censoring, Breast Cancer Data sets, DFP-unconstrained optimization method, Maximum likelihood function and Kaplan-Meier survivor rate estimates.

Introduction

Breast cancer is a systemic disease [1,10,15] until proved otherwise. When the treatment is stopped the disease progresses with uniform ‘velocity’ v through a fixed ‘distance’d in the disease to recurrence point. In this paper, we find the parameter estimates, survival rate estimates, variance covariance matrix for the Gumbel probability distribution model using maximum likelihood function using breast cancer data [14] . For the survival of the patient with the breast cancer, a statistical approach is considered; wihich is based on two parameters refered as scale and shape parameters respectively of the said distributions. Further work on probabilistic approach has been done by Khan, K.H. [14] . using Inverse Guassioan distribution model. The survivor rate estimates for the Gumbel probability distribution has also been compared with the nonparametric model [7] .

The Gumbell Model and Estimation of Parameters

The data regarding survival analysis generally falls in two classes- (i) the failure time of items, which actually fail during the experiment, (ii) the survival times of items which, actually survive with the experiment.
These classes are generally separated statistically by the use of censoring, for detail see Cox, [11] . In parametric models the pdf of lifetime ‘T’ has form with survival function, where is a vector of parameters. The contribution to the likelihood of an item that fails at time t is and an item that survives beyond time is. Thus, according to the Lawless [8] , using the Gumbel distribution models, the likelihood function when the time is divided into intervals is given as



where NG, , N and F are the number of recurrence groups, number of failures (recurrences) in the ith year, sample size and total number of recurrences in 10 years respectively.
The maximum likelihood estimates can be obtained by taking the log-likelihood function. Since the probability of no failure until time t is defined by, then the log-likelihood function can be written as



To find the parameter estimates we used the unconstrained optimization method ‘(DFP)’ developed by Davidon and amended by Fletcher and Powell (see. [12] , [13] ). The DFP method (Quasi-Newton-Method) is an iterative method, which minimizes the objective function and requires only first partial derivatives in addition to the function values. So, the log-likelihood function to be maximized is equivalent to the minus times the log-likelihood function to be minimized . Therefore the required form for the estimation parameters is . . The variance-covariance matrix of estimates



is calculated automatically and numerically as a part of these optimization procedures, and without any direct evaluation of the second derivatives of which would be very complicated. Since the article is concerned with the use of Gumbel distribution [6] so the pdf and are respectively as under.



where is a scale parameter and is the location parameter.
For reparameterization, we take and so the above pdf becomes



Now the survivor function is



The hazard function or the failure rate is



The hazard rate/failure rate is proportional to the scale parameter and time as it passes.
Now the likelihood function can be written as



Where and at (Max).
The maximum likelihood estimates for (scale and shape parameters) , are the values of which maximize , or, equivalently, which minimize .
Thus, we have



where . The partial derivatives of w.r.t. are





where and .

Application

We considered the data of 254 patients surviving with breast cancer. These patients were initially treated at the department of chemotherapy department, Bradford Royal Infirmary, [15] , England, thirty five years ago. Each patient was treated for a period of ten years or until death. The patients surviving with breast cancer were between 23 and 82 years old (Hancock et al. [10] ). The patients were classified into four diffrenet stages using TNM (Tumor Nodes Metastases) system and clinically staged accordingly.
Out of 254 patients, 100 patients were premenopausal and 154 were postmenopausal. A woman was considered to be postmenopausal when 2 years had elapsed since her last menstrual period. The two main categories are premenopausal and postmenopausal. Note that Stages I & II for premenopausal and postmenopausal were each combined together. In the light of [Table-1] and [Table-2] the survival related to the clinical stage (%age) over ten years is given in [Table-3] .

Graphical Representation of Survivor Rate Estimates for Different Stages of Breast Cancer Using Gumbel Distribution Model

The Graphical comparisons of Gumbel model (parametric) survivor-rate with Kaplan-Meier (non- parametric) model survivor-rate estimates given in the following [Fig-1] .

Conclusions

Analysis shows that the Gumbel distribution is a reasonable model to describe the progression of breast cancer and finding survivor rates for 254 patients. Using Maximum likelihood method through unconstrained optimization method (DFP-Davidon-Fletcher-Powell) the parameter estimates and variance-covariance matrix for the Gumbel distribution model were found.
However unlike a number of two-parameter distributions which are used in survivor studies it does have some beaming on the physical process being described.

Acknowledgment

The author (Khizar H.Khan) thankfully acknowledges the support provided by the Department of Mathematics, College of Science and Humanities, Salman Bin Abdulaziz University, Al-Kharj, and Ministry of Education, Saudi Arabia for providing the facilities and an environment to perform the research work.

References

[1] Boag, J.W. (1949) J.R. Stat. Soc. Series B, 1(11), 15-53.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[2] Bain, L.J.,Englehardt, M. (1991) Statistical Analysis of Reliability and Life-Testing Models: Theory and Methods, 2, Marcel Dekker.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[3] Coles S. (2001) An Introduction to Statistical Modelling of Extreme Values. Springer-Verlag.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[4] Gupta R.D., Kundu D. (2007) Journal of Statistical Planning and Inference 137,3537-3547.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[5] Kotz S., Nadarajah S. (2000) Extreme Value Distributions: Theory and Applications. Imperial College Press.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[6] Nadarajah S. (2006) Environmetrics 17,13-23.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[7] Kaplan E.L.,Meier P.P. (1958) J. Amer. Statist. Assoc., 53, 457-481.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[8] Lawless J.F. (1982) Statistical Models and Methods for lifetime Data, John Wiley and Sons.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[9] Meeker W.Q., Escober L.A., Stanford J. and Vardeman S. (1994).  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[10] Hancock K., Peet B.G.,Price J.,Watson G. W., Stone J., Turner R. L. (1977) British Journal of Surgery, 64, 134 - 138.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[11] Cox D.R.,Oaks D. (1984) Analysis of Survival Data. London Chapman and Hall.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[12] Davidon W.C. (1959). Mathematical Programming, 9,1-30.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[13] Fletcher R.,Powell M.J.D. (1963) The Computer Journal, 6, 163-168.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[14] Khan K.H.,Zafar Mehmud (2002) An International Journal,1(2),201-209.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[15] Watson G. W.,Turner R. L. (1959) British Medical Journal, 1, 1315 - 1320.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

Images
Fig. 1-
Table 1- Age Distribution Related to Clinical Stage and Menopasal Status.
Table 2- Survivals and Failures Related to Clinical Stage and Menopausal Status.
Table 3- Survival Related to Clinical Stage (%age) over ten years.
Table 4- Data for Stages I to IV over the ten years.
Table 5- Estimates of Parameters amd ML-Function for Gumbel Distribution Model.
Table 6- Estimates of Variance-Covariance Matrix and Gradient vector for the Gumbel Model.
Table 7-Survival Proportion for Pre-menopausal Stages.
Table 8- Survival Proportion for Post-menopausal Stages.