| kmestimate | stsdas.analysis.statistics | kmestimate |
kmestimate -- Compute the Kaplan-Meier estimator of a randomly censored distribution.
kmestimate input
The kmestimate task calculates and prints the Kaplan-Meier product-limit estimator of a randomly censored distribution. It is the unique, self-consistent, generalized maximum-likelihood estimator for the population from which the sample was drawn. When formulated in cumulative form, it has analytic asymptotic error bars (for large N). The median is always well-defined, though the mean is not if the lowest point in the sample is an upper limit.
A differential, or binned, Kaplan-Meier estimator is available as an option. This allows users to find the number of points falling into specified bins along the X-axis according to the Kaplan-Meier estimated survival curve. However, users are stringly encouraged to use the cumulative form for which analytic error analysis is available. There is no known error analysis for the differential estimator.
The Kaplan-Meier estimator works with any underlying distribution (e.g., Gaussian, power law, bimodal), but only if the censoring is "random." That is, the probability that the measurement of an object is censored can not depend on the value of the censored variable. At first glance, this may seem to be inapplicable to most astronomical problems: we detect the brighter objects in a sample, so the distribution of upper limits always depends on brightness. However, two factors often serve to randomize the censoring distribution. First, the censored variable may not be correlated with the variable by which the sample was initially identified. Thus, infrared observations of a sample of radio bright objects will be randomly censored if the radio and infrared emission are unrelated. Second, astronomical objects in a sample usually lie at different distances, so that brighter objects are not always the most luminous.
Thus, the censoring mechanisms of each study MUST be understood individually to judge whether the censoring is likely to be random. The appearance of the data, even if the upper limits are clustered at one end of the distribution, is NOT a reliable measure. A frequent (if philosophically distasteful) escape from the difficulty of determining the nature of the censoring in a given experiment is to define the population of interest to be the observed sample. The Kaplan-Meier estimator then always gives a valid redistribution of the upper limits, though the result may not be applicable in wider contexts.
If this value is INDEF (the default) the task will compute it internally. This task parameter is only used if the task parameter diff is set to "yes".
If this value is INDEF (the default) the task will compute it internally. This task parameter is only used if the task parameter diff is set to "yes".
This task parameter is only used if the task parameter diff is set to "yes".
1. Apply the Kaplan-Meier estimator to the data in the text file "kmestimate.dat". There is a copy of this file in the statistics$data directory (i.e., "statistics$data/kmestimate.dat"). The notation [1,2] means that the first column is the censor indicator and the second column contains the values.
cl> kmestimate kmestimate.dat[1,2]
censor, survival
Type "help statistics option=sys" for a higher-level description of the statistics package.