AbstractsMathematics

Abstract

Sampling techniques such as case-cohort and nested case-control studies, allows us to analyze survival data under the relative risk model assumption, without having to collect data for the full cohort. This can reduce the cost of a large scale study substantially. Simulations which investigates the performance of parameter-estimators in such studies, have been carried through extensively. Most concludes that the procedures o er great alternatives to a cohort study, using considerably less data, while suering only small losses in efficiency, Samuelsen et al. (2007), Borgan and Samuelsen (2003), Self and Prentice (1988), Langholtz and Borgan (1995). In the nested case control studies, controls are sampled from the risk set on the event times, and tied to the specific case. Weighted pseudo likelihood estimators uses the data from such a sampling, but in a more efficient way, which leads to more precision in the estimated parameters. If also there exist some additional information, a surrogate measure, that is available for the entire cohort, this may b e exploited in a strati ed sampling to achieve a more informative set of controls. This can lead to increased precision for the estimated parameters (Borgan et al. 2000). The estimated parameters are further used when we need to calculate cumulative hazard rates, or the estimated survival function. Thus if one method produces parameter estimates that are substantially better than another procedure (lower variance, bias), the corresponding baseline estimator should inherit some of these merits. However, there could b e other influencing factors as well, leading to differences between the baseline estimators. While the estimated parameters are found by maximizing an overall likelihood, the estimator for the cumulative hazard is a function of timet, calculated at each observed event time. Such estimators may for certain values of to be more sensitive to the distribution of the controls over the time period. Thus an estimator may have go o d properties in some region of the observation time, but be lacking in other areas when compared to others. A third asp ect that could have an impact on the baseline estimator, is how data is used in the estimation at different t-values. More speciffically, the traditional estimator in a nested case-control study, will in early regions of the time period use far less data than the alternative methods which pool the controls together. The latter will have a baseline estimator similar to the case-cohort study, which will from the rst observed event time use information from every case and sampled control. This could mean a more stable estimator early on, but may imply more variance being accumulated at later event times, leading to a drop in precision towards the end of the time perio d. In this thesis, the properties of estimators of the cumulative baseline hazard, commonly referred to as Breslow-type estimators, are studied under various circumstances. The main goal will be to establish that the estimators works, and that we are able to…