# 40th anniversary for Section of Biostatistics and Two day meeting in Danish Society of Theoretical Statistics

### 2nd and 3rd of October 2018

### The Maersk Tower - University of Copenhagen

The Section of Biostatistics at the Faculty of Health Sciences is celebrating our 40th anniversary and we are using the occasion to organize a two-day scientific meeting (jointly with the Danish Society of Theoretical Statistics). We will have eight talks - four on Tuesday afternoon and four on Wednesday morning. On Tuesday evening we have planned a social event with a conference dinner and Wednesday afternoon there will be a reception.

#### Practical information

Registration: Registration by email to Susanne Kragskov Laupstad: skl@sund.ku.dk, Section of Biostatistics no later than September 3rd, 2018.

Conference fee: 200 Danish kr. for students (excl. Ph.D. students) and 500 kr. for others.

*Conference dinner:* If you do not want to participate in the conference dinner Tuesday evening, please inform us in the registration email.

*Payments (no later than September 3rd, 2018) to:*

CVR/VAT: DK29979812

University of Copenhagen bank information

Danske Bank, Holmens Kanal 2

DK-1090 København K

Reg. nr. 0216, account 4069044336

Iban DK73 0216 4069 0443 36

SWIFT DABADKKK

REMEMBER payment reference: name of participant(s) – 30333000-5003005004

*Venue:*The Maersk Tower, Blegdamsvej 3B, 2200 Copenhagen N

*Sponsors:*The Two-day meeting is kindly sponsored by Lundbeck and Novo Nordisk

## Young Statisticians Denmark - free lunch

*Danish:* Young Statisticians Denmark arrangerer en gratis frokost før todagsmødet den 2 oktober fra kl 12.20-13.00 i Mærsk Tårnet. For mere information se facebook opslaget https://www.facebook.com/events/433266690412632/. Man kan registrere sig her.

*English:* Young Statisticians Denmark are organizing a free lunch before the two-day meeting on October 2 at 12.20-13.00 in the Mærsk Tower(room TBA). For more information see the facebook event https://www.facebook.com/events/433266690412632/ and register for the event here.

## Program

**Tuesday 2rd of October (Jerne Auditorium)**

13.15 – 13.30: Welcome by Dean at Faculty of Health and Medical Sciences Ulla Væver and Section of Biostatistics

13.30 - 14.15: Odd Aalen, University of Oslo : “Time-dependent mediators in survival analysis: Modelling direct and indirect effects with the additive hazards model”

14.15 - 15.00: Claus Thorn Ekstrøm, University of Copenhagen : "Study the environment - it’s good for your genes"

15.00 - 15.30: Coffee and anniversary cake

15.30 - 16.15: Hélène JACQMIN-GADDA, University of Bordeaux : "Random change-point mixed models for the study of chronic diseases"

16.15 - 17.00: Esben Budtz-Jørgensen, University of Copenhagen : "A two-stage estimation procedure for non-linear structural equation models"

*Evening:* Conference Dinner (included in the conference fee, also for students). If you have special requirements for the dinner let us know.

**Wednesday 3rd of October (Jerne Auditorium)**

09.00 - 09.45: Robin Henderson, Newcastle University : “Topological event history analysis”

09.45 - 10.30: Thomas Scheike, University of Copenhagen :"The mean, variance, and correlation for bivariate recurrent events data with a terminal event"

10.30 - 11.00: Water and fruit (and coffee)

11.00 - 11.45: Maja Pohar Perme, University of Ljubljana : "On estimation in relative survival"

11.45 - 12.30: Hein Putter, Leiden University : “On the relation between the cause-specific hazard and the subdistribution rate for competing risks data: Solving the Fine-Gray riddle”

*After the last talk:* reception for conference participants, colleagues, collaborators and friends + sandwich to go if you have to leave right after the conference.

### Abstracts

*Odd Aalen, University of Oslo, Norway.*

**Time-dependent mediators in survival analysis: Modelling direct and indirect effects with the additive hazards model.**

We discuss causal mediation analyses for survival data and propose a new approach based on the additive hazards model. The emphasis is on a dynamic point of view, that is, understanding how the direct and indirect effects develop over time. Hence, importantly, we allow for a time varying mediator. To define direct and indirect effects in such a longitudinal survival setting we take an interventional approach (Didelez, 2018) where treatment is separated into one aspect affecting the mediator and a different aspect affecting survival.

In general, this leads to a version of the non-parametric

*g*-formula (Robins, 1986). In the present paper, we demonstrate that combining the

*g*-formula with the additive hazards model and a linear structural equation model for the mediator process results in simple and interpretable expressions for direct and indirect effects in terms of relative survival as well as cumulative hazards. Our results generalise and formalise the method of dynamic path analysis (Fosen et al, 2006; Strohmaier et al, 2015) and also work by Lange and Hansen (2011).

*Claus Ekstrøm, Biostatistics, University of Copenhagen, Denmark.*

**Study the environment - it’s good for your genes.**

Environmental factors do not receive a lot of attention in studies involving genomic for several reasons: they change over time, they are more expensive to measure, and they contain less structure than genetic data. Finite mixtures of regression models provide a flexible modeling framework for many phenomena including gene-gene interactions, gene-environment interactions and personalized medicine. Using moment-based estimation of the regression parameters, we develop unbiased estimators with a minimum of assumptions on the mixture components. In particular, only the average regression model for one of the components in the mixture model is needed with no requirements on the distributions.

The consistency and asymptotic distribution of the estimators is derived and the method is applied to a large-scale genome study to identify single-nucleotide-polymorphisms that were undiscovered using traditional approaches.

*Hélène Jacqmin-Gadda, Inserm U1219, Univ. Bordeaux, France.*

**Random changepoint mixed models for the study of chronic diseases.**

In biomedical literature, random changepoint mixed models are used to describe biphasic biomarker trajectories with subject-specific time of change. For instance, previous studies suggest that the long pre-diagnosis cognitive decline of dementia follows such a biphasic shape. However, the shape of decline could be different according to subject’s characteristics or according to the cognitive function considered. The transition could be smoother for subjects with low educational level and the mean time of acceleration of the decline could be different for different cognitive tests.

In this talk, we discuss inference methods for these models. We first describe a new test procedure to assess the existence of a random changepoint. Then we discuss test procedures to compare the mean time of changes between groups or between cognitive functions. The methods are applied to the French cohort PAQUID of elderly subjects to study the natural history of dementia and identify cognitive functions that are first impaired in the pre-diagnosis phase.

*Esben Budtz-Jørgensen, Biostatistics, University of Copenhagen.*

**A two-stage estimation procedure for non-linear structural equation models.**

Maximum likelihood (ML) estimation in non-linear structural equation models with latent variables requires numerical integration and results are sensitive to distributional assumptions. In this talk, we introduce a two-stage technique for estimation of non-linear associations between latent variables. Here both steps are based on fitting linear structural equation models: first a model is fitted to data measuring the latent predictor and terms describing the non-linear effect are predicted. In the second step, these predictions are included in a model for the latent outcome variable. We show that this procedure is consistent and illustrate that it allows the association between latent variables to be modeled using restricted cubic splines. We also discuss robustness and compare the method to relevant alternatives including ML. (Joint work with Klaus Holst).

*Robin Henderson, Newcastle University, UK.*

**Topological event history analysis.**

Topological data analysis is used for the analysis of data

*Z(s)*indexed by a location s in some space, for example an image or a random field. A key element is the notion of persistent homology: how features change as we filter in some way. Often the filtration is based on level, t say, and features are components, holes, loops and so on that are apparent in level sets made up of the locations of all points with values

*Z(s) > t*.

*Thomas Scheike, Biostatistics, University of Copenhagen, Denmark.*

**The mean, variance, and correlation for bivariate recurrent events data with a terminal event.**

The analysis of recurrent events in the presence of a terminal event is often encountered in the biomedical setting. The marginal mean of the number of recurrent events in a specified time-period is a useful nonparametric summary of recurrent events data also in the presence of terminal events. Another useful nonparametric summary, that is simple to compute, is the distribution function of the number of recurrent events for each point in time as well as the variance of the number of recurrent events. For bivariate recurrent events, still in the presence of a terminal event, we here suggest a simple non-parametric estimator of the covariance or correlation of the marginal number of events for both processes. We suggest an adjustment for correlation induced by the terminal event to get a measure that reflects the dependence in the recurrent events processes among survivors only. Our estimators can then be used for deciding if the two recurrent events are correlated and in what way. We provide large sample properties of our estimators and show their performance in small samples by simulations. The estimators are used to show a positive correlation on the number of infections and the number of occlusion defects based on data on catheter complications for patients receiving home parenteral nutrition through a central venous catheter.

*Maja Pohar Perme, University of Ljubljana, Slovenia.*

**On estimation in relative survival.**

In population based cancer survival analysis, the information on cause of death is often unavailable or unreliable, nevertheless cancer specific questions are of interest. The idea of relative survival is to bring in the missing information through the population mortality tables that serve as the information on the other-cause mortality. In cancer registry analysis, the researchers are interested in a measure that is directly related to cancer specific hazard and hence enables more direct comparisons of populations with different background mortality hazard, this is the reason why net survival is often considered. In this talk we review the assumptions of net survival and discuss what measure can be estimated without forcing assumptions that cannot be tested and do not make much sense in the real world.

*Hein Putter, Leiden University, The Netherlands.*

**On the relation between the cause-specific hazard and the subdistribution rate for competing risks data: Solving the Fine-Gray riddle.**

The Fine-Gray proportional subdistribution hazards model has been puzzling many people since its introduction. The main reason for the uneasy feeling is that the approach considers individuals still at risk for competing risk 1 after they fell victim to risk 2. The subdistribution hazard and the extended risk sets, where subjects who failed of the competing risk remain in the risk set, are generally perceived as unnatural. One could say it is somewhat of a riddle why the Fine-Gray approach yields valid inference. To take away these uneasy feelings we explore the link between the Fine-Gray and cause-specific approaches in more detail. We introduce the reduction factor as representing the proportion of subjects in the Fine-Gray risk set that has not yet experienced a competing event. In the presence of covariates, the dependence of the reduction factor on a covariate gives information on how the effect of the covariate on the cause-specific hazard and the subdistribution rate relate. We discuss estimation and modeling of the reduction factor, and show how they can be used in various ways to estimate cumulative incidences, given covariates. Methods are illustrated on data of the European Blood and Marrow Society (EBMT).

(Joint work with Hans van Houwelingen)