Hotel Estoril Eden, Monte Estoril,
5-8 October 2005



NextText Box: Participants
Text Box: Programme

mall Sample Statistical Modeling and Inference of Genetic Networks

Korbinian Strimmer
Department of Statistics, University of Munich, Germany

The advent of large-scale high-throughput experiments promises to resolve many questions in molecular and genome biology. However, due to their "small n, large p" character, current genomic data also pose substantial challenges for statistical modeling and analysis.

In my talk I will present two examples from our recent research on inferring and modeling genetic networks that illustrate some general strategies to cope with small sample high-dimensional data.

First, I discuss the inference of large-scale gene relevance and gene association networks. These require the estimation of an unstructured (inverse) covariance resp. correlation matrix to describe the gene interactions. However, in most previous studies it has been ignored that the standard estimator of the covariance matrix performs very poorly if there are many genes and only few observations. Instead, a regularized estimator needs to be employed. We propose to use variance-reduced and shrinkage estimators, along with heuristic empirical Bayes model selection, to infer the respective network structures.

Second, I consider the problem of inferring true transcription factor (=regulator) activities, and their functional interaction. We suggest to use a variant of partial least squares regression that not only can be applied to small sample data but also extracts information about the underlying grouping of the regulators. This method also integrates gene expression data with data obtained from chromatin immunoprecipitation (ChIP) assays.