Hotel Estoril Eden, Monte Estoril, 



Korbinian Strimmer The advent of largescale highthroughput experiments promises to resolve many questions in molecular and genome biology. However, due to their "small n, large p" character, current genomic data also pose substantial challenges for statistical modeling and analysis. In my talk I will present two examples from our recent research on inferring and modeling genetic networks that illustrate some general strategies to cope with small sample highdimensional data. First, I discuss the inference of largescale gene relevance and gene association networks. These require the estimation of an unstructured (inverse) covariance resp. correlation matrix to describe the gene interactions. However, in most previous studies it has been ignored that the standard estimator of the covariance matrix performs very poorly if there are many genes and only few observations. Instead, a regularized estimator needs to be employed. We propose to use variancereduced and shrinkage estimators, along with heuristic empirical Bayes model selection, to infer the respective network structures. Second, I consider the problem of inferring true transcription factor (=regulator) activities, and their functional interaction. We suggest to use a variant of partial least squares regression that not only can be applied to small sample data but also extracts information about the underlying grouping of the regulators. This method also integrates gene expression data with data obtained from chromatin immunoprecipitation (ChIP) assays.
