Center for Statistical Sciences Seminar
Abstract: A central problem in genetic epidemiology is to identify and rank genes involved in a disease. Genome-Wide Association studies are increasingly used to detect disease-gene associations. Information about gene regulatory dependence (1) is collected using biomedical experiments in pathway databases; (2) can be inferred via statistical methodology e.g., gene co-expression networks. We propose a Bayesian methodology that can be used to rank disease- gene associations with ability to incorporate prior biological information about gene regulatory dependence and integrate various types of data. This approach is based on a prior distribution that is defined by a Markov Random Field (MRF). In the context of pathway-based association analysis the prior knowledge induced by a MRF is that weather or not a gene is associated with disease directly depends on an association status of genes within the same pathway. We conducted simulation experiments to illustrate performance of our method. An application of the proposed methodology is illustrated using an association study of Crohn's disease. (Candidate for Assistant Professor in the Biostatistics Section of the Program in Public Health)
Center for Statistical Sciences Seminar
Abstract: There are a number of reasons why it may be important to search for differential expression of groups of genes and pathways rather than for differential expression of individual genes. If biological samples are taken at a fixed time following an intervention, a transcriptional cascade may occur at different speeds in different individuals. At a fixed time, the differential transcription will then lie in different genes within the pathway for different individuals, thus resulting in a signal that occurs in one gene in the pathway for some individuals and other genes for other individuals. Polymorphisms and differences in physiology can result in differential expression of one gene from a class (e.g., MAP Kinases) in one individual and other genes from the same class in other individuals. Up- or down-regulation may be broad across a class of genes but with a signal that is too diffuse and weak to be detected in the results from individual genes. The greater power from aggregation of results may increase the sensitivity. The earliest methods for handling this general problem class involved computing whether the gene group at issue is over-represented in a set of, for example, significantly differentially expressed genes. This is sometimes called Gene Set Enrichment Analysis (GSEA). In Rocke et al. (2005), a new method of analyzing gene groups and pathways was introduced, called the Test of Test Statistics (ToTS) method, in which a set of test statistics, such as in the case of the example a t-test for the dose-reponse slope, is tested for a positive or negative bias using the Wilcoxon one-sample test. This proved to be much more powerful than tests of individual genes. In order to make the procedure robust to correlations in the tests, a re-sampling based method is used to determine significance, rather than the usual asymptotic p-values for the Wilcoxon test. We have subsequently investigated a number of alternative approaches for evaluating the statistical significance of such results, and found methods that are effective under a wide variety of assumptions. (Candidate for faculty position in the Biostatistics Section of the Program in Public Health)
<--- 2008 Index