CCMB Postdoctoral Seminar
Abstract: Large-scale sequencing of cancer genomes is uncovering thousands of DNA alterations, but the functional relevance of the majority of these mutations to tumorigenesis is unknown. Identifying which of these mutations contribute to cancer is critical for understanding tumor biology, and for finding new diagnostic biomarkers and therapeutic targets. We have developed a computational method, called Cancer-specific High-throughput Annotation of Somatic Mutations (CHASM), to identify and prioritize the missense mutations most likely to generate functional changes in proteins that enhance tumor cell proliferation. CHASM uses a supervised machine learning technique called a Random Forest and more than 80 quantitative features describing amino acid changes to predict candidate driver mutations. The method has high sensitivity and specificity when discriminating between known driver missense mutations and randomly generated missense mutations, and performs well relative to other computational methods applied to this problem. CHASM has been applied to over 15 tumor sequencing studies to prioritize missense mutations for further study and initial results are promising; however, further experimental validation is needed to confirm CHASM predictions.
Division of Applied Mathematics & Center for Vision Research Seminar
Abstract: Columbia University, the University of Maryland, and the Smithsonian Institution are working on visual recognition software to help identify species from photographs. I will discuss our work on developing Leafsnap -- the first in a series of electronic field guides. As part of this work, we have completed photographing close to one third of the world's plant species and have begun capturing beautiful high-resolution images of live specimens. Our work has led us in new research directions for the visual recognition of human faces, dog breeds, and bird species, including the adoption of centuries-old techniques from taxonomy for the process of labeling images with visual attributes and object parts. In particular, I will show that it is possible to automatically recognize a wide range of visual attributes and parts in images and use them in numerous applications of computer vision.
Probability/Stochastics seminar
Abstract: Is there a natural way to put a random total ordering on the vertices of a finite graph? Natural here means that all finite graphs get an isomorphism-invariant random ordering and induced subgraphs get the random ordering that is inherited from the larger graph. Thus, the uniformly random ordering is natural; are there any others? What if we restrict to certain kinds of graphs? What about finite hypergraphs or finite metric spaces? We discuss these questions and sketch how their answers give unique ergodicity of corresponding automorphism groups; for example, in the case of graphs, the group is the automorphism group of the infinite random graph. This is joint work with Omer Angel and Alexander.
Scientific Computing Seminar
Abstract:
This presentation will focus on new possibilities for the interpolation of scattered observations
and missing data on regular grids by means of Spartan spatial random fields (SSRFs). A brief
overview of mathematical SSRF properties will be given. Model inference, spatial interpolation
(prediction) and simulation in the framework of SSRFs will be reviewed. It will be shown that
SSRFs can be derived from the Gaussian model of statistical field theory [1] or equivalently,
from stochastic (Langevin) partial differential equations driven by white or color noise. In
contrast with field theory, the focus of SSRFs is on short-range correlations which are important
for the local structure and not on long-range properties of the covariance function near critical
points. SSRF covariance models are characterized by sparse structure of the precision matrix
(the inverse covariance matrix), at least for lattice data. The sparseness derives from the locality
of the operators in the respective energy functional and leads to explicit spectral forms. In
certain cases, the correlations in real space can be derived analytically by direct integration of
the spectral representation, given by the Hankel transform of the density [2]. The availability
of explicit approximate expressions for both the covariance and the precision matrix can help
to overcome the curse of dimensionality in the numerical procedures of parameter inference,
spatial interpolation and conditional simulation [3,4]. An application of SSRFs to the mapping
of radioactivity ground dose rates using data from the European Radiological Data Exchange
Platform will be presented. Extension of the SSRF interaction-based concept for handling spatial
data with non-Gaussian probability distributions, using discretized random field models with
Ising ``spin'' interactions, will be motivated. Finally, topics for further research and perspectives
for the future development of SSRFs will be discussed.
[1] Kardar, M. (2007) Statistical physics of fields. Cambridge University Press, Cambridge.
[2] Hristopulos and Elogne (2007). Analytic Properties and Covariance Functions for a New
Class of Generalized Gibbs Random Fields, IEEE Transactions on Information Theory, 53 (12),
4667-4679.
[3] Hristopulos and Elogne (2009). Computationally efficient spatial interpolators based on
Spartan spatial random fields, IEEE Transactions on Signal Processing, 57(9), 3475-3487.
[4] Hristopulos, D. T. (2003). Spartan Gibbs random field models for geostatistical applications.
SIAM Journal on Scientific Computation, 24(6):2125-2162.
Probability/Stochastics seminar
Abstract: The multi-server queue with non-homogeneous Poisson arrivals and customer abandonment is a fundamental dynamic rate queueing model for large service systems such as call centers and hospitals. Scaling the arrival rates and number of servers arises naturally as staffing issues for these systems in response to predictable increasing demand. Mathematically, this type of asymptotic scaling gives us the fluid and diffusion limits. These limits suggest a Gaussian approximation to the stochastic behavior of this queueing process. The mean and variance are computed from a two-dimensional dynamical system for the limiting fluid process and variance of the diffusion process. Recent work has shown that a modified version of these differential equations can be used to obtain better Gaussian estimates of the original queueing system. In this paper, we introduce a new three-dimensional dynamical system that improves on all these approaches. Using Hermite polynomials, we construct a distribution from a quadratic function of a Gaussian random variable to estimate the mean, variance, and third cumulative moment of the dynamic queueing process. This is joint work with Jamol Pender of Princeton University.
<--- 2012 Index