" Design and Analysis of Large Scale Gene Expression
Experiments and the Application to Angiogenesis
and Blood Vessel Maturation"
-pdf
Abstract: High-throughput technologies such as DNA microarrays
are changing the landscape of scientific research. Instead
of measuring RNA levels for one gene at a time, it is now possible
to perform tens of thousands of simultaneous gene expression
measurements. Along with the obvious benefits afforded
by this dramatic increase in scale, however, come numerous complications
associated with data processing and analysis. In this seminar
I will present my doctoral research related to microarray experimental
design and data analysis, agglomerative hierarchical clustering,
and their application to evaluating differences in gene expression
in an in-vivo model of angiogenesis and blood vessel network
formation.
To address many of the challenges associated with identifying
differentially expressed genes in a microarray dataset, we have
developed an experimental approach and supporting software that
incorporate replication and a statistical linear model to account
for known sources of variation. This software, named CARMA
(Computational Analysis of Replicate Measures for Arrays), performs
an analysis of variance (ANOVA) on two-channel microarray datasets,
in addition to all of the necessary data preprocessing steps
including importing, transforming, and normalizing the raw data
files.
Once the differentially expressed genes have been identified,
it is often desirable to group these genes based on their expression
profile. Numerous pattern recognition techniques, statistical
approaches, and learning algorithms have been successfully applied
to experimental microarray data, however evaluating the performance
of each technique has proven difficult without knowledge of the
true classifications within the data. In order to evaluate
the performance of hierarchical clustering algorithms, we developed
software than generates simulated microarray datasets and implements
10 hierarchical clustering algorithms and 4 distance metrics. Performance
of each algorithm/distance metric combination was assessed based
on their ability to recover the known clusters within the simulated
datasets.
In an effort to improve our understanding of the cellular mechanisms
regulating angiogenesis, blood vessel maturation, and vascular
remodeling we utilized a mouse microvessel fragment model to
study gene expression during the formation of a vascular network
from small vessel fragments isolated from mouse periovarial and
epididymal fat pads. Over the course of 28 days, these
small isolated fragments developed into a physiological microvascular
network. Analysis of gene expression at days 0, 3, 7, 14,
21, 28 revealed patterns of gene expression consistent with an
initial angiogenesis phase followed by a maturation and network
remodeling phase.
|