|
Bioinformatics
Expression profiling of whole genomes, and modern high throughput proteomics has created a revolution in the study of disease states and performance of basic biology research. However, the increase in both the amount of data and the diversity of data types (genomic, proteomic, metabolomic) presents challenges to researchers in the areas of organization and analysis. Available bioinformatics solutions include development of customized database solutions for storage and management of large complex heterogeneous (genomic, proteomic, metabolomic) datasets, and advanced analysis tools for the integrated interpretation of the data in proper biological pathway context. Also, we are currently of developing laboratory information management system (LIMS) software to support ongoing natural product extract research aimed at identification of new antimalarial lead compounds.
Development of a Custom Microarray Database to Enable Assay Development
|
Schematic diagram of the cipherDB workflow enabling the efficient construction of detection and diagnostic assays for human exposure to bioagents. |
The primary objective of this effort is to design and develop a database management system (named cipherDB), which will enable the integrated interpretation of diverse biological information within the proper biological context. The purpose of cipherDB is to store, query, analyze and visualize various data types including gene expression arrays, protein expression arrays and immunoassays from a variety of biological studies. More specifically, cipherDB will enable the development of diagnostic assays for human exposure to biological threat agents by coupling the database to a suite of systems biology and bioinformatics analysis tools. The overall design of cipherDB will be established to optimally address current and future needs, and to ensure compliance with the recognized MIAME (Minimum Information About a Microarray Experiment) standard. CFDRC’s successful implementation of cipherDB will result in a DBMS for gene expression information with advanced query and analysis capability.
Integrative Analysis of Biological Data in a Pathway Context
The tremendous increase in both the amount and diversity (genomic, proteomic, metabolomic) of cellular data widely available to researchers represents a challenge to the bioinformatics community to develop generalized analysis tools to aid the researcher in discovery and hypothesis development. Widely used algorithms, such as clustering analysis, are generally limited to analyzing only one data type (protein expression, mRNA expression, metabolite concentrations) at a time, and consequently there is a growing need to develop integrative analysis tools that simultaneously consider all available data within the context of a biological interaction network. Considering data in the context of biological networks enables the objective quantitative identification of critical biological interactions, providing direction for future research and hypothesis generation efforts. Development of advanced clustering approaches able to simultaneously consider gene and protein expression data in the context of biological networks, hold the key to filling this void. CFDRC is addressing this challenge by developing and demonstrating a novel computational algorithm for the Integrative Cluster Analysis of Biological Interaction Networks (iCABIN). The iCABIN algorithm simultaneously utilizes microarray and proteomic data in conjunction with a biological interaction network to objectively identify and rank active sub-networks using a newly developed integrative clustering approach.
Development of Laboratory Information Management Systesms LIMS) to Support Natural Product Extract Research
|
Illustration of the proposed effort to consider data in the context of a biological network. |
CFDRC is in the process of developing laboratory information management system (LIMS) software to support our natural product extract research. The LIMS software is called the Natural Extract Lead compound Identification (NELI) data management and analysis software. The database component of the NELI platform will enable the simultaneous storage of MS, HPLC, UV-vis, and bioactivity assay data. The relational database management system (RDBMS) of choice will be the Oracle RDBMS as is has a proven track record in the industry of being used to warehouse large databases with high volumes of transactions. This database will support links to standard chemical libraries (e.g. Chapman & Hall Chemical Database), to enable rapid dereplication. Thus the database design will be user-friendly and flexible enough to enable extensibility and thus serve as a useful tool for the biological researcher. The bioinformatics component of our natural product drug discovery program will require a metabolomics/ bioactivity analysis suite. The suite will provide a JAVA interface for displaying and manipulating metabolite expression data sets. Using this interface we will be able to display multiple chromatograms (UV, ESI+/-) overlaid with bioactivity. State-of-the-art metabolite identification methods (specifically, principal component analysis) will be implemented, with the goal of identifying high quality potential lead compounds with a specified resolution criteria. |
|