
Just 200 genes are sufficient to classify all end points and can form the basis of a small toxicogenomic array. Those tend to be enriched in xenobiotic and acute phase response genes as well as un-annotated genes, indicating that not all key genes in the liver xenobiotic responses have been characterized. Moreover, deriving our signatures from such a large database ensures that the cross-validated classification performance reported here is more predictive of the forward validation results obtained on future data sets.Īnalysis of the genes present in the 34 unique signatures reveals that some genes contribute disproportionably to the overall classification potential (i.e. While it is intuitively obvious that ‘more data’ is better, we show that our signatures have a better overall classification performance than many diagnostic tests in widespread use such as prostate-specific antigen, pap smear, Ames test and others. In total, 34 distinct gene expression-based signatures (classifiers) for pharmacological and toxicological end points were identified. Extensive blood chemistry and histopathology were performed in parallel on the same animals. More than 5000 rats were treated with 344 compounds in multiple doses, for multiple time points and in biological triplicate. We have used a supervised classification approach ( El Ghaoui et al, 2003 Natsoulis et al, 2005) to systematically mine a large microarray database derived from livers of compound-treated rats ( Ganter et al, 2005). Our approach using the whole genome and a diverse set of compounds allows a comprehensive view of most pharmacological and toxicological questions and is applicable to other situations such as disease and development. The analysis of the union of all genes present in these signatures can reveal the underlying biology of that end point as illustrated here using liver fibrosis signatures. Many signatures with equal classification capabilities but with no gene in common can be derived for the same phenotypic end point. Signatures were enriched in xenobiotic and immune response genes and contain un-annotated genes, indicating that not all key genes in the liver xenobiotic responses have been characterized. Just 200 genes are sufficient to classify these end points. Thirty-four distinct signatures (classifiers) for pharmacological and toxicological end points can be identified. We have used a supervised classification approach to systematically mine a large microarray database derived from livers of compound-treated rats.
