March 12, 2006
This is a secondary analysis of Affymetrix microarray data on the transition from normal intestinal epithelia to adenomas and carcinomas in the APC-Min/+ mouse. Results of the original study were published in [1]. Different statistical methods were used in this supplementary study that may yield additional information. It does not, however, alter any of the conclusions in [1] in a material way.
The reader should consult [1] for a discussion of the experimental protocol. The steps in the analysis here consist of a quality assessment of the raw microarray data (the .cel files), calculation of expression intensities with gcrma, and differential expression analysis. After identifying sets of genes differentially expressed in selected pairs of tissue types, effected pathways can be examined through KEGG. Packages in Bioconductor were used throughout this study.
The quality of the data as represented in the .cel files was assessed using the test parameters: scale factor, average background, percent present, 3'/5' ratios. These were accessed with the simpleaffy package. In addition the affyPLM package was used to produce RLE and NUSE plots of the arrays. The results are found in the Quality Parameter Table (.xls), RLE plot , and NUSE plot .
Conclusions from the quality assessment and summary results of the differential analysis can be found in APC Report2. In a nutshell, it is recommended to exclude one array of the normal tissue from the APC-Min/+ mice. Moroever, the scale factors of several of the arrays from the wild type mice are marginal. Combined with the RLE and NUSE plots exclusion of at least two of these arrays seems prudent. However, this must be balanced with the need for a sufficient number of replicates to judge differential expression.
The analysis proceeded by analyzing the adenoma, carcinoma and normal samples together; then independently including the four best wild type. These two sets of .cel files were normalized separately with gcrma.
Sets of genes differentially expressed for selected pairs of tissue types were obtained using the algorithm developed in [2] and implemented in the Bioconductor package limma. In this report A denotes adenoma, C=carcinoma, N=normal and W=wild type. A pair C-A denotes a test for differential expression between carcinoma and adenoma; a positive value of M (in the report .xls file) means higher expression in carcinoma. These tests produce the following numbers of differentially expressed genes.
Pairs |
C-N | A-N | C-A | C-W | A-W | N-W |
# of Genes |
144 | 0 | 2 | 533 | 146 | 0 |
The lists of genes are found in DiffGenesTable (.xls). In these tables M refers to the difference in means for the relevant pair, A is the average of all the values, t is the value of the modified t-test (see [2]) and P.Value is as usual for a t-test. Notice that since expression values are on a log2 scale, M=log2(fold change). Methods for estimating the power of microarray experiments suggest having low confidence in differential expression of a gene when M is < 2. We leave these genes in the table, however, for completeness. Highest confidence is in the C-N comparison since there are 6 carcinoma replicates, 5 normal replicates and the quality of the .cel files is good.
All but 10 of the genes differentially expressed between adenoma and wild type are also differentially expressed in carcinoma versus wild type. Thus, it appears that carcinogenesis passes through adenoma development as concluded in [1]. The data suggest there are real differences between all of these tissue types. However, quantifying the differences between adenoma and normal, C-A and N-W would require more replicates. Preliminary results for C-A and N-W based simply on fold change are given in worksheets in DiffGenesTable (.xls).
KEGG can be used to identify the pathways involving the differentially expressed genes. Here are a few of the more important effected pathways.
C-N Wnt Signaling (.gif) (Orange means increased expression in C versus N, blue means lower expression)
C-N TGF Signaling
C-N ECM Receptor Interaction
A-W Wnt Signaling
A-W TGF Signaling
C-W Wnt Signaling
C-W TGF Signaling
C-W Coagulation Cascade
C-W Cell Cycle
Below are the files the reader needs to perform these queries. For each of the pairs C-N, C-W and A-W download the corresponding color file to your computer. Connect to the web page Color Objects in KEGG Pathways; select the species in the "Search Against:" pull-down menu; click the "Choose File" button and select the color file you downloaded. All effected pathways can be examined at once along with the specific genes involved.
Color Files: C-N Color File , A-W Color File , C-W Color File . (Clicking on the link may open the file in the browser. Option-click (Mac) or right-click (Windows) and select "Download link" to save the file to your computer.)
[1] Paoni NF, Feldman MW, Gutierrez LS, Ploplis VA, Castellino FJ, Transcriptional profiling of the transition from normal intestinal epithelia to adenomas and carcinomas in the APCMin/+ mouse. Physiol Genomics. 2003 Nov 11;15(3):228-35.
[2] Smyth, GK, Linear models and empirical Bayes methods for assessing differential expression in microarray experiments, Statistical Applications in Genetics and Molecular Biology, Vol 3, Issue 1, 2004.