Ator, which does not support it to be a candidate diagnostic tool [25]. Further to this, pathologists recently have to face growing workload due to the increasing demand on cancer screening biopsies, molecular testing for target therapy and the concomitant sub-specialization. Therefore, an alternative but still reliableOriginal sample set (n = 53microarrays) Independent sample set(n = 94 microarrays) Independent sample set (n = 68 RT-PCR reactions)Table 4. Discriminant analysis order SPDB results of the 11 classificatory transcripts.NormalNormal Adenoma CRC 1 1 20 22 0 2 25 0 20 0 20 2 25 2 11 0 0 11 38 0 0 38 29AdenomaCRCTotalNormalAdenomaCRCTotalNormal20 1Adenoma0 22CRC0 1Total20 24OriginalCountPercentage Normal Adenoma CRC Normal Adenoma CRC 1 3 18 22 1 2 15 3 20 2 11 0 0 11 37 0 25 2 4.5 4.5 90.9 100 0 7.4 0 100 0 100 6.9 86.2 100 0 0 100 100 0 0 6.9 92.6 1 2 24 100 100 100 38 29 27 100 4.2 4.2 20 1 1 0 91.7 0 0 21 0 0 4.2 95.8 0 2 23 100 100 100 20 24Normal Adenoma CRC 4.5 13.6 81.8 10 75 15 100 100 100 0 0 100 97.4 6.9 3.9 0 86.2 7.Cross-validatedCountPercentage 2.6 6.9 88.9 100 100 100 100 4.2 4.2 0 87.5 0 0 8.3 95.8 100 100Biomarkers for Dysplasia-Carcinoma Transitiondoi:10.1371/journal.pone.BIBS39 0048547.tBiomarkers for Dysplasia-Carcinoma TransitionFigure 2. ROC statistic results of original sample group of microarray (53 samples) (A ), independent sample group of microarray (94 samples) (D ). The applied multiple logistic regression equations were applied on the different datasets. doi:10.1371/journal.pone.0048547.gmethod for identifying diseased or negative specimens could be of great importance. The automated evaluation of colon biopsy specimens by mRNA expression profiling could be a valid approach since much of the methodology, preparation and the analysis procedure are already available. Furthermore, the mRNA expression analysis gives us an insight into altered cellular functions beyond the microscopic level. This information might be related to the biological behaviour of tumors and/or the expression of therapeutic targets, e.g. growth factor receptors. Also the expression of metastasis related genes and those involved in tumor invasiveness may be identified. The set of 11 classifiers determined in our study showed considerably high discriminatory power on the microarray datafiles of previous studies in CRC vs. normal and in adenoma vs. normal comparisons. In silico results suggest that the identified transcript panel can be used as general discriminative markers for colorectal cancer and polyps. Only datasets with CRC and normal, respectively adenoma and normal biopsy samples can be downloaded from Gene Expression Omnibus database which applied Affymetrix HGU133 Plus 2.0. microarray system. To our knowledge, this study is the first whole genomic oligonucleotide microarray study containing CRC, adenoma and normal biopsy samples together available in GEO which can be suitable for the identification of discriminatory transcripts even between early stage CRC and high-grade dysplastic adenoma tissues. The common pre-processing of the data files from different studies resulted in a clear separation of not only diseased and normal samples, but of adenoma and CRC samples as well. However, the datasets of the different studies are difficult to handle together as the differences of sample preparation can distort the results: thiscase can cause the overestimation of the efficacy of adenoma and CRC discrimination. Among the 11 discriminatory trans.Ator, which does not support it to be a candidate diagnostic tool [25]. Further to this, pathologists recently have to face growing workload due to the increasing demand on cancer screening biopsies, molecular testing for target therapy and the concomitant sub-specialization. Therefore, an alternative but still reliableOriginal sample set (n = 53microarrays) Independent sample set(n = 94 microarrays) Independent sample set (n = 68 RT-PCR reactions)Table 4. Discriminant analysis results of the 11 classificatory transcripts.NormalNormal Adenoma CRC 1 1 20 22 0 2 25 0 20 0 20 2 25 2 11 0 0 11 38 0 0 38 29AdenomaCRCTotalNormalAdenomaCRCTotalNormal20 1Adenoma0 22CRC0 1Total20 24OriginalCountPercentage Normal Adenoma CRC Normal Adenoma CRC 1 3 18 22 1 2 15 3 20 2 11 0 0 11 37 0 25 2 4.5 4.5 90.9 100 0 7.4 0 100 0 100 6.9 86.2 100 0 0 100 100 0 0 6.9 92.6 1 2 24 100 100 100 38 29 27 100 4.2 4.2 20 1 1 0 91.7 0 0 21 0 0 4.2 95.8 0 2 23 100 100 100 20 24Normal Adenoma CRC 4.5 13.6 81.8 10 75 15 100 100 100 0 0 100 97.4 6.9 3.9 0 86.2 7.Cross-validatedCountPercentage 2.6 6.9 88.9 100 100 100 100 4.2 4.2 0 87.5 0 0 8.3 95.8 100 100Biomarkers for Dysplasia-Carcinoma Transitiondoi:10.1371/journal.pone.0048547.tBiomarkers for Dysplasia-Carcinoma TransitionFigure 2. ROC statistic results of original sample group of microarray (53 samples) (A ), independent sample group of microarray (94 samples) (D ). The applied multiple logistic regression equations were applied on the different datasets. doi:10.1371/journal.pone.0048547.gmethod for identifying diseased or negative specimens could be of great importance. The automated evaluation of colon biopsy specimens by mRNA expression profiling could be a valid approach since much of the methodology, preparation and the analysis procedure are already available. Furthermore, the mRNA expression analysis gives us an insight into altered cellular functions beyond the microscopic level. This information might be related to the biological behaviour of tumors and/or the expression of therapeutic targets, e.g. growth factor receptors. Also the expression of metastasis related genes and those involved in tumor invasiveness may be identified. The set of 11 classifiers determined in our study showed considerably high discriminatory power on the microarray datafiles of previous studies in CRC vs. normal and in adenoma vs. normal comparisons. In silico results suggest that the identified transcript panel can be used as general discriminative markers for colorectal cancer and polyps. Only datasets with CRC and normal, respectively adenoma and normal biopsy samples can be downloaded from Gene Expression Omnibus database which applied Affymetrix HGU133 Plus 2.0. microarray system. To our knowledge, this study is the first whole genomic oligonucleotide microarray study containing CRC, adenoma and normal biopsy samples together available in GEO which can be suitable for the identification of discriminatory transcripts even between early stage CRC and high-grade dysplastic adenoma tissues. The common pre-processing of the data files from different studies resulted in a clear separation of not only diseased and normal samples, but of adenoma and CRC samples as well. However, the datasets of the different studies are difficult to handle together as the differences of sample preparation can distort the results: thiscase can cause the overestimation of the efficacy of adenoma and CRC discrimination. Among the 11 discriminatory trans.