Users who used selection:
Number of articles per page:
BMC bioinformatics [electronic resource]. 7 (1), 3 (06 Jan 2006)
BACKGROUND: Selection of relevant genes for sample classification is a common task in most gene expression studies, where researchers try to identify the smallest possible set of genes that can still achieve good predictive performance (for instance, for future use with diagnostic purposes in clinical practice). Many gene selection approaches use univariate (gene-by-gene) rankings of gene relevance and arbitrary thresholds to select the number of genes, can only be applied to two-class problems, and use gene selection ranking criteria unrelated to the classification algorithm. In contrast, random forest is a classification algorithm well suited for microarray data: it shows excellent performance even when most predictive variables are noise, can be used when the number of variables is much larger than the number of observations and in problems involving more than two classes, and returns measures of variable importance. Thus, it is important to understand the performance of random forest with microarray data and its possible use for gene selection. RESULTS: We investigate the use of random forest for classification of microarray data (including multi-class problems) and propose a new method of gene selection in classification problems based on random forest. Using simulated and nine microarray data sets we show that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy. CONCLUSION: Because of its performance and features, random forest and gene selection using random forest should probably become part of the "standard tool-box" of methods for class prediction and gene selection with microarray data.
The journal of histochemistry and cytochemistry : official journal of the Histochemistry Society 51 (5), 575-84 (May 2003)
The increased use of immunohistochemistry (IHC) in both clinical and basic research settings has led to the development of techniques for acquiring quantitative information from immunostains. Staining correlates with absolute protein levels and has been investigated as a clinical tool for patient diagnosis and prognosis. For these reasons, automated imaging methods have been developed in an attempt to standardize IHC analysis. We propose a novel imaging technique in which brightfield images of diaminobenzidene (DAB)-labeled antigens are converted to normalized blue images, allowing automated identification of positively stained tissue. A statistical analysis compared our method with seven previously published imaging techniques by measuring each one's agreement with manual analysis by two observers. Eighteen DAB-stained images showing a range of protein levels were used. Accuracy was assessed by calculating the percentage of pixels misclassified using each technique compared with a manual standard. Bland-Altman analysis was then used to show the extent to which misclassification affected staining quantification. Many of the techniques were inconsistent in classifying DAB staining due to background interference, but our method was statistically the most accurate and consistent across all staining levels.
<< Prev 0 Showing entries 1 to 2 of 2 total Next 0 >>


