<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1423-0127-16-25</ui>
   <ji>1423-0127</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>An integrated method for cancer classification and rule extraction from microarray data</p>
         </title>
         <aug>
            <au ca="yes" id="A1">
               <snm>Huang</snm>
               <fnm>Liang-Tsung</fnm>
               <insr iid="I1"/>
               <email>larry@mdu.edu.tw</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Computer Science and Information Engineering, Mingdao University, Changhua 523, Taiwan</p>
            </ins>
         </insg>
         <source>Journal of Biomedical Science</source>
         <issn>1423-0127</issn>
         <pubdate>2009</pubdate>
         <volume>16</volume>
         <issue>1</issue>
         <fpage>25</fpage>
         <url>http://www.jbiomedsci.com/content/16/1/25</url>
         <xrefbib>
            
         <pubidlist><pubid idtype="pmpid">19272192</pubid><pubid idtype="doi">10.1186/1423-0127-16-25</pubid></pubidlist></xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>08</day>
               <month>9</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>24</day>
               <month>2</month>
               <year>2009</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>24</day>
               <month>2</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>Huang; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>Different microarray techniques recently have been successfully used to investigate useful information for cancer diagnosis at the gene expression level due to their ability to measure thousands of gene expression levels in a massively parallel way. One important issue is to improve classification performance of microarray data. However, it would be ideal that influential genes and even interpretable rules can be explored at the same time to offer biological insight.</p>
            <p>Introducing the concepts of system design in software engineering, this paper has presented an integrated and effective method (named X-AI) for accurate cancer classification and the acquisition of knowledge from DNA microarray data. This method included a feature selector to systematically extract the relative important genes so as to reduce the dimension and retain as much as possible of the class discriminatory information. Next, diagonal quadratic discriminant analysis (DQDA) was combined to classify tumors, and generalized rule induction (GRI) was integrated to establish association rules which can give an understanding of the relationships between cancer classes and related genes.</p>
            <p>Two non-redundant datasets of acute leukemia were used to validate the proposed X-AI, showing significantly high accuracy for discriminating different classes. On the other hand, I have presented the abilities of X-AI to extract relevant genes, as well as to develop interpretable rules. Further, a web server has been established for cancer classification and it is freely available at <url>http://bioinformatics.myweb.hinet.net/xai.htm</url>.</p>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The challenge of cancer treatment is to develop specific therapies based on distinct tumor types, to maximize efficacy and minimize toxicity. Hence, improvements in cancer classification have been paid more and more attention. Recently, microarray gene expression data has been successfully used to investigate useful information for cancer classification at the gene expression level. One of the earliest methods for cancer classification is the weighted voting machine which is based on a linear model <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Other methods includes hierarchical clustering <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, machining learning <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, compound covariate <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, shrunken centroids <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, partial least square <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, principal component analysis disjoint models <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, factor mixture models <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, consensus analysis of multiple classifiers using non-repetitive variables <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> etc. On the whole, these methods are mostly concentrated in the improvement of accuracy rather than other issues.</p>
         <p>In addition to classification, another challenge is to extract relevant genes, even creditable and interpretable rules from microarray gene expression data to offer biological insight between genes. Several kinds of rules have been successfully developed in different subjects of molecular biology. In our earlier studies, decision rules based on decision tree algorithms have been effectively extracted from the thermodynamic database of proteins and mutants to explore potential knowledge of protein stability prediction <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. On the other hand, association rule techniques can also reveal relevant associations between different items. Borgelt and Berthold <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> presented an algorithm to find fragments in a set of molecules that help to discriminate between different classes of activity in a drug discovery context. Oyama et al. <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> proposed a data mining method to discover association rules related to protein-protein interactions. Moreover, association rules which demonstrate diverse mutations and chemical treatments have been reported from 300 gene expression profiles of yeast <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Carmona-Saez et al. <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> have offered an approach which integrates gene annotations and expression data to discover intrinsic associations.</p>
         <p>Typically, a classification system may achieve high accuracy by non-linear models, but these models are hard to provide rules. In contrast, a rule extraction system is necessary to consider the model interpretability which can provide a pathway to explore underlying relationships among data; however, this restriction often affects the system performance in classification. Hence, a learning model which can provide accurate classification, as well as useful rules, would be ideal. Even so, a relatively few attempts have been made to integrate the two types of systems on microarray gene expression data. In earlier reports, Li et al. <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> has proposed a classifier named PCL (prediction by collective likelihoods) which is based on the concept of emerging patterns and can provide the rules describing the microarray gene expression data. Tan et al. <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> have introduced a new classifier named TSP (top scoring pair) which is based on relative expression reversals and can generate accurate decision rules. These studies also revealed the phenomenon of trade-off between credibility and comprehensibility in such a hybrid system. For that reason, I have made attempts to design an integrated and effective framework with less interaction between cancer classification and rule extraction functions.</p>
         <p>In this paper, I have presented an integrated method (named X-AI) which is based on a three-tiered architecture from the viewpoint of system design of software engineering. Different tests have been carried out on two leukemia datasets for evaluating the performance of X-AI. The obtained results indicated that X-AI is able to perform well on both functions of classification and rule extraction in microarray analysis.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Datasets and pre-processing</p>
            </st>
            <p>I used two different leukemia datasets for the following reasons: (i) both datasets have been analyzed and discussed in many literatures, which is helpful to compare with their results; (ii) the rules extracted from the similar cancer type of datasets could be compared to each other; (iii) the robustness of classification system could be observed by the datasets that are obtained from different experiments; and (iv) the two datasets represent the nature of the binary classification and multi-class problems, which is useful to evaluate the effectiveness of the proposed method for different classification problems.</p>
            <p>The first acute leukemia data (named L1) of Golub et al. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> is composed of 72 samples from two different types of acute leukemia, acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML). The training set has 38 bone marrow samples (27 ALL and 11 AML) and the test set consists of 24 bone marrow and 10 peripheral blood samples (20 ALL and 14 AML). Bone marrow mononuclear cells were collected by Ficoll sedimentation in the training set and RNA was hybridized to Affymetrix oligonucleotide microarrays, by which each sample has expression patterns of 7129 probes measured. The second acute leukemia data (named L2) of Armstrong et al. <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> includes 12582 gene expression values for 57 peripheral blood or bone marrow samples. The training set contains 57 leukemia samples (20 ALL, 17 MLL (mixed lineage leukemia) and 20 AML) and the test set contains 15 samples (4 ALL, 3 MLL and 8 AML). For microarray data, pre-processing is of critical importance in downstream analyses. In order to equalize expression values for each sample and avoid the bias against samples, all values in a sample have been re-scaled by a multiplicative factor which is determined by linear regression of genes with present calls. All multiplicative factors are available on the established web server. Duoit et al. <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> applied thresholding, filtering and logarithmic transformation steps before analyzing the leukemia dataset. Accordingly, the expression values were limited by both upper and lower bounds. Since it could be easy to neglect information leakage effects during pre-processing of the proteomic profiling on mass spectrometry data as well as the microarray expression data <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, the upper bound is lifted to 24000 and the lower bound -800, which can increase the changes of finding relevant genes due to a larger search space. Further, I tried to perform the feature selection function instead of a simple filter to systematically reduce the number of genes. The mechanism is described in the following section.</p>
            <p>More details of datasets can be found on the web server and in Broad Institute <url>http://www.broad.mit.edu</url> which evolved from research collaborations in the MIT and Harvard communities and made the generated data available to the scientific community.</p>
         </sec>
         <sec>
            <st>
               <p>X-AI Method</p>
            </st>
            <p>From the viewpoint of system design in software engineering, Yourdon and Constantine <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> made a major contribution to the development of structured design methods by defining a series of criteria that can be used in separating systems into appropriate modules. Modules with tight cohesion and loose coupling are the goal of design. Tight cohesion means that a module should capture one abstraction, while loose coupling means that modules should have little dependency on each other. Introducing the concepts, I adopted a three-tiered architecture (see Figure <figr fid="F1">1</figr>) for the integrated system and each layer includes one or more specific functions: (i) The data management layer comprises the functions required at all stages of data pre-processing issues in microarray analysis. This is consistent with the report of Tinker et al. <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, describing the data management is necessary for the pre-processing which is an important part of microarray experimentation. (ii) The data reduction layer corresponds to the feature selection function, which is mainly to reflect the fact that not all genes measured from a microarray are relevant to a particular cancer; moreover, the data reduction can also help to reduce computational complexity. (iii) The data mining layer satisfies the functions of different kinds of analysis, and here is partitioned into two functions of classification and rule extraction. The two functions based on the same lower layer are loosely coupled and each delivers a coherent group of services, conforming to the design principle mentioned above.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>A three-tiered architecture applied to microarray gene expression data to integrate the tasks of data analysis from the pre-processing to the data mining</p>
               </caption>
               <text>
                  <p><b>A three-tiered architecture applied to microarray gene expression data to integrate the tasks of data analysis from the pre-processing to the data mining</b>.</p>
               </text>
               <graphic file="1423-0127-16-25-1"/>
            </fig>
            <p>The three-tiered architecture integrates the tasks of microarray data analysis from the pre-processing to the data mining including classification and rule extraction. Each function layer with independency can be changed internally without affecting other layers. Therefore, this architecture can provide the consistency of data to different components of the same layer, and reduce the interaction between layers as well as between the components of the same layer.</p>
            <p>The proposed X-AI method primarily implemented the data mining and the data reduction layers of the architecture, and integrated three functions: (i) feature selection, (ii) cancer classification, and (iii) associate rule development (see Figure <figr fid="F2">2</figr>). Although there are many algorithms for these functions, I included three common algorithms so as to observe how well the integrated architecture can perform. Nevertheless, it is optional that replacing these algorithms with others which conform to these functions. Here, Chi2 algorithm serves as the selector to systematically extract the relative important genes so as to reduce the dimension and retain as much as possible of the class discriminatory information. This selector can also provide the consistency of data to the other functions, the input data flows of which come from the output data flows of the selector. Subsequently, diagonal quadratic discriminant analysis (DQDA) was combined to discriminate tumor classes. And generalized rule induction (GRI) was integrated to establish association rules which can give an understanding of the relationship between cancer classes and influence genes. In addition, the outcomes obtained from the three functions of selector, classification and rule development can be referenced by each other. For example, an accurate classification reveals the fact that the selected features are effective, which generally makes the developed rules more reliable.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The X-AI framework with dataflow for cancer classification and knowledge acquisition from DNA microarray data</p>
               </caption>
               <text>
                  <p><b>The X-AI framework with dataflow for cancer classification and knowledge acquisition from DNA microarray data</b>.</p>
               </text>
               <graphic file="1423-0127-16-25-2"/>
            </fig>
            <sec>
               <st>
                  <p>Chi2 algorithm</p>
               </st>
               <p>The Chi2 algorithm <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> can discretize numeric features and select relevant features according to the chi-squared statistic with respect to the class. The chi-squared value of an attribute is calculated as the following equation,</p>
               <p>
                  <display-formula id="M1">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1423-0127-16-25-i1">
                        <m:semantics>
                           <m:mrow>
                              <m:msup>
                                 <m:mi>&#967;</m:mi>
                                 <m:mn>2</m:mn>
                              </m:msup>
                              <m:mo>=</m:mo>
                              <m:mstyle displaystyle="true">
                                 <m:munderover>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:mi>i</m:mi>
                                       <m:mo>=</m:mo>
                                       <m:mn>1</m:mn>
                                    </m:mrow>
                                    <m:mn>2</m:mn>
                                 </m:munderover>
                                 <m:mrow>
                                    <m:mstyle displaystyle="true">
                                       <m:munderover>
                                          <m:mo>&#8721;</m:mo>
                                          <m:mrow>
                                             <m:mi>j</m:mi>
                                             <m:mo>=</m:mo>
                                             <m:mn>1</m:mn>
                                          </m:mrow>
                                          <m:mtext>k</m:mtext>
                                       </m:munderover>
                                       <m:mrow>
                                          <m:mfrac>
                                             <m:mrow>
                                                <m:msup>
                                                   <m:mrow>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:msub>
                                                         <m:mi>A</m:mi>
                                                         <m:mrow>
                                                            <m:mi>i</m:mi>
                                                            <m:mi>j</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                      <m:mo>&#8722;</m:mo>
                                                      <m:msub>
                                                         <m:mi>E</m:mi>
                                                         <m:mrow>
                                                            <m:mi>i</m:mi>
                                                            <m:mi>j</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                      <m:mo stretchy="false">)</m:mo>
                                                   </m:mrow>
                                                   <m:mn>2</m:mn>
                                                </m:msup>
                                             </m:mrow>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>E</m:mi>
                                                   <m:mrow>
                                                      <m:mi>i</m:mi>
                                                      <m:mi>j</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mfrac>
                                       </m:mrow>
                                    </m:mstyle>
                                 </m:mrow>
                              </m:mstyle>
                              <m:mo>,</m:mo>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aqatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeq4Xdm2aaWbaaSqabeaacqaIYaGmaaGccqGH9aqpdaaeWbqaamaaqahajuaGbaWaaSaaaeaacqGGOaakcqWGbbqqdaWgaaqaaiabdMgaPjabdQgaQbqabaGaeyOeI0Iaemyrau0aaSbaaeaacqWGPbqAcqWGQbGAaeqaaiabcMcaPmaaCaaabeqaaiabikdaYaaaaeaacqWGfbqrdaWgaaqaaiabdMgaPjabdQgaQbqabaaaaaWcbaGaemOAaOMaeyypa0JaeGymaedabaGaee4AaSganiabggHiLdaaleaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqaIYaGma0GaeyyeIuoakiabcYcaSaaa@4E78@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>where k is the number of classes and <it>A</it><sub><it>ij </it></sub>the number of samples of the <it>j</it>-th class in the <it>i</it>-th interval. <it>E</it><sub><it>ij </it></sub>means the expected frequency of <it>A</it><sub><it>ij</it></sub>, which is calculated by</p>
               <p>
                  <display-formula id="M2">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1423-0127-16-25-i2">
                        <m:semantics>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>E</m:mi>
                                 <m:mrow>
                                    <m:mi>i</m:mi>
                                    <m:mi>j</m:mi>
                                 </m:mrow>
                              </m:msub>
                              <m:mo>=</m:mo>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>R</m:mi>
                                       <m:mi>i</m:mi>
                                    </m:msub>
                                    <m:mo>*</m:mo>
                                    <m:msub>
                                       <m:mi>C</m:mi>
                                       <m:mi>j</m:mi>
                                    </m:msub>
                                 </m:mrow>
                                 <m:mi>n</m:mi>
                              </m:mfrac>
                              <m:mo>,</m:mo>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aqatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyrau0aaSbaaSqaaiabdMgaPjabdQgaQbqabaGccqGH9aqpjuaGdaWcaaqaaiabdkfasnaaBaaabaGaemyAaKgabeaacqGGQaGkcqWGdbWqdaWgaaqaaiabdQgaQbqabaaabaGaemOBa4gaaOGaeiilaWcaaa@3A2A@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>where <it>R</it><sub><it>i </it></sub>is the number of samples in the <it>i</it>-th interval, <it>C</it><sub><it>j </it></sub>the number of samples in the <it>j</it>-th class, <it>n </it>the total number of samples. The algorithm mainly consists of two phases, named Phase I and II. Phase I comprises the calculation of the chi-squared value for adjacent intervals, and the merge of adjacent intervals under a chi-squared threshold which will be decrementing until an inconsistency rate of data is exceeded; Phase II includes the finer process of Phase I for each feature, and the evaluation of the merge degree which reveals the relevant feature to data. For example, a feature is regarded as an irrelevance for data if it is merged to only one value at the end of Phase II.</p>
               <p>In this work, I have applied the algorithm to two different datasets to analyze the relative importance of genes for the discrimination of tumor classes. And it was chiefly carried out from a suit of free open-source software <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, which provides numerous machine learning algorithms from various learning paradigms.</p>
            </sec>
            <sec>
               <st>
                  <p>Diaquadratic discriminant analysis (DQDA)</p>
               </st>
               <p>Based on Bayes decision theory, the maximum likelihood (ML) discriminant rule discriminates the class of a feature vector <it>x </it>by assigning the one which yields maximal likelihood <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. For multivariate Gaussian distributions, the likelihood function of <it>&#969;</it><sub><it>i </it></sub>with respect to <it>x </it>in the <it>l</it>-dimensional feature space is given by</p>
               <p>
                  <display-formula id="M3">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1423-0127-16-25-i3">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>p</m:mi>
                              <m:mo stretchy="false">(</m:mo>
                              <m:mi>x</m:mi>
                              <m:mo>|</m:mo>
                              <m:msub>
                                 <m:mi>&#969;</m:mi>
                                 <m:mi>i</m:mi>
                              </m:msub>
                              <m:mtext>)=</m:mtext>
                              <m:mfrac>
                                 <m:mn>1</m:mn>
                                 <m:mrow>
                                    <m:msup>
                                       <m:mrow>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mn>2</m:mn>
                                          <m:mi>&#960;</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>l</m:mi>
                                          <m:mo>/</m:mo>
                                          <m:mn>2</m:mn>
                                       </m:mrow>
                                    </m:msup>
                                    <m:mo>|</m:mo>
                                    <m:msub>
                                       <m:mi>&#931;</m:mi>
                                       <m:mi>i</m:mi>
                                    </m:msub>
                                    <m:msup>
                                       <m:mo>|</m:mo>
                                       <m:mrow>
                                          <m:mn>1</m:mn>
                                          <m:mo>/</m:mo>
                                          <m:mn>2</m:mn>
                                       </m:mrow>
                                    </m:msup>
                                 </m:mrow>
                              </m:mfrac>
                              <m:mi>exp</m:mi>
                              <m:mo>&#8289;</m:mo>
                              <m:mo stretchy="false">[</m:mo>
                              <m:mo>&#8722;</m:mo>
                              <m:mfrac>
                                 <m:mn>1</m:mn>
                                 <m:mn>2</m:mn>
                              </m:mfrac>
                              <m:msup>
                                 <m:mrow>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>x</m:mi>
                                    <m:mo>&#8722;</m:mo>
                                    <m:msub>
                                       <m:mi>&#956;</m:mi>
                                       <m:mi>i</m:mi>
                                    </m:msub>
                                    <m:mo stretchy="false">)</m:mo>
                                 </m:mrow>
                                 <m:mi>T</m:mi>
                              </m:msup>
                              <m:msubsup>
                                 <m:mi>&#931;</m:mi>
                                 <m:mi>i</m:mi>
                                 <m:mrow>
                                    <m:mo>&#8722;</m:mo>
                                    <m:mn>1</m:mn>
                                 </m:mrow>
                              </m:msubsup>
                              <m:mo stretchy="false">(</m:mo>
                              <m:mi>x</m:mi>
                              <m:mo>&#8722;</m:mo>
                              <m:msub>
                                 <m:mi>&#956;</m:mi>
                                 <m:mi>i</m:mi>
                              </m:msub>
                              <m:mo stretchy="false">)</m:mo>
                              <m:mo stretchy="false">]</m:mo>
                              <m:mo>,</m:mo>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiCaaNaeiikaGIaemiEaGNaeiiFaWNaeqyYdC3aaSbaaSqaaiabdMgaPbqabaGccqqGPaqkcqqG9aqpjuaGdaWcaaqaaiabigdaXaqaaiabcIcaOiabikdaYiabec8aWjabcMcaPmaaCaaabeqaaiabdYgaSjabc+caViabikdaYaaacqGG8baFcqqHJoWudaWgaaqaaiabdMgaPbqabaGaeiiFaW3aaWbaaeqabaGaeGymaeJaei4la8IaeGOmaidaaaaakiGbcwgaLjabcIha4jabcchaWjabcUfaBjabgkHiTKqbaoaalaaabaGaeGymaedabaGaeGOmaidaaOGaeiikaGIaemiEaGNaeyOeI0IaeqiVd02aaSbaaSqaaiabdMgaPbqabaGccqGGPaqkdaahaaWcbeqaaiabdsfaubaakiabfo6atnaaDaaaleaacqWGPbqAaeaacqGHsislcqaIXaqmaaGccqGGOaakcqWG4baEcqGHsislcqaH8oqBdaWgaaWcbaGaemyAaKgabeaakiabcMcaPiabc2faDjabcYcaSaaa@68F7@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>where <it>&#956;</it><sub><it>i </it></sub>is the mean of <it>x </it>for the <it>&#969;</it><sub><it>i </it></sub>class, &#931;<sub><it>i </it></sub>the <it>l </it>by <it>l </it>covariance matrix. When the covariance matrices are diagonal, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1423-0127-16-25-i4"><m:semantics><m:mrow><m:msub><m:mi>&#931;</m:mi><m:mi>i</m:mi></m:msub><m:mo>=</m:mo><m:mtext>diag</m:mtext><m:mo stretchy="false">(</m:mo><m:msubsup><m:mi>&#963;</m:mi><m:mrow><m:mi>i</m:mi><m:mtext>1</m:mtext></m:mrow><m:mtext>2</m:mtext></m:msubsup><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msubsup><m:mi>&#963;</m:mi><m:mrow><m:mi>i</m:mi><m:mi>l</m:mi></m:mrow><m:mtext>2</m:mtext></m:msubsup><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeu4Odm1aaSbaaSqaaiabdMgaPbqabaGccqGH9aqpcqqGKbazcqqGPbqAcqqGHbqycqqGNbWzcqGGOaakcqaHdpWCdaqhaaWcbaGaemyAaKMaeeymaedabaGaeeOmaidaaOGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIaeq4Wdm3aa0baaSqaaiabdMgaPjabdYgaSbqaaiabbkdaYaaakiabcMcaPaaa@461E@</m:annotation></m:semantics></m:math></inline-formula>, the ML discriminate rule can be written as <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1423-0127-16-25-i5"><m:semantics><m:mrow><m:mi>C</m:mi><m:mo stretchy="false">(</m:mo><m:mi>x</m:mi><m:mo stretchy="false">)</m:mo><m:mo>=</m:mo><m:munder><m:mrow><m:mi>arg</m:mi><m:mo>&#8289;</m:mo><m:mi>min</m:mi><m:mo>&#8289;</m:mo></m:mrow><m:mi>i</m:mi></m:munder><m:mstyle displaystyle="true"><m:munderover><m:mo>&#8721;</m:mo><m:mrow><m:mi>j</m:mi><m:mo>=</m:mo><m:mn>1</m:mn></m:mrow><m:mi>l</m:mi></m:munderover><m:mrow><m:mo stretchy="false">[</m:mo><m:msup><m:mrow><m:mo stretchy="false">(</m:mo><m:msub><m:mi>x</m:mi><m:mi>j</m:mi></m:msub><m:mo>&#8722;</m:mo><m:msub><m:mi>&#956;</m:mi><m:mrow><m:mi>i</m:mi><m:mi>j</m:mi></m:mrow></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:mn>2</m:mn></m:msup><m:mo>/</m:mo><m:msubsup><m:mi>&#963;</m:mi><m:mrow><m:mi>i</m:mi><m:mi>j</m:mi></m:mrow><m:mn>2</m:mn></m:msubsup><m:mo>+</m:mo><m:mi>log</m:mi><m:mo>&#8289;</m:mo><m:msubsup><m:mi>&#963;</m:mi><m:mrow><m:mi>i</m:mi><m:mi>j</m:mi></m:mrow><m:mn>2</m:mn></m:msubsup><m:mo stretchy="false">]</m:mo></m:mrow></m:mstyle></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4qamKaeiikaGIaemiEaGNaeiykaKIaeyypa0ZaaCbeaeaacyGGHbqycqGGYbGCcqGGNbWzcyGGTbqBcqGGPbqAcqGGUbGBaSqaaiabdMgaPbqabaGcdaaeWbqaaiabcUfaBjabcIcaOiabdIha4naaBaaaleaacqWGQbGAaeqaaOGaeyOeI0IaeqiVd02aaSbaaSqaaiabdMgaPjabdQgaQbqabaGccqGGPaqkdaahaaWcbeqaaiabikdaYaaakiabc+caViabeo8aZnaaDaaaleaacqWGPbqAcqWGQbGAaeaacqaIYaGmaaGccqGHRaWkcyGGSbaBcqGGVbWBcqGGNbWzcqaHdpWCdaqhaaWcbaGaemyAaKMaemOAaOgabaGaeGOmaidaaOGaeiyxa0faleaacqWGQbGAcqGH9aqpcqaIXaqmaeaacqWGSbaBa0GaeyyeIuoaaaa@60FF@</m:annotation></m:semantics></m:math></inline-formula>, which is a special case of diagonal quadratic discriminant analysis (DQDA). In practice, <it>&#956;</it><sub><it>i </it></sub>and &#931;<sub><it>i </it></sub>are estimated by corresponding sample quantities. we have effectively utilized it for the analysis of discriminating two- and three-state proteins <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. In this study, the combination of selected genes was used as the feature vector to discriminate tumor classes.</p>
            </sec>
            <sec>
               <st>
                  <p>Generalized rule induction (GRI)</p>
               </st>
               <p>Generalized rule induction was proposed by Smyth and Goodman <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, which applies an information theoretic approach to automate rule acquisition. For a rule, <it>if antecedent then consequent</it>, GRI applies <it>J</it>-measure quantifies its information content:</p>
               <p>
                  <display-formula id="M4">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1423-0127-16-25-i6">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>J</m:mi>
                              <m:mo>=</m:mo>
                              <m:mi>p</m:mi>
                              <m:mo stretchy="false">(</m:mo>
                              <m:mi>a</m:mi>
                              <m:mo stretchy="false">)</m:mo>
                              <m:mrow>
                                 <m:mo>[</m:mo>
                                 <m:mrow>
                                    <m:mi>p</m:mi>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>c</m:mi>
                                    <m:mo>|</m:mo>
                                    <m:mi>a</m:mi>
                                    <m:mo stretchy="false">)</m:mo>
                                    <m:mi>ln</m:mi>
                                    <m:mo>&#8289;</m:mo>
                                    <m:mfrac>
                                       <m:mrow>
                                          <m:mi>p</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>c</m:mi>
                                          <m:mo>|</m:mo>
                                          <m:mi>a</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>p</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>c</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                    </m:mfrac>
                                    <m:mo>+</m:mo>
                                    <m:mo stretchy="false">[</m:mo>
                                    <m:mn>1</m:mn>
                                    <m:mo>&#8722;</m:mo>
                                    <m:mi>p</m:mi>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>c</m:mi>
                                    <m:mo>|</m:mo>
                                    <m:mi>a</m:mi>
                                    <m:mo stretchy="false">)</m:mo>
                                    <m:mo stretchy="false">]</m:mo>
                                    <m:mi>ln</m:mi>
                                    <m:mo>&#8289;</m:mo>
                                    <m:mfrac>
                                       <m:mrow>
                                          <m:mn>1</m:mn>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>p</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>c</m:mi>
                                          <m:mo>|</m:mo>
                                          <m:mi>a</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mn>1</m:mn>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>p</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>c</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                    </m:mfrac>
                                 </m:mrow>
                                 <m:mo>]</m:mo>
                              </m:mrow>
                              <m:mo>,</m:mo>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOsaOKaeyypa0JaemiCaaNaeiikaGIaemyyaeMaeiykaKYaamWaaeaacqWGWbaCcqGGOaakcqWGJbWycqGG8baFcqWGHbqycqGGPaqkcyGGSbaBcqGGUbGBjuaGdaWcaaqaaiabdchaWjabcIcaOiabdogaJjabcYha8jabdggaHjabcMcaPaqaaiabdchaWjabcIcaOiabdogaJjabcMcaPaaakiabgUcaRiabcUfaBjabigdaXiabgkHiTiabdchaWjabcIcaOiabdogaJjabcYha8jabdggaHjabcMcaPiabc2faDjGbcYgaSjabc6gaULqbaoaalaaabaGaeGymaeJaeyOeI0IaemiCaaNaeiikaGIaem4yamMaeiiFaWNaemyyaeMaeiykaKcabaGaeGymaeJaeyOeI0IaemiCaaNaeiikaGIaem4yamMaeiykaKcaaaGccaGLBbGaayzxaaGaeiilaWcaaa@6AFB@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>where <it>p</it>(<it>a</it>) represents the probability of the observed attribute value of <it>a</it>, as a measure of the coverage of the antecedent; <it>p</it>(<it>c</it>) represents the prior probability of the value of <it>c</it>, as a measure of the common of the observed attribute value of c in the consequent; <it>p</it>(<it>c</it>|<it>a</it>) represents an modified probability of observing this value of <it>c </it>after taking into account the additional information of the value of <it>a</it>. For rules with more than one antecedent, <it>p</it>(<it>a</it>) is regarded as the probability of the conjunction of the variable values in the antecedent. Accordingly, a set of optimal rules was then generated by ITRULE algorithm, which calculates <it>J</it>-measures of rules by employing depth-first search over possible left-hand sides.</p>
               <p>Here, the genes selected by Chi2 algorithm were considered as the attributes of the antecedent. And the tumor class was the only attribute of the consequent.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Performance evaluation and test procedure</p>
            </st>
            <sec>
               <st>
                  <p>Prediction accuracy</p>
               </st>
               <p>I considered the classification of the leukemia datasets L1 and L2 as the two-class and three-class problems, respectively. To evaluate the performance of the classification problems, both classification accuracy and misclassified number were calculated along with corresponding number of selected genes.</p>
            </sec>
            <sec>
               <st>
                  <p>Support and confidence</p>
               </st>
               <p>The support and confidence measures were defined to reveal the importance of individual association rule. For a particular association rule, support is the proportion of samples in the dataset that contain the rule antecedent:</p>
               <p>
                  <display-formula id="M5">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1423-0127-16-25-i7">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>s</m:mi>
                              <m:mi>u</m:mi>
                              <m:mi>p</m:mi>
                              <m:mi>p</m:mi>
                              <m:mi>o</m:mi>
                              <m:mi>r</m:mi>
                              <m:mi>t</m:mi>
                              <m:mo>=</m:mo>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:mtext>number&#160;of&#160;samples&#160;containing&#160;antecedent</m:mtext>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:mtext>total&#160;number&#160;of&#160;samples</m:mtext>
                                 </m:mrow>
                              </m:mfrac>
                              <m:mo>.</m:mo>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4CamNaemyDauNaemiCaaNaemiCaaNaem4Ba8MaemOCaiNaemiDaqNaeyypa0tcfa4aaSaaaeaacqqGUbGBcqqG1bqDcqqGTbqBcqqGIbGycqqGLbqzcqqGYbGCcqqGGaaicqqGVbWBcqqGMbGzcqqGGaaicqqGZbWCcqqGHbqycqqGTbqBcqqGWbaCcqqGSbaBcqqGLbqzcqqGZbWCcqqGGaaicqqGJbWycqqGVbWBcqqGUbGBcqqG0baDcqqGHbqycqqGPbqAcqqGUbGBcqqGPbqAcqqGUbGBcqqGNbWzcqqGGaaicqqGHbqycqqGUbGBcqqG0baDcqqGLbqzcqqGJbWycqqGLbqzcqqGKbazcqqGLbqzcqqGUbGBcqqG0baDaeaacqqG0baDcqqGVbWBcqqG0baDcqqGHbqycqqGSbaBcqqGGaaicqqGUbGBcqqG1bqDcqqGTbqBcqqGIbGycqqGLbqzcqqGYbGCcqqGGaaicqqGVbWBcqqGMbGzcqqGGaaicqqGZbWCcqqGHbqycqqGTbqBcqqGWbaCcqqGSbaBcqqGLbqzcqqGZbWCaaGccqGGUaGlaaa@893E@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>This measure reveals the comprehensiveness of the rule to the dataset.</p>
               <p>Further, confidence of the association rule is a measure of accuracy of the rule:</p>
               <p>
                  <display-formula id="M6">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1423-0127-16-25-i8">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>c</m:mi>
                              <m:mi>o</m:mi>
                              <m:mi>n</m:mi>
                              <m:mi>f</m:mi>
                              <m:mi>i</m:mi>
                              <m:mi>d</m:mi>
                              <m:mi>e</m:mi>
                              <m:mi>n</m:mi>
                              <m:mi>c</m:mi>
                              <m:mi>e</m:mi>
                              <m:mo>=</m:mo>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:mtext>number&#160;of&#160;samples&#160;containing&#160;both&#160;antecedent&#160;and&#160;concequent</m:mtext>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:mtext>number&#160;of&#160;samples&#160;containing&#160;antecedent</m:mtext>
                                 </m:mrow>
                              </m:mfrac>
                              <m:mo>.</m:mo>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4yamMaem4Ba8MaemOBa4MaemOzayMaemyAaKMaemizaqMaemyzauMaemOBa4Maem4yamMaemyzauMaeyypa0tcfa4aaSaaaeaacqqGUbGBcqqG1bqDcqqGTbqBcqqGIbGycqqGLbqzcqqGYbGCcqqGGaaicqqGVbWBcqqGMbGzcqqGGaaicqqGZbWCcqqGHbqycqqGTbqBcqqGWbaCcqqGSbaBcqqGLbqzcqqGZbWCcqqGGaaicqqGJbWycqqGVbWBcqqGUbGBcqqG0baDcqqGHbqycqqGPbqAcqqGUbGBcqqGPbqAcqqGUbGBcqqGNbWzcqqGGaaicqqGIbGycqqGVbWBcqqG0baDcqqGObaAcqqGGaaicqqGHbqycqqGUbGBcqqG0baDcqqGLbqzcqqGJbWycqqGLbqzcqqGKbazcqqGLbqzcqqGUbGBcqqG0baDcqqGGaaicqqGHbqycqqGUbGBcqqGKbazcqqGGaaicqqGJbWycqqGVbWBcqqGUbGBcqqGJbWycqqGLbqzcqqGXbqCcqqG1bqDcqqGLbqzcqqGUbGBcqqG0baDaeaacqqGUbGBcqqG1bqDcqqGTbqBcqqGIbGycqqGLbqzcqqGYbGCcqqGGaaicqqGVbWBcqqGMbGzcqqGGaaicqqGZbWCcqqGHbqycqqGTbqBcqqGWbaCcqqGSbaBcqqGLbqzcqqGZbWCcqqGGaaicqqGJbWycqqGVbWBcqqGUbGBcqqG0baDcqqGHbqycqqGPbqAcqqGUbGBcqqGPbqAcqqGUbGBcqqGNbWzcqqGGaaicqqGHbqycqqGUbGBcqqG0baDcqqGLbqzcqqGJbWycqqGLbqzcqqGKbazcqqGLbqzcqqGUbGBcqqG0baDaaGccqGGUaGlaaa@BB23@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
            </sec>
            <sec>
               <st>
                  <p>Holdout validation and leave-one-out cross-validation tests</p>
               </st>
               <p>The present method was validated by both holdout validation and leave-one-out cross-validation (LOOCV) tests. Holdout validation derives a predictor from the training set, and uses the blind or independent test set to evaluate the predictor. LOOCV is simple <it>n</it>-fold cross-validation, where <it>n </it>is the number of samples in the dataset. Each sample is left out in turn, and the predictor is trained on all the remaining ones. The procedure is repeated for <it>n </it>times to obtain a mean score.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results and discussions</p>
         </st>
         <sec>
            <st>
               <p>Analysis of important genes</p>
            </st>
            <p>X-AI provides a feature selection function to systematically extract the relative important genes for discriminating different classes. In Table <tblr tid="T1">1</tblr>, the top ten genes for each training set of two datasets are listed according to the order of the chi-squared statistic. The selected genes provide input information to both subsequent functions of classification and rule development, and the small number of selected genes has a low data dimension, as well as low calculation complexity. Nevertheless, the decision of the nmber is flexible and largely depends on the analysis requirement.</p>
            <p>In the part of L1, the importance of most genes has been discussed in the study of Golub et al. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and in earlier literatures. Further, Wang et al. <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> also presented additional arguments about Zyxin and PTX3, suggesting that the expression level of both plays an important or neglected role in distinguishing between ALL and AML. The selection function of X-AI has also been compared with some other selection algorithms, including information gain and symmetrical uncertainty criteria. It showed an almost the same selection in the top ten genes. In the part of L2, the average of chi-squared values is higher than that in L1. The results indicate that most of genes extracted by the selection function of X-AI agrre with earlier studies, and may be important for the class discrimination.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Top ten genes selected by feature selection function of X-AI for two datasets</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Dataset</p>
                     </c>
                     <c ca="left">
                        <p>Probe ID</p>
                     </c>
                     <c ca="left">
                        <p>Gene annotation</p>
                     </c>
                     <c ca="left">
                        <p>&#967;<sup>2 </sup>Score</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>L1</p>
                     </c>
                     <c ca="left">
                        <p>X95735</p>
                     </c>
                     <c ca="left">
                        <p>Zyxin</p>
                     </c>
                     <c ca="left">
                        <p>38.00</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>M55150</p>
                     </c>
                     <c ca="left">
                        <p>FAH Fumarylacetoacetate</p>
                     </c>
                     <c ca="left">
                        <p>33.54</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>M27891</p>
                     </c>
                     <c ca="left">
                        <p>CST3 Cystatin C (amyloid angiopathy and cerebral hemorrhage)</p>
                     </c>
                     <c ca="left">
                        <p>33.31</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>M31166</p>
                     </c>
                     <c ca="left">
                        <p>PTX3 Pentaxin-related gene, rapidly induced by IL-1 beta</p>
                     </c>
                     <c ca="left">
                        <p>33.31</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>X70297</p>
                     </c>
                     <c ca="left">
                        <p>CHRNA7 Cholinergic receptor, nicotinic, alpha polypeptide 7</p>
                     </c>
                     <c ca="left">
                        <p>29.77</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>U46499</p>
                     </c>
                     <c ca="left">
                        <p>GLUTATHIONE S-TRANSFERASE, MICROSOMAL</p>
                     </c>
                     <c ca="left">
                        <p>29.77</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>L09209_s</p>
                     </c>
                     <c ca="left">
                        <p>APLP2 Amyloid beta (A4) precursor-like protein 2</p>
                     </c>
                     <c ca="left">
                        <p>29.77</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>M77142</p>
                     </c>
                     <c ca="left">
                        <p>NUCLEOLYSIN TIA-1</p>
                     </c>
                     <c ca="left">
                        <p>29.77</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>J03930</p>
                     </c>
                     <c ca="left">
                        <p>ALKALINE PHOSPHATASE, INTESTINAL PRECURSOR</p>
                     </c>
                     <c ca="left">
                        <p>29.02</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>M23197</p>
                     </c>
                     <c ca="left">
                        <p>CD33 CD33 antigen (differentiation antigen)</p>
                     </c>
                     <c ca="left">
                        <p>28.95</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>L2</p>
                     </c>
                     <c ca="left">
                        <p>36239_at</p>
                     </c>
                     <c ca="left">
                        <p>H. sapiens mRNA for oct-binding factor</p>
                     </c>
                     <c ca="left">
                        <p>91.08</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>37539_at</p>
                     </c>
                     <c ca="left">
                        <p>Homo sapiens mRNA for KIAA0959 protein, partial cds</p>
                     </c>
                     <c ca="left">
                        <p>84.51</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>35260_at</p>
                     </c>
                     <c ca="left">
                        <p>Homo sapiens mRNA for KIAA0867 protein, complete cds</p>
                     </c>
                     <c ca="left">
                        <p>83.72</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>32847_at</p>
                     </c>
                     <c ca="left">
                        <p>Homo sapiens myosin light chain kinase (MLCK) mRNA, complete cds</p>
                     </c>
                     <c ca="left">
                        <p>79.82</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>35164_at</p>
                     </c>
                     <c ca="left">
                        <p>Homo sapiens transmembrane protein (WFS1) mRNA, complete cds</p>
                     </c>
                     <c ca="left">
                        <p>79.46</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at</p>
                     </c>
                     <c ca="left">
                        <p>Homo sapiens TWIK-related acid-sensitive K+ channel (TASK) mRNA, complete cds</p>
                     </c>
                     <c ca="left">
                        <p>78.57</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>40191_s_at</p>
                     </c>
                     <c ca="left">
                        <p>wg66h09.x1 Homo sapiens cDNA, 3' end</p>
                     </c>
                     <c ca="left">
                        <p>77.22</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>39318_at</p>
                     </c>
                     <c ca="left">
                        <p>H. sapiens mRNA for Tcell leukemia</p>
                     </c>
                     <c ca="left">
                        <p>76.22</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>32579_at</p>
                     </c>
                     <c ca="left">
                        <p>Human transcriptional activator (BRG1) mRNA, complete cds</p>
                     </c>
                     <c ca="left">
                        <p>74.97</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>41715_at</p>
                     </c>
                     <c ca="left">
                        <p>H. sapiens mRNA for phosphoinositide 3-kinase</p>
                     </c>
                     <c ca="left">
                        <p>73.53</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>L1: the dataset of Golub et al. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp></p>
                  <p>L2: the dataset of Armstrong et al. <abbrgrp><abbr bid="B20">20</abbr></abbrgrp></p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Prediction performance of system</p>
            </st>
            <p>Different tests have been applied to verify the accuracy of the classification function of X-AI. For holdout validation test, it shows the accuracy of 96% and 99% on the test sets of L1 and L2, respectively, using the ten genes as input information. I have also carried out the analysis of classification accuracy along with the corresponding number of genes by holdout validation test. Figure <figr fid="F3">3</figr> illustrates the classification accuracy as a function of the number of selected genes. The genes were one by one included as the input information according to the order of chi-squared statistic. On the test set of dataset L1, X-AI achieves an accuracy of 98.6% using two genes, and increasing the number of genes to 10 did not further improve it. In addition, on the test set of dataset L2, the accuracy can increase to 100% using eight genes. On the one hand, the training and test sets for each dataset were combined to form a complete dataset for LOOCV test. The test yielded the accuracy of 96% and 94% for datasets L1 and L2, respectively.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Prediction performance of X-AI along with different number of genes on the test set of two datasets</p>
               </caption>
               <text>
                  <p><b>Prediction performance of X-AI along with different number of genes on the test set of two datasets</b>. The y-axis represents classification accuracy and the x-axis is the corresponding number of genes which were used as information in classification. L1: for the dataset of Golub et al. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> L2: for the dataset of Armstrong et al. <abbrgrp><abbr bid="B20">20</abbr></abbrgrp></p>
               </text>
               <graphic file="1423-0127-16-25-3"/>
            </fig>
            <p>The results show that the classification function performs well in discriminating these different classes when the input information is provided by the feature selector function of X-AI. Namely, the integration of the both functions can be feasible and effective for the binary classification and three-class problems.</p>
         </sec>
         <sec>
            <st>
               <p>Comparison with other methods</p>
            </st>
            <p>The performance comparison between X-AI and other methods has also been made on different datasets. The results provide an overall view about the performance of different methods. In Figure <figr fid="F4">4</figr>, the prediction performance is tested on dataset L1 by holdout validation. These compared methods include the weighted voting machine, which is based on a linear model <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>; support vector machines (SVM) <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>; the emerging patterns algorithm <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>; maximal margin linear programming (MAMA) <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>; four methods that combine the feature selector with machine learning algorithms <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> and six methods which have been discussed in earlier literature <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. The numbers of misclassified samples and of used genes vary from 0 to 5 and 1 to 132, respectively. This analysis shows that other methods can not dominate X-AI simultaneously on the numbers of misclassified samples and of used genes; namely, X-AI has a relatively small number of misclassified samples or used genes.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Comparison of prediction performance between different methods</p>
               </caption>
               <text>
                  <p><b>Comparison of prediction performance between different methods</b>. The y-axis denotes the number of samples which were misclassified by those methods on the test set of L1. The number of used genes is represented in the x-axis. Voting machine <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> SVM <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> Emerging patterns <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> MAMA <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> J48, NB, SMO-CFS, SMO-Wrapper <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> RIRLS, RPLS, RPCR, FPLS, MAVE, <it>k</it>-NN <abbrgrp><abbr bid="B34">34</abbr></abbrgrp></p>
               </text>
               <graphic file="1423-0127-16-25-4"/>
            </fig>
            <p>Figure <figr fid="F5">5</figr> shows the comparison of prediction performance on dataset L2. the classification based on correlation/ordering network <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> showed an accuracy of 100% using information of 40 genes. Other seven compared methods include three TSP-family classifiers and five machine learning methods: C4.5 decision trees (DT), Na&#239;ve Bayes (NB), <it>k</it>-nearest neighbor (<it>k</it>-NN), SVM and prediction analysis of microarrays (PAM) <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. The accuracy and the number of used genes vary from 80% to 100% and 2 to 12582, respectively. The analysis reveals that X-AI can achieve a relatively high accuracy using a small number of informative genes when comparing to these methods.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Comparison of prediction performance between different methods</p>
               </caption>
               <text>
                  <p><b>Comparison of prediction performance between different methods</b>. The y-axis denotes the number of samples which were misclassified by those methods on the test set L2. The number of used genes is represented in the x-axis. Classification based on correlation/ordering network <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> HC-TSP, HC-<it>k</it>-TSP, DT, NB, <it>k</it>-NN, SVM, PAM <abbrgrp><abbr bid="B19">19</abbr></abbrgrp></p>
               </text>
               <graphic file="1423-0127-16-25-5"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Association rule development</p>
            </st>
            <p>The function of feature selection did not only reduce the number of input genes, but also improve the efficiency of rule development. It also results in a rational and acceptable number of rules. Based on the genes of Table <tblr tid="T1">1</tblr>, X-AI included all the samples for each dataset to establish association rules.</p>
            <p>Tables <tblr tid="T2">2</tblr> and <tblr tid="T3">3</tblr> list all association rules that developed for each dataset and class. The average confidence is 99% and 97% for datasets L1 and L2, respectively, showing the high accuracy of these rules. In Table <tblr tid="T2">2</tblr>, the second rule means that <it>if the expression of M23197 (CD33) is larger than 401.5</it>, <it>then the sample is classified as ALL</it>. For dataset L1, 29.17% samples contain the antecedent of this rule and all these samples are correctly classified. This rule efficiently reveals the importance of the gene in discriminating between AML and ALL. This finding is in accord with the results of earlier studies <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B19">19</abbr></abbrgrp>. Further, I observed the occurrence of genes among the rules, which may related to their importance. Interestingly, the gene X95735 (Zyxin) has a highest percentage of occurrence (30%) and Wang et al. <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> also gave a detailed discussion about its role in leukemia. In Table <tblr tid="T3">3</tblr>, the gene 1325_at (TASK) also has a high percentage of occurrence (24%). However, it may need more comparative studies for validation.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Two different classes of rules generated from dataset L1</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Consequent</p>
                     </c>
                     <c ca="left">
                        <p>Antecedent</p>
                     </c>
                     <c ca="center">
                        <p>Support (%)</p>
                     </c>
                     <c ca="center">
                        <p>Confidence (%)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ALL</p>
                     </c>
                     <c ca="left">
                        <p>L09209_s &gt; 1056.5 &amp; M23197 &gt; 326.0</p>
                     </c>
                     <c ca="center">
                        <p>30.56</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>M23197 &gt; 401.5</p>
                     </c>
                     <c ca="center">
                        <p>29.17</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>M27891 &gt; 2096.5</p>
                     </c>
                     <c ca="center">
                        <p>27.78</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>X95735 &gt; 994.0 &amp; M55150 &gt; 1250.5</p>
                     </c>
                     <c ca="center">
                        <p>27.78</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>X95735 &gt; 994.0</p>
                     </c>
                     <c ca="center">
                        <p>36.11</p>
                     </c>
                     <c ca="center">
                        <p>92</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AML</p>
                     </c>
                     <c ca="left">
                        <p>U46499 &lt; 154.5</p>
                     </c>
                     <c ca="center">
                        <p>59.72</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>L09209_s &lt; 992.5</p>
                     </c>
                     <c ca="center">
                        <p>58.33</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>X95735 &lt; 994.0</p>
                     </c>
                     <c ca="center">
                        <p>63.89</p>
                     </c>
                     <c ca="center">
                        <p>98</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mean</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>41.67</p>
                     </c>
                     <c ca="center">
                        <p>99</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Three different classes of rules generated from dataset L2</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>Consequent</p>
                     </c>
                     <c ca="left">
                        <p>Antecedent</p>
                     </c>
                     <c ca="center">
                        <p>Support (%)</p>
                     </c>
                     <c ca="center">
                        <p>Confidence (%)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ALL</p>
                     </c>
                     <c ca="left">
                        <p>32847_at &gt; 147.0</p>
                     </c>
                     <c ca="center">
                        <p>30.56</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>36239_at &gt; 2201.0</p>
                     </c>
                     <c ca="center">
                        <p>27.78</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>AML</p>
                     </c>
                     <c ca="left">
                        <p>39318_at &lt; 1063.0 &amp; 32579_at &lt; 2285.0</p>
                     </c>
                     <c ca="center">
                        <p>34.72</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 1501.5, 39318_at &lt; 1063.0 &amp; 32579_at &lt; 2285.0</p>
                     </c>
                     <c ca="center">
                        <p>34.72</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 1501.5, 36239_at &lt; 214.0 &amp; 40191_s_at &lt; 508.5</p>
                     </c>
                     <c ca="center">
                        <p>33.33</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>36239_at &lt; 214.0 &amp; 40191_s_at &lt; 508.5</p>
                     </c>
                     <c ca="center">
                        <p>33.33</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>39318_at &lt; 1063.0 &amp; 35164_at &lt; -794.5</p>
                     </c>
                     <c ca="center">
                        <p>31.94</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>40191_s_at &lt; 519.0 &amp; 36239_at &lt; 167.0</p>
                     </c>
                     <c ca="center">
                        <p>31.94</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 1501.5, 39318_at &lt; 1063.0 &amp; 35164_at &lt; -794.5</p>
                     </c>
                     <c ca="center">
                        <p>31.94</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 1501.5, 40191_s_at &lt; 519.0 &amp; 36239_at &lt; 167.0</p>
                     </c>
                     <c ca="center">
                        <p>31.94</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 1501.5, 36239_at &lt; 214.0 &amp; 37539_at &lt; -362.0</p>
                     </c>
                     <c ca="center">
                        <p>31.94</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>36239_at &lt; 214.0 &amp; 37539_at &lt; -362.0</p>
                     </c>
                     <c ca="center">
                        <p>31.94</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>37539_at &lt; -725.5</p>
                     </c>
                     <c ca="center">
                        <p>29.17</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>32579_at &lt; 2285.0</p>
                     </c>
                     <c ca="center">
                        <p>36.11</p>
                     </c>
                     <c ca="center">
                        <p>96</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 1501.5 &amp; 32579_at &lt; 2285.0</p>
                     </c>
                     <c ca="center">
                        <p>36.11</p>
                     </c>
                     <c ca="center">
                        <p>96</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>36239_at &lt; 214.0</p>
                     </c>
                     <c ca="center">
                        <p>40.28</p>
                     </c>
                     <c ca="center">
                        <p>93</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>MLL</p>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 201.0, 35260_at &gt; 794.5 &amp; 40191_s_at &gt; 1107.5</p>
                     </c>
                     <c ca="center">
                        <p>19.44</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 201.0 &amp; 36239_at &gt; 214.0</p>
                     </c>
                     <c ca="center">
                        <p>23.61</p>
                     </c>
                     <c ca="center">
                        <p>94</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1325_at &lt; 201.0</p>
                     </c>
                     <c ca="center">
                        <p>37.50</p>
                     </c>
                     <c ca="center">
                        <p>67</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mean</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>32.02</p>
                     </c>
                     <c ca="center">
                        <p>97</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Web server for cancer classification</p>
            </st>
            <p>I have also developed a web server for classifying tumors of acute leukemia and it is freely available at <url>http://bioinformatics.myweb.hinet.net/xai.htm</url>. The prediction can be made by taking four simple steps (see Figure <figr fid="F6">6</figr>): (i) select "Prediction" from the main page to open an input subpage, (ii) select a set of input genes, (iii) input the expression values for each gene, and (iv) press the "Submit" button to start the service.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Snapshot of the prediction page of web service for cancer classification</p>
               </caption>
               <text>
                  <p><b>Snapshot of the prediction page of web service for cancer classification</b>.</p>
               </text>
               <graphic file="1423-0127-16-25-6"/>
            </fig>
            <p>Because X-AI selected two different sets of input genes from two datasets for training the classifiers, it results in two classifiers with different sets of input genes. Users can optionally assign one of both to predict cancer classes. In addition to the cancer classification page, the web server has provided help and reference pages for interested researchers.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>In this study, I have proposed an integrated method for accurate cancer classification, relevant gene selection, and the associate rule development from DNA microarray data. Applying the concepts of system design, the modules in the present architecture are tight cohesion and loose coupling.</p>
         <p>Through different tests, the method shows high classification accuracy on two leukemia datasets. In addition, the selected genes and the generated rules are in accord with recent studies. The results suggest that the method can effectively integrate these related functions for the analysis of microarray data.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The author declares that they have no competing interests.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported by grant no. NSC97-2221-E-451-013 from the National Science Council, Taiwan, ROC. I would like to thank Dr. Chang-Sheng Wang for critical reading, and reviewers for providing valuable comments to improve the manuscript.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Molecular classification of cancer: class discovery and class prediction by gene expression monitoring</p>
            </title>
            <aug>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Tamayo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Huard</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gaasenbeek</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mesirov</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Coller</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Loh</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Downing</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Caligiuri</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Bloomfield</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <issue>5439</issue>
            <fpage>531</fpage>
            <lpage>537</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.286.5439.531</pubid>
                  <pubid idtype="pmpid" link="fulltext">10521349</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays</p>
            </title>
            <aug>
               <au>
                  <snm>Alon</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Barkai</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Notterman</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ybarra</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mack</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <issue>12</issue>
            <fpage>6745</fpage>
            <lpage>6750</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">21986</pubid>
                  <pubid idtype="pmpid" link="fulltext">10359783</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.12.6745</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Recursive partitioning for tumor classification with gene expression microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>CY</fnm>
               </au>
               <au>
                  <snm>Singer</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Xiong</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <issue>12</issue>
            <fpage>6730</fpage>
            <lpage>6735</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">34421</pubid>
                  <pubid idtype="pmpid" link="fulltext">11381113</pubid>
                  <pubid idtype="doi">10.1073/pnas.111153698</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Deriving quantitative conclusions from microarray expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Olshen</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Jain</snm>
                  <fnm>AN</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>7</issue>
            <fpage>961</fpage>
            <lpage>970</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.7.961</pubid>
                  <pubid idtype="pmpid" link="fulltext">12117794</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Gene-expression profiles in hereditary breast cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Hedenfalk</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Duggan</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Radmacher</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bittner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Meltzer</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Gusterson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Esteller</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kallioniemi</snm>
                  <fnm>OP</fnm>
               </au>
               <au>
                  <snm>Wilfond</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Borg</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Trent</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Raffeld</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yakhini</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Ben-Dor</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Dougherty</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kononen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bubendorf</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Fehrle</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Pittaluga</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gruvberger</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Loman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Johannsson</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Olsson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sauter</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>N Engl J Med</source>
            <pubdate>2001</pubdate>
            <volume>344</volume>
            <issue>8</issue>
            <fpage>539</fpage>
            <lpage>548</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1056/NEJM200102223440801</pubid>
                  <pubid idtype="pmpid" link="fulltext">11207349</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Diagnosis of multiple cancer types by shrunken centroids of gene expression</p>
            </title>
            <aug>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hastie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Narasimhan</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Chu</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <issue>10</issue>
            <fpage>6567</fpage>
            <lpage>6572</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">124443</pubid>
                  <pubid idtype="pmpid" link="fulltext">12011421</pubid>
                  <pubid idtype="doi">10.1073/pnas.082099299</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Linear regression and two-class classification with gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Pan</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>16</issue>
            <fpage>2072</fpage>
            <lpage>2078</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg283</pubid>
                  <pubid idtype="pmpid" link="fulltext">14594712</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>PCA disjoint models for multiclass cancer analysis using gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Bicciato</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Luchini</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Di Bello</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>5</issue>
            <fpage>571</fpage>
            <lpage>578</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg051</pubid>
                  <pubid idtype="pmpid" link="fulltext">12651714</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Classification of microarray data with factor mixture models</p>
            </title>
            <aug>
               <au>
                  <snm>Martella</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <issue>2</issue>
            <fpage>202</fpage>
            <lpage>208</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti779</pubid>
                  <pubid idtype="pmpid" link="fulltext">16287938</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Consensus analysis of multiple classifiers using non-repetitive variables: diagnostic application to microarray gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Su</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Hong</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Perkins</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Shao</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Cai</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Tong</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Comput Biol Chem</source>
            <pubdate>2007</pubdate>
            <volume>31</volume>
            <issue>1</issue>
            <fpage>48</fpage>
            <lpage>56</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.compbiolchem.2007.01.001</pubid>
                  <pubid idtype="pmpid" link="fulltext">17303535</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Knowledge acquisition and development of accurate rules for predicting protein stability changes</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>LT</fnm>
               </au>
               <au>
                  <snm>Gromiha</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Hwang</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Ho</snm>
                  <fnm>SY</fnm>
               </au>
            </aug>
            <source>Comput Biol Chem</source>
            <pubdate>2006</pubdate>
            <volume>30</volume>
            <issue>6</issue>
            <fpage>408</fpage>
            <lpage>415</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.compbiolchem.2006.06.004</pubid>
                  <pubid idtype="pmpid" link="fulltext">17000135</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Sequence analysis and rule development of predicting protein stability change upon mutation using decision tree model</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>LT</fnm>
               </au>
               <au>
                  <snm>Gromiha</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Ho</snm>
                  <fnm>SY</fnm>
               </au>
            </aug>
            <source>Journal of Molecular Modeling</source>
            <pubdate>2007</pubdate>
            <volume>13</volume>
            <issue>8</issue>
            <fpage>879</fpage>
            <lpage>890</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00894-007-0197-4</pubid>
                  <pubid idtype="pmpid">17394029</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>iPTREE-STAB: interpretable decision tree based method for predicting protein stability changes upon mutations</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>LT</fnm>
               </au>
               <au>
                  <snm>Gromiha</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Ho</snm>
                  <fnm>SY</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <issue>10</issue>
            <fpage>1292</fpage>
            <lpage>1293</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btm100</pubid>
                  <pubid idtype="pmpid" link="fulltext">17379687</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Mining molecular fragments: finding relevant substructures of molecules</p>
            </title>
            <aug>
               <au>
                  <snm>Borgelt</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Berthold</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <publisher>The 2002 IEEE international Conference on Data Mining, Washington, DC</publisher>
            <pubdate>2002</pubdate>
            <fpage>51</fpage>
            <lpage>58</lpage>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Extraction of knowledge on protein-protein interaction by association rule discovery</p>
            </title>
            <aug>
               <au>
                  <snm>Oyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kitano</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Satou</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ito</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>5</issue>
            <fpage>705</fpage>
            <lpage>714</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.5.705</pubid>
                  <pubid idtype="pmpid" link="fulltext">12050067</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Mining gene expression databases for association rules</p>
            </title>
            <aug>
               <au>
                  <snm>Creighton</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hanash</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>1</issue>
            <fpage>79</fpage>
            <lpage>86</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/19.1.79</pubid>
                  <pubid idtype="pmpid" link="fulltext">12499296</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Integrated analysis of gene expression by Association Rules Discovery</p>
            </title>
            <aug>
               <au>
                  <snm>Carmona-Saez</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chagoyen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rodriguez</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Trelles</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Carazo</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Pascual-Montano</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>54</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1386712</pubid>
                  <pubid idtype="pmpid" link="fulltext">16464256</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-7-54</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Downing</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Yeoh</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>1</issue>
            <fpage>71</fpage>
            <lpage>78</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/19.1.71</pubid>
                  <pubid idtype="pmpid" link="fulltext">12499295</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Simple decision rules for classifying human cancers from gene expression profiles</p>
            </title>
            <aug>
               <au>
                  <snm>Tan</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Naiman</snm>
                  <fnm>DQ</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Winslow</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Geman</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>20</issue>
            <fpage>3896</fpage>
            <lpage>3904</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1987374</pubid>
                  <pubid idtype="pmpid" link="fulltext">16105897</pubid>
                  <pubid idtype="doi">10.1093/bioinformatics/bti631</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia</p>
            </title>
            <aug>
               <au>
                  <snm>Armstrong</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Staunton</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Silverman</snm>
                  <fnm>LB</fnm>
               </au>
               <au>
                  <snm>Pieters</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>den Boer</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Minden</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Sallan</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Lander</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Golub</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Korsmeyer</snm>
                  <fnm>SJ</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>1</issue>
            <fpage>41</fpage>
            <lpage>47</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng765</pubid>
                  <pubid idtype="pmpid" link="fulltext">11731795</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Comparison of discrimination methods for the classification of tumors using gene expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Dudoit</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fridlyand</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Speed</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Technical Report 576, Statistics Dept, UC Berkeley</source>
            <pubdate>2000</pubdate>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Machine learning methods for predictive proteomics</p>
            </title>
            <aug>
               <au>
                  <snm>Barla</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jurman</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Riccadonna</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Merler</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chierici</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Furlanello</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Brief Bioinform</source>
            <pubdate>2008</pubdate>
            <volume>9</volume>
            <issue>2</issue>
            <fpage>119</fpage>
            <lpage>128</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bib/bbn008</pubid>
                  <pubid idtype="pmpid" link="fulltext">18310105</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Structured design: fundamentals of a discipline of computer program and systems design</p>
            </title>
            <aug>
               <au>
                  <snm>Yourdon</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Constantine</snm>
                  <fnm>LL</fnm>
               </au>
            </aug>
            <publisher>Englewood Cliffs, N.J., Prentice Hall</publisher>
            <pubdate>1979</pubdate>
         </bibl>
         <bibl id="B24">
            <title>
               <p>A practical approach to microarray data analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Berrar</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Dubitzky</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Granzow</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <publisher>Boston, MA, Kluwer Academic Publishers</publisher>
            <pubdate>2003</pubdate>
            <xrefbib>
               <pubid idtype="pmpid">12603013</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Chi2: Feature Selection and Discretization of Numeric Attributes</p>
            </title>
            <aug>
               <au>
                  <snm>Huan</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Rudy</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Seventh International Conference on Tools with Artificial Intelligence (ICTAI)</source>
            <pubdate>1995</pubdate>
            <fpage>388</fpage>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Data Mining: Practical machine learning tools and techniques</p>
            </title>
            <aug>
               <au>
                  <snm>Witten</snm>
                  <fnm>IH</fnm>
               </au>
               <au>
                  <snm>Frank</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <publisher>San Francisco, Morgan Kaufmann</publisher>
            <edition>2</edition>
            <pubdate>2005</pubdate>
         </bibl>
         <bibl id="B27">
            <aug>
               <au>
                  <snm>Theodoridis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Koutroumbas</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Pattern recognition</source>
            <publisher>Amsterdam; Boston, Elsevier/Academic Press</publisher>
            <edition>3</edition>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Analysis and prediction of protein folding rates using quadratic response surface models</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>LT</fnm>
               </au>
               <au>
                  <snm>Gromiha</snm>
                  <fnm>MM</fnm>
               </au>
            </aug>
            <source>Journal of Computational Chemistry</source>
            <pubdate>2008</pubdate>
            <volume>29</volume>
            <issue>10</issue>
            <fpage>1675</fpage>
            <lpage>1683</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/jcc.20925</pubid>
                  <pubid idtype="pmpid" link="fulltext">18351617</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>An information theoretic approach to rule induction from databases</p>
            </title>
            <aug>
               <au>
                  <snm>Smyth</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Goodman</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Knowledge and Data Engineering, IEEE Transactions on</source>
            <pubdate>1992</pubdate>
            <volume>4</volume>
            <issue>4</issue>
            <fpage>301</fpage>
            <lpage>316</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1109/69.149926</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Gene selection from microarray data for cancer classification &#8211; a machine learning approach</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Tetko</snm>
                  <fnm>IV</fnm>
               </au>
               <au>
                  <snm>Hall</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Frank</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Facius</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mayer</snm>
                  <fnm>KF</fnm>
               </au>
               <au>
                  <snm>Mewes</snm>
                  <fnm>HW</fnm>
               </au>
            </aug>
            <source>Comput Biol Chem</source>
            <pubdate>2005</pubdate>
            <volume>29</volume>
            <issue>1</issue>
            <fpage>37</fpage>
            <lpage>46</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.compbiolchem.2004.11.001</pubid>
                  <pubid idtype="pmpid" link="fulltext">15680584</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Support vector machine classification and validation of cancer tissue samples using microarray expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Cristianini</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Duffy</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bednarski</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Schummer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>10</issue>
            <fpage>906</fpage>
            <lpage>914</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.10.906</pubid>
                  <pubid idtype="pmpid" link="fulltext">11120680</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>5</issue>
            <fpage>725</fpage>
            <lpage>734</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.5.725</pubid>
                  <pubid idtype="pmpid" link="fulltext">12050069</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Optimization models for cancer classification: extracting gene interaction information from microarray expression data</p>
            </title>
            <aug>
               <au>
                  <snm>Antonov</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Tetko</snm>
                  <fnm>IV</fnm>
               </au>
               <au>
                  <snm>Mader</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Budczies</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mewes</snm>
                  <fnm>HW</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>5</issue>
            <fpage>644</fpage>
            <lpage>652</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg462</pubid>
                  <pubid idtype="pmpid" link="fulltext">15033871</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Classification using partial least squares with penalized logistic regression</p>
            </title>
            <aug>
               <au>
                  <snm>Fort</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Lambert-Lacroix</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>7</issue>
            <fpage>1104</fpage>
            <lpage>1111</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti114</pubid>
                  <pubid idtype="pmpid" link="fulltext">15531609</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Topology-based cancer classification and related pathway mining using microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>WS</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>HY</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>PC</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>PC</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>JJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>14</issue>
            <fpage>4069</fpage>
            <lpage>4080</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1557825</pubid>
                  <pubid idtype="pmpid" link="fulltext">16914437</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl583</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
