<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD 2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
	<front>
		<journal-meta>
			<journal-id journal-id-type="nlm-ta">J Proteomics Bioinform</journal-id>
			<journal-id journal-id-type="publisher-id">opg</journal-id>						
			<journal-title>Journal of Proteomics &amp; Bioinformatics</journal-title>			 
			<issn pub-type="epub">0974-276X</issn>
			<publisher>
				<publisher-name>OMICS Publishing Group</publisher-name>
				<publisher-loc>India, USA</publisher-loc>
			</publisher>
		</journal-meta>
		<article-meta>			
			<article-id pub-id-type="publisher-id">000063</article-id>
			<article-categories>
				<subj-group subj-group-type="heading">
					<subject>Research Article</subject>
				</subj-group>
				<subj-group subj-group-type="Discipline">
					<subject>Biochemistry</subject>
				</subj-group>
				<subj-group subj-group-type="System Taxonomy">
					<subject>Proteomics</subject>
					<subject>Bioinformatics</subject>
					<subject>Genomics</subject>
					<subject>Transcriptomics</subject>
					<subject>Biomarkers</subject>
				</subj-group>
			</article-categories>
			<title-group>
				<article-title>Exploring the Interplay of Sequence and Structural Features in Determiming the Flexibility of AGC Kinase Protein Family : A Bioinformatics Approach</article-title>
			</title-group>
			<contrib-group>
				<contrib contrib-type="author">
					<name>
						<surname>Banerjee</surname>
						<given-names>Amit Kumar</given-names>
					</name>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Arora</surname>
						<given-names>Neelima</given-names>
					</name>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Pranitha</surname>
						<given-names>Varakantham</given-names>
					</name>
				</contrib>
				<contrib contrib-type="author">
					<name>
						<surname>Murty</surname>
						<given-names>U. S. N.</given-names>
					</name>					
				</contrib>
			</contrib-group>
			<aff id="af1">Bioinformatics Group, Biology Division, Indian Institute of Chemical Technology, Hyderabad-500607, A.P., India.</aff>
			<author-notes>
				<corresp id="cor1">To whom correspondence should be addressed: Dr. U.S.N Murty, Deputy Director/ Scientist “F” Head, Biology Division, Indian Institute of ChemcalTechnology,Hyderabad- 500607, India, Phone: +91 40 27193134; Fax: +91 40 27193227; E-mail: <email>murty_usn@yahoo.com</email></corresp>
			</author-notes>
			<pub-date pub-type="collection">
			     <month>05</month>
				 <year>2008</year>
			</pub-date>
			<pub-date pub-type="epub">
				<day>20</day>
				<month>05</month>
				<year>2008</year>
			</pub-date>			
			<volume>1</volume>
			<issue>2</issue>
			<fpage>077</fpage>
			<lpage>089</lpage>
			<history>
			<date date-type="received">
			     <day>02</day>
				 <month>05</month>
				 <year>2008</year>
			</date>
			<date date-type="accepted">
			      <day>15</day>
				  <month>05</month>
				  <year>2008</year>
			</date>
			</history>
			<permissions>				 
			<copyright-statement><bold>Copyright:</bold> &copy; 2008 Amit KB, etal.</copyright-statement>
			<copyright-year>2008</copyright-year>
			<license license-type="open access">
			<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</p>
			</license>
			</permissions>			
			<abstract>
				<p>In this study, data mining approach was used to generate association rules for predicting average flexibility from the various derived sequence and structural features. 21 parameters were calculated and their variable importance was calculated for 115 sequences of AGC kinase family belonging to mouse and human using Classification and Regression Tree (CART). Beta turns were found to have maximum influence on average flexibility while the total beta strands were found to exert minimum impact on average flexibility. Understanding the variable importance will prove useful as a simple pr edictor of flexibility from an amino acid sequence. This will aid in better understanding of phenomenon underlying the average flexibility and thus, will pave a way for rational design of therapeutics.</p>
			</abstract>
			<kwd-group>
				<kwd>AGC kinase</kwd>
				<kwd>Protein flexibility</kwd>
				<kwd>Data mining</kwd>
				<kwd>Classification and Regression Tree (CART)</kwd>
				<kwd>Bioinformatics</kwd>
			</kwd-group>
			 <custom-meta-wrap>
				<custom-meta>
					<meta-name>citation</meta-name>
					<meta-value>Banerjee AK, Arora N, Pranitha V, Murty USN (2008) Exploring the Interplay of Sequence and Structural Features in Determiming the Flexibility of AGC Kinase Protein Family : A Bioinformatics Approach.</meta-value>
				</custom-meta>
			</custom-meta-wrap>
		</article-meta>
	</front>
  <body>
	<sec id="s1">
			<title>Introduction</title>
				<p>Every biological molecule is characterized and set apart from other biomolecules by a definite set of inherent intrinsic properties. Being the determinant of some vital functions like transport of metabolites (<xref ref-type="bibr" rid="r1">Anderson et al., 1990</xref>; <xref ref-type="bibr" rid="r32">Spurlino et al., 1991</xref>), catalysis (<xref ref-type="bibr" rid="r3">Bennett and Steitz, 1978</xref>; <xref ref-type="bibr" rid="r30">Remington et al., 1982</xref>) and regulation of protein activity (<xref ref-type="bibr" rid="r28">Perutz, 1970</xref>; <xref ref-type="bibr" rid="r27">Perutz, 1989</xref>) etc, average flexibility holds prime importance. Eukaryotic proteins demonstrate higher flexibility which influence conformational lability required in important biological processes like molecular recognition, interaction, assembly and modification. Moreover, protein flexibility is also known to influence stability and folding. There has been a sudden spur of interest in studies related to flexibility of proteins owing to discovery of role of some highly flexible proteins with implications in life threatening diseases like AIDS (HIV gp41) and scrapie (<xref ref-type="bibr" rid="r9">Chan et al., 1997</xref>). A comprehensive knowledge of fundamental nature of average flexibility will facilitate the unraveling of structurefunction relationship and will also aid in development of novel therapeutics (<xref ref-type="bibr" rid="r33">Teague, 2003</xref>).</p>
				<p>AGC protein kinase family, one among the eight ePK families defined in the Kinbase, includes many important enzymes such as cyclic nucleotide and calcium-phospholipid dependent kinases, ribosomal S6-phosphorylating kinases, G protein-coupled kinases, and few others. The AGC serine threonine kinases, known for phosphorylating sites surrounded by basic amino acids, are involved in many intra–cellular signaling pathways, critical cellular processes and control cell growth, differentiation and cell survival. Their crucial role in transmembrane signaling process hints on the importance of features of AGC kinases which may be responsible for membrane localization (<xref ref-type="bibr" rid="r29">Peterson and Schreiber, 1999</xref>). This group of protein kinases shares similarity within the catalytic domain and is characterized by similar mechanism of activation. Deregulation of AGC kinases is known to have implications in several diseases like Cancer, Diabetes, neurodegeneration, and thus, AGC kinases represent several attractive targets for small inhibitors of therapeutic significance (<xref ref-type="bibr" rid="r7">Breitenlechner, 2003</xref>).</p>
				<p>Their stringent spatio-temporal regulation is attained through loop phosphorylation and repositioning of the key catalytic and substrate binding regions which indicates the importance of flexibility in these proteins (<xref ref-type="bibr" rid="r20">Kannan et al., 2007</xref>). There is preponderance of literature on flexibility of proteins but elucidating the effect of parameters influencing it is cumbersome. This study aims at exploring the importance of different parameters influencing the average flexibility of AGC kinase family using data mining approach.</p>
				</sec>
		<sec sec-type="methods">
			<title>Materials and Methods</title>
			    <sec>
				<title>Sequence Collection and Pre-Processing</title>
				<p>Protein sequences of the enzymes belonging to AGC family of protein kinase super family in FASTA format were collected from the non redundant (NR) protein database of NCBI (<ext-link ext-link-type="uri" xlink:href="www.ncbi.nlm.nih.gov">http://www.ncbi.nlm.nih.gov</ext-link>). Partial sequences were excluded from the study and sequences were again put to manual filtering so as to minimize the redundancy. This approach resulted in 600 sequences from the total 1259 sequences of AGC family available in the database were obtained. Out of these, sequences belong ing to Homo sapiens (59) and Mus musculus (56) were considered for this study.</p>  			
          <p><xref ref-type="table" rid="t1">Table 1</xref></p>	                         			
					    <fig id="g1">
					        <label>Figure 1</label>
					        <caption>
						<title>Frequency distribution chart for different parameters generated in CART Frequency distribution chart for different parameters generated in CART</title>							
					</caption>
					<graphic xlink:href="JPB-01-077-g001.tif"/>
				</fig>				
				<p>14 trees with different complexities and error values obtained using CART based on splitting criteria are reflected in <xref ref-type="table" rid="t2">Table 2</xref>. Out of these trees, tree with 21 terminal nodes with minimum complexity and re-substitution relative error of 0.08501 and cross validated error of 0.72543 ± 0.12560 generated by Least Square splitting criteria was selected for generating decision rules. The topology of tree and error rate is represented in <xref ref-type="fig" rid="g2">Figure 2</xref>. Splitters for the regression tree are shown in <xref ref-type="fig" rid="g3">Figure 3</xref>. Decision rules obtained using CART are summarized in <xref ref-type="table" rid="t3">Table 3</xref>(Supplement).</p>
				<fig id="g2">
					<label>Figure 2</label>
					<caption>
						<title>The tree sequence of lowest complexity which yielded 21 terminal nodes (A) with the cross validation error rate (B) and terminal node box plot(C).</title>
					</caption>
					<graphic xlink:href="JPB-01-077-g002.tif"/>
				</fig>				
				<fig id="g3">
					<label>Figure 3</label>
					<caption>
						<title>Details of splitter for the Decision tree</title>
					</caption>
					<graphic xlink:href="JPB-01-077-g003.tif"/>
				</fig>
				<p>Rules derived from CART can be interpreted in simple context of “If “and “Then” based statement and thus are self-explanatory.</p>
				<p>For example: Rule 1 can be interpreted as</p>
				<p><bold>Rule 1:</bold> IF &ldquo;BULKINESS &rdquo;&lt;= 14.2207&rdquo; &amp; &ldquo;ALPHA -HELIX &lt;=    1.01975&rdquo; &amp;&rdquo; A.A COMPOSITION &lt;= 5.55&rdquo;, THEN &ldquo;AVERAGE FLEXIBILITY=0.457&rdquo;.&ldquo;</p>
				<p><bold>Rule 14:&rdquo;</bold> IF &ldquo;RECOGNITION FACTORS&lt;= 89.4723&rdquo; &amp;&ldquo;TRANSMEMBRANE TENDENCY&lt;= -54225&rdquo; &amp; &ldquo;ALPHA  -HELIX &gt; 1.01975&rdquo; &amp; &ldquo;TOTAL BETA-STRAND&gt; 0.95975&amp;&lt;= 1.018&rdquo; &amp; &ldquo;A.A. Composition&lt;= 6.0055&rdquo; &amp; &ldquo;RELATIVE  MUTABILITY&lt;= 80.0835&rdquo;, THEN &ldquo;AVERAGE FLEXIBILITY= 0.436563&rdquo;</p>
				</sec>
				</sec>
            <sec>
			<title>Variable Importance</title>
				<p>Importance of different variables was calculated based on predefined scores in CART and summarized in<xref ref-type="table" rid="t4">Table 4</xref>.</p> 				
		</sec>	
		<sec id="s3">
			<title>Discussion</title>
				<p>Dynamic nature of proteins, conferred by their structural flexibility, is associated with function. Average flexibility, an innate property of proteins is being recognized with implications in many important physiological processes recently (<xref ref-type="bibr" rid="r34">Wright and Dyson 1999</xref>; <xref ref-type="bibr" rid="r8">Bright et al. 2001</xref>; <xref ref-type="bibr" rid="r14">Dunker et al. 2001</xref>; <xref ref-type="bibr" rid="r24">Namba 2001</xref>). Recognition of several highly flexibile proteins in some pathological conditions have led to the momentum in studies related to the flexibility of proteins. The huge gap in number of sequence and structures in PDB limits the utilization of 3-dimensional structure for deriving features affecting flexibility like Bfactors. In unavailability of such data, sequence composition and secondary structure provides a rough estimation of structural properties. This warrants the need for an alternate and simplistic approach for determining the effect of various parameters on average flexibility in an easy to understand quantitative relationship. Data mining approaches based on decision tree based methods have been successfully exploited in elucidating importance of features affecting important biological processes (<xref ref-type="bibr" rid="r2">Banerjee et al, 2007</xref>). CART has been exploited in microarray studies (Boulesteix et al., 2003), ecological studies (De’at&amp; Fabricius, 2000), risk prediction (<xref ref-type="bibr" rid="r16">Gottschalk et al., 1998</xref>), diseases diagnosis (<xref ref-type="bibr" rid="r17">Hermanek &amp; Holzmann., 1994</xref>) and social studies (Özge et al., 2004).</p>
				<p>The dataset comprising of various derived features was used to elucidate decision rules by CART that can serve as rule of thumb for finding the effect of different parameters on average flexibility, which is virtually impossible to calculate in a lab simultaneously using conventional approaches. Among the secondary structure features, beta turn, alpha helix, coil, parallel beta strand, beta sheet and total beta strands were found to influence the average flexibility in descending order. Among sequence features, % accessible residues, trans-membrane tendency, amino acid composition, bulkiness, recognition factors, molecular weight, polarity, hydrophobicity, average area buried, refractivity, no. of codons, % buried residues, and relative mutability were observed to affect the average flexibility in decreasing order(<xref ref-type="table" rid="t4">Table 4</xref>). Beta turns were found to have maximum impact while total beta strand were found to have minimum effect on average flexibility of the proteins considered in the study. As more and more studies are advocating the inclusion of protein flexibility in docking algorithms, it will be interesting to gain an insight on features influencing the flexibility of proteins. It is speculated that an extensive knowledge of protein flexibility and the various parameters contributing towards is important for rational drug design. Such an approach will lead to better understanding of underlying biological phenomena and aid in enzyme engineering processes.</p>
		</sec>
		<sec id="s4">
		<title>Accession numbers of the considered AGC kinase protein sequences are as follows</title>
			<p>O70291.1, POC605.1, P16054.1, P18654.2, P23298.1, P31750.1, P54265.1, P68181.2,</p>
            <p>P70268.3, P70336.1, Q3UU96.2, O70293.1, P05132.3, P18653.1, P20444.3, P28867.3,</p>
            <p>P49025.3, P63318.1, P68404.3, P70335.1, Q3U214.2, Q3UYH7.1, Q7TPS0.2,</p>
            <p>Q7TSE6.1, Q7TSJ6.1, Q7TT50.1, Q8BSK8.1, Q8BWW9.2, Q8BYR2.2, Q8C0P0.1,</p>
            <p>Q8C050.2, Q8K045.1, Q8VEB1.2, Q9ERE3.1, Q9QZS5.1, Q9R1L5.3,</p>
            <p>Q9WUA6.1,Q9WUT3.1, Q9WVC6.1, Q9WVL4.1, Q9Z0Z0.1, Q9Z1M4.1, Q9Z2A0.2,</p>
            <p>Q9Z2B9.1, Q8OUW5.2, Q91VJ4.1, Q99MK8.2, Q811L6.2, Q922R0.1, Q02111.1,</p>
            <p>Q02956.1, Q60592.1, Q60823.1, Q61410.1, Q62074.2, P41743.1, P43250.2, P51812.1,</p>
            <p>P51817.1, Q02156.1, Q16513.1, Q16512.1, Q15835.1, Q15418.2, Q15349.2, Q15208.1,</p>
            <p>Q13976.3, Q13464.1, Q13237.1, CAE55958.1, NP_443073.1, O00141.2, O14578.2,</p>
            <p>O15021.2, O15530.1, O60307.2, O75116.3, O75582.1, O75676.1, O95835.1, P05129.3,</p>
            <p>P05771.4, P14619.1, P17252.3, P17612.2, P22612.3, P22694.2, P23443.2, P24256.1,</p>
            <p>P24723.2, P25098.2, P31749.2, P31751.2, P32298.3, P34947.1, P35626.2, Q09013.1,</p>
            <p>Q05655.1, Q05513.4, Q04759.3, Q96GX5.1, Q96BR1.1, Q9Y243.1, Q9Y5S2.2,</p>
            <p>Q9Y2H9.2, Q9Y2H1.3, Q9UK32.1, Q9UBS0.1, Q9NRM7.1, Q9HBY8.1, Q8WTQ7.1,</p>
            <p>Q6P5Z2.1, Q6P0Q8.2, Q6DT37.1, Q5VT25.1.</p>
        </sec>
</body>
	<back>
		<ack>
			<p>Authors thank Dr. J.S.Yadav, Director, IICT for his continuous support and encouragement. We thank anonymous reviewers for their critical suggestions for the improvement of the manuscript. </p>
		</ack>		
		<ref-list>
			<title>References</title>
					<ref id="r1">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Anderson</surname>
								<given-names>BF</given-names>
							</name>
							<name>
								<surname>Baker</surname>
								<given-names>HM</given-names>
							</name>
							<name>
								<surname>Morris</surname>
								<given-names>GE</given-names>
							</name>
							<name>
								<surname>Rumball</surname>
								<given-names>SV</given-names>
							</name>
							<name>
								<surname>Baker</surname>
								<given-names>EN</given-names>
							</name>
							</person-group>
							<year>1990</year>
							<article-title>Apolactoferrin structure demonstrates ligand-induced conformational change in     transferrins</article-title>
							<source>Nature</source>
							<volume>344</volume>
							<fpage>784</fpage>
							<lpage>787</lpage>
					</citation>
					</ref>
					<ref id="r2">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Banerjee</surname>
								<given-names>AK</given-names>
							</name>
							<name>
								<surname>Arora</surname>
								<given-names>N</given-names>
							</name>
							<name>
								<surname>Murty</surname>
								<given-names>USN</given-names>
							</name>
							</person-group>
							<year>2007</year>
							<article-title>Stability of ITS2 Secondary Structure in Anopheles: What Lies Beneath?.</article-title>
							<source>International Journal of Integrative Biology</source>
							<volume>3</volume>
							<fpage>232</fpage>
							<lpage>238</lpage>
					</citation>
					</ref>
					<ref id="r3">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Bennett</surname>
								<given-names>WS</given-names>
								<suffix>Jr</suffix>
							</name>
							<name>
								<surname>Steitz</surname>
								<given-names>TA</given-names>
							</name>
							</person-group>
							<year>1978</year>
							<article-title>Glucose-induced conformational change in yeast hexokinase</article-title>
							<source>Proc Natl Acad Sci USA</source>
							<volume>75</volume>
							<fpage>4848</fpage>
							<lpage>4852</lpage>
					</citation>
					</ref>
					<ref id="r4">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Bhaskaran</surname>
								<given-names>R</given-names>
							</name>
							<name>
								<surname>Ponnuswamy</surname>
								<given-names>PK</given-names>
							</name>
							</person-group>
							<year>1988</year>
							<article-title>Positional flexibilities of amino acid residues in globular proteins</article-title>
							<source>Int J Pept Prot Res</source>
							<volume>32</volume>
							<fpage>242</fpage>
							<lpage>255</lpage>
					</citation>
					</ref>
					<ref id="r5">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Boulesteix</surname>
								<given-names>AL</given-names>
							</name>
							<name>
								<surname>Tutz</surname>
								<given-names>G</given-names>
							</name>
							<name>
								<surname>Strimmer</surname>
								<given-names>K</given-names>
							</name>
							</person-group>
							<year>2003</year>
							<article-title>A CART-based approach to discover emerging patterns in microarray data</article-title>
							<source>Bioinformatics</source>
							<volume>19</volume>
							<fpage>2465</fpage>
							<lpage>2472</lpage>
					</citation>
					</ref>
					<ref id="r6">
					<citation citation-type="confproc">
							<person-group>
							<name>
								<surname>Breiman</surname>
								<given-names>L</given-names>
							</name>
							<name>
								<surname>Friedman</surname>
								<given-names>JH</given-names>
							</name>
							<name>
								<surname>Olshen</surname>
								<given-names>RA</given-names>
							</name>
							<name>
								<surname>Stone</surname>
								<given-names>CJ</given-names>
							</name>
							</person-group>
							<year>1984</year>
							<conf-name>Classification and regression trees</conf-name>
							<conf-loc>Chapman Hall New York NY</conf-loc>
					</citation>
					</ref>
					<ref id="r7">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Breitenlechner</surname>
								<given-names>C</given-names>
							</name>
							<name>
								<surname>Gabel</surname>
								<given-names>M</given-names>
							</name>
							<name>
								<surname>Engh</surname>
								<given-names>R</given-names>
							</name>
							<name>
								<surname>Bossemeyer</surname>
								<given-names>D</given-names>
							</name>
							</person-group>
							<year>2003</year>
							<article-title>Structural Insights Into AGC Kinase Inhibition</article-title>
							<source>Oncology Research Featuring Preclinical and Clinical Cancer Therapeutics</source>
							<volume>14</volume>
							<fpage>267</fpage>
							<lpage>278</lpage>
					</citation>
					</ref>
					<ref id="r8">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Bright</surname>
								<given-names>JN</given-names>
							</name>
							<name>
								<surname>Woolf</surname>
								<given-names>TB</given-names>
							</name>
							<name>
								<surname>Hoh</surname>
								<given-names>JH</given-names>
							</name>
							</person-group>
							<year>2001</year>
							<article-title>Predicting properties of intrinsically unstructured proteins</article-title>
							<source>Prog Biophys Mol Biol</source>
							<volume>76</volume>
							<fpage>131</fpage>
							<lpage>173</lpage>
					</citation>
					</ref>
					<ref id="r9">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Chan</surname>
								<given-names>DC</given-names>
							</name>
							<name>
								<surname>Fass</surname>
								<given-names>D</given-names>
							</name>
							<name>
								<surname>Berger</surname>
								<given-names>JM</given-names>
							</name>
							<name>
								<surname>Kim</surname>
								<given-names>PS</given-names>
							</name>
							</person-group>
							<year>1997</year>
							<article-title>Core structure of gp41 from the HIV envelope glycoprotein</article-title>
							<source>Cell</source>
							<volume>89</volume>
							<fpage>263</fpage>
							<lpage>273</lpage>
					</citation>
					</ref>
					<ref id="r10">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Chou</surname>
								<given-names>PY</given-names>
							</name>
							<name>
								<surname>Fasman</surname>
								<given-names>GD</given-names>
							</name>
							</person-group>
							<year>1978</year>
							<article-title>Prediction of the secondary structure of proteins from their amino acid sequence</article-title>
							<source>Adv Enzymol Relat Areas Mol Biol</source>
							<volume>47</volume>
							<fpage>45</fpage>
							<lpage>148</lpage>
					</citation>
					</ref>
					<ref id="r11">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Dayhoff</surname>
								<given-names>MO</given-names>
							</name>
							<name>
								<surname>Schwartz</surname>
								<given-names>RM</given-names>
							</name>
							<name>
								<surname>Orcutt</surname>
								<given-names>BC</given-names>
							</name>
							</person-group>
							<year>1978</year>
							<article-title>A model of evolutionary change in protein; in: M.O. Dayhoff (Ed.), Atlas of Protein Sequence and Structure, Nat. Biomed. Res. Foundation,Washington, DC</article-title>
							<source>5 Suppl</source>
							<volume>3</volume>
							<fpage>345</fpage>
							<lpage>352</lpage>
					</citation>
					</ref>
					<ref id="r12">
					<citation citation-type="journal">
							<person-group>

							<name>
								<surname>De'ath</surname>
								<given-names>G</given-names>
							</name>
							<name>
								<surname>Fabricius</surname>
								<given-names>KE</given-names>
							</name>
							</person-group>
							<year>2000</year>
							<article-title>Classification and regression trees: a powerful yet simple technique for ecological data analysis</article-title>
							<source>Ecology</source>
							<volume>81</volume>
							<fpage>3178</fpage>
							<lpage>3192</lpage>
					</citation>
					</ref>
					<ref id="r13">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Deleage</surname>
							</name>
							<name>
								<surname>Roux</surname>
							</name>
							</person-group>
							<year>1987</year>
							<article-title>An algorithm for protein secondary structure prediction based on class prediction</article-title>
							<source>Protein Engineering Design and Selection</source>
							<volume>1</volume>
							<fpage>289</fpage>
							<lpage>294</lpage>
					</citation>
					</ref>
					<ref id="r14">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Dunker</surname>
								<given-names>AK</given-names>
							</name>
							<name>
								<surname>Lawson</surname>
								<given-names>DJ</given-names>
							</name>
							<name>
								<surname>Brown</surname>
								<given-names>CJ</given-names>
							</name>
							<name>
								<surname>Williams</surname>
								<given-names>RM</given-names>
							</name>
							<name>
								<surname>Romero</surname>
								<given-names>P</given-names>
							</name><etal/>
							</person-group>
							<year>2001</year>
							<article-title>Intrinsically disordered protein</article-title>
							<source>J Mol Graph Model</source>
							<volume>19</volume>
							<fpage>26</fpage>
							<lpage>59</lpage>
					</citation>
					</ref>
					<ref id="r15">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Fraga</surname>
								<given-names>S</given-names>
							</name>
							</person-group>
							<year>1982</year>
							<article-title>Theoretical prediction of protein. antigenic determinants from amino acid sequences</article-title>
							<source>Can J Chem</source>
							<volume>60</volume>
							<fpage>2606</fpage>
							<lpage>2610</lpage>
					</citation>
					</ref>
					<ref id="r16">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Gottschalk</surname>
								<given-names>KW</given-names>
							</name>
							<name>
								<surname>Colbert</surname>
								<given-names>JJ</given-names>
							</name>
							<name>
								<surname>Feicht</surname>
								<given-names>DL</given-names>
							</name>
							</person-group>
							<year>1998</year>
							<article-title>Tree mortality risk of oak due to gypsy moth</article-title>
							<source>European Journal of Forest Pathology</source>
							<volume>28</volume>
							<fpage>121</fpage>
							<lpage>132</lpage>
					</citation>
					</ref>
					<ref id="r17">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Hermanek</surname>
								<given-names>P</given-names>
							</name>
							<name>
								<surname>Guggenmoos Holzmann</surname>
								<given-names>I</given-names>
							</name>
							</person-group>
							<year>1994</year>
							<article-title>Classification and regression trees (CART) for estimation of prognosis in patients    with gastric carcinoma</article-title>
							<source>J Cancer Res Clin Oncol</source>
							<volume>120</volume>
							<fpage>309</fpage>
							<lpage>313</lpage>
					</citation>
					</ref>
					<ref id="r18">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Joel</surname>
								<given-names>Janin</given-names>
							</name>
							</person-group>
							<year>1979</year>
							<article-title>Surface and inside volumes in globular proteins</article-title>
							<source>Nature</source>
							<volume>277</volume>
							<fpage>491</fpage>
							<lpage>492</lpage>
					</citation>
					</ref>
					<ref id="r19">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Jones</surname>
								<given-names>DD</given-names>
							</name>
							</person-group>
							<year>1975</year>
							<article-title>Amino acid properties and side-chain orientation in proteins: a cross correlation     appraoch</article-title>
							<source>J Theor Biol</source>
							<volume>50</volume>
							<fpage>167</fpage>
							<lpage>183</lpage>
					</citation>
					</ref>
					<ref id="r20">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Kannan</surname>
								<given-names>N</given-names>
							</name>
							<name>
								<surname>Haste</surname>
								<given-names>N</given-names>
							</name>
							<name>
								<surname>Taylor</surname>
								<given-names>SS</given-names>
							</name>
							<name>
								<surname>Neuwald</surname>
								<given-names>AF</given-names>
							</name>
							</person-group>
							<year>2007</year>
							<article-title>The hallmark of AGC kinase functional divergence is its C-terminal tail,a cis-acting   regulatory module</article-title>
							<source>Proc Natl Acad Sci U S A</source>
							<volume>104</volume>
							<fpage>1272</fpage>
							<lpage>1277</lpage>
					</citation>
					</ref>
					<ref id="r21">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Kyte</surname>
								<given-names>J</given-names>
							</name>
							<name>
								<surname>Doolittle</surname>
								<given-names>RF</given-names>
							</name>
							</person-group>
							<year>1982</year>
							<article-title>A simple method for displaying the hydrophobic character of a protein</article-title>
							<source>J Mol Biol</source>
							<volume>157</volume>
							<fpage>105</fpage>
							<lpage>132</lpage>
					</citation>
					</ref>
					<ref id="r22">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Lifson</surname>
								<given-names>S</given-names>
							</name>
							<name>
								<surname>Sander</surname>
								<given-names>C</given-names>
							</name>
							</person-group>
							<year>1979</year>
							<article-title>Antiparallel and parallel - strands differ in amino acid residue preferences</article-title>
							<source>Nature</source>
							<volume>282</volume>
							<fpage>109</fpage>
							<lpage>111</lpage>
					</citation>
					</ref>
					<ref id="r23">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>McCaldon</surname>
								<given-names>P</given-names>
							</name>
							<name>
								<surname>Argo</surname>
								<given-names>P</given-names>
							</name>
							</person-group>
							<year>1988</year>
							<article-title>Oligopeptide biases in protein sequences and their use in predicting protein coding   regions in nucleotide sequences</article-title>
							<source>Proteins Structure Function and Genetics</source>
							<volume>4</volume>
							<fpage>92</fpage>
							<lpage>122</lpage>
					</citation>
					</ref>
					<ref id="r24">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Namba</surname>
								<given-names>K</given-names>
							</name>
							</person-group>
							<year>2001</year>
							<article-title>Roles of partially unfolded conformations in macromolecular self-assembly</article-title>
							<source>Gene Cells</source>
							<volume>6</volume>
							<fpage>1</fpage>
							<lpage>12</lpage>
					</citation>
					</ref>
					<ref id="r25">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Ozge</surname>
								<given-names>C</given-names>
							</name>
							<name>
								<surname>Toros</surname>
								<given-names>F</given-names>
							</name>
							<name>
								<surname>Bayramkaya</surname>
								<given-names>E</given-names>
							</name>
							<name>
								<surname>Camdeviren</surname>
								<given-names>H</given-names>
							</name>
							<name>
								<surname>Sasmaz</surname>
								<given-names>T</given-names>
							</name>
							</person-group>
							<year>2006</year>
							<article-title>Which sociodemographic factors are important on smoking behaviour of high school  students? The contribution of classification and regression tree methodology in a broad epidemiological survey</article-title>
							<source>Postgraduate Medical Journal</source>
							<volume>82</volume>
							<fpage>532</fpage>
							<lpage>541</lpage>
					</citation>
					</ref>
					<ref id="r26">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Parker</surname>
								<given-names>PJ</given-names>
							</name>
							<name>
								<surname>Parkinson</surname>
								<given-names>SJ</given-names>
							</name>
							</person-group>
							<year>2001</year>
							<article-title>AGC protein kinase phosphorylation and protein kinase C</article-title>
							<source>Biochemical Society Transactions</source>
							<volume>29</volume>
							<fpage>860</fpage>
							<lpage>863</lpage>
					</citation>
					</ref>
					<ref id="r27">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Perutz</surname>
								<given-names>MF</given-names>
							</name>
							</person-group>
							<year>1989</year>
							<article-title>Mechanisms of cooperativity and allosteric regulation in proteins</article-title>
							<source>Q Rev Biophys</source>
							<volume>22</volume>
							<fpage>139</fpage>
							<lpage>237</lpage>
					</citation>
					</ref>
					<ref id="r28">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Perutz</surname>
								<given-names>MF</given-names>
							</name>
							</person-group>
							<year>1970</year>
							<article-title>Stereochemistry of cooperative effects in haemoglobin</article-title>
							<source>Nature</source>
							<volume>228</volume>
							<fpage>726</fpage>
							<lpage>739</lpage>
					</citation>
					</ref>
					<ref id="r29">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Peterson</surname>
								<given-names>RT</given-names>
							</name>
							<name>
								<surname>Schreiber</surname>
								<given-names>SL</given-names>
							</name>
							</person-group>
							<year>1999</year>
							<article-title>Kinase phosphorylation:Keeping it all in the family</article-title>
							<source>Curr Biol</source>
							<volume>9</volume>
							<fpage>R521</fpage>
							<lpage>R524</lpage>
					</citation>
					</ref>
					<ref id="r30">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Remington</surname>
								<given-names>S</given-names>
							</name>
							<name>
								<surname>Wiegand</surname>
								<given-names>G</given-names>
							</name>
							<name>
								<surname>Huber</surname>
								<given-names>R</given-names>
							</name>
							</person-group>
							<year>1982</year>
							<article-title>Crystallographic refinement and atomic models of two different forms of citrate     synthase at 2.7 and 1.7 A resolution</article-title>
							<source>J Mol Biol</source>
							<volume>158</volume>
							<fpage>111</fpage>
							<lpage>152</lpage>
					</citation>
					</ref>
					<ref id="r31">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Rose</surname>
								<given-names>GD</given-names>
							</name>
							<name>
								<surname>Geselowitz</surname>
								<given-names>AR</given-names>
							</name>
							<name>
								<surname>Lesser</surname>
								<given-names>GJ</given-names>
							</name>
							<name>
								<surname>Lee</surname>
								<given-names>RH</given-names>
							</name>
							<name>
								<surname>Zehfus</surname>
								<given-names>MH</given-names>
							</name>
							</person-group>
							<year>1985</year>
							<article-title>Hydrophobicity of amino acid residues in globular proteins</article-title>
							<source>Science</source>
							<volume>229</volume>
							<fpage>834</fpage>
							<lpage>838</lpage>
					</citation>
					</ref>
					<ref id="r32">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Spurlino</surname>
								<given-names>JC</given-names>
							</name>
							<name>
								<surname>Lu</surname>
								<given-names>GY</given-names>
							</name>
							<name>
								<surname>Quiocho</surname>
								<given-names>FA</given-names>
							</name>
							</person-group>
							<year>1991</year>
							<article-title>The 2.3-A resolution structure of the maltose- or maltodextrin-binding protein, a    primary receptor of bacterial active transport and chemotaxis</article-title>
							<source>J Biol Chem</source>
							<volume>266</volume>
							<fpage>5202</fpage>
							<lpage>5219</lpage>
					</citation>
					</ref>
					<ref id="r33">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Teague</surname>
								<given-names>SJ</given-names>
							</name>
							</person-group>
							<year>2003</year>
							<article-title>Implications of protein flexibility for drug discovery</article-title>
							<source>Nat Rev Drug Discov</source>
							<volume>2</volume>
							<fpage>527</fpage>
							<lpage>541</lpage>
					</citation>
					</ref>
					<ref id="r34">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Wright</surname>
								<given-names>PE</given-names>
							</name>
							<name>
								<surname>Dyson</surname>
								<given-names>HJ</given-names>
							</name>
							</person-group>
							<year>1999</year>
							<article-title>Intrinsically Unstructured Proteins:Re-assessing the Protein Structure-Function   Paradigm</article-title>
							<source>J Mol Biol</source>
							<volume>293</volume>
							<fpage>321</fpage>
							<lpage>331</lpage>
					</citation>
					</ref>
					<ref id="r35">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Zhao</surname>
								<given-names>G</given-names>
							</name>
							<name>
								<surname>London</surname>
								<given-names>E</given-names>
							</name>
							</person-group>
							<year>2006</year>
							<article-title>An amino ac&quot;transmembrane tendency&quot; scale that approaches the theoretical limit to accuracy for prediction of transmembrane helices: Relationship to biological hydrophobicity.di</article-title>
							<source>Protein Sci</source>
							<volume>15</volume>
							<fpage>1987</fpage>
							<lpage>2001</lpage>
					</citation>
					</ref>
					<ref id="r36">
					<citation citation-type="journal">
							<person-group>
							<name>
								<surname>Zimmerman</surname>
								<given-names>JM</given-names>
							</name>
							<name>
								<surname>Naomi</surname>
								<given-names>E</given-names>
							</name>
							<name>
								<surname>Simha</surname>
								<given-names>R</given-names>
							</name>
							</person-group>
							<year>1968</year>
							<article-title>The characterization of amino acid sequences in proteins by statistical methods</article-title>
							<source>Journal of Theoretical Biology</source>
							<volume>21</volume>
							<fpage>170</fpage>
							<lpage>201</lpage>
					</citation>
					</ref>
       </ref-list> 
	</back>
<floats-wrap >
	<table-wrap position="float" id="t1">
	<label>Table 1.</label>
  			<caption>
  				<title>Basic statistical features of parameters considered in the study.</title>
  			</caption>
   <table frame="hsides" rules="groups">
      <thead>
         <tr>
            <th align="left">Parameter</th>
            <th align="left">Mean</th>
            <th align="left">Standard Deviation</th>
			<th align="left">Skewness</th>
			<th align="left">Coefficient of variation</th>
			<th align="left">Variance</th>	
			<th align="left">Kurtosis</th>
			<th align="left">Standard Error Mean</th>			
         </tr>
      </thead>
      <tbody>
         <tr>
            <td>Accessible residues</td>
            <td>5.8171</td>
            <td>0.42102</td>
			<td>5.0288</td>
            <td>0.072376</td>
            <td>0.17725</td>
			<td>40.439</td>
            <td>0.03926</td>			
         </tr>
         <tr>
            <td>Buried residues</td>
            <td>5.7892</td>
            <td>0.72877</td>
			<td>-4.2973</td>
            <td>0.12588</td>
            <td>0.5311</td>	
			<td>25.436</td>
            <td>0.067958</td>		
         </tr>
         <tr>
            <td>Amino acid composition</td>
            <td>5.786</td>
            <td>0.19749</td>
			<td>-0.034656</td>
            <td>0.034133</td>
            <td>0.039003</td>	
			<td>-0.15092</td>
            <td>0.018416</td>		
         </tr>
         <tr>
            <td>Alpha helix</td>
            <td>1.0192</td>
            <td>0.031284</td>
			<td>1.4608</td>
            <td>0.030695</td>
            <td>0.0009787</td>	
			<td>9.3437</td>
            <td>0.0029173</td>		
         </tr>
		 <tr>
            <td>Beta sheet</td>
            <td>0.97093</td>
            <td>0.025983</td>
			<td>-0.20939</td>
            <td>0.026761</td>
            <td>0.00067513</td>	
			<td>1.077</td>
            <td>0.0024229</td>		
         </tr>
		 <tr>
            <td>Beta turn</td>
            <td>1.02</td>
            <td>0.027913</td>
			<td>-0.11458</td>
            <td>0.027365</td>
            <td>0.00077915</td>	
			<td>-0.24003</td>
            <td>0.0026029</td>		
         </tr>
		 <tr>
            <td>Coils</td>
            <td>1.0387</td>
            <td>0.0309</td>
			<td>0.39441</td>
            <td>0.029749</td>
            <td>0.00095484</td>	
			<td>-0.40818</td>
            <td>0.0028815</td>		
         </tr>
		 <tr>
            <td>Parallel beta strands</td>
            <td>1.0625</td>
            <td>0.050085</td>
			<td>0.045298</td>
            <td>0.047139</td>
            <td>0.0025085</td>	
			<td>-0.18584</td>
            <td>0.0046704</td>		
         </tr>
		 <tr>
            <td>Anti parallel beta strands</td>
            <td>0.9799</td>
            <td>0.033513</td>
			<td>-0.38504</td>
            <td>0.034201</td>
            <td>0.0011231</td>	
			<td>-0.11993</td>
            <td>0.0031251</td>		
         </tr>
		 <tr>
            <td>Transmembrane tendency</td>
            <td>-0.5891</td>
            <td>0.27052</td>
			<td>5.5183</td>
            <td>-0.45921</td>
            <td>0.07318</td>	
			<td>45.421</td>
            <td>0.025226</td>		
         </tr>
		 <tr>
            <td>Total beta strands</td>
            <td>0.98868</td>
            <td>0.030955</td>
			<td>-0.56077</td>
            <td>0.031309</td>
            <td>0.00095818</td>	
			<td>0.31456</td>
            <td>0.0028865</td>		
         </tr>
		 <tr>
            <td>Relative mutability</td>
            <td>76.674</td>
            <td>2.9206</td>
			<td>-0.085732</td>
            <td>0.038091</td>
            <td>8.53</td>	
			<td>-0.19263</td>
            <td>0.27235</td>		
         </tr>
		 <tr>
            <td>Refractivity</td>
            <td>16.212</td>
            <td>1.3109</td>
			<td>0.12774</td>
            <td>0.080856</td>
            <td>1.7184</td>	
			<td>0.27699</td>
            <td>0.12224</td>		
         </tr>
		 <tr>
            <td>Recognition factors</td>
            <td>88.918</td>
            <td>1.4693</td>
			<td>0.43693</td>
            <td>0.016525</td>
            <td>2.159</td>	
			<td>-0.42356</td>
            <td>0.13702</td>		
         </tr>
		 <tr>
            <td>Polarity</td>
            <td>19.936</td>
            <td>1.9885</td>
			<td>0.2598</td>
            <td>0.099744</td>
            <td>3.954</td>	
			<td>-0.022502</td>
            <td>0.18543</td>		
         </tr>
		 <tr>
            <td>Number of codons</td>
            <td>3.572</td>
            <td>0.24312</td>
			<td>-2.0097</td>
            <td>0.068063</td>
            <td>0.059107</td>	
			<td>11.473</td>
            <td>0.022671</td>		
         </tr>
		 <tr>
            <td>Molecular weight</td>
            <td>130.19</td>
            <td>3.7174</td>
			<td>-0.33221</td>
            <td>0.028553</td>
            <td>13.819</td>	
			<td>1.203</td>
            <td>0.34665</td>		
         </tr>
		 <tr>
            <td>Hydrophobicity</td>
            <td>-0.41118</td>
            <td>0.35214</td>
			<td>2.7344</td>
            <td>-0.8564</td>
            <td>0.124</td>	
			<td>15.724</td>
            <td>0.032837</td>		
         </tr>
		  <tr>
            <td>Bulkiness</td>
            <td>14.261</td>
            <td>1.1952</td>
			<td>-6.3417</td>
            <td>0.083806</td>
            <td>1.4284</td>	
			<td>54.206</td>
            <td>0.11145</td>		
         </tr>
		 <tr>
            <td>Average area buried	</td>
            <td>124.92</td>
            <td>7.8686</td>
			<td>-6.3319</td>
            <td>0.062987</td>
            <td>61.915</td>	
			<td>55.828</td>
            <td>0.73375</td>		
         </tr>
		 <tr>
            <td>Average flexibility	</td>
            <td>0.44019</td>
            <td>0.0060555</td>
			<td>-0.11045</td>
            <td>0.013757</td>
            <td>3.6669e-005</td>	
			<td>0.57539</td>
            <td>0.00056468</td>		
         </tr>
     </tbody>
 	  </table>
 	</table-wrap>
	<table-wrap position="float" id="t2">
	<label>Table 2.</label>
  			<caption>
  				<title>Details of trees generated in CART along with relative error and complexities.</title>
  			</caption>
   <table frame="hsides" rules="groups">
      <thead>
         <tr>
            <th align="left">Tree Number</th>
            <th align="left">Terminal Nodes</th>
            <th align="left">Cross-Validated Relative Error</th>
			<th align="left">Resubstitution Relative Error</th>
			<th align="left">Complexity</th>			
         </tr>
      </thead>
      <tbody>
         <tr>
            <td>1</td>
            <td>21</td>
            <td>0.72543 ± 0.12560</td>
			<td>0.08501</td>
            <td>0.00000</td>           
         </tr>
         <tr>
            <td>2</td>
            <td>20</td>
            <td>0.71808 ± 0.12370</td>
			<td>0.08653	</td>
            <td> 1.00000E-005</td>           
         </tr>
         <tr>
            <td>3</td>
            <td>19</td>
            <td>0.71000 ± 0.11971</td>
			<td>0.08899</td>
            <td>0.00002</td>           
         </tr>
         <tr>
            <td>4				</td>
            <td>15</td>
            <td>0.67935 ± 0.11594</td>
			<td>0.11571</td>
            <td>0.00003</td>            	
         </tr>
		 <tr>
            <td>5</td>
            <td>13</td>
            <td>0.66759 ± 0.11029</td>
			<td>0.14635</td>
            <td>0.00007</td>            		
         </tr>
		 <tr>
            <td>6</td>
            <td>11</td>
            <td>0.66746 ± 0.11162</td>
			<td>0.18358</td>
            <td>0.00008</td>            		
         </tr>
		 <tr>
            <td>7</td>
            <td>9</td>
            <td>0.65670 ± 0.11209</td>
			<td>0.22481</td>
            <td>0.00009</td>           
         </tr>
		 <tr>
            <td>8</td>
            <td>8</td>
            <td>0.57881 ± 0.09948</td>
			<td>0.25020</td>
            <td>0.00012</td>           
         </tr>
		 <tr>
            <td>9</td>
            <td>6</td>
            <td>0.60897 ± 0.08204</td>
			<td>0.35804</td>
            <td>0.00023</td>           
         </tr>
		 <tr>
            <td>10</td>
            <td>5</td>
            <td>0.66411 ± 0.09268</td>
			<td>0.41964</td>
            <td>0.00027</td>           
         </tr>
		 <tr>
            <td>11</td>
            <td>4</td>
            <td>0.89325 ± 0.08412</td>
			<td>0.52601</td>
            <td>0.00045</td>           
         </tr>
		<tr>
            <td>12</td>
            <td>3</td>
            <td>0.92470 ± 0.08126</td>
			<td>0.65254</td>
            <td>0.00054</td>           
         </tr> 
		 <tr>
            <td>13</td>
            <td>2</td>
            <td>0.91504 ± 0.07452</td>
			<td>0.78894</td>
            <td>0.00058</td>           
         </tr> 
		 <tr>
            <td>14</td>
            <td>1</td>
            <td>1.00139 ± 0.00159</td>
			<td>1.00000</td>
            <td>0.00089</td>           
         </tr> 
     </tbody>
 	  </table>
 	</table-wrap>
	<table-wrap position="float" id="t3">
	<label>Table 3.</label>
  			<caption>
  				<title>Association rules obtained in CART.</title>
  			</caption>
   <table frame="hsides" rules="groups">
      <thead>
         <tr>
            <th align="left">Node</th>
            <th align="left">Bulkiness</th>
            <th align="left">Polarity</th>
			<th align="left">Recognition factors</th>
			<th align="left">Trans membrane tendency</th>
			<th align="left">% Accessible residues</th>
			<th align="left">Alpha -helix</th>
			<th align="left">beta- sheet</th>
			<th align="left">Coil</th>
			<th align="left">Total beta-strand</th>
			<th align="left">Anti Parallel beta- strand</th>
			<th align="left">Parallel beta- strand</th>
			<th align="left">A.A. composition</th>
			<th align="left">Relative mutability</th>
			<th align="left">Average flexibility</th>			
         </tr>
      </thead>
      <tbody>
         <tr>
            <td>1</td>
            <td>&lt;= 14.2207</td>
            <td></td>
			<td></td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&lt;= 5.55</td>
			<td></td>
			<td>0.457</td>		
         </tr>
         <tr>
            <td>2</td>
            <td>&lt;= 14.2207</td>
            <td></td>
			<td></td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&lt;= 0.977</td>
			<td>&gt; 5.55</td>
			<td></td>
			<td>0.4494</td>			
         </tr>
         <tr>
            <td>3</td>
            <td>&lt;= 14.2207</td>
            <td></td>
			<td>&lt;= 90.611</td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&gt; 0.977</td>
			<td>&gt; 5.55 &amp; &lt;= 5.63625</td>
			<td></td>
			<td>0.447667</td>				
         </tr>
         <tr>
            <td>4</td>
            <td>&lt;= 14.2207</td>
            <td></td>
			<td>&lt;= 90.611</td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td>&lt;= 0.98925</td>
			<td>&gt; 0.977</td>
			<td>&gt; 5.63625</td>
			<td></td>
			<td>0.443143</td>			
         </tr>
		 <tr>
            <td>5</td>
            <td>&lt;= 14.2207</td>
            <td></td>
			<td>&lt;= 90.611</td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td>&gt; 0.98925</td>
			<td>&gt; 0.977</td>
			<td>&gt; 5.63625</td>
			<td></td>
			<td>0.441429</td>				
         </tr>
		 <tr>
            <td>6</td>
            <td>&lt;= 14.2207</td>
            <td></td>
			<td>&gt; 90.611</td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&gt; 0.977</td>
			<td>&gt; 5.55</td>
			<td></td>
			<td>0.4479</td>				
         </tr>
		 <tr>
            <td>7</td>
            <td>&gt; 14.2207</td>
            <td>&lt;= 19.9293</td>
			<td></td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td></td>
			<td>&lt;= 1.0425</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>0.4336</td>				
         </tr>
		 <tr>
            <td>8</td>
            <td>&gt; 14.2207</td>
            <td>&lt;= 19.9293</td>
			<td></td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td></td>
			<td>&gt; 1.0425</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>0.438722</td>				
         </tr>
		 <tr>
            <td>9</td>
            <td>&gt; 14.2207</td>
            <td>&gt; 19.9293</td>
			<td></td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td>&lt;= 0.97275</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&lt;= 5.68875</td>
			<td></td>
			<td>0.4419</td>				
         </tr>
		 <tr>
            <td>10</td>
            <td>&gt; 14.2207</td>
            <td>&gt; 19.9293</td>
			<td></td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td>&lt;= 0.97275</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&gt; 5.68875</td>
			<td></td>
			<td>0.444667</td>				
         </tr>
		 <tr>
            <td>11</td>
            <td>&gt; 14.2207</td>
            <td>&gt; 19.9293</td>
			<td></td>
            <td></td>
            <td></td>	
			<td>&lt;= 1.01975</td>
			<td>&gt; 0.97275</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>0.4402</td>				
         </tr>
		 <tr>
            <td>12</td>
            <td></td>
            <td></td>
			<td>&lt;= 89.4723</td>
            <td>&lt;= -0.54225</td>
            <td></td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&lt;= 6.0055</td>
			<td></td>
			<td>0.44075</td>				
         </tr>
		 <tr>
            <td>13</td>
            <td></td>
            <td></td>
			<td>&lt;= 89.4723</td>
            <td>&lt;= -0.54225</td>
            <td></td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td></td>
			<td>&gt; 0.95975 &amp; &lt;= 1.018</td>
			<td></td>
			<td></td>
			<td>&lt;= 6.0055</td>
			<td>&lt;= 80.0835</td>
			<td>0.436563</td>				
         </tr>
		 <tr>
            <td>14</td>
            <td></td>
            <td></td>
			<td>&lt;= 89.4723</td>
            <td>&lt;= -0.54225</td>
            <td></td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td></td>
			<td>&gt; 0.95975 &amp; &lt;= 1.018</td>
			<td></td>
			<td></td>
			<td>&lt;= 6.0055</td>
			<td>&gt; 80.0835</td>
			<td>0.438</td>				
         </tr>
		 <tr>
            <td>15</td>
            <td></td>
            <td></td>
			<td>&lt;= 89.4723</td>
            <td>&lt;= -0.54225</td>
            <td></td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td></td>
			<td>&gt; 1.018</td>
			<td></td>
			<td></td>
			<td>&lt;= 6.0055</td>
			<td></td>
			<td>0.440125</td>				
         </tr>
		 <tr>
            <td>16</td>
            <td></td>
            <td></td>
			<td>&gt; 89.4723 &amp;&amp;&lt;= 89.9445</td>
            <td>&lt;= -0.54225</td>
            <td></td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&lt;= 6.0055</td>
			<td></td>
			<td>0.4425</td>				
         </tr>
		 <tr>
            <td>17</td>
            <td></td>
            <td></td>
			<td>&gt; 89.9445</td>
            <td>&lt;= -0.54225</td>
            <td></td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&lt;= 6.0055</td>
			<td></td>
			<td>0.4345</td>				
         </tr>
		 <tr>
            <td>18</td>
            <td></td>
            <td></td>
			<td></td>
            <td>&lt;= -0.54225</td>
            <td></td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>&gt; 6.0055</td>
			<td></td>
			<td>0.444714</td>				
         </tr>
		 <tr>
            <td>19</td>
            <td></td>
            <td></td>
			<td></td>
            <td>&gt; -0.54225</td>
            <td>&lt;= 5.7975</td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td>&lt;= 1.00675</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>0.432083</td>				
         </tr>
		 <tr>
            <td>20</td>
            <td></td>
            <td></td>
			<td></td>
            <td>&gt; -0.54225</td>
            <td>&lt;= 5.7975</td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td>&gt; 1.00675</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>0.426667</td>				
         </tr>
		 <tr>
            <td>21</td>
            <td></td>
            <td></td>
			<td></td>
            <td>&gt; -0.54225</td>
            <td>&gt; 5.7975</td>	
			<td>&gt; 1.01975</td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td></td>
			<td>0.439</td>				
         </tr>
     </tbody>
 	  </table>
 	</table-wrap>
	<table-wrap position="float" id="t4">
	<label>Table 4.</label>
  			<caption>
  				<title>Variable importance of parameters influencing average flexibility.</title>
  			</caption>
   <table frame="hsides" rules="groups">
      <thead>
         <tr>
            <th align="left">S. No.</th>
            <th align="left">VARIABLE</th>
            <th align="left">IMPORTANCE</th>					
         </tr>
      </thead>
      <tbody>
         <tr>
            <td>1</td>
            <td>BETA-TURN (CHOU &amp; FASMAN)</td>
            <td>100.00</td>				
         </tr>
         <tr>
            <td>2</td>
            <td>% ACCESSIBLE RESIDUES</td>
            <td>93.57</td>					
         </tr>
         <tr>
            <td>3</td>
            <td>ALPHA HELIX (CHOU &amp; FASMAN)</td>
            <td>86.18</td>				
         </tr>
         <tr>
            <td>4</td>
            <td>TRANSMEMBRANE TENDENCY</td>
            <td>78.43</td>			
         </tr>
		 <tr>
            <td>5</td>
            <td>AMINOACID COMPOSITION</td>
            <td>71.15</td>					
         </tr>
		 <tr>
            <td>6</td>
            <td>BULKINESS</td>
            <td>55.69</td>				
         </tr>
		 <tr>
            <td>7</td>
            <td>COIL (DELEAGE &amp; ROUX)</td>
            <td>50.69</td>				
         </tr>
		 <tr>
            <td>8</td>
            <td>PARALLEL BETA-STRAND</td>
            <td>50.03</td>				
         </tr>
		 <tr>
            <td>9</td>
            <td>RECOGNITION FACTORS</td>
            <td>49.06</td>				
         </tr>
		 <tr>
            <td>10</td>
            <td>MOLECULAR WEIGHT</td>
            <td>34.84</td>				
         </tr>
		 <tr>
            <td>11</td>
            <td>POLARITY (ZIMMERMAN)</td>
            <td>33.05</td>				
         </tr>
		  <tr>
            <td>12</td>
            <td>HYDROPHOBICITY (KYTE &amp; DOOLITTLE)</td>
            <td>32.08</td>				
         </tr>
		 <tr>
            <td>13</td>
            <td>AVERAGE AREA BURIED</td>
            <td>29.71</td>				
         </tr>
		 <tr>
            <td>14</td>
            <td>REFRACTIVITY</td>
            <td>29.16</td>				
         </tr>
		 <tr>
            <td>15</td>
            <td>BETA SHEET (CHOU &amp; FASMAN)</td>
            <td>27.81</td>				
         </tr>
		 <tr>
            <td>16</td>
            <td>NUMBER OF CODONS</td>
            <td>21.31</td>				
         </tr>
		 <tr>
            <td>17</td>
            <td>%BURIED RESIDUES</td>
            <td>17.72</td>				
         </tr>
		 <tr>
            <td>18</td>
            <td>RELATIVE MUTABILITY</td>
            <td>2.37</td>				
         </tr>
		 <tr>
            <td>19</td>
            <td>TOTAL BETA STRAND</td>
            <td>1.14</td>				
         </tr>
		 <tr>
            <td>20</td>
            <td>ANTI-PARALLEL BETA STRAND</td>
            <td>0</td>				
         </tr>
     </tbody>
 	 </table>
 	</table-wrap>
  </floats-wrap>
</article>
