Quality Weighted Mean and T-test in Microarray Analysis
Lead to Improved Accuracy in Gene Expression Measurements
and Reduced Type I and II Errors in Differential Expression Detection |
Shouguo Gao1+, Shuang Jia2+, Martin Hessner2, Xujing Wang1* |
| 1Department of Physics & the Comprehensive Diabetes Center, University
of Alabama at Birmingham, 1300 University Blvd, Birmingham, AL 35294, USA |
| 2The Max McGee National Research Center for Juvenile Diabetes &
the Human and Molecular Genetics Center, The Medical College of Wisconsin and Children’s Hospital of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA |
| *Corresponding author: |
Dr. Xujing Wang,
Phone: 001-205-934-8186,
Fax: 001-205-934-8042,
E-mail: xujingw@uab.edu |
|
| + The authors wish it to be known that, in their opinion, the first two
authors should be regarded as joint First Authors. |
| Received December 09, 2008; Accepted December 22, 2008; Published December 26, 2008 |
| Citation: Shouguo G, Shuang J, Martin H, Xujing W (2008) Quality Weighted Mean and T-test in Microarray Analysis Lead to Improved Accuracy in Gene Expression Measurements and Reduced Type I and II Errors in Differential Expression Detection. J Comput Sci Syst Biol 1: 041-049. doi:10.4172/jcsb.1000003 |
| Copyright: ©2008 Shouguo G, etal. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
| Abstract |
Previously we have reported a microarray image processing and data analysis package Matarray,
where quality scores are defined for every spot that reflect the reliability and variability of the data acquired from
each spot. In this article we present a new development in Matarray, where the quality scores are incorporated
as weights in the statistical evaluation and data mining of microarray data. With this approach filtering of poor
quality data is automatically achieved through the reduction in their weights, thereby eliminating the need to
manually flag or remove bad data points, as well as the problem of missing values. More significantly, utilizing a
set of control clones spiked in at known input ratios ranging from 1:30 to 30:1, we find that the quality-weighted
statistics leads to more accurate gene expression measurements and more sensitive detection of their changes
with significantly lower type II error rates. Further, we have applied the quality-weighted clustering to a timecourse
microarray data set, and find that the new algorithm improves grouping accuracy. In summary, incorporating
quantitative quality measure of microarray data as weight in complex data analysis leads to improved reliability
and convenience. In addition it provides a practical way to deal with the missing value issue in establishing
automatic statistical tests. |
|
|
|