In multicellular organisms, including humans, understanding expression specificity at the tissue level is essential for interpreting protein function, such as tissue differentiation. We developed a prediction approach via generated sequence features from overrepresented patterns in housekeeping (HK) and tissue-specific (TS) genes to classify TS expression in humans. Using TS domains and transcriptional factor binding sites (TFBSs), sequence characteristics were used as indices of expressed tissues in a Random Forest algorithm by scoring exclusive patterns considering the biological intuition; TFBSs regulate gene expression, and the domains reflect the functional specificity of a TS gene. Our proposed approach displayed better performance than previous attempts and was validated using computational and experimental methods.
Paik, H.-J., Ryu, T.-W., Heo, H.-S., Seo, S.-W., Lee, D.-H., & Hur, C.-G. (2011). Predicting tissue-specific expressions based on sequence characteristics. BMB Reports, 44(4), 250–255. doi:10.5483/bmbrep.2011.44.4.250