Ye Tian's February 2011 Talk
* Time and Date: 4:10 pm Thursday, February 10
* Location: 5W Neill Hall, WSU Mathematics Department
* Title: A Linear Optimization Based Predictor for Solubility Mutagenesis
* Abstract: Mutagenesis is the process of changing one or more amino acids in a protein to others. It is used as a standard tool to engineer proteins with desirable properties - increased stability, decreased solubility, an so on. Computational tools are invaluable to narrow down the number of potential mutations to make in the lab, so as to identify one that works. We have developed a classification model to predict whether a protein's solubility will be increased or decreased due to mutations. We build models using concepts from computational geometry to capture relationships between the protein structure and its sequence. Subsequently, we define a weighted log-likelihood scoring function for making predictions. Weights for this predictor are obtained through linear optimization (LO). Model robustness and prediction accuracy are demonstrated using various cross validation techniques. We also compare our LO model to predictors developed by other standard machine learning methods such as Support Vector Machines (SVM) and the Least Absolute Shrinkage and Selection Operator (LASSO). On the dataset of mutations we have assembled, the LO model performs the best overall.
* Speaker Bio: Ye Tian is a fifth year PhD student in Mathematics at Washington State University, working with Bala Krishnamoorthy. She holds an MS degree in Statistics from WSU, supervised by Jave Pascual, and a BS degree in Applied Mathematics from Sichuan University, China. Her research areas are bioinformatics, optimization, and statistical inference. She also has industrial internship experiences with bioinformatics and biostatistics research firms.
* Background Information for Talk: Preprint, datasets, and code are available from This link
