Bacteriophage lysins are compelling antimicrobial proteins whose biotechnological utility and evolvability would be aided by elevated stability. Lysin catalytic domains, which evolved as modular entities distinct from cell wall binding domains, can be classified into one of several families with highly conserved structure and function, many of which contain thousands of annotated homologous sequences. Motivated by the quality of these evolutionary data, the performance of generative protein models incorporating coevolutionary information was analyzed to predict the stability of variants in a collection of 9,749 multimutants across 10 libraries diversified at different regions of a putative lysin from a prophage region of a Clostridium perfringens genome. Protein stability was assessed via a yeast surface display assay with accompanying highthroughput sequencing. Statistical fitness of mutant sequences, derived from secondorder Potts models inferred with different levels of sequence homolog information, was predictive of experimental stability with areas under the curve (AUCs) ranging from 0.78 to 0.85. To extract an experimentally derived model of stability, a logistic model with site-wise score contributions was regressed on the collection of multimutants. This achieved a cross-validated classification performance of 0.95. Using this experimentally derived model, 5 designs incorporating 5 or 6 mutations from multiple libraries were constructed. All designs retained enzymatic activity, with 4 of 5 increasing the melting temperature and with the highest-performing design achieving an improvement of +4°C.
Bibliographical noteFunding Information:
This work was supported by a grant from the National Institutes of Health (R01 GM121777). We have submitted a patent application pertaining to some of the engineered lysin molecules in this work.
- Antimicrobial protein
- Clostridium perfringens
- Coevolutionary model