Statistics and Design
Atomic level. The basic biological properties of proteins – folding, function, and the capacity to adapt – arise from a global pattern of epistatic interactions between amino acid residues. Defining this pattern is complicated by the subtlety and vast combinatorial complexity of amino acid interactions, problems which are difficult to address by either theory or experiment. We have taken a distinct strategy, using the growing databases of extant protein sequences to statistically learn the pattern of interactions between amino acids. The idea is that the biologically relevant cooperative actions of amino acids will be reflected in the correlated evolution of the corresponding sequence positions in the long-term evolutionary record of a protein family. This approach has led to the concept of protein sectors – groups of collectively evolving amino acids – that form physically contiguous networks within protein structures and that underlie basic aspects of foldability, biochemical activities, and the capacity to evolve.
A series of papers that make this argument: (1) Lockless SW, Ranganathan R. Science 1999 286:295-299, (2) Süel GM, et al.. Nature Structural and Molecular Biology. 2003 10:59, (3) Socolich M, et al.. Nature. 2005 437:512, (4) Russ WP, et al. Nature. 2005 437:579, (5) Lee J, et al. Science. 2008 322:438-42, (6) Halabi N, et al. Cell. 2009 138:774-86, (7) Reynolds KA, et al. Cell. 2011 147:1564-75, (8) McLaughlin Jr RN, et al. Nature. 2012 491:138, (9) Stiffler MA, et al. Cell. 2015 160:882-92, (10) Raman AS,et al. Cell. 2016 166:468-80.
The concept of protein sectors demands further clarification and study. For example, can we use evolution-based statistical models to accurately predict protein phenotype given genotype? As a test of sufficiency, can we use the models to build synthetic proteins that fold, function, and evolve in a manner that recapitulates natural proteins in vivo? Given strong evidence that coevolution can also report direct contacts in tertiary structures, what is the relationship of collectively evolving networks of amino acids (protein sectors) to direct contacts? Given clear evidence for high-order epistasis between amino acids, how can protein function smoothly evolve through a process of stepwise variation and selection? Current projects in the lab are addressing each of these questions through a combination of experiments and theory.
Cellular/organismal level. Genetic variation operates at the molecular level, but selection operates at the level of organisms existing and competing within fluctuating environments. Thus, any attempt at understanding evolution as a design process must take into account this entire hierarchy of constraints. For example, how does stepwise variation of protein phenotypes lead to the emergence of systems-level organismal behaviors? We are addressing this problem in several powerful model systems – Drosophila vision, yeast osmo-sensing, and circadian rhythms in cyanobacteria. A dramatic example of connecting protein phenotype to organismal fitness is found in studies of the Drosophila InaD complex, a macromolecular unit that is essential for vision in flies. Within InaD, the light-dependent dynamics of a single disulfide bond allosterically switches the conformation of a protein domain, regulating the speed of vision and enabling an evolutionarily conserved escape behavior that is essential for fitness in flying insects. This work provides a beautiful model system for mechanistically connecting the full hierarchy – understanding how stepwise variation at the molecular level can leads to the emergence of organismal behaviors. The key papers in this area: (1) Kiselev A, et al. Neuron. 2000 28:139-52, (2) Natarajan M,et al. Nature cell biology. 2006;8:571, (3) Mishra P, et al. Cell. 2007;131:80-92, (4) Pumir A, et al. PNAS. 2008;105:10354-9.
Center for Physics of Evolution
Biochemistry & Molecular Biology
The Institute for Molecular Engineering
The University of Chicago
929 E. 57th Street Chicago, IL 60637