Cel­lular func­tion requires bio­mol­e­cules to undergo dynamic tran­si­tions that include folding, con­for­ma­tional rearrangements, and large-​​scale assembly. The result is a highly inter­de­pen­dent net­work of processes that is main­tained by a bal­ance of ther­mo­dy­namic and kinetic fac­tors. In mol­e­c­ular machines, each con­stituent biopolymer (i.e., a chain of residues) first folds to a low energy configuration/ensemble. These ordered poly­mers can then assemble into sophis­ti­cated archi­tec­tures, which undergo con­for­ma­tional tran­si­tions during function. In con­trast to the dynamics of macro­scopic machines, molecular-​​level processes are sto­chastic, where the mol­e­c­ular inter­ac­tions that ensure struc­tural integrity are weak (i.e., on the scale of ener­getic fluctu­a­tions from sol­vent). In this dynamic environment, biomolecules con­stantly fluctuate (1), and the extent of dis­order is het­ero­ge­neous between residues. Inspired by this, in 2003, Miyashita et al. pos­tu­lated that bio­mol­e­cules may exploit dis­order to accel­erate func­tional kinetics (2). In their the­o­ret­ical inves­ti­ga­tion of pro­tein func­tion, the authors found large levels of strain energy accu­mu­late in iso­lated residues. The pre­dicted level of strain exceeded the sta­bility of most pro­teins under cel­lular con­di­tions, sug­gesting that these highly-​​strained regions may locally unfold, or “crack.” By cracking, the mol­e­cule may gain configurational entropy and thereby reduce the strain-​​induced bar­rier (Fig. 1). Sub­se­quently, many the­o­ret­ical and com­pu­ta­tional inves­ti­ga­tions have found evi­dence of cracking during func­tion. These studies have pri­marily used simplified models (3), with which mil­lisec­ond­scale dynamics are com­pu­ta­tion­ally accessible. In con­trast, sim­u­la­tions with explicit-​​solvent models are typ­i­cally lim­ited to nanoseconds, or occa­sion­ally microsec­onds (4, 5). Because cracking and large-​​scale rearrange­ments occur on rel­a­tively long timescales (microsec­onds to mil­lisec­onds), evi­dence of cracking with explicit-​​solvent models has been sparse. In PNAS, Shan et al. (6) report the most definitive evi­dence of cracking from explicit-​​solvent sim­u­la­tions, to date. Using a spe­cial­ized com­puter, they per­formed mul­tiple sim­u­la­tions of EGFR kinase in sol­vent for tens of microsec­onds and found cracking to spon­ta­neously occur. Although open ques­tions remain about the pre­cise details of cracking prop­er­ties, Shan et al.’s study high­lights how con­ver­gent the­o­ret­ical descrip­tions of bio­log­ical dynamics are emerging as explicit-​​solvent sim­u­la­tions are pushed to longer timescales.

Grounded in the sta­tis­tical physics of glasses, energy land­scape theory (7, 8) pro­vides a frame­work for under­standing the rela­tion­ship between pro­tein dis­order and ener­getics, at global (folding) and local (cracking) scales. A key finding has been that pro­teins do not fold along pre­cisely defined path­ways, but there is a mul­ti­tude of routes by which pro­teins nav­i­gate between extended (unfolded) and com­pact (folded/​native) ensem­bles. The theory fur­ther pre­dicts that folding ener­getics are dom­i­nated by the inter­ac­tions formed in the folded configuration, which has allowed for exten­sive appli­ca­tion of simplified “structure-based” models for folding (3). Although folding of indi­vidual domains can often be described as a pseudo first-​​order phase tran­si­tion (9), the process is not per­fectly cooperative. Many residues coop­er­a­tively orga­nize, although some atoms remain free to undergo sep­a­rate order-​​disorder events. Simplified models demon­strated this point (10), which was later cor­rob­o­rated by long-​​timescale sim­u­la­tions from Shaw et al. (11). This intu­itive finding is one example of how longer­timescale explicit-​​solvent sim­u­la­tions are rein­forcing pre­dic­tions from simple models, in this case sug­gesting a propen­sity for local­ized dis­order that is sep­a­rable from full folding transitions.

Simplified models built on energy land­scape prin­ci­ples have repeat­edly impli­cated cracking during func­tion. Structure-​​based models approx­i­mate the land­scape by a few dom­i­nant basins of attrac­tion, each cor­re­sponding to an exper­i­men­tally deter­mined configuration. In doing so, the models use knowl­edge of these low-​​energy configurations to pro­vide a first-​​pass descrip­tion of the poten­tial energy sur­face. These models carry the added bonus of being com­pu­ta­tion­ally inex­pen­sive, enabling long-​​timescale sim­u­la­tions to be obtained, even for large assem­blies (3, 12, 13). One may then iden­tify sta­tis­ti­cally significant cor­re­la­tions between cracking and free-​​energy bar­riers, as demon­strated for the pro­teins calmod­ulin (14), kinesin (15), and adeny­late kinase (16), among others. Despite mounting evi­dence for cracking, it has remained unknown whether cracking would also be pre­dicted by long-​​timescale explicit-​​solvent simulations.

Short sim­u­la­tions (nanosec­onds) with explicit-​​solvent models are fre­quently used to argue that pro­tein func­tional rearrange­ments are gov­erned solely by loose “hinge”regions, and not cracking (17). Explicit-​​solvent sim­u­la­tions may be viewed as the philo­soph­ical oppo­site of energy land­scape theory-​​inspired models. That is, con­ven­tional explicit-​​solvent sim­u­la­tions use a trans­fer­able set of para­me­ters, where only the sequence com­po­si­tion of the pro­tein and the ini­tial configuration are pro­vided as input. The global fea­tures of the land­scape are not assumed a priori, and occa­sion­ally the native configuration is not the global ener­getic min­imum (4). The assump­tion when using these models is that the para­me­ters are accu­rately cal­i­brated, such that sim­u­la­tion may be con­sid­ered to be a “com­pu­ta­tional micro­scope” (18). In prin­ciple, it should be pos­sible to con­struct a gen­eral model that includes all rel­e­vant ener­getic inter­ac­tions. How­ever, including more detail comes with a price, lim­iting many sim­u­la­tions to tens of nanosec­onds (17), or a few microsec­onds (4). Although com­pu­ta­tional capacity con­tinues to increase (19), reversible order-​​disorder tran­si­tions and large-​​scale con­for­ma­tional rearrange­ments occur on mul­ti­mi­crosecond (or greater) timescales. Sam­pling lim­i­ta­tions are exac­er­bated by the fact that func­tional rearrange­ments and cracking are sto­chastic, making their rela­tion­ship sta­tis­tical. Thus, the com­pu­ta­tional demand to quan­ti­ta­tively study cracking is orders of mag­ni­tude beyond most avail­able resources.

Unsatisfied with the lim­ited timescales of explicit-​​solvent sim­u­la­tions, the Shaw group devel­oped a spe­cial­ized mas­sively par­allel machine, called Anton. Now, they can pro­duce over ten microsec­onds of sim­u­lated time, per day (20). This ∼100-​​fold increase in com­puting speed was largely enabled by designing a processor tai­lored to mol­e­c­ular dynamics cal­cu­la­tions. Rather than use general-​​purpose com­pute cores, the team designed unique hard­ware that opti­mizes per-​​core per­for­mance, data man­age­ment and load bal­ancing of mol­e­c­ular dynamics sim­u­la­tions (20). Stan­dard  CPUs are ver­sa­tile, but they only per­form sev­eral oper­a­tions per cycle. The Anton chip for­feits flexi­bility by hard­wiring the arith­metic pipelines, which enables over 1,000 oper­a­tions per cycle. Shaw et al. demon­strated the incred­ible power of this approach by per­forming the first mil­lisecond explicit-​​solvent sim­u­la­tion (11), and by folding many small pro­teins in sol­vent (21). One remark­able aspect of their sim­u­la­tions has been that the dynamics of small pro­tein folding are qual­i­ta­tively and quan­ti­ta­tively sim­ilar between explicit-​​solvent models and structure-​​based approaches. Specifically, the same coor­di­nates cap­ture the under­lying bar­riers and both classes of models yield con­sis­tent descrip­tions of folding thermodynamics.

The Shaw team has now taken aim at pro­tein func­tion, and in the PNAS paper by Shan et al. they report explicit-​​solvent sim­u­la­tions of EGFR kinase, in which spon­ta­neous large-​​scale con­for­ma­tional rearrange­ments occurred (6). With sim­u­la­tions that extend to tens of microsec­onds, they found that the con­for­ma­tional process is not fully accounted for by a hinge-​​like descrip­tion. Rather, the mol­e­cule adopts inter­me­diate configurations that appear to be sta­bi­lized by dis­order in iso­lated regions (Fig. 1), fully con­sis­tent with the cracking par­a­digm. In the con­text of nearly a decade of debate, this study stands out as the most clear identification of cracking in explic­it­sol­vent simulations.

The Shan et al. (6) study signifies a turning point in the dis­cus­sion of cracking and the rela­tion­ship between explicit-​​solvent and structure-​​based models. It is now clear that cracking is pre­dicted by both classes of models, although we must elu­ci­date its extent in dif­ferent pro­teins and its pre­cise impact on free-​​energy bar­riers. Addi­tion­ally, the struc­tural char­acter of cracking needs to be fur­ther clarified. For example, are there dif­ferent modes of cracking (e.g., back­bone vs. side-​​chain reor­ga­ni­za­tion)? If so, is there a cor­re­la­tion between a protein’s bio­log­ical func­tion and the type of cracking used? As we forge for­ward with these ques­tions, com­ple­men­tary per­spec­tives pro­vided by an array of models will help solidify our under­standing of the mech­a­nistic and ener­getic fac­tors that govern bio­log­ical dynamics.

Pro­ceed­ings of the National Academy of Sci­ences of the United States of America
April 22, 2013, doi: 10.1073/pnas.1305236110

 

 

Read the article at PNAS →