Post-translational insulin consists of two chains (A and B) that are bonded through disulfide bonds. In humans this is obtained through post-translational modification of a single peptide chain. Such machinery is not available in an E. coli host. Noel is studying how to design the construct for insulin such that a the two parts can reliably find each other.
His design is based on the strong affinity between barnase and barstar. Barnase (resp. barstar) are connected to one of the insulin chains using a linker. When the barnase-barstar complex is formed, insulin should form by proximity. This also circumvents the risk of the formation of two chains of the same kind bonding (A-A or B-B).
Such insuline-barnase-barnstar complexes should aggregate into a large complex of many such molecules, from which pure, functional insulin can easily be cleaved. Admittedly, I did not follow this 100%.
The challenge here is to design such linker as such the complex can be formed. To this end, Noel writes his own software to do mechanistic molecular modelling. He computes all the molecular (polar, Vanderwaals, …) interactions between amino acids to finally obtain the stability of a complex. For a given linker, this is represented by a set of (sparse) matrixes quantifying the strength of the interactions between the amino acids.
This is the point where machine learning comes into play:
- Using a set of example matrices, predict the matrices for new AA chains
- Perhaps use the information to directly build a model to predict stability of a given linker? The mechanistic information should somehow be used to improve the model (i.e. Vapnik’s learning with privileged information)
- Given a model, optimize the linker (SA, GA…)
- Bayesian optimization to search for promising under model uncertainty?