Open Insulin: Getting Started

I started this simple wiki for beginners and experts alike to get oriented in contributing to the Open Insulin project. For beginners it should be a starting point to learn more about the technology. It should help everyone better navigate the information we have archived on the Google Drive.

This wiki contains:

  1. How do I get involved?
  2. Research Status
  3. Technology Basics
  4. Where do I find information?
  5. FAQ

Anyone who wants to help build this can join in. SImple things you can do:

  • Find good quality online educational material that offers an introduction to synthetic biology, molecular biology, protein engineering, genetics and other relevant fields. Add it to the 'Technology Basics' section with a short description containing what's it about and why people should use it.
  • Help in organising information. Are you a pro in organising scientific articles and data? Great, we can use help with that! Get in touch with @WinniePoncelet .
  • Add Frequently Asked Questions. Have you been asked about the project? Put the question and your answer in the FAQ!

1. How do I get involved?

You are a beginner or a (bio)scientist: everyone is welcome to join! We can use your brains, energy and creativity regardless of your prior knowledge.

As a beginner or non-bioscientist, you may be interested in learning more about the technology. It is super interesting stuff and it will also help you to contribute to the project at a technological level. Take a look at the technology basics below.

As a bioscientist trained in the field, you can jump right in. Take a look below at the research status and introduce yourself on the forum or at one of the live meetings (see introduction post for the latest info).

You can also help us by donating materials or money, and we’d be forever grateful! Get in touch to discuss how it could work.

2. Research status

Current step: awaiting the arrival of the plasmid samples from the team in Oakland.

Next step: use the plasmids to transform E. coli, in order to replicate the work of Oakland and set a reference point for further optimization.

Updates from the Oakland team can be found on their website.

3. Technology basics

MOOC on production of medicines, with insulin as a case. Not currently online but @arnepauwels has notes.

This MOOC on synthetic biology will require some additional study (eg. looking up terms on Wikipedia) but guides you through the basics.

MIT offers a very extensive range of biology courses (some of which are written in comic sans). One of them is this solid introduction to biology on edX. Choose recent courses, knowledge is outdated fast as the field changes rapidly.

Can anyone recommend other specific courses?

4. Where do I find information?

On the Google Drive, and we’re working on organizing it! (we’re limiting access to team members, get in touch if you’re interested).

5. FAQ

  • You're working open source, where's the data?

It is not currently openly available (see the discussion below).

  • Are you going to inject hacked drugs into people?

No. The goal (for now) is a production protocol for the insulin molecule.

open data

There is a debate about what really constitutes ‘open’ research still, certainly.  @WinniePoncelet  The response to the FAQ question ‘where is the data?’ as ‘It will be released when the time is there. We are still in the research phase.’ could be looked upon with askance by many…  Most Hackuarium projects are not intensively getting all expts in progress into wiki pages, but there are definitely some that try to just get everything down to document as they go, for instance.  (not worrying about being polished is key for that - the assumption is that having the ‘work in progress’ notice means that someone may help for this process eventually… ) I am still a bit agnostic, even though I encourage documentation asap always, and wonder how you personally feel about this.

Starting a European Open Insulin project is awesome, btw!

Re: open data

That’s a valid point @Rachel . Although I am a big fan of documentation, I have personally not given it too much thought for OI. Partly because the group in Oakland took the approach of not having an open and rolling documentation repository. Partly because it seemed like too big an effort for now, when we are already struggling with managing all the information we do and don’t have.

I am also reminded of an interesting aspect of openness. A researcher I know, has a huge dataset on barefoot walking by indigenous communities. The Nikes of this world would pay big cash to have it. She believes in open source, however, opening up the dataset would mean only the Nikes could really exploit the data, thanks to their size. Smaller companies can’t do much with the data (they don’t have eg. the $10,000 3D printer for it) and the indigenous communities can’t either. There is skewness in the situation: a huge relative difference in resources, a huge financial incentive and no community of peers that is in a position to contribute to the commons. For all good measure, opening up the data would be closer to a transaction (a gift, even).

The same factors are at play for Open Insulin. If at one point, an experiment proves promising, a biotech company could take it from there and easily be faster. The peers (like the Belgium group) who want to contribute, can contribute though, but by getting more closely involved. At the end everything can be released, and hopefully this will shift some of the factors at play. Paving the way for a new way of protein engineering is also a goal of Open Insulin.

Not sure if there’s good answers, I’d love to discuss this more.

Messy topic, but some conventional wisdom is emerging

As @WinniePoncelet knows, I am a bit of an open data fanboy.  Four thoughts here:

  1. "Documentation is expensive and nobody is paying us for it" is a completely acceptable argument. No one can fault an open source project for bad or missing documentation, only praise it when it does it right.
  2. There is no opennness without letting go of control. Keep your eyes on the ball: if your goal is cheap insulin, or shoes that reap the health benefits of barefoot walking, you win even if it is Novartis or Nike delivering it. In open source, most people consider that open source is winning exactly because business uses it (and, now, contributes to maintaining it). Linux is everywhere except your laptop; if you are hellbent on open source, or cost-sensitive, you can also get a Linux laptop, with Linux being free as in both speech and beer.
  3. Open cannot be closed again. If you don't like what Novartis did with your data and you thing you can do better/cheaper, your data are still there and still open. You can still go ahead and undercut them.
  4. That said, you can take a "free" as opposed to "open" approach, and use nc or nc-sa licenses. This means people can reuse your data, but are prohibited from doing anything commercial with it; additionally, with sa (share alike) any new entity that incorporates your data inherits their license, so all "children" dataset stay noncommercial forever. The Creative Commons website has a handy wizard for choosing your preferrred license. These days, noncommercial licenses are not considered open licenses according to the Open Definition. With Creative Commons licenses, additional usage rights can always be negotiated with the rights holder. So, you can put out data with a nc license; if Novartis thinks they are so valuable and want to use them to develop a drug, they have to come to you and ask for a different license. At that point you decide what to do. Of course, it is difficult to monitor that they do not just syphon the data up and do whatever they want anyway, but hopefully the regulation is tight enough that they will consider not worth the risk of being caught out. 

I’d love to help with your data strategy. OpenCare’s works like this.

1 Like

Best strategies

On your point 2: the goal is an open source production protocol, mainly for economical reasons. Though those reasons are ‘how one will use the protocol’. We can all agree on the protocol as a goal, what happens with it afterwards is up to the user. In this stage, it is desirable that the way we get there does not jeopardize the use some of the team members have in mind (or at least minimally).

Cheap insulin is one of the reasons, but a precedent for more of this kind of research is also a reason, as this could decrease the cost of many more medicines. We can only guess, but I doubt the lesson Novartis would draw is “we need to do more open research” after they pick up half-finished open knowledge and successfully turn it into a product or profit. From releasing a somewhat finished protocol into the world, they might still not take that message away, but more citizen researchers might be inclined to, and keep the effort going. It becomes an optimization problem to balance short term and long term economic benefits, taking the strategy to get there into account.

The comparison of the dynamics in software is fair, but that is not to say that there is not a more favorable outcome. The stakes are also higher when considering medicine. The difference between having or not having a piece of software is not death.

On 3: agreed to an extent. There is a lot of wickedness in the biotech industry. Research is very slow and expensive, almost as extreme as it gets. The dynamics do change at these extremes, I think, and that asks for a different strategy. To break through, accumulating small wins in an iterative way might not be ideal. Reaching a particular benchmark that actually matters big time (eg. the first open source production protocol for insulin) perhaps would be.

Volunteer time can drastically undercut the cost, and make this type of research more efficient than what a Novartis could hope for. We already seeing signs of that in the DIYbio movement. But we are still learning how to leverage the advantage. Hardware, software and wetware are definitely enabling us at this point, but are also still developing. Hopefully these things are bringing us to a tipping point, quicklier than the large companies are.

I think this is still a learning phase, where there are vulnerabilities on several fronts. Taking the least risky route wherever we can, seems reasonable.

risk vs honor

I think the diybio code of ethics has transparency as its first principal for a very good reason - it gives a moral high ground that I think can override such questions of risk @WinniePoncelet, especially when you consider those options for cc, so open stays forever open and non-profit (thanks for pointing this out, @Alberto !)!

To me, the worst is when an idea is squashed because others decide to protect it for their own profit - but if it is all in the open to start, we should all benefit.

Idealistic?  perhaps…  :)

The best default

Well, idealism is the best default I think :slight_smile:

Thanks for sharing your views on this. Not opening up data right away is a compromise for sure, one that I do not particularly like, but I can see reasons. Maybe these are wrong, I have no answers.

We have not really produced data ourselves yet. Looking forward to when we do and at that point, the team should make the call together. Then it is a conscious and collective decision.

1 Like

notes of MIT course By Prof Christopher Love - Koch institute

By Prof Christopher Love - Koch institute at MIT

Focus on protein therapeutics manufacturing using recombinant DNA technology.

Week 1

  1. Define what a biologic drug is
  2. Describe why biologic drugs are important in the treatment of disease
  3. Summarize how cells were first used in manufacturing
  4. Explain why many modern biologic drugs are manufactured using cell culture

What is a biologic drug?

  • biological drug (= protein therapeutic)
  • vacines
  • blood components
  • hormones 

Multi billion dollar industry

First test with penicillin:

low yield to high yield penicillin (over 1000 fold increase)

=> penicillin is protective measure of mold => stressing the mold will lead to higher production of defence?

diphtheria first antiserum

He found that by heating and inactivating the bacterial toxin that caused diphtheria, and injecting it into guinea pigs,

the animals were immune to lethal doses of the toxin.

Bio-manufacturing to Deliver High-quality Biologics

Week 2

  • Describe how a small molecule drug and a protein therapeutic differ.
  • Name the 20 amino acids used to build proteins.
  • Identify the four types of protein structure.
  • Identify and categorize post-translational modifications.
  • Summarize how small changes in the structure of insulin lead to large changes in function.
  • Describe what an antibody is and how it can be used to treat disease.

Conventional drugs vs recombinant biologics

size difference analogy marble and football

In red the recombinant biologics in top 10 list

Small molecules are easier to produce by chemical synthesis

mimic compound or block pathway for example ibuprofen, blocking inflammatory reactions. 

Due to the small molecule it can bind at a lot of other sites, creating side effects and unintended reactions. This is the reason of large scale failure in first round testing

Biologics ar impossible to chemical synthesis, they are grown. less side-effects thanks to hyper specific shape

in order to work well => drug needs to attach to:

they need to be exact and precise to activate a function

How larger the molecule, the easier something can go wrong (such as oxidation or substitution)

possible problem: aggregation

=> clustering of molecules, can lead to allergic reactions

Better filtration techniques are a solution for this problem

Small molecules are often given in pill form, the acidity changes the pill to the active ingredient. If a biologics would be used the same way, they would be destroyed - intravenous, intramuscular, subcutaneous injection or inhalation

Enzymatic: example insulin

special targeting: inhibit normal biological function

Protein vaccines: hepatitis B, influenza

2.2 introduction to amino acids

3D form of proteins = what is does and how

Primary structure: sequence of amino acids

Secondary: A-helix and B-sheets

the alfa helix support itself by hydrogen bonding between the carboxyl and the amino group.

Tertiary structure: interactions among all atoms in 3D space (cysteine covalent bonding of S in insulin)

Tertiary structure, how everything influence the position between other molecules including ionic bonds (charges),  hydrogene bonds, hydrophobic

quaternary structure

insuline is stored as a hexamer but activa as a monomer

Insuline has a tendency to clog together when it is released from hexamer,

insuline is engineered to prevent this

2.4 Post Translational Modifications (PTM)

PTM can be a cause of concern: toxic variables

Thanks, added

Fantastic, thanks! I added the notes in pdf format to the wiki. You can also add things/make edits to the wiki yourself any time.