Applications of AI for Directed Evolution in Biotechnology,

Part
01
of three
Part
01

AI in Directed Evolution Case Studies

Machine Learning-Guided Directed Evolution for Protein Engineering and Machine Learning-Assisted Directed Protein Evolution with Combinatorial Libraries are the two applications of AI/ML in directed evolution.

1. Machine Learning-Guided Directed Evolution for Protein Engineering

Overview

  • Machine learning-guided directed evolution is a new paradigm for biological design that enables optimization of complex functions. Machine learning methods use data to predict sequence maps to function without requiring a detailed model of the underlying physics or biological pathways.

Company/Brand involved

  • N/A

Success Metrics

  • Machine learning methods accelerate directed evolution by learning from information contained in all measured variants and uses that information to select sequences that are likely to be improved.

Additional Insights

  • In 2019, Denovium and Maxygen entered into a partnership to apply AI in directed evolution for protein engineering. This partnership brings together Maxygen’s expertise in molecular breeding and directed evolution with Denovium’s state-of-art AI engine.

2. Machine Learning-Assisted Directed Protein Evolution with Combinatorial Libraries

Overview

  • To reduce experimental effort associated with directed protein evolution and to explore the sequence space encoded by mutating multiple positions simultaneously, machine learning is incorporated into the directed evolution workflow.
  • Combinatorial sequence space can be quite expensive to sample experimentally but machine learning models trained on tested variants provide a fast method for testing sequence space computationally.

Company/Brand involved

  • N/A

Success Metrics

  • Incorporating machine learning into directed evolution workflow helps to reduce associated expenses in performing experiments in combinatorial sequence space.
  • By greatly increasing throughput within silico modeling, machine learning enhances the quality and diversity of sequence solutions for a protein engineering problem.

RESEARCH STRATEGY

Despite a comprehensive search, we were unable to find case studies illustrating the application of AI in directed evolution. The following strategies were deployed to identify the required data:
First, we analyzed brands/companies that are using AI/ML to incorporate with Directed Evolution. We consulted several business intelligence websites, press releases, and medical publications such Nature.com, NCBI, Forbes, Synbiobeta, PRNewswire among others. We were only able to find about the partnership between Denovium and Maxygen that focuses on how Artificial intelligence is applied into Directed evolution for protein engineering. We tried to find a case study about the progress of this partnership but nothing relevant was found. We believe the partnership is fairly new and no case study is available yet.

Next, since only clinical trials and research studies are the only data available that illustrates the applications of AI into directed evolution, we performed an exhaustive search to identify clinical trials and research studies to find any companies/brands that might be sponsoring to the studies. However, we only found data on research institutes and the authors of the study/clinical trials.

Lastly, we looked for additional data about each identified authors of the clinical trials/research studies we found including Zachary Wua, S. B. Jennifer Kana, Russell D. Lewis, Bruce J. Wittmannb, and Frances H. Arnolda. We followed this approach to find the companies/brands that are funding their clinical trials/research studies. We scanned trusted media interviews by these authors on New York Times, Fair Observer, and others. Nevertheless, we only found a few details around the idea of incorporating AI and machine learning into Directed Evolution.
Part
02
of three
Part
02

AI Application in Directed Evolution SWOT Analysis

Machine learning is groundbreaking and crucial in the field of directed evolution development. Regardless, a significant risk is present, when it comes to directed evolution development.

Strengths

  • Machine learning-guided directed evolution is groundbreaking in the biological design field as it enables complex function optimization. Machine learning methods predict how sequence maps function, without the need for detailed models of biological pathways or underlying physics.
  • Directed evolution alone is time-, energy- and material consuming, regardless of the diversification technique that is used. Machine learning methods intelligently select new variants that can reach higher fitnesses and speed up the direct evolution process, in comparison to directed evolution performed without AI.
  • Database of Structural Propensities of Proteins (dSPP), the world's first interactive repository of dynamic and structural features of proteins, has been opened up to enable researchers and protein engineers to utilize and integrate artificial intelligence and machine learning frameworks such as Tensorflow and Keras.

Weaknesses

  • Machine learning has limitations and may not necessarily be useful in all applications. Machine learning is useful in cases of shortage in high-throughput screen limits, but if the number of sequences that can be screened is sufficient, or if the fitness landscape is additive and smooth, machine learning may not succeed in finding better variants or decreasing screening burden.
  • Machine learning methods cause expenses to increase, especially in DNA sequencing and computation. Though the cost of using machine learning in directed evolution is decreasing, protein engineers are advised to keep the excessive costs in mind, especially if machine learning is not necessarily required.
  • Despite various possibilities, machine learning techniques have another limitation, in that the AI processes need large amounts of data to be effective.

Opportunities

  • There are fields where machine learning can improve. Machine learning can take advantage of analyzing unlabeled protein sequences that are not of specific interest to the protein engineer, in order to eventually discover functional or structural information that might lead to novel protein functions. This was confirmed by Frances Arnold, a Nobel-winning chemical engineer, as she mentioned this in one of her research papers.
  • According to an article written by Dr. James Canton, directed evolution will become shaped by AI, starting from 2020.

Threats

  • A terrorism risk is associated with AI application in directed evolution. In 2013, a committee headed by the Defense Intelligence Agency (DIA) was formed to warn US policymakers of the potential bioterrorism threat of developing viral and biosynthetic pathogens.
  • A recent study confirmed that directed evolution can be exploited for development of bioterrorism agents. Precisely, the study noted that the available variants of anthrax pathogen may be developed and carried out for negative purposes, with help of database libraries.

RESEARCH STRATEGY

In order to analyze strengths, weaknesses, opportunities and threats to AI application in directed evolution, the research team leveraged a compilation of scientific researches and biotechnology journals. It is worth noting that, in order to determine the threat AI application in directed evolution faces, we included information that applies to directed evolution development in general, since AI directly contributes to directed evolution development.
Part
03
of three
Part
03

AI Application in Biotechnology Trends

Trends in AI applications in biotechnology include disease identification, imaging, drug discovery and manufacturing. More details about these trends can be seen below.

AI Application in Biotechnology Trends

Disease Identification

  • Overview Of The Trend : AI with Biotechnology has been shown to have a greater chance of correctly identifying diseases in patients.
  • About The Trend : Year over year, the industries of AI and biotechnology for disease identification continue to achieve greater influence in AI and medical publishings.
  • The Driver of The Trend : The correct identification and diagnosis of a disease is one of the biggest challenges in medicine, and that is why it has been a number one priority one machine learning development within the biotechnology industry.
  • Impact of The Trend : AI with Biotechnology Disease Identification has already changed the way Sanfilippo Syndrome has been diagnosed, with the AI correctly identifying the disease 90% of the time, outperforming the clinical experts on three experiments.

Drug Discovery and Manufacturing

  • Overview of The Trend : Artificial intelligence, machine learning, and computing technologies are making drug discovery cheaper and quicker.
  • About The Trend : Pharmaceutical companies are no longer trying to attract doctors offices, but rather go directly to the consumers, in turn, are paying more attention to the pharmaceutical companies, and what they are putting in their bodies. This has led to the era of personalized medications, which have had a positive impact on health, with more and more companies seeking a way to produce those in an easier and quicker way.
  • Drivers of The Trend : Most devastating diseases and the companies who strive to cure them face debilitating crosshairs such as soaring drug-discovery costs and testing times. AI and biotechnology is being used to create a faster and more efficient way to find and develop new drugs.
  • Impact of The Trend : 70% of pharmaceutical companies agree that AI will play a big part in their work and over 60% of the companies already invest in AI.
  • Companies in The Trend : MIT Clinical Machine Learning Group, focuses on algorithm development within the drug discovery and manufacturing industry in regards to AI and biotechnology.
  • Companies in The Trend: Microsoft also has a project, known as Project Hanover which uses AI in order to develop a personalized drug protocol in the hopes of managing acute myeloid leukemia.

Imaging

  • Overview of The Trend : AI, when used with biotechnology, has shown to have a positive impact on machine imaging in the healthcare field with such developments shown to match or exceed the accuracy of human experts when analyzing images.
  • About The Trend : Interest for the AI and Biotechnology trend of imaging has increased over the past 12–18 months, especially in the radiology field with the industry expected to increase to $2 billion by 2023.
  • Drivers of The Trend : In medical imaging, such as radiology, images are viewed by trained physicians in order to detect, and monitor diseases. This is a time-consuming process that leaves room for error. There has been a need for a faster, more quantitative option, and AI, along with biotechnology, offers this.
  • Impact of The Trend : There has been an enhanced productivity in the imaging field as well as an overall increased diagnostic accuracy. The use of AI in hand with biotechnology has given the healthcare imaging field the ability to give more personalized treatment planning, and ultimately, a better and more enhanced clinical outcome for patients.
  • Companies Within The Trend : GE Healthcare, along with Macquarie University and Macquarie Medical Imaging, collaborated in order to diagnose and monitor brain aneurysms on medical images faster and more efficiently using AI.
  • Companies Within The Trend : Biocellvia, a company in France, offers automated digital image analysis and claims that their test results bring clearer insights for patients and stakeholders and are quicker than other tests.
Sources
Sources

From Part 02
Quotes
  • "Database of Structural Propensities of Proteins (dSPP) is the world’s first interactive repository of structural and dynamic features of proteins with seamless integration for leading Machine Learning frameworks, Keras and Tensorflow."
Quotes
  • "Machine learning (ML)-guided directed evolution is a new paradigm for biological design that enables optimization of complex functions. ML methods use data to predict how sequence maps to function without requiring a detailed model of the underlying physics or biological pathways."
  • " No matter the diversification technique, directed evolution is energy-, time-, and material-intensive, and multiple generations may be required to achieve meaningful performance improvements."
  • "While directed evolution discards information from unimproved sequences, machine-learning methods can use this information to expedite evolution and expand the properties that can be optimized by intelligently selecting new variants to screen, reaching higher fitnesses than through directed evolution alone.10 Figure 1b illustrates this data-augmented cycle. Machine-learning methods learn functional relationships from data11, 12 – the only added costs are in computation and DNA sequencing, the costs of which are decreasing rapidly"
  • "Machine learning is not necessarily useful in all applications. Because one major benefit of machine learning is in reducing the quantity of sequences to test, machine learning will be particularly useful in cases where lack of a high-throughput screen limits or precludes directed evolution. However, when a sufficient number of sequences can be screened or if the fitness landscape is smooth and additive, machine learning may not significantly decrease screening burden or find better variants. "
  • "As researchers continue to collect sequence-function data in engineering experiments and to catalog the natural diversity of proteins, machine learning will be an invaluable tool to extract knowledge from protein data and engineer proteins for novel functions"
Quotes
  • "Clearly, in spite of the large numbers of possibi-lities opened up, the use of ML techniques still pre-sents limitations that should be overcome. First of all, the data set used for training has to be large and diverse, otherwise overfitting problems may start to emerge."
Quotes
  • "We are just at the edge of realizing that we are headed into an entirely new era where the directed evolution of humanity will be shaped by AI. So, we best get this right. "
Quotes
  • "Mikhail Shapiro—“Directed evolution of viral pathogens and biosynthetic pathogens produced by gene clusters in heterologous systems are fairly recent and are interesting. "
Quotes
  • " But, at the same time, Directed Evolution, in a manner similar to the way it can be involved in the development of bioterrorism agents, can be used to efficaciously counter their impact. For example, in case of genetically modified Bacillus anthracis, the analysis of the DNA sequences of all available variants of the anthrax pathogen may be carried out with the help of a database library, and particular regions in the sequence that are preserved in most of the variants may be found."