Unfolding the Protein Folding Solution

Amidst the tumult of 2020, in which the world often seemed to stand still (and worse, at times to regress), the march of scientific progress pushed ever onward. First and foremost, we saw a new standard set for the development, testing, and validation of vaccines, culminating in the approval of multiple COVID-19 vaccines. At the Large Hadron Collider (LHC), we saw the first evidence of a new type of particle. And SpaceX’s Crew Dragon became the first private vehicle to carry astronauts to the International Space Station (ISS).

In November, we witnessed another remarkable scientific feat, as Google DeepMind’s AlphaFold2 “solved” the protein folding problem – an achievement with far-reaching implications for biology and medicine. This feat absolutely deserves to be celebrated. But it is also important to recognize the potential limitations of their approach, and to retain a healthy dose of skepticism. 

In this blog post, I plan to describe the protein folding “problem”, and then to explain why I believe it is best to exercise caution, rather than to immediately regard AlphaFold’s performance as a “solution” to this problem.

First however, I want to acknowledge that I am a physicist, not a biologist. Make of that what you will. I also want to disclose that last year I interned Google X, formerly known as The Moonshot Factory. The opinions I espouse here are entirely my own.

What is a Protein?

To most people, proteins mainly connote biology. Many – like myself – remember learning about proteins as biological molecules, or biomolecules, which have distinct biological functions. In reality, proteins sit at a unique intersection of biology, chemistry, and physics. This makes them fascinating objects of study, but also makes them particularly unyielding to established scientific methods. 

At a basic level, proteins are chains made from amino acids. The amino acids serve as the building blocks for proteins, in much the same way as letters in an alphabet can be strung together to form words. Just as the order of the letters in a word affects the meaning of the sequence, so too does the order of amino acids in the chain affect the biology of the resulting protein.

Unlike words however, the protein chains exist natively in the physical world. When we write a word on the page, the space between letters is fixed. The previous letters in the word don’t dictate how much space we should leave before the next letter. 

For proteins, space matters. Chemically, the amino acids are strung together via covalent bonds, where electron pairs are shared between both parties. Going a level deeper, the amino acids themselves are organic compounds made up of atoms, and are as a result substantially influenced by chemical and physical forces. These forces constantly push and pull the constituents in different directions, driving a series of twists and turns in three dimensional space as the protein moves toward a stable configuration, or conformation. This intricate dance is the process of protein folding. The protein gradually moves from less stable, higher energy configurations to more stable, lower energy states, so the folding is called spontaneous.

And here’s the thing: the protein’s function depends directly on this conformation. In other words, identifying a protein’s stable shape is crucial to understanding the roles it plays in biology.

The Protein Folding Problem

One of the most remarkable things about protein folding is that for a given chain, many distinct paths – each with their own twists and turns – can lead to the same final shape. The intermittent configurations can at times seem completely random; and yet the result is somehow predestined. Observations of this kind led Nobel laureate Christian Anfinsen to postulate that a protein’s structure is entirely determined by its sequence of amino acids. This hypothesis, known as Anfinsen’s dogma, essentially defines the protein folding problem: to predict a protein’s shape (and consequently its function) given only the protein’s sequence of amino acids.

Solving this problem has been an outstanding challenge for half a century, evading the tools of biology, of chemistry, and of physics. 

Physically, the problem is typically framed in terms of minimizing the energy of the collection of atoms and molecules in the protein chain. Despite their success in areas such as biophysics and drug design, techniques like molecular dynamics, which are based in classical mechanics, fall spectacularly short. And the proteins, often consisting of hundreds or even thousands of amino acids, are far too large to be treated quantum mechanically. Some physical models for the problem, which treat the protein chain as randomly choosing junctures at which to fold (so long as the chain doesn’t fold in on itself) lead to the conclusion that the problem is NP-Hard: a fancy way of saying that solving the general case is VERY HARD.

Typically when a problem gets too large in scale to be succinctly stated in the language of one theory, another theory emerges, and with it comes a more suitable language. We need not analyze the quantum mechanical wave function of every proton, neutron and electron to understand that noble gases are stable because their valence shells of electrons are full. And we need not look at every chemical bond in a cell to understand that the mitochondria is the powerhouse. To quote Nobel laureate Phil Anderson in his essay More is Different, “The constructionist hypothesis breaks down when confronted with the twin difficulties of scale and complexity…at each level of complexity entirely new properties appear”.

In the case of protein folding, progress has indeed been made toward finding a more suitable language. In fact, there is a general structural hierarchy within the folded proteins; the primary structure is comprised of the amino acid sequence; in the secondary structure the amino acids form stable patterns of helices and sheets; in the tertiary structure these helices and sheets are then folded into further formations; finally, the quaternary structure captures the interplay between multiple chains; biologists have even identified structural motifs, or three-dimensional structureswhich frequently appear as segments within folded proteins.

The frustrating thing about the protein folding problem is that we are fairly positive such a language should exist. Why? Because nature solves the problem all the time. Typical proteins fold in seconds or minutes; some fold on the scale of microseconds. Yet our theoretical models for protein folding – models based in physics – tell us that it should take proteins astronomically long times to fold. Even under the most lenient assumptions, the predicted timescales are longer than the Universe is old! This apparent discrepancy between the complexity of modeling protein folding on one hand, and the ease with which proteins actually fold on the other, is known as Levinthal’s paradox.

The Test

Every two years, the worldwide protein folding community comes together to assess the state of progress in the field. More than one hundred research groups from around the globe come armed with their newest and most sophisticated algorithms for predicting the structure of proteins. These algorithms are then evaluated on a set of roughly 100 never before measured proteins. 

Adding to the challenge, the competitors (the different research groups) are not told anything about the proteins prior to the assessment. In this way, the biennial test, known as the Community Assessment of protein Structure Prediction (CASP), is designed so as to test protein structure prediction solely on the basis of amino acid sequence. In other words, CASP is designed so that ‘solution’ implies solving the protein folding problem.

Given the vast space of possible ‘predictions’ for each protein, CASP evaluates the quality of a prediction, or how closely it approximates the actual measured protein, on a variety of metrics. The primary evaluation metric, the global distance test (GDT) involves comparing the actual and predicted positions of molecules known as alpha-carbons, which tag the approximate locations of the amino acids. In essence, this is a way of quantifying how well the measured and predicted proteins overlap in three-dimensional space, with GDT scores of 0 implying no overlap, and 100 signifying perfect overlap. 

However, the experimental techniques used to measure the actual proteins are not perfect. This means that after a certain point, it isn’t clear whether the predicted or measured protein is more accurate; which one is closer to ground truth. As a result, a score above 90 on the GDT is generally regarded as a ‘solution’. 

From the inception of CASP in 1994 up through 2016 (CASP12, the 12th competition), there had not been substantial improvement in performance on the GDT. In the intervening years, understanding of proteins and protein folding had absolutely matured. But the results had not materialized in three-dimensional structure prediction. From 2006 and 2016 for instance, the median GDT on the subset of test proteins in the free-modeling category for the best performing algorithm remained above 30 and below 42 every year. Up through this point, no machine learning based approach had even come close to threatening the state of the art. 

The ‘Solution’

Enter DeepMind, on a mission to radically challenge our deepest held beliefs about the power of AI. In 2017, DeepMind shocked the world when its artificial intelligence AlphaGo demonstrated mastery over the game of Go, convincingly beating the reigning world champion.  Fresh off of its game-playing triumph, DeepMind unabashedly set its sights on protein-folding.

In 2018, participating in CASP for the first time, DeepMind’s AI system AlphaFold handily beat the competition, scoring a median GDT of close to 60 on the free-modeling category, regarded as the most challenging category. This was properly recognized as a tremendous leap forward on the protein folding problem, albeit far from a solution. Already, AlphaFold had convinced many that machine learning could potentially be useful not just in games, but in pure scientific research. Indeed, AlphaFold was so convincing that about half of the entrants for the 2020 CASP competition used deep learning in their approaches.

Determined to build on this initial success, DeepMind went back to the drawing board and returned to CASP in 2020 with a new and improved AI; AlphaFold2. Once again, DeepMind shocked the world by shattering its own records and achieving a median GDT of 87 on the free-modeling category – and 92.4 GDT overall. On average, AlphaFold2’s predictions were within a single-atom’s width of the actual measurements. 

Almost immediately, AlphaFold2 was hailed as a ‘solution’ to the protein folding problem. Because AlphaFold2 is an improved version of AlphaFold, we’ll refer to the AI system without the number ‘2’.

DeepMind’s own blog claimed “AlphaFold: a solution to a 50-year old grand challenge in biology”. News outlets followed suit, with sources including Science Magazine, Vox, CNBC, and MIT Tech Review using some variant of the word “solved” in their coverage, and sentiment to match.


With AlphaFold, like AlphaGo before it, DeepMind is forcing us to reimagine what artificial intelligence is capable of. This in and of itself is remarkable. That AI will likely play an integral role in the future of medical research and drug discovery is worth further celebrating. DeepMind deserves ample credit for these achievements. 

That being said, it is far too early to claim they have ‘solved’ the protein folding problem. I believe we should remain skeptical because of the relationship between machine learning on the one hand, and generalizability and interpretability on the other. These problems are not unique to AlphaFold. Rather, they are philosophical qualms with using machine learning to ‘solve’ scientific problems in the way that AlphaFold attempts to do.

The application of machine learning to scientific research is not new. At the Large Hadron Collider (LHC) at CERN for instance, machine learning was used to find the Higgs boson back in 2012. The difference lies in how machine learning is employed. 

At CERN, machine learning was used to facilitate the comparison of our scientific theories – in this case the standard model of particle physics – and experimental data. Physicists already had a theory for the ways in which elementary particles interact with each other; they set out to test that theory by colliding fast-moving particles together, and comparing post-collision measurements with the outputs of their theoretical models. The problem was that even on powerful computers, their model took a long time to generate predictions. Machine learning helped them to more quickly generate synthetic data to compare with experiments. Machine learning did not replace the physics-based model; it helped test the model.

With AlphaFold, DeepMind is effectively attempting to replacephysical and biological models of protein folding with a machine learning model. Yes, AlphaFold performed far better than any previous models. But to what extent can we actually trust AlphaFold’s predictions on new proteins? In other words, how well does AlphaFold generalize? 

Well, we don’t really know. Of the millions of proteins we have already found, AlphaFold was trained on the tiny fraction whose structures have been measured. The test set was even smaller still. Even if AlphaFold had perfectly predicted every test protein – which it didn’t – I’d still bet on nature’s ingenuity. 

Of course, any theory, when faced with new observations, must wrestle with the same questions. If theory and observations disagree, the theory must be modified or replaced entirely. But with machine learning models, where the assumptions are hidden, it often isn’t clear where the model is breaking down.


By playing against AlphaGo time after time, researchers have begun to gain insights into how the AI “thinks”. And human Go players have taken inspiration from AlphaGo in their own strategies. In just a few years, the AI has already given us tremendous insights into the game of Go, strategy more generally, and what it means to be creative.

“Solving” a scientific theory is a far higher bar than besting the best human at a game. 

It’s quite possible that artificial intelligence helps us to achieve this goal; to find the right language. But we need to work on extracting insights from our machine learning models, and interpreting the models we build.

Even if AlphaFold never improves beyond its current state, it will still prove useful in medical research; at the bare minimum, it will allow biologists to take coarser measurements in the lab (reducing time and money spent), and use AlphaFold to iron out the fine structure. More optimistically, we can envision a future in which humans with with AlphaFold to discover the rules of protein folding.

AlphaFold is not a solution to the protein folding problem, but it is absolutely a breakthrough. Any machine learning based approach to science will need to address practical and philosophical challenges. For now, we should appreciate DeepMind’s colossal step forward, and we should prepare for unprecedented progress in the near future. This is only the beginning.

Diagrams and Deep Neural Nets: Abstraction in Science

Famed Abstract Expressionist Arshile Gorky once wrote, “Abstraction allows man to see with his mind what he cannot physically see with his eyes… Abstract art enables the artist to perceive beyond the tangible, to extract the infinite out of the finite. It is the emancipation of the mind. It is an explosion into unknown areas.”

In science as in art, abstraction has always been vital to progress. It is responsible for our mathematics, for many of our scientific discoveries, and for unearthing overlooked connections in old theories, thereby changing our understanding of the world. Historically, the development of new tools for abstraction has led to novel insights, and counterintuitively, to more concrete and quantitatively accurate predictions. Now, deep learning methods are taking this to an entirely new level, uncovering patterns invisible to not only the naked eye but even the machinery of mathematics.

In society, the abstract is all around us in our signs, symbols, and gestures. We use abstractions to organize our knowledge and to express ourselves. As civilization has developed, our communal trove of knowledge has grown exponentially. We develop abstractions for our abstractions, and place an ever higher premium on the ability to think abstractly.


Mathematics provides the perfect showcase for this idea: It’s basically the science of abstraction. The very process of learning mathematics highlights how abstract representations get layered one upon the other until they form a universe of connections.

The first layer — counting — is so simple we might not even think of it as abstraction. But to say “there are five apples” means that we can abstract away the different shapes and sizes that make them distinct objects, and categorize them as the same. We learn that four apples is different than five apples. We eat one and are forced to develop the concepts of addition and subtraction.

We develop numerals, written symbols for the numbers they represent. We create notation for addition (+) and subtraction (-). We integrate the concept of a variable, something that can change. The layers are already stacking up: The variable is an abstraction for a changing numeral, which is an abstraction for a number, which we originally manufactured to count our physical objects. 

As our mathematics becomes more sophisticated, we develop abstractions for owing (having negative of something), zero (the idea of nothingness), and infinity to represent something larger than any counting number – larger than any quantity of apples or anything else we could ever possibly have. The unbounded vastness of infinity perhaps epitomizes the limitlessness of abstraction itself. We define sets of numbers, like the irrationals, which is impossible to make contact with without thinking in abstractions. This process goes on and on. 

Abstraction enters science

Through mathematics, abstraction made its way into science itself. At its core, the scientific method is a cycle of hypothesis, testing, and revision. Thus, scientists have always sought patterns and laws to describe natural phenomena. At the height of the Scientific Revolution, Sir Isaac Newton published Principia (1687), laying the ideological framework for a science rooted in abstraction. 

While Newton’s eponymous laws of motion, and law of universal gravitation were quite accurate at the time (and to this day remarkably describe macroscopic non-relativistic matter), the laws were even more powerful in their statement that the state of a physical object can be represented by mathematical variables. For Newton’s laws, the state of an object was fully captured by its position, velocity and acceleration, all of which are easily measured quantities. However, in different theories the state has since taken on various properties. Furthermore, the abstraction to a state allowed for properties that are not directly measurable – like the phase of a quantum state (only the relative phases between quantum states are measurable) – but which nonetheless have observable consequences. This represented a paradigmatic philosophical shift in the practice of science.

Diagrams are emblematic of abstraction in science. Ubiquitous in science – and especially prominent in physics – they go far beyond pictures or illustrative figures. In nearly every branch of physics, diagrams facilitate the solution of problems by making computations tractable. But even more importantly, they do so by abstracting away physically unimportant details of the system under study and emphasizing one particular feature

In Classical Mechanics, which describes how macroscopic objects like blocks and balls and trains behave, Newton’s Laws formulate the dynamics of such objects in terms of forces, which act on objects and set them in motion. Free Body Diagrams (FBDs) arise as a visual tool for keeping track of the forces acting on an object. In an FBD, forces are represented as lines emanating from (the center of mass of) an object.

As a simple example, consider the setup in figure 1 below: two electrically charged balls, A and B, are hanging (at rest) from strings attached to a rafter. Suppose we want to find the tension in the string attached to ball A. From this picture alone, it is not clear what details are relevant or even if we have all of the necessary information.

Fig. 1: Physical setup: two electrically charged balls of uniform density, A and B, are hanging statically from ropes attached to a rafter. The ropes have the same length, and each makes an angle theta will the vertical.

The Free Body Diagram for ball A gives a much cleaner picture, indicating the relevant aspects of the physics. We can see immediately that there are only three forces acting on A – gravity, a Coulomb repulsion from B, and the tension from the rope. These forces are very different in nature, but they are all treated on equal footing. We can conclude immediately that we do not need to know the length of the rope, or the length or width of the rafter. We don’t even need the angle theta.

Fig. 2: Free body diagram for setup in Fig. 1. Only forces acting on ball A are shown – the tension, electromagnetic, and gravitational. Because the ball is in equilibrium, both the horizontal (x) and vertical (y) components of the net force must cancel.

Furthermore, we need the mass of A but do not need to know the mass of B (because this is an FBD for A, not B), however we do need the electrical charge of both balls and the distance between them. Ball A is in static equilibrium, so by Newton’s Laws, the net force acting on it must be zero.

By abstracting away the nature of the forces, the details of the physical setup, and the other objects present, Free Body Diagrams isolate the ingredients responsible for determining motion, making a seemingly complicated problem feasible.

The history of physics is littered with similar examples of the power of diagrammatic abstractions, such as Minkowski Diagrams in Special Relativity, and Penrose Diagrams in General Relativity, which illuminate the causal structure of spacetime. Perhaps the most prevalent diagram in all of physics is the Feynman Diagram. Feynman Diagrams are such powerful tools that Julian Schwinger, who shared the 1965 Nobel Prize in Physics with Richard Feynman, said they “brought quantum field theory to the masses.” Feynman Diagrams are so popular they have even pervaded pop culture, finding their way into movies and onto shirts and mugs.

The central object of study in quantum electrodynamics (QED) – the study of the interactions between light and matter-  is the scattering matrix. The fundamental processes in quantum field theory are called scattering events – one particle scatters off another and breaks up into multiple (decay),  two particles collide and annihilate each other (pair annihilation), etc.. The scattering matrix provides the relationships between the initial and final states of such a system when particles scatter. It is given by an integral that is often quite difficult or even impossible to calculate directly.

Feynman Diagrams are useful tools for “book-keeping” when calculating the scattering matrix. Richard Feynman recognized that even though the scattering matrix might be hard to calculate directly, the integral could be written as a (possibly infinite) series, where each term in the series could be viewed as a set of particles interacting, representing a different pathway or “channel” for the scattering to occur. Furthermore, each term can be represented by a diagram.

These diagrams are read temporally from left to right, with initial particles entering at the far left (some initial time) and final particles exiting at the far right (after scattering). The diagrams do not contain spatial information. Every line represents a particle, and every vertex an interaction. Implicitly, momentum and charge are conserved at every vertex. Terms that contributed more strongly to the path integral corresponded to simpler – and thus more probable – particle interactions. Feynman rules provide a prescription for manipulating these diagrams, and for calculating their contributions to the scattering matrix, thus expediting the computation of the previously intractable quantity.

Fig. 3: Feynman diagram for electron-positron annihilation. p1 and p2 are the momenta of the electron and positron respectively. The product of the scattering event is a photon (the wavy line). Copied from Schwartz QFT.

Moreover, these diagrams paved the way for new theoretical developments. First, they shed light on the fundamental nature of symmetry. Taking the diagrams at face value, Feynman concluded in 1941 that a particle moving forward in time was indistinguishable from its anti-particle moving backward in time. This became known as the Feynman-Stuckelberg interpretation. 

Second, they provided insight into the role of locality. Just looking at the terms in the scattering matrix as a series, it is not clear which terms will contribute and which will get cancelled out by other terms. Viewing the series diagrammatically, it becomes obvious that there are two types of terms: connected diagrams, in which you can trace a path from any initial particle to any final particle, and disconnected diagrams, in which you cannot. The disconnected diagrams can be decomposed into connected components, and simple manipulations show that these cannot contribute to the final scattering amplitude. This leads to cluster decomposition – a statement of locality that says that experiments well-separated in space cannot influence each other.

Fig. 4: Example of disconnected and connected Feynman diagrams. The disconnected diagrams cannot interfere with the connected diagrams. Copied from Schwartz QFT.

Diagrams will always have a place in science. And the prevalence of these tools speaks to the human capacity for creativity and ingenuity. Each diagram reflects a revelation in which one particular set of features was discovered to be vital and others immaterial. As our understanding of the world develops, however, our theories grow ever more intricate. What if the essential elements of these theories become too subtle to isolate by stroke of genius alone?

Computational Abstraction

To put it bluntly, humans aren’t essential for abstraction. Humans are bound to their physical nature, but the act of abstracting means leaving the physical realm behind. Indeed, many of the technological advances of the past few years have been spurred on by computational abstraction, a process in which computers learn abstract representations of data. At the core of this renaissance is the deep neural network – an algorithm originally conceived to mimic the process of learning in the human brain.

A simplified model of the human brain consists of many connected neurons (a network) that talk (pass information) to each other. Each neuron takes some information in, transforms it, and then transmits an electrical signal via synapse to another neuron. The synapse either fires or doesn’t fire, depending on magnitude of the transformed value.

A neural network functions on the same principles: A set of neurons take input data, transform it, and then pass the new values to another set of neurons, which in turn transform and communicate the modified values. Each set of neurons is called a layer, and the number of layers is the depth of the network. One slight modification from the model of the human brain is the prescription for transmitting electrical signal, known as the activation function. Rather than the binary fire or not fire of genuine synapses, more complicated functions are used. 

Such an algorithm learns through a training process, in which it is given input data which it is asked to transform, and then the estimated output is compared to the true output (the final representation you would like it to learn. Every time the estimated output differs from the desired output, the network updates itself by changing the way it transforms inputs.

Just as the human brain performs abstraction when learning new mathematical concepts or drawing FBDs or Feynman Diagrams, a neural network abstracts away irrelevant details from the training examples when it modifies the transformation it applies to the data. However, whereas in these diagrams, the relevant features were hand-picked, neural networks learn which features are relevant.

On its face, there is no clear advantage to having multiple layers of neurons. In practice increased depth often leads to improved performance. One distinct advantage of deep neural networks is that abstraction occurs at each layer. Throughout the training process, the transformations at each layer are tuned so that the network learns intermediate representations (one for every layer), in addition to a final representation. The deeper the layer, the more abstract the features.

Take one type of neural network used to process images, called a convolutional neural net (CNN). At the highest layers, the filters look like distorted images. In the middle layers, patterns start to emerge. In the lowest layers, the CNN picks out specific textures and then edges. The CNN itself isn’t thinking, but through the process of abstraction it uncovers low-level visual features. 

For instance, let’s say you want to teach a CNN what a human face is. To train the network, you assemble a large, diverse collection of images of human faces, and feed those images through the network one by one. After each step, the CNN adjusts its understanding of faces through a process called backpropagation. If the input face differs from the network’s current understanding of a face, the network changes the way the neurons communicate with each other to try to account for these differences. As more images are passed through the network, its definition of a face becomes increasingly robust. 

Fig. 5: Example of feature representations at different layers in a convolutional neural network (CNN). The input layer takes in images of faces, and deeper layers decompose the faces into more and more abstract elements. Copied from Nathan Lintz’s Indico blog post.

By the end of the training process, the deep neural network has “learned”what a human face is by deconstructing it layer by layer, with deeper layers discovering more fundamental patterns in the data. Then the network can recombine these features in new ways, painting pictures of what it thinks a human face actually is

Deep learning facilitates scientific progress

Deep learning has already found applications in many areas of science. It is being used to model dark matter and galaxy shapes, to identify new physics in collision events at the Large Hadron Collider (LHC), and to advance drug discovery. And these data-oriented approaches have already met with tremendous success in identifying features that people could not find through intuition or genius alone. 

Higgs Detection

One of the first applications of deep learning in physics was in the discovery of the Higgs boson at CERN. The Standard Model of Particle Physics provides a unified description of three of the four fundamental forces: electromagnetic, and weak and strong nuclear interactions. It stipulates the existence of the Higgs boson – a particle that gives mass to the other particles. The Higgs was theorized to have such high energy that, when it is a possible product of a scattering event, its diagrams contribute very minimally to scattering matrix – it is produced with very low probability. 

In order to verify the existence of the Higgs boson, physicists conducted trillions of scattering events in the LHC and set out to demonstrate that the measured and theorized Higgs contributions matched. This required distinguishing events in which Higgs bosons were produced from background events, some of which gave quite similar signatures. 

The primary challenge lay in the quantity of data required to determine the Higgs’ contribution to within acceptable margin of error. At the LHC, particles are collided together at near the speed of light, resulting in billions of scattering events each second. The detectors take millions of measurements for each collision, resulting in the creation of roughly a petabyte of data per second. 

It was unfeasibly under hardware constraints to store the massive amount of data resulting from all collisions necessary for the theorized number of Higgs bosons to be produced. Thus, decisions about which collisions to store, (the ones that are likely to have produced High particles), had to be made on the spot. Therefore, the traditional machinery of quantum field theory was too bulky for this problem. Instead, deep neural nets were trained to take the measurements from the detector as input and classify events as potentially interesting or not. In other words, the networks took in physical attributes from the collision, and abstracted away what makes a collision likely to produce Higgs bosons. This allowed for essentially instantaneous classification. 

Drug Discovery

More recently, deep learning has shown great promise in the quest for novel classes of molecules and materials. Throughout history, entire eras have been defined by the discovery and exploitation of new types of materials, from the bronze age to the iron age to our current silicon age and the blossoming of the semiconductor industry. Since at least 1942, when penicillin was derived from the penicillium fungus and used as an antibiotic, pharmaceuticals have had a similarly society-altering effect on public health.  This has resulted in the quest for compounds that exhibit particular properties of interest, be they medicinal, electronic or otherwise.

The difficulty here is two-fold: first, the space of possible materials (or of possible drugs) is vast, and far too expansive to be searched systematically. Second, the synthesis or a compound from scratch is expensive and time-consuming. 

In order to find a drug that satisfies a particular property, it is necessary to greatly reduce the number of compounds that need to be synthesized. This process of reducing the search space is known as high-throughput screening. Machine learning has been a part of this process for decades, but the quality of the computational sieve required to pick out good candidates lay out of reach – until the increased abstraction and representational power of deep neural networks made many problems in drug discovery tractable.

The road ahead

While abstraction itself does not require a human element, science does. As a tool for abstraction, deep learning relies heavily on practitioners and scientists. Humans must tune the hyperparameters of the network such as the learning rate, which controls how much the transformations at each neuron are updated at each step of the training process. Humans also specify the depth of the network, and the number of neurons in each layer. These choices can be far from obvious. 

Perhaps even more importantly, deep neural networks do not replace previous scientific methods and results, but instead build upon them. At CERN, the neural networks were trained using the results from simulated collision events based upon the physics of the Standard Model, viewed by many as the crowning achievement of theoretical physics thus far. In drug discovery, one of the essential factors impacting performance is the input representation. A priori it is not clear how best to present a molecule as data to a computer, be it a list of constituents and relative positions of atoms, a graph with atoms as vertices and bonds as edges, or something else entirely. It turns out that if scientists use domain knowledge (pertaining to the desired properties), they can generate chemically inspired input encodings that far outperform naïve encodings. 

Deep learning is not a panacea for the problems of science. It will not reveal to us the true nature of our universe, nor will it replace the role of humans in science. Time and again revolutionary thinkers have shifted the paradigm and changed the way we view the world, and the human spirit has strength to prevail against all odds. But by utilizing deep learning as a tool, we can shift the odds in our favor, and in so doing expedite scientific progress.