Last term I taught a small freshman seminar called “Modeling Life,” on ways of looking at biology through the lens of physical and computational models. It was part of the university’s “Freshman Interest Group” (FIG) program, in which one creates small seminars that connect two regular courses that each student in the FIG takes. This was the first time I’ve taught a FIG, and I proposed linking my Physics of Life course with the introduction to programming course in the computer science department. Both are “100-level” general education courses, intended to reach a wide range of students. The FIG was an interesting experiment, and since I haven’t documented it anywhere else, I’ll write a little bit about it here. Though I don’t think my approach was great, there is definitely a need for more venues that expose students, at an early stage, to the intersection of biology, physics, and computation, and that convey what concepts like “modeling” mean, so perhaps some of the material I put here might be of use to some future course somewhere. Motivations. The study of life is being revolutionized by the study of information. Thanks to DNA sequencing and other high-throughput ways of identifying biomolecules, we have troves of data on genomes, gene activity, protein interactions, and more. Thanks to advances in imaging, sensing, and tracking, we can acquire huge amounts of information about form and structure. In itself, all this data doesn’t provide any insights into how living things work. For this, we need models — simplified representations of reality that highlight key features, as well as tools that let us navigate through data. The goal of this course was to reveal the existence of these broad themes in contemporary science. It was a non-technical course, focused more on this landscape than on particular paths through it. I tried to use lots of examples related to bacteria, because of their importance, because of their utility in illustrating biophysical and “systems biological” concepts, and because of my own interests in bacterial communities (see also here).
“Dry” biology. The term began with some readings about the gut microbiota, as well as a short piece from Science on “Biology’s Dry Future” , describing how, via new computational methods, “researchers are making fundamental discoveries without ever filling a pipette, staining a cell or dissecting an animal.” The existence of this mode of research was completely alien to for all the students, disjoint from any conception of biology that they had gotten in high school, and they were surprised and excited by it. The Physics of Penguins. To introduce the concept of models, we discussed a paper, “The origin of traveling waves in an emperor penguin huddle” , in which a group of physicists explain how waves of jostling motions propagate through a group of penguins. The paper illustrates the idea of creating simple, tractable models that capture the essence of a phenomena, and helps set up the idea of agent-based simulations. I’m not actually very fond of the penguin paper: in the twenty-first century no one should be at all surprised that collective phenomena like waves can emerge from simple objects with nearest-neighbor interactions, and we shouldn’t need to run a computer simulation to realize this. Originally, I planned to discuss this and other criticisms, but in the end abandoned this in the interests of time and simplicity. Analytic and Numerical Approaches (and growth and disease). Thinking about agent-based models led us to models for diseases and epidemics and, more broadly, the distinction between analytic models and numerical simulations. We discussed the advantages and disadvantages to writing a simulation, and the “art” of figuring out what problems are amenable to analytic solution and what require brute force. I started with a simple example: figuring out the average of random numbers uniformly distributed between -10 and 10, which one could determine by simulation but which, we all realize, is trivial to figure out just by thinking. We then moved on to examples in which it’s less obvious, that there are “clever” exact solutions, for example logistic growth: a population with some growth rate dependent on the present value of the population (giving exponential growth) and on some sort of constraint on the overall carrying capacity of the environment (giving a stable ceiling to the population). It’s easy to see how to simulate this; it’s not readily apparent that there’s a nice analytic solution to the population as a function of time. This also let me discuss my lab’s research on microbial growth, which we returned to several times. Noisy gene expression. The theme of simulating versus exactly knowing what form a model takes led to a discussion of a few pieces of a beautiful paper from Ido Golding & colleagues, “Real-Time Kinetics of Gene Activity in Individual Bacteria” , which illustrates both approaches. We talked about genes, and asked what gene expression ‘looks like’ at the level of the actual, physical, molecules involved. How can the same genes lead to different outcomes, and how can we think about randomness and predictability in systems of genes? These questions could easily fill a whole term; we were very superficial. Still, the discussion achieved its aim of conveying that there’s a far greater depth to how genes act than is even hinted at in cartoons from introductory biology books, and that quantitative ideas about physical processes are helping us explore these areas. Noisy gene expression. It is fascinating that, in an age when genes and DNA sequencing are referred to everywhere, most people have no idea of the computational challenges involved in figuring anything out from sequence-based data. To illustrate this, I sketched the basic idea behind a neat approach by Curtis Huttenhower and colleagues  to infer the genes contained in microbial communities given sparse information about what species are present. First, we discussed an analogy: suppose you knew how to say “seven” in several Indo-European languages, but not in Italian. How would you try to predict what ‘seven’ in Italian would be? This was fun to discuss, and again, the existence of algorithmic challenges like this in biology was a complete surprise to everyone in the class. Image analysis. Another key role for computation is in getting data in the first place (often very large amounts of data). This is something my lab deals with a lot, in the context of microscopy and imaging. We took everyone to my lab, and looked with our microscopes at bacteria swimming, and ogled arrays of mirrors and lasers. We also discussed some basic themes of image analysis, starting by asking “what is a digital image?” (A few people could give a decent answer; most could not.) Given that an image is an array of numbers, how, for example, can we identify objects like cells? Throughout several of the topics, we visited and revisited questions like: What is a model, and What is modeling good for? I’ll leave it to the reader to supply answers.
In this post, I haven’t said anything about who the students were, what projects and assignments we had, how the course went, and whether I’ll teach it again. I’ve already spent far too much time writing, though, so all this will have to wait until Part 2. Stay tuned!
Update: I’ve written Part 2.
 R. F. Service, “Biology’s Dry Future.” Science. 342, 186–189 (2013). [http://www.sciencemag.org/content/342/6155/186.summary]  R. C. Gerum et al., The origin of traveling waves in an emperor penguin huddle. New J. Phys. 15, 125022 (2013). [http://iopscience.iop.org/1367-2630/15/12/125022/article]; see also http://www.nature.com/nature/journal/v505/n7483/full/505265e.html  I. Golding, J. Paulsson, S. M. Zawilski, E. C. Cox, Real-Time Kinetics of Gene Activity in Individual Bacteria. Cell. 123, 1025–1036 (2005) [http://www.cell.com/cell/abstract/S0092-8674%2805%2901037-8]  M. G. I. Langille et al., Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821 (2013). http://www.nature.com/nbt/journal/v31/n9/full/nbt.2676.html