T-minus 9 days for my graduate biophysics course

urchin_Feb2015_transparentNext term, I’ll be teaching a brand-new graduate biophysics course. (It’s the first time teaching a graduate course in my eight years as a professor!) I’ve spent quite a while thinking of what should be in it and how the course should be structured. Here, I’ll just note my list of topics (below, with a few comments), and provide a link to the syllabus (here). Hopefully in weeks to come I’ll comment on how the course is going.


Introduction; Physics, statistics, and sight

What are the fundamental limits on vision, and how close does biology come to reaching them? (A brief look.)

Components of biological systems

What are the components of biological systems? What are the length, time, and energy scales that we’ll care about? How can we organize a large list of “parts?”

Probability and heredity (a quick look)

We’ll review concepts in probability and statistical mechanics. We’ll discuss a classic example of how a quantitative understanding of probability revealed how inheritance and mutation are related.

Random Walks

We can make sense of a remarkable array of biophysical processes, from the diffusion of molecules to the swimming strategies of bacteria to the conformations of biomolecules, by understanding the properties of random walks.

Life at Low Reynolds Number

We’ll figure out why bacteria swim, and why they don’t swim like whales.

Entropy, Energy, and Electrostatics

We’ll see how entropy governs electrostatics in water, the “melting” of DNA, phase transitions in membranes, and more.

Mechanics in the Cell

We’ll look more at the mechanical properties of DNA, membranes, and other cellular components, and also learn how we can measure them.

Circuits in the Cell

Cells sense their environment and perform computations using the data they collect. How can cells build switches, memory elements, and oscillators? What physical principles govern these circuits?

Multicellular organization and pattern formation

How does a collections of cells, in a developing embryo, for example, organize itself into a robust three-dimensional structure? We’re beginning to understand how multicellular organisms harness small-scale physical processes, such as diffusion, and large-scale processes, such as folding and buckling, to generate form. We’ll take a brief look at this.

Cool things everyone should be aware of

We live in an age in which we can shine a laser at particular neurons in a live animal to stimulate it, paste genes into a wide array of organisms, and sequence a genome given only a single cell. It would be tragic to be ignorant of these sorts of almost magical things, and they contain some nice physics as well!


As you’ve probably concluded, this is too much for a ten-week course! I will cull things as we go along, based on student input. I definitely want to spend some time on biological circuits, though, which I’m increasingly interested in. I also want to dip into the final topic of “cool things” — I find it remarkable and sad that so many physicists are unaware of fantastic developments like optogenetics, CRISPR, and high-throughput sequencing. Students: prepare to be amazed.

My sea urchin illustration above has nothing to do with the course, but if you’d like a puzzle: figure out what’s severely wrong with this picture.


IMG_1442I’m at a conference at Biosphere 2, the large ecological research facility in the Arizona desert that was originally launched as an attempt at creating a sealed, self-contained ecosystem.

It’s a surreal place — a collection of glass pyramids and domes housing miniature rain forests, deserts, an “ocean,” and a few other biomes — that’s now used for more “normal” research and education. I’m here not to join some a futuristic commune (at least not yet), but rather as a participant in a fascinating conference organized by Research Corporation called “Molecules Come to Life” — basically, it’s getting a lot of people who are interested in complex living systems together to discuss big questions, think of new research directions, and launch new projects. It’s a fascinating and very impressive group that’s here. Interestingly, a huge fraction are physicists, either physicists in physics departments (like me) or people trained as physicists who are now in systems biology, bioengineering, microbiology, etc., departments.

Do the conference topic and the venue have anything to do with one another? Explicitly, no. But in an indirect sense, both touch on issues of scale. A key issue in the study of all sorts of complex systems is how to relate phenomena across different extents of space and time. How can we connect the properties of molecules to the operation of a biological circuit? A circuit to a cell? A cell to an organism? Are there general principles — like those that tie the individually chaotic behaviors of atoms in a gas into robust many-particle properties like pressure and density — that lead to a deeper understanding? Would a piece of a complex system have the same behavior as the whole, or are collective properties scale-dependent?

The initial goal with Biosphere 2 was that these small-scale ecosystems under glass could function sustainably. This failed quite badly (at least at first — see Wikipedia for more details). As we learned on an excellent tour this afternoon, nearly all animals in the enclosure died, the food grown was so minimal that everyone was hungry all the time, and oxygen levels dropped from about 20% to 14% (at which point oxygen had to be pumped in). Walking around, the issue that kept coming to mind was: what is the scale of an ecosystem? Biosphere 2 is really not very big — it’s a few football fields in total area. Are the webs of interaction that can exist in an area this size sufficient to mimic a “real” rainforest, savannah, or other environment? Are they large enough to be stable, and not fluctuate wildly?

Perhaps these questions couldn’t have been answered without building the structure and trying the experiment. (Or perhaps they could.) It would be great to talk to the people behind the project — they were commune dwellers, not scientists — and see what thoughts, assessments, dreams, and predictions went into the planning of this impressive, but odd, place.

Some more photos:


What have I got in my pocket?

What makes a good exam question? Not surprisingly, I try to write exams that most students who are keeping up with the course should do well on — almost by definition, the exam should be evaluating what I’m teaching. But I also want the exam to reveal and assess different levels of understanding; it would be useless to have an exam that everyone aced, or that everyone failed. Also not surprisingly, I’m not perfect at coming up with questions that achieve these aims. For years, however, I’ve been using the data from the exam scores themselves to tell me about the exam. Here’s an illustration:

I recently gave a midterm exam in my Physics of Energy and the Environment course. It consisted of 26 multiple choice questions and 8 short answer questions. For the multiple choice questions, I can calculate (i) the fraction of students who got a question correct, and (ii) the correlation between student scores on that question and scores on the exam as a whole. The first number tells us how easy or hard the question is, and the second tells us how well the question discriminates among different levels of understanding. (It also tells us whether the question is assessing the same things that the exam as a whole is aiming for, roughly speaking.) These are both standard things to look at, and I’ll note for completeness there’s lots of literature I tend not to read and can’t adequately cite about the mechanics of testing.

Here’s the graph of correlation coefficient vs. fraction correct for each of the multiple choice questions from my exam:

miderm correlations

We notice first of all a nice spread: there are questions in the lower right that lots of people get right. These don’t really help distinguish between students, but they probably make everyone feel better! The upper left shows questions that are more difficult, and that correlate strongly with overall performance. In the lower left are my mistakes (questions 6 and 15): questions that are difficult and that don’t correlate with overall performance. These might be unclear or irrelevant questions. Of course I didn’t intend them to be like this, and now after the fact I can discard them from my overall scoring. (Which, in fact, I do.)

I can also include the short answer questions, now plotting mean score rather than fraction correct (since the scoring isn’t binary for these). We see similar things — in general the correlation coefficients are higher, as we’d expect, since these short answer questions give more insights into how students are thinking.

all correlations

It’s fascinating, I think, to plot and ponder these data, and it has an important goal of assessing whether my exam is really doing what I want. I’m rather happy to note that only a few of my questions fall into the lower-left-corner of mediocrity. I was spurred to post this because we’re doing a somewhat similar exercise with my department’s Ph.D. qualifier exam. One might think, given the enormous effect of such an exam on students’ lives, and the fact that a building full of quantitative scientists create it, that (i) we routinely analyze the exam’s properties, and (ii) it passes any metrics of quality one could think of. Sadly, neither is the case. Only recently, thanks to a diligent colleague, do we have a similar analysis of response accuracy and question discrimination. Frighteningly, we have given exams in which a remarkable fraction of questions are poor discriminators, correlating weakly or even negatively with overall performance! I am cautiously optimistic that we will do something about this. Of course, it is very difficult to write good questions. However: rather than telling ourselves we can do it flawlessly, we should let the results inform the process.

Modeling Life (a freshman seminar) — Part 2

fig, abstract, watercolor, transparentIn Part 1, I described the motivations behind a “Freshman Interest Group” (FIG) seminar I taught last term, called “Modeling Life,” that explored how contemporary science can make sense of biology by way of physical and computational models. I also wrote about several of the topics explored in the class. Here, I’ll describe some of the assignments and projects, along with thoughts on whether the course succeeded in its aims, and whether I’ll teach it again.


Since the course was only a one-credit, one hour per week seminar, and was focused on awareness of what can and can’t be done with models rather than actually conveying skills in modeling, I kept the assignments minimal. Many weeks involved just writing a paragraph or two. For example, following the first class’ discussion of a paper modeling waves of jostling penguins (see Part 1), students had to “Think of at least one other system besides penguins (biological or not) that would be amenable to this sort of modeling of interactions, and describe what ingredients or rules you’d put into a model of it.” Students proposed various systems of interacting agents, nearly all involving animals, people, or cars. This led to a nice discussion of, for example, the field of traffic modeling, and to Itai Cohen’s group’s simulations of “Collective dynamics in mosh pits.”

All FIGs are supposed to do something with the library, and so I came up with an assignment I’m quite fond of that explored the “demographics” of article authorships. The students picked one of two papers that we had mentioned in class:

and then looked “forward” and “backwards” at some subset of its citations (e.g. via Web of Knowledge) and its references. The students picked at least two characteristics like:

  • What departments the authors are from;
  • What countries the authors are from;
  • Whether the papers are about experiments, computation, or both (just determined from the abstract)

and described what they found about the collection of studies linked to the chosen article. (An extended version of this assignment was an option for the final project for the class.) Even more than I expected, students were surprised and interested to find things like the wide array of departments represented by the authors (biology, physics, computer science, various forms of engineering); the number of countries represented (with the very large US fraction being even larger among references than citations); and more. We spent a while discussing authorship — most students have a nineteenth-century notion of lone scientists writing single-author papers — and how numbers of people in research groups varies between fields. I of course showed an example from high-energy physics; this one has over three hundred authors, which is fairly typical:

Screen Shot 2015-02-08 at 3.15.46 PMThe full first page:

Screen Shot 2015-02-08 at 3.16.16 PM

Final project

For a final project, students had a choice of either an expanded version of the ‘follow the literature’ assignment described above, or they could write simple computer programs that illustrated biased random walks (as in bacterial chemotaxis) or logistic growth (chaotic population dynamics). They could work in groups. About 2/3 chose the programming exercises. All of these went well — better than I expected in terms of both students’ interest in the project and their success in implementing them. (The students made use of the simple programming methods they were learning in the computer science class — I cringed to watch graphs being made by having a “turtle graphics” cursor trace out paths, and had flashbacks to seventh grade.)

Overall assessments

Did the course succeed? In some ways: yes. Students seemed very interested in the topics we explored, and most weeks we had quite good discussions. And it certainly was the case that the things we learned about were, to the students, completely new and far outside the scope of standard things they had previously encountered. If this were a “normal” course, I’d call it a success based on the level of engagement and interest we achieved. However, it was not a normal course, and there were three issues with it that dampen my enthusiasm for repeating it.

First, since I taught this concurrently with my Physics of Life course, a typical large, four-credit class, it added to my workload. Of course, I knew this going in. But, because I have far, far more things to do every week than there are hours in which to do them, I should really be subtracting from, rather than adding to, my list.

Second, a goal of the FIGs in general is that they’re social as well as academic experiences, and it’s apparent that I have neither the time nor the inclination to be very social. The high point of this aspect of the course was during the first few weeks, when I made sure to have coffee or lunch with all the students, in groups of 1-5. This was fun, and it was interesting to get some insights into their very different backgrounds, levels of comfort with the university, and experiences. Especially with respect to programming, the students ranged from ones who had never programmed anything prior to their concurrent computer science course to one who had held a job as a programmer. Aside from these chats, I did one social activity outside of class, a very short hike up Skinner Butte. (I had hoped for Spencer Butte, about an hour to a rocky summit with beautiful views, but the logistics of transportation foiled us.) A few students came, along with my kids; it was a nice walk on a sunny Sunday afternoon.

Third, the demographics of the FIG weren’t really what I was aiming for. The FIG connects my Physics of Life course with the introductory computer science class; students in the FIG are enrolled in both these courses. The intended audience of the Physics of Life class is non-science-major undergraduates. Introductory computer classes, at UO and elsewhere, are attracting sharply increasing numbers of students (see here) with a very wide range of interests. Therefore I was hoping for the same diverse assortment of students in the FIG — people interested in majoring in history, or political science, or art, etc. Instead, eighteen out of twenty in the course were intended computer science majors! They were a great bunch, but they were not my target in terms of general education. One could argue that these students are precisely those who we should be introducing to quantitative biology, since the field very much needs them. I would agree with this, and if I were part of a quantitative biology program I might agree that this is part of my job. But I’m not.

Overall, I don’t intend to teach the seminar again in the near future, though I could imagine happily revisiting it again someday. In case anyone plans similar courses, hopefully the thoughts noted here are of some use — feel free to email me for more details. The topic of mathematical and physical modeling of biological systems is fascinating, and it is certainly one that more students, especially early in their undergraduate careers, should be exposed to!

Notice how I’m transferring knowledge, for free

I’ll finish a “real” post soon: Part 2 of my recap of a freshman seminar course on models. (I made the painting for it already!) But, since I’ve written about funding issues in science before (e.g. here), I can’t resist a small post on a proposed new NIH funding program.

There is, of course, a lot of concern about low levels of grant funding, overpopulation of scientists, etc. The last thing one would expect to read is a serious proposal, by the NIH, that it should fund more “emeritus” investigators (i.e. very senior people). But, here it is. The idea is that the program would help these researchers “transition out of a position that relies on funding from NIH research grants” and “facilitate the transfer of their work, knowledge and resources to junior colleagues.” I had to check my calendar to see if April 1 had come up without my noticing. I could point out that “transitioning” is easy to accomplish by not applying for grants, or by collaborating with other researchers. I could also point out that transferring knowledge is what one should be doing already, as a part of a university. But all these points and more are well made in scores of scathing comments at the NIH site. Even better is a brilliant takedown at the “Complex Roots” blog — I highly recommend it.

Modeling Life (a freshman seminar) — Part 1

Fig (watercolor) Last term I taught a small freshman seminar called “Modeling Life,” on ways of looking at biology through the lens of physical and computational models. It was part of the university’s “Freshman Interest Group” (FIG) program, in which one creates small seminars that connect two regular courses that each student in the FIG takes. This was the first time I’ve taught a FIG, and I proposed linking my Physics of Life course with the introduction to programming course in the computer science department. Both are “100-level” general education courses, intended to reach a wide range of students. The FIG was an interesting experiment, and since I haven’t documented it anywhere else, I’ll write a little bit about it here. Though I don’t think my approach was great, there is definitely a need for more venues that expose students, at an early stage, to the intersection of biology, physics, and computation, and that convey what concepts like “modeling” mean, so perhaps some of the material I put here might be of use to some future course somewhere. Motivations. The study of life is being revolutionized by the study of information. Thanks to DNA sequencing and other high-throughput ways of identifying biomolecules, we have troves of data on genomes, gene activity, protein interactions, and more. Thanks to advances in imaging, sensing, and tracking, we can acquire huge amounts of information about form and structure. In itself, all this data doesn’t provide any insights into how living things work. For this, we need models — simplified representations of reality that highlight key features, as well as tools that let us navigate through data. The goal of this course was to reveal the existence of these broad themes in contemporary science. It was a non-technical course, focused more on this landscape than on particular paths through it. I tried to use lots of examples related to bacteria, because of their importance, because of their utility in illustrating biophysical and “systems biological” concepts, and because of my own interests in bacterial communities (see also here).


“Dry” biology. The term began with some readings about the gut microbiota, as well as a short piece from Science on “Biology’s Dry Future” [1], describing how, via new computational methods, “researchers are making fundamental discoveries without ever filling a pipette, staining a cell or dissecting an animal.” The existence of this mode of research was completely alien to for all the students, disjoint from any conception of biology that they had gotten in high school, and they were surprised and excited by it. The Physics of Penguins. To introduce the concept of models, we discussed a paper, “The origin of traveling waves in an emperor penguin huddle” [2], in which a group of physicists explain how waves of jostling motions propagate through a group of penguins. The paper illustrates the idea of creating simple, tractable models that capture the essence of a phenomena, and helps set up the idea of agent-based simulations. I’m not actually very fond of the penguin paper: in the twenty-first century no one should be at all surprised that collective phenomena like waves can emerge from simple objects with nearest-neighbor interactions, and we shouldn’t need to run a computer simulation to realize this. Originally, I planned to discuss this and other criticisms, but in the end abandoned this in the interests of time and simplicity. Analytic and Numerical Approaches (and growth and disease). Thinking about agent-based models led us to models for diseases and epidemics and, more broadly, the distinction between analytic models and numerical simulations. We discussed the advantages and disadvantages to writing a simulation, and the “art” of figuring out what problems are amenable to analytic solution and what require brute force. I started with a simple example: figuring out the average of random numbers uniformly distributed between -10 and 10, which one could determine by simulation but which, we all realize, is trivial to figure out just by thinking. We then moved on to examples in which it’s less obvious, that there are “clever” exact solutions, for example logistic growth: a population with some growth rate dependent on the present value of the population (giving exponential growth) and on some sort of constraint on the overall carrying capacity of the environment (giving a stable ceiling to the population). It’s easy to see how to simulate this; it’s not readily apparent that there’s a nice analytic solution to the population as a function of time. This also let me discuss my lab’s research on microbial growth, which we returned to several times. Noisy gene expression. The theme of simulating versus exactly knowing what form a model takes led to a discussion of a few pieces of a beautiful paper from Ido Golding & colleagues, “Real-Time Kinetics of Gene Activity in Individual Bacteria” [3], which illustrates both approaches. We talked about genes, and asked what gene expression ‘looks like’ at the level of the actual, physical, molecules involved. How can the same genes lead to different outcomes, and how can we think about randomness and predictability in systems of genes? These questions could easily fill a whole term; we were very superficial. Still, the discussion achieved its aim of conveying that there’s a far greater depth to how genes act than is even hinted at in cartoons from introductory biology books, and that quantitative ideas about physical processes are helping us explore these areas. Noisy gene expression. It is fascinating that, in an age when genes and DNA sequencing are referred to everywhere, most people have no idea of the computational challenges involved in figuring anything out from sequence-based data. To illustrate this, I sketched the basic idea behind a neat approach by Curtis Huttenhower and colleagues [4] to infer the genes contained in microbial communities given sparse information about what species are present. First, we discussed an analogy: suppose you knew how to say “seven” in several Indo-European languages, but not in Italian. How would you try to predict what ‘seven’ in Italian would be? This was fun to discuss, and again, the existence of algorithmic challenges like this in biology was a complete surprise to everyone in the class. Image analysis. Another key role for computation is in getting data in the first place (often very large amounts of data). This is something my lab deals with a lot, in the context of microscopy and imaging. We took everyone to my lab, and looked with our microscopes at bacteria swimming, and ogled arrays of mirrors and lasers. We also discussed some basic themes of image analysis, starting by asking “what is a digital image?” (A few people could give a decent answer; most could not.) Given that an image is an array of numbers, how, for example, can we identify objects like cells? Throughout several of the topics, we visited and revisited questions like: What is a model, and What is modeling good for? I’ll leave it to the reader to supply answers.

Next time…

In this post, I haven’t said anything about who the students were, what projects and assignments we had, how the course went, and whether I’ll teach it again. I’ve already spent far too much time writing, though, so all this will have to wait until Part 2. Stay tuned!

Update: I’ve written Part 2.

[1] R. F. Service, “Biology’s Dry Future.” Science. 342, 186–189 (2013). [http://www.sciencemag.org/content/342/6155/186.summary] [2] R. C. Gerum et al., The origin of traveling waves in an emperor penguin huddle. New J. Phys. 15, 125022 (2013). [http://iopscience.iop.org/1367-2630/15/12/125022/article]; see also http://www.nature.com/nature/journal/v505/n7483/full/505265e.html [3] I. Golding, J. Paulsson, S. M. Zawilski, E. C. Cox, Real-Time Kinetics of Gene Activity in Individual Bacteria. Cell. 123, 1025–1036 (2005) [http://www.cell.com/cell/abstract/S0092-8674%2805%2901037-8] [4] M. G. I. Langille et al., Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821 (2013). http://www.nature.com/nbt/journal/v31/n9/full/nbt.2676.html

Jetpacks? No. Flying cars? No. Thousands and thousands of emails? Yes. That’s 2014.

Correction Jan. 2, 2014: My ‘received email’ count was actually the sum of the number of sent and received emails. (My ‘sent’ folders are inside my local received folders, hence the double-counting.) I’ve fixed the numbers and graphs.

I’m far from the first person to note that emails are an incessant and almost overwhelming burden. There are so many of them, and it feels like the number is ever-increasing. Is it really? I noted the number of emails I’ve received and sent in each of the past eight years, and the answer is clearly yes:

email graph (corrected)The ‘received’ number doesn’t even count emails I immediately delete — spam from predatory journals, nonsensical conference announcements, shotgun book recommendations from Amazon, etc. — but only those that I save into folders. As you can see from the graph, this number crossed 9000 for the first time in 2014. Nine thousand! I sent close to 6000 emails — another personal record. Of course, I’m sure that there are lots of people who deal with many more emails than this. I’m not making any general claims, just thinking here about the trend illustrated above and what I can do about it. (You may wonder: Why was there a slight dip in 2013? I was on sabbatical Winter and Spring terms of that year — apparently I wasn’t very inaccessible.)

Why does this matter? Obviously, all these emails take time to deal with. How much? Most take about 2 minutes each (yes, I measured this), though many take much longer. Even as a severe underestimate, therefore, 9000 emails consumed about 300 hours of my time in 2014.

There are of course two ways to reduce this number. One is to get fewer emails, and the other is to spend less time reading and responding to them. The emails are just markers of some request or announcement that, in the days before email, would have been conveyed by some other medium or not conveyed at all. It’s hard to see how this number can be lessened, though I suppose if I respond to fewer emails, people might take the hint and stop writing to me.

The second issue is more interesting. It’s probably hard to spend much less time per email than I presently do, but, thinking about this a lot and reading various articles, I’ve realized that the problem with email is not just this two-minutes (or whatever), but rather the mental effort involved in getting to the two minutes in the first place. This isn’t an original thought. As written at the “Inbox zero” site, for example:

Just remember that every email you read, re-read, and re-re-re-re-re-read as it sits in that big dumb pile is actually incurring mental debt on your behalf. The interest you pay on email you’re reluctant to deal with is compounded every day and, in all likelihood, it’s what’s led you to feeling like such a useless slacker today. [Source: here and here.]

In other words, any email I don’t promptly deal with is one that I find myself thinking about again and again, each time it’s there in front of me. Even if it’s trivial to deal with — for example, some minor scheduling issue that would be solved if I had one more piece of input that I’m waiting for — this repetition is draining and distracting.

The solution is obvious and unoriginal (e.g. Inbox zero, noted above, and maybe The Tyranny of E-mail, which I read a very good excerpt from long ago): one should immediately process emails, reading them once and only once, and one should never “check” email (i.e. read without processing). I’ve tried this with mixed success in the past — it takes a lot of willpower to ignore the seeming urgency of email, and to resist the trivial sense of accomplishment that comes from reading or sending a message or two. And it does seem silly to consider it useful to “process” an email by moving the task it demands from my inbox to some other list of things-to-do. But it is useful, I think, since it separates the confusing dual roles of email as a source of new items and a list of existing tasks.

It’s clear that I need to muster the fortitude to deal better with email, saving myself both time and mental energy that could be better put to more productive, and more enjoyable, ends. That’s the task for 2015. My first tactic: to not look at my Inbox at all between about 7.30am (i.e. when I leave for work) and 11.30. Will angry mobs come track me down? Will I miss the news of important events? Probably no, and no. I feel more relaxed already. Happy 2015!