How do I hate p-values? Let me count the ways…

pepper_17Feb2016[Note: a long post of interest only to people who care about data analysis and bad statistics, and maybe about the distant stars influencing your life.]

By now, we should all be able to list the many reasons that p-values (or null-hypothesis-significance-testing, NHST) are awful:

  • that “statistical significance” has nothing to do with effect size or actual significance
  • that “p < 0.05” tells one nothing about the likelihood of a scientific hypothesis being true
  • that “statistical significance” is simply a measure of measurability, or sample size
  • that the p-value is itself a stochastic variable and is therefore especially unreliable for small sample sizes

… and so on. From a recent news piece in Nature:

P values have always had critics. In their almost nine decades of existence, they have been likened to mosquitoes (annoying and impossible to swat away), the emperor’s new clothes (fraught with obvious problems that everyone ignores) and the tool of a “sterile intellectual rake” who ravishes science but leaves it with no progeny[3]. One researcher suggested rechristening the methodology “statistical hypothesis inference testing”[3], presumably for the acronym it would yield. — Regina Nuzzo, 12 February 2014.

Yet another problem with p-values / NHST, that struck me during a really terrible talk I endured at a recent conference, is that p-values promote nonsensical binary thinking. What do I mean by this?

A sort-of-made-up case study

Consider, for example, some tumorous cells that we can treat with drugs 1 and 2, either alone or in combination. We can make measurements of growth under our various drug treatment conditions. Suppose our measurements give us the following graph:



… from which we tell the following story: When administered on their own, drugs 1 and 2 are ineffective — tumor growth isn’t statistically different than the control cells (p > 0.05, 2 sample t-test). However, when the drugs are administered together, they clearly affect the cancer (p < 0.05); in fact, the p-value is very small (0.002!). This indicates a clear synergy between the two drugs: together they have a much stronger effect than each alone does. (And that, of course, is what the speaker claimed.)

I’ll pause while you ponder why this is nonsense.

Another interpretation of this graph is that the “treatments 1 and 2” data are exactly what we’d expect for drugs that don’t interact at all. Treatment 1 and Treatment 2 alone each increase growth by some factor relative to the control, and there’s noise in the measurements. The two drugs together give a larger, simply multiplicative effect, and the signal relative to the noise is higher (and the p-value is lower) simply because the product of 1’s and 2’s effects is larger than each of their effects alone.

I made up the graph above, but it looks just like the “important” graphs in the talk. How did I make it up? The control dataset is random numbers drawn from a normal distribution with mean 1.0 and standard deviation 0.75, with N=10 measurements. Drug 1 and drug 2’s “data” are also from normal distributions with the same N and the same standard deviation, but with a mean of 2.0. (In other words, each drug enhances the growth by a factor of 2.0.) The combined treatement is drawn from a distribution of mean 4.0 (= 2 x 2), again with the same number of measurements and the same noise. In other words, the simplest model of a simple effect. One can simulate this ad nauseum to get a sense of how the measurements might be expected to look.

Did I pick a particular outcome of this simulation to make a dramatic graph? Of course, but it’s not un-representative. In fact, of the cases in which Treatment 1 and Treatment 2 each have p>0.05, over 70% have p<0.05 for Treatment 1 x Treatment 2 ! Put differently, conditional on looking for each drug having an “insignificant” effect alone, there’s a 70% chance of the two together having a “significant” effect not because they’re acting together, but just because multiplying two numbers greater than one gives a larger number, and a larger number is more easily distinguished from 1!

(Of course, I could have made different assumptions about noise, made up different effect sizes, etc. The reader can feel free to play with this…)

The real point here, though, is not that one must be aware of the subtleties of p-values, or that one should simulate one’s data (both true), but that one shouldn’t make binary (true/false) statements about data. Sadly, I see this all the time in biology — a perverse desire to say “this treatment has an effect,” or “… doesn’t have an effect”, rather than “the magnitude of the effect is [x +/- y]”. Why is this so terrible?


On binary statements

First, because “binary” statements can’t be sensibly combined. If this were computer science, or mathematical logic, they could — we all know how to apply ANDs and ORs and NOTs, etc., to boolean variables — but for measurements with noise, especially poorly characterized noise, this is hopeless.

Second, it is almost never necessary to combine boolean statements. Only at the “end” of some scientific inquiry, deciding, for example, whether we use what we know to prescribe or not prescribe some drug, could we need a binary, true/false value. Before that, to build from one study to another, we necessarily lose information in going from a quantitative measure (like an effect size) to a binary one (p < 0.05, “there is an effect”).

Third, everything always has an effect. Over a hundred years ago, Émile Borel showed that the motion of a gram mass a few light years away perturbs the trajectories of molecules in a gas enough to prevent prediction of their paths for more than a few seconds [1]. It’s absurd to think that anything exists in isolation, or that any treatment really has “zero” effect, certainly not in the messy world of living things. Our task, always, is to quantify the size of an effect, or the value of a parameter, whether this is the resistivity of a metal or the toxicity of a drug.

I can understand the temptation for binary statements — they make things simpler, and it would be easier to wade through the billions of papers published every week if we could distill them into blunt bullet points — but they’ll lead, necessarily, to nonsense.

It would have been interesting to discuss this with the speaker, but he was one of those who dash into a week-long meeting just before his talk and dash off immediately after it. Why chat with us when there’s cancer to be cured!

Today’s illustration: A pepper (watercolor and, mostly, colored pencil.) Based on an illustration from a book whose title and author I can’t remember. Peppers start with “p” — this is a true statement.

Update: Andrew Gelman nicely notes that he has written about similar issues of binary thinking. (I would bet that I read these long ago, and internalized their messages!) See:

The second post I think is particularly interesting; I disagree with a very interesting sentence in it, that “We live in an additive world that our minds try to model Booleanly,” but I won’t bother commenting for now..

[1] E. Borel, Le Hasard, 1914. The discussion is quite unclear. The topic has been dealt with many times; see e.g. L. Brillouin, Poincare’s theorem and uncertainty in classical mechanics. Inf. Control. 5, 223–245 (1962). (Link)

Graphs and Grading: Winter 2016

princess tree Paulownia tomentosa seed capsules March2016 small

Assessing exams

I’m fond of analyzing the outcomes of exams I give. My favorite assessment is to look at how the score on each exam question correlates with the overall total score, and to plot this correlation coefficient versus the fraction of students who got that question correct. Roughly, I’d like the plot to turn out like the orange arc here:


Questions falling in the lower right show that most of the class understands the topic being asked, and of course I want a lot of the “core” material of a course to be followed by nearly everyone. The questions in the upper left are not so simple, but the fact that they correlate well with overall performance indicates that they’re probably not difficult because they’re opaque, but rather that they correctly,and usefully, discriminate based on students’ grasp of the topic. (This is similar to the “item discrimination index” commonly used by people who design tests, see e.g. this and this.) I hope to avoid questions that fall into the lower left corner (the red “X”) — questions that are answered correctly by a low fraction of students, and for which answering correctly isn’t correlated with answering other questions correctly. It’s likely that these questions, despite my best efforts, are misleading or otherwise flawed. Usually, a few things lie in this corner and, since this indicates bad questions, I generally toss these out of the overall exam scoring. (I tell this to the students.)

My correlation plot for the second exam of my “Physics of Energy and the Environment” course this past term, however, turned out wonderfully:


There’s nothing in the lower left! Technically, this doesn’t represent the entire exam — there were 24 multiple choice questions (indicated by the numbers on the plot) and 5 short answer questions, whose correlations I didn’t bother calculating. Nonetheless, I’ll conclude that I’m pleased by the above plot! I’ll also note that it’s very easy to do this assessment — I highly recommend it.

Assessing my course

Overall, this past term’s teaching went pretty well. “Physics of Energy and the Environment” is a general education course for non-science majors that I’ve taught before, and as always I try to incorporate a lot of “active learning” — lecturing as little as possible, and having students figure things out through guided questions and worksheets. (It takes energy, both from me and from the students, but it works well.) The syllabus is here.

My course evaluation scores for “course quality” and “instructional quality” were 4.3 out of 5.0 and 4.5 out of 5.0, respectively. Comparing to *all* general education Physics courses taught at UO since 2008, for which I’ve tabulated evaluation scores, three things pop out of the histogram. (I’m just plotting the “instructional quality” histogram, but all the categories are strongly correlated with each other.)


First, I’m doing pretty well (about 1.5 standard deviations over the mean in “instructional quality”). I put a lot of work into the class, and often ask myself whether I’m wasting my time and energy. Of course, the answer may well be “yes,” but at least by some metric, there does seem to be some positive outcome to it. Second, the mean score for the department is quite high. I haven’t compared to other departments, but I do think ours has a lot of people who care about teaching. (Given that tuition provides over 80% of the income of the university, it is stunning that there are people who do not!) Third, the dynamic range of the plot is very narrow. (The standard deviation divide by the mean is 0.1!) This holds, I think, in general — even courses that (I’m told) are awful tend not to score much lower than 3.0. Why is this? I really don’t know.

A final graph: Here’s the distribution of final grades in the course:


The bins indicate how many students got each final letter grade. The course isn’t curved — in principle, everyone could get an “A,” or everyone could fail. About 5 of the F’s are “real” — the rest are students who never showed up, or soon stopped showing up. (There are always some like this.) What “should” the distribution of grades look like? In a course like this, I really do think that everyone, or nearly everyone, in the class is capable of getting an A, but the realities of student motivation and diligence are such that many do not. (I think most students would agree — in the written evaluations, lots of people commented on how fair the grading and the structure of the course were.) I’d be curious to see the grade distributions for all UO courses, especially broken down by levels. There’s a lot more, in principle, that one could do with this, especially if one had access to data on students’ performance in other courses — are low grades in a gen-ed course like this one indicative of doing poorly in general in college, or are the low-performers saving their time and energy for “important” courses? I’d bet money on the former, and I’d also bet that addressing the existence of the poorly motivated and poorly organized would do wonders for the university, and for these students. But for now, this will have to remain a mystery.

Today’s illustration

Seed capsules from a “princess tree” (Paulownia tomentosa), again painted from a photograph in Trees up close (text by Nancy Ross Hugo, photos by Robert Llewellyn).

Scientific publishing for non-scientists


I’ll guess that most people reading this don’t believe in homeopathy, astrology, or the existence of lizardlike extraterrestrials that walk among us. This is probably not because any of us ourselves have researched these topics, but rather because we are unconvinced by their proponents, and also perhaps because these ideas have not managed to break into the a body of what we consider the scientific literature.

The state of this literature is the topic of lots of hand-wringing at the moment, but largely missing from this, I’ll note, is a discussion of how the scientific literature informs what the general public thinks of as science and not-science. Before getting to that, let’s look at the question of why we publish scientific papers in the first place:

Why do we publish?

Here’s a non-exhaustive list, in no particular order.

  1. To communicate our findings to other scientists and also to the broader world
  2. To critique and discuss others’ work
  3. To establish priority of discoveries and techniques
  4. To document findings for the historical record
  5. To document our activities for administrative entities like funding agencies, promotion committees, thesis committees, etc.
  6. To generate proxies for research quality — based on things like what journal a paper is published in, for example. (I’m not claiming that this is a good thing…)
  7. To establish the validity of scientific findings. (Again, I’m not commenting for now on the validity of this goal!)

Problems with publishing

As mentioned, hardly a week goes by without some intense discussion of or essay about items 1 to 6. It’s very hard, for example, to publish corrections to or critiques of published work. Publishing takes a very long time, which gets in the way of sharing new results. Most published papers are wrong (at least in biomedical research). Everyone hates using journal impact factors or reputation as a proxy for research quality, but everyone does it anyway.

I could go on about all these, but what struck me recently is that I’ve encountered many of discussions of items 1-6, but almost none about #7. When teaching science courses for non-science-majors, or interacting with the general public, how do we convey a notion of what “is” and “isn’t” science? It’s an important question with real-world consequences — we’re all aware of, and perhaps know, people who don’t vaccinate their children based on vague notions of danger, or who believe in homeopathic cures, or other such things. Why does the topic of communicating with the public so rarely come up in discussions of scientific publishing? In general, such conversations occur between scientists, and are moreover disjoint from conversations about teaching or interactions with non-scientists.

What is & isn’t science

Answering the italicized question above, of how we can convey to non-scientists a notion of what “is” and “isn’t” science, is difficult. There is, I think, a good but hard to implement answer, and a not-very-good but easier to implement answer.

Peer review as a marker

The less good / easy answer is that the framework of peer-reviewed publishing provides a way for non-scientists to know what’s reliable and what’s not. In other words, the scientific community’s assessment, “X is science,” is reflected in “X has been reviewed by other qualified scientists, and passed this test in order to appear in the scientific literature.” In general, this isn’t a bad correspondence, for all its flaws. Some random quack’s blog post on homeopathy is less likely to show signs of rigorous, logical testing than a peer-reviewed article.

One can certainly find statements in articles about communicating with beginning students and other non-experts that reflect this idea that peer-reviewed scientific publishing confers legitimacy, and that the existence of peer-review allows non-scientists to assess the credibility of sources of information. For example:

We developed a two-tier set of criteria for evaluating scientific literature that can be used in traditional and nontraditional learning environments, and can be presented to science majors and nonscience majors alike. … either it is published in an authoritative source or it is not. Authority is a measure of the reputation of the publication and the authors it publishes. We have found this to be too vague and have settled on peer review as the indication of authority. These are not statements of value, but they are designed to get students thinking about the nature and types of literature … From: Karenann Jurecki and Matthew C. F. Wander (2012) Science Literacy, Critical Thinking, and Scientific Literature: Guidelines for Evaluating Scientific Literature in the Classroom. Journal of Geoscience Education: 2012, 60, pp. 100-105. doi:

It’s hard, however, to read the excerpt above without one’s skin crawling a bit. Every scientist knows that a lot of peer reviewed papers are awful, and their claims poorly justified. (Conversely, a lot of blog postings or other unorthodox writings are very good.) Methodological flaws, especially having to do with poor statistics and a lack of understanding of noise and randomness, are endemic in the peer-reviewed literature. High-profile journal routinely peddle splashy findings that oversell their data. Finally, there is fundamentally no good reason to think that “truth” is established by two or three random reviewers approving of a paper. (This randomness works the other way to, as all of us who have dealt with fine papers being rejected can relate.)

Trust no one!

The better but difficult answer to the italicized question is that there’s really no way around the necessity of evaluating claims with a critical eye, no matter where they appear. Peer review is no panacea. Fine, one says, what’s so hard about that? People who don’t routinely interact with non-scientists in an academic setting vastly overestimate, in my experience, the sophistication that the general consumer of media (articles, videos, etc.) brings to the information they’re presented with. Issues of noise, uncertainty, p-hacking, model fitting, etc., are high level concepts compared to basic features of quantitative thinking and logical inference that many people struggle with. Having spent time helping college students understand how many kilograms 100 grams is — admittedly, an atypical example, but a real one — it doesn’t surprise me that conveying deeper concepts takes a lot of time and effort. It’s doable, and it’s worthwhile, but it’s not simple. How, then, does it scale to asking that the general public critically evaluate everything they’re exposed to, essentially on their own? This, I think, is the challenge we need to address.

Scientific publishing: a lost opportunity?

Of course, one can respond that the public isn’t on their own — sources of reliability will emerge via popular consensus, elite “status,” or other magic. I am skeptical. And even if so, it seems tragic that by allowing the state of scientific publishing to decline to the extent that there are compelling reasons to abandon peer review altogether, the scientific community may be giving up the chance to provide a useful service to the general public. Put differently: if we really did do peer review well, it would benefit more than just scientists.

Today’s illustration

I painted this from a photograph, “Fruit of the beech tree,” in the beautiful book Trees up close (text by Nancy Ross Hugo, photos by Robert Llewellyn). Echoing the theme of a previous post, I found this randomly in our Art and Architecture library, where it was lying on the floor of an aisle.

When is an ethics course not an ethics course?

Kestrel watercolorThere seems to be a lot more discussion of ethics in scientific news and articles these days compared to the distant past (e.g. when I was a graduate student). This may be due to an increased complexity in the practice of science — issues like data sharing, for example, are more difficult than they used to be — or an increase in incidents of irreproducible results or actual fraud, or perhaps simple fashions about what’s worth discussing. Various funding agencies, notably the NIH and NSF, now require training in the “responsible conduct of research” (RCR) for graduate students funded by their grants. Though my research group and some of my colleagues’ have implemented ethics discussions in our group meetings, my department as a whole doesn’t have anything of this sort that all graduate students experience. (Other science departments here at Oregon do.) Thinking that this isn’t good, I (perhaps foolishly) volunteered to teach a graduate ethics workshop, which I’ll do next term together with another faculty member, in addition to our usual teaching tasks.

It’s interesting to think about what should go into such a workshop. One key thing I’ve realized is that it’s a mistake to think of the course as an “ethics workshop,” rather than a “workshop on topics in the responsible conduct of research.” Sadly, the latter is unwieldy. The former, though, causes problems, especially in communicating with colleagues. What’s the distinction, and what’s wrong with an “ethics workshop?”

First, I would argue that training in ethics per se is rather pointless. Nearly all of us know that lying, cheating, and stealing are bad, and the tiny fraction of people who don’t grasp this aren’t going to be convinced of the error of their ways by sitting in a classroom. I am reminded, in writing this, of the surreal form the university asks faculty to fill out each year about reporting grant activity and related things that essentially asks, “are you lying?” I showed this to my then-four-year old a few years ago; he recognized that the only possible answer, whether one is honest or dishonest, is “no.” (The kids and I used to discuss Knights and Knaves puzzles a lot…)

Second, the more generally applicable and interesting issues are those that aren’t as straightforward to map onto right and wrong. These are also issues that relate to the social, economic, and structural framework in which science is done. How do we handle data? How does publishing work? I’ll flesh out some examples below. In addition to being relevant to the practice of science, some knowledge about these issues at the start of one’s graduate training can help prevent conflict, frustration, or even the temptations of unethical behavior later on. Also, I’d argue, learning about the “landscape” of science is an important part of being a graduate student.

Referring to a course on RCR as an ethics course is a convenient shorthand, but I’ve learned that it causes confusion. It also, quite rightly, makes some faculty reluctant to support it, for the reasons noted two paragraphs above.


I’ve sketched several topics that would be worth discussing in this proposed RCR workshop. Here they are, with a little bit of commentary:

  • Data handling and management — What are our responsibilities with respect to preserving data, and also making it available to others? What do funding agencies and others say about this? What do we do, in practice, in an age of giant datasets? What distinguishes “raw data” from reduced data? This last question, by the way, is one that has provoked spirited discussion at microscopy conferences I’ve been to.
  • Data integrity — Can one justify throwing out “bad” data points? If so, how, or why? This is a difficult, and very common, question. It connects also to contemporary thoughts on fitting and data analysis; see e.g. this. This topic also spans the handling of images, and image manipulation.
  • Publishing and Authorship — How does the publication processes work, and how is it changing? What are authorship criteria and roles, and what do various professional societies say about them?
  • Research Misconduct and Scientific Fraud — I.e. actual ethics! We should definitely look at case studies, of which there are lots of interesting ones! Arguably the most famous in physics is the story of Jan Henrik Schön.
  • Statistics and ethics — A lack of understanding (or mis-understanding) of statistics, coupled with poor experimental design, underlies the present proliferation of mediocre and irreproducible studies — see e.g. this, this, or this for some snippets of the relevant discussions. This phenomenon is fascinating. But what, one might ask, does it have to do with physics, which is relatively free of the dispiriting methodology that seems to plague, for example, sociology or epidemiology? So far, not much, thankfully. But (i) similar issues come up in physics, for example in the dodgy or delusional ways physicists tend to fit power-laws to everything; and (ii) I would expect issues of statistics and perilous data-mining to become more common in physics, as datasets grow in size and complexity. OK, one replies, but what does this have to do with ethics or RCR? It occurs to me, reading a lot of examples of bad science, that the practices employed are ethical (in the sense of being with a sincere belief in their validity) only if one is ignorant of how to handle noise, uncertainty, and other quantitative aspects of data. But ignorance shouldn’t, of course, be a justification for bad science. Do we then have an ethical obligation to understand how to treat data? I haven’t seen this generally discussed, and it would be interesting to explore further. I’ll note that these ar half-formed thoughts, that may not make it into the course!
  • Ethical issues relating to environment, science policy, and law — (This one is from my co-teaching colleague.) What is the relationship between politically neutral science and areas of public policy that are closely connected to science (e.g. climate change)?
  • More things about how science is done — It’s useful to understand the landscape of science — the flows of money, people, etc. This affects graduate students quite directly, in topics like jobs, funding, etc., and it wouldn’t hurt to have some  exposure to it. As I often do, I’ll note Paula Stephan’s excellent “How Economics Shapes Science” as a resource on this.


The structure of this workshop is still to be determined. The challenges are (i) to satisfy the dictates of the funding agencies, which are very vague, (ii) to make it worthwhile for students, (iii) to avoid taking up too much of research-active students’ time, and (iv) to avoid taking up too much of my time. My own preference is to have weekly 1 hour meetings, not occurring in the middle of the day, for some number of weeks between 5 and 10. Various faculty have spoken in favor of more or less time. I view the Spring launch of this workshop as an experiment — we’ll see what happens!

The workshop itself should be mostly discussion based. There are good readings on most of these topics, e.g. this available free from the National Academies.

Today’s illustration…

…is a kestrel I painted a few weeks ago, shortly after spotting both a kestrel and a bald eagle (not together) on my bike ride to work one morning. The eagle was surveying the Willamette River. The kestrel was standing in the middle of a road, devouring some smaller creature.

A random walk through bookshelves — books and movies 2015

Crow watercolor -- Raghu ParthasarathyA few years ago, after too many instances of starting a book and then realizing that I’d read it before, I began to keep a list of the books I’ve read, making a brief note in it each time I finish something. The list makes it easy to look back on what I’ve read in the past year. Today, on New Year’s Eve, I’ll write a quick post on my favorites of 2015. It doesn’t really fit in with the general themes of the blog, though there is a bit of science in it, and some thoughts on randomness.



Out of 21 books, it’s surprisingly easy to pick my favorite for this year: Your Republic Is Calling You by Young-Ha Kim (2010). It’s a novel about a North Korean spy, living a normal life for many years in South Korea, who is suddenly called back to the North. It gets a surprisingly low average rating on Goodreads (3.5/5.0), perhaps because most people want their spy novels to be action-packed and thrilling. This one is not. Rather, what’s striking about it is its depiction of a possibly sudden end to an ordinary life. Plus, its scenes of North Korea are fascinating and chilling, like seemingly everything about North Korea.

Runners up:

City of Tiny Lights by Patrick Neate (2006). A modern noir with a Ugandan-Indian-British private eye, investigating a political murder. It’s funny, clever, and fast, though it becomes annoyingly implausible in its last quarter.

Serious Men by Manu Joseph (2010). I don’t often read fiction about science because (i) there isn’t much of it, (ii) it’s often bad, and (iii) I spend enough time thinking about science. I picked this one, though, because it’s Indian and because the cover is neat (see below). It’s a cynical and funny novel about scientists, social dynamics, and more. Its characters are too caricature-ish to take the top spot, but it was nonetheless enjoyable. Its depiction of the culture of science, especially “big” science, are remarkably good, and free of the stilted and artificial characterizations of how science works that one usually finds. I noted these lines, which I particularly like: “… he stared at the ancient black sofa. Its leather was tired and creased. There was a gentle depression in the seat as though a small invisible man had been waiting there forever to meet Acharya and show him the physics of invisibility.”


These three books have something in common: I picked them all by randomly browsing the bookshelves at the University library! (There’s an excellent “popular reading” section that I like to look at.) I hadn’t heard of any of them before, or searched for them, or had an algorithm from Amazon recommend them to me. There’s a lot to be said, I think, for random discovery, especially if one wants to find things one didn’t know existed, rather than refinements of things one already knows.


My favorite out of 13 non-fiction books is a very new one: The Planet Remade: How Geoengineering Could Change the World, by Oliver Morton (2015). I read this in the past few weeks, mainly because I’m teaching Physics of Energy and the Environment this term — a course for non-science majors that I’ve taught before — and felt that its topic is one I should explore further. It’s a brilliant book about geoengineering: scenarios, methods, concerns, and more. It’s thoughtful, thorough, and beautifully written. I could write more, but I might turn this into its own blog post.

A very close second is Sahara Unveiled: A Journey Across the Desert by William Langewiesche (1997), about the author’s travels starting from Algeria, south through the Sahara, and west to Mali. It has a wonderful and thoughtful mix of descriptions of the natural landscape and of the remarkable, sometimes inspiring, and sometimes dispiriting people and societies he encounters along the way. Science comes up in a few spots, both directly — there’s a charming section on Ralph Bagnold, a giant in the study of sand dunes — and indirectly, when the author is stranded amid ancient rock art that depicts the rich wildlife the Sahara used to contain, before it became a desert, a topic discussed by Morton as well.

Kids’ books

If I were to travel back and visit my 2005 self, I would suggest that he note down books read with his kids, of which there are a lot of great ones, and which he has trouble remembering. (They aren’t on the present list.) Certainly some highlights of the past year were finishing the 42-book comic book version of the Indian epic, The Mahabharata, with my six-year-old. It’s not surprising that it’s such an enduring story — it’s fascinating, and full of ethical quandaries. There’s apparently a new prose retelling that gets good reviews.

We’ve also read a lot of Asterix comics (e.g.), which I never knew when I was a kid. They’re great. Perhaps as a result, my six year old has become very fond of ancient civilizations, Rome in particular. There are a lot of very good kids books on the topic, such as Rome: In Spectacular Cross-Section, which have been fun to read.


Almost all of my wife’s and my movie watching is via Netflix, whose selection (on physical DVDs) is thankfully vast. The best movie seen this year, out of 16, is the appropriately titled “We Are the Best!” (2013), about a trio of 13-year old girls in Sweden who form a punk band. It’s charming, funny, clever, and uplifting without being at all sappy.

Runners up: All is Lost (2013), An Education (2008), Nobody Else But You (2011). The last of these is perhaps the strangest of the three, a French mystery about a dead small-town starlet whose life mirrored that of Marilyn Monroe.

I can’t think of any deep insights to convey about these movies, or anything that touches on biophysics or science or anything else I usually write about. I should, I suppose, note that none of these movies were found by random browsing, but rather made use of Netflix’s recommendation algorithm. Make of that what you will…

Overall, it was a great year for both books and movies, revealing many new worlds that I wouldn’t have otherwise imagined. We’ll see what 2016 brings.

Happy New Year!

Recap of a graduate biophysics course — Part II

Great grey owl watercolorI’ll continue describing a graduate biophysics course I taught in Spring 2015. In Part I, I wrote about the topics we covered. Here, I’ll focus on the structure of the course — books, assignments, in-class activities, and the students’ final project — and note what worked and didn’t work. (What didn’t work: popsicle sticks.) Click for the syllabus.

My overall learning goals for the course were that students would be able to

  • …understand the physical principles that underlie important biological phenomena such as DNA packing, bacterial motion, membrane deformations, and signaling circuits.
  • …apply statistical and statistical-mechanical ideas to a wide variety of complex systems.
  • …read contemporary papers in biophysics and follow the aims and general approach.

How does one get there?


As I’ve written before, we are fortunate to live in an age in which there are good biophysics textbooks. Most notably,

The first two are “standard” biophysics texts in that they explore the statistical mechanics, electrostatics, and mechanics of DNA, proteins, membranes, and other cellular components, as well as the interplay of forces that control micro-scale biological interactions. Both are excellent books! I felt it would be useful to have an assigned textbook for the course, both for students to refer to, and to make it easier to have reading assignments that freed class time for more specialized discussions and activities. I chose Nelson’s Biological Physics book, mainly because it is more concise and “linear” in its progression of topics. (I did, however, distribute some excerpts from Physical Biology of the Cell, especially on DNA mechanics.) I was a bit worried that Nelson’s book would be too simple, since it’s geared towards undergraduates as well as graduate students, but this wasn’t a problem. The exercises aimed at graduate students are very good, and the straightforward nature of the book helped us move quickly, and gave us material to build on in class.

Bialek’s book is quite different, focusing on noise and signal processing, and the principles underlying things like gene expression, vision, chemotaxis, etc. I took bits from it, on photon detection in vision and on chemotaxis, which are excellent. Overall, it would be interesting to structure a course around this book, but one would miss out on the “mechanical” aspects of biophysics (DNA rigidity, membrane dynamics, etc.), and also on much of the variety that exists in the cellular world; I think these things are crucial for an introductory biophysics course. I must also point out that Bialek’s book is quite difficult — it takes a lot of thought to follow it, which is certainly fine, but that would place severe constraints on a ten week introductory course.

Physical Models of Living Systems is a fascinating book, and I extracted several pieces of it for the section of the course on “Cellular Circuits” (See Part I.) It would be great to make a course focused solely on the topics of this book, embellishing it with more discussion of experimental methods, but this also would take us away from “central” themes in biophysics. Still, it’s an excellent book. (I was fortunate to read and evaluate it before it came out! It’s nice to see my name in print in the acknowledgements!)

I also made use of various parts of Howard Berg’s classic Random Walks in Biology, on diffusion as well as bacterial strategies for motion (runs and tumbles, etc.).


I assigned weekly problem sets, which were usually a mixture of exercises from Nelson’s book and questions I either wrote myself or took from other sources. For an example, see Homework #4. Several of the problems required writing computer simulations, which is an extremely useful skill to practice. In general, the homework assignments went very well. Students noted, however, that the difficulty and time required were very inconsistent between problem sets. I am not surprised by this.

In-class activities

I wanted to integrate active learning into the class: lecturing as little as possible, since it’s a poor way to convey understanding, and having lots of occasions for students to think and do things in class. I do this a lot in my general-education undergraduate classes, but it’s less common in higher level classes.

Quite often, I asked students, either on their own or in a group of two or three, to figure out something that would either lead into a more detailed analysis, or that would in itself illustrate the implications of some physical concept. As an example of the former, we examined how fluorescent correlation spectroscopy (FCS), which measures the intensity fluctuations that result as fluorescent molecules diffuse in and out of microscope’s focal volume, can yield the diffusion coefficient of the molecule. Lazily pasting a sketch:



Ignoring uninteresting numerical factors, we can figure out how the characteristic features of the autocorrelation curve (g(𝜏)), namely the value of g(0) and the location of the inflection point, depend on things like the concentration of molecules, the focal volume, and the molecules’ diffusion coefficient. So, I asked students to do this, which went quite well. Then, I went through the derivation of the exact expression for g(𝜏) — a long slog, which strengthened my resolve not to fill our class time with tedious and unenlightening things like this.

More interesting, and somewhat related, examples came from our investigations of the physical constraints on bacterial chemotaxis. If we consider a bacterium capturing nutrients that surround it at some mean concentration c_0, and “measuring” the number it has acquired over time t_m, how accurately can it measure c_0? This is an important question, since the bacterium will “want” to migrate to regions of higher nutrient concentration, and so will need to know whether a perceived increase in food abundance is real, or just a statistical fluctuation due to the randomness of diffusion. I asked students to figure this out, which not only helped really cement ideas of noise and fluctuations — more so than just hearing me state them — and also led to more interesting questions like: since the measurement accuracy increases with increasing t_m, why shouldn’t the bacterium just increase t_m arbitrarily? What physical limits constrain the measurement time?

Contemporary papers

We discussed a lot of contemporary papers in class. Just to give a few examples: after covering FCS above, we looked at a remarkable paper from Carlos Bustamante’s group in which the authors measured the enhanced diffusion of an enzyme due to the heat released from its chemical activity. We spent quite a while discussing experiments from Kazuhiko Kinosita’s group on the workings of ATP synthase, a fascinating rotary molecular motor (e.g. this and this), which involved questions like “How can we relate the observed motion of objects attached to the protein complex to the work done by the complex?” and “”How close is ATP synthase’s performance to fundamental thermodynamic limits on machines?” We looked at several examples of cellular circuits, including one that relates to my lab’s interests in gut microbial activities. In general, using class time to discuss contemporary research went very well, both in terms of being interesting in itself, and for illustrating the connection between topics of the course and actual research activity.

Discussing readings

I also had students read and briefly comment on sections of the textbook, or other readings, in class. This not only freed class time — i.e. not having to examine in detail things that are well-explained elsewhere — but also gave me a good sense of how well students understood things. In retrospect, I should have done more of this. I didn’t in part because it takes a good amount of advance planning to map out exactly what students should present on, and in part because I was perhaps too used to lower-level undergraduate courses in which students don’t do very well with exercises that require independent reading and thinking. It was very liberating to teach a graduate course — all the students actually read and think!

What didn’t work: popsicle sticks

Overall, designing the course to be very active was great — I think students learned a lot, and it made the course lively and fun to teach. I would definitely run the class similarly the next time I teach it.

The one major failure of my active learning approach was my tactic for getting a variety of voices in-class: writing each student’s name on a popsicle stick, and picking a stick at random to call on someone to answer a question. I took this idea from an activity I did last year with my older son’s fourth grade class where it worked wonderfully. Here, however, it was a disaster. Despite everyone in principle accepting the idea that it’s fine to say wrong things, or respond to questions with further questions about what’s unclear, students really didn’t like being put on the spot by random forces. I distributed a mid-term evaluation to get feedback on how the course was going; the feedback was very positive with the near-universal exception of the popsicle sticks! I acquiesced, therefore, and got rid of them. I’m not completely happy with this — without the random selection, some of the quieter students very rarely spoke up, and it would have probably helped them, unpleasant though it may be, to practice being more outspoken.

Final Project

The students each did a final project, for which my goals were that they

  • Learn more about biophysical topics.
  • Practice constructing a research question.
  • Think about experimental design, and how it relates to the questions we ask.

In other words: I wanted students to build on their understanding of the topic of biophysics, but also enhance their understanding of how biophysics progresses. Each student gave a 15 minute (+ questions) presentation in class that covered the background of their chosen topic, a statement of something unknown plus reasons to care that it should be understood, and the experimental design of a study to investigate it.

This was asking a lot. It went fairly well, but there is considerable room for improvement. Not surprisingly, student’s coverage of the background science was generally good. The topics chosen included embolisms in the xylem of plants, bacteriophages and human diseases, ways of modeling actin networks, and more. In retrospect, we should have spent much more time iteratively working on plans for hypothetical future experiments, critiquing methods and their potential outcomes. Planning experiments is very difficult! It’s hard to really do this well in a ten week course, however; we only started dealing with the final projects with a few weeks to go in the term. Still, overall, the projects were enjoyable to listen to, and I think people learned quite a bit from them.

Concluding thoughts

As mentioned in Part I, I consider the course overall a great success. It took a lot of work to put together, but it was very enjoyable and stimulating to teach, and students liked it a lot. I’m not teaching it in 2015-16 — it’s rare here to offer graduate electives in consecutive years — but I expect that I’ll teach it again in the near future. For anyone else thinking of teaching a similar course: I’m happy to share any of my materials, including about 90 pages of notes on the day-to-day content of the class.

When designing the course, one thing I had worried about was whether, given the breadth of biophysics and the variety of topics we’d be exploring, the subject would be seen as having any overall coherence. It’s not obvious, even to biophysicists, that it does. It was satisfying to see that the course did indeed hold together — that the themes of biological materials interacting via physical forces, quantitative analyses of dynamical systems, and the overarching roles of statistical mechanics and random process really did tie the class together.

Today’s illustration is again a painting based on a photo from Owls by Marianne Taylor.

Recap of a graduate biophysics course — Part I

owl watercolor

In Spring 2015 I taught a graduate biophysics course for the first time. It was a first in several ways: the course didn’t exist before, so I developed it from scratch, and it was also the first graduate course I’ve taught in my nine years as a professor! I’ve been thinking for months that I should write a summary of how it went, especially because such classes are uncommon enough that describing what worked and what didn’t work might be useful for others.

Overall, the class was a great success. It was fun and rewarding to teach — though it took a lot of work — and the students seemed to get a lot out of it. There were ten graduate students and one undergraduate enrolled, which is large for a graduate elective at Oregon. The student evaluation scores were the highest I’ve ever received, averaging 4.7 out of 5.0 in seven categories.

Here, I’ll describe some of the topics we explored. In the next post, I’ll describe the structure of the course: in-class activities, books and other sources, student projects, and more.


There were several themes I wanted to cover:

  • the major roles that statistical mechanics and, relatedly, randomness and probabilistic processes, play in biophysics
  • the mechanics of cellular structures
  • cellular circuits: how cells construct switches, logic gates, and memory elements
  • special bonus theme: amazing things everyone should be aware of

A random walk through biophysics

Much of the course explored the roles of statistical mechanics and, more generally, randomness and probabilistic processes, in biophysics. This included the physics of random walks and Brownian motion, and experimental methods for measuring diffusive properties of proteins and other molecules. We spent quite a while exploring how Brownian motion and other physical constraints impact the strategies that microorganisms use to perform various tasks. For example:

  • Why aren’t there microscopic “baleen whales,” that scoop up nutrients as they swim through water?
  • Why is it a good idea for a bacterium to cover just a tiny fraction of its surface with receptor molecules?
  • Why are bacteria small? How can some bacteria be huge?
  • How can bacteria migrate towards regions of higher nutrient density? What are the physical limits on the sensitivity of chemotaxis, and how close do bacteria come to these limits?

I’ve commented on some of these topics in past blog posts, for example this one on the non-intuitive nature of diffusion-to-capture.

More generally, we studied several examples of how understanding probabilistic processes enables insights into all sorts of systems. These ranged from recent examples like using brightness fluctuations to quantify the number of RNA molecules in single cells, and also discover “bursts” of transcription (Golding et al. 2005), to classic examples like the famous Luria-Delbrück experiment. In all these cases, a deep message is that that probability distributions encapsulate a great deal of information. The variance of some quantity, for example, may be as informative as its mean.

The mechanics of cellular structures

Understanding the physical properties of biological materials and how they matter for the functioning of living things is central to biophysics, and so we of course discussed the rigidity of DNA, the electrostatics of viral assembly, phase separation in lipid membranes, and other such topics. The connections to randomness and statistical mechanics are clear, since entropic forces and thermal fluctuations are huge contributors to the mechanical properties of these microscopic objects.

As one of many examples of the interplay between energy and entropy, I’ll note here DNA melting — the separation of the two strands of a DNA double helix at a particular, well-defined temperature. Before examining it, we learned about PCR (polymerase chain reaction), the method by which fragments of DNA are duplicated over and over, enabling bits of crime scene debris or tainted food to be analyzed for their genetic fingerprints. Repeated cycles of melting and copying are the essence of PCR, so understanding DNA melting of practical concern, as well as being very interesting in itself. Why does DNA have a melting temperature? This is a question whose answer seems obvious, then less obvious, and then interesting as the amount of thought one puts into it increases. At first, one might find it unsurprising that DNA separates at some well-defined temperature. After all, water melts at some particular temperature, and countless other pure materials have well-defined phase transitions. Looking further, however, one can think of DNA as a “zipper” whose links form a 1-dimensional chain, each with a lower energy when closed (base-paired) than open. With a bit of statistical mechanics, it’s easy to show that this chain won’t have a sharp melting transition, but rather will gradually open with temperature — a common property of one-dimensional systems [1]. The puzzle is resolved, however, by properly considering entropy: the double-stranded DNA might open at points in the middle, forming “bubbles” of open links (see below). These links cost energy but, crucially, increase the entropy of the molecule, since the bubble halves can wobble and fluctuate. Above a critical temperature, the entropic free energy wins over the energetic benefit to staying linked — bubbles grow, and DNA melts!

From: M. Peyrard, Biophysics: Melting the double helix. *Nat. Phys.* **2**, 13–14 (2006).

From M. Peyrard, “Biophysics: Melting the double helix.” Nature Physics 2: 13-14 (2006).

One of the things I especially like about the course is that we can consider “universal” materials like DNA and membranes, but also very specific materials, manifestations of the variety of life. For example, we looked at studies of Vorticella, a one-celled organism that can propel itself several body lengths (hundreds of microns) in milliseconds by harnessing the power of electrostatic forces to collapse bundles of protein fibers.

Cellular circuits: how cells construct switches, logic gates, and memory elements.

Cells do more than build with their components, they also compute — making decisions, constructing memories, telling time, etc. Our understanding of this has blossomed in recent years, driven especially by tools that allow us to create and manipulate cellular circuits. My own thinking about this, especially with respect to teaching it, was influenced heavily by Philip Nelson’s excellent recent textbook Physical Models of Living Systems, which I’ll comment on more in Part II.

We began by learning the basics of gene expression and genetic networks and then moved on to feedback in these networks and schemes for analyzing bistable switches. The physical modeling of these circuits leads to two interesting observations: (i) that particular circuit behaviors are possible in particular regions of the parameter space, which correspond to particular values of biophysical or biochemical attributes, and (ii) that the analysis of these sorts of networks is exactly the same as that of other dynamical systems that physics students are used to seeing. Neither of these are surprising, but they’re worth discussing, and they tie back to my question to myself before the course of whether to include this topic of cellular circuits. In retrospect, I’m very glad I did, not only because it’s  important, but because it highlights the power of quantitative analysis in biological systems separate from concepts of mechanics or motion. Since this sort of analysis is deeply ingrained in physics education, it provides yet another route for physicists to impact the study of living systems. Of course, it doesn’t have to be so. One could imagine a world in which mathematical analysis was as ingrained into biological education as it is in physics, but despite occasional pleas to make this happen, such a world is far removed from ours.

Amazing things

I decided to end the course with some very amazing, very recent developments in how we examine or understand the living world, regardless of whether or not one would classify them as biophysical. I picked three. I’ll pause for a moment while you guess what they are… (While waiting, you can look at two more owl illustrations. The one at the top is mine; these are from the kids. All are based on photos from the excellent Owls by Marianne Taylor.)



One was CRISPR / Cas9, the new and revolutionary approach to genome editing. As readers likely know, CRISPR has generated a frenzy of excitement, more than any scientific advance I can think of of the past decade. While tools for manipulating genomes have existed for a while, CRISPR / Cas9 provides a method to target essentially any sequence simply by providing a corresponding sequence of RNA that guides a DNA-cleaving enzyme. This would be worth covering just for its scientific impact, but it more broadly brings up issues of ethics and social impact. How could one go about, for example, engineering human embryos, or destroying pathogenic species? Would one want to? The story behind CRISPR provides a great illustration of the power of basic science. Its discovery in bacteria, from studies of their battles with viruses, was quite a surprise. It’s likely that surprises of similar magnitude still await us in unexplored corners of the living world. Connecting CRISPR to biophysics isn’t hard, by the way, since its mechanisms of operation are closely tied to the mechanics of bending and cutting DNA.

The second “amazing” topic is DNA sequencing. The cost of sequencing has fallen by orders of magnitude in recent decades. We’re close, for example, to being able to sequence an entire 3 billion base pair human genome for $1000! All this is driven by physically fascinating technologies — for example, detecting the ions released from a single nucleotide being added to a growing DNA strand, or the electrical current fluctuations as a single DNA molecule snakes through a nanopore.

The final amazing topic was optogenetics, the optical control of genetically encoded control elements. Using light-activated ion channels, for example, researchers can selectively turn neurons on and off in live organisms, a real-life version of science fiction-y mind control. Here again, the connections between technology and basic research are clear. Channelrhodopsin, one of the first and most useful proteins to be used and modified for optogenetic ends, was discovered in studies of unicellular algae.

Overall, this excursion was great. It tied into the main substance of the course better than I expected, and the students clearly shared my excitement about these topics. It was also noted that this sort of connection to cutting edge developments is sadly lacking in most physics courses.

Next time…

In Part II, I’ll describe some of the “active learning” approaches I implemented, which went well with one exception, and I’ll also discuss books, readings, and assignments. (For a glimpse of all this, you can see the syllabus.) I’ll note both then and now that all of my materials for the course are available to anyone thinking of teaching something similar — feel free to email me.



[1] For a simple treatment of the “zipper” problem, see C. Kittel, “Phase Transition of a Molecular Zipper.” Am. J. Phys. 37, 917–920 (1969). The paper generally considers the case of a zipper in which each link has one closed state and “g” open states. The g=1 case is quick to consider, and is a nice end-of-chapter exercise in Kittel and Kroemer’s Thermal Physics (an undergraduate statistical mechanics textbook), which is where I first encountered it. For g=1, there is no sharp phase transition. The g>1 case gives a sharper transition, but one shouldn’t spend much time thinking about it, since it’s much more realistic to think about bubble formation rather than DNA unzipping from its ends.