How much is that physics major in the window?

A friend pointed me to a recent study looking at the earnings (salaries) of people with degrees in different majors — for example, what’s the average salary of a physics major? Or a history major? The authors of the study are economists, and in my opinion put forth in their exposition of the study an overly pecuniary view of the factors that go into choosing a major. Regardless of their utility, though, the data are interesting. The web site is here, and includes link to a summary and a 214 page full report. In case you’re curious, the median salary of 25-59 year-olds with undergraduate degrees in physics is $81,000, making it rank #15 out of the 137 majors listed. I’m surprised this is so high. Nine of the top ten majors are various flavors of engineering, with the other (#2) being pharmacy. Chemistry is #50 ($64k) and Biology is #74 ($56k). Depressingly, at the bottom of the list is Early Childhood Education ($39k) — something of immense importance, but that doesn’t pay well.

In addition to salary, the report looks at the popularity of various majors. I was curious whether these two are correlated — whether, for example, there’s a “supply curve” (thinking like an economist) such that the majors for which students are abundant are those for which the pay is less. (This would assume that a lot of other factors are equal across majors.) I can’t extract the number of people with each major from the report — at least not unless I put a lot of work into this, which won’t happen. However, I can easily extract the rankings for each (salary and popularity) and can plot these:

Major_each_Earnings_Pop_Rank(The colors indicate categories, detailed in the legend of the next graph.)

As you can see, it’s a cloud! There are popular, lucrative majors, unpopular, low-paying majors, and every other combination.

The 137 majors in the study are grouped into categories. If we plot the median salary (the actual amount, not the ranking) versus popularity (percentage of majors), we get the following:

Major_Cat_Earnings_PopAgain, there’s no trend evident. Stretching the horizontal axis with a semilogarithmic scale gives us a nice triangle of points (with one outlier):

Major_Cat_Earnings_Pop_semilogxIs there a lesson here, other than that there are a lot of business majors? I don’t know. I’ll leave that to you, dear reader.

Slammed! (Why do bacteria care about physics?)

EColi_big_5Apr2015_transparent_downsampleI was a participant (contestant?) in our local “Physics Slam” two weeks ago, in which half a dozen physics faculty gave 10 minute talks to the general public, with a “winner” chosen by about five judges selected from the audience. About 500 people came, filling an auditorium:

PhysicsSlam_299_4_08_15My talk, Why do bacteria care about physics?, was mostly about the surreal physics of small-scale fluid flows, which microbes have to deal with and which necessitate that bacteria can’t swim by, for example, waving appendages back and forth — a strategy that works well for fish, whales, and other large things. I did a live demonstration of the classic illustration of reversible flows using a big vat of corn syrup — this is one of my favorite demonstrations to do, and the crowd loved it, spontaneously cheering at the end. The whole talk went remarkably well — you can watch for yourself at http://media.uoregon.edu/channel/2015/04/09/2015-physics-slam/. My part is at 1:05 or so. I waved my arms a lot. Eric Corwin’s talk, on sand grains and sphere packing, is particularly good (0:38). (If you’ve never seen reversible flows, also check out this YouTube video by fluid dynamics giant G. I. Taylor.)

My graduate biophysics class has also been exploring the microscopic-scale physics of diffusion and flow. This week, we’ll get to the “low Reynolds number” issues reflected in the Physics Slam talk. Last week, we explored among other things the very non-intuitive ways in which diffusion-to-capture works. For example: Imagine a bacterium covered with “sticky” patches (receptors for nutrients, for example). The nutrients dissolved in the bacterium’s surroundings diffuse and, by chance, hit the patches, where they’re absorbed and “eaten.” Each patch is small, — say, 1 nm in radius like a typical protein — compared to the roughly 1 micron radius of the bacterium. The bacterium could cover itself entirely with adsorbing patches and maximize its food uptake, but it wouldn’t have any surface left for any other tasks — motility, secretion, sensing, etc. — so this would not be a good strategy to adopt. We, or the microbe, can ask: What fraction of the surface needs to be covered in absorbing patches for the total food uptake to be half as large as it would be if the entire surface were “sticky?” Naively, one might expect the answer to this to be 0.5 — half the surface should give half the food uptake. This is, however, totally wrong. Remarkably, one needs only about 0.3% of the surface to be “sticky” to provide a 50% efficiency for diffusive food capture.

A graph of the capture rate versus the area fraction covered by “sticky” patches looks like this (green curve):

patches_capture_rateor more clearly, on a semilog scale:

patches_capture_rate_semilogxThis remarkable result follows from the properties of diffusion. A simple derivation can be found in Howard Berg’s classic “Random Walks in Biology.” We can get a rough intuitive sense of how it arises by realizing that diffusive flows are driven by gradients of concentration. If we halve the radius of an absorbing patch, its area drops by a factor of four. However, the average flow lines of the diffusing particles, which must end at the patches, are squeezed into a tighter space, increasing the concentration gradient and thereby giving a greater flow rate that partially counteracts the large drop in area.

patchy sphere transparentThis is both amazing and, for bacteria, extremely useful. A handful of receptors are sufficient to to efficiently capture molecules from the surroundings, so there’s plenty of room for many types of receptors for many types of molecules (various nutrients, attractants, repellents, etc.).

The physics of the microscopic world are endlessly fascinating. Returning to the Physics Slam: As five of us predicted with high certainty beforehand, our astronomer Scott Fisher won — he started off with an astronomically-modified version of the intro to Prince’s Let’s Go Crazy, complete with music. Plus there were photos of colorful space things. The rest of us didn’t stand a chance.

You should appreciate the infrequency of my blog posts

fish_5April2015_transparentToday’s illustration doesn’t have anything to do with the topic below. I made it for a ten minute talk I’ll give tomorrow, at the local “Physics Slam.” You can see the program here. Short version: Six physics faculty will have ten minutes each to explain something. The audience votes on their favorite presentation. Apparently, when it was done last a few years ago, several hundred people came. We’ll see what happens this time! My title:

Why do bacteria care about physics?

At some point, I should practice…

Now on to today’s topic:

Everyone agrees that it’s impossible to keep up with the ever-expanding scientific literature. An interesting recent paper* takes a look at this phenomenon, verifying that the number of papers published every year is, indeed, growing exponentially:

* “Attention decay in science,” Pietro Della Briotta Parolo et al., http://arxiv.org/pdf/1503.01881v1.pdf

Parolo Fig 5The authors look at what this means for scientific “memory.” In general, the rate at which a paper is cited by later papers decays over time (after an initial peak), as it is forgotten or as it is gives rise to other works that are cited instead. One might guess that a growth in publication rate might correlate with a larger decay rate for citations — we spend less time with the past as we’re swamped by new stuff. This is indeed what Parolo et al. find: a decay rate that has steadily grown over decades. This is unfortunate: by not considering papers of the more distant past we risk needlessly re-discovering insights, and we disconnect ourselves from our fields’ pioneering perspectives.

Returning to the overall number of papers: I wonder if this terrifying growth is driven primarily by an increase in the number of scientists or by an increase in papers written per person. I suspect the former. Even within the US, there are a lot more scientists than there used to be [e.g. this graph]. In the developing world this increase is far more dramatic (see e.g. here), as (presumably) it should be.

Unfortunately, I can’t find any data on the total number of scientists worldwide — at least not with just a few minutes of searching — or even the total number of Ph.D.’s awarded each year.

Looking around for any data that might help illuminate trends of population and paper production, I stumbled upon historical data for the American Physical Society (APS), namely the number of members in each year, since 1905 (http://www.aps.org/membership/statistics/upload/historical-counts-14.pdf). It’s not hard to tabulate the total number of papers published each year in the Physical Review journals — the publications of the APS. Looking at how each of these change with time might give a rough sense of whether one tracks the other. Of course, there are a lot of problems with interpreting any correlation between these two things: APS members (like me) publish in all sorts of journals, not just APS ones; non-APS members publish in APS journals; etc. Still, let’s see what these two look like:

APS_membership_and_papersJust considering APS journals alone, the number of papers published each year is 10 times what it was a few decades ago! Within the microcosm of APS, the number of papers being published has been growing at a far faster rate than the membership.

What does all this mean? I don’t really know. It’s impossible to do something about the general complaint that there are too many papers to read unless we have some deeper understanding of why we’re in this state. Lacking that, I suppose we’re just stuck reading papers as best we can, or feeling guilty for not reading…

T-minus 9 days for my graduate biophysics course

urchin_Feb2015_transparentNext term, I’ll be teaching a brand-new graduate biophysics course. (It’s the first time teaching a graduate course in my eight years as a professor!) I’ve spent quite a while thinking of what should be in it and how the course should be structured. Here, I’ll just note my list of topics (below, with a few comments), and provide a link to the syllabus (here). Hopefully in weeks to come I’ll comment on how the course is going.

Topics

Introduction; Physics, statistics, and sight

What are the fundamental limits on vision, and how close does biology come to reaching them? (A brief look.)

Components of biological systems

What are the components of biological systems? What are the length, time, and energy scales that we’ll care about? How can we organize a large list of “parts?”

Probability and heredity (a quick look)

We’ll review concepts in probability and statistical mechanics. We’ll discuss a classic example of how a quantitative understanding of probability revealed how inheritance and mutation are related.

Random Walks

We can make sense of a remarkable array of biophysical processes, from the diffusion of molecules to the swimming strategies of bacteria to the conformations of biomolecules, by understanding the properties of random walks.

Life at Low Reynolds Number

We’ll figure out why bacteria swim, and why they don’t swim like whales.

Entropy, Energy, and Electrostatics

We’ll see how entropy governs electrostatics in water, the “melting” of DNA, phase transitions in membranes, and more.

Mechanics in the Cell

We’ll look more at the mechanical properties of DNA, membranes, and other cellular components, and also learn how we can measure them.

Circuits in the Cell

Cells sense their environment and perform computations using the data they collect. How can cells build switches, memory elements, and oscillators? What physical principles govern these circuits?

Multicellular organization and pattern formation

How does a collections of cells, in a developing embryo, for example, organize itself into a robust three-dimensional structure? We’re beginning to understand how multicellular organisms harness small-scale physical processes, such as diffusion, and large-scale processes, such as folding and buckling, to generate form. We’ll take a brief look at this.

Cool things everyone should be aware of

We live in an age in which we can shine a laser at particular neurons in a live animal to stimulate it, paste genes into a wide array of organisms, and sequence a genome given only a single cell. It would be tragic to be ignorant of these sorts of almost magical things, and they contain some nice physics as well!

Comments

As you’ve probably concluded, this is too much for a ten-week course! I will cull things as we go along, based on student input. I definitely want to spend some time on biological circuits, though, which I’m increasingly interested in. I also want to dip into the final topic of “cool things” — I find it remarkable and sad that so many physicists are unaware of fantastic developments like optogenetics, CRISPR, and high-throughput sequencing. Students: prepare to be amazed.

My sea urchin illustration above has nothing to do with the course, but if you’d like a puzzle: figure out what’s severely wrong with this picture.

Mini-Geo-Engineering

IMG_1442I’m at a conference at Biosphere 2, the large ecological research facility in the Arizona desert that was originally launched as an attempt at creating a sealed, self-contained ecosystem.

It’s a surreal place — a collection of glass pyramids and domes housing miniature rain forests, deserts, an “ocean,” and a few other biomes — that’s now used for more “normal” research and education. I’m here not to join some a futuristic commune (at least not yet), but rather as a participant in a fascinating conference organized by Research Corporation called “Molecules Come to Life” — basically, it’s getting a lot of people who are interested in complex living systems together to discuss big questions, think of new research directions, and launch new projects. It’s a fascinating and very impressive group that’s here. Interestingly, a huge fraction are physicists, either physicists in physics departments (like me) or people trained as physicists who are now in systems biology, bioengineering, microbiology, etc., departments.

Do the conference topic and the venue have anything to do with one another? Explicitly, no. But in an indirect sense, both touch on issues of scale. A key issue in the study of all sorts of complex systems is how to relate phenomena across different extents of space and time. How can we connect the properties of molecules to the operation of a biological circuit? A circuit to a cell? A cell to an organism? Are there general principles — like those that tie the individually chaotic behaviors of atoms in a gas into robust many-particle properties like pressure and density — that lead to a deeper understanding? Would a piece of a complex system have the same behavior as the whole, or are collective properties scale-dependent?

The initial goal with Biosphere 2 was that these small-scale ecosystems under glass could function sustainably. This failed quite badly (at least at first — see Wikipedia for more details). As we learned on an excellent tour this afternoon, nearly all animals in the enclosure died, the food grown was so minimal that everyone was hungry all the time, and oxygen levels dropped from about 20% to 14% (at which point oxygen had to be pumped in). Walking around, the issue that kept coming to mind was: what is the scale of an ecosystem? Biosphere 2 is really not very big — it’s a few football fields in total area. Are the webs of interaction that can exist in an area this size sufficient to mimic a “real” rainforest, savannah, or other environment? Are they large enough to be stable, and not fluctuate wildly?

Perhaps these questions couldn’t have been answered without building the structure and trying the experiment. (Or perhaps they could.) It would be great to talk to the people behind the project — they were commune dwellers, not scientists — and see what thoughts, assessments, dreams, and predictions went into the planning of this impressive, but odd, place.

Some more photos:

IMG_1445IMG_1448IMG_1454IMG_1455IMG_1458

What have I got in my pocket?

What makes a good exam question? Not surprisingly, I try to write exams that most students who are keeping up with the course should do well on — almost by definition, the exam should be evaluating what I’m teaching. But I also want the exam to reveal and assess different levels of understanding; it would be useless to have an exam that everyone aced, or that everyone failed. Also not surprisingly, I’m not perfect at coming up with questions that achieve these aims. For years, however, I’ve been using the data from the exam scores themselves to tell me about the exam. Here’s an illustration:

I recently gave a midterm exam in my Physics of Energy and the Environment course. It consisted of 26 multiple choice questions and 8 short answer questions. For the multiple choice questions, I can calculate (i) the fraction of students who got a question correct, and (ii) the correlation between student scores on that question and scores on the exam as a whole. The first number tells us how easy or hard the question is, and the second tells us how well the question discriminates among different levels of understanding. (It also tells us whether the question is assessing the same things that the exam as a whole is aiming for, roughly speaking.) These are both standard things to look at, and I’ll note for completeness there’s lots of literature I tend not to read and can’t adequately cite about the mechanics of testing.

Here’s the graph of correlation coefficient vs. fraction correct for each of the multiple choice questions from my exam:

miderm correlations

We notice first of all a nice spread: there are questions in the lower right that lots of people get right. These don’t really help distinguish between students, but they probably make everyone feel better! The upper left shows questions that are more difficult, and that correlate strongly with overall performance. In the lower left are my mistakes (questions 6 and 15): questions that are difficult and that don’t correlate with overall performance. These might be unclear or irrelevant questions. Of course I didn’t intend them to be like this, and now after the fact I can discard them from my overall scoring. (Which, in fact, I do.)

I can also include the short answer questions, now plotting mean score rather than fraction correct (since the scoring isn’t binary for these). We see similar things — in general the correlation coefficients are higher, as we’d expect, since these short answer questions give more insights into how students are thinking.

all correlations

It’s fascinating, I think, to plot and ponder these data, and it has an important goal of assessing whether my exam is really doing what I want. I’m rather happy to note that only a few of my questions fall into the lower-left-corner of mediocrity. I was spurred to post this because we’re doing a somewhat similar exercise with my department’s Ph.D. qualifier exam. One might think, given the enormous effect of such an exam on students’ lives, and the fact that a building full of quantitative scientists create it, that (i) we routinely analyze the exam’s properties, and (ii) it passes any metrics of quality one could think of. Sadly, neither is the case. Only recently, thanks to a diligent colleague, do we have a similar analysis of response accuracy and question discrimination. Frighteningly, we have given exams in which a remarkable fraction of questions are poor discriminators, correlating weakly or even negatively with overall performance! I am cautiously optimistic that we will do something about this. Of course, it is very difficult to write good questions. However: rather than telling ourselves we can do it flawlessly, we should let the results inform the process.

Modeling Life (a freshman seminar) — Part 2

fig, abstract, watercolor, transparentIn Part 1, I described the motivations behind a “Freshman Interest Group” (FIG) seminar I taught last term, called “Modeling Life,” that explored how contemporary science can make sense of biology by way of physical and computational models. I also wrote about several of the topics explored in the class. Here, I’ll describe some of the assignments and projects, along with thoughts on whether the course succeeded in its aims, and whether I’ll teach it again.

Assignments

Since the course was only a one-credit, one hour per week seminar, and was focused on awareness of what can and can’t be done with models rather than actually conveying skills in modeling, I kept the assignments minimal. Many weeks involved just writing a paragraph or two. For example, following the first class’ discussion of a paper modeling waves of jostling penguins (see Part 1), students had to “Think of at least one other system besides penguins (biological or not) that would be amenable to this sort of modeling of interactions, and describe what ingredients or rules you’d put into a model of it.” Students proposed various systems of interacting agents, nearly all involving animals, people, or cars. This led to a nice discussion of, for example, the field of traffic modeling, and to Itai Cohen’s group’s simulations of “Collective dynamics in mosh pits.”

All FIGs are supposed to do something with the library, and so I came up with an assignment I’m quite fond of that explored the “demographics” of article authorships. The students picked one of two papers that we had mentioned in class:

and then looked “forward” and “backwards” at some subset of its citations (e.g. via Web of Knowledge) and its references. The students picked at least two characteristics like:

  • What departments the authors are from;
  • What countries the authors are from;
  • Whether the papers are about experiments, computation, or both (just determined from the abstract)

and described what they found about the collection of studies linked to the chosen article. (An extended version of this assignment was an option for the final project for the class.) Even more than I expected, students were surprised and interested to find things like the wide array of departments represented by the authors (biology, physics, computer science, various forms of engineering); the number of countries represented (with the very large US fraction being even larger among references than citations); and more. We spent a while discussing authorship — most students have a nineteenth-century notion of lone scientists writing single-author papers — and how numbers of people in research groups varies between fields. I of course showed an example from high-energy physics; this one has over three hundred authors, which is fairly typical:

Screen Shot 2015-02-08 at 3.15.46 PMThe full first page:

Screen Shot 2015-02-08 at 3.16.16 PM

Final project

For a final project, students had a choice of either an expanded version of the ‘follow the literature’ assignment described above, or they could write simple computer programs that illustrated biased random walks (as in bacterial chemotaxis) or logistic growth (chaotic population dynamics). They could work in groups. About 2/3 chose the programming exercises. All of these went well — better than I expected in terms of both students’ interest in the project and their success in implementing them. (The students made use of the simple programming methods they were learning in the computer science class — I cringed to watch graphs being made by having a “turtle graphics” cursor trace out paths, and had flashbacks to seventh grade.)

Overall assessments

Did the course succeed? In some ways: yes. Students seemed very interested in the topics we explored, and most weeks we had quite good discussions. And it certainly was the case that the things we learned about were, to the students, completely new and far outside the scope of standard things they had previously encountered. If this were a “normal” course, I’d call it a success based on the level of engagement and interest we achieved. However, it was not a normal course, and there were three issues with it that dampen my enthusiasm for repeating it.

First, since I taught this concurrently with my Physics of Life course, a typical large, four-credit class, it added to my workload. Of course, I knew this going in. But, because I have far, far more things to do every week than there are hours in which to do them, I should really be subtracting from, rather than adding to, my list.

Second, a goal of the FIGs in general is that they’re social as well as academic experiences, and it’s apparent that I have neither the time nor the inclination to be very social. The high point of this aspect of the course was during the first few weeks, when I made sure to have coffee or lunch with all the students, in groups of 1-5. This was fun, and it was interesting to get some insights into their very different backgrounds, levels of comfort with the university, and experiences. Especially with respect to programming, the students ranged from ones who had never programmed anything prior to their concurrent computer science course to one who had held a job as a programmer. Aside from these chats, I did one social activity outside of class, a very short hike up Skinner Butte. (I had hoped for Spencer Butte, about an hour to a rocky summit with beautiful views, but the logistics of transportation foiled us.) A few students came, along with my kids; it was a nice walk on a sunny Sunday afternoon.

Third, the demographics of the FIG weren’t really what I was aiming for. The FIG connects my Physics of Life course with the introductory computer science class; students in the FIG are enrolled in both these courses. The intended audience of the Physics of Life class is non-science-major undergraduates. Introductory computer classes, at UO and elsewhere, are attracting sharply increasing numbers of students (see here) with a very wide range of interests. Therefore I was hoping for the same diverse assortment of students in the FIG — people interested in majoring in history, or political science, or art, etc. Instead, eighteen out of twenty in the course were intended computer science majors! They were a great bunch, but they were not my target in terms of general education. One could argue that these students are precisely those who we should be introducing to quantitative biology, since the field very much needs them. I would agree with this, and if I were part of a quantitative biology program I might agree that this is part of my job. But I’m not.

Overall, I don’t intend to teach the seminar again in the near future, though I could imagine happily revisiting it again someday. In case anyone plans similar courses, hopefully the thoughts noted here are of some use — feel free to email me for more details. The topic of mathematical and physical modeling of biological systems is fascinating, and it is certainly one that more students, especially early in their undergraduate careers, should be exposed to!