[I’ve created this page to provide a brief summary of the image analysis issues related to our studies of gut microbial communities, mainly for students interested in computational projects. – Raghuveer Parthasarathy, Jan. 12, 2015]
Image Analysis and Machine Learning, in the context of visualizing gut microbial communities
Each of our bodies is a home for trillions of microbes, mostly resident in our digestive tract, whose roles in health and physiology are only beginning to be understood. The spatial structure and temporal dynamics of the microbial communities associated with humans and all animals are still largely mysterious, spurring my lab to develop new microscopy-based approaches to explore microbial colonization in zebrafish, a model organism. (See here for a blurb on our recent work on physical models of bacterial growth, with a pointer to a paper.)
Here’s one three-dimensional image of bacteria (red) and immune cells (green) in the gut of a live, larval zebrafish:
Note that the bacteria exist as free individuals as well as dense clusters.
Here’s a video taken over several hours. Each frame is a projection of a 3D image (a stack of 2D images), in which we’re just looking at bacteria engineered to express fluorescent proteins:
Each series like this contains hundreds of gigabytes of image data. Converting the images into useful data — the number of microbes and their spatial location — is a computational challenge! In addition to the size of the datasets, here are three key issues, partially illustrated by one “zoomed in ” figure.
1. There is a large fluorescent background, in addition to the signal from the bacteria. We generally deal with this by various types of adaptive thresholding, which work well.
2. We need to separate (“segment”) the gut interior from the exterior, which I’ve just roughly done here in green. This largely involves manual labor, and would be great to automate!
3. We’re very interested in classifying objects as individual bacteria or as clusters, in addition to classifying bacterial vs. zebrafish (“host”) cells. We’ve adopted machine learning methods for this, using support vector machines. Are our methods optimal? There is likely room for improvement!
Each of these issues could form the nucleus of an interesting project for a computer science student, and moreover one that is useful and that could lead to a research position! I’ll also note that we have a considerable amount of data, and a lot of manually curated “ground truth” sets, allowing assessment of various computational approaches.
If you’d like to learn more, contact me: Prof. Raghuveer Parthasarathy, firstname.lastname@example.org