The Science Behind It: Metagenomics 101
“The Science Behind It” is an educational video series by Dr. Tuesday Simmons, science writer at Trace Genomics. It was created to illuminate the scientific foundations of soil microbiology that form the basis of Trace technology. At Trace Genomics, we help growers leverage the soil microbiome for better soil management decisions. If you’re interested in learning more about the basics of the soil microbiome, check out our video: Soil Microbiome 101. In this video, we’re going to take a closer look at how we figure out what microorganisms live in the soil.
Video Script:
If microorganisms are invisible to the naked eye, how do we know anything about them? Generally, there are 3 common ways to identify microbes living in a given environment. The first is microscopy – looking at microbes under a microscope. There are only a few different shapes that microbes can be, so most of them look identical under a microscope. Even if we use different types of stains to tell them apart, it would still be a long, laborious process to try and look at every microbe in a small soil sample, not to mention we can’t tell them apart anyway.
The second way involves growing the microbes in a lab. If we grow them on petri dishes, they have a lot more visibly different characteristics than the cell size and shape that we see under a microscope. We can also grow them on petri dishes that have different food sources or change colors when microbes use a certain type of metabolism, and that can help us tell them apart. However, the vast majority of microbes can’t be grown in a lab, in spite of a lot of effort by scientists. We sometimes call these organisms “microbial dark matter”.
The third way (which has really illuminated the diversity of microbes in the soil) is by looking at different biomolecules as microbial fingerprints. There are 4 major types of biomolecules: carbohydrates, proteins, lipids (aka fats), and nucleic acids (which includes DNA). Some soil labs offer PLFA: phospholipid fatty acid analysis, which uses one type of biomolecule to get a picture of the microbial community. This works, but it’s like watching The Brady Bunch on a TV from the 60s with an antenna that needs adjusting – the resolution isn’t so great.
A much better biomolecule to use is DNA. DNA is sometimes called the “blueprint of life” because it contains all the information necessary to “build” a living organism. It’s made up of 4 different chemical building blocks called bases (aka nucleotides), and these can be thought of like a language with 4 letters in the alphabet. The order of bases in a DNA molecule can be read like an instruction manual. In the manual, each chapter has instructions for a different piece of machinery (that is, another biomolecule like a protein), and the DNA that codes for it is called a gene. All the DNA in a particular organism is called its genome, and the process that scientists use to decipher it is called genome sequencing.
There are 2 different methods for sequencing DNA in a complex environment to find out what microbes are there. The first method is called amplicon sequencing, and it uses fingerprint genes to identify what microbes are present. For each of the different microbial groups (bacteria, archaea, fungi, and protists) there are a few genes that are found in all members of a group (this is called a conserved gene). For example, the 16S rRNA gene (or just “16S”) is conserved among bacteria and archaea, so scientists can sequence all the 16S genes from an environment. There is enough difference between 16S genes to get a good picture of what bacteria and archaea are there, but sometimes not at a species level.
The second method is called metagenomics and involves sequencing all the DNA in an environment. Over the past 20 years, this has been an incredible scientific discovery tool for understanding the microbial world. Since it looks at all the DNA, it has a much higher resolution than amplicon sequencing. But, since it looks at all the DNA, it is much more difficult to do.
If we think of a single organism’s genome as one puzzle, a soil metagenome is like having 10,000 (or more) different puzzles with their pieces all mixed into the same box. Amplicon sequencing is like looking for a specific piece that all the puzzles have in common, like the top left corner, and using that to identify which puzzles (or genomes) are there. Metagenomics looks at all the pieces to try and find other useful information, like how many genomes contain interesting functional genes (like nitrogen or phosphorus cycling).
While there are benefits to using amplicon sequencing (namely cost and simplicity), at Trace Genomics we decided to build a pipeline and database using metagenomics because it provides the most information with the highest degree of confidence.
About the author: Dr. Tuesday Simmons is the Science Writer at Trace Genomics. She earned her Ph.D. in Microbiology from the University of California, Berkeley, studying the root microbiome of cereal crops.