The advance may lead to new methods of understanding and overcoming genetic diseases.
Last year, the multi-institutional team demonstrated that when the 2-meter long human genome folds up inside the nucleus of a cell, it forms roughly 10,000 loops. These loops turn genes on and off, and control how long stretches of the genome are packed. Anomalies in this folding process can lead to disease. The team also discovered a DNA codeword, or “motif,” that lies at both ends of nearly all loops: a string of fewer than 20 genetic letters that causes the DNA to bind a protein called CTCF. More often than not, these motifs lie in what had previously been thought of as “junk” DNA.
Now, in a result with profound consequences for genetic research, a team directed by Broad visiting scientist Erez Lieberman Aiden, has demonstrated that by manipulating these motifs, it is possible to destroy, move, and create new loops in the genome. The work appears this week in the journal Proceedings of the National Academy of Sciences.
“We were able to use our insights into how loops form in nature in order to engineer genome loops artificially. This means that it is possible, at least in principle, to fix errors in genome folding by modifying a handful of genetic letters, without disturbing the surrounding DNA,” said Aiden, senior author on the new study, as well as a McNair Scholar at Baylor and a Senior Investigator at Rice University’s Center for Theoretical Biological Physics. Aiden was recruited to Baylor with the help of a Cancer Prevention Research Institute of Texas (CPRIT) grant.
Like the strings on a marionette, loops often connect genes with the DNA elements that control them – even though, when the genome is viewed as a one-dimensional string of letters, those elements lie far away. To modify the code words that create loops without disrupting the surrounding sequence, the team used CRISPR/Cas9-based genome editing, which makes it possible to modify a genome sequence in an extremely targeted fashion. If the team was right about the role of the code words in creating loops, the results would do more than affect the sequence of the genome: they would change the fold.
“Using CRISPR allowed us to go in with a ‘molecular scalpel’ to add or remove a small number of genetic letters. By knowing exactly which letters we needed to target, we found that it was possible to change how the genome folded in a highly predictable fashion,” said Suhas Rao, co-first author of the study and a student in the Aiden lab. The new study also demonstrates how loops form inside the genome – and the results were a huge surprise. For decades, scientists thought that loops form when bits of the genome, wiggling around in the nucleus, bump into one another. But the authors found that the cell forms loops by a different process, called extrusion.
Sanborn uses the analogy of a backpack to explain how extrusion works.
“Extrusion is the process you use when you’re manipulating the plastic adjustors on your backpack: you feed the strap through on each side, and the slack forms a loop. But the genome is much longer than the strap on a backpack, so the process keeps going and going: more and more slack, a bigger and bigger loop! The key is that the extrusion process stops when it hits a CTCF site. That’s why modifying the motifs is the key to controlling the whole process,” he said.
In each experiment, the researchers modified certain CTCF sites, changing the pattern of CTCF binding. Once they knew that extrusion drives the genome folding process, the team found that knowing where CTCF bound DNA was enough to predict how whole regions of the genome would fold. Modifying the CTCF sites made it possible to destroy loops, to move loops, and to create new loops. It was also possible to predictably modify other folding features, called domains, which are stretches of the genome that all segregate into a single spatial position. In each of over a dozen cases, the team combined mathematics and high-performance computation to predict how the genome would fold in advance. In each case, the genome’s fold closely matched the team’s predictions. In one case, adding a single base pair was enough to change the folding of millions of letters of the genome.
These results are an important step in the process of understanding how the genome folds – and they come just as the National Institutes of Health is launching a 10-year initiative, called the 3D Nucleome project, to explore exactly these questions.
“CTCF sites function like a code for genome folding. Now that we’ve begun to crack the code, we can understand and control the folding process,” said Eric Lander, co-corresponding author on the paper and director of the Broad Institute.
Others who took part in this work include Su-Chen Huang, Neva C. Durand, Miriam H. Huntley, Andrew Jewett, Ivan D. Bochkov, Dharmaraj Chinnappan, Ashok Cutkosky, Jian Li, Kristopher P. Geeting, Doug McKenna and Elena K. Stamenova all with the Center for Genome Architecture at Baylor; Andreas Gnirke and Alexandre Melnikov with the Broad Institute at MIT and Harvard. Aiden, Sanborn, and Li are also with the Center for Theoretical Biological Physics at Rice University. McKenna is also with Mathemaesthetics, Inc., in Boulder, Colorado.