CRISPR spacers - asymmetry and orientation

CRISPR arrays act as the memory of CRISPR adaptive immunity systems, storing “spacers” or fragments of DNA collected from past invasions. But these spacers need to be loaded into the array in the correct orientation for the system to identify foreign DNA sequences and trigger cleavage. A recent paper in the Journal of Biological Chemistry, led by Bailey Lab PhD student Anita Ramachandran, showed that in Escherichia coli, the way that certain exonucleases trim prespacers bound by the Cas1-Cas2 complex results in an asymmetrical prespacer overhangs, helping the system insert the spacers in the correct orientation. The paper was selected for an Editor’s Pick Highlight: A moonlighting nuclease puts CRISPR in its place. The work was funded by the NIH’s National Institutes of General Medical Sciences.

Spacers are added to CRISPR arrays during the adaptation stage. In most CRISPR systems, including E. coli’s, a complex of two proteins, Cas1-Cas2, captures DNA fragments, called “prespacers,” that include a protospacer adjacent motif, or PAM. These prespacers are thought to come from various degradation pathways, and in vitro experiment show that the prespacers that are most efficiently integrated into CRISPR arrays have a 23 bases of paired, duplex DNA plus 5 nt long 3’ overhangs on either side.

Cas1-Cas2 can integrate a bound prespacer into the CRISPR array in two steps, first inserting the 3’ end of one of the prespacer strands into the CRISPR array at the junction between the leader and  the first repeat in the array (leader side), and then inserting the 3’ end of the other strand between the spacer and the repeat (spacer side). During this process, some or all of the PAM sequence is removed, which allows the system to distinguish between a foreign DNA target, which has a PAM, and the cell’s CRISPR array, which does not. Inserting the prespacer in the correct orientation—is critical as shown in the figure below. Only one strand of the spacer will be transcribed into the crRNA used to recognize foreign DNA fragments. CASCADE, the complex of proteins that uses the crRNA to identify foreign DNA and target it for destruction only “checks” DNA strands with the PAM sequence (here shown in purple), so the crRNA must be complementary to that strand for the CRISPR machinery to recognize the target.

Two schematics showing how spacer orientation determines functionality – the correct orientation produces a crRNA that pairs to the DNA strand 5’ to PAM – the “top” strand in the diagram; the other pairs to the “bottom” strand.

Some CRISPR systems have Cas4 proteins that play a role in processing prespacers and ensuring that they are integrated in the correct orientation, but not all systems have Cas4. For example, Streptococcus thermophilius does not have Cas4 but its Cas2 is fused to a DnaQ-like domain which can trim the 3’ overhangs of prespacers, and promote integration of unprocessed prespacers in vitro.

The E. coli Type I-E system, however, has neither a Cas4 nor a DnaQ domain fused to its Cas2, but still processes and inserts the spacers in the correct orientation in vivo. However, experiments looking at the integration of prespacers in vitro have shown a mix of orientations.  Anita and her colleagues wanted to explore how the E. coli Type I-E system processes prespacers and properly inserts them into the array.

Based on the S. thermophilius Cas2 fusion’s role in adaptation, Anita wondered if a similar activity occurs in the E. coli Type I-E system. She identified three 3’->5’ exonucleases from E. coli to test – the ExoT, ExoI, and the catalytic domain of DnaQ.

Though most of the previous in vitro biochemistry has been done with preprocessed prespacers, with a 23 base-pair duplex region and 5 nucleotide overhangs, the DNA fragments that Cas1-Cas2 encounter in cells have varied overhang lengths. So Anita and colleagues took a step back, to look at processing and integration of unprocessed prespacers, with 15 nucleotide overhangs on either side of the 23 bp duplex.

To monitor for integration, they used a plasmid topology assay, where different combinations of prespacer sequences and proteins are added to a reaction with a supercoiled plasmid containing a CRISPR array. The integration of a prespacer, even a half-site integration where just one end is inserted, can be detected when the plasmids are run on a gel. PCR assays can test the orientation of the integrated spacers.

While the Cas1-Cas2 complex did not integrate unprocessed prespacers into the CRISPR array, adding either DnaQ and ExoT (but not ExoI) to the reaction resulted in integration – and those integration events were strongly biased toward the correct orientation. That bias only occurred with preprocessed prespacers containing the PAM – if the prespacer had already been trimmed down to the 5 nt 3’ overhangs, or did not contain a PAM sequence, it was integrated in both orientations at similar frequencies, even in the presence of DnaQ or ExoT.

Given these results, Anita wondered if the exonucleases might trim the PAM and non-PAM strand overhangs differently. Using labeled unprocessed prespacers, she showed that when the prespacer was bound to Cas1-Cas2, there were differences in how the strands were cleaved. The non-PAM strand’s overhang was cleaved to 4 nt (DnaQ) or 6 nt (ExoT), while both left a 9-10 nt overhang on the PAM side. This resulted in an asymmetrical prespacer, with only the non-PAM strand close to the optimal overhang. The longer overhang on the PAM side is likely due to protection by the c-terminal tail of Cas1 where it specifically recognizes and makes contacts with the PAM sequence.

Anita and colleagues then looked closer at the integration of prespacers with symmetrical overhangs of different lengths. They found that the integration was tolerant of longer overhangs (6 or 7 nt) for the insertion of the non-PAM strand into the leader side of the repeat, but for the spacer side of the repeat or the PAM strand, the overhang had to be 5 nt to allow efficient integration. Since the leader side is where prespacers are integrated first, the only strand that could be inserted first would be the non-PAM strand, as the PAM strand’s overhang would be too long – blocking the insertion in the incorrect orientation. In fact, when Anita looked at the integration of the non-PAM strand over a time series, she found that while, over time, a prespacer with 5 nt overhangs on each strand could be integrated at either the leader or the spacer side, if the PAM overhang was increased to either 7 or 9 nt, the non-PAM strand was only inserted at the leader side.

Putting all of these details together, Anita developed a model. The protection of the PAM sequence by Cas1-Cas2 causes DnaQ, ExoT or similar exonucleases to trim prespacer overhangs to different lengths on each strand – around 5 nt on the non-PAM strand, but longer on the PAM strand. This leaves one option for the first step of the integration - the non-PAM strand inserting into the leader side of the repeat, as shown below.

Cas1-Cas2 bound to unprocessed protospacer with overhangs; trimmed by DnaQ, leaving grey overhang on PAM side & optimal integration length of green spacer overhang on other; spacer side integration of non-PAM strand blocked; integrates at leader side

The model continues (see below) with subsequent processing that trims down the PAM strand’s excess overhang and PAM sequence, and then it is integrated into the spacer side of the repeat, in the correct orientation. Sequencing of the integrations showed that in the DnaQ reactions, in most cases both overhangs were ultimately trimmed to 5 nt, properly removing the first two bases of the PAM, while in the ExoT reactions the majority of spacers were trimmed to 6 nt, resulting in a spacer with extra bases, including two bases of the PAM sequence.

DnaQ trims off grey overhang next to PAM; that end integrates at repeat spacer side, lower strand; integration “flips” open grey repeat into single stranded regions flanking prespacer; gaps are filled with the new spacer in the correct orientation.

This model answers some of Anita’s questions – and opens up new areas to explore. One is to identify the enzyme, or enzymes, that trim the overhangs in the cell – though ExoT and the DnaQ domain could process the overhangs in vitro, it’s possible others are involved in the cell. Another area is how the PAM strand is processed after the non-PAM strand has been integrated. 

Anita lead the work, guidance by Scott and lab work assistance by former ScM student Lesley Summerville, Research Associate Brian Learn, and Ingenuity Project at Baltimore Polytechnic Institute high school student Lily DeBell.

For more details about this work, check out the paper and the highlight at JBC.

Want to learn more about the work this builds on? Check out these papers:

Cas1-Cas2 spacer insertion process

Nuñez JK, Lee ASY, Engelman A, Doudna JA. 2015. Integrase-mediated spacer acquisition during CRISPR-Cas adaptive immunity. Nature 519(7542):193–198.

Rollie C, Schneider S, Brinkmann AS, Bolt EL, White MF, Nilsen TW. 2015. Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition. eLife Sciences 4:e08716.

Processing Prespacers

Nuñez JK, Harrington LB, Kranzusch PJ, Engelman AN, Doudna JA. 2015. Foreign DNA capture during CRISPR-Cas adaptive immunity. Nature 527:535–538.

Wang J, Li J, Zhao H, Sheng G, Wang M, Yin M, Wang Y. 2015. Structural and Mechanistic Basis of PAM-Dependent Spacer Acquisition in CRISPR-Cas Systems. Cell 163:840–853.

Bias for correct orientation in vivo

Shipman SL, Nivala J, Macklis JD, Church GM. 2016. Molecular recordings by directed CRISPR spacer acquisition. Science 353:aaf1175–aaf1175.

Cas4 protein’s role in processing and insertion

Kieper, S. N., Almendros, C., Behler, J., McKenzie, R. E., Nobrega, F. L., Haagsma, A. C., Vink, J. N. A., Hess, W. R., and Brouns, S. J. J. (2018) Cas4 Facilitates PAM-Compatible Spacer Selection during CRISPR Adaptation. Cell Reports. 22, 3377–3384

Lee, H., Zhou, Y., Taylor, D. W., and Sashital, D. G. (2018) Cas4-Dependent Prespacer Processing Ensures High-Fidelity Programming of CRISPR Arrays. Mol. Cell. 70, 48–59.e5

Lee, H., Dhingra, Y., and Sashital, D. G. (2019) The Cas4-Cas1-Cas2 complex mediates precise prespacer processing during CRISPR adaptation. eLife Sciences. 10.7554/eLife.44248

Rollie, C., Graham, S., Rouillon, C., and White, M. F. (2018) Prespacer processing and specific integration in a Type I-A CRISPR system. Nucl. Acids Res. 46, 1007–1020

Shiimori, M., Garrett, S. C., Graveley, B. R., and Terns, M. (2018) Cas4 Nucleases Define the PAM, Length, and Orientation of DNA Fragments Integrated at CRISPR Loci. Mol. Cell. 70, 814–824.e6

Cas2 fusion in Streptococcus thermophilius

Drabavicius G, Sinkunas T, Silanskas A, Gasiunas G, Venclovas Č, Siksnys V. 2018. DnaQ exonuclease-like domain of Cas2 promotes spacer integration in a type I-E CRISPR-Cas system. EMBO reports 19:e45543.

Leader-side insertion of prespacers

Xiao, Y., Ng, S., Nam, K. H., and Ke, A. (2017) How type II CRISPR-Cas establish immunity through Cas1-Cas2-mediated spacer integration. Nature. 550, 137–141

Nuñez, J. K., Bai, L., Harrington, L. B., Hinder, T. L., and Doudna, J. A. (2016) CRISPR Immunological Memory Requires a Host Factor for Specificity. Mol. Cell. 62, 824–833

Wright, A. V., Liu, J.-J., Knott, G. J., Doxzen, K. W., Nogales, E., and Doudna, J. A. (2017) Structures of the CRISPR genome integration complex. Science. 357, 1113–1118

PAM sequence removal

Datsenko, K. A., Pougach, K., Tikhonov, A., Wanner, B. L., Severinov, K., and Semenova, E. (2012) Molecular memory of prior infections activates the CRISPR/Cas adaptive bacterial immunity system. Nature Communications. 3, 945

Goren, M. G., Yosef, I., Auster, O., and Qimron, U. (2012) Experimental Definition of a Clustered Regularly Interspaced Short Palindromic Duplicon in Escherichia coli. J. Mol. Biol. 423, 14–16

Swarts, D. C., Mosterd, C., van Passel, M. W. J., and Brouns, S. J. J. (2012) CRISPR Interference Directs Strand Specific Spacer Acquisition. PLoS ONE. 7, e35888

Wang, J., Li, J., Zhao, H., Sheng, G., Wang, M., Yin, M., and Wang, Y. (2015) Structural and Mechanistic Basis of PAM-Dependent Spacer Acquisition in CRISPR-Cas Systems. Cell. 163, 840–853