How CRISPR Actually Works

In 2012, a paper was published in the journal Science that most biologists immediately recognized as one of the most consequential in the history of the field. It described a system โ€” originally discovered as a bacterial immune defense โ€” that could be repurposed to cut any DNA sequence in any organism with unprecedented precision. The authors were Jennifer Doudna and Emmanuelle Charpentier. Eight years later, they shared the Nobel Prize in Chemistry. In the decade since, CRISPR has moved from a laboratory curiosity to clinical reality, with the first approved therapy reaching patients in 2023.

But most explanations of CRISPR are unsatisfying. The popular version โ€” "molecular scissors that cut DNA" โ€” tells you approximately nothing about why it works, how it finds its target, what happens after the cut, or why it's so much more powerful than previous gene editing tools. The real story requires understanding the molecular machinery in some detail, and that detail is worth having โ€” because it reveals why CRISPR represents not just a better tool but a fundamentally different category of tool.

This is that story. Not the simplified version. The actual mechanism, from bacterial immune system to Nobel Prize to FDA-approved therapy โ€” and what it genuinely can and cannot do.


Part I โ€” The bacterial immune system nobody knew existed

The story begins not with Jennifer Doudna but with a Japanese graduate student named Yoshizumi Ishino, who in 1987 was sequencing a gene in E. coli for an entirely unrelated reason. In the data, he noticed something strange: a series of short repeated DNA sequences, separated by unique spacer sequences of similar length. He published them as a curiosity and moved on. He had no idea what they were.

Over the next decade, other researchers noticed similar repeating sequences in various bacteria and archaea. In 2002, a Dutch scientist named Ruud Jansen formally named them: Clustered Regularly Interspaced Short Palindromic Repeats โ€” CRISPR. The name is a mouthful, but the pattern it describes is distinctive enough to recognize across species: short palindromic sequences (the same forwards and backwards), appearing in clusters, with unique sequences between them.

What were they for? The consensus was silence. Nobody knew. The sequences sat in the literature as an unexplained observation โ€” interesting enough to name, not interesting enough to pursue. Then, in the early 2000s, Francisco Mojica โ€” a microbiologist at the University of Alicante in Spain โ€” did something simple that others hadn't thought to do: he compared the spacer sequences (the unique bits between the repeats) against publicly available DNA databases.

๐Ÿ“œ The Discovery Nobody Published

Mojica's database searches returned striking matches: the spacer sequences corresponded to sequences found in bacteriophages โ€” viruses that infect bacteria. He submitted his findings to Nature in 2003. The paper was rejected. He submitted to PNAS. Rejected. Then EMBO Journal. Rejected. It was finally published in the Journal of Molecular Evolution in 2005 โ€” two years after submission. His hypothesis: CRISPR sequences were fragments of viral DNA that bacteria had incorporated into their own genomes as a kind of molecular immune memory. He was right about everything. But by the time the scientific community caught up, others had independently reached similar conclusions, and the credit for the therapeutic application would go to others entirely.

The logic of Mojica's hypothesis is elegant. When a bacterium survives a phage attack, it can store a short piece of the phage's DNA between the repeats in its CRISPR array โ€” a molecular archive of past infections. If that phage attacks again, the bacterium can recognize it and destroy it. This is adaptive immunity in the simplest possible form: a memory system that records encounters with pathogens and uses them to defend against future attacks.

By 2007, this hypothesis had been experimentally confirmed. Bacteria exposed to phages incorporated new spacers from those phages. Bacteria with spacers matching a phage were resistant to it. Delete the spacers, and resistance was lost. CRISPR was a genuine immune memory system โ€” the first example of adaptive immunity ever found in prokaryotes. That result earned the scientists who confirmed it a place in history. But the tool that would change biology hadn't been built yet. That required understanding not just what CRISPR stored, but how the cell used that information.

Part II โ€” The molecular mechanism in detail

Understanding CRISPR as a tool requires understanding CRISPR as an immune system โ€” specifically, how bacteria actually use the stored viral sequences to detect and destroy invaders. The process has three stages: acquisition (storing new viral sequences), expression (producing the molecular machinery), and interference (using that machinery to destroy matching DNA). It's the interference stage that Doudna and Charpentier figured out how to harness.

The cast of molecules

CRISPR doesn't work alone. It works with a set of associated proteins called Cas proteins (CRISPR-associated proteins). Different bacterial species use different Cas proteins with different properties โ€” there are multiple classes and types of CRISPR systems. The one Doudna and Charpentier studied, and the one used in most early applications, is Type II, which uses a single large protein called Cas9.

Cas9 from Streptococcus pyogenes (SpCas9) became the workhorse of the field. It's a roughly 1,368-amino-acid protein with two functional nuclease domains โ€” HNH and RuvC โ€” each responsible for cutting one strand of the DNA double helix. When both domains cut, you get a double-strand break โ€” both strands severed at the same position. This is the "cut" in "molecular scissors." But scissors without guidance are useless. What guides Cas9 is RNA.

The guide RNA โ€” the address label

In the natural bacterial system, the CRISPR array is transcribed into a long RNA molecule that is then processed into short pieces called crRNAs (CRISPR RNAs). Each crRNA contains the sequence from one spacer โ€” one piece of viral memory. The crRNA pairs with a second RNA called tracrRNA (trans-activating CRISPR RNA), and this two-part complex binds Cas9 and loads it with targeting information.

Doudna and Charpentier's key insight was that these two RNAs could be fused into a single molecule โ€” the single guide RNA (sgRNA) โ€” that retained full functionality. This simplification was crucial. Instead of needing two separate RNA components, you need one. And because the targeting sequence is just RNA โ€” a molecule that can be designed on a computer and synthesized cheaply โ€” you can direct Cas9 to any DNA sequence you want by simply writing the appropriate guide RNA sequence. This is why CRISPR is programmable. Change 20 letters of RNA, and you change the target.

๐ŸŽฏ Cas9 as a Search Engine

Think of Cas9 as a search engine for DNA. The guide RNA is the search query โ€” a 20-letter sequence that defines the target. Cas9 loads the query, then slides along DNA, comparing the sequence it finds against the query. When it finds a match (plus a short adjacent motif called the PAM sequence), it stops and makes the cut. Change the query, and you change what the search engine looks for. The same Cas9 protein, redirected by a different guide RNA, will find and cut a completely different location in the genome. The programmability is entirely in the RNA.

The PAM sequence โ€” the safety lock

There's one more requirement for Cas9 to cut: a short DNA sequence immediately adjacent to the target site called the PAM (Protospacer Adjacent Motif). For SpCas9, the PAM sequence is NGG (any nucleotide followed by two guanines). Cas9 only cuts if a PAM sequence is present on the non-target strand next to the 20-base-pair target sequence.

Why does this exist? The PAM is a self/non-self discrimination mechanism. In the bacterium's own CRISPR array, the spacer sequences are flanked by repeat sequences, not PAM sequences โ€” so Cas9 cannot cut its own immune memory. It only cuts DNA that has a PAM next to the target sequence, which is a feature of foreign DNA (phage DNA) but not of the CRISPR array itself. In the context of gene editing, the PAM is both a constraint (you can only target sites near a PAM) and occasionally a limitation (some genomic sequences you might want to edit lack a nearby PAM). Researchers have since engineered modified Cas9 variants with different PAM requirements to expand the targeting range.

๐Ÿค” What actually happens at the molecular level when Cas9 finds its target?

โ–ผ

Cas9 first scans DNA by binding and briefly unwinding short sections, checking if the sequence matches the guide RNA. When it encounters a PAM sequence, it unwinds the adjacent DNA and allows the guide RNA to attempt base pairing with the target strand. If the 20-base-pair match is sufficient (a few mismatches can be tolerated), Cas9 undergoes a conformational change โ€” its two lobes close around the DNA like a clamp. This triggers both nuclease domains: HNH cuts the strand complementary to the guide RNA, and RuvC cuts the other strand. The result is a blunt-ended double-strand break, typically 3 base pairs upstream of the PAM sequence. The entire process โ€” from PAM recognition to cleavage โ€” takes only a few seconds once the target is found.


Part III โ€” After the cut: how editing actually happens

Making a double-strand break is only half the story. The cut itself doesn't edit anything โ€” it just creates a break that the cell must repair. How the cell repairs that break determines what kind of editing occurs. There are two main repair pathways, and choosing between them (or biasing the cell toward one) is the difference between different types of edits.

NHEJ โ€” the imprecise repair

Non-homologous end joining (NHEJ) is the cell's quick-and-dirty repair mechanism. It simply glues the two broken ends back together โ€” but the ligation is imprecise. Small insertions or deletions of one or a few bases (called indels) are introduced at the cut site. In a protein-coding gene, an indel in the coding sequence typically causes a frameshift โ€” scrambling every codon downstream and usually producing a non-functional, truncated protein.

This sounds like a flaw, but it's actually useful. If you want to knock out a gene โ€” disable it entirely โ€” NHEJ is your friend. Direct Cas9 to cut in the coding sequence of your target gene, let NHEJ repair it messily, and the result is a disrupted gene that no longer produces a functional protein. This is the simplest form of CRISPR editing and the most reliable: high efficiency, no additional template needed, works in almost any cell type.

HDR โ€” the precise repair

Homology-directed repair (HDR) is the cell's precise repair mechanism. It uses a DNA template with sequences matching both sides of the break to repair the cut accurately โ€” essentially copying the template sequence into the cut site. If you provide your own synthetic DNA template alongside Cas9 and guide RNA, the cell can incorporate your template sequence into the genome. This allows precise editing: changing specific base pairs, correcting disease-causing mutations, or inserting new sequences at exact locations.

HDR is more powerful but much less efficient than NHEJ. In most dividing human cells, HDR occurs in only 1โ€“10% of edited cells, and it's essentially absent in non-dividing cells (which include neurons and most post-mitotic tissues). This efficiency limitation is one of the major technical challenges in CRISPR therapeutics โ€” many applications require precise correction of a mutation, not just disruption of a gene, and getting HDR to work reliably at therapeutic scale is an ongoing research challenge.

โšก Base Editors and Prime Editors

The HDR efficiency problem spawned an entirely new category of tools. Base editors, developed by David Liu at the Broad Institute in 2016, fuse a catalytically impaired Cas9 (which binds but doesn't cut) to a chemical enzyme that directly converts one DNA base to another without making a double-strand break. Cytosine base editors convert CยทG to TยทA; adenine base editors convert AยทT to GยทC. They're more efficient than HDR and don't create the risky double-strand break. Prime editors, also from Liu's lab (2019), use a reverse transcriptase fused to Cas9 to directly write new sequence into the genome โ€” like a "search and replace" function. They can make all 12 possible point mutations and small insertions and deletions without double-strand breaks or requiring a donor template. Together, these tools have expanded what precise genome editing can accomplish beyond what the original CRISPR-Cas9 system allowed.

๐Ÿค” Can CRISPR make off-target cuts โ€” and how often does this happen?

โ–ผ

Yes โ€” Cas9 can cut at sites with sequence similarity to the guide RNA, not just perfect matches. This is the "off-target" problem. The frequency depends heavily on how similar the off-target site is to the intended target, the concentration of Cas9/guide RNA, and the specific guide sequence. Early CRISPR studies found off-target rates of 1โ€“5% in some cases, which was concerning for therapeutic applications. However, the field has responded with several solutions: high-fidelity Cas9 variants (eSpCas9, HiFi Cas9) that require more precise base pairing and have dramatically lower off-target rates; truncated guide RNAs (17โ€“18 bp instead of 20 bp) that are more discriminating; and paired Cas9 "nickase" approaches that require two cuts in close proximity to generate a double-strand break. Whole-genome sequencing of edited cells has shown that current optimized approaches can achieve off-target rates below 1 in a million โ€” comparable to the spontaneous mutation rate of cell division itself.


Part IV โ€” From laboratory tool to clinical therapy

The gap between a molecular tool that works in a laboratory dish and a treatment that works safely in a human patient is enormous. Delivering CRISPR components to the right cells in a living body, achieving sufficient editing efficiency to produce a therapeutic effect, avoiding immune responses to the bacterial Cas9 protein, and ensuring the edits are both effective and safe โ€” all of these presented challenges that took a decade to partially solve.

The delivery problem

CRISPR's three components โ€” Cas9 protein, guide RNA, and optionally a repair template โ€” must reach the nucleus of the target cells. In a dish, this is straightforward: electroporation (brief electrical pulses that temporarily permeabilize membranes) or lipid nanoparticles (fatty bubbles that fuse with cell membranes) deliver the components efficiently. In a living body, you need to get them to specific tissues while avoiding off-target delivery elsewhere.

The dominant delivery strategy for current clinical applications is ex vivo editing: take cells out of the patient, edit them in the lab, then reinfuse them. This sidesteps the in vivo delivery problem entirely. Blood cells are easy to extract, edit, and reinfuse โ€” which is why the first approved CRISPR therapies target blood diseases. In vivo delivery โ€” editing cells inside the body โ€” is more challenging. Lipid nanoparticles have been successfully used to deliver CRISPR to the liver (which naturally takes up circulating nanoparticles). Adeno-associated viruses (AAVs) can deliver to muscle, eye, and nervous tissue. But many tissues remain difficult to reach efficiently.

Casgevy โ€” the first approved CRISPR therapy

In November 2023, the FDA approved Casgevy (exagamglogene autotemcel) for sickle cell disease โ€” the first CRISPR-based therapy to receive regulatory approval anywhere in the world. The UK had approved it weeks earlier. The approach is elegant: rather than trying to correct the sickle cell mutation directly, Casgevy uses CRISPR to reactivate fetal hemoglobin โ€” a form of hemoglobin that's normally silenced after birth but doesn't have the sickle cell defect.

The target is the BCL11A enhancer โ€” a regulatory DNA sequence in blood stem cells that silences the fetal hemoglobin gene. CRISPR cuts and disrupts this enhancer, relieving the silencing. The patients' own stem cells are extracted, edited ex vivo, and reinfused after conditioning. The result: fetal hemoglobin is produced throughout the patient's life, compensating for the defective adult hemoglobin. In clinical trials, the majority of patients had no severe vaso-occlusive crises โ€” the agonizing episodes of pain caused by sickled cells blocking blood vessels โ€” after treatment. Several patients described it as essentially curative. The treatment is a one-time procedure costing approximately $2.2 million per patient โ€” a cost that raises its own profound equity questions.

โš  The He Jiankui Affair

In November 2018, Chinese scientist He Jiankui announced that he had used CRISPR to edit human embryos โ€” producing twin girls whose CCR5 gene had been disrupted, intended to confer HIV resistance. The scientific community condemned the experiment as premature, unethical, and scientifically unjustified. The editing was mosaic (incomplete), the medical justification was weak (the girls' father was HIV-positive but could not transmit the virus to his children), and the long-term effects of CCR5 disruption are unknown (CCR5 also plays a role in immunity to other pathogens including West Nile virus). He was sentenced to three years in prison by Chinese authorities. The incident crystallized a global conversation about germline editing governance โ€” one that remains unresolved. Most scientific bodies now call for a moratorium on heritable human genome editing until safety, efficacy, and societal implications are adequately assessed.

๐Ÿค” What diseases are CRISPR therapies being developed for right now?

โ–ผ

The clinical pipeline is substantial. Blood diseases are the most advanced: sickle cell disease and beta-thalassemia (both using the fetal hemoglobin reactivation approach, now approved). Transthyretin amyloidosis โ€” a degenerative disease caused by misfolded protein deposits โ€” is being treated with in vivo CRISPR delivered to the liver via lipid nanoparticles; early results showed >90% reduction in the disease-causing protein. Leber congenital amaurosis (a form of inherited blindness) is being treated with in vivo CRISPR delivered directly to the retina. Cancer is a major focus โ€” CRISPR-engineered CAR-T cells are in trials for multiple hematological malignancies. HIV โ€” attempting to excise integrated HIV DNA from infected cells โ€” is in early trials. Duchenne muscular dystrophy, familial hypercholesterolemia, chronic hepatitis B, and several others are in various stages of development. The field is moving very fast.


Part V โ€” What CRISPR can't do (yet)

The excitement around CRISPR is warranted. But precision requires acknowledging its limitations โ€” both current technical ones and more fundamental constraints.

CRISPR works best on single-gene disorders with clear loss-of-function or gain-of-function mutations. Sickle cell disease is caused by a single nucleotide change in a single gene โ€” a perfect target. Most common diseases are not like this. Type 2 diabetes, heart disease, schizophrenia, most cancers โ€” these are polygenic, involving hundreds or thousands of genetic variants each with small effects, plus environmental factors. You cannot CRISPR your way to lower heart disease risk by editing 500 loci in every cell.

The delivery problem remains unsolved for many tissues. Brain, bone, fat, skin, lung โ€” getting sufficient CRISPR editing efficiency in these tissues in vivo is still a research challenge. For diseases of the central nervous system, which would require editing neurons throughout the brain, no practical delivery approach currently exists at therapeutic scale.

The cost problem is profound. At $2.2 million for Casgevy, CRISPR therapies will initially be accessible only in the wealthiest health systems, for the most straightforward diseases. Scaling to global access โ€” for sickle cell disease, which disproportionately affects populations in sub-Saharan Africa and South Asia โ€” requires dramatic cost reduction, which requires manufacturing innovation, competition, and policy intervention that doesn't yet exist.

"CRISPR is not magic. It's a molecular tool that can cut specific sequences of DNA in specific cells. The gap between 'can cut DNA' and 'can cure disease' is filled with biology, delivery, medicine, and time."

And CRISPR edits the genome but not the epigenome โ€” the layer of chemical modifications to DNA and histones that controls gene expression without changing the sequence. Many diseases involve epigenetic dysregulation rather than sequence mutations. New tools โ€” CRISPR-based epigenome editors that can add or remove methylation marks at specific loci without cutting DNA โ€” are being developed. They don't yet have the track record of standard CRISPR, but they point toward a future where gene regulation, not just gene sequence, can be precisely controlled.

CRISPR Mechanisms

Select all statements that accurately describe how CRISPR-Cas9 gene editing works.

Guide RNA directs Cas9 to a specific DNA sequence
Cas9 edits DNA by directly rewriting base sequences
A PAM sequence is required adjacent to the target
CRISPR can edit any tissue equally well in vivo
NHEJ repair causes insertions/deletions at the cut site
HDR allows precise edits but is less efficient than NHEJ
The guide RNA must be 40 nucleotides long
Off-target cuts can occur at similar DNA sequences

Key Terms โ€” CRISPR

CRISPR
Clustered Regularly Interspaced Short Palindromic Repeats โ€” sequences in bacterial genomes storing viral DNA fragments as immune memory.
Cas9
The most widely used CRISPR-associated protein. A nuclease that cuts both strands of DNA when guided to a target sequence by sgRNA.
Guide RNA (sgRNA)
A synthetic single-guide RNA molecule encoding the 20-nucleotide targeting sequence that directs Cas9 to a specific genomic locus.
PAM Sequence
Protospacer Adjacent Motif โ€” a short DNA sequence (NGG for SpCas9) required adjacent to the target for Cas9 to bind and cut.
NHEJ
Non-Homologous End Joining โ€” imprecise DNA repair that introduces indels at the cut site. Used for gene disruption/knockout.
HDR
Homology-Directed Repair โ€” precise repair using a template. Used for exact sequence changes. Efficient only in dividing cells.
Base Editor
A CRISPR variant that chemically converts one DNA base to another without making a double-strand break. More precise than standard CRISPR.
Prime Editor
A CRISPR variant using reverse transcriptase to directly write new sequence into the genome โ€” a "search and replace" for DNA.
Ex Vivo Editing
Removing cells from a patient, editing them outside the body, and reinfusing them. The approach used in Casgevy.
Casgevy
The first FDA-approved CRISPR therapy (2023), for sickle cell disease. Reactivates fetal hemoglobin by disrupting the BCL11A enhancer.