Wednesday, July 29, 2026

The Giant RNA Polymerase of CCHFV: A New Structural Window into Viral RNA Synthesis and Antiviral Design

https://thernablog.blogspot.com/

Viral RNA polymerases are among the most important molecular machines in virology. They copy viral RNA, make viral transcripts, and control whether an RNA virus can successfully replicate inside a host cell. Because human cells do not use the same kind of processive RNA-dependent RNA polymerase for genome replication, these enzymes have long been attractive targets for antiviral drug discovery.

A recent study by Jia and colleagues, titled “RNA synthesis and substrate analog inhibition in the CCHFV polymerase,” provides a major structural and biochemical advance in this area. The work focuses on the L protein of Crimean-Congo hemorrhagic fever virus, or CCHFV, a tick-borne virus belonging to the Nairoviridae family.

What makes this enzyme remarkable is its size. Nairoviridae L proteins are about 4,000 residues long, making them among the largest known viral polymerases. Until now, this enormous size came with a major mystery: why does this virus need such a large polymerase to perform a job that other RNA viruses accomplish with much smaller polymerase systems? Jia and colleagues address this by reporting structures of the full-length CCHFV L protein, including a 3.0 Å polymerase elongation complex.

The Nature study is important for two reasons. First, it gives us a clearer picture of how this giant viral enzyme organizes RNA synthesis. Second, it identifies nucleotide analogs with sofosbuvir-like ribose modifications that can specifically inhibit the CCHFV polymerase by immediate chain termination.

Why CCHFV polymerase matters

CCHFV is not an ordinary virus from a public-health perspective. It is a tick-borne biosafety level-4 pathogen, meaning it requires the highest level of laboratory containment. The virus causes Crimean-Congo hemorrhagic fever, a severe disease of major concern in endemic regions. Understanding how its polymerase works is therefore not only a structural biology question; it is also a foundation for antiviral discovery.

Like other segmented negative-sense RNA viruses, CCHFV depends on an RNA-dependent RNA polymerase, or RdRP, to copy and transcribe its RNA genome. The L protein contains multiple functional regions, including an endonuclease, the central RdRP module, and a cap-binding domain. These regions cooperate during viral transcription and replication. In segmented negative-sense RNA viruses, transcription often depends on “cap-snatching,” where the viral polymerase captures capped fragments from host RNAs and uses them as primers for viral mRNA synthesis.

The puzzle is that Nairoviridae L proteins are much larger than many related viral polymerase systems. Previous structures of segmented negative-sense RNA virus polymerases generally involved systems of about 2,000–2,500 residues, while Nairoviridae L proteins can reach 3,800–4,900 residues. The authors note that apart from an N-terminal OTU domain, the reason for this unusually large size had remained unclear.

A full-length view of a giant enzyme

To solve this problem, the researchers purified full-length CCHFV L protein and used cryo-electron microscopy to capture different structural states. They obtained apo and promoter-bound states, but the major breakthrough was the 3.0 Å elongation complex, which covered a much larger portion of the enzyme. This structure allowed the authors to define a more complete architecture of CCHFV L and to see how different regions cooperate during RNA synthesis.

One of the most interesting findings is that CCHFV L is not simply a larger version of other viral polymerases. It contains large additions and insertions in all three major functional regions. These additions reshape how the enzyme interacts with RNA. Two Nairoviridae-specific elements are especially important:

FID, or the fingers insertion domain, extends the downstream template RNA-binding path.

UPD, or the upstream product-binding domain, extends the upstream product RNA-binding path.

Together, these domains help explain why the CCHFV polymerase is so large. The extra mass is not random decoration. It appears to form additional RNA-binding paths and interaction networks that may help the enzyme handle long RNA products with sufficient processivity.

FID and UPD: two additions that change the RNA path

The study shows that FID lies near the downstream side of the RdRP active site and may help coordinate template RNA binding together with other polymerase regions. Structural analysis revealed positively charged residues in the relevant groove, consistent with a role in nucleic acid interaction. The authors propose that FID contributes not only to promoter binding but also to general downstream template RNA binding.

On the other side of the active site, UPD helps form an extended path for the upstream RNA product. The study identifies a tunnel-like route involving UPD, CBD, and mid-link regions, with positively charged residues positioned along the putative product RNA exit path. This suggests that the polymerase has evolved extra structural features to guide RNA as it emerges from the active site.

This is where the structural work becomes biologically meaningful. The researchers tested mutations in these interaction networks using a CCHFV minigenome assay. All 14 tested mutations reduced minigenome replication to varying degrees, and mutations affecting FID:RNA and UPD:RNA interactions had particularly strong effects, dropping replication below 20% of the wild-type level.

In simple terms, the extra domains are not just visible in the structure; they matter for viral RNA replication.

How the enzyme moves from initiation to elongation

The authors also propose a model for CCHFV RNA replication. In this model, the polymerase first recognizes the viral promoter and positions the 3′ end of the template RNA at the active site. As RNA synthesis progresses, the enzyme transitions into elongation. When the RNA duplex reaches roughly 10 base pairs, structural elements such as the lid and priming element move to accommodate the growing RNA duplex, and the lid helps separate template and product strands.

This model is useful because viral polymerases are not static machines. They must grip the promoter, initiate RNA synthesis, elongate the RNA chain, separate RNA strands, and eventually complete an entire replication cycle. The CCHFV L structure suggests that FID and UPD may help support processive elongation, possibly allowing the enzyme to synthesize the large L transcript of Bunyaviricetes.

The antiviral angle: sofosbuvir-like nucleotide analogs

The second major part of the study concerns nucleotide analog inhibitors. Nucleotide analogs work by mimicking natural nucleotide substrates. If a viral polymerase incorporates the analog into a growing RNA chain, the analog may disrupt further RNA synthesis.

The best-known example in this category is sofosbuvir, a nucleotide analog used to treat hepatitis C virus infection. Sofosbuvir’s active triphosphate form contains characteristic 2′-α-fluoro-2′-β-C-methyl ribose modifications that cause immediate chain termination in the hepatitis C virus polymerase.

Jia and colleagues asked whether similar chemistry could work against CCHFV RdRP. They found that nucleotide analogs carrying ribose-2′ modifications identical to sofosbuvir could be incorporated by CCHFV RdRP and then stop RNA synthesis immediately. Importantly, the same analogs were not incorporated by Lassa virus and Rift Valley fever virus polymerases in their assays, suggesting specificity for CCHFV among the tested systems.

The authors tested several analogs and found that all four base types with this ribose modification showed incorporation activity and chain-terminating behavior in the CCHFV system. Competition assays further supported the potential of these compounds, although different analogs varied in how strongly they competed with the corresponding natural nucleotides.

This does not mean that sofosbuvir itself is now a proven treatment for CCHFV infection. The study works at the enzyme and structural-biochemistry level. Drug development would still require prodrug optimization, cell culture testing, animal studies, pharmacokinetic evaluation, safety testing, and eventually clinical trials. But the work identifies a promising chemical logic: ribose 2′-α-fluoro-2′-β-C-methyl modification may be a useful starting point for anti-CCHFV nucleotide analog development.

Why this study matters for RNA biology

For RNA biologists, this work is exciting because it connects structure, mechanism, and inhibition in one system. The study does not merely show a beautiful cryo-EM structure. It links structural features to RNA-binding paths, tests their functional relevance through minigenome assays, and then uses active-site insight to explore antiviral inhibition.

It also reminds us that viral RNA polymerases are diverse. The familiar “right-hand” RdRP core is conserved, but viruses build many different accessory domains around that core. These additions can determine how the polymerase recognizes RNA, how it transitions between replication stages, how it separates strands, and how vulnerable it is to nucleotide analogs.

The CCHFV L protein is therefore more than a giant enzyme. It is a molecular example of how RNA viruses expand a conserved catalytic machine into a specialized replication platform.

Conclusions

The new CCHFV polymerase structures help answer a long-standing question: why are Nairoviridae L proteins so large? The answer appears to lie in expanded RNA-binding architecture. Domains such as FID and UPD extend the paths of template and product RNA, helping organize the enzyme during replication. At the same time, the discovery that sofosbuvir-like nucleotide analogs can terminate CCHFV RNA synthesis provides a valuable starting point for antiviral research.

For The RNA Blog, this study is a reminder of why RNA biology remains one of the most dynamic areas of modern science. A single viral enzyme can teach us about evolution, molecular architecture, disease biology, and drug discovery. In the case of CCHFV, seeing the polymerase in action may be the first step toward learning how to stop it.

Monday, July 06, 2026

How RISC Works: Argonaute, Small RNAs, and the Logic of Gene Silencing

A mechanistic guide to RISC assembly, guide-strand selection, target recognition, slicing, repression, deadenylation, and turnover.

Argonaute and RISC: The Molecular Engine Behind RNA Interference

RISC components: The core RISC is a small-RNA-loaded Argonaute (AGO) protein, often associated with GW182/TNRC6 in animals. In metazoans, Dicer and its dsRNA-binding cofactors (TRBP/PACT in mammals; R2D2/Loqs in flies) form a RISC-loading complex (RLC) that hands off small-RNA duplexes to Argonaute. Argonaute contains four domains (N, PAZ, MID, PIWI) that bind the 3' end, 5' end, and body of the guide.

Small-RNA biogenesis/loading: miRNAs derive from Pol II hairpins (pri-miRNAs) processed by Drosha/DGCR8 and then Dicer into ∼22-nt duplexes. siRNAs come from long dsRNA (viral or endogenous) cleaved by Dicer. piRNAs are Dicer-independent ~24-30-nt RNAs from single-stranded precursors (e.g. in germline) loaded into Piwi-clade AGOs. After biogenesis, small-RNA duplexes are loaded into AGO (with Hsc70/Hsp90 chaperones). Guide strand selection depends on 5'-end nucleotide preference and thermodynamic asymmetry, and the passenger strand is removed by Argonaute slicing (if fully complementary) or a "slicer-independent" unwinding mechanism.

Argonaute conformational states: Crystal/cryo-EM structures show AGO as a bilobed protein (MID-PIWI lobe and N-PAZ lobe) that clamps the guide (5' end in the MID pocket, 3' end in PAZ). Loading and target-binding trigger conformational shifts: apo-AGO "open" state, guide-bound "clamped" state, and target-bound state in which the central channel widens to accommodate guide-target pairing. The N-domain helps splay duplex strands and limits 3'-target pairing (enforcing seed-based recognition).

Guide selection & passenger removal: After loading, AGO uses multiple "sensors" to choose the guide strand. Factors include 5'-terminal nucleotide identity (MID pocket preference), thermodynamic stability of ends, and Ago's slicing of the passenger if perfectly paired. In slicer-competent AGOs (e.g. human AGO2, Drosophila AGO2), the passenger strand can be cleaved (at the guide's 10-11 position) to free the guide. Non-slicing AGOs or imperfect duplexes rely on thermal destabilization plus chaperones (e.g. C3PO, La/SSB) to unwind and eject the passenger. Open questions include the exact roles of unwinding factors (C3PO, etc.) and how different AGO isoforms manage strand separation.

Target recognition: The core determinant of target binding is seed pairing: perfect complementarity to guide positions 2-7 (or 8) drives binding. Additional base-pairing 3' of the seed (supplementary pairing) strengthens binding, while central mismatches/bulges generally prevent slicing. Bulged or wobbled sites can still mediate repression if seed pairing is intact. The tolerance of mismatches and the extent of supplementary pairing vary with AGO clade and species. Outstanding questions include the full rules for non-seed interactions and how AGO conformational changes propagate mismatch signals (cf. Joseph & Osman 2012).

Catalytic cleavage: Only AGOs with an active RNase H-like "PIWI" domain can slice targets. Human AGO2 (and, to a lesser extent, AGO3) carry the catalytic DEDH tetrad required for Mg²+-dependent phosphodiester hydrolysis. Cleavage chemistry resembles RNase H: the guide-bound AGO positions the scissile phosphate near two Mg²+ ions, facilitating an SN2 attack by the 2'-OH on the adjacent phosphate. Structures (e.g. human Ago2-miRNA-target complexes) show a kink at the cleavage site induced by the so-called "glutamate finger", orienting the water nucleophile. Open issues include the detailed energetics of catalysis and how slicer-inactive AGOs function in organisms like plants (some plant AGOs have lost slicing yet still mediate silencing).

Translational repression & deadenylation: In animals, AGO-guide complexes recruit GW182/TNRC6 proteins, which in turn bind poly(A)-binding protein (PABP) and the CCR4-NOT and PAN2-PAN3 deadenylase complexes. This leads to shortening of the poly(A) tail, decapping (via DCP1/2), and mRNA decay. miRNA-bound AGO may also inhibit translation initiation (via eIF4G/eIF4A interference). Key experiments tethering GW182 to reporters demonstrate that GW182 alone can induce deadenylation and repression. In flies, loss of GW182 abolishes deadenylation but has complex effects on translational repression. Open questions include how GW182 distinguishes targets for decay vs mere repression, and how initial translation inhibition is triggered prior to mRNA decay.

RISC recycling/turnover: After target repression or cleavage, RISCs must be recycled for further rounds. Target cleavage yields 5' and 3' fragments; recent work suggests phosphorylative events promote release of cleaved products (e.g. AGO2 C-terminal serine phosphorylation accelerates target release). The "loading" AGO may remain bound to the guide for multiple cycles. Small RNAs themselves can turnover (some miRNAs are stabilized by 2'-O-methylation in plants and animals). Factors like XRN1 exonuclease clear cleaved targets. A notable factor, C3PO, degrades AGO-nicked passenger fragments to fully activate RISC. Precisely how AGOs dissociate from targets for new rounds (and how Ago itself is turned over or modified) are active research areas.

Regulatory PTMs and cofactors: AGO function is modulated by post-translational modifications. Human AGO2 is phosphorylated at several sites: for example, Y393 by EGFR (in hypoxia) reduces AGO2-Dicer binding and miRNA loading; S387 by Akt3 promotes recruitment of LIMD1/TNRC6A and DDX6 into repression complexes; a C-terminal S824-S834 cluster is hyperphosphorylated after target binding to accelerate target release. Other PTMs include AGO2 sumoylation, acetylation, ubiquitination, prolyl-4-hydroxylation, and PARylation, many of which affect stability or localization. RISC cofactors include heat-shock chaperones (Hsc70/Hsp90) required for loading duplexes, the C3PO nuclease (for passenger removal), and RNA helicases (e.g. MOV10, to disrupt RNPs). Open questions include the full map of AGO modifications in various cell states and how co-chaperones influence loading kinetics.

Experimental evidence: The RISC mechanism is supported by multiple assay types. X-ray crystallography and cryo-EM have resolved Argonaute structures in apo, guide-bound, and guide-target states (e.g. archaeal and bacterial Argonautes; eukaryotic Ago2-miRNA complexes; TNRC6-AGO complexes). In vitro cleavage assays (radioactive RNA substrates) defined the catalytic "slicer" requirements and rates. Crosslinking immunoprecipitation (CLIP) sequencing (HITS-CLIP, PAR-CLIP, CLASH) have mapped AGO binding sites transcriptome-wide, confirming seed-pairing rules and identifying non-canonical sites. Luciferase reporter assays with inserted miRNA sites have quantified repression efficiency and defined seed/supplement categories. Cryo-EM of the human RISC-loading complex (Ago2-Dicer-TRBP) has recently illuminated loading intermediates.

Open questions/controversies: Despite progress, some issues remain unresolved. The relative contributions of translational repression vs mRNA decay in different contexts is debated. The existence and mechanism of miRNA "target slicing" (beyond perfect siRNA-like sites) is still being explored. The roles of many AGO co-factors (beyond Dicer/TRBP, GW182) are still being delineated. Structural snapshots capture many states, but the dynamic transitions of AGO during target search are less understood. Finally, the diversity of AGO family members (with different activities) raises questions about their specialized functions in various species and pathways.

Argonaute	Domain architecture	Active-site motif	Slicer?	Major pathway/notes
HsAGO1	N–PAZ–MID–PIWI (858 aa)	DEDH (E remains)	No	miRNA repression
HsAGO2	N–PAZ–MID–PIWI (859 aa)	DEDH (canonical)	Yes	miRNA/siRNA (viral)
HsAGO3	N–PAZ–MID–PIWI (925 aa)	DEDH (mutant form)	Marginal¹	miRNA (some slicing)
HsAGO4	N–PAZ–MID–PIWI (859 aa)	DEDN (N instead of H)	No	miRNA
DmAGO1	N–PAZ–MID–PIWI (843 aa)	DEDH (E remains)	No	miRNA (development)
DmAGO2	N–PAZ–MID–PIWI (940 aa)	DEDD (active)	Yes	siRNA antiviral
CeRDE-1	N–PAZ–MID–PIWI (925 aa)	DEDH (active)	Yes	siRNA (RNAi)
Piwi proteins (e.g. HsHIWI2)	N–PAZ–MID–PIWI (1000+ aa, plus Gly-rich N-term)	DEDH / DEDH	Yes (piRNA)	germline piRNA silencing

RISC Composition and Assembly

The core of RISC is an Argonaute protein bound to a single-stranded "guide" RNA. In animals, GW182/TNRC6 proteins (with tandem GW/WG motifs) bind AGO and mediate repression/decay. In the RISC-loading complex, AGO is physically associated with Dicer and its dsRNA-binding partners: in mammals, Dicer binds TRBP and/or PACT; in Drosophila, Dcr-2 binds R2D2 (and Loquacious). These scaffolds bring the small-RNA duplex to AGO. Biochemically, purified human Dicer-TRBP and Dicer-PACT complexes each form stable RLCs that bind siRNA or pre-miRNA; swapping TRBP and PACT domains can alter processing specificity. Notably, RLC assembly increases Dicer's affinity for RNA and presents the duplex in a conformation competent for loading.

During loading, ATP-dependent chaperones (Hsc70/Hsp90, Hop, p23, etc.) are required to "open" AGO for duplex entry. In vitro reconstitution (purified components) shows that Hsp90beta, Hsc70 and cochaperones form a loading machine that presents AGO in a high-affinity state for the duplex. Inhibition of Hsp90 blocks RISC loading in cells, indicating this is a conserved requirement (Tomari & Zamore 2005; Tahbaz et al. 2005). After AGO binds the duplex (one strand destined as guide, the other passenger), a series of strand-separation steps ensues (see below). Only after passenger ejection is the RISC considered mature and able to bind targets.

Table 1 (below) compares selected Argonaute proteins: all share N, PAZ, MID, PIWI domains (N-box and PAZ grip the duplex; MID anchors the guide's 5'-phosphate; PIWI harbors the RNaseH fold). The presence of an active-site Asp/Glu (the "slicer tetrad") determines whether a given AGO can catalyze target cleavage. For example, human AGO2 has the canonical DEDH and is an active slicer, whereas AGO1/3/4 have substitutions (AGO3 can be activated by domain swaps). Organismal distribution varies: many animals encode multiple AGO paralogs (e.g. 4 in humans, each broadly expressed), while plants have >10 AGOs with specialized roles.

Small-RNA Biogenesis and Loading

miRNA Pathway

Animal miRNAs begin as long primary transcripts (pri-miRNAs) made by Pol II. The Microprocessor complex (Drosha + DGCR8) cleaves the pri-miRNA into a ~60-70-nt precursor hairpin (pre-miRNA). This pre-miRNA is exported to the cytoplasm and further diced by Dicer into a ~22-nt RNA duplex with 2-nt 3' overhangs. TRBP and PACT (humans) or Loquacious (flies) bind Dicer's RNase III domains and influence cleavage accuracy and strand selection. The guide strand selection is influenced by 5'-terminal nucleotide preference (AGO MID-domain often favors U or A) and by the relative thermodynamic stability of the duplex ends. The duplex (with 5'-monophosphates on both strands) is presented to AGO: in humans this usually means AGO2, while other AGOs also bind miRNAs but are non-slicing. Chaperone proteins (Hsc70/Hsp90) use ATP to transiently "open" AGO for duplex entry.

siRNA Pathway

siRNAs arise from long double-stranded RNAs (exogenous viruses, transposons, or endogenous transcripts). In Drosophila, Dicer-2 (with partner R2D2) processes long dsRNA into 21-nt siRNA duplexes. In mammals, a single Dicer can generate both miRNAs and siRNAs (e.g. from shRNA expression) with the help of TRBP/PACT. Once produced, siRNA duplexes are loaded into AGO. In flies, AGO2 is specialized for siRNAs; in mammals, AGO2 is the main slicer and can load siRNAs for RNAi. A key feature of siRNA loading is that one strand (the guide) will pair fully with targets, so AGO can cleave complementary mRNA targets.

piRNA Pathway (brief)

In metazoan germlines, piRNAs are 24-30 nt RNAs that associate with PIWI-clade Argonautes (Piwi, Aubergine, AGO3 in flies; PIWIL1-4 in mammals). piRNAs derive from single-stranded cluster transcripts (no Drosha/Dicer required). Mitochondrial endonuclease Zucchini (and Tudor-domain factors) generate primary piRNAs. Ping-pong amplification creates secondary piRNAs via slicer activity of PIWI proteins. The final piRNA-Piwi complexes mediate transposon silencing by target cleavage and transcriptional repression (H3K9 methylation). piRNA 5' ends are 2'-O-methylated by Hen1, further stabilizing them. (See Iwasaki et al. 2015 for review.)

Loading and Strand Separation

In all pathways, after duplex production the RLC loads the duplex into AGO. AGO's MID domain "senses" the 5'-phosphate of one strand to position it as the guide. The other strand (passenger) must be removed. Two mechanisms operate:

Slicer-assisted: If the passenger strand is fully complementary, AGO2 (or other slicers) will cleave it between positions 10-11 (guide numbering). This "nick" promotes rapid dissociation of the passenger fragments. This is the case for siRNAs in canonical RNAi. Matranga et al. (2005) showed that human and fly AGO2 cleaves the passenger of loaded siRNA, while miRNA duplexes (imperfect) are not cleaved.

Slicer-independent: Non-slicing AGOs (e.g. AGO1, 3, 4) or imperfect duplexes rely on the intrinsic thermodynamic bias (less stable 5' end or mismatches) to eject the passenger. At 37°C human AGO1/3/4 can eject an siRNA passenger without cleavage, suggesting a "hotter" conformational dynamics (the PAZ domain transiently releases the 3' end). Accessory factors like C3PO (a Mg²+-dependent endonuclease) can degrade nicked passenger fragments, further promoting activation. La/SSB has also been reported to bind AGO2 and assist release of cleavage products. Thus, even without slicing, AGO can effect strand separation through conformational changes and ancillary helpers.

Open questions remain about the precise kinetics of passenger removal (e.g. how general is C3PO's role?) and how AGOs discriminate guide vs passenger beyond thermodynamics.

Argonaute Structure and Guide/Target Interactions

Argonaute proteins are bilobed. The MID-PIWI lobe forms one side of the nucleic-acid channel, the PAZ-N lobe forms the other (Figure 1 in). The MID domain (Rossmann-like fold) binds the 5'-phosphate and first base of the guide by a conserved pocket. The PAZ domain (OB-fold) binds the 2-nt 3' overhang of the guide. Thus the guide is anchored at both ends. The N-terminal "N domain" lies between the lobes and helps split duplexes and prevent overextension of base-pairing at the guide's 3' end. The PIWI domain is a RNase H-like fold containing the (Asp/Glu) active site. The catalytic tetrad (Asp-Glu-Asp-His) coordinates two Mg²+ ions for phosphodiester hydrolysis. (Non-slicer AGOs have one or more mutations in this motif.)

As shown by crystal structures, AGO-guide interactions define a characteristic "seed channel". Positions 2-8 of the guide (the seed) are pre-organized by contacts to the protein (e.g. MID/PAZ interactions clamp the ends). The N-PAZ lobe covers the 3' half of the guide, preventing pairing until the seed has bound. Upon target binding, structures show the seed region bound to target, inducing a kink at position 6-7. With extensive pairing beyond position 8, the guide-target duplex can extend into the supplementary chamber of the PIWI lobe. The transition from guide-only to guide-target causes conformational shifts: in some cases the PIWI domain repositions its active site loop (the "glutamate finger") to engage the scissile phosphate.

Conformational studies (FRET, cryo-EM) indicate at least three AGO states: apo-open (RNA-free), guide-loaded (central cleft clamped), and target-bound (cleft open to accommodate duplex). The MID and PIWI domains move closer upon guide binding, completing the GW182-binding surface. These structural rearrangements enforce the target recognition rules: only targets pairing to the seed (g2-7/8) can productively bind deep in the channel. Mismatches/bulges in the seed severely weaken binding (seed is base-paired in helix). In contrast, central mismatches (guide 9-11) prevent slicing by misaligning the active site. Supplementary pairing (guide 13-17) can strengthen binding if present. Figures 2-3 in Uchiumi et al. (2016) and structures in illustrate the guide and target path.

Target Recognition Rules

AGO-guide complexes scan mRNAs for complementary sequences. The seed region (guide positions 2-7/8) is paramount: a contiguous Watson-Crick match here is usually required for stable binding. Typical miRNA target sites are classified as 6mer (nts 2-7), 7mer (2-8), or 8mer (2-8+matching A at target position 1) in 3'UTRs. Additional "3'-supplementary" pairing (guide 13-17 to target positions) can compensate for a shorter seed. Many bona fide sites tolerate a single bulge or GU wobble in the seed if flanked by perfect pairs. However, a mismatch at guide position 9/10 (the scissile phosphate) abolishes cleavage.

Genome-wide CLIP-seq experiments (e.g. AGO HITS-CLIP by Chi et al. 2009, Helwak et al. 2013) confirm that 3'UTR sites with canonical seed pairing (often with flanking AU-rich context) are enriched under AGO peaks. Non-canonical sites (seedless or centered sites) exist but are generally weaker. AGO's N-domain can sometimes tolerate small 3'-bulges of the guide, but extended bulges usually require an extra stabilizing anchor (e.g. 3' supplementary pairing) to engage the PIWI lobe. Mutational studies show that introducing bulges in the seed disrupts silencing regardless of downstream pairing.

A remaining mystery is how AGOs detect and "communicate" guide-target mismatches. Molecular dynamics studies suggest an allosteric network within AGO relays information from the seed to the catalytic site and to surface sites. For example, Joseph & Osman (2012) found that seed mismatches induce small shifts in an extensive residue network, ultimately affecting surface loops. In practice, mismatches reduce slicing efficiency and accelerate turnover, but non-slicing repression can still occur (with reduced potency).

Catalytic (Slicer) Cleavage Mechanism

When a target pairs fully to the guide (especially positions 2-12), slicing occurs (in slicing-competent AGOs). The PIWI domain's RNase H fold positions two divalent cations (Mg²+) near the guide-target junction. One metal activates a water nucleophile for in-line attack on the scissile phosphate, while the other stabilizes the leaving group. The conserved glutamate finger (a loop in PIWI) contacts the phosphate backbone to position the scissile bond at the catalytic center. Structural studies of archaeal Ago and human Ago2 (bound to guide and target) reveal the cleavage geometry: the target's phosphodiester is bent at the cleavage site and the 2'-OH of the target attacks the phosphorus, yielding 5'-phosphate and 3'-OH ends.

Biochemical kinetics show slicer cleavage is single-turnover fast (∼minutes) when complementarity is perfect, and essentially abrogated by mismatches or bulges at the cleavage site. Mutagenesis of the DEDH residues (e.g. D597A, H807A in hAGO2) completely blocks cleavage but not binding. For non-slicer AGOs (lacking the full tetrad), target binding still occurs but no phosphodiester bond breakage ensues - these RISCs rely entirely on repression/deadenylation pathways.

Recent cryo-EM data (e.g. Cell 2025 by Zhang et al.) have begun to capture the intermediate states of human AGO2 during cleavage, revealing how the active site reorganizes. The precise catalytic mechanism (e.g. transition state intermediates) likely parallels RNase H enzymes. Open questions include the pH dependence and any required proton transfers, and how AGO3 (with variant PIWI) may occasionally cleave unusual substrates.

Translational Repression and Deadenylation

In metazoans, most miRNA binding triggers repression rather than cleavage. The bridge between AGO and the repression machinery is provided by GW182/TNRC6 proteins. GW182 proteins have an N-terminal AGO-binding region (with multiple tryptophan "GW" motifs) and a C-terminal effector region that interacts with mRNA decay factors. Tethering experiments (GW182 fused to a reporter) show that GW182 alone can induce poly(A) shortening and translational silencing.

Mechanistically, AGO-GW182 complexes recruit PABP and the CCR4-CAF1-NOT deadenylase and PAN2-PAN3 complexes to the target mRNA tail. The deadenylases shorten the poly(A) tail, which leads to decapping by DCP1/2 and 5'->3' exonucleolytic decay (XRN1). GW182 also interacts with DDX6 (RCK/p54) and other decapping enhancers. In Drosophila, knocking down CCR4 or NOT1 abolishes miRNA-dependent deadenylation and decay, but residual translational repression can persist. Thus, translational repression can be mechanistically separated from deadenylation, though in cells they often occur sequentially.

Proposed models for repression include interference with cap recognition or ribosome initiation. For example, GW182-bound CCR4-NOT can inhibit eIF4A/eIF4G, blocking 43S pre-initiation complex assembly. Some data suggest miRNA-mediated repression acts primarily at initiation, while others find elongation stalls or ribosome drop-off. The field agrees that deadenylation is a major downstream effect, but the timing (repression first, decay later) is still debated.

In summary, after target binding the mature RISC can silence expression by (1) slicing (if perfect match, via PIWI), or (2) recruiting GW182 to repress translation and deadenylate/decap the mRNA. The balance of these pathways depends on AGO isoform, target context, and cell type.

RISC Recycling and Turnover

Once an mRNA is cleaved or repressed, the question arises: how is the AGO-guide complex recycled? Cleavage case: Argonaute slices the target, leaving two fragments. These fragments dissociate from AGO (AGO then remains bound to the guide). Recent evidence suggests AGO2 is actively phosphorylated after target binding to promote release: phosphorylation of its C-terminal serine cluster (S824-S834) lowers the affinity for bound mRNA, allowing AGO to turn over more rapidly. Conversely, preventing this phosphorylation leads to "sticky" RISC that holds onto targets. Thus, an AGO phosphorylation cycle accelerates RISC recycling after slicing.

Repression case: If no slicing occurred, AGO stays bound to target 3' UTR. It likely releases by thermal dissociation (since pairing is partial) or with help from RNA helicases (e.g. MOV10) and ATPases (e.g. Me31b/DDX6). Notably, TNRC6 can bind multiple RISCs to one mRNA, possibly stabilizing some interactions. Eventually, after multiple rounds of repression/decay, the RISC may dissociate or be sequestered into P-bodies.

RISC turnover: Argonaute itself is relatively stable (half-lives of many hours) but is turned over by ubiquitination (especially AGO2) under some conditions. Stress or viral infection can trigger Argonaute degradation. Small RNAs also turn over: 3' end 2'-O-methylation in plants and piRNAs protects them from exonucleases. In animals, the lack of 2'-O-methylation in AGO-loaded miRNAs may make them susceptible to tailing and trimming (via TUTases and exonucleases), particularly for aged RISC.

Finally, after mRNA decay, the guide strand itself may be released from AGO and degraded, freeing AGO to load a new duplex. The details of guide recycling are less well studied, but in vitro slicing assays show that AGO2-guide can survive multiple cleavage events.

Regulatory PTMs and Cofactors

AGO activity is finely tuned by post-translational modifications and binding partners:

Phosphorylation: As noted, AGO2 undergoes key phosphorylations. EGFR (under hypoxia) phosphorylates AGO2 at Y393, which disrupts AGO2-Dicer interaction and downregulates miRNA maturation. Serine phosphorylation of AGO2 is rich: S387 (by Akt3 kinase) triggers AGO2 binding to LIMD1 and recruitment of TNRC6A and DDX6, coupling miRNA repression to the CCR4-NOT complex. A C-terminal cluster S824-S834 is phosphorylated upon target binding, lowering mRNA affinity (as above). Other kinases (CK1alpha, GRK4) also modify AGO2 at distinct sites, altering localization (nuclear vs cytoplasmic) or miRNA loading. Overall, phosphorylation regulates when and where RISC binds targets and recruits repressors.

Other PTMs: AGO2 is SUMOylated (on Lys402) to enhance stability and localization to P-bodies. Kinetic studies show SUMOylation can switch AGO2 between translational repression vs slicing modes. Lys48-linked polyubiquitination leads to proteasomal turnover of AGO. Prolyl-4-hydroxylation (on a conserved Pro700 in human AGO2) is important for miRNA activity in tumor cells. PARP enzymes can ADP-ribosylate AGO2, antagonizing its function during stress. These modifications often respond to signaling pathways, linking RISC activity to cellular state.

Cofactors: We have already mentioned Dicer/TRBP/PACT and Hsc70/Hsp90 as loading cofactors. Other notable partners include: (1) C3PO (TREX1 complex) that degrades AGO2-nicked passenger RNAs to finalize RISC activation (Ye et al., 2011). (2) La/SSB binds AGO2 and promotes release of cleaved fragments. (3) MOV10 helicase associates with AGO2-miRNA complexes and is thought to remodel target mRNPs for degradation. (4) GW182-binding factors: LIMD1 (in complex with AKT3) and FMRP/FXR1 may scaffold repression complexes.

Open areas include: How the interplay of these modifications is orchestrated (e.g. does Akt3 phosphorylation always precede AGO2-GW182 binding?), and whether there are uncharacterized AGO partners in specialized RNP granules (e.g. germ granules, stress granules).

Key Experimental Evidence

Structural studies: High-resolution crystal structures of Argonautes (bacterial, archaeal, eukaryotic) have defined the domain architecture and guide/target path. Song et al. (2004) and Nishimasu et al. (2012) solved human AGO2 with guide RNA. More recent cryo-EM (Sheu-Gruttadauria et al. 2019, Ma 2021) captured human AGO-guide complexes at different steps. Structures of AGO with GW182 peptides (Sheu-Gruttadauria et al. 2019) reveal the tryptophan-binding pockets on PIWI. These static images, combined with single-molecule FRET, illuminate how AGO opens/closes during loading and target scanning.

Biochemical assays: Slicing activity has been assayed with radio-labeled target RNAs, establishing that only AGO2 (and AGO3 with modifications) can cut. Mutational scanning of the seed region (by Oglesbee, La Rocca, etc.) mapped the exact base-pairing requirements for repression vs cleavage. Reconstitution of RISC in vitro (Doudna lab) with purified human proteins showed that an RLC of Dicer-TRBP-Ago2 suffices for efficient loading and cleavage of complementary targets (Noland & Doudna 2013). Tethering GW182 to a reporter demonstrated that CCR4-NOT recruitment alone can silence translation without AGO (Eulalio et al. 2009).

High-throughput target identification: HITS-CLIP and PAR-CLIP of AGO proteins (Hafner et al. 2010; Chi et al. 2009) identified thousands of binding sites, refining seed-match rules. CLASH (crosslinking ligation and sequencing) captured chimeric reads of miRNA-target hybrids (Helwak et al. 2013), revealing non-canonical sites and miRNA sponges. Ribosome profiling experiments (e.g. Guo et al. 2010, Eichhorn et al. 2016) showed that mRNA decay is the dominant outcome of miRNA action, supporting the model that repression precedes deadenylation and decay.

Functional reporters: Hundreds of studies using luciferase or GFP reporters with synthetic miRNA sites (seed matches, bulged sites, etc.) have empirically measured repression efficiency. Such assays confirmed the hierarchy of site types (8mer>7mer-A1>7mer-m8>6mer) and showed that supplemental 3' pairing boosts repression of 6mers (Brennecke et al. 2005).

Each of these assays underpins the mechanistic model: structural data define the molecular contacts; in vitro assays reveal the chemistry; and genomic experiments validate the rules in cells.

Pathway

Small RNA

Size (nt)

Precursor/Processing

Key AGO effector

Mode of action

miRNA

miR, miR* duplex

~22

pri-miR –(Drosha)→ pre-miR –(Dicer)→ duplex【18†L1942-L1948】

hAGO1–4 (AGO2 slicer)

Seed pairing → translational repression & decay【33†L262-L270】

siRNA

siRNA duplex

21–23

long dsRNA –(Dicer)→ duplex (TRBP/PACT or R2D2–Dicer)【23†L263-L271】

hAGO2, DmAGO2

Full pairing → target cleavage (slicer)

piRNA

piRNA

26–31

single-strand transcript –(Zucchini + ping-pong)→ piRNA【56†L113-L121】

Piwi-clade (Aub, Piwi)

Transposon silencing by cleavage and heterochromatin

Pathway	Small RNA	Size (nt)	Precursor/Processing	Key AGO effector	Mode of action
miRNA	miR, miR* duplex	~22	pri-miR –(Drosha)→ pre-miR –(Dicer)→ duplex【18†L1942-L1948】	hAGO1–4 (AGO2 slicer)	Seed pairing → translational repression & decay【33†L262-L270】
siRNA	siRNA duplex	21–23	long dsRNA –(Dicer)→ duplex (TRBP/PACT or R2D2–Dicer)【23†L263-L271】	hAGO2, DmAGO2	Full pairing → target cleavage (slicer)
piRNA	piRNA	26–31	single-strand transcript –(Zucchini + ping-pong)→ piRNA【56†L113-L121】	Piwi-clade (Aub, Piwi)	Transposon silencing by cleavage and heterochromatin

References:

Perspective: machines for RNAi. Genes and Development, 2005. https://doi.org/10.1101/gad.1284105

Origins and Mechanisms of miRNAs and siRNAs. Cell, 2009. https://pmc.ncbi.nlm.nih.gov/articles/PMC2675692/

Towards a molecular understanding of microRNA-mediated gene silencing. Nature Reviews Genetics, 2015. https://www.nature.com/articles/nrg3965

The Structure of Human Argonaute-2 in Complex with miR-20a. Cell, 2012. https://doi.org/10.1016/j.cell.2012.05.017

Structural basis for microRNA targeting. Science, 2014. https://pubmed.ncbi.nlm.nih.gov/25359968/

From guide to target: molecular insights into eukaryotic RNA-interference machinery. Nature Structural and Molecular Biology, 2015. https://www.nature.com/articles/nsmb.2931

Biological principles of microRNA-mediated regulation: shared themes amid diversity. Nature Reviews Genetics, 2008. https://www.nature.com/articles/nrg2455

Monday, June 29, 2026

Evolution of Plant microRNA Gene Families: Birth, Expansion, and Functional Diversification of Small RNA Regulators

Evolution of Plant microRNA Gene Families: Birth, Expansion, and Functional Diversification

How plant MIRNA genes arise, duplicate, diversify, co-evolve with targets, and sometimes disappear.

Introduction

Plant development, adaptation, and genome regulation depend not only on protein-coding genes but also on small regulatory RNAs. Among the most important of these are microRNAs, or miRNAs: short, non-coding RNAs, usually around 20-24 nucleotides long, that guide Argonaute-containing silencing complexes to complementary target transcripts. In plants, this targeting is often highly sequence-specific and frequently results in transcript cleavage, although translational repression and other regulatory outcomes also occur.

The genes that produce miRNAs are known as MIRNA genes. They are usually transcribed into primary transcripts that fold into stem-loop structures. These precursors are processed mainly by DICER-LIKE1 and associated proteins to release mature miRNA duplexes. One strand is loaded into an Argonaute protein and directs repression of target mRNAs. Because plant miRNAs often regulate transcription factors, hormone-response genes, nutrient-homeostasis genes, and disease-resistance genes, small changes in MIRNA gene copy number, sequence, expression, or targeting can have large developmental and evolutionary consequences.

The evolution of plant microRNA gene families is therefore a story of regulatory innovation. Some miRNA families are ancient and deeply conserved across land plants. Others are recently born, restricted to a species, genus, or family, and may disappear before becoming functionally embedded. Plant MIRNA evolution is not a linear march from simple to complex. It is a dynamic cycle of birth, duplication, divergence, selection, and loss.

What Is a Plant microRNA Gene Family?

A plant microRNA gene family is usually defined by similarity among mature miRNA sequences and, often, similarity in target recognition. Multiple MIRNA loci may produce identical or nearly identical mature miRNAs. These loci can be dispersed across the genome, arranged in tandem clusters, retained after whole-genome duplication, or generated independently through local rearrangements.

For example, conserved families such as miR156/157, miR160, miR164, miR165/166, miR167, miR169, miR172, miR319, and miR396 regulate major developmental transcription-factor families, including SPL, ARF, NAC, HD-ZIP III, AP2, TCP, and GRF genes. These modules are central to phase transition, organ polarity, leaf development, root architecture, flowering, and stress responses. Their conservation suggests that once a MIRNA-target module becomes integrated into a core regulatory network, it can be maintained for hundreds of millions of years.

At the same time, plant genomes contain many lineage-specific MIRNA genes. These younger loci may have weak expression, less precise processing, unstable precursor structures, or narrow tissue-specific activity. Some are evolutionary experiments. A few become useful regulators. Many are lost.

Birth of New MIRNA Genes

One of the best-supported mechanisms for the origin of new plant MIRNA genes is inverted duplication of target-gene sequences. In this model, a fragment of a protein-coding gene is duplicated and inserted in an inverted orientation near a related sequence. The resulting genomic region can form a hairpin RNA. Initially, such a hairpin may behave more like a source of small interfering RNAs, producing heterogeneous small RNAs. Over time, mutations may refine the foldback structure, improve processing precision, and favor production of a dominant mature miRNA.

This mechanism is especially elegant because it explains how a new miRNA can immediately possess complementarity to a biologically relevant target. If the MIRNA precursor arose from a duplicated fragment of its future target gene, the mature miRNA may already recognize that target or related paralogs. The newly born MIRNA gene therefore begins with a plausible regulatory connection rather than needing to find one entirely by chance.

However, birth is not enough. A young MIRNA locus must pass several evolutionary filters. It must be transcribed in the right place. Its precursor must be processed accurately. The mature miRNA must be loaded into the correct Argonaute complex. Its target interaction must provide a selective benefit or at least avoid harmful misregulation. Only then can a young MIRNA gene move from genomic accident to functional regulator.

Duplication and Expansion of MIRNA Families

Once a MIRNA gene becomes useful, duplication can expand its regulatory influence. Plant genomes are shaped by tandem duplication, segmental duplication, transposable-element activity, and repeated rounds of whole-genome duplication. MIRNA genes are not exempt from these forces.

Tandem duplication can create multiple MIRNA copies in close genomic proximity. Segmental duplication can move related MIRNA loci into different chromosomal contexts. Whole-genome duplication can duplicate both MIRNA genes and their target genes at the same time, creating new opportunities for dosage balance, subfunctionalization, and regulatory divergence.

Expansion of a MIRNA family can increase dosage. More MIRNA copies may produce more mature miRNA, strengthening repression of target transcripts. But copy-number expansion can also enable expression divergence. One MIRNA copy may remain active in leaves, another in roots, another during reproductive development, and another during stress. Even if mature miRNA sequences remain identical, promoter divergence can create new spatial and temporal regulation.

This is one reason plant MIRNA gene families are often functionally more complex than their small size suggests. A mature miRNA sequence may be conserved, but the genomic loci that produce it can differ in expression pattern, precursor structure, processing efficiency, and evolutionary age.

Conservation and Ancient Regulatory Modules

The most deeply conserved plant miRNA families tend to regulate transcription factors or regulatory proteins. This is not accidental. Transcription factors sit near the top of developmental control systems. A single miRNA targeting a transcription-factor family can coordinate entire developmental programs.

The miR156-SPL module is a classic example. miR156 is associated with juvenile-to-adult phase transition, flowering, architecture, and stress-related traits. Its target SPL transcription factors control broad developmental outputs. The miR172-AP2 module also contributes to phase transition and flowering. The miR165/166-HD-ZIP III module regulates adaxial-abaxial polarity, vascular patterning, and meristem function. The miR160 and miR167 families regulate auxin-response factors, connecting miRNA evolution to hormone signaling.

These ancient families are usually under strong purifying selection. Their mature sequences are highly conserved because even small sequence changes could alter target recognition. Their target sites are also conserved because disruption may disturb essential developmental programs. In such cases, the MIRNA gene and its target become an evolutionary unit: each constrains the other.

Rapid Turnover of Young MIRNA Genes

In contrast to ancient conserved families, young plant MIRNA genes show rapid birth and death. Many species-specific MIRNA candidates are found in small-RNA datasets, but not all represent stable evolutionary innovations. Some may be weakly expressed hairpins, degradation products, siRNA-like loci, or recently formed precursors that have not yet acquired canonical features.

Young MIRNA genes often show several features: limited phylogenetic conservation, lower expression, less precise processing, weaker evidence of Argonaute loading, and uncertain target repression. Over evolutionary time, most are lost. A small fraction gradually acquire stronger precursor structure, more accurate processing, more consistent mature miRNA accumulation, and biologically meaningful targets.

This creates a layered MIRNA repertoire. At the bottom are newly formed, unstable, or weakly functional hairpin loci. In the middle are lineage-specific MIRNA genes with emerging regulatory roles. At the top are deeply conserved MIRNA families embedded in core plant biology.

Whole-Genome Duplication and MIRNA Family Evolution

Whole-genome duplication is a major force in plant evolution. Many angiosperm lineages have experienced one or more genome duplication events. After such events, most duplicated genes are eventually lost, but some are retained because they provide dosage balance, developmental flexibility, or raw material for innovation.

MIRNA genes can be retained after whole-genome duplication, but their fate depends on both the MIRNA locus and its target network. If a MIRNA and its target genes are duplicated together, the regulatory relationship may be preserved. Alternatively, one MIRNA copy may be lost while target duplicates diverge. In other cases, retained MIRNA duplicates may acquire different expression domains or subtly different mature sequences.

The consequences can be significant. A duplicated MIRNA family member may regulate one subset of target paralogs, while another copy regulates a different subset. This can help duplicated protein-coding genes escape identical regulation and develop new functions. Thus, MIRNA evolution after whole-genome duplication contributes not only to small-RNA diversity but also to the rewiring of gene regulatory networks.

Target Co-evolution

Plant miRNA evolution cannot be understood by looking only at MIRNA genes. The target genes evolve too. A miRNA target site may be conserved, lost, duplicated, or modified. Target-site changes can weaken regulation, create new regulation, or shift a transcript from one miRNA family to another.

This co-evolution is especially visible in duplicated gene families. If a transcription-factor family expands, some paralogs may retain the ancestral miRNA target site while others lose it. The result is regulatory partitioning. One group remains under miRNA control; another escapes repression and may evolve a new expression pattern.

Defense-related NBS-LRR genes provide a striking example of miRNA-target co-evolution. These genes often occur in large, rapidly evolving clusters. Because overexpression of immune receptors can be costly, miRNAs that target conserved motifs in duplicated NBS-LRR transcripts may help control immune-gene dosage. In some cases, the same type of target gene expansion that creates regulatory problems may also generate the inverted-repeat structures from which new miRNAs arise. The target family therefore helps produce its own regulator.

Functional Diversification Within MIRNA Families

After duplication, MIRNA family members can diversify in several ways.

First, they can diverge in expression. Two MIRNA loci producing the same mature sequence may be active in different tissues, developmental stages, or environmental conditions.

Second, they can diverge in precursor structure. Changes in the stem-loop can affect DCL1 processing accuracy, mature miRNA abundance, or production of alternative small RNAs from the same precursor.

Third, mature miRNA sequences can diverge. Even one or two nucleotide substitutions, especially in target-recognition regions, can shift target specificity.

Fourth, regulatory context can change. A MIRNA locus may acquire new promoter elements, become responsive to stress, or be integrated into hormone signaling.

Through these processes, MIRNA family members can undergo subfunctionalization, where ancestral functions are partitioned among duplicates, or neofunctionalization, where one copy acquires a new role.

Loss of MIRNA Genes

Loss is as important as gain. MIRNA genes may be deleted, silenced, structurally degraded, or rendered nonfunctional by mutations that disrupt processing or expression. Target sites can also be lost, making the MIRNA irrelevant even if the MIRNA gene remains.

Loss may occur because a young MIRNA never provided a selective advantage. It may also occur because regulation becomes harmful under new ecological or developmental conditions. In some cases, loss of a MIRNA or target site may release a gene from repression and contribute to phenotypic diversification.

This turnover explains why the MIRNA complement differs strongly among plant species. Conserved families provide a stable regulatory backbone, while lineage-specific families reflect recent evolutionary experimentation.

Evolutionary Significance

The evolution of plant microRNA gene families reveals how genomes build regulatory complexity without needing entirely new proteins. A short RNA sequence, if produced accurately and expressed in the right context, can regulate many transcripts. This makes miRNAs powerful tools for coordinating gene families, buffering expression noise, and fine-tuning developmental transitions.

Plant MIRNA evolution also shows that regulatory networks are modular. A miRNA and its target family can form a portable regulatory unit. Once established, such a module can be duplicated, modified, lost, or redeployed. This modularity helps plants adapt to new body plans, reproductive strategies, stress environments, and pathogen pressures.

Future Directions

Several questions remain central to the field. How many lineage-specific MIRNA annotations are truly functional? What structural features determine whether a young hairpin becomes a canonical MIRNA gene? How often do new MIRNA genes arise from target-gene fragments, transposable elements, or random hairpins? How do MIRNA duplicates partition expression after whole-genome duplication? And how does target-site evolution contribute to crop domestication and adaptation?

Long-read transcriptomics, improved small-RNA sequencing, degradome analysis, Argonaute immunoprecipitation, comparative genomics, and genome editing are now making these questions more tractable. In crops, understanding MIRNA family evolution may help researchers manipulate architecture, flowering time, stress tolerance, nutrient use, and immunity with greater precision.

Conclusion

Plant microRNA gene families evolve through a balance of conservation and experimentation. Ancient families such as miR156, miR160, miR165/166, miR167, miR172, and miR396 form deeply conserved regulatory circuits that control development and physiology. Younger MIRNA genes arise continuously through inverted duplication, local rearrangement, duplication, and genome-scale events. Most are lost, but a few become integrated into functional networks.

The evolution of plant MIRNA families is therefore not merely the history of small RNA genes. It is the history of how plants refine gene regulation, absorb genome duplication, control expanding gene families, and generate developmental and adaptive diversity from short RNA sequences.

Selected References

Evolution of plant microRNA gene families. Cell Research, 2007. https://www.nature.com/articles/7310113

Evolution of plant microRNAs and their targets. Trends in Plant Science, 2008. https://doi.org/10.1016/j.tplants.2008.03.009

Origins and Evolution of MicroRNA Genes in Plant Species. Genome Biology and Evolution, 2012. https://pmc.ncbi.nlm.nih.gov/articles/PMC3318440/

The evolution of microRNAs in plants. Current Opinion in Plant Biology, 2017. https://pmc.ncbi.nlm.nih.gov/articles/PMC5342909/

MicroRNA Gene Evolution in Arabidopsis lyrata and Arabidopsis thaliana. The Plant Cell, 2010. https://pmc.ncbi.nlm.nih.gov/articles/PMC2879733/

Conservation and evolution of miRNA regulatory programs in plant development. Current Opinion in Plant Biology, 2007. https://pmc.ncbi.nlm.nih.gov/articles/PMC2080797/

De novo origination of MIRNAs through generation of short inverted repeats in target genes. RNA Biology, 2019. https://pmc.ncbi.nlm.nih.gov/articles/PMC6546375/

Monday, June 22, 2026

TRACKer Diagnostics: Amplification-Free Ribozyme Sensing for Viral RNA

How a ribozyme-controlled riboregulator platform detects viral RNA without target preamplification. Can Ribozyme Circuits Replace Amplification in Viral RNA Testing?

TRACKer: Amplification-Free Ribozyme-Based Diagnostics for Viral RNA

Abstract: Rapid, accurate detection of viral RNA at the point-of-care is a critical need, especially highlighted by recent pandemics. Traditional methods like RT-qPCR achieve high sensitivity (tens of copies) but require complex instrumentation and pre-amplification. RNA biosensors - including engineered ribozymes, toehold switches, and CRISPR-based systems - offer programmable detection in cell-free formats. However, many still rely on target amplification or suffer from background activation. Recent work by Tang et al. introduces the TRACKer platform, a modular cell-free diagnostic that uses engineered ribozymes to detect viral RNA directly with attomolar sensitivity and no pre-amplification. TRACKer combines a novel inhibition-recognition strand (IRS) design with translational signal cascades to achieve switch-like activation and high specificity. This article reviews RNA biosensing approaches, details the TRACKer system, compares its performance to other methods, and discusses specificity and deployment considerations.

Introduction

Rapid nucleic acid diagnostics are essential for infectious disease control. The gold-standard RT-qPCR provides high sensitivity but is slow, costly, and ill-suited for field use. PCR-free methods are therefore highly desirable: they can be lower-cost, simpler, and portable. Recent reviews emphasize RNA-based sensors for point-of-care (POC) viral detection. For example, toehold switch sensors - de-novo RNA devices that change conformation upon binding target RNA - have shown excellent selectivity and have been applied to viruses like Zika and SARS-CoV-2. Likewise, CRISPR-Cas systems (especially Cas13/Cas12) have enabled sensitive RNA detection via collateral cleavage signals. Engineered Cas13 variants can even achieve attomolar detection in ~30 minutes without separate amplification. Ribozymes - catalytic RNAs that cleave specific sequences - are another programmable approach, and have been used in lateral-flow tests and gene circuits.

However, most amplification-free sensors face challenges. Direct detection often sacrifices sensitivity, requiring clever signal amplification. For toehold or ribozyme sensors, achieving both high sensitivity and specificity is nontrivial. In particular, allosteric ribozymes often suffer "leakage" (background activity) and unintended off-target activation due to incomplete decoupling of their sensor and catalytic domains. This can reduce specificity and scalability for multiplexed detection. Tang et al. specifically note that "allosteric ribozymes are constrained by a lack of orthogonality… leading to signal leakage and unintended off-target activation," a hurdle for cell-free diagnostics.

Ribozyme-Based RNA Sensors and Limitations

Ribozymes act as RNA enzymes: they fold into specific tertiary structures and cleave substrates (often RNA). In diagnostics, engineered hammerhead or other ribozymes can be programmed to recognize a viral RNA and then cleave a reporter. Because ribozymes can be designed in silico, they are attractive sensor modules for cell-free assays. Early work showed ribozymes could function as switchable sensors, but often required careful tuning or auxiliary components. Toehold riboregulators (based on ribozyme or riboswitch principles) have also been created, but typically still need upstream amplification of the target or they have limited dynamic range. For example, ribozyme switches may remain active (trigger false positives) unless an inhibitor strand blocks their activity; without such decoupling, even non-target RNAs can induce some cleavage.

Moreover, most implementations of cell-free ribozyme sensors rely on pre-amplification (RT-PCR, RPA, etc.) to reach clinically relevant sensitivities. Studies using toehold switches often amplify target RNA via NASBA or RT-LAMP before sensing. Direct, amplification-free sensing usually only hits femtomolar or higher limits. A 2024 review notes "there is a certain lack of sensitivity in the direct detection of RNA" and that, although PCR-free methods address cost and complexity, they often still struggle to match PCR sensitivity.

The TRACKer Platform

Tang et al. address these limitations with TRACKer (Target-Responsive non-preAmplification Cell-free Kit). TRACKer is a three-module system that transduces target RNA into a measurable output without nucleic acid amplification. The modules are:

Ribozyme Allostery Module: An engineered hammerhead ribozyme is held in an inactive conformation by an inhibition-recognition strand (IRS). The IRS hybridizes partly with the ribozyme and also contains a sequence complementary to the viral target. Upon encountering the target RNA, the IRS binds the target (via strand displacement) and is released from the ribozyme, switching the ribozyme to its active state. This design decouples sensing from catalysis: the ribozyme remains OFF until the correct RNA displaces the IRS. In effect, the IRS "locks" the ribozyme's catalytic core, preventing background cleavage.

Riboregulator Activation Module: Once released, the now-active ribozyme triggers a cascade of gene expression. Specifically, the ribozyme cleaves a designed RNA that otherwise blocks translation; cleavage frees a riboregulator that then initiates transcription and translation of reporter proteins. This protein expression cascade amplifies the signal of a single binding event: one target RNA can lead to the production of many reporter enzymes. Because translation provides exponential amplification, TRACKer achieves high sensitivity without DNA/RNA amplification.

Output Module: The translated reporters can be detected via interchangeable outputs. In Tang et al., they use reporters like nanoluciferase (HiBiT) or dual-epitope tags (Flag-His) that can produce luminescence or read out on lateral-flow strips. Thus TRACKer can yield a quantitative light signal or a visual line on a test strip, making it adaptable to different POC formats.

Together, these modules function in a cell-free lysate under isothermal conditions. The overall workflow is: add sample RNA -> active target displaces IRS -> ribozyme cleaves to activate translation -> reporter produced within <70 minutes.

Key design elements include in silico selection of IRS and target regions. Tang et al. computationally screened viral genome sequences (using tools like NUPACK) to identify conserved, structured target sites and to design IRS sequences that minimize cross-reactivity. The GitHub code for IRS design is provided by the authors, enabling customization to new targets.

Performance and Sensitivity

TRACKer achieves remarkable analytical sensitivity. The cell-free reactions can detect as little as 1-10 attomolar (aM) of input RNA (roughly single-digit copy numbers). In practice, Tang et al. demonstrated attomolar limits of detection (1-10 aM) for each of six respiratory viruses (Influenza A, Influenza B, RSV, human rhinovirus, SARS-CoV-2, and human parainfluenza virus). These correspond to a few RNA molecules in a standard reaction volume. Detection is rapid: positive signal appears well within 60-70 minutes.

In side-by-side comparisons, TRACKer matched PCR-based methods. Clinical pharyngeal swab samples (n=97) were tested for Influenza A, RSV, and rhinovirus. TRACKer correctly identified positives with 88.9-100% concordance against RT-qPCR. Notably, this was achieved without any RNA extraction or amplification step. Furthermore, the authors showed that pseudovirus particles and unpurified samples (diluted swab eluates) could be used directly, and the system still reliably reported positives while avoiding false positives in negatives.

Suggested visual: use a clean schematic or flowchart for this section rather than publishing the copied placeholder text.

Sensitivity: attomolar (1-10 aM) for all six targets.

Speed: detection in <70 minutes.

Sample: direct viral RNA from clinical swabs, pseudovirus particles, or synthetic RNA.

Flexibility: compatible with luminescent readout or lateral-flow readout.

Portability: operates isothermally, suitable for low-resource settings.

Comparison to Other Detection Methods

TRACKer's performance rivals that of many amplification-based assays. By achieving single-digit copy sensitivity without PCR, it approaches (and in some settings matches) RT-qPCR. For context, a typical RT-qPCR assay has LODs on the order of 10^1-10^2 copies per reaction. Similarly, isothermal amplification methods like LAMP or RPA typically report LODs in the low hundreds of copies (e.g. 50-200 copies) within 30-60 minutes. In contrast, TRACKer's attomolar sensitivity corresponds to ~10 molecules per 10 µL reaction, outperforming many amplification-free biosensors.

CRISPR-Cas detection platforms can also reach attomolar sensitivity. For example, Gao et al. engineered Cas13 variants that detected SARS-CoV-2 RNA at attomolar levels in ~30 minutes. Like TRACKer, they did so without PCR by enhancing the enzyme's activity and reading out via electrochemical sensors. In another Cas13a-based "SHERLOCK"-type assay, digital microfluidics achieved sensitivity to 1 copy/µL with collateral signal amplification. The bottom line is that attomolar LOD is now within reach for several cutting-edge methods, and TRACKer joins these as an amplification-free contender.

What sets TRACKer apart is its purely RNA-based control and protein output. Unlike DNA amplification, no primers or polymerases are needed. Compared to CRISPR collateral detection (which requires CRISPR enzyme production), TRACKer uses only ribosomes and cell-free extract (which can be lyophilized). Unlike electrochemical or optical nanostructured biosensors, TRACKer can be run on simple incubators or water baths. The modular reporter design allows swapping in reporter genes to suit the readout (fluorescent, luminescent, or LFA lines).

Specificity and Off-Target Control

A chief concern for any amplification-free biosensor is specificity. Off-target interactions can produce false positives. Tang et al. addressed this in several ways. First, the IRS design inherently reduces ribozyme leakage: by physically occluding the active site, it minimizes accidental cleavage. Second, target binding and ribozyme activation are essentially irreversible strand-displacement events, providing a sharp switch-like response only if the exact target sequence is present.

Third, the authors verified orthogonality experimentally. In multiplex or singleplex tests across all six viral targets, TRACKer reactions responded only to their cognate target. For example, a lateral-flow version (TRACKer-LFA) produced clear test lines for each virus and no cross-reactivity with others. Thus the system is allele-specific: single-nucleotide mismatches or unrelated RNAs do not trigger the ribozyme (both in silico design and empirical testing confirmed this).

Finally, the clinical sample results (concordance with RT-qPCR) and low background in negative controls demonstrate high specificity in practice. Any residual false positives would have to come from sample contaminants or non-specific strand displacement, but none were observed in the reported data.

Deployment and Practical Considerations

TRACKer is designed for field deployment. The reagents are RNA, cell extract, and buffer; these can be lyophilized for shelf stability (an approach well-established for cell-free diagnostics). Detection is isothermal (e.g. 37-42°C) and read out by eye or simple devices. Because no DNA or enzymes beyond the extract are required, the assay cost is potentially low per test.

Some practical points:

Equipment: A heat block or incubator suffices; no thermocycler needed. Luminescence readers or a smartphone camera can quantitate signals. Lateral-flow strips give a binary yes/no output.

Speed: <1 hour from sample to answer is competitive. While slower than some CRISPR fluorescence assays (30 min), it is faster than typical PCR runs.

Sensitivity trade-offs: Achieving attomolar sensitivity required optimal reaction design. Reaction volumes and timings were tuned; real-world conditions may require calibration.

Reagent supply: Cell-free extracts and ribozymes must be produced, but can be standardized. The open-source IRS design code aids adaptation to new targets.

Sample prep: Tang et al. used minimal processing. However, real clinical samples vary; debris or inhibitors might affect sensitivity. Additional filtration or RNA stabilization steps could be needed in some settings.

Scalability: Each viral target needs a bespoke IRS and reporter construct. While the design is algorithm-assisted, producing kits for dozens of pathogens would take effort. On the other hand, the modular nature means a single central kit could be adapted on-site by swapping the "key" IRS-RNA sets.

Overall, TRACKer combines many desirable features: it is a single-tube, programmable, amplification-free, and sensitive detection system. Its reliance on ribozyme engineering marks a novel approach compared to protein-centric sensors. As with any new platform, field trials and further optimization (e.g. ambient-temperature operation, freeze-drying, mixed-target panels) will determine its ultimate impact.

Conclusion

RNA biosensors are at the forefront of next-generation diagnostics. The TRACKer system, by ingeniously using an inhibition-recognition strand to gate a ribozyme switch, represents a significant advance in cell-free sensing. It achieves ultrahigh sensitivity (attomolar) in under an hour without PCR, and its specificity is bolstered by careful sequence design. Compared to other emerging methods (toehold switches, CRISPR sensors), TRACKer offers a complementary tool: wholly RNA-based, enzyme-free, and adaptable to both fluorescent and lateral-flow outputs.

Future work could extend TRACKer to more pathogens or biomarkers, streamline its workflow, and integrate it into user-friendly devices. Its low cost and rapid readout make it especially appealing for low-resource or outbreak settings. As Tang et al. conclude, TRACKer "presents a promising approach to nucleic acid detection… with potential applications in point-of-care diagnostics and beyond".

Key Points:

Ribozymes can serve as programmable RNA sensors but traditionally require amplification and have specificity issues.

TRACKer uses a novel IRS-mediated lock-and-key mechanism to prevent ribozyme background activity.

Three-module design enables amplification-free detection: target binding -> ribozyme activation -> reporter expression.

Sensitivity reaches 1-10 aM (~ single-digit copies) for six respiratory viruses, with results in ~70 min.

Clinical testing showed high concordance with RT-qPCR (89-100%).

System is versatile: output via luminescence or lateral flow, and isothermal operation fits point-of-care settings.

Specificity is ensured by sequence design (conserved viral targets, orthogonal IRS) and low background activation.

Compared to PCR, LAMP/RPA, and CRISPR assays, TRACKer matches their sensitivity without needing DNA amplification or complex enzymes.

Selected References

Literature on RNA biosensors and diagnostics was reviewed comprehensively, including Tang et al.'s TRACKer report, a Nature Chem. Biol. highlight, and recent reviews of amplification-free RNA detection.

TRACKer and ribozyme diagnostics

De novo-designed ribozyme-controlled riboregulator for cell-free diagnostics. Nature Communications, 2026. https://www.nature.com/articles/s41467-026-71684-6

TRACKing viruses. Nature Chemical Biology, 2026. https://www.nature.com/articles/s41589-026-02245-7

Toehold switches: de-novo-designed regulators of gene expression. Cell, 2014. https://doi.org/10.1016/j.cell.2014.10.002

Zeptomole detection of a viral nucleic acid using a target-activated ribozyme. RNA, 2003. https://rnajournal.cshlp.org/content/9/9/1058.full

Rapid, Multiplexed, and Enzyme-Free Nucleic Acid Detection Using Programmable Aptamer-Based RNA Switches. ACS Synthetic Biology, 2024. https://pmc.ncbi.nlm.nih.gov/articles/PMC11259118/

Label-free and amplification-free viral RNA quantification from primate biofluids using a trapping-assisted optofluidic nanopore platform. Proceedings of the National Academy of Sciences, 2024. https://pmc.ncbi.nlm.nih.gov/articles/PMC11032468/

Friday, June 19, 2026

Proximity Chemistries for Subcellular RNA Biology

Mapping RNA in Place: Proximity Chemistries for Subcellular RNA Biology

A practical guide to APEX-seq, Halo-seq, CAP-seq, HyPro/POCA, and other methods for mapping local transcriptomes.

Proximity chemistries to study RNA biology at the subcellular scale

Figure: Cartoon of an animal cell highlighting major compartments (nucleus, nucleolus, ER, mitochondria, etc.) that host localized RNAs. Proximity labeling methods tag RNAs (or proteins bound to RNAs) within such compartments for sequencing. RNAs are nonuniformly distributed throughout the cell, from membrane-bound organelles to phase-separated condensates. For example, pre-ribosomal RNAs (45S) concentrate in nucleoli for ribosome biogenesis, and NEAT1 lncRNA scaffolds paraspeckles in the nucleus. Mislocalization of RNAs and their RNP complexes (e.g. NPM1 in leukemia or TDP-43 in neurodegeneration) can drive disease, underscoring the need to map RNA locales. Traditional methods (fractionation, imaging, laser-capture, RIP/CLIP) are limited by resolution, throughput or perturbation. Proximity labeling (PL) circumvents these issues by using localized catalysts to covalently tag nearby RNAs or their binding proteins in intact cells. Labeled RNAs are then enriched (e.g. via biotin-streptavidin) and deep-sequenced, yielding compartment-specific transcriptomes. PL thus systematically "records" spatial RNA-protein/RNA-RNA neighborhoods in vivo, enabling discovery of locale-specific regulatory networks.

Reactive proximity-labeling chemistries

Many PL methods generate short-lived reactive intermediates that diffuse nanometers from a target catalyst, labeling any RNA or protein in range. The labeling radius depends on the intermediate's lifetime and diffusion. For example, the peroxidase APEX2 (upon H2O2 addition) oxidizes biotin-phenol to phenoxyl radicals, which diffuse only ~20 nm and covalently attach to nearby biomolecules. In contrast, flavin-derived photosensitizers produce longer-lived species: miniSOG under blue light generates singlet O2 (~0.6 us lifetime, ~70 nm radius), and dibromofluorescein (DBF) under green light produces reactive oxygen that diffuses ~100 nm. Thus, APEX-driven PL gives very fine spatial resolution, while photosensitized approaches label a somewhat larger nanoscopic neighborhood. The following are major reactive PL platforms for RNAs:

APEX2 (peroxidase)-based PL (APEX-seq): APEX2 is genetically fused to a compartment marker. In cells fed biotin-phenol, a brief H2O2 pulse activates APEX2 to create phenoxyl radicals that biotinylate nearby RNAs. Fazal et al. introduced APEX-seq, using APEX2 in nine subcellular locales to generate a "nanometer-resolution spatial map" of the human transcriptome. This revealed, for instance, a radial organization of nuclear RNAs and transcripts gating at the nuclear pore, as well as two distinct mRNA targeting pathways to mitochondria. APEX-seq enables rapid (∼1 min) labeling, but requires H2O2 and tends to label proteins more efficiently than RNA; nonetheless it has been successful in mapping compartment-enriched transcripts.

Chromophore-assisted PL (CAP-seq): Photoactivatable flavins (e.g. miniSOG) are fused to a bait protein. Upon blue-light illumination, miniSOG photo-oxidizes nearby nucleic acids by ^1O2. An exogenous nucleophile (typically propargylamine, PA) is present to react with oxidized bases, installing an alkyne handle. Subsequent copper-click with azide-biotin captures the labeled RNAs. CAP-seq using G3BP1-miniSOG, for example, captured hundreds of mRNAs in stress granules (457 RNAs under arsenite stress, 822 under sorbitol). The labeling radius is limited by singlet oxygen diffusion (~70 nm), yielding high spatial specificity suitable for micron-scale assemblies.

HaloTag + dibromofluorescein (Halo-seq): A HaloTag fusion localizes a covalently bound DBF ligand in a compartment. Green-light activation causes DBF to release oxygen radicals that oxidize nearby RNA bases. These oxidized bases then react with PA to add an alkyne, which is biotinylated by click chemistry. Engel et al. showed that Halo-seq robustly labels RNAs near the targeted compartment and outperforms previous methods in efficiency. For example, Halo-p65 (cytosolic) enriched cytoplasmic mRNAs like GAPDH, whereas Halo-H2B (nuclear) enriched nuclear RNAs (RNase P, TERC, 7SK). Halo-seq thus provides ~100 nm resolution (due to DBF radical diffusion) with good sensitivity across all major compartments.

Engineered photocatalysts (Lantern and variants): Directed evolution has improved flavoprotein photosensitizers. Ziqi Ren et al. engineered Lantern, a LOV-domain flavoprotein with enhanced ROS output and rapid kinetics. Lantern fusions generate singlet oxygen with sub-minute illumination, greatly accelerating PL. It has been targeted to the ER, mitochondria and stress granules to perform rapid CAP-seq (transcriptome) and CAP-MS (proteome) mapping. Lantern was also adapted for CAP-CELL, a cell-surface RNA-labeling mode enabling spatial cell typing. Lantern-based PL demonstrated unprecedented temporal resolution: for instance, m^6A-modified RNAs were seen entering stress granules within 5 minutes of stress induction.

Each reactive strategy has trade-offs in radius, speed, and labeling chemistry (Table 1). In general, APEX2 (H2O2-driven) is very fast (∼1 min) but labels an isotropic small radius, miniSOG/DCBP photo-oxidation covers tens of nanometers with slower illumination (minutes), and evolved catalysts like Lantern achieve both small radius and sub-minute timing. Importantly, Halo-seq showed higher RNA labeling efficiency than CAP-seq or APEX-seq. Reactive methods directly tag RNAs (by oxidation or addition of biotin/alkyne handles), yielding high sensitivity: for example, Halo-seq identified thousands of compartment-specific RNAs and distinguished known organelle transcripts (see below). In contrast, indirect protein-based labeling (below) tends to enrich fewer RNAs.

Contact-dependent labeling techniques

Contact-dependent or templated methods achieve labeling only when a probe is physically bound to its target, affording very tight spatial control. These typically use an affinity reagent (antibody, oligonucleotide, protein fusion) to bring a catalyst or modifying enzyme into direct proximity with an RNA or protein of interest. For instance:

Hybridization-proximity (HyPro) and POCA: In HyPro/POCA methods, fixed cells are hybridized or immunostained with a fluorescent photosensitizer probe that binds a specific RNA or protein. The attached photosensitizer then generates radicals that label nearby molecules. This approach requires no genetic engineering and minimal input. Biletch et al. developed POCA, targeting organic fluorophores via standard immunofluorescence or FISH to place the catalyst at an epitope. They demonstrated POCA in fixed cells by imaging a fluorescent tag on-target before activation, then performing PL. In one study, Yap et al. used HyPro to target the lncRNAs 45S and NEAT1: they identified both known and novel proteins (HyPro-MS) and RNAs (HyPro-seq) associated with these transcripts in intact nuclei. Notably, NEAT1-directed HyPro-seq revealed a rich set of A-to-I-edited RNAs at paraspeckle boundaries, showing how anchored PL can map RNA-chromatin interactions. POCA/HyPro has been applied to diverse structures (nuclear pore, nucleolus, nuclear speckles, telomeres, etc.), and can even be anchored to both a protein and an RNA in the same compartment to compare the proximal proteomes from each perspective.

Polyuridylation tagging (RNA Tagging): In this enzymatic approach, a chosen RNA-binding protein (RBP) is fused to a poly(U) polymerase (C. elegans PUP-2). When the fusion binds its native RNAs, PUP-2 adds a short U-tail to the target, marking it covalently. Sequencing of 3' ends then reveals the tagged transcripts. This "RNA Tagging" method (Lapointe et al.) has been used in yeast to map RBP-RNA networks without crosslinking.

RNA base-editing (TRIBE/STAMPS): A catalytically inactive RBP can be fused to an RNA-editing enzyme. For example, fusing ADAR's catalytic domain to an RBP leads to A->I (seen as A->G) edits at binding sites, indirectly flagging the targets (TRIBE, STAMP). Similarly, APOBEC or other editors can be used for C->U tagging. These fusions record interactions transcriptome-wide in vivo, though typically over longer timescales (hours).

CRISPR-guided proximity: Catalytically-dead RNA-targeting CRISPR (dCas13) can be programmed with a guide RNA to bind a specific transcript. Fusing a PL enzyme (APEX2, BirA, etc.) to dCas13 then directs labeling to that RNA's neighborhood. Han et al.* demonstrated this concept by using MS2-MCP or Cas13 to deliver APEX2 to the human telomerase RNA (hTR), selectively labeling its interactors (PNAS 2020). Such CRISPR-PL methods allow sequence-specific profiling of an individual RNA's local proteome or co-transcripts.

These contact methods excel at pinpointing interactions of a chosen RNA or RBP, even in genetically unperturbed cells. For example, HyPro-seq targeting NEAT1 uncovered hundreds of Paraspeckle-associated transcripts. POCA (targeting antibodies) and RNA-Tagging (targeting RBPs) similarly yield high specificity. However, they require delivery of probes or fusion proteins and often work in fixed or engineered cells.

Indirect (protein-centric) approaches

Some PL strategies label proteins rather than RNA directly, then identify associated RNAs via crosslinking. For example:

TurboID/BioID tagging - Proximity biotin ligases (TurboID or BioID) fused to compartmental markers robustly biotinylate nearby proteins (within ~10 nm). After streptavidin pulldown, the bound proteome is analyzed by mass spec, and any co-purifying RNAs (crosslinked or stably associated) are sequenced. Ramelow et al. introduced SPARO (Simultaneous Protein And RNA-omics) using a Rosa26-TurboID mouse line. By labeling astrocytes or neurons in vivo, they enriched cell-type proteomes and protein-bound transcriptomes concurrently. SPARO validated that the captured RNA pool represents the expected transcriptome and revealed cases of mRNA-protein discordance in neuroinflammation.

APEX-RIP/CLIP - APEX2 (or HRP) can also label the local proteome, which is then subjected to RNA immunoprecipitation (e.g. CLIP or APEX-RIP). For instance, one can APEX-label ER proteins and then UV-crosslink+IP any RNAs associated with biotinylated ribonucleoproteins. These indirect methods can in principle detect RNAs near a structure without labeling them chemically. However, they depend on crosslinking efficiency and bias toward protein-bound transcripts. In practice, indirect approaches often show lower sensitivity. For example, simple nuclear APEX-RIP enriched fewer than 200 transcripts, whereas direct Halo-seq of the nucleus enriched >1000.

Comparison of methods

The various PL chemistries differ in range, speed, and throughput (Table 1). Labeling radius: APEX2 phenoxyl radicals tag within ~10-20 nm; miniSOG singlet oxygen ~70 nm; Halo-DBF radicals ~100 nm. Time resolution: APEX/H2O2 pulses label in ≲1 min; photo-oxidation (CAP/DBF) typically requires several minutes of illumination; Lantern allows labeling in seconds. Specificity: Direct RNA tagging yields high compartment specificity. For example, Halo-seq targeted to nucleoli (Fibrillarin-Halo) enriched known nucleolar RNAs (e.g. SNORA68, 7SL) that were depleted in a general nuclear pulldown. CAP-seq labeling of G3BP1 captured well-known SG mRNAs (long, AU-rich, translationally repressed). Sensitivity: Direct PL methods can recover hundreds to thousands of RNAs per locale. Halo-seq reported robust enrichment of nuclear vs cytosolic transcriptomes, and CAP-seq found hundreds of SG RNAs. In contrast, indirect labeling of proteins yields far fewer RNAs (e.g. <200 nuclear RNAs by APEX-RIP). In summary, direct PL provides higher sensitivity and spatial precision, whereas indirect (protein-centric) PL is more limited by crosslinking and diffusion of label.

Applications: subcellular transcriptome mapping

Proximity labeling has been applied to chart RNAs in virtually every compartment:

Nucleus and nucleolus: APEX2-NLS or lamin fusions have delineated nuclear-layered transcriptomes. For instance, Fazal et al. found that processed mRNAs exit the nucleus through inner pore regions, and that HuR-dependent RNAs accumulate in the nucleus (APEX-seq). Halo-seq of H2B (chromatin) vs p65 (cytosol) highlighted that nuclear-enriched transcripts frequently contain AU-rich elements, suggesting HuR involvement. Halo-seq directed to fibrillarin (nucleolus) specifically pulled down nucleolar RNAs (snoRNAs, 7SL) that were absent from the general nuclear pool.

Paraspeckles and other nuclear bodies: Using HyPro/POCA, researchers have begun mapping RNAs at specific nuclear condensates. In targeting the NEAT1 lncRNA (paraspeckles), HyPro-seq uncovered a large set of incompletely processed, A->I-edited transcripts that localize at active chromosomal loci immediately adjacent to the NEAT1-paraspeckle core. Similarly, PL of NEAT1-associated proteins (PSPC1, etc.) or the Perinucleolar Compartment (PNC) could reveal their tethered transcripts.

Mitochondria and ER: APEX2 targeted to the mitochondrial outer membrane or ER membrane profiles the RNAs on those surfaces. Fazal et al. showed two distinct mitochondrial targeting signals: one pathway localizes nuclear-encoded respiratory-chain mRNAs to the OMM, and another path for other mitochondrial proteins. (Organelle PL can also capture imported transcripts or ER-localized mRNAs, though specific examples remain under study.)

Stress granules (SGs) and P-bodies: Membraneless cytoplasmic granules have been difficult to purify intact, but CAP-seq and Lantern have made it possible. Zou et al. used G3BP1-miniSOG to perform CAP-seq on SGs in live cells. They found that SG-enriched mRNAs tend to be long, AU-rich, poorly translated, and many carry m^6A modifications. They also tracked SG transcriptomes during assembly/disassembly. Lantern-accelerated CAP-seq captured early SG recruitment of methylated mRNAs (within minutes). By contrast, proteins heavily targeted to SGs (TIA1, FMRP) could be profiled by PL-MS and their bound RNAs inferred.

Cell surface and in vivo labeling: A recent advance (CAP-CELL) uses Lantern delivered to the plasma membrane to label cell-surface RNAs, enabling 'spatial cell typing'. Meanwhile, in vivo PL strategies like SPARO (astrocytes/neurons in mouse) combine TurboID and sequencing to yield native-cell-type transcriptomes and proteomes. These methods hint at future whole-organism spatial omics.

In all these cases, PL has uncovered locale-specific RNA populations that were inaccessible by fractionation or imaging alone. For example, Paraspeckle-CLIP methods (HyPro) and SG-CAP-seq complement conventional CLIP by revealing transcripts that cluster without necessarily binding a known RBP. By systematically mapping the "local transcriptomes" of organelles and granules, PL is redefining our view of the intracellular RNA landscape.

Performance overview

Labeling radius: Determined by chemistry - e.g. APEX2 phenoxyl radicals label within ∼20 nm; miniSOG singlet O2 labels ∼70 nm; Halo-DBF radicals ∼100 nm. Contact-based methods effectively have zero diffusion beyond the target.

Temporal resolution: APEX/H2O2 can tag in ≲1 minute pulses (fastest). Photo-PL (miniSOG/DBF) typically requires 5-15 minutes of illumination. Lantern enables labeling in a few seconds (sub-minute). Enzymatic PL (TurboID, ADAR) operates on timescales of minutes to hours (limited by enzyme kinetics).

Specificity: High - PL precisely enriches known compartment markers. For instance, Halo-seq of nucleolar fibrillarin enriched SNORA and RNase MRP RNAs (nucleolus-specific) that were depleted in a parallel nuclear control. CAP-seq of SGs enriched long, U-rich mRNAs (e.g. DYNC1H1, NORAD) over mitochondrial or highly translated RNAs.

Sensitivity: Direct RNA labeling yields hundreds to thousands of transcripts per experiment. In practice, APEX-seq and Halo-seq report ~10^3-10^4 enriched RNAs genome-wide per compartment. CAP-seq often recovers a smaller set (10^2-10^3) focused on the granule core. By contrast, indirect approaches recover far fewer RNA hits (e.g. <200 RNAs enriched by nuclear APEX-RIP). Thus, direct PL is generally more sensitive and comprehensive.

Coverage: PL is unbiased (sequencing-based) and genome-wide, unlike FISH or targeted assays. It can profile coding and noncoding RNAs (mRNAs, lncRNAs, sn/snoRNAs, etc.) at once.

Future outlook

Proximity chemistries for RNA are rapidly evolving. The trend is toward faster, more efficient catalysts and broader applicability. Engineered photosensitizers like Lantern exemplify this, enabling previously impossible kinetics. Orthogonal targeting (new HaloTags, split-enzymes) and small-molecule activation may further refine spatial control. Contact-based platforms like POCA/HyPro show that one can exploit standard imaging workflows to profile RNAs and proteins without genetic tags. Integrating PL with single-cell or super-resolution methods could produce even richer maps (e.g. combining APEX-seq with MERFISH for precise sub-organellar localization). Remaining challenges include minimizing oxidative damage, distinguishing direct contacts from mere colocalization, and extending PL to in vivo and clinical samples. Nevertheless, current PL tools have already revealed intricate links between RNA localization, RNA-binding proteins, and cell function. As catalysts and methods improve, spatially resolved transcriptomics will become a routine window into RNA regulation.

Sources: Recent advances in RNA proximity labeling have been documented in primary research: for example, Fazal et al. (2019) on APEX-seq, Engel et al. (2022) on Halo-seq, Zou et al. (2023) on CAP-seq in stress granules, Ren et al. (2025) on Lantern, and Yap et al. (2022) on hybridization-based methods. These studies (among others) provide the mechanistic and performance data summarized above. (If certain details were unavailable in the sources, we have noted that accordingly.)

Selected References

Atlas of Subcellular RNA Localization Revealed by APEX-Seq. Cell, 2019.
Analysis of subcellular transcriptomes by RNA proximity labeling with Halo-seq. Nucleic Acids Research, 2022.
Halo-seq: an RNA proximity labeling method for the isolation and analysis of subcellular RNA populations. Current Protocols, 2022.
HyPro-seq reveals spatially regulated RNA processing at paraspeckles. Nature, 2022.
Directed evolution of a genetically encoded photocatalyst for temporally resolved proximity labeling of subcellular RNAs and proteins. Preprint, 2025.

About Us

Wednesday, July 29, 2026

The Giant RNA Polymerase of CCHFV: A New Structural Window into Viral RNA Synthesis and Antiviral Design

Why CCHFV polymerase matters

A full-length view of a giant enzyme

FID and UPD: two additions that change the RNA path

How the enzyme moves from initiation to elongation

The antiviral angle: sofosbuvir-like nucleotide analogs

Why this study matters for RNA biology

Conclusions

Monday, July 06, 2026

How RISC Works: Argonaute, Small RNAs, and the Logic of Gene Silencing

Argonaute and RISC: The Molecular Engine Behind RNA Interference

RISC Composition and Assembly

Small-RNA Biogenesis and Loading

miRNA Pathway

siRNA Pathway

piRNA Pathway (brief)

Loading and Strand Separation

Argonaute Structure and Guide/Target Interactions

Target Recognition Rules

Catalytic (Slicer) Cleavage Mechanism

Translational Repression and Deadenylation

RISC Recycling and Turnover

Regulatory PTMs and Cofactors

Key Experimental Evidence

References:

Monday, June 29, 2026

Evolution of Plant microRNA Gene Families: Birth, Expansion, and Functional Diversification of Small RNA Regulators

Introduction

Birth of New MIRNA Genes

Duplication and Expansion of MIRNA Families

Conservation and Ancient Regulatory Modules

Rapid Turnover of Young MIRNA Genes

Whole-Genome Duplication and MIRNA Family Evolution

Target Co-evolution

Functional Diversification Within MIRNA Families

After duplication, MIRNA family members can diversify in several ways.

Loss of MIRNA Genes

Evolutionary Significance

Future Directions

Conclusion

Selected References

Monday, June 22, 2026

TRACKer Diagnostics: Amplification-Free Ribozyme Sensing for Viral RNA

TRACKer: Amplification-Free Ribozyme-Based Diagnostics for Viral RNA

Introduction

Ribozyme-Based RNA Sensors and Limitations

The TRACKer Platform

Performance and Sensitivity

Sensitivity: attomolar (1-10 aM) for all six targets.

Flexibility: compatible with luminescent readout or lateral-flow readout.

Portability: operates isothermally, suitable for low-resource settings.

Comparison to Other Detection Methods

Specificity and Off-Target Control

Deployment and Practical Considerations

Some practical points:

Conclusion

Key Points:

Selected References

TRACKer and ribozyme diagnostics

Friday, June 19, 2026

Proximity Chemistries for Subcellular RNA Biology

Mapping RNA in Place: Proximity Chemistries for Subcellular RNA Biology

Proximity chemistries to study RNA biology at the subcellular scale

Reactive proximity-labeling chemistries

Contact-dependent labeling techniques

Indirect (protein-centric) approaches

Comparison of methods

Applications: subcellular transcriptome mapping

Performance overview

Future outlook

Selected References

Recent Stories

Featured Story

The Giant RNA Polymerase of CCHFV: A New Structural Window into Viral RNA Synthesis and Antiviral Design

Blog Archive