Mapping RNA in Place: Proximity Chemistries for Subcellular RNA Biology

A practical guide to APEX-seq, Halo-seq, CAP-seq, HyPro/POCA, and other methods for mapping local transcriptomes.

Proximity chemistries to study RNA biology at the subcellular scale

Figure: Cartoon of an animal cell highlighting major compartments (nucleus, nucleolus, ER, mitochondria, etc.) that host localized RNAs. Proximity labeling methods tag RNAs (or proteins bound to RNAs) within such compartments for sequencing. RNAs are nonuniformly distributed throughout the cell, from membrane-bound organelles to phase-separated condensates. For example, pre-ribosomal RNAs (45S) concentrate in nucleoli for ribosome biogenesis, and NEAT1 lncRNA scaffolds paraspeckles in the nucleus. Mislocalization of RNAs and their RNP complexes (e.g. NPM1 in leukemia or TDP-43 in neurodegeneration) can drive disease, underscoring the need to map RNA locales. Traditional methods (fractionation, imaging, laser-capture, RIP/CLIP) are limited by resolution, throughput or perturbation. Proximity labeling (PL) circumvents these issues by using localized catalysts to covalently tag nearby RNAs or their binding proteins in intact cells. Labeled RNAs are then enriched (e.g. via biotin-streptavidin) and deep-sequenced, yielding compartment-specific transcriptomes. PL thus systematically "records" spatial RNA-protein/RNA-RNA neighborhoods in vivo, enabling discovery of locale-specific regulatory networks.

Reactive proximity-labeling chemistries

Many PL methods generate short-lived reactive intermediates that diffuse nanometers from a target catalyst, labeling any RNA or protein in range. The labeling radius depends on the intermediate's lifetime and diffusion. For example, the peroxidase APEX2 (upon H2O2 addition) oxidizes biotin-phenol to phenoxyl radicals, which diffuse only ~20 nm and covalently attach to nearby biomolecules. In contrast, flavin-derived photosensitizers produce longer-lived species: miniSOG under blue light generates singlet O2 (~0.6 us lifetime, ~70 nm radius), and dibromofluorescein (DBF) under green light produces reactive oxygen that diffuses ~100 nm. Thus, APEX-driven PL gives very fine spatial resolution, while photosensitized approaches label a somewhat larger nanoscopic neighborhood. The following are major reactive PL platforms for RNAs:

APEX2 (peroxidase)-based PL (APEX-seq): APEX2 is genetically fused to a compartment marker. In cells fed biotin-phenol, a brief H2O2 pulse activates APEX2 to create phenoxyl radicals that biotinylate nearby RNAs. Fazal et al. introduced APEX-seq, using APEX2 in nine subcellular locales to generate a "nanometer-resolution spatial map" of the human transcriptome. This revealed, for instance, a radial organization of nuclear RNAs and transcripts gating at the nuclear pore, as well as two distinct mRNA targeting pathways to mitochondria. APEX-seq enables rapid (∼1 min) labeling, but requires H2O2 and tends to label proteins more efficiently than RNA; nonetheless it has been successful in mapping compartment-enriched transcripts.

Chromophore-assisted PL (CAP-seq): Photoactivatable flavins (e.g. miniSOG) are fused to a bait protein. Upon blue-light illumination, miniSOG photo-oxidizes nearby nucleic acids by ^1O2. An exogenous nucleophile (typically propargylamine, PA) is present to react with oxidized bases, installing an alkyne handle. Subsequent copper-click with azide-biotin captures the labeled RNAs. CAP-seq using G3BP1-miniSOG, for example, captured hundreds of mRNAs in stress granules (457 RNAs under arsenite stress, 822 under sorbitol). The labeling radius is limited by singlet oxygen diffusion (~70 nm), yielding high spatial specificity suitable for micron-scale assemblies.

HaloTag + dibromofluorescein (Halo-seq): A HaloTag fusion localizes a covalently bound DBF ligand in a compartment. Green-light activation causes DBF to release oxygen radicals that oxidize nearby RNA bases. These oxidized bases then react with PA to add an alkyne, which is biotinylated by click chemistry. Engel et al. showed that Halo-seq robustly labels RNAs near the targeted compartment and outperforms previous methods in efficiency. For example, Halo-p65 (cytosolic) enriched cytoplasmic mRNAs like GAPDH, whereas Halo-H2B (nuclear) enriched nuclear RNAs (RNase P, TERC, 7SK). Halo-seq thus provides ~100 nm resolution (due to DBF radical diffusion) with good sensitivity across all major compartments.

Engineered photocatalysts (Lantern and variants): Directed evolution has improved flavoprotein photosensitizers. Ziqi Ren et al. engineered Lantern, a LOV-domain flavoprotein with enhanced ROS output and rapid kinetics. Lantern fusions generate singlet oxygen with sub-minute illumination, greatly accelerating PL. It has been targeted to the ER, mitochondria and stress granules to perform rapid CAP-seq (transcriptome) and CAP-MS (proteome) mapping. Lantern was also adapted for CAP-CELL, a cell-surface RNA-labeling mode enabling spatial cell typing. Lantern-based PL demonstrated unprecedented temporal resolution: for instance, m^6A-modified RNAs were seen entering stress granules within 5 minutes of stress induction.

Each reactive strategy has trade-offs in radius, speed, and labeling chemistry (Table 1). In general, APEX2 (H2O2-driven) is very fast (∼1 min) but labels an isotropic small radius, miniSOG/DCBP photo-oxidation covers tens of nanometers with slower illumination (minutes), and evolved catalysts like Lantern achieve both small radius and sub-minute timing. Importantly, Halo-seq showed higher RNA labeling efficiency than CAP-seq or APEX-seq. Reactive methods directly tag RNAs (by oxidation or addition of biotin/alkyne handles), yielding high sensitivity: for example, Halo-seq identified thousands of compartment-specific RNAs and distinguished known organelle transcripts (see below). In contrast, indirect protein-based labeling (below) tends to enrich fewer RNAs.

Contact-dependent labeling techniques

Contact-dependent or templated methods achieve labeling only when a probe is physically bound to its target, affording very tight spatial control. These typically use an affinity reagent (antibody, oligonucleotide, protein fusion) to bring a catalyst or modifying enzyme into direct proximity with an RNA or protein of interest. For instance:

Hybridization-proximity (HyPro) and POCA: In HyPro/POCA methods, fixed cells are hybridized or immunostained with a fluorescent photosensitizer probe that binds a specific RNA or protein. The attached photosensitizer then generates radicals that label nearby molecules. This approach requires no genetic engineering and minimal input. Biletch et al. developed POCA, targeting organic fluorophores via standard immunofluorescence or FISH to place the catalyst at an epitope. They demonstrated POCA in fixed cells by imaging a fluorescent tag on-target before activation, then performing PL. In one study, Yap et al. used HyPro to target the lncRNAs 45S and NEAT1: they identified both known and novel proteins (HyPro-MS) and RNAs (HyPro-seq) associated with these transcripts in intact nuclei. Notably, NEAT1-directed HyPro-seq revealed a rich set of A-to-I-edited RNAs at paraspeckle boundaries, showing how anchored PL can map RNA-chromatin interactions. POCA/HyPro has been applied to diverse structures (nuclear pore, nucleolus, nuclear speckles, telomeres, etc.), and can even be anchored to both a protein and an RNA in the same compartment to compare the proximal proteomes from each perspective.

Polyuridylation tagging (RNA Tagging): In this enzymatic approach, a chosen RNA-binding protein (RBP) is fused to a poly(U) polymerase (C. elegans PUP-2). When the fusion binds its native RNAs, PUP-2 adds a short U-tail to the target, marking it covalently. Sequencing of 3' ends then reveals the tagged transcripts. This "RNA Tagging" method (Lapointe et al.) has been used in yeast to map RBP-RNA networks without crosslinking.

RNA base-editing (TRIBE/STAMPS): A catalytically inactive RBP can be fused to an RNA-editing enzyme. For example, fusing ADAR's catalytic domain to an RBP leads to A->I (seen as A->G) edits at binding sites, indirectly flagging the targets (TRIBE, STAMP). Similarly, APOBEC or other editors can be used for C->U tagging. These fusions record interactions transcriptome-wide in vivo, though typically over longer timescales (hours).

CRISPR-guided proximity: Catalytically-dead RNA-targeting CRISPR (dCas13) can be programmed with a guide RNA to bind a specific transcript. Fusing a PL enzyme (APEX2, BirA, etc.) to dCas13 then directs labeling to that RNA's neighborhood. Han et al.* demonstrated this concept by using MS2-MCP or Cas13 to deliver APEX2 to the human telomerase RNA (hTR), selectively labeling its interactors (PNAS 2020). Such CRISPR-PL methods allow sequence-specific profiling of an individual RNA's local proteome or co-transcripts.

These contact methods excel at pinpointing interactions of a chosen RNA or RBP, even in genetically unperturbed cells. For example, HyPro-seq targeting NEAT1 uncovered hundreds of Paraspeckle-associated transcripts. POCA (targeting antibodies) and RNA-Tagging (targeting RBPs) similarly yield high specificity. However, they require delivery of probes or fusion proteins and often work in fixed or engineered cells.

Indirect (protein-centric) approaches

Some PL strategies label proteins rather than RNA directly, then identify associated RNAs via crosslinking. For example:

TurboID/BioID tagging - Proximity biotin ligases (TurboID or BioID) fused to compartmental markers robustly biotinylate nearby proteins (within ~10 nm). After streptavidin pulldown, the bound proteome is analyzed by mass spec, and any co-purifying RNAs (crosslinked or stably associated) are sequenced. Ramelow et al. introduced SPARO (Simultaneous Protein And RNA-omics) using a Rosa26-TurboID mouse line. By labeling astrocytes or neurons in vivo, they enriched cell-type proteomes and protein-bound transcriptomes concurrently. SPARO validated that the captured RNA pool represents the expected transcriptome and revealed cases of mRNA-protein discordance in neuroinflammation.

APEX-RIP/CLIP - APEX2 (or HRP) can also label the local proteome, which is then subjected to RNA immunoprecipitation (e.g. CLIP or APEX-RIP). For instance, one can APEX-label ER proteins and then UV-crosslink+IP any RNAs associated with biotinylated ribonucleoproteins. These indirect methods can in principle detect RNAs near a structure without labeling them chemically. However, they depend on crosslinking efficiency and bias toward protein-bound transcripts. In practice, indirect approaches often show lower sensitivity. For example, simple nuclear APEX-RIP enriched fewer than 200 transcripts, whereas direct Halo-seq of the nucleus enriched >1000.

Comparison of methods

The various PL chemistries differ in range, speed, and throughput (Table 1). Labeling radius: APEX2 phenoxyl radicals tag within ~10-20 nm; miniSOG singlet oxygen ~70 nm; Halo-DBF radicals ~100 nm. Time resolution: APEX/H2O2 pulses label in ≲1 min; photo-oxidation (CAP/DBF) typically requires several minutes of illumination; Lantern allows labeling in seconds. Specificity: Direct RNA tagging yields high compartment specificity. For example, Halo-seq targeted to nucleoli (Fibrillarin-Halo) enriched known nucleolar RNAs (e.g. SNORA68, 7SL) that were depleted in a general nuclear pulldown. CAP-seq labeling of G3BP1 captured well-known SG mRNAs (long, AU-rich, translationally repressed). Sensitivity: Direct PL methods can recover hundreds to thousands of RNAs per locale. Halo-seq reported robust enrichment of nuclear vs cytosolic transcriptomes, and CAP-seq found hundreds of SG RNAs. In contrast, indirect labeling of proteins yields far fewer RNAs (e.g. <200 nuclear RNAs by APEX-RIP). In summary, direct PL provides higher sensitivity and spatial precision, whereas indirect (protein-centric) PL is more limited by crosslinking and diffusion of label.

Applications: subcellular transcriptome mapping

Proximity labeling has been applied to chart RNAs in virtually every compartment:

Nucleus and nucleolus: APEX2-NLS or lamin fusions have delineated nuclear-layered transcriptomes. For instance, Fazal et al. found that processed mRNAs exit the nucleus through inner pore regions, and that HuR-dependent RNAs accumulate in the nucleus (APEX-seq). Halo-seq of H2B (chromatin) vs p65 (cytosol) highlighted that nuclear-enriched transcripts frequently contain AU-rich elements, suggesting HuR involvement. Halo-seq directed to fibrillarin (nucleolus) specifically pulled down nucleolar RNAs (snoRNAs, 7SL) that were absent from the general nuclear pool.

Paraspeckles and other nuclear bodies: Using HyPro/POCA, researchers have begun mapping RNAs at specific nuclear condensates. In targeting the NEAT1 lncRNA (paraspeckles), HyPro-seq uncovered a large set of incompletely processed, A->I-edited transcripts that localize at active chromosomal loci immediately adjacent to the NEAT1-paraspeckle core. Similarly, PL of NEAT1-associated proteins (PSPC1, etc.) or the Perinucleolar Compartment (PNC) could reveal their tethered transcripts.

Mitochondria and ER: APEX2 targeted to the mitochondrial outer membrane or ER membrane profiles the RNAs on those surfaces. Fazal et al. showed two distinct mitochondrial targeting signals: one pathway localizes nuclear-encoded respiratory-chain mRNAs to the OMM, and another path for other mitochondrial proteins. (Organelle PL can also capture imported transcripts or ER-localized mRNAs, though specific examples remain under study.)

Stress granules (SGs) and P-bodies: Membraneless cytoplasmic granules have been difficult to purify intact, but CAP-seq and Lantern have made it possible. Zou et al. used G3BP1-miniSOG to perform CAP-seq on SGs in live cells. They found that SG-enriched mRNAs tend to be long, AU-rich, poorly translated, and many carry m^6A modifications. They also tracked SG transcriptomes during assembly/disassembly. Lantern-accelerated CAP-seq captured early SG recruitment of methylated mRNAs (within minutes). By contrast, proteins heavily targeted to SGs (TIA1, FMRP) could be profiled by PL-MS and their bound RNAs inferred.

Cell surface and in vivo labeling: A recent advance (CAP-CELL) uses Lantern delivered to the plasma membrane to label cell-surface RNAs, enabling 'spatial cell typing'. Meanwhile, in vivo PL strategies like SPARO (astrocytes/neurons in mouse) combine TurboID and sequencing to yield native-cell-type transcriptomes and proteomes. These methods hint at future whole-organism spatial omics.

In all these cases, PL has uncovered locale-specific RNA populations that were inaccessible by fractionation or imaging alone. For example, Paraspeckle-CLIP methods (HyPro) and SG-CAP-seq complement conventional CLIP by revealing transcripts that cluster without necessarily binding a known RBP. By systematically mapping the "local transcriptomes" of organelles and granules, PL is redefining our view of the intracellular RNA landscape.

Performance overview

Labeling radius: Determined by chemistry - e.g. APEX2 phenoxyl radicals label within ∼20 nm; miniSOG singlet O2 labels ∼70 nm; Halo-DBF radicals ∼100 nm. Contact-based methods effectively have zero diffusion beyond the target.

Temporal resolution: APEX/H2O2 can tag in ≲1 minute pulses (fastest). Photo-PL (miniSOG/DBF) typically requires 5-15 minutes of illumination. Lantern enables labeling in a few seconds (sub-minute). Enzymatic PL (TurboID, ADAR) operates on timescales of minutes to hours (limited by enzyme kinetics).

Specificity: High - PL precisely enriches known compartment markers. For instance, Halo-seq of nucleolar fibrillarin enriched SNORA and RNase MRP RNAs (nucleolus-specific) that were depleted in a parallel nuclear control. CAP-seq of SGs enriched long, U-rich mRNAs (e.g. DYNC1H1, NORAD) over mitochondrial or highly translated RNAs.

Sensitivity: Direct RNA labeling yields hundreds to thousands of transcripts per experiment. In practice, APEX-seq and Halo-seq report ~10^3-10^4 enriched RNAs genome-wide per compartment. CAP-seq often recovers a smaller set (10^2-10^3) focused on the granule core. By contrast, indirect approaches recover far fewer RNA hits (e.g. <200 RNAs enriched by nuclear APEX-RIP). Thus, direct PL is generally more sensitive and comprehensive.

Coverage: PL is unbiased (sequencing-based) and genome-wide, unlike FISH or targeted assays. It can profile coding and noncoding RNAs (mRNAs, lncRNAs, sn/snoRNAs, etc.) at once.

Future outlook

Proximity chemistries for RNA are rapidly evolving. The trend is toward faster, more efficient catalysts and broader applicability. Engineered photosensitizers like Lantern exemplify this, enabling previously impossible kinetics. Orthogonal targeting (new HaloTags, split-enzymes) and small-molecule activation may further refine spatial control. Contact-based platforms like POCA/HyPro show that one can exploit standard imaging workflows to profile RNAs and proteins without genetic tags. Integrating PL with single-cell or super-resolution methods could produce even richer maps (e.g. combining APEX-seq with MERFISH for precise sub-organellar localization). Remaining challenges include minimizing oxidative damage, distinguishing direct contacts from mere colocalization, and extending PL to in vivo and clinical samples. Nevertheless, current PL tools have already revealed intricate links between RNA localization, RNA-binding proteins, and cell function. As catalysts and methods improve, spatially resolved transcriptomics will become a routine window into RNA regulation.

Sources: Recent advances in RNA proximity labeling have been documented in primary research: for example, Fazal et al. (2019) on APEX-seq, Engel et al. (2022) on Halo-seq, Zou et al. (2023) on CAP-seq in stress granules, Ren et al. (2025) on Lantern, and Yap et al. (2022) on hybridization-based methods. These studies (among others) provide the mechanistic and performance data summarized above. (If certain details were unavailable in the sources, we have noted that accordingly.)

Selected References

Atlas of Subcellular RNA Localization Revealed by APEX-Seq. Cell, 2019.
Analysis of subcellular transcriptomes by RNA proximity labeling with Halo-seq. Nucleic Acids Research, 2022.
Halo-seq: an RNA proximity labeling method for the isolation and analysis of subcellular RNA populations. Current Protocols, 2022.
HyPro-seq reveals spatially regulated RNA processing at paraspeckles. Nature, 2022.
Directed evolution of a genetically encoded photocatalyst for temporally resolved proximity labeling of subcellular RNAs and proteins. Preprint, 2025.

The RNA Blog

About Us

Friday, June 19, 2026

Proximity Chemistries for Subcellular RNA Biology