RNA expression provides a proxy of cellular activity and is considered as a key cellular phenotype. Alterations in a regulatory pathway can very often be linked to a specific gene expression signature. Being able to identify the RNA content of a cell can therefore provide crucial information about cell state. This contributes to the precise RNA content of a cell as also do its type, function and age. Estimates of the total RNA content of a typical mammalian cell range between 10 and 30pg, with a majority of transfer RNA (tRNA) and ribosomal RNA (rRNA), which are both used for protein translation. Messenger RNA (mRNA, related to gene expression activity) represent 1 to 5% of total cellular RNA, which corresponds to about 360,000 molecules. A large spectrum of expression is observed: some mRNAs have more than 10,000 copies per cell, while others low-abundance mRNAs are only expressed at 5-15 copies per cell.

An important feature of eukaryotic transcription is that one unique gene can give rise to different molecules of RNA through the mechanism of alternative splicing. RNA splicing is an essential process that governs many aspects of cellular proliferation, survival, and differentiation. The system use inclusion / exclusion of exons cassette into the RNA molecule to produce mRNA molecules having different sequences and activities. The five major categories of alternative splicing are: cassette exons, alternative 3' or 5' splice site selection, mutually exclusive exons, and intron retention.

It is commonly admitted that more than 90% of the genes are subjected to alternative splicing. This is in line with the description of 229,580 distinct isoforms in the latest version of Gencode 3 (v35) to be compared with only 60,656 genes. On average, a human gene contains 8.8 exons, with a mean size of 145nt. The mean intron length is 3,365nt, and the 5' and 3' UTR are 770 and 300nt, respectively. A gene spans on average about 27kbp, and encodes a mRNA that contains 1,340nt of coding sequence, 1,070nt of untranslated regions and a poly(A) tail.

Current single cell and spatial transcriptomics protocols result in [CELLs x GENEs] matrices that have already enable the dissection of multiple new biological questions. Some information was still lacking, though. This is the case of alternative splicing, which affects more than 90% of the human genes leading to the production of distinct proteins with different functions from the same gene locus. This is also the case of sequence polymorphisms, which can correspond to either single nucleotide polymorphism (SNP), gene editing, or occurrence of fusion events. Integrating this kind of information with gene expression at the single cell level appears fundamental to better describe and characterize different cellular systems.