The silent biomarkers of stem cells- Small RNA

I was watching this movie, where the protagonists were a race of human clones made from “stem cells” and they try to defend their lives from the native human race. You might have already heard of dolly the sheep, the first cloned animal out of an adult stem cell. Also, the infamous CRISPR babies whose genes were modified in their embryonic stem cells? Curiosity got the best of me as to know about these stem cells and why are they so popular?

Yes, stem cells have gained quite the popularity and they do deserve it. Stem cells are the primary cells that make who we are. Every life on earth begins its journey as a single cell that divides into a particular type of cells like the heart or a nerve (brain) cell, call ed the differentiated cell. The stem cells also stay in adults to help our bodies in case of repair.

Figure-1: The little stem cell (Source:

Wouldn’t it be awesome if we know what exactly makes this stem cell into a heart cell or also called cell differentiation? Yes, these studies are ongoing (Though, we’re still confused just like that baby stem cell). We have found DNA, RNA, and proteins, the biochemicals of the central dogma to solve it.

Besides these molecules, small RNAs also play a role. These are small-sized nucleotide molecules that moderate and regulate most of our body mechanisms. They have different classes- miRNA, siRNA, etc. Recent discoveries state that they also play a role in the development of cells. Identifying these small RNAs might help us in knowing more about the mechanism of cell differentiation. These small RNAs are different in cells and tracking them can help us in understanding how the stem cells differentiate (Yes, we can help that little stem cell to find its path in life).

Small RNAs being small were very difficult to sequence and they required a large number of samples. Understanding their unique role in cells also required them to be sequenced from single cells. Finally, in 2016, Faridanii came up with a technique called Small Seq for sequencing these small RNAs in a single cell. They also validated the technique by sequencing Human Embryonic stem cells (hESC) in naive stage -the stage where they could turn into any cell (the little one), in primed stage- the stage where the cell has determined its cell type but hasn’t differentiated (the little one decides what he wants to be), and human embryonic kidney cell (hEK)-the cell fate determined (The little one grew up).
This is the story of how it can be found out if small RNAs are different in these two cells.

Analyzing Single Cell Small RNA

The small RNA sequence of hESCs and the other cells used in the Small-Seq study were downloaded from NCBI-GEO (GEO: GSE812871) in the file format FASTQ. They were loaded onto Strand NGS and analyzed.
The first step for analysis is to check data quality. Pre Alignment QC plots can be visualized from the workflow.

Figure-2: Quality control plot-Read Length Distribution plot of a single sample(SRR3495421)

Since there are several classes of small RNA, their read lengths can be different and this graph shows the read length range from 10-34. There are no reads outside of the read length range 10-34 suggesting the sequences are hardly contaminated.

Figure-3: Quality control plot-UMI distribution (%) plot of a single sample(SRR3495421)

To identify each unique cell- primed or naive, a Unique Molecular Identifiers (UMI) or a molecular barcode is added to the small RNA molecule extracted before PCR. UMI sequence of template sRNA is present in PCR amplified sRNA molecules, thus eliminating duplicate reads in the analysis. The above graph denotes the UMI distribution.

Then the samples are deduplicated (remove UMI), aligned to human reference sequence hg19, quantified and annotated using Ensembl (e75) annotations. The types of small RNAs found in each sample can be seen in the genic region plots. The small RNAs that were not known already are classified novel.

Figure-4: Genic Region QC plot(Sample SRR3495421)

Small RNAs perform their silent tasks because of their shape. Every small RNA has a distinct shape based on the sequence and this can be viewed in small RNA gene view along with the distribution of the small RNA in each type of cell.

Figure-5: A-Gene view of small RNA-snRNA (RNU4ATAC), B-Gene view of small RNA- snoRNA (SNORD82), C-Gene view of small RNA-miRNA (hsa-mir-302c), D-Gene view of small RNA-tRNA (tRNA131)

The small RNAs were analyzed for differential expression between HEK, Primed, and Naive cells using ANOVA. 2554 small RNAs were found to be differentially expressed. 2125 out of these 2554 were identified as novel.

Figure-6: Venn Diagram of Differentially expressed genes-2125 novel and 429 known genes

Wouldn’t it be nice if we knew what type of sRNA it was? The novel small RNAs were classified into sRNA types using an inbuilt algorithm(2) using Strand NGS. This classification was confirmed by PCA-the dimensionality reduction algorithm.

Figure-7: Novel small RNA types

Figure-8: PCA plot of samples used for class prediction

Now, to the main question, Can we classify our cell type based on the small RNA? These small RNAs were clustered based on the cell type and it was clear that they could be differentiated based on the small RNA.

Figure-9: Clustering of entities based on cell types Naive, Primed and HEK cells

So in conclusion, many small RNAs are yet to be studied in stem cells and they can act as potential biomarkers in identifying the stem cell types.
To know more about the methodology, refer to Strand NGS website’s white paper on this topic:


  1. Faridani, O., Abdullayev, I., Hagemann-Jensen, M., Schell, J., Lanner, F., & Sandberg, R. (2016). Single-cell sequencing of the small-RNA transcriptome. Nature Biotechnology, 34(12), 1264-1266. doi: 10.1038/nbt.3701
  2. Langenberger, D., Bermudez-Santana, C., Stadler, P., & Hoffmann, S. (2010). Identification and Classification of Small RNAs in transcriptome sequence data. Biocomputing 2010, 80-87. doi: 10.1142/9789814295291_0010