The unending desire and quest to unravel and see what lies beneath the deepest layers have always pushed humans to improve upon anything and everything that they touch. This curiosity in a way has led to never thought before technological advances. One of the most important discoveries of all the times has been the discovery of DNA. Since then we have been busy decoding it and learning what it means. In order to understand the meaning of DNA we have continuously been trying to develop newer tools by day and most important of them is sequencing the DNA and trying to derive the meaning of As, Ts, Gs and Cs present in DNA. Since the sequencing of the first whole genome, bacteriophage phiX174 in the year 1977, sequencing has evolved rapidly.
Today, sequencing has evolved into next generation sequencing (NGS) where we are capable of sequencing massive amounts of DNAs really fast, at a very low cost with much-reduced efforts. Remember the sequencing until a decade or two back, the radiolabeling, big gels running and then reading those long lanes with black bases. Today, NGS is capable of parallelizing the sequencing and producing huge amounts of data in the form of millions of sequences. This provides researchers with the confidence to take up powerful experiments and plunge deep into biological processes to understand and find answers to many whys and hows of biology. Next generation sequencing methods have been producing unprecedented amounts of read data and to make meaning of such massive data has led to the development of new tools for sequence analysis. One such tool, Strand NGS, has been developed by Strand Life Sciences. The Strand NGS is capable of comprehensive analysis of the enormous read data that comes out of DNA, RNA, Small RNA, ChIP and Methyl or MeDIP Seq experiments.
Strand NGS has a user-friendly interface that allows one to visualize and analyze the reads and sequencing depth/ coverage along with very strong visualisations of the sequencing data. One can easily visualise aligned reads and genetic variations juxtaposed to the annotation information of reference genome. One can view the data at different levels of magnification from genome or chromosome level (overall coverage) to nucleotide level. Just mousing over the base provides information regarding the read and the base. One could also color read tracks to their liking. The tool provides various views configurable to one’s choice for data presentation to allow deep and thorough visualisation and investigation of the data. Views like Variant support view (brings together test region in tabular format with quality views for easy verification of bases). Events such as SNPs, MNPs, InDels, structural variants, large deletions, insertions or translocations or copy number variants can be visualised in great detail in the genome browser.
Sometimes Strand NGS predicts large structural variants in the data spanning either one chromosome or spanning two chromosomes, like gene fusions or large deletions. Visually verifying such events in detail is not possible using the standard genome browsers which allows visualization of the smaller events. The elastic genome browser (eGB) in Strand NGS allows one to bring together and expand the genomic regions that are quite far to actually see what’s happening at these multiple genomic regions with detailed read level information. One can add pins to elastic band track, compress and expand portions of a large genomic region by dragging these pins.
The eGB view is launched automatically when the Navigate in Genome Browser functionality is launched for the region lists resulting from Gene Fusion Detection and Find Novel Spliced Junctions. The user can exit the eGB by deleting all the pins or by merging all the regions from Regions of Interest tab or by giving just one genomic region in the search bar(e.g.,chr10:65241620- 65242359;).
The eGB browser displays coverage, the reads from the file and an hg19 transcript track. Other things possible in eGB are:
- Pins can be added, moved and removed. Pin can be added at the desired genomic region at the top of the browser showing the displayed genomic size. This creates a pin pair and the line connecting the pins shows the genomic region size it represents from the LHS.
- One can add as many pins to a single view and all these would be visible in the Regions of Interest tab in the right side of the eGB.
- A pin can be moved (by left click and then dragging the mouse, expanding the desired region). Pins can be removed (by clicking on the specific pin and deleting it or deleting it from the Regions of Interest tab.). Multiple pins can also be merged from the Regions of Interest tab.
- The first and last pin-pairs are immovable and cannot be removed.
- Another way to launch the eGB is by giving the necessary genomic coordinates for the desired regions in the GB toolbar like chr1:45241617-45242351:0.5; chr10:65241620-65242359:0.5;
Examples of eGB to visualize and verify events such as novel splice junction, gene fusions or DNA structural variants are discussed below.
1. Novel splice junction: Novel splice junction discovery and verification in Strand NGS and eGB. The figure below shows novel splice junction discovered by Strand NGS in the ISYNA1 gene and is viewed in Genome browser (fig.1.). The verification of the event has been done using eGB. In the eGB one can easily view the spliced reads that span the novel splice junction as the bookmark would compress the intronic regions, expand the exonic regions of interest to show it. (Fig.2).
The exact same genomic region is viewed in the eGB; the two distant regions containing the exons 4, 5 have been shown in great detail. They are stretched to occupy a larger screen width and uninteresting upstream and downstream regions have been compressed. The elastic band is located at the top of the browser and it contains the pin controls that govern the resolution at which different genomic regions are displayed in the view. In this example, the eGB verifies that the novel splice junction prediction is indeed accurate because a part of exon 5 is retained which is shown in the novel detection report track.
Fig.2. An elastic view of the ISYNA1 gene in which exons 4,5 alone are shown in great detail.
2. Gene fusion: To verify gene fusion events the bookmark compresses the genomic region between the two genes and expands the exons where the reads aligning to fusion are present.
Fig.3. Gene Fusion discovered by Strand NGS between ALK and EML4 genes.
3. DNA Structural Variants: Presence of reads mapping in wrong orientations or mapping too near or too far are indicative of presence of structural variants like insertions, deletions and inversions in the data. Bookmarks that display the neighborhood of the two breakpoints in more detail can help verify the SV predictions.