Splicing Impact SpliceVarDB Track Settings
 
SpliceVarDB: Experimentally validated splicing variants

Track collection: Splicing Impact Prediction Scores and Databases

+  Description
+  All tracks in this collection (3)

Display mode:      Duplicate track

Track height: pixels (range: 32 to 128)
Data view scaling:
Vertical viewing range: min:  max: 

Show only items with score at or above:   (range: 0 to 1000)

Data schema/format description and download
Source data version: Nov 2024
Assembly: Human Dec. 2013 (GRCh38/hg38)
Data last updated at UCSC: 2024-11-12 09:40:37


new Note: Released Mar. 5, 2025

Description

The "Splicing Impact" container track contains tracks showing the predicted or validated effect of variants close to splice sites.

AbSplice

AbSplice is a method that predicts aberrant splicing across human tissues, as described in Wagner, Çelik et al., 2023. This track displays precomputed AbSplice scores for all possible single-nucleotide variants genome-wide. The scores represent the probability that a given variant causes aberrant splicing in a given tissue. AbSplice scores can be computed from VCF files and are based on quantitative tissue-specific splice site annotations (SpliceMaps). While SpliceMaps can be generated for any tissue of interest from a cohort of RNA-seq samples, this track includes 49 tissues available from the Genotype-Tissue Expression (GTEx) dataset.

SpliceAI

SpliceAI is an open-source deep learning splicing prediction algorithm that can predict splicing alterations caused by DNA variations. Such variants may activate nearby cryptic splice sites, leading to abnormal transcript isoforms. SpliceAI was developed at Illumina; a lookup tool is provided by the Broad institute.

Why are some variants not scored by SpliceAI?

SpliceAI only annotates variants within genes defined by the gene annotation file. Additionally, SpliceAI does not annotate variants if they are close to chromosome ends (5kb on either side), deletions of length greater than twice the input parameter -D, or inconsistent with the reference fasta file.

What are the differeneces between masked and unmasked tracks?

The unmasked tracks include splicing changes corresponding to strengthening annotated splice sites and weakening unannotated splice sites, which are typically much less pathogenic than weakening annotated splice sites and strengthening unannotated splice sites. The delta scores of such splicing changes are set to 0 in the masked files. We recommend using the unmasked tracks for alternative splicing analysis and masked tracks for variant interpretation.

SpliceVarDB

SpliceVarDB is an online database consolidating over 50,000 variants assayed for their effects on splicing in over 8,000 human genes. The authors evaluated over 500 published data sources and established a spliceogenicity scale to standardize, harmonize, and consolidate variant validation data generated by a range of experimental protocols. Genes and variant locations were obtained using GENCODE v44. Splice regions were calculated as specific distances from the closest canonical exon, including 5' and 3' untranslated regions (UTRs). The database is available at splicevardb.org.

Display Conventions and Configuration

AbSplice

The AbSplice score is a probability estimate of how likely aberrant splicing of some sort takes place in a given tissue. The authors suggest three cutoffs which are represented by color in the track.

  • High (red) - An AbSplice score over 0.2 indicates a high likelihood of aberrant splicing in at least one tissue.
  • Medium (orange) - A score between 0.05 and 0.2 indicates a medium likelihood.
  • Low (blue) - A score between 0.01 and 0.05 indicates a low likelihood.
  • Scores below 0.01 are not displayed.

Mouseover on items shows the gene name, maximum score, and tissues that had this score. Clicking on any item brings up a table with scores for all 49 GTEX tissues.

SpliceAI

Variants are colored according to Walker et al. 2023 splicing impact:

  • Predicted impact on splicing: Score >= 0.2
  • Not informative: Score < 0.2 and > 0.1
  • No impact on splicing: Score <= 0.1

Mouseover on items shows the variant, gene name, type of change (donor gain/loss, acceptor gain/loss), location of affected cryptic splice, and spliceAI score. Clicking on any item brings up a table with this information.

The scores range from 0 to 1 and can be interpreted as the probability of the variant being splice-altering. In the paper, a detailed characterization is provided for 0.2 (high recall), 0.5 (recommended), and 0.8 (high precision) cutoffs.

SpliceVarDB

According to the strength of their supporting evidence, variants were classified as "splice-altering" (~25%), "not splice-altering" (~25%), and "low-frequency splice-altering" (~50%), which correspond to weak or indeterminate evidence of spliceogenicity. 55% of the splice-altering variants in SpliceVarDB are outside the canonical splice sites (5.6% are deep intronic). The data is shown as lollipop plots that can be clicked, the details page then shows a link to SpliceVarDB with full details.

The classification thresholds primarily follow those established by the original study. However, most studies only defined criteria for splice-altering variants and did not define criteria for variants that resulted in normal splicing. The authors implemented stringent thresholds to define the normal category and ensure a high-quality set of control variants. Variants that did not meet these criteria were classified as low-frequency splice-altering variants with a wide range of sub-optimal scores. Variants that fell between the normal and splice-altering classifications were placed into a low-frequency splice-altering category. In situations where a variant was validated multiple times, if at least one validation returned splice-altering and another returned normal, the "conflicting" category was applied.

The lollipop plots are color-coded based on the score value, which corresponds to the following classifications:

  • 3 - Splice-altering
  • 2 - Low-frequency
  • 1 - Normal
  • 0 - Conflicting

Methods

AbSplice

Data was converted from the files (AbSplice_DNA_ hg38 _snvs_high_scores.zip) provided by the authors at zenodo.org. Files in the score_cutoff=0.01 directory were concatenated. To convert the data to bigBed format, scores and their tissues were selected from the AbSplice_DNA fields and maximum scores, and then calculated using a custom Python script, which can be found in the makeDoc from our GitHub repository.

SpliceAI

The data were downloaded from Illumina. The spliceAI scores are represented in the VCF INFO field as SpliceAI=G|OR4F5|0.01|0.00|0.00|0.00|-32|49|-40|-31

Here, the pipe-separated fields contain

  • ALT allele
  • Gene name
  • Acceptor gain score
  • Acceptor loss score
  • Donor gain score
  • Donor loss score
  • Relative location of affected cryptic acceptor
  • Relative location of affected acceptor
  • Relative location of affected cryptic donor
  • Relative location of affected donor

Since most of the values are 0 or almost 0, we selected only those variants with a score equal to or greater than 0.02.

The complete processing of this track can be found in the makedoc.

SpliceVarDB

The data was converted by Patricia Sullivan from SpliceVarDB to bigLolly format, and the UCSC Browser staff downloaded it for display.

Data Access

Precomputed AbSplice-DNA scores in all 49 GTEx tissues are available at Zenodo.

License

The SpliceAI data is not available for download from the Genome Browser. The raw data can be found directly on Illumina. FOR ACADEMIC AND NOT-FOR-PROFIT RESEARCH USE ONLY. The SpliceAI scores are made available by Illumina only for academic or not-for-profit research only. By accessing the SpliceAI data, you acknowledge and agree that you may only use this data for your own personal academic or not-for-profit research only, and not for any other purposes. You may not use this data for any for-profit, clinical, or other commercial purpose without obtaining a commercial license from Illumina, Inc.

The raw data can be explored interactively with the Table Browser or the Data Integrator. For automated analysis, the data may be queried from our REST API.

For automated download and analysis, the genome annotation is stored in a bigBed file that can be downloaded from our download server. Individual regions or the whole genome annotation can be obtained using our tool bigBedToBed which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain only features within a given range, e.g. bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg19/splicevardb/SVADB.bb -chrom=chr21 -start=0 -end=100000000 stdout

Credits

Thanks to Nils Wagner for helpful comments and suggestionsi for the AbSplice track.

Thanks to the SpliceVarDB team for converting the data into our data formats.

References

Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF, Darbandi SF, Knowles D, Li YI, Kosmicki JA, Arbelaez J, Cui W, Schwartz GB et al. Predicting Splicing from Primary Sequence with Deep Learning. Cell. 2019 Jan 24;176(3):535-548.e24. PMID: 30661751

Sullivan PJ, Quinn JMW, Wu W, Pinese M, Cowley MJ. SpliceVarDB: A comprehensive database of experimentally validated human splicing variants. Am J Hum Genet. 2024 Oct 3;111(10):2164-2175. PMID: 39226898; PMC: PMC11480807

Wagner N, Çelik MH, Hölzlwimmer FR, Mertes C, Prokisch H, Yépez VA, Gagneur J. Aberrant splicing prediction across human tissues. Nat Genet. 2023 May;55(5):861-870. PMID: 37142848

Walker LC, Hoya M, Wiggins GAR, Lindy A, Vincent LM, Parsons MT, Canson DM, Bis-Brewer D, Cass A, Tchourbanov A et al. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI Splicing Subgroup. Am J Hum Genet. 2023 Jul 6;110(7):1046-1067. PMID: 37352859; PMC: PMC10357475