Description
The
NIH Genotype-Tissue Expression (GTEx) project
determined genetic variation and gene expression in 52 tissues and 2 cell lines
using RNA-seq data (V8, August 2019), on 17,382 samples from 948 adults.
This track focuses on the gene expression part. It shows read coverage, from one
single sample per tissue, selected for high-quality and high read depth.
The data is summarized to one number per base pair, the number of sequencing
reads that cover this position. The plot allows finding out if a given exon is
transcribed primarily in certain tissues and also whether transcription is
uniform over the length of a single exon.
Display Conventions
This track follows the display conventions for composite
"wiggle" tracks. The subtracks, one per tissue, of this track
may be configured in a variety of ways to highlight different aspects of the
displayed data. The graphical configuration options are shown at the top of
the track description page, followed by a list of subtracks. To display only
selected subtracks, uncheck the boxes next to the tracks you wish to hide.
For more information about the graphical configuration options, click the
Graph
configuration help link.
Tissue colors were assigned to conform to the GTEx Consortium publication conventions.
In Dense mode, the darkness of the grayscale rectangle displayed for the gene reflects the absolute
read count.
Methods
For background information about GTEx sample selection, see our
GTEx gene expression
track. In short, samples were sequenced with the Illumina TrueSeq protocol
on unstranded polyA+ librarires to obtain 76-bp paired end reads with
HiSeq 2000 and 2500 machines.
Sequence reads were aligned to the hg38/GRCh38 human genome using STAR v2.5.3a
and the GENCODE 26 transcriptome.
The alignment pipeline is available
here.
For further method details, see the
GTEx Portal Documentation page.
To obtain read coverage, the GTEx Laboratory, Data Analysis and Coordinating
Center (LDACC) at the Broad Institute decided to select a single, high-quality
representative sample for each tissue type, since aggregated tracks may
obscure certain features or even introduce some artifacts (e.g. intronic
coverage). For each tissue, the selected sample has the highest RIN value with
a high coverage (>80M reads) and exonic rate (>85%).
The alignment-to-coverage pipeline is available from Github:
Python script,
Docker file and
Pipeline WDL description.
To show the exact GTEx sample that was used for each tissue,
click the "Schema" link on the track configuration page (above), the filename
under "bigDataUrl" includes the identifier.
Subject and Sample Characteristics
The scientific goal of the GTEx project required that the donors and their biospecimen
present with no evidence of disease.
The tissue types collected were chosen based on their clinical significance, logistical
feasibility and their relevance to the scientific goal of the project and the
research community.
Summary plots of GTEx sample characteristics are available at the
GTEx Portal Tissue Summary page.
Data Access
The raw data for the GTEx Read Coverage track can be accessed interactively through the
Table Browser.
For automated analysis and downloads, the track data files can be downloaded from
our downloads server
or the JSON API.
Individual regions or the whole genome annotation can be accessed as text using our utility
bigBedToBed . Instructions for downloading the utility can be found
here.
That utility can also be used to obtain features within a given range, e.g.
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/gtex/gtexGeneV8.bb -chrom=chr21
-start=0 -end=100000000 stdout
Data can also be obtained directly from GTEx at the following link:
https://gtexportal.org/home/datasets
Credits
Statistical analysis and data interpretation was performed by The GTEx Consortium Analysis
Working Group.
Data was provided by the GTEx LDACC at The Broad Institute of MIT and Harvard.
References
GTEx Consortium.
The GTEx Consortium atlas of genetic regulatory effects across human tissues.
Science. 2020 Sep 11;369(6509):1318-1330.
PMID: 32913098;
PMC: PMC7737656
GTEx Consortium.
The Genotype-Tissue Expression (GTEx) project.
Nat Genet. 2013 Jun;45(6):580-5.
PMID: 23715323;
PMC: PMC4010069
Carithers LJ, Ardlie K, Barcus M, Branton PA, Britton A, Buia SA, Compton CC, DeLuca DS,
Peter-Demchok J, Gelfand ET et al.
A Novel Approach to High-Quality Postmortem Tissue Procurement: The GTEx Project.
Biopreserv Biobank. 2015 Oct;13(5):311-9.
PMID: 26484571;
PMC: PMC4675181
Melé M, Ferreira PG, Reverter F, DeLuca DS, Monlong J, Sammeth M, Young TR, Goldmann JM,
Pervouchine DD, Sullivan TJ et al.
Human genomics. The human transcriptome across tissues and individuals.
Science. 2015 May 8;348(6235):660-5.
PMID: 25954002; PMC: PMC4547472
DeLuca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C, Reich M, Winckler W, Getz G.
RNA-SeQC: RNA-seq metrics for quality control and process optimization.
Bioinformatics. 2012 Jun 1;28(11):1530-2.
PMID: 22539670; PMC: PMC3356847
|
|