Description
The FANTOM5 track shows mapped transcription start sites (TSS) and their usage in primary cells,
cell lines, and tissues to produce a comprehensive overview of gene expression across the human
body by using single molecule sequencing.
Display Conventions and Configuration
Items in this track are colored according to their strand orientation. Blue
indicates alignment to the negative strand, and red indicates
alignment to the positive strand.
Methods
Protocol
Individual biological states are profiled by HeliScopeCAGE, which is a variation of the CAGE
(Cap Analysis Gene Expression) protocol based on a single molecule sequencer. The standard protocol
requiring 5 µg of total RNA as a starting material is referred to as hCAGE, and an
optimized version for a lower quantity (~ 100 ng) is referred to as LQhCAGE
(Kanamori-Katyama et al. 2011).
Samples
Transcription start sites (TSSs) were mapped and their usage in human and mouse primary cells,
cell lines, and tissues was to produce a comprehensive overview of mammalian gene expression across the
human body. 5′-end of the mapped CAGE reads are counted at a single base pair resolution
(CTSS, CAGE tag starting sites) on the genomic coordinates, which represent TSS activities in the
sample. Individual samples shown in "TSS activity" tracks are grouped as below.
- Primary cell
- Tissue
- Cell Line
- Time course
- Fractionation
TSS peaks and enhancers
TSS (CAGE) peaks across the panel of the biological states (samples) are identified by DPI
(decomposition based peak identification, Forrest et al. 2014), where each of the peaks consists of
neighboring and related TSSs. The peaks are used as anchors to define promoters and units of
promoter-level expression analysis. Two subsets of the peaks are defined based on evidence of read
counts, depending on scopes of subsequent analyses, and the first subset (referred as
robust set of the peaks, thresholded for expression analysis is shown as TSS peaks. They
are named as "p#@GENE_SYMBOL" if associated with 5'-end of known genes, or
"p@CHROM:START..END,STRAND" otherwise. The CAGE data is also used to produce an atlas of active, in
vivo-transcribed enhancers (Andersson et al. 2014). The summary tracks consist of the TSS (CAGE)
peaks, and summary profiles of TSS activities (total and maximum values). The summary track consists of the
following tracks.
- TSS (CAGE) peaks
- Enhancers
- TSS summary profiles
- Total counts and TPM (tags per million) in all the samples
- Maximum counts and TPM among the samples
TSS activity
5′-end of the mapped CAGE reads are counted at a single base pair resolution (CTSS, CAGE tag
starting sites) on the genomic coordinates, which represent TSS activities in the sample. The read
counts tracks indicate raw counts of CAGE reads, and the TPM tracks indicate normalized counts as
TPM (tags per million).
- Categories of individual samples
- - Cell Line hCAGE
- - Cell Line LQhCAGE
- - fractionation hCAGE
- - Primary cell hCAGE
- - Primary cell LQhCAGE
- - Time course hCAGE
- - Tissue hCAGE
FANTOM-NET enhancers
A set of enhancers consist of the ones identified by Andersson et al. 2014 and the ones by Hirabayashi
et al. 2019
FANTOM CAT
FANTOM CAGE associated transcriptome (FANTOM CAT) is a meta-assembly where FANTOM5 CAGE datasets
were integrated with transcript models from diverse sources. Transcription Initiation Evidence Score
(TIEScore) is a custom metric that evaluates the properties of a pair of CAGE cluster and transcript
model to quantify the likelihood that the corresponding CAGE transcription start site (TSS) is
genuine. TIEScore was first applied to each of the five transcript model collections separately
and then merged into a non-redundant transcript set. Specifically, the transcript models from
GENCODEv19 were used as the initial reference to sequentially overlay onto them the transcripts
from the other four collections, in sequence of Human BodyMap 2.0, miTranscriptome, ENCODE, and
FANTOM5 RNA-seq assembly. For more details, please refer to Hon et al. 2017.
Data Access
FANTOM5 data can be explored interactively with the
Table Browser and cross-referenced with the
Data Integrator. For programmatic access,
the track can be accessed using the Genome Browser's
REST API.
ReMap annotations can be downloaded from the
Genome Browser's download server
as a bigBed file. This compressed binary format can be remotely queried through
command line utilities. Please note that some of the download files can be quite large.
The FANTOM5 reprocessed data can be found and downloaded on the FANTOM website.
Credits
Thanks to the FANTOM5 consortium,
the Large Scale Data Managing Unit and Preventive Medicine and
Applied Genomics Unit, Center for Integrative Medical Sciences (IMS), and
RIKEN for providing this data
and its analysis.
References
Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C,
Suzuki T et al.
An atlas of active enhancers across human cell types and tissues.
Nature. 2014 Mar 27;507(7493):455-461.
PMID: 24670763; PMC: PMC5215096
FANTOM Consortium and the RIKEN PMI and CLST (DGT), Forrest AR, Kawaji H, Rehli M, Baillie JK, de
Hoon MJ, Haberle V, Lassmann T, Kulakovskiy IV, Lizio M et al.
A promoter-level mammalian expression atlas.
Nature. 2014 Mar 27;507(7493):462-70.
PMID: 24670764; PMC: PMC4529748
Hirabayashi S, Bhagat S, Matsuki Y, Takegami Y, Uehata T, Kanemaru A, Itoh M, Shirakawa K, Takaori-
Kondo A, Takeuchi O et al.
NET-CAGE characterizes the dynamics and topology of human transcribed cis-regulatory elements.
Nat Genet. 2019 Sep;51(9):1369-1379.
PMID: 31477927
Hon CC, Ramilowski JA, Harshbarger J, Bertin N, Rackham OJ, Gough J, Denisenko E, Schmeier S,
Poulsen TM, Severin J et al.
An atlas of human long non-coding RNAs with accurate 5' ends.
Nature. 2017 Mar 9;543(7644):199-204.
PMID: 28241135; PMC: PMC6857182
Kanamori-Katayama M, Itoh M, Kawaji H, Lassmann T, Katayama S, Kojima M, Bertin N, Kaiho A, Ninomiya
N, Daub CO et al.
Unamplified cap analysis of gene expression on a single-molecule sequencer.
Genome Res. 2011 Jul;21(7):1150-9.
PMID: 21596820; PMC: PMC3129257
Lizio M, Harshbarger J, Shimoji H, Severin J, Kasukawa T, Sahin S, Abugessaisa I, Fukuda S, Hori F,
Ishikawa-Kato S et al.
Gateways to the FANTOM5 promoter level mammalian expression atlas.
Genome Biol. 2015 Jan 5;16(1):22.
PMID: 25723102; PMC: PMC4310165
|
|