GENCODE Archive GENCODE VM30 Track Settings

JavaScript is disabled in your web browser

You must have JavaScript enabled in your web browser to use the Genome Browser

GENCODE VM30

Track collection: GENCODE Archive

Description

All tracks in this collection (5)

Display mode: Duplicate track

Label: Gene Symbol    Ensembl ID    UniProt display ID    UCSC Genes ID
Show: non-coding genes    splice variants    pseudogenes
Tagged Sets: MANE only    BASIC only    All

Color track by codons: Help on codon coloring

Show codon numbering:

Show only transcripts with these accessions:

Change track color:

▼

Reset

Display data as a density graph:

Show only items with score at or above: (range: 0 to 1000)

Data schema/format description and download

Assembly: Mouse Jun. 2020 (GRCm39/mm39)
Data last updated at UCSC: 2022-06-24 11:41:01

Description

The GENCODE Genes track (version M30, Dec 2020) shows high-quality manual annotations merged with evidence-based automated annotations across the entire human genome generated by the GENCODE project. By default, only the basic gene set is displayed, which is a subset of the comprehensive gene set. The basic set represents transcripts that GENCODE believes will be useful to the majority of users.

The track includes protein-coding genes, non-coding RNA genes, and pseudo-genes, though pseudo-genes are not displayed by default. It contains annotations on the reference chromosomes as well as assembly patches and alternative loci (haplotypes).

The following table provides statistics for the VM30 release derived from the GTF file that contains annotations only on the main chromosomes. More information on how they were generated can be found in the GENCODE site.

GENCODE VM30 Release Stats

Genes Observed Transcripts Observed

Protein-coding genes 21,668 Protein-coding transcripts 59,116

Long non-coding RNA genes 14,525 - full length protein-coding 45,378

Small non-coding RNA genes 6,105 - partial length protein-coding 13,738

Pseudogenes 13,647 Nonsense mediated decay transcripts 7,209

Immunoglobulin/T-cell receptor gene segments 701 Long non-coding RNA loci transcripts 25,419

For more information on the different gene tracks, see our Genes FAQ.

Display Conventions and Configuration

By default, this track displays only the basic GENCODE set, splice variants, and non-coding genes. It includes options to display the entire GENCODE set and pseudogenes. To customize these options, the respective boxes can be checked or unchecked at the top of this description page.

This track also includes a variety of labels which identify the transcripts when visibility is set to "full" or "pack". Gene symbols (e.g. NIPA1) are displayed by default, but additional options include GENCODE Transcript ID (ENST00000561183.5), UCSC Known Gene ID (uc001yve.4), UniProt Display ID (Q7RTP0). Additional information about gene and transcript names can be found in our FAQ.

This track, in general, follows the display conventions for gene prediction tracks. The exons for putative non-coding genes and untranslated regions are represented by relatively thin blocks, while those for coding open reading frames are thicker.

Coloring for the gene annotations is based on the annotation type:

coding
non-coding
pseudogene
problem
all 2-way pseudogenes
all polyA annotations

This track contains an optional codon coloring feature that allows users to quickly validate and compare gene predictions. There is also an option to display the data as a density graph, which can be helpful for visualizing the distribution of items over a region.

Methods

The GENCODE VM30 track was built from the GENCODE downloads file gencode.vM30.chr_patch_hapl_scaff.annotation.gff3.gz. Data from other sources were correlated with the GENCODE data to build association tables.

Related Data

The GENCODE Genes transcripts are annotated in numerous tables, each of which is also available as a downloadable file.

One can see a full list of the associated tables in the Table Browser by selecting GENCODE Genes from the track menu; this list is then available on the table menu.

Data access

GENCODE Genes and its associated tables can be explored interactively using the REST API, the Table Browser or the Data Integrator. The genePred format files for mm39 are available from our downloads directory or in our GTF download directory. All the tables can also be queried directly from our public MySQL servers, with more information available on our help page as well as on our blog.

Credits

The GENCODE Genes track was produced at UCSC from the GENCODE comprehensive gene set using a computational pipeline developed by Jim Kent and Brian Raney.

References

Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, Aken BL, Barrell D, Zadissa A, Searle S et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 2012 Sep;22(9):1760-74. PMID: 22955987; PMC: PMC3431492

Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006;7 Suppl 1:S4.1-9. PMID: 16925838; PMC: PMC1810553

A full list of GENCODE publications is available at The GENCODE Project web site.

Data Release Policy

GENCODE data are available for use without restrictions.

cancel