Schema for ccdsGene

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Cite Us
- Release Log
- Staff
- Conditions of Use
- Our History
- Licenses
- Contact Us

field

example

SQL type

info

description

bin

585

smallint(5) unsigned

range

Indexing field to speed chromosome range queries.

name

CCDS30547.1

varchar(255)

values

Name of gene (usually transcript_id from GTF)

chrom

chr1

varchar(255)

values

Reference sequence chromosome or scaffold

strand

char(1)

values

+ or - for strand

txStart

69090

int(10) unsigned

range

Transcription start position (or end position for minus strand item)

txEnd

70008

int(10) unsigned

range

Transcription end position (or start position for minus strand item)

cdsStart

69090

int(10) unsigned

range

Coding region start (or end position for minus strand item)

cdsEnd

70008

int(10) unsigned

range

Coding region end (or start position for minus strand item)

exonCount

int(10) unsigned

range

Number of exons

exonStarts

69090,

longblob

Exon start positions (or end positions for minus strand item)

exonEnds

70008,

longblob

Exon end positions (or start positions for minus strand item)

score

int(11)

range

score

name2

varchar(255)

values

Alternate name (e.g. gene_id from GTF)

cdsStartStat

cmpl

enum('none', 'unk', 'incmpl', 'cmpl')

values

Status of CDS start annotation (none, unknown, incomplete, or complete)

cdsEndStat

cmpl

enum('none', 'unk', 'incmpl', 'cmpl')

values

Status of CDS end annotation (none, unknown, incomplete, or complete)

exonFrames

longblob

Reading frame of the start of the CDS region of the exon, in the direction of transcription (0,1,2), or -1 if there is no CDS region.

      hg38.ccdsInfo.ccds (via ccdsGene.name)
      hg38.ccdsKgMap.ccdsId (via ccdsGene.name)
      hg38.ccdsNotes.ccds (via ccdsGene.name)

bin

name

chrom

strand

txStart

txEnd

cdsStart

cdsEnd

exonCount

exonStarts

exonEnds

score

name2

cdsStartStat

cdsEndStat

exonFrames

585

CCDS30547.1

chr1

69090

70008

69090

70008

69090,

70008,

cmpl

588

CCDS72675.1

chr1

450739

451678

450739

451678

450739,

451678,

cmpl

590

CCDS41221.1

chr1

685715

686654

685715

686654

685715,

686654,

cmpl

592

CCDS2.2

chr1

925941

944153

925941

944153

925941,930154,931038,935771,939039,939274,941143,942135,942409,942558,943252,943697,943907,

926013,930336,931089,935896,939129,939460,941306,942251,942488,943058,943377,943808,944153,

cmpl

0,0,2,2,1,1,1,2,1,2,1,0,0,

592

CCDS3.1

chr1

944693

959240

944693

959240

944693,945056,945517,946172,946401,948130,948489,951126,951999,952411,953174,953781,954003,955922,956094,956893,957098,958928,95 ...

944800,945146,945653,946286,946545,948232,948603,951238,952139,952600,953288,953892,954082,956013,956215,957025,957273,959081,95 ...

cmpl

1,1,0,0,0,0,0,2,0,0,0,0,2,1,0,0,2,2,0,

592

CCDS30550.1

chr1

960693

965191

960693

965191

960693,961292,961628,961825,962354,962703,963108,963336,963919,964106,964348,964962,

960800,961552,961750,962047,962471,962917,963253,963504,964008,964180,964530,965191,

cmpl

0,2,1,0,0,0,1,2,2,1,0,2,

592

CCDS53256.1

chr1

966531

974575

966531

974575

966531,966703,970276,970520,970685,970878,971076,971323,972074,972287,972860,973499,973832,974315,974441,

966614,966803,970423,970601,970758,971006,971208,971404,972150,972424,973010,973640,974051,974364,974575,

cmpl

0,2,0,0,0,1,0,0,0,1,0,0,0,0,1,

592

CCDS4.1

chr1

966531

974575

966531

974575

966531,966703,970276,970520,970685,970878,971112,971323,972074,972287,972860,973185,973499,973832,974315,974441,

966614,966803,970423,970601,970758,971006,971208,971404,972150,972424,973010,973326,973640,974051,974364,974575,

cmpl

0,2,0,0,0,1,0,0,0,1,0,0,0,0,0,1,

592

CCDS76083.1

chr1

976171

981029

976171

981029

976171,976498,978880,

976269,976624,981029,

cmpl

1,1,0,

592

CCDS44034.1

chr1

999058

999973

999058

999973

999058,999525,999691,

999432,999613,999973,

cmpl

1,0,0,

Description

This track shows mouse genome high-confidence gene annotations from the Consensus Coding Sequence (CCDS) project. This project is a collaborative effort to identify a core set of mouse protein-coding regions that are consistently annotated and of high quality. The long-term goal is to support convergence towards a standard set of gene annotations on the mouse genome.

Collaborators include:

European Bioinformatics Institute (EBI)
National Center for Biotechnology Information (NCBI)
University of California, Santa Cruz (UCSC)
Wellcome Trust Sanger Institute (WTSI)

For more information on the different gene tracks, see our Genes FAQ.

Methods

CDS annotations of the mouse genome were obtained from two sources: NCBI RefSeq and a union of the gene annotations from Ensembl and Vega, collectively known as Hinxton.

Genes with identical CDS genomic coordinates in both sets become CCDS candidates. The genes undergo a quality evaluation, which must be approved by all collaborators. The following criteria are currently used to assess each gene:

an initiating ATG (Exception: a non-ATG translation start codon is annotated if it has sufficient experimental support), a valid stop codon, and no in-frame stop codons (Exception: selenoproteins, which contain a TGA codon that is known to be translated to a selenocysteine instead of functioning as a stop codon)
ability to be translated from the genome reference sequence without frameshifts
recognizable splicing sites
no intersection with putative pseudogene predictions
supporting transcripts and protein homology
conservation evidence with other species

A unique CCDS ID is assigned to the CCDS, which links together all gene annotations with the same CDS. CCDS gene annotations are under continuous review, with periodic updates to this track.

Credits

This track was produced at UCSC from data downloaded from the CCDS project web site.

References

Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T et al. The Ensembl genome database project. Nucleic Acids Res. 2002 Jan 1;30(1):38-41. PMID: 11752248; PMC: PMC99161

Pruitt KD, Harrow J, Harte RA, Wallin C, Diekhans M, Maglott DR, Searle S, Farrell CM, Loveland JE, Ruef BJ et al. The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genome Res. 2009 Jul;19(7):1316-23. PMID: 19498102; PMC: PMC2704439

Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. PMID: 15608248; PMC: PMC539979