|
|
Database: hg38 Primary Table: ccdsGene Row Count: 32,506   Data last updated: 2019-10-03
Format description: A gene prediction with some additional info. On download server: MariaDB table dump directory
field | example | SQL type | info | description |
bin | 585 | smallint(5) unsigned | range | Indexing field to speed chromosome range queries. |
name | CCDS30547.1 | varchar(255) | values | Name of gene (usually transcript_id from GTF) |
chrom | chr1 | varchar(255) | values | Reference sequence chromosome or scaffold |
strand | + | char(1) | values | + or - for strand |
txStart | 69090 | int(10) unsigned | range | Transcription start position (or end position for minus strand item) |
txEnd | 70008 | int(10) unsigned | range | Transcription end position (or start position for minus strand item) |
cdsStart | 69090 | int(10) unsigned | range | Coding region start (or end position for minus strand item) |
cdsEnd | 70008 | int(10) unsigned | range | Coding region end (or start position for minus strand item) |
exonCount | 1 | int(10) unsigned | range | Number of exons |
exonStarts | 69090, | longblob | | Exon start positions (or end positions for minus strand item) |
exonEnds | 70008, | longblob | | Exon end positions (or start positions for minus strand item) |
score | 0 | int(11) | range | score |
name2 | | varchar(255) | values | Alternate name (e.g. gene_id from GTF) |
cdsStartStat | cmpl | enum('none', 'unk', 'incmpl', 'cmpl') | values | Status of CDS start annotation (none, unknown, incomplete, or complete) |
cdsEndStat | cmpl | enum('none', 'unk', 'incmpl', 'cmpl') | values | Status of CDS end annotation (none, unknown, incomplete, or complete) |
exonFrames | 0, | longblob | | Reading frame of the start of the CDS region of the exon, in the direction of transcription (0,1,2), or -1 if there is no CDS region. |
|
To download this table in different text formats or to intersect or correlate it with other tables, use the Table Browser.
| |
|
|
Connected Tables and Joining Fields
|
|
Sample Rows
|
|
bin | name | chrom | strand | txStart | txEnd | cdsStart | cdsEnd | exonCount | exonStarts | exonEnds | score | name2 | cdsStartStat | cdsEndStat | exonFrames |
---|
585 | CCDS30547.1 | chr1 | + | 69090 | 70008 | 69090 | 70008 | 1 | 69090, | 70008, | 0 | | cmpl | cmpl | 0, |
588 | CCDS72675.1 | chr1 | - | 450739 | 451678 | 450739 | 451678 | 1 | 450739, | 451678, | 0 | | cmpl | cmpl | 0, |
590 | CCDS41221.1 | chr1 | - | 685715 | 686654 | 685715 | 686654 | 1 | 685715, | 686654, | 0 | | cmpl | cmpl | 0, |
592 | CCDS2.2 | chr1 | + | 925941 | 944153 | 925941 | 944153 | 13 | 925941,930154,931038,935771,939039,939274,941143,942135,942409,942558,943252,943697,943907, | 926013,930336,931089,935896,939129,939460,941306,942251,942488,943058,943377,943808,944153, | 0 | | cmpl | cmpl | 0,0,2,2,1,1,1,2,1,2,1,0,0, |
592 | CCDS3.1 | chr1 | - | 944693 | 959240 | 944693 | 959240 | 19 | 944693,945056,945517,946172,946401,948130,948489,951126,951999,952411,953174,953781,954003,955922,956094,956893,957098,958928,95 ... | 944800,945146,945653,946286,946545,948232,948603,951238,952139,952600,953288,953892,954082,956013,956215,957025,957273,959081,95 ... | 0 | | cmpl | cmpl | 1,1,0,0,0,0,0,2,0,0,0,0,2,1,0,0,2,2,0, |
592 | CCDS30550.1 | chr1 | + | 960693 | 965191 | 960693 | 965191 | 12 | 960693,961292,961628,961825,962354,962703,963108,963336,963919,964106,964348,964962, | 960800,961552,961750,962047,962471,962917,963253,963504,964008,964180,964530,965191, | 0 | | cmpl | cmpl | 0,2,1,0,0,0,1,2,2,1,0,2, |
592 | CCDS53256.1 | chr1 | + | 966531 | 974575 | 966531 | 974575 | 15 | 966531,966703,970276,970520,970685,970878,971076,971323,972074,972287,972860,973499,973832,974315,974441, | 966614,966803,970423,970601,970758,971006,971208,971404,972150,972424,973010,973640,974051,974364,974575, | 0 | | cmpl | cmpl | 0,2,0,0,0,1,0,0,0,1,0,0,0,0,1, |
592 | CCDS4.1 | chr1 | + | 966531 | 974575 | 966531 | 974575 | 16 | 966531,966703,970276,970520,970685,970878,971112,971323,972074,972287,972860,973185,973499,973832,974315,974441, | 966614,966803,970423,970601,970758,971006,971208,971404,972150,972424,973010,973326,973640,974051,974364,974575, | 0 | | cmpl | cmpl | 0,2,0,0,0,1,0,0,0,1,0,0,0,0,0,1, |
592 | CCDS76083.1 | chr1 | - | 976171 | 981029 | 976171 | 981029 | 3 | 976171,976498,978880, | 976269,976624,981029, | 0 | | cmpl | cmpl | 1,1,0, |
592 | CCDS44034.1 | chr1 | - | 999058 | 999973 | 999058 | 999973 | 3 | 999058,999525,999691, | 999432,999613,999973, | 0 | | cmpl | cmpl | 1,0,0, |
|
Note: all start coordinates in our database are 0-based, not
1-based. See explanation
here.
| |
|
|
CCDS (ccdsGene) Track Description
|
|
Description
This track shows mouse genome high-confidence gene annotations from the
Consensus
Coding Sequence (CCDS) project. This project is a collaborative effort
to identify a core set of
mouse protein-coding regions that are consistently annotated and of high
quality. The long-term goal is to support convergence towards a standard set
of gene annotations on the mouse genome.
Collaborators include:
For more information on the different gene tracks, see our Genes FAQ.
Methods
CDS annotations of the mouse genome were obtained from two sources:
NCBI
RefSeq and a union of the gene annotations from
Ensembl and
Vega, collectively known
as Hinxton.
Genes with identical CDS genomic coordinates in both sets become CCDS
candidates. The genes undergo a quality evaluation, which must be approved by
all collaborators. The following criteria are currently used to assess each
gene:
- an initiating ATG (Exception: a non-ATG translation start codon is
annotated if it has sufficient experimental support), a valid stop codon, and
no in-frame stop codons (Exception: selenoproteins, which contain a TGA codon
that is known to be translated to a selenocysteine instead of functioning as
a stop codon)
- ability to be translated from the genome reference sequence without frameshifts
- recognizable splicing sites
- no intersection with putative pseudogene predictions
- supporting transcripts and protein homology
- conservation evidence with other species
A unique CCDS ID is assigned to the CCDS, which links together all gene
annotations with the same CDS. CCDS gene annotations are under continuous
review, with periodic updates to this track.
Credits
This track was produced at UCSC from data downloaded from the
CCDS project
web site.
References
Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T et
al.
The Ensembl genome database project.
Nucleic Acids Res. 2002 Jan 1;30(1):38-41.
PMID: 11752248; PMC: PMC99161
Pruitt KD, Harrow J, Harte RA, Wallin C, Diekhans M, Maglott DR, Searle S, Farrell CM, Loveland JE,
Ruef BJ et al.
The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the
human and mouse genomes.
Genome Res. 2009 Jul;19(7):1316-23.
PMID: 19498102; PMC: PMC2704439
Pruitt KD, Tatusova T, Maglott DR.
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts
and proteins.
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4.
PMID: 15608248; PMC: PMC539979
| |
|
|
|