SwitchGear TSS Track Settings
 
SwitchGear Genomics Transcription Start Sites   (All Regulation tracks)

Display mode:      Duplicate track

Filter TSSs by score:

Include TSSs for predicted pseudogenes

Data schema/format description and download
Assembly: Human Mar. 2006 (NCBI36/hg18)
Data coordinates converted via liftOver from: May 2004 (NCBI35/hg17)
Data last updated at UCSC: 2007-05-09

Description

This track describes the location of transcription start sites (TSS) throughout the human genome along with a confidence measure for each TSS based on experimental evidence. The TSSs of a gene are important landmarks that help define the promoter regions of a gene. These TSSs were determined by SwitchGear Genomics by integrating experimental data using an empirically derived scoring function. Each TSS has a unique identifier that associates it with a gene model (see details below), and each TSS is color-coded to reflect its confidence score.

These TSSs are also available in a searchable format at SwitchDB, an open-access online database of human TSSs. Expermental tools are available through SwitchGear to study the function of the promoter regions associated with these TSSs.

Methods

The predicted TSSs are associated with a genome-wide set of gene models. SwitchGear gene models are defined as clusters of cDNA alignments that have overlapping exons on the same strand. These gene models were created from over 250,000 human cDNA alignments to construct a genome-wide set of ~37,000 gene models. Each gene model is identified by its chromosome number, strand, and unique identifier. For example, ID CHR7_P0362 indicates a cDNA cluster (0362) aligning to the plus strand (P) of chromosome 7 (CHR7). Existing gene annotation is mapped to the gene models through the NCBI annotation associated with Refseq accession numbers.

The SwitchGear TSS prediction algorithm identifies the most likely sites of transcription initiation for each gene model. The algorithm employs a scoring metric to assign a confidence level to each TSS prediction based on existing experimental evidence. In addition to the ~250,000 human cDNAs listed in Genbank, more than 5 million additional 5' human cDNA sequence tags have been generated using a combination of approaches. While these short sequence reads do not reveal gene structure, they provide a significant amount of experimental evidence for identifying transcript start sites. For each gene model, the algorithm counts the number of TSSs (defined as the 5' end of a cDNA) within 200 bp of one another. The TSS score is based on the total number of TSSs identified within this window, with each TSS weighted according to several discriminating features: cDNA library source, relative location within the gene model, and exon structure of the transcript. Furthermore, the TSSs for each gene model are ranked to identify the TSS representing the most likely transcription initiation site for a gene model. Rankings are indicated in the TSS unique identifier by the addition of a suffix (i.e. CHR7_P0362_R1 or CHR7_P0362_R2).

Using the Filter

This track has a filter that can be used to change the TSS elements displayed by the browser. This filter is based on the score of the TSS element. The filter is located at the top of the track description page, which is accessed via the small button to the left of the track's graphical display or through the link on the track's control menu. By default the track displays only those TSSs with a score of 10 or above.

By default, the TSSs for predicted pseudogenes are not displayed. If you would like to display them, check the box next to the Include TSSs for predicted pseudogenes label.

When you have finished configuring the filter, click the Submit button.

Credits

This track was created by Nathan Trinklein and Shelley Force Aldred of SwitchGear Genomics.