| 1. ASAP II Main Query Page [Back, Top] |
| Figure 1. ASAP II Main Query Page. Select "Organism & Database" and then submit your query. Supported query types are Gene Symbol (e.g. BRCA1), UniGene Sequence Identifier (e.g. Hs#S20337840), UniGene Cluster Identifier (e.g. Hs.194143), Full Gene Name (e.g. breast cancer 1) and GenBank Sequence Identifier (e.g. BC046142). |
| Figure 2. ASAP II supports 15 species including 9 vertebrate species, 4 insects, and nematodes. ASAP II provides with extensive alternative splicing analysis and their splicing variants. Furthermore, thanks to MULTIZ alignments for 17 species available at UCSC Genome Browser, ASAP II provides with comprehensive orthologous exons and introns identified by multiple alignments. |
| 2. Summary for User Query, UniGene Annotation, Orthologous Genes and Genome Browsers [Back, Top] |
| Figure 3. If there are multiple hits for user query, ASAP II shows a list of all hits. Click UniGene ID to see each hit. ASAP II does not cover all alternative splicing results for all UniGene clusters. In above figure, Hs.511754 / Hs.512656 / Hs.536913 have no linkouts and no alternative splicing analysis in ASAP II. Each Gene Symbol has a linkout for NCBI Entrez Gene. |
| Figure 4. If user query has only one hit, ASAP II shows detailed information. UniGene Summary shows gene symbol and its aliases, gene name, No. of sequences, and gene expression information from UniGene *.data.gz file. Orthologous Genes are identified by sharing of splice junction in MULTIZ multiple alignments. If exons or introns share at least one of their splice junctions in MULTIZ multiple alignments, we annotate them as orthologous exons or introns. More detailed explanation can be found in introduction page. |
| Figure 5. Genome Alignments section shows genomic alignments of RefSeq, mRNA, ESTs, Exons, Introns and Isoforms similar to UCSC Genome Browser. In above figure, RefSeq alignments are denoted as "navy" color and mRNA alignments are denoted as "green" color. On the left, if user clicks one of GenBank Accession, we can see GenBank Nucleotide information for each sequence in new window. |
| Figure 6. (continued to figure 5) In UniGene Summary section, ASAP II displays only spliced isoforms. If given isoform has only weak evidence, mostly single EST alignment, alignment color is "gray". CDS is defined as longest ORF. See Yi Xing et al's multiple assembly problem paper in Genome Research for details. |
| Figure 7. If user clicks isoform ID (numbers, left side), we can see each transcript and protein sequences defined as longest ORF. |
| Figure 8. (continued to figure 6) Genomic alignments of canonical exon and introns are displayed at the bottom of browser. Constitutive exons and introns (.CON suffix in above figure) are denoted as "blueviolet" color and alternative exons and introns (.ALT suffix in above figure) as "orangered" color. Non-canonical introns are denoted as "gray" color. Numbers in parenthesis, e.g. (6, 19) in 117676.CON, mean no. of RefSeq/mRNA evidences and no. of EST evidences respectively. Repeat alignments from UCSC Genome Browser RepeatMasker track are drawn in dense view at the bottom of browser. |
| Figure 9. In figure 8, if user clicks one of exon ID on the left side, each exon sequences and supporting evidences are displayed in new window. |
| Figure 10. In figure 8, if user clicks one of intron ID on the left side, each intron sequences and supporting evidences are displayed in new window. We can clearly see "gt" at the start and "ag" at the end of intron sequence. |
| Figure 11. In figure 4, if user clicks "Show Library Information", all EST library in give UniGene cluster are displayed in new window. Library ID has a linkout to NCBI UniGene Library Browser. Title, Tissue and Vector information is derived from UniGene *.lib.info.gz file. ASAP II provides with our EST library classification for human EST library. Detailed information about human EST library classification will be displayed at the bottom of output page with tissue & cancer specificity of each splice sites. |
| Figure 12. In figure 5, if user clicks "Show Seq Information", all sequence members for given UniGene cluster will be displayed in new window. GenBank Accession and GI / PubMed have linkouts to NCBI GenBank and NCBI PubMed repectively. |
| 3. Summary for Genome Alignment [Back, Top] |
| Figure 13. In genome alignment summary, ASAP II shows genomic location of give UniGene cluster and a linkout to UCSC Genome Browser. This coordinate is important because all coordinates in Summary for Alternative Splicing are relative to this genomic coordinates. First stage of ASAP II calculation is to find genomic location and extract segment of genomic sequences (reverse complement if reverse gene) and map all sequences against that genomic sequences. Thus, all coordinates in ASAP II database (most tables in downloads page) are relative to that genomic sequences. If you want to know exact chromosomal coordinates, you have to add chromosome start for forward genes, or subtract your coordinates from chromosomal end position for reverse genes. See Barmak Modrek et al's alternative splicing detection paper in Nucleic Acids Research for detailed information. All coordinates in ASAP II web site is 0-based. But, original ASAP II distribution files in downloads page use 1-based coordinates. You need to be careful if you have to deal with two different coordinate system. |
| 4. Summary for Exons & Orthologous Exons [Back, Top] |
| Figure 14. In exon information section, all internal exons identified by canonical intron junction (GT...AG U1/U2 and AT...AC U11/U12 splice sites) are displayed with sequence evidences (#EST OBS & #mRNA OBS). Modular exons (0 remainder divided by 3) are denoted as "red" color at "Exon Size" column. If user clicks each exon ID, exon sequences and evidence information will be displayed in new windows (see figure 9 for details). |
| Figure 15. If user clicks "Show Orthologous Exons" in figure 14, genome browser for exons and introns will be displayed at the top. Genome browser is same as figure 8 and user can see exons & introns sequences and evidences by clicking IDs on the left side. See figure 8 for more details. |
| Figure 16. Orthologous Exons are defined by sharing at least one splice junction in MULTIZ multiple alignments. If both splice junctions are exactly aligned, the exon is denoted as "EXACT" in comment column. If only one splice junction is matched, the exon is denoted as "3' Match" or "5' Match". Numbers in parenthesis mean modularity (remainder divided by 3). If there is a frame shift by evolutionary changes, modularity will change. If there is alternative splicing by evolution, there will be two orthologous exons in target (right side) species, one for "EXACT" and the other for "3'/5' Match". User can click UniGene ID column to see orthologous gene information. Above table is generated by source species reference MULTIZ multiple alignments (for human, hg17 referenced MULTIZ 17-way multiple alignments). Matching Position means chromosomal location of orthologous genes identified by MULTIZ alignments. Current Position means chromosomal location of that exon. If two location is same, it is string evidence that those two exons are orthologous. If not, as we suggested briefly, there may be alternative splicing by evolution changes. |
| Figure 17. Besides from human (hg17) referenced MULTIZ multiple alignments, mouse (mm7), chicken (galGal2), drosophila melanogaster (dm2), zebrafish (dr3) and frog (xenTro1) referenced MULTIZ multiple alignments are freely available at UCSC genome browser. Above orthologous genes are identified by target species referenced MULTIZ multiple alignments. As you can see, table format is slightly different with figure 16. Leftmost column means origin of MULTIZ alignments and right column for orthologous genes. If there is no MULTIZ alignments available, ASAP II shows only table shows in this figure, but not table shows in figure 16. |
| 5. Summary for Introns & Orthologous Introns [Back, Top] |
| Figure 18. In intron information section, all introns are displayed with intron size, 5' and 3' splice site, and comments. GT...AG splice site is denoted as "U1/U2" and AT...AC splice site as "U11/U12". If user clicks each intron ID, intron sequences and evidence information will be displayed in new window (see figure 10 for details). |
| Figure 19. Orthologous Intron detection method is same as orthologous exons. See text in figure 16 for more details. Numbers in parenthesis of comment column means intron size, left for source species and right for target species. As you can see, orthologous introns of Hs.194143 intron ID 365585 are Rn.48840 intron ID 28953. And, intron size is increased by 1225 bp, 1157 to 2382 bp. |
| Figure 20. Just as comments in figure 17, these orthologous introns are identified by target species referenced MULTIZ multiple alignments. There are two rows for canFam2 Cfa.140 intron 30631, human intron 365603 (EXACT) and 365600 (3' Match). This means dog exon is alternatively spliced in human. But, it is also possible that this is a just result of lack of mRNA/EST sequences for dog because there is no dog intron for human intron 365600. |
| Figure 21. Splice Junctions for each species are extracted from MULTIZ multiple alignment. That is, these tables are generated using first two and second two nucleotides of the intron. Conservation of splice junction consensus (mainly for canonical splice site) may give an idea that give gene exists for those species and there may no evolutionary changes for that intron. In above figure, there is no splice site information for fishes (tetNig, fr1, and danRer3), implying that both flanking exons are too diverse to make good MULTIZ multiple alignments or fishes don't have such a gene. By looking at conservation of splice junctions for various species, we can choose highly putative lineage-specific genes. If at least one of the flanking exon has high conservation and there are splice junctions for most of species (especially at least one from more distant species), we can easily choose good candidates for lineage-specific genes. One thing to remind is that genome assemblies for some species (e.g., danNov1 and monDom2 in above figure) are not so good to get better results because tiny scaffold fragments may not cover whole gene regions. This theoretical approach, comparing splice junctions with many species, can give an shortest way to comparative studies. |
| 6. Summary for Alternative Splicing [Back, Top] |
| Figure 22. Alternative splicing summary gives all alternative splicing information. If user clicks exon and intron IDs, sequence and evidence information will be displayed in new windows. See figure 9 and 10 for more details. |
| Figure 23. If user clicks skip ID in figure 22, evidences for skipped exons will be displayed in new window. |
| 7. Summary for Isoform and Protein Sequences [Back, Top] |
| Figure 24. The isoform genome browser is same as genome browser in user query summary section. See comments in figure 5 and 6 for more details. The difference is that unspliced isoform alignments are displayed below. |
| Figure 25. If user clicks "Show Fasta Sequences" in figure 24, all transcript fasta sequences will be displayed in new window. |
| Figure 26. If user clicks "Show Protein Sequences" in figure 24, all protein (CDS defined as longest ORF) fasta sequences will be displayed in new window. |
| 8. Summary for Tissue & Cancer Specificity [Back, Top] |
| Figure 27. Tissue/Cancer specificity section supports only for human. All human EST libraries (8828 EST Libraries) are classified into 47 distinct tissues and normal/cancer tissues. LOD calculation method is same as Qiang Xu and Christopher Lee's cancer-specific alternative splicing paper in Nucleic Acids Research. There are highly confident (LOD >= 3 and #EST S1 >= 3) 1709 tissue specific alternative splicing relationship from 960 alternatively spliced genes (You can download tissue classification and LOD calculation file in downloads page and query into MySQL in order to get your candidate genes). And, 273 high confident (LOD >= 3 and #EST S1 >= 3) cancer specific alternative splicing relationship from 198 alternatively spliced genes. We added normal-specific alternative splicing relationship in this summary as well as cancer-specific cases. |
| Figure 28. If user clicks "Show EST Library Classification" in figure 27, all EST library classification (Second and Third column, Tissue & Nomal/Cancer) will be displayed in new window. This page is similar to figure 11 (EST Library Information from UniGene), but it gives more information with ASAP II EST Library Classification. |