WGAViewer

Top hits: SNP view (Figure 3.3.2.c-1)

Click on tab “SNP” to display the SNP view panel (Figure 3.3.2.c-1). This panel shows the annotation of the selected SNP (in this case the selected top hit) with LD extension, recent selection score, gene context, and association with gene expression levels. It consists of 8 parts:

Figure 3.3.2.c-1. Results of comprehensive annotation for top hits: rs9264942, SNP view. A: Chromosome ideogram; B: LD extension; C: Recent selection score; D: transcripts; E: association with (the closest) gene expression levels; F: Description for annotated SNP; G: Dynamic data sheet; H: Popup menu. Part E shows a strong correlation between the annotated top hit, rs9264942, and HLA-C expression levels in immortalized B-lymphocytes. However it should also be noted that in the case of a highly variable gene like HLA-C, apparent expression effects could result from either an association with expression or an association with variant sites affecting probe binding (Urban et al. unpublished data). Circled red arrow denotes the physical location of the annotated SNP (rs9264942). The red highlight line through part B to D represents rs2246742, which is in LD with the annotated SNP rs9264942 (r2=0.74) and is also located in promoter region of HLA-C. (Click to enlarge)

A: Chromosome ideogram : shows the annotated region on a chromosome with a transparent red rectangle;

B: LD extension : shows the pair-wise LD extension between the annotated top hit and all the available HapMap SNPs in the annotated region defined by Figure 3.3.2-1 (200Kbp up- and down-stream in this example). The color scheme for r2 is the same with gene view, LD matrix (3.3.2.b, Figure 3.3.2.b-1, part E): blue 0-0.2; yellow 0.2-0.6; red 0.6-1.0. Missing values are coded as -9 and plotted in gray. Each line will respond to mouse movement and show the detailed information, including the pair-wise LD scores, for each HapMap SNPs in the dynamic data sheet (part G). The LD cutoff is 0.8 by default, but can be set before the annotation (see: section 3.3.2, Figure 3.3.2-1).

Figure 3.3.2.c-2. Resource menu for SNPs

Click on each SNP LD score line to bring up a resource menu (Figure 3.3.2.c-2). For lines too dense to easily pick up by mouse movement, press key “<”/”,” to move the highlighted lines backward (left, towards smaller chromosome coordinates), or key “>”/”.” to move the highlighted lines forward (right, towards larger chromosome coordinates), and then press key “enter” to bring up this menu.

C: Available recent selection score (Voight et al. 2006) for HapMap SNPs shown in part B;

D: Transcripts : show transcripts located in the annotated region. Exons are plotted as blue rectangles with a red arrow representing the strand. Each transcript will respond to mouse movement and will show the detailed information for each SNP, together with the hyperlink to external databases, in an dynamic data sheet (part G). Alternative transcripts are plotted with detailed exon information shown in the dynamic data sheet too (part G).

Click on each transcript to bring up a resource menu (Figure 3.3.2.c-3). For transcripts too dense to easily pick up by mouse movement, press key “<”/”,” to move backward (left, towards smaller chromosome coordinates), or key “>”/”.” to move forward (right, towards larger chromosome coordinates), and then press key “enter” to bring up this menu.

Figure 3.3.2.c-3. Resource menu for transcripts

E: Association test between genotype and gene expression : This test is based on the genome-wide gene expression database from Sanger Institute, GENEVAR project (Stranger et al. 2005; Stranger et al. 2007) and genotype data from HapMap database (The International HapMap Consortium. 2007) . The gene expression has been quantified in immortalized B-lymphocytes. As shown from the example dataset in Figure 3.3.2.c-1, there is a strong correlation between one of the top hits, rs9264942, and HLA-C expression levels (Fellay et al. 2007) . This creates an immediate working hypothesis that this genetic variation might function through controlling the expression levels of the HLA-C. This immediately leads to the functional follow up and the hypothesis has been verified by data from other independent cohorts.

However it should also be noted that in the case of a highly variable gene like HLA-C, apparent expression effects could result from either an association with expression or an association with variant sites affecting probe binding (Urban et al. unpublished data).

F: Description of annotated SNP : shows the detailed information for the annotated SNP. If the annotated SNP (top hit) tags any other HapMap SNPs in the specified HapMap population with the specified pair-wise r2 cutoff, the tagged SNPs and the r2 values will be listed in this data sheet too. This data sheet also shows a summary of the spanning region for annotation, including annotation span, start and end landmark, the number of SNPs and transcripts that are showed, etc. If the SNP annotation is based on the exact same region of gene view, annotation span will be shown as “0” in this data sheet. Different from the dynamic data sheet (part G), the contents of this data sheet are fixed.

G: Dynamic data sheet : Similar to the dynamic data sheet in gene view panel (Figure 3.3.2.a-1). This sheet shows the detailed information for the highlighted item in parts B-D. Therefore the contents of this data sheet will change according to which type of item is highlighted in the main graphical region. Like the dynamic data sheet in gene view, this data sheet has also a fixed tool bar including a drop-down menu for all the HapMap SNPs shown in part B, sorted by the rs#. One has the option to select a SNP and show the detailed information in the dynamic data sheet. A red highlight line will also then be plotted on part B to D to show which SNP has been selected. Sometimes this is more convenient than directly pointing the mouse to a specific SNP in part B, because when the SNP density is higher it is difficult to conveniently highlight a SNP among the overlapped lines. For SNP and transcript, this data sheet always offers hyperlinks to external databases, including Ensembl and NCBI, for a convenient reference for data not shown.

H: Popup menu : Similar to the popup menu in gene view panel (Figure 3.3.2.b-1). Clicking on blank region (other than hotspots, for example SNPs or transcripts) will activate this popup menu. It can then be dismissed by click on menu item “Exit”. Similarly, this popup menu offers four functions:

H.1 Show all SNPs : click on this menu item will activate a popup window and show the detailed information for all the HapMap SNPs (Figure 3.3.2.c-1) plotted in part B, instead of one by one in the dynamic data sheet (part G). This data sheet window offers five methods to sort the SNP collection: by location, by r2, by D', by rs #, or by type. Select the sorting method and then click on button “Sort by”. Sorting by r2 is probably the most useful one to quickly find the tagged SNPs in this region by the annotated top hit.

Figure 3.3.2.c-4. Results of comprehensive annotation for top hits: rs9264942, SNP view, data sheet for showing all HapMap SNPs. (Click to enlarge)

Very similar to the data sheet window in gene view (Figure 3.3.2.b-4), this data sheet window also offers the clickable hyperlink navigating to dbSNP, Ensembl, and PubMed. If one clicks on rs# for each SNP, a red highlight line will be drawn on part B to D to indicate the location of the selected HapMap SNP. To dismiss this window, click on the “OK” button. One also has the option to save the contents of this table to a comma separated text file (.csv) by clicking on the “Save” button, or to copy the tab-separated contents to system clip board by clicking on the “Copy” button and pasting the data into external software, for example, any text editor or Microsoft EXCEL.

H.2 Show all transcripts : clicking on on this menu item will activate a popup window that shows the detailed information for all the transcripts (Figure 3.3.2.c-5) plotted in part D, instead of one by one in the dynamic data sheet (part G). This function is equivalent to the one in gene view (see 3.3.2.b, H2, Figure 3.3.2.b-5.).

Figure 3.3.2.c-5. Results of comprehensive annotation for top hits: rs9264942, SNP view, data sheet for showing all transcripts. (Click to enlarge)

H.3 LD measures to display : click on this menu item to choose whether to display D' or r 2 in LD extension.

H.4 Save image: This function is similar to the function described in gene view (see: section 3.3.2.b, H4, Figure 3.3.2.b-6). The difference is this function will save the SNP view as an image file instead.