Structure of this guide; questions to be answered
We hope that this guide will not only offer practical and easy-to-follow advice as to how to use this software, but also provide a platform that we can share our own thoughts and experiences of trying to interpret the statistical results yielded from a WGA project. Therefore, instead of listing the functions of each menu item, button, and interactive graphical components one by one, we have constructed this guide as a tour through a real WGA annotation process recently completed in our own group (Fellay et al. 2007) .
The following are examples of questions that came up during this study and that led to specific features now implicated in WGAViewer:
- What are the top hits and their P values?
- Are these top hits located in or near any gene?
- If they are located in a gene, what type of SNPs are they? Are they of known function?
- If they are not non-synonymous coding SNPs, nor located in a known splice site, how far are they from the closest exon?
- If they are not in a known gene, how far are they from the closest known gene?
- What exactly is the genic context for each hit? What are the surrounding genes?
- Is there any evidence for evolutionary conservation/ selection of the surrounding region?
- What are the P values of the surrounding SNPs?
- What is the LD context among these SNPs?
- How far does the LD extend for each hit? Does this LD extension cover other genes?
- Are there (perhaps ungenotyped) proxies for the associated SNP that are in a more interesting genomic context?
- Do these hits or their proxies show any association with available functional data, for example, gene expression levels?
- After all, is there a way to conveniently annotate these hits in an automatic and batch manner? Is there a way to automatic filter their proxies by their function?
- There are many candidate gene studies published on the same or related phenotypes. What are the P values for SNPs in and around these associated genes in our WGA project? Can we replicate previous findings?
- Can we replicate previous associations of particular SNPs?
- If the previously -associated SNPs have not been included in our WGA project, are there any correlated proxies or tags for these candidate SNPs? What are their P values?
- Is there evidence of population stratification effects?
- We have association data for replication cohorts. There are also cohorts with related but not identical phenotypes. Is there a way to compare between multiple datasets easily?
- We have genome-wide HWE test results. We have effect size, effect direction, etc. Is there a way to list these supporting/QC data alongside our association findings?
- I don't have a WGA set. But I want to annotate a SNP in such a way too. I also want to test LD among a list of SNPs. I want to test SNP-gene expression associations. Are there any convenient bioinformatic tools that WGAViewer can offer?
We hope this structure will make this user guide more useful. In the following guide, instead of going through buttons and menus, we will mainly go through these questions using real data.