We hope all you need to use this tool is self-explaining or covered in the manual. However, in case we missed something, we trust this FAQs section can help you to make your SNPchiMp experience a little bit easier.
- How do I cite this tool?
- Why that name: SNPchiMp? It seems like a tool for monkeys!
- Why are there no custom SNP chips included in this tool?
- Why do I get repated SNPs in the SNP chip data I downloaded?
- What is the meaning for all the fields I get when I download a SNPchiMp file?
- Why do I get some SNPs located in the "99th" chromosome?
- Why is all the ss information "NULL" in the bovine Affymetrix SNP chip?
- Is it possible to access the database information directly using my own pipeline? (e.g. not having to actually do the step-by-step process?)
If you find this tool useful for your research, please cite:
Nicolazzi E.L., Caprera A., Nazzicari N., Cozzi P., Strozzi F., Lawley C., Pirani A., Soans C., Brew F., Jorjani H., Evans G., Simpson B., Tosser-Klopp G., Brauning R., Williams J.L., Stella A. (2015). SNPchiMp v.3: integrating and standardizing single nucleotide polymorphism data for livestock species. BMC genomics, 16:283
Nicolazzi E.L., Picciolini M., Strozzi F., Schnabel R.D., Lawley C., Pirani A., Brew F. and Stella A. (2014) SNPchiMp: A database to disentangle the SNPchip jungle in bovine livestock. BMC genomics, 15:123
Many people seem to be curious about the name we chose. To tell you the truth, the name of this tool comes from the inner child of the leading author, and can be summarized as this: the tool allows you to "jump" among different SNP chips versions, like a chimp in the SNP chip jungle.... so SNPchiMp... probably something like "SNPchi(m)p" would have been more appropriate, but SNPchiMp was its original name, and we stuck to it.
This is because only non-proprietary datasets are available. We are open to include any other (custom or not) SNP chip as long as the information contained in the tool is freely available to the animal genetics community.
We only include a SNP chip if the original information comes from the (custom or commercial) producer. In some cases (e.g. mainly in custom chips), some SNPs are genotyped twice (or more). Although it is true that repeating twice the same exact information is not a smart use of computer/database memory, we decided to keep the original information we received as a "golden standard", trying (as much as possible) to avoid subjective decisions on the inclusion/exclusion of data. Repeated information can occurr also in the case the SNP is in our "cross-reference" file, where we contain most of the different names each SNP might have received along its SNP chip history.
Here's an explanation for all the header names you might get using this tool (depending on the download options you chose):
- chip_name: Short name for the SNP chip (displayed by default)
- rs: RefSNP or reference SNP id (from dbSNP; displayed by default). It’s defined as:"A reference SNP ID number, or “rs” ID, is an identification tag assigned by NCBI to a group (or cluster) of SNPs that map to an identical location."
- ss: Submitted SNP (from dbSNP). It’s defined as: The NCBI assay ID number or 'ss' ID number is simply a unique identifier in a standardized format that is assigned by NCBI to submitted SNPs.
- alleles: Allelic variation of the SNP. This information is linked to the ss information (from dbSNP)
- orient: Orientation (from dbSNP; Forward/Reverse). This is linked to ss (and assembly) information and is subject to change over time due to assembly updates.
- strand: Illumina strand (from dbSNP; Top/Bottom). This is dependent on the sequence of the SNP, linked to ss information and is NOT subject to change over time! sender: Organization sending the data to dbSNP (from dbSNP)
- SNPname_sender: SNP name in dbSNP (this is the original name sent by the sender organization).
- ITB_index: Interbull official exchange index (from Interbull).
- Alleles_A_B_FORWARD: Illumina allele coding for FORWARD strand (for alleles A and B, respectively), extracted from iManifests
- Alleles_A_B_TOP: Illumina allele coding for TOP strand (for alleles A and B, respectively), extracted from iManifests
- Alleles_A_B_Affymetrix: Affymetrix allele coding. Affymetrix alleles are FORWARD by default.
- chromosome: Chromosome number in the desired assembly. When Native Assembly is chosen this information is extracted from original files from Illumina, Geneseek or Affymetrix. When other assemblies are chosen, this is the map information linked to the rs ID of the SNP (obtained from dbSNP)
- position: Position (in bp) in the chosen assembly. When Native Assembly is chosen this information is extracted from original files from Illumina, Geneseek or Affymetrix. When other assemblies are chosen, this is the map information linked to the rs ID of the SNP (obtained from dbSNP).
- SNP_name: Commercial SNP name. This field contains the official Illumina, Geneseek and Affymetrix SNP names. Cross-references included.
A “dummy” set of coordinates was assigned (e.g. chromosome 99 and position 0) for those loci where there was no commercial SNP ID - rs ID association, and for those SNPs whose rs ID were not mapped in the assembly considered. You can consider this as another "chromosome 0" SNP. We used this coding to differenciate these from the "original" SNPs in chromosome 0 (e.g. those in the original files from Illumina, Geneseek and Affymetrix).
The ss information you download from this tools was used to first find a link between the commercial SNP ID and the rs ID, by using the ss name. For Affymetrix SNP chip, we used the rs IDs already provided by the company, thus no rs-ss matching was done. For this reason none of the Affymetrix SNPs is linked to any ss IDs.
Yes, it is now possible. There is a specific syntax to do that. However, providing all the options is not feasible in this web page, so please contact ezequiel [dot] nicolazzi [at] ptp [dot] it for further information. We will be happy to help you!