Genetic genealogy is an area of biological research where individuals working collectively can make a very significant contribution to furthering our understanding of historical and present day human populations.
Recently a collaborative network largely composed of independent researchers used freely available whole genome sequence data to rewrite the phylogenetic tree of Y chromosome Haplogroup R1b1a2, the most common Y chromosome variant in Western Europe [Rocca et al. (2012). Discovery of Western European R1b1a2 Y Chromosome Variants in 1000 Genomes Project Data: An Online Community Approach. PLoS ONE, 7(7), e41634].
Now, some of the same researchers have been granted access to a new collection of data. The aim of this project is to follow up with a much larger and more in-depth analysis of R1b1a2 using a new dataset of 1,000 male whole genomes from the British Isles, containing an estimated 700 R1b1a2 samples. Our original phylogeny was based on 135 samples, so this approximately 5-fold increase will allow the construction of a far more fine-grained phylogeny of R1b1a2 variants present in the British Isles population. This is participatory research on many levels, from the individuals whose genomes were sequenced, to the citizen scientists analysing their data, and the many individuals whose genetic genealogy research will benefit from the newly discovered variants.
Why this matters
Our research is important because alternative approaches, such as small scale sequencing of targeted regions of the Y chromosome, are not powerful enough, and traditional academic studies are unable to keep up with the pace of new variant discovery in the genetic genealogy community.
The principal goals of our project are new variant discovery and validation. Discovery takes place through computational analysis of whole genome data. The Wellcome Trust (UK) have granted us access to new whole genome data, and we are currently extracting and analysing the Y chromosome variants from the first release of data.
This initial release of 310 male genomes allowed us to conduct pilot investigations of R1b1a2 that confirmed the richness of the resource. As an example, analysis of the DF49 branch of R1b1a2 identified sixteen DF49+ men, and comparison of their Y chromosomes revealed over 50 new candidate variants that define 8 new branches and therefore refine the current DF49 phylogeny dramatically. Importantly our analysis identifies new non-private genetic variants defining branches below M222.
Using DF49 as a yardstick, we expect to be able to generate large numbers of candidate variants across the spectrum of R1b1a2 found in the British Isles (principally the branches under U106, P312), and to then select phylogenetically useful variants for validation by Sanger sequencing. Haplogroup R1b1a2 comprises about 70% of the men found in the British Isles, so it is by far the most common Haplogroup seen.
What your money can do
This is our first call for funding. We need your support in order to purchase the reagents and consumables to develop new primer pairs and to validate novel genetic variants using DNA sequencing. Our costs are relatively low; we do not recover costs for analysing the genome data, however we do need to purchase some specific reagents and perform DNA sequencing to validate the new genetic variants.
In the currency of genetic genealogy $39 buys one test for one individual. If all regular R1b1a2 SNP testers donated $39 it would financially underwrite a large scale rewriting of the phylogeny. In principle $5,000 would allow us to progress at least 20 new genetic markers (i.e. 20 new branches of the tree), and this will form the basis of our next publication. Any additional funds beyond this will allow us to develop even more genetic markers.
The primary objective is to validate new Y chromosome markers and help rewrite the Y chromosome phylogenetic tree for R1b1a2. We will publish the new phylogeny in online peer-review journals initially. Importantly for genetic genealogy enthusiasts we will make the results available prior to publication at http://arxiv.org. Thus we will avoid the long delays between submission of a manuscript and its eventual publication (some 4-6 months on average). As soon as the manuscript is published at arxiv.org details of the markers including PCR primers will be available, thereby opening the door for commercial DNA testing for interested individuals.
Your donation will allow us to confirm more than 20 new branch-defining variant SNPs in Haplogroup R1b1a2. By funding the project in this way it offers the opportunity to discover at least as many new markers as our previous publication of the DF and Z series of markers in 2012, at a total cost of under $250 per validated marker.