Your data security and privacy is important: the reputation of genetic genealogy (and the future discoveries we can make) rely on us ethically treating your data in a way you are happy with. The data agreement you will be entering into by participating in this project can be found on the data upload page. The information below is intended as a guide to help you make an informed decision about whether to submit your data, and what options you have to safeguard your privacy. It also describes our standard procedures, and the logic behind the ethical decisions we have made regarding your data privacy. It is important that you understand these before submitting your raw data, or submitting raw data on someone else's behalf.
At the end of the day, the project administrators are known, named invidiuals. We have chosen to make public our full genetic data Y-DNA data. We have put it in the public domain because we consider the risks associated with this to be very small, and the return on the understanding of our origins to be much greater. We hope, having understood the risks and benefits, that you will feel the same.
Disclaimer: The information here does not represent legal or medical advice. It has had no approval by any ethical, medical or legal entity. It is prepared in good faith for informative purposes, and is complete only within the bounds of conciseness and of our personal knowledge.
Data processing for this project will take place in the United Kingdom and United States. However, the centralised data storage means it will be available to a selected few other administrators (currently [11 Mar 2018] US-based James Kane and Alex Williamson). Additional administrators may join in future from additional countries, who will have access to raw data. This list may be updated without notification.
Submitters' consent is made via that data submission form for anonymised, processed data to be sent to third parties. These are selected by the project administrators, with the expectation (and, depending on circumstances, legal requirement) that these are limited to named people with a genuine research interest.
UK and EU law: The UK Data Protection Bill (DPB) and EU General Data Protection Regulation (GDPR) are among the laws covering the treatment of personal and genetic material within the United Kingdom and European Union, respectively, and the latter will continue to be part of UK law following the UK's exit from the EU.
UK and EU users should be aware that the Haplogroup R Data Warehouse is run by James Kane, they are sending data for storage in the United States. This archive is open to other researchers (currently [11 Apr 2018] only Alex Williamson and Jef Treece) who are based in the US: EU/UK laws may not apply (or may not apply in full) to data that the user personally exports outside the EU. While we aim to abide by the full principles and guidance of the GDPR and DPB, as "hobby" projects (i.e. not necessarily under the "large scale" definition of the GDPR), our liability under the GDPR, the DPB, and the associated US implementations via the Privacy Shield Programme, may be limited.
US law: In the US, laws may be applicable on either the national or state level (see a lay summary on privacy law and genetic data). The US has a less-well-developed body of legislature guaranteeing personal privacy than the UK or EU: fewer than half of states even require written consent to disclose genetic information.
Note that some US law-enforcement agencies are using genetic data from genealogical databases (e.g. GedMatch) to trace victims or perpetrators of crime. It is unlikely that our database would be used for such a purpose but we may theoretically, under certain circumstances, be requested to co-operate with law enforcement agencies in any country on such ground. Such requests will be considered individually, with due consideration to the balance of users' privacy, national and international law, and the rights of (and due respect for) victims of crime.
In the event of a problem: Regardless of the legal situation, if you feel your data security has been (or may be) compromised by this project, we would welcome correspondence. Security issues regarding data storage in our Warehouse may be addressed to James Kane; issues regarding data processing and release may be addressed to Iain McDonald. Data can be edited or fully removed or from our storage and analysis facilities and, on receipt of a request to do so, we will aim to do this in an expedient manner.
If you are uploading genetic data that is not yours, as a kit holder you must take responsibility for obtaining informed consent from the person whose DNA is being tested, unless that person is deceased. Informed consent can only be made by persons above the age of legal consent and with mental capacity to manage their own affairs. (See EU General Data Protection Regulation (GDPR) Article 7, Article 9 Section 2(b); UK Data Protection Bill (DPB) Article 84(2), among others.).
Furthermore, testers must take some personal responsibility for the secrets in their own family. This may include situations where revealing their own identity could impact on the lives of others in their family and beyond, e.g. through adoptions, infidelity, etc. Please be considerate of others when publicising the results of your own DNA.
Aside from this, you should be aware that any copy of data placed on an internet site may be subject to malicious attacks, including hacking. Safeguards are meant to prevent this from happening, but can never be completely secure. The small but inevitable risk of this is that data may be stolen.
Testers with other companies (e.g. YSeq WGS and FGC YElite/WGS) will have similar sets of results. Whole Genome Sequencing (WGS) tests will contain autosomal, mitochondrial and (depending on the test) exome DNA results. These are not relevant to us, so we remove them from the analysis and do not make them public.
Specific infertility related issues can be identifed easily in tests, such as a deletion of DYS464. Others are more subtle. Factors like genetic stability or mosaic loss of Y may have affected the acquisition of results, but will not normally be apparent in the results. Hence, while we cannot provide absolute certainty, current research implies most people's submitted Y-DNA will have negligible relevant medical use.
This balances the legal and ethical need to anonymise data (see Can this data be used to identify me?), with the need for users to identify themselves and genealogically relevant matches in the data. A contact e-mail address is collected by the Warehouse administrator for the purposes of data administration (see privacy policy), and will only be made available to administrators, except with the uploader's explicit consent.
Our reasons for using your kit number is that it allows your uploaded data to be connected back to your ancestral information and STR profile at Family Tree DNA (we also ask you to upload your STR data if you are happy to). This provides an independent check for all administrators and users that the data being uploaded are assigned correctly to an individual. This minimises errors and ensures that administrators of genetic projects know which of their members have uploaded. The MDKA surname provides an additional check (e.g. to make sure you haven't made a typo in the kit number anywhere). It is also a useful indicator of close relationships: the origins of surnames can be traced using matches within a surname group. Geographical information about your MDKA is used to map historical migrations in a statistical sense. All of this data can be anonymised (see Can my information be included more anonymously?), but we strongly encourage you to include as much of it as possible.
We ask that your kit is represented by an ancestral surname and your assigned kit number (see also Can my information be included more anonymously?). This information alone cannot identify you as an individual, but could be used to identify you if you create a paper trail leading back to it. If you are worried about being identified personally, you should take normal online data precautions to avoid creating a paper trail that leads back to personal information. An example of a paper trail might be posting your kit number or a family tree online, under a username that you use on social media, which can then connect you back to an organisation or institution. If you remain concerned by this, you can anonymise some of your information when it is input.
The ability of your genetic information to identify you as an individual is remote. Depending on the commonness of your Y-DNA haplogroup, the genetic information you supply could be linked to a branch of your family (e.g. as a descendant of a specific person): indeed, that is a specific goal of this project. However, you can only be securely identified as a specific person from this data if every other male line within the last handful of generations can be ruled out by death or direct DNA testing.
Note that your kit number is linked to your account at your testing company. Anyone with access to that account can readily identify personal information associated to yourself, which may include your name, and e-mail and physical addresses. For Family Tree DNA, normally that is restricted to Family Tree DNA (Gene by Gene) staff and volunteer administrators, who are bound by Family Tree DNA's privacy policy.
Currently, there is normally negligible risk in associating your Y-DNA genetic data with you as a person, to the extent that the administrators of this project are happy to have that information made public, and we encourage you to take that as a guide. While different circumstances may apply to your own situation, we encourage users to learn about what information they are sharing and the small relative risk they are exposing themselves to. The information you reveal here and elsewhere is your choice, and you retain the right for your submitted personal and genetic to be removed entirely from the Warehouse and its subsiduaries.