PDA

View Full Version : Torsten Günther at al 2018 - Population genomics of Mesolithic Scandinavia



artemv
09-30-2019, 02:58 PM
Population genomics of Mesolithic Scandinavia: Investigating early postglacial migration routes and high-latitude adaptation(PLOS Biology)

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2003703
Abstract
Scandinavia was one of the last geographic areas in Europe to become habitable for humans after the Last Glacial Maximum (LGM). However, the routes and genetic composition of these postglacial migrants remain unclear. We sequenced the genomes, up to 57× coverage, of seven hunter-gatherers excavated across Scandinavia and dated from 9,500–6,000 years before present (BP). Surprisingly, among the Scandinavian Mesolithic individuals, the genetic data display an east–west genetic gradient that opposes the pattern seen in other parts of Mesolithic Europe. Our results suggest two different early postglacial migrations into Scandinavia: initially from the south, and later, from the northeast. The latter followed the ice-free Norwegian north Atlantic coast, along which novel and advanced pressure-blade stone-tool techniques may have spread. These two groups met and mixed in Scandinavia, creating a genetically diverse population, which shows patterns of genetic adaptation to high latitude environments. These potential adaptations include high frequencies of low pigmentation variants and a gene region associated with physical performance, which shows strong continuity into modern-day northern Europeans.

Main discussion topic:
https://anthrogenica.com/showthread.php?11361-Genomics-of-Mesolithic-Scandinavia-reveal-colonization-routes-and-high-latitude-adapt&highlight=Mesolithic+Scandinavia

This topic is dedicated to Steigen individ:
https://anthrogenica.com/showthread.php?16500-North-Norwegian-Hunter-Gatherer-on-GEDmatch



ID
Age (cal BP)
Sex
mt
Y(orignal)
Contamination (based on X)
Contamination (based on autosomes)
Y(detailed)


Hum1
9452-9275
XX
U5a1

NA
0,00%



HUM2
9452-9275
XY
U5a1d
I2-M438
0,63%
0,73%
pre I2-S6635


Steigen
5950-5764
XY
U5a1d
I2-M423
0,4%
0,0%
I2-M423*


SF9
9300-8988
XX
U4a2

NA
0,0%



SF11
9023-8760
XY
U5a1
NA
NA
10,16%
pre-I1?


SF12
9033-8757
XX
U4a1

NA
0,93%



SBj
8963-8579
XY
U4a1
I2-L68
1,4%
0,06%
I2-L68



Brief results: 2 out of 4 Ychromosomes likely belong to ancient long dead branches.
I have checked the *.bam files myself and used 2019 version of ISOGG tree (several months old though).

Most samples have high coverage, so I will not post here (even under spoiler) all the list of upstream positives and downstream negatives as it will make a too long post.

HUM2
Out of S6635 synonimes he is positive to 25 and negative to 2. It could happen, that those 2 negatives are just problems with the data, but I desided here to state that he is pre-S6635, as if those two negatives where both correct.
What is more important that these two negative SNPs, is that he is negative for both known downstream SNP branches: I-PF3885 and I-Y5334.

He lived 9452-9275 ybp, while his terminal Ychr branch has TMRCA 10100 ybp.
Closest surviving branch is at least 600 years from him, but it is likely a bit more as he is likely pre-S6635, but not full S6635.

S6635 synonimes
Positive:
PF6915
PF3881
PF3892
PF3904
PF3914
PF3924
PF3930
PF3933
S6635
S6648
S6661
S6685
S6701
S6705
S6711
S10911
S15422
S17911.2
S23479
S23817
FGC18596
FGC18630
PF3926
PF3928
S6724

Negative:
S6716 (4 reads)
FGC18615 (6 reads, five of them at the begining/end of a read)

I-PF3885 synonimes:
All negatives:
BY431
A10439
PF3885
PF3905
PF3913
PF3916
PF3920
PF3921
S18870
S23632
Z45426
Z45427
Z45429
PF6909
Z45428
S10292
S12457
S18874

I-Y5334 synonimes:
All negatives:
FGC18124
FGC18541
FGC18587
FGC18594


Steigen

If we use y-full tree we will see that he is positive to all the M423 synonimes, but negative to both Y3104 and Y24694 branches.
Y24694 branch is not in ISOGG 2019 jet, so I had to manually copy SNP's from y-full to make this check. He is ancestral to all 10 Y24694 SNP's where he has calls.

So, this hunter-gather is I2a1a2* (ISOGG 2019). He lived 6,000–5,800 cal BP, while his terminal Ychr branch has TMRCA 13900 ybp.
This means his Y-chromosome branch is about 8000 years away from the closest surviving branch.


M423 synonimes (y-full)
Positive:
M423
AM01256
AM01270
AM01282
AM01284
AM01290
AM01291
FGC7074
FGC7084
S11650
Y3114
Z2601
Z2575
AM01259
AM01264
AM01268
AM01271
AM01272
AM01286
AM01296

Y3104 synonimes:
Negative:
AM01253
AM01258
AM01266
AM01289
AM01292
FGC7066
FGC7067
FGC8114
S7721
S12378
AM01273
AM01283
AM01299
Z2587

Y24694 synonimes:
All negatives:
Y130032
Y128076
Y128343
Y129078
Y128442
Y151128
Y136309
Y128271
Y32698
Y80035


SF11
This sample seems to be a good candidate for an ancestor of modern Scandinavian I1.
But it has high percent of estimated contamination, so it could happen he is positive to so many I1 SNPs because of contaminated modern DNA.

Hope, we will have more pre-I1 samples from any part of Europe. In this case we will be able to compare positive and negative SNPs from different pre-I1 samples.

Positive:
CTS1748
CTS6221
CTS6629
FGC2433
YSC0000301
Z2726
Z2731
CTS5887
FGC7747

Negative:
CTS3268
CTS6547
CTS9258
FGC2425
L121
L509

SBj
He is positive to I2a happlogroup (ISOGG 2019) - those SNPs are still listed as synonimes just to root I2 on y-full.
He is also negative to important known downstream branches: P37, M436 and L596 (no calls for L460).

L460 has TMRCA at 20800 ypb, and this hunter-gather lived 8963-8579 ybp. So, his Ychr branch is at least 11 837 years away from the closest known survived branch - but likely even more, as he could be negative for L460 SNPs.
Given a few negatives I could even suggest this could be some pre-I2, but he is positive to all SNPs marked as I2a, so I will think this 3 cases are just some errors.


I2/I2a synonimes:
Positives:
CTS4568
CTS6096
CTS10247
L68
PF3644
PF3664.1
PF3669
PF3812
CTS11806
PF3810
PF3850
CTS1799
CTS2257
CTS3529
CTS4047

Undesided:
CTS6479(1 read for ancestral, 1 for derived)

Negatives:
CTS7965(8 reads)
PF3648(2 reads)
PF3650(1 read for derived, 5 reads for ancestral)

P37 synonimes:
Negative:
CTS6825 (24 reads)

Positive:
CTS1279 (1 read, this is a C->T mutation, so we should not wory he could be pre P-37)

M436 synonimes:
All negative:
CTS4314
CTS5017
CTS8302
CTS11311
FGC3524
FGC3526
FGC3530
FGC3537
FGC3538
P217
P218
PF6895

L596 synonimes:
All negative
L597
FGC18092
FGC18599
M3935
PF3906
PF3911
PF3931
FGC18542
PF3907

deadly77
09-30-2019, 04:35 PM
Regarding SF11, I wouldn't say there's enough strong evidence to say this is a good candidate for an ancestor of modern Scandinavian I1. I think it's more likely that SF11 is representative of an extinct lineage that didn't result in the modern I1, but they would share a common ancestor, although the available data really isn't conclusive enough so could be either. I'd add that SF11 has several derived SNPs that are associated with other haplogroups and not found in modern I1 individuals, but as you say some of these might be due to contamination and some might be due to DNA damage such as deamination. All of the derived calls are one-read SNPs so could be false positives.

I'm using the YFull tree as reference for the 312 SNPs that they define haplogroup I1 - I had a few differences in the I1 SNP calls for SF11 - most were the same but I also had a derived read for Z2683 (1T) and ancestral reads for Y1831/FGC2468 (1T), CTS9845 (1A), FGC7871/Y1906 (1G), Z2690 (1G). There's definitely some sections of the genome where the BAM file shows a lot of derived reads close together close to some of these SNPs, so I'm overall less confident about some of these calls.

We might have a better idea of correllating the ancestral and derived versus the sample age if there were more reads for the I1 SNPs in SF11's BAM file, which would help lock down if SF11 was a transition between I and I1 or a brother lineage that didn't make it, but unfortunately SF11 is no call for more than 290 of the I1 SNPs (>93%), so it's difficult to say with any certainty.

Of the other pre-I1 samples that I'm aware of that have a mixture of derived and ancestral I1 SNPs, there are two in a couple of the recent ancient Iberian papers:

BAL051 from "Survival of Late Pleistocene Hunter-Gatherer Ancestry in the Iberian Peninsula" Villalba-Mouco et al., 2019, Current Biology 29, 1–9; Derived SNPs: Z2699 2C, Z2751 2A, Z2869 1G, Z2885 1T, Z2887 1T, CTS7751 1T, Z2812 1T, Z2860 1A, L124 1C, CTS4532 1A, Z2724 2G, FGC2441 2G, CTS10140 1A. Ancestral SNPs:Z2886 1T, Z2679 1T, Z2727 1C, Z2850 1A, Y1962 4T, P40 1C, Z2747 1A, FGC2422 1C, CTS3506/Z2765 1A, FGC33327 1T, CTS11534 1A, Y1863 1T, L848 2C, FGC2433 1G, Y1950 1C, FGC2427 1C, Y1932 1A, Z2870 1T, S22865 1G, Z2806/CTS6765 1T.

Car1 from "The genomic history of the Iberian Peninsula over the past 8000 years" Olade et al 2019, Science 363, 1230–1234; Derived I1 SNPs: Z2699 1C, Z2880 3A, Z2729 1T, Z2779 1C, CTS10338 1C, Z2885 1T, V5190/Z2742 1A, CTS11036/Z2863 1T, Z2730 1G, Z2720 1T, Z2887 1T, Z2847 1T, CTS1739 1G, Y1950 CGGA, Z2852 1T, CTS6022 1A, CTS4532 1A, Z2859 2A, Z2695 1T; Ancestral I1 SNPs: L840 1C, FGC7747 1T, Z2718 2C, Z2679 2T, Z2727 3C, CTS6140 1T, Y1872 1G, Z2747 1A, CTS11042 2T, Z2826 1T, Z2858 1G, V5203 2G, CTS11534 1A, Y1863 3T, CTS8394 1G, Z2825 1G, L848 1C, CTS9486 1G, Z2705 1C, Z2790 1G, Y1860 1A, Y1871 1C, FGC2427 1C, L764 1C, L1439 2C 1G (ambiguous, but likely ancestral), Z2870 1T, CTS9487 1G, Z2775 1G, CTS9368 2T, Y1876 1T,Z6116 1A, Z2832 2G, CTS7386 1T, CTS8708 1T, FGC2462 1C - also one SNP that has a mutation at the described position for FGC2445, but to a different nucleotide (1A rather than T).

There are a few more that are not enough information to tell. Three samples (RISE175, RISE179, RISE210) from the Allentoft 2015 paper have some derived I1 SNPs, but are low coverage and >97% of the I1 SNPs are not read, so could have some that are ancestral, or could be all derived - there's no way to tell. BAB5 from Szécsényi-Nagy 2014
only had a single SNP (M253) relevant to I1 tested, but predates the TMRCA of modern I1, so was probably a mix of derived and ancestral I1 SNPs. But there's no way to tell with the information we have.

Also, check out my Ancient I1 samples map - I include SNP calls from all the ancient I1 samples that I'm aware of. The confirmed pre-I1 (mixture of derived and ancestral I1 SNPs) are on the map in black with skull and crossbones while the uncertain ones are brown with a "?" - link to the map in my signature.

artemv
09-30-2019, 06:25 PM
Regarding SF11, I wouldn't say there's enough strong evidence to say this is a good candidate for an ancestor of modern Scandinavian I1. I think it's more likely that SF11 is representative of an extinct lineage that didn't result in the modern I1, but they would share a common ancestor, although the available data really isn't conclusive enough so could be either. I'd add that SF11 has several derived SNPs that are associated with other haplogroups and not found in modern I1 individuals, but as you say some of these might be due to contamination and some might be due to DNA damage such as deamination.
Of course we cannot know for sure about each individ if it is exactly on the branch going to modern happlogroups or somewhere near. But given its time and location we can suppose that the example is on or somewhere near the branch leading to modern I1.
I usually check files for a high number of SNPs, I've cheked SF11 for all the SNPs in trunk tree, I1 and I2 tree, and results always contain some noise, approximately one of 30-40 SNPs is usually false.
For example, nucleotids, located very close to known or unkonwn "del" or "insert" mutations have a chance to have a wrong position number.


All of the derived calls are one-read SNPs so could be false positives.
This is the data we have for this sample - we usually have only 1 read for each position, rarely 2 reads. Very few cases for more than 2 reads for SF11.
As I've said, I've checked SF11 for all the I1/I2 SNPs in ISOGG2019 tree.
And the results do not look like they are heavily contaminated. For example there are 195 calls for downstream I1 SNPs and all 100% of them are ancestral.
For I2 SNPs I have the following stats: total 314 calls, out of them 6 derived, 4 "new"(means replaced by a different nucleotide).



I'm using the YFull tree as reference for the 312 SNPs that they define haplogroup I1 - I had a few differences in the I1 SNP calls for SF11 - most were the same but I also had a derived read for Z2683 (1T) and ancestral reads for Y1831/FGC2468 (1T), CTS9845 (1A), FGC7871/Y1906 (1G), Z2690 (1G). There's definitely some sections of the genome where the BAM file shows a lot of derived reads close together close to some of these SNPs, so I'm overall less confident about some of these calls.

All these SNPs are not in a current ISOGG SNP list.
By the way, how did you download the SNP list from Y-full? I saw there only definition of SNPs, that start with Y.



We might have a better idea of correllating the ancestral and derived versus the sample age if there were more reads for the I1 SNPs in SF11's BAM file, which would help lock down if SF11 was a transition between I and I1 or a brother lineage that didn't make it, but unfortunately SF11 is no call for more than 290 of the I1 SNPs (>93%), so it's difficult to say with any certainty.

I meant that we should check which I1 SNPs appeared "earlier" and which appeared "later", and if we get contradicting results this means we got errors due to damaged aDNA, contamination, e.t.c.


Of the other pre-I1 samples that I'm aware of that have a mixture of derived and ancestral I1 SNPs, there are two in a couple of the recent ancient Iberian papers:

BAL051 from "Survival of Late Pleistocene Hunter-Gatherer Ancestry in the Iberian Peninsula" Villalba-Mouco et al., 2019, Current Biology 29, 1–9; Derived SNPs: Z2699 2C, Z2751 2A, Z2869 1G, Z2885 1T, Z2887 1T, CTS7751 1T, Z2812 1T, Z2860 1A, L124 1C, CTS4532 1A, Z2724 2G, FGC2441 2G, CTS10140 1A. Ancestral SNPs:Z2886 1T, Z2679 1T, Z2727 1C, Z2850 1A, Y1962 4T, P40 1C, Z2747 1A, FGC2422 1C, CTS3506/Z2765 1A, FGC33327 1T, CTS11534 1A, Y1863 1T, L848 2C, FGC2433 1G, Y1950 1C, FGC2427 1C, Y1932 1A, Z2870 1T, S22865 1G, Z2806/CTS6765 1T.

Car1 from "The genomic history of the Iberian Peninsula over the past 8000 years" Olade et al 2019, Science 363, 1230–1234; Derived I1 SNPs: Z2699 1C, Z2880 3A, Z2729 1T, Z2779 1C, CTS10338 1C, Z2885 1T, V5190/Z2742 1A, CTS11036/Z2863 1T, Z2730 1G, Z2720 1T, Z2887 1T, Z2847 1T, CTS1739 1G, Y1950 CGGA, Z2852 1T, CTS6022 1A, CTS4532 1A, Z2859 2A, Z2695 1T; Ancestral I1 SNPs: L840 1C, FGC7747 1T, Z2718 2C, Z2679 2T, Z2727 3C, CTS6140 1T, Y1872 1G, Z2747 1A, CTS11042 2T, Z2826 1T, Z2858 1G, V5203 2G, CTS11534 1A, Y1863 3T, CTS8394 1G, Z2825 1G, L848 1C, CTS9486 1G, Z2705 1C, Z2790 1G, Y1860 1A, Y1871 1C, FGC2427 1C, L764 1C, L1439 2C 1G (ambiguous, but likely ancestral), Z2870 1T, CTS9487 1G, Z2775 1G, CTS9368 2T, Y1876 1T,Z6116 1A, Z2832 2G, CTS7386 1T, CTS8708 1T, FGC2462 1C - also one SNP that has a mutation at the described position for FGC2445, but to a different nucleotide (1A rather than T).

I've checked beforehad the BAL051 results, unfortunately no SNPs that show up in both samples (SF11 and BAL051).
But at least we have FGC7747, that is derived in SF11 and ancestral in Car1.

deadly77
09-30-2019, 10:47 PM
I've checked beforehad the BAL051 results, unfortunately no SNPs that show up in both samples (SF11 and BAL051).
But at least we have FGC7747, that is derived in SF11 and ancestral in Car1.

Both SF11 and BAL051 have a read for FGC2433 - SF11 has derived (1A), BAL051 has ancestral (1G). Although this doesn't contradict earlier/later since BAL051 is older than SF11.