GithubHelp home page GithubHelp logo

cov-lineages / pango-designation Goto Github PK

View Code? Open in Web Editor NEW
1.0K 82.0 97.0 2.59 GB

Repository for suggesting new lineages that should be added to the current scheme

License: Other

Python 39.44% Shell 3.44% Jupyter Notebook 57.12%

pango-designation's Introduction

Pango designation

The latest maintained Pango-lineage designations.

Suggesting a new lineage

Novel lineages or lineage refinements can be suggested by filing an issue with the respective sequence names, as found on GISAID, and any supporting information such as a phylogenetic tree for the putative lineage.

Full details on how to suggest a new lineage can be found in the Pango lineage designation guide.

Nomenclature rules can be found in the Pango statement of nomenclature rules.

Resources available on this repository

As detailed on the former pango.network website, we host the lineage description list (LDL) and sequence designation list (SDL) in this repository.

Lineage description list: lineage_notes.txt

Sequence designation list: lineages.csv

Alias record: alias_key.json

Other

The lineage_constellations directory contains the mutations associated with each of the AY lineages. These are the mutations that were acquried along the phylogenetic path leading to the common ancestor of the lineage and remain conserved in the lineage (defined here as being in >70% of sequences designated to the lineage). This path is identified using the UShER phylogenetic tree. These constellations are different from those in constellations as they attempt to capture all of the associated mutations, not only the defining mutations for a lineage. These constellations are provided for reference and are not used by scorpio currently. However, they should be compatible with scorpio if researchers are interested in exploring them. Sites in the intermediate category drop in and out in the Delta clade so may be associated with the lineage but may be prone to artefacts.

Developer docs

It is recommended to clone this repo using --filter=blob:none for a blobless clone unless you want to download and put 15GB on your hard drive. The lineages.csv file is quite large and since git stores the state of each file at each commit, that creates a large repo. Git is good at downloading past blobs when necessary, so you shouldn't be impacted in your day to day work by doing this partial clone. Learn more about partial clone here: https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/

Recommended commands to clone:

git clone git clone --filter=blob:none git+https://github.com/cov-lineages/pango-designation.git
gh repo clone cov-lineages/pango-designation -- --filter=blob:none

pango-designation's People

Contributors

actions-user avatar aineniamh avatar angiehinrichs avatar aukehaver avatar aviczhl2 avatar candidateoxon avatar chrisruis avatar ciscorucinski avatar corneliusroemer avatar erinpnewcomer avatar fedegueli avatar hynnspylor avatar infrpopgen avatar jchapman avatar joshuailevy avatar jshoyer avatar lenaschimmel avatar memorablea avatar mydtlwn avatar over-there-is avatar rambaut avatar rmcolq avatar rquiroga7 avatar theosanderson avatar thezetner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pango-designation's Issues

New possible sub-lineage under B.1 in West Bengal in India

Description

Sub-lineage of: B.1
Earliest sequence: 2020-10-25 (EPI_ISL_1419390 , hCoV-19/India/WB-1931300236261/2021)
Most recent sequence: 2021-03-17 (EPI_ISL_1312386, hCoV-19/Singapore/275/2021)
Countries circulating: India and one sequence each from Singapore, United Kingdom

Lineage was found to be circulating in India in the state of West Bengal in 2021. The genomes have the key mutation E484K in spike without the presence of N501Y. 15% of genomes from West Bengal collected between Jan - March 2021 belong to this cluster. One sequence from the state of Maharashtra is also available (EPI_ISL_1419052).
The genomes also have the mutations N_G18S, N_A119S, Spike_H146del,Spike_Y145del. Other mutations common to all genomes in the cluster include NSP13_Y277C, NSP12_D269N, NS8_Q29L, NS7b_E33stop, NS3_T151I.

Genomes

gisaid_accessions.txt

Evidence

image
image

Live nextstrain instance is available at : https://nextstrain.org/community/banijolly/Phylovis/COVID-India?c=gt-S_484,501

Proposal for sublineage B.1.243.1

Emergent lineage in Arizona with a number of defining amino acid mutations, including S:E484K and S:V213G. Described in Skidmore et al 2021. https://www.medrxiv.org/content/10.1101/2021.03.26.21254367v1.full.pdf.

Lineage includes the following 17 sequences

USA/AZ-ASU2621/2021,B.1.243.1
USA/AZ-ASU2625/2021,B.1.243.1
USA/AZ-ASU2857/2021,B.1.243.1
USA/AZ-CDC-21801839/2021,B.1.243.1
USA/AZ-CDC-21802041/2021,B.1.243.1
USA/AZ-CDC-22062741/2021,B.1.243.1
USA/AZ-ASU2754/2021,B.1.243.1
USA/AZ-ASU3132/2021,B.1.243.1
USA/AZ-ASU2925/2021,B.1.243.1
USA/TX-HMH-MCoV-29140/2021,B.1.243.1
USA/AZ-ASU2540/2021,B.1.243.1
USA/AZ-TG758899/2021,B.1.243.1
USA/AZ-TG758666/2021,B.1.243.1
USA/AZ-CDC-22555310/2021,B.1.243.1
USA/AZ-CDC-22554229/2021,B.1.243.1
USA/AZ-TG761699/2021,B.1.243.1
USA/NMDOH-2021075279/2021,B.1.243.1

Proposal for sub-lineage R.1.1.

By:

Kentucky State Public Health Lab. Stephanie Lunn, Vaneet Arora, Karim George, Josh Tobias, William Grooms, Matthew Johnson, Rachel Zinner, Rhonda Lucas.

Bioinformatics contact: Stephanie Lunn, [email protected]
Scientific/general questions: Arora Vaneet [email protected]

Description

Sub-lineage of: R.1.
Earliest sequence: 2021-01-08 (as of 2021-03-22)
Most recent sequence: 2021-03-08 (as of 2021-03-22)
Countries circulating: United States
A distinct cluster within the R.1 lineage (figure 01, purple lineage) with amino acid mutation ORF1a:A2584T was found circulating in New York, Maryland, Kentucky, and Ohio (figure 02, putative sub-lineage in yellow). First appearance appears to be in Maryland and New York before introduction into Kentucky. Kentucky cluster associated with outbreak and has a now proven ability to cause vaccine breakthrough infections and reinfections in unvaccinated and vaccinated individuals.

Genomes

R1_ORF1aA2584T.txt

Evidence

Figure 01. R.1 lineage
image

Figure 02. Putative sub-lineage with amino acid mutation ORF1a:A2584T
image

Proposed Sub-Lineage Name

R.1.1

Proposal for lineage within B.1.1.33 (Suggestion lineage N.10)

New lineage proposal N.10

Description

Sub-lineage of: B.1.1.33

Earliest sequence: hCoV-19/Brazil/MA-FIOCRUZ-6871/2021 / EPI ISL 1181371 / 2021-01-04

Most recent sequence: hCoV-19/Brazil/MA-FIOCRUZ-11486/2021 / 2021-02-19

Countries circulating: Brazil

This new lineage (suggestion N.10) is circulating in the Northeast of Brazil (Maranhao) and North (Amapa). This lineage has evolved from lineage B.1.1.33 major lineage circulating in Brazil since the beginning of the pandemic.
This lineage was characterized by 21 lineage-defining genetic changes 17 including 17 non-synonymous mutations, three deletions, and one no-sense mutation.
Eight lineage-defining genetic changes are located in the S protein: two mutations at the RBD (E484K and V445A) and two mutations (I210V and L212I) and three deletions (∆141-144, ∆211 and ∆256-258) in the amino (N)-terminal domain.

Genomes
location and sequences.txt

Evidence
Picture1

Proposed lineage name: N.10

Sublineage of B.1.214 in Congo/ France

Proposed lineage name

B.1.214.1

Description

Sub-lineage of: B.1.214
Countries circulating: Congo, France

Taxa

17 sequences on GISAID

B.1.214_2021-03-02.lineages.csv.zip

Evidence

Screenshot 2021-03-02 at 12 50 42

Red = B.1.214 Green = B.1.214.1 (Congo/ France lineage) Blue = B.1.214.2 (European/ Belgian lineage with key mutations and insertion mutation)

Suggested by

Áine O'Toole, Andrew Rambaut

Proposed new B.1 sublineage circulating in India

Description

Sub-lineage of: B.1
Earliest sequence: 2020-12-07
Most recent sequence: 2021-03-17
Countries circulating: India; also seen in data from UK, Australia, New Zealand, Singapore, USA, Germany, Canada

This lineage was reported to be circulating in India and comprised around 20% of sequences from some regions (https://www.telegraphindia.com/india/covid-double-mutation-variant-fuels-fears/cid/1809715). Contains L452R and E484Q mutations.

Gene Amino Acid Nucleotide Notes
orf1ab - 3457C>T
orf1ab T1567I 4965C>T nsp3:T749I
orf1ab T3646A 11201A>G nsp6:T77A
orf1ab M5753I 17523G>T nsp13:M429I
orf1ab K6711R 20396A>G  nsp15:K259R
orf1ab - 21895T>C  
S Gene G142D 21987G>A  
S Gene E154K 22022G>A  
S Gene L452R 22917T>G  
S Gene E484Q 23012G>C  
S Gene P681R 23604C>G  
S Gene Q1071H 24775A>T  
orf3a S26L 25469C>T  
orf7a V82A 27638T>C  
N Gene R203M 28881G>T  

Genomes
accessions.txt

Evidence

india_20210325 filtered aligned fasta treefile

Potential new lineage causing a cluster in Mayotte

Potential new lineage

by Etienne Simon-Loriere
Description:

Sub-lineage of: A
Earliest sequence: 2020-12-14 (Denmark)
Most recent sequence: 2021-01-20 (Turkey)
Countries circulating: Mayotte (French territory/department, part of the Comoros archipelago) and France at least

During a recent survey of the diversity of SARS-CoV-2 in Mayotte, a cluster of divergent sequences within lineage A (19B) was noted. More sequencing is underway to follow up. This variant has been also detected in Europe and France but sporadically.
We note 18 changes in the long branch leading to this cluster, including 7 NS changes in the spike (L18F, L452R, N501Y, A653V, H655Y, D796Y, G1219V), on a 614D background. We are exploring theories for emergence. A Pango lineage designation would help exchanges.

Genomes
Mayotte_cluster_19B.txt

Evidence

Screenshot_2021-02-13 auspice us

Proposal for new lineage within B.1

Description

Sub-lineage of: B.1
Earliest sequence: 2020-11-23
Most recent sequence: 2021-01-29
Countries circulating: A lineage predominantly circulating in New York but with a few exports to other countries. Characterised by spike mutations T95I and D253G, plus others.
The most frequent spike mutation pattern is L5F T95I D253G E484K D614G A701V, with a smaller fraction having S477N instead of E484K.

Other common coding changes in this lineage are:
N P199L M234I
NS3 P42L Q57H
NS8 T11I
NSP2 T85I
NSP4 L438P
NSP6 S106del G107del F108del
NSP12 P323L
NSP13 Q88H

Genomes
B.1.XX_genomes.txt

List of genomes, collated 2021-02-10, attached.

Evidence

Phylogenetic tree (PDF and image). Spike mutations and bootstrap values are indicated. Proposed new lineage includes all sequences except for those in the grey shaded box at the top.
B.1.XX_tree_spike_mutations.pdf

B 1 XX_tree_image

Phylogenetic tree file:
B.1.XX_tree.txt

Proposed lineage name

To be determined as the the next available designation within B.1

Sublineages of B.1.1.28 (proposed P.4) circulating in South Brazil with N:P13L, ORF3a:T151I and ORF9b:P10S mutations

Description
Sub-lineage of: B.1.1.28
Earliest sequence: Brazil/SP-796/2020 (EPI_ISL_735433) - 2020-07-01
Most recent sequence: Brazil/SP-1711/2021 (EPI_ISL_1121317) - 2021-02-09
Countries circulating: Brazil, Netherlands, Japan, England
A novel lineage was found circulating in South/Southern Brazil with the following mutations: N:P13L, ORF3a:T151I and ORF9b:P10S (figure below, red and blue clades). Also, a putative sublineage (figure below, blue clade) was found mostly in Rio Grande do Sul with the following mutations in ORF1a: P2287S, V2588F, L3027F and Q3777H.

Genomes
P_4.txt

Evidence
Captura de tela de 2021-03-09 16-41-04
Captura de tela de 2021-03-09 16-41-48

Proposed lineage name
P.4 (blue and red clades) and P.4.1 (blue clade)

Potential New Lineages in Missouri, USA

New lineage proposals
by: Cynthia Y Tang, Xiu-Feng Wan

Description 1
Sub-lineage of: B.1.1
Earliest sequence: 7/2/20
Most recent sequence: 7/9/20
Countries circulating: USA

Description 2
Sub-lineage of: B.1.1
Earliest sequence: 5/14/20
Most recent sequence: 7/9/20
Countries circulating: USA

Description 3
Sub-lineage of: B.1.2
Earliest sequence: 7/5/20
Most recent sequence: 7/7/20
Countries circulating: USA

Description 4
Sub-lineage of: B.1
Earliest sequence: 7/2/20
Most recent sequence: 7/6/20
Countries circulating: USA

During 2 sampling periods (March and July 2020), 4 locally adapted SARS-CoV-2 lineages, each with unreported mutations, emerged in Missouri. Each new lineage had >95% posterior probability when analyzed using Bayesian phylogenetics, >99% sequence identity, and occupied a separate subclade than published sequences in the phylogeny. The associated unpublished mutations are as follows:
Proposed lineage: B.1.1.318 (NSP12-A2V)
Proposed lineage: B.1.1.319 (NSP4-M366I, NSP12-C22F)
Proposed lineage: B.1.2.1 (NSP15-P252L)
Proposed lineage: B.1.527 (NSP3-N1178T, NSP3-A1179T)

Genomes
meta.xlsx

Evidence
eFigure2.FullTreeAnnotated.pdf
(Highlighted and denoted starting with 'MO-')
Proposed lineage names:
B.1.1.318
B.1.1.319
B.1.2.1
B.1.527

Potential new linegage A.2.4.X in Panama and Costa Rica with Spike L452R substitutions and 141-143 deletion.

Potential VOI circulating in Central America.

Sub-lineage of A.2.4
Earlies Sequence 2020-11-21
Most Recent Sequence 2021-02-04
Countries circulating: Costa Rica and Panama.
Genomes:

listofaccession.txt

Tree evidence:
treeimage

Genome changes:
gene nucleotide aminoacid
Nsp4 C10029T T492I
Nsp6 C11005A H11Q
Sgene A23403G D614G
T22917G L452R
deletion: L141-,S:G142-,S:V143-
insertion: 22205:CGGCAGGCT
ORF3a C25613T S74F
N C28863T S197L
C28975T M234I
C29366T P365S
C29241T P383L

New Lineage proposal: N.9 (sub-lineage of B.1.1.33)

Description
Sub-lineage of: B.1.1.33
Earliest sequence: EPI_ISL_861899 - 11/11/2020
Most recent sequence: hCoV-19/Brazil/MA-FIOCRUZ-6876/2021 - 01/03/2021 - under submission in GISAID
Countries circulating: Brazil

I would like to suggest a new pangolin lineage to support the B.1.1.33(E484K) clade circulating in Brazil.
We would like to call this N.5 following the other sublineages inside the ancestor B.1.1.33 lineage. This branch is highly supported as we can see highlighted in Blue in the tree attached.
The defining mutations are NSP3:A1711V, NSP6:F36L, S:E484K and NS7b:E33A.

Genomes
Data_N.5_lineage.txt

Evidence
N 5

Proposed lineage name N.5

New lineage proposal for Nextstrain clade 20A/Spike: F888L, E484K

by Erik Alm

Spike A67V, Spike D614G, Spike E484K, Spike F888L, Spike 69-70del, Spike Q52R, Spike Q677H, Spike 144del
Sub-lineage of: B.1
Earliest sequence: 2020-12-15
Most recent sequence: 2021-02-01
Countries circulating: Denmark, UK, USA, Spain, South Africa, Nigeria, France, Belgium, Canada, Australia, Jordan, Italy, Japan

A new variant of interest reported to ECDC by Denmark. Wide geographic circulation and carrying E484K. Rapid increase in Denmark recently.

Genomes

Evidence
B1 lineage
new lineage within B.1.xlsx

Proposed lineage name

Proposal for new lineage within B.1

Proposed by Son Nguyen, PEH-FSS, Queensland Department of Health.
Another clade of S:N439K other than the European clade (B.1.258), circulating in Malay Archipelago countries and infected Queensland miners coming back from PNG.

nextstrain-s439k

Description

Sub-lineage of: B.1
Earliest sequence: 2020-11-12 (Indonesia/JB-EIJK70/2020)
Most recent sequence: 2021-03-6 (Australia/QLD1659/2021)
Countries circulating: Singapore (21), Indonesia (40), Malaysia (6), Australia (14) (exposed in PNG), Japan (10) (exposed in Indonesia)

Characterised by spike mutations: N439K, a sub-set acquired P681R (from A.23.1) would form a sub-level lineage.

Genomes

b1x_genome.txt
b1xy_genome.txt

Evidence

proposed-lineage-B1XY

Tree file:
lineages-proposal.nex.txt

Proposed lineage name

To be determined as the the next available designation within B.1

Defining mutations

B.1.X: C22879A (S:N439K), C3768T, C5184T

B.1.X.Y: + C23604G (S:P681R), T22219C, C17012T, C29718T, C29743T, C683T.

Proposal for lineage R.2 as an alias for B.1.1.316.2

Proposal for lineage R.2 as an alias for B.1.1.316.2

Current genomes on 2021-02-20:

USA/NY-NYCPHL-002765/2021|2021-01-25
USA/RI-Broad_RIDOH-00153/2021|2021-01-13
USA/RI-Broad_RIDOH-00166/2021|2021-01-19
USA/MA-Broad_CRSP-00367/2021|2021-01-08
USA/RI-Broad_RIDOH-00171/2021|2021-01-18
USA/RI-Broad_RIDOH-00154/2021|2021-01-13
USA/RI-Broad_RIDOH-00142/2021|2021-01-13
USA/MA-CDC-STM-000002438/2021|2021-01-13

'Defining' SNPs:

ORF1ab:	A1049V
		K1202N
C4999T
		Q1592R
		D4085G	
		A4703S
C15240T

spike:	
C21759T
		E484K
		Q677H
		T732S
A25048G
		E1202Q

Green lineage in tree:
B 1 1 316 ml tree

New lineage proposal for B.1.111 with Spike substitutions E484K and L249S

by Katherine Laiton-Donato, Jose A. Usme-Ciro, Carlos Franco-Muñoz, Diego A. Álvarez-Díaz, Hector Alejandro Ruiz-Moreno, Jhonnatan Reales-González, Diego Andrés Prada, Sheryll Corchuelo, Maria T. Herrera-Sepúlveda, Julian Naizaque, Gerardo Santamaría, Magdalena Wiesner, Diana Marcela Walteros, Martha Lucia Ospina Martínez, Marcela Mercado-Reyes.

Description
Sub-lineage of: B.1.111
Earliest sequence: 2020-12-26 (hCoV-19/Colombia/CES-INS-VG-248/2020)
Most recent sequence: 2021-02-17 (hCoV-19/USA/NY-NYCPHL-003387/2021)
Countries circulating: Colombia (4), USA (10), Belgium (1), Aruba (2).
Four Colombian sequences collected between December 26, 2020 and January 14, 2021, were assigned to the B.1.111 lineage by Pangolin COVID-19 Lineage Assigner (https://pangolin.cog-uk.io/), however, these sequences share a characteristic mutation pattern, including two amino acid changes in the Spike protein (E484K and L249S). B.1.111+484K/249S sequences have been obtained from SARS-CoV-2 recently circulating in Colombia, USA, Belgium and Aruba. The preliminary results are consistent with the emergence of a novel and phylogenetically distant lineage from the parental B.1.111 lineage (Figure 1).
List of characteristic amino acid changes of the new lineage:
List of characteristic amino acid changes of new lineage.txt

Genomes
List of genomes, collated 2021-03-02, attached.
Genome sequences list.txt

Evidence
Phylogeny for proposal
The full phylogenetic tree can be accessed through this link: https://microreact.org/project/fTa6f3kY9JraG9NPmQYGog/42c3e045

Proposed lineage name
To be designated.

Lithuanian B.1.1.7 sub-lineage

New lineage proposal
by Rimvydas Norvilas, Ingrida Olendraitė, Dovilė Ežerskytė, Daniel Naumovas, and Gytis Dudas

Description
Sub-lineage of: B.1.1.7
Earliest sequence: 2021-02-15 (hCoV-19/Lithuania/S21B109/2021, current date on GISAID is incorrect)
Most recent sequence: 2021-03-22 (hCoV-19/Lithuania/S21C992/2021)
Countries circulating: Lithuania

A distinct cluster of B.1.1.7 infections comprising over half (204/384 genomes) of currently sequenced B.1.1.7 genomes in Lithuania. This sub-lineage was first detected in Vilnius county and is still concentrated there, though onward spread to other counties is apparent too. Given the prevalence of this sub-lineage (the next biggest distinct B.1.1.7 cluster in Lithuania has <40 genomes) we believe it is likely to dominate the rest of the Lithuanian epidemic.

Genomes
B.1.1.7_VN.txt

Evidence
B 1 1 7_ltu

Proposed lineage name
VN

already proposed - lineage in B.1 India L452R E484Q P681R

Sorry saw it was posted there #38

Hi, well-supported lineage, good diversity, 25 SNP, 15% of the 2021 Indian sequences

`
​ EPI_ISL_1164354 Singapore 2021-02-26 B.1
​ EPI_ISL_1246284 United Kingdom 2021-02-22 B.1
​ EPI_ISL_1246268 United Kingdom 2021-02-22 B.1
​ EPI_ISL_1257024 United Kingdom 2021-03-04 B.1
​ EPI_ISL_1256972 United Kingdom 2021-03-02 B.1
​ EPI_ISL_1264032 United Kingdom 2021-03-04 B.1
​ EPI_ISL_1284652 Germany 2021-03-01 B.1
​ EPI_ISL_1294622 United Kingdom 2021-03-06 B.1
​ EPI_ISL_1293047 Australia 2021-03-16 B.1
​ EPI_ISL_1315322 New Zealand 2021-03-09 B.1
​ EPI_ISL_1315998 United Kingdom 2021-03-07 B.1
​ EPI_ISL_1316100 United Kingdom 2021-03-07 B.1
​ EPI_ISL_1316002 United Kingdom 2021-03-07 B.1
​ EPI_ISL_1329279 United Kingdom 2021-03-10 B.1
​ EPI_ISL_1329653 United Kingdom 2021-03-11 B.1
​ EPI_ISL_1330057 United Kingdom 2021-03-04 B.1
​ EPI_ISL_1332722 United Kingdom 2021-03-09 B.1
​ EPI_ISL_1326666 United Kingdom 2021-03-13 B.1
​ EPI_ISL_1327290 United Kingdom 2021-03-09 B.1
​ EPI_ISL_1327318 United Kingdom 2021-03-11 B.1
​ EPI_ISL_1327458 United Kingdom 2021-03-10 B.1
​ EPI_ISL_1327609 United Kingdom 2021-03-12 B.1
​ EPI_ISL_1327362 United Kingdom 2021-03-10 B.1
​ EPI_ISL_1327494 United Kingdom 2021-03-12 B.1
​ EPI_ISL_1335059 Canada 2021-03-01 B.1
​ EPI_ISL_1341986 United Kingdom 2021-03-17 B.1
​ EPI_ISL_1344560 United Kingdom 2021-03-15 B.1
​ EPI_ISL_1357691 India 2021-02-20 B.1
​ EPI_ISL_1357692 India 2021-02-20 B.1
​ EPI_ISL_1357695 India 2021-03-11 B.1
​ EPI_ISL_1357696 India 2021-03-11 B.1
​ EPI_ISL_1357697 India 2021-03-11 B.1
​ EPI_ISL_1357698 India 2021-03-13 B.1
​ EPI_ISL_1357699 India 2021-03-12 B.1
​ EPI_ISL_1357700 India 2021-03-12 B.1
​ EPI_ISL_1357701 India 2021-03-13 B.1
​ EPI_ISL_1357702 India 2021-03-13 B.1
​ EPI_ISL_1357703 India 2021-03-07 B.1
​ EPI_ISL_1357704 India 2021-02-08 B.1.1
​ EPI_ISL_1357705 India 2021-03-05 B.1
​ EPI_ISL_1357706 India 2021-03-09 B.1
​ EPI_ISL_1360304 India 2020-12-05 B.1
​ EPI_ISL_1360306 India 2020-12-07 B.1
​ EPI_ISL_1360316 India 2020-12-17 B.1
​ EPI_ISL_1360317 India 2020-12-18 B.1.530
​ EPI_ISL_1360318 India 2020-12-19 B.1
​ EPI_ISL_1360328 India 2021-01-05 B.1
​ EPI_ISL_1360329 India 2021-01-06 B.1
​ EPI_ISL_1360330 India 2021-01-07 B.1
​ EPI_ISL_1360338 India 2021-01-15 B.1
​ EPI_ISL_1360341 India 2021-01-18 B.1
​ EPI_ISL_1360342 India 2021-01-19 B.1
​ EPI_ISL_1360352 India 2021-01-30 B.1
​ EPI_ISL_1360359 India 2021-02-08 B.1
​ EPI_ISL_1360361 India 2021-02-10 B.1
​ EPI_ISL_1360363 India 2021-02-12 B.1
​ EPI_ISL_1360364 India 2021-02-13 B.1
​ EPI_ISL_1360370 India 2021-02-19 B.1
​ EPI_ISL_1360371 India 2021-02-20 B.1
​ EPI_ISL_1360375 India 2021-02-24 B.1
​ EPI_ISL_1360376 India 2021-02-25 B.1
​ EPI_ISL_1360382 India 2021-03-03 B.1
​ EPI_ISL_1360387 India 2021-03-08 B.1
​ EPI_ISL_1367560 Singapore 2021-03-21 B.1
​ EPI_ISL_1372093 India 2020-12-01 B.1
​ EPI_ISL_1374314 United Kingdom 2021-03-17 B.1
​ EPI_ISL_1376188 United Kingdom 2021-03-17 B.1
​ EPI_ISL_1377698 United Kingdom 2021-03-15 B.1
​ EPI_ISL_1376830 United Kingdom 2021-03-16 B.1
​ EPI_ISL_1376971 United Kingdom 2021-03-13 B.1
​ EPI_ISL_1376850 United Kingdom 2021-03-16 B.1
​ EPI_ISL_1379889 USA 2021-03-01 B.1
​ EPI_ISL_1384844 India 2021-02-14 B.1
​ EPI_ISL_1384851 India 2021-02-26 B.1
​ EPI_ISL_1384866 India 2021-02-03 B.1
​ EPI_ISL_1385821 India 2021-02-13 B.1
​ EPI_ISL_1385823 India 2021-02-12 B.1
​ EPI_ISL_1390313 United Kingdom 2021-03-16 B.1
​ EPI_ISL_1415093 India 2021-02-25 B.1
​ EPI_ISL_1415162 India 2021-01-08 B.1
​ EPI_ISL_1415164 India 2021-02-13 B.1
​ EPI_ISL_1415165 India 2021-02-16 B.1
​ EPI_ISL_1415188 India 2021-02-12 B.1
​ EPI_ISL_1415203 India 2020-10-02 B.1
​ EPI_ISL_1415208 India 2021-02-23 B.1
​ EPI_ISL_1415214 India 2021-01-09 B.1
​ EPI_ISL_1415216 India 2021-02-20 B.1
​ EPI_ISL_1415218 India 2021-02-23 B.1
​ EPI_ISL_1415225 India 2021-02-02 B.1
​ EPI_ISL_1415233 India 2021-02-16 B.1
​ EPI_ISL_1415252 India 2021-02-03 B.1
​ EPI_ISL_1415261 India 2021-02-12 B.1
​ EPI_ISL_1415262 India 2021-02-16 B.1
​ EPI_ISL_1415263 India 2021-02-13 B.1
​ EPI_ISL_1415264 India 2021-02-17 B.1
EPI_ISL_1415265 India 2021-02-20 B.1
EPI_ISL_1415266 India 2021-02-02 B.1
EPI_ISL_1415267 India 2021-02-12 B.1
EPI_ISL_1415268 India 2021-02-14 B.1
EPI_ISL_1415269 India 2021-02-23 B.1
EPI_ISL_1415270 India 2021-02-21 B.1
EPI_ISL_1415271 India 2021-02-16 B.1
EPI_ISL_1415272 India 2021-02-15 B.1
EPI_ISL_1415273 India 2021-02-16 B.1
EPI_ISL_1415274 India 2021-02-25 B.1
EPI_ISL_1415275 India 2021-02-13 B.1
EPI_ISL_1415276 India 2021-02-16 B.1
EPI_ISL_1415277 India 2021-02-16 B.1
EPI_ISL_1415278 India 2021-02-16 B.1
EPI_ISL_1415287 India 2021-01-17 B.1
EPI_ISL_1415288 India 2021-02-02 B.1
EPI_ISL_1415289 India 2021-02-23 B.1
EPI_ISL_1415290 India 2021-01-14 B.1
EPI_ISL_1415291 India 2021-02-02 B.1
EPI_ISL_1415292 India 2021-02-12 B.1
EPI_ISL_1415294 India 2021-02-03 B.1
EPI_ISL_1415295 India 2021-02-03 B.1
EPI_ISL_1415296 India 2021-02-03 B.1
EPI_ISL_1415297 India 2021-02-23 B.1
EPI_ISL_1415298 India 2020-12-25 B.1
EPI_ISL_1415299 India 2021-02-03 B.1
EPI_ISL_1415300 India 2021-02-03 B.1
EPI_ISL_1415301 India 2021-02-23 B.1
EPI_ISL_1415302 India 2021-02-02 B.1
EPI_ISL_1415303 India 2021-02-02 B.1
EPI_ISL_1415304 India 2021-02-02 B.1
EPI_ISL_1415305 India 2021-02-02 B.1
EPI_ISL_1415306 India 2021-02-02 B.1
EPI_ISL_1415307 India 2021-02-06 B.1
EPI_ISL_1415308 India 2021-02-06 B.1
EPI_ISL_1415309 India 2021-02-06 B.1
EPI_ISL_1415310 India 2021-02-03 B.1
EPI_ISL_1415311 India 2021-02-23 B.1
EPI_ISL_1415312 India 2021-02-23 B.1
EPI_ISL_1415313 India 2021-02-23 B.1
EPI_ISL_1415314 India 2021-02-16 B.1
EPI_ISL_1415315 India 2021-02-16 B.1
EPI_ISL_1415316 India 2020-10-05 B.1
EPI_ISL_1415317 India 2020-10-03 B.1
EPI_ISL_1415318 India 2021-02-16 B.1
EPI_ISL_1415319 India 2021-02-16 B.1
EPI_ISL_1415320 India 2021-02-17 B.1
EPI_ISL_1415321 India 2021-01-04 B.1
EPI_ISL_1415322 India 2021-02-02 B.1
EPI_ISL_1415323 India 2020-10-05 B.1
EPI_ISL_1415324 India 2021-02-15 B.1
EPI_ISL_1415353 India 2021-02-15 B.1
EPI_ISL_1415356 India 2020-10-03 B.1
EPI_ISL_1415358 India 2021-02-16 B.1
EPI_ISL_1415359 India 2021-02-25 B.1
EPI_ISL_1415362 India 2021-02-02 B.1
EPI_ISL_1415370 India 2021-02-27 B.1
EPI_ISL_1415386 India 2021-02-14 B.1
EPI_ISL_1415387 India 2021-02-13 B.1
EPI_ISL_1408942 United Kingdom 2021-03-21 B.1
EPI_ISL_1413668 United Kingdom 2021-03-17 B.1
EPI_ISL_1410266 United Kingdom 2021-03-18 B.1
EPI_ISL_1412240 United Kingdom 2021-03-16 B.1
EPI_ISL_1416967 Guadeloupe 2021-03-10 B.1
EPI_ISL_1416968 Guadeloupe 2021-03-10 B.1
EPI_ISL_1442952 Singapore 2021-03-22 B.1

`

Subclade of B.1.1.1 (resulting from a discrete evolutionary sprint in December/January?) widespread in Chile

New lineage proposal

by Michael K. Edwards [email protected]

Description

Sub-lineage of: B.1.1.1
Earliest sequence: collected 2021-01-17
Most recent sequence: widespread in Chile (as of early April, latest nextstrain.org update)
Countries circulating: widespread in Chile; formerly common in Peru, current status uncertain; spotted in Spain, Australia, Germany, USA/NY, Brazil

It actually appears to have originated in Peru, and split into two subclades. One has deletions in the S NTD (Δ246–252 plus N253D) and in nsp6 (ORF1a Δ3675–3677, shared with B.1.1.7/B.1.351/P.1 — which doesn't mean shared ancestry); this one traveled to Chile and took off. The one without the deletions doesn't appear to have spread beyond Peru, except maybe to Brazil; Nextstrain's South American subsample is a bit of a moving target, and the Brazilian sequence in the screenshot isn't there right now.

The closest thing to the founding strain in Nextstrain's South American subsample appears to be GISAID EPI ISL 1111128. Most of the cases outside Peru lack the N: T366I mutation, so let's subtract that. Here are the visible changes specific to the variant, then:

N: P13L, G214C
ORF1a: P2287S, F2387V, L3201P, T3255I
ORF1b: P314L
ORF9b: P10S
S: G75V, T76I, L452Q, F490S, T859N

(I haven't yet dug down to the base-sequence level to look at what may be beneath the waterline, in terms of non-coding regions and template-switching effects associated with RNA secondary structure.)

Genomes

I don't know how to get nextstrain.org to dump a CSV. S: 452Q,F490S is currently a pretty good filter https://nextstrain.org/ncov/south-america?branchLabel=aa&c=country&gt=S.452Q,490S — but RBD point-substitution tweaks are pretty short putts, so some of the probably-fitness-neutral changes like S:T859 and the ORF1a hits outside nsp6 are probably better markers. If I had to guess I'd say that N:P13L (which is also ORF9b:P10S) is functionally interesting, but it has been invented repeatedly in other clades so it's not a good lineage marker.

Evidence

Screen Shot 2021-04-13 at 12 47 17 PM

Proposed lineage name

I have no idea what the conventions are around this. Presumably B.1.1.something?

B.1.1.29 note

B.1.1.29 note

This note addresses a question raised surrounding the designation of lineage B.1.1.29. B.1.1.29 is a Welsh lineage that has 116 sequences designated. There was an error in designation of B.1.1.29 during the pangothon in March 2021, which led to it being designated as B.1.1.439. Lineage B.1.1.439 has now been withdrawn and new pangoLEARN models will reflect this in due course.

A question related to this was raised asking about a subset of B.1.1.29 becoming lineage B.1.1.420. This subset of sequences had only ever been 'assigned' a lineage by pangolin and had not been officially designated. Occasionally, a shared SNP may lead to new sequences being assigned an otherwise phylogenetically distinct lineage in error. Manual curation of these sequences identified them as a distinct lineage B.1.1.420, which has diversity in Europe and the USA.

Reassigned in https://github.com/cov-lineages/pango-designation/releases/tag/v1.1.8

New lineage proposal for A.23.1 with E484K

by Erik Alm

A new lineage in the UK that is part of the old A main lineage, with E484K. Spike E484K, Spike F157L, Spike P681R, Spike Q613H, Spike R102I, Spike V367F
Sub-lineage of: A.23.1
Earliest sequence: 2020-12-26
Most recent sequence: 2021-01-28
Countries circulating: UK

Old lineage resurrected, now with E484K

A23-1 lineage
new lineage A-23-1.xlsx

Potential new lineage in France

Potential new lineage in France

by Etienne Simon-Loriere
Description:

Sub-lineage of: A
Earliest sequence: 2020-12-22 (Oman)
Most recent sequence: 2021-02-02 (France)
Countries circulating: France at least, Spain?

This hospital based cluster was noted because of a spike of cases. We note 9 changes on this branch, including the 69-70del, N501T (not Y) and H655Y. (This variant is picked up by the qPCR S drop). More sequencing is underway to follow up. A Pango lineage designation would help exchanges.

Genomes
cluster_19B_IDF.txt

Evidence

Screenshot_2021-02-13 auspice us

COVID-19 cluster with unusual clinical presentation in France

Description
request by the French National Reference Center for viruses of respiratory infections headed by Pr. S. van der Werf

Sub-lineage of: B.1
Earliest sequence: 2021-01-29
Most recent sequence: 2021-02-15 (more sequencing ongoing)
Countries circulating: France
Genomes
Table_Lannion.tsv.txt

Evidence
A cluster of COVID-19 cases with inconsistent detection by RT-PCR in nasopharyngeal swabs at the time of onset of symptoms compatible with COVID-19 (including severe/lethal cases) is under investigation in Bretagne France.
More sequences are under way, but most samples are very difficult to sequence (low viral load with the usual sampling methods)
The genomes are characterized by a large number of mutations:

Spike_D215G,Spike_H655Y
Spike D215G D614G G142del G669S H66D H655Y N1187D Q949R V483A Y144V
E F20L T30I
M H125Y
N T325I
NS3 Q57H
NSP2 T85I
NSP3 N506S S1443Y T820I
NSP4 Y397H
NSP5 A260V
NSP6 L37F
NSP12 P323L Q822R
NSP14 L157F
NSP16 K277R T140I
Screenshot_2021-03-13-auspice

tree_Lannion_20C_B.1.pdf
tree_Lannion_20C_B.1.nex.txt

A Pango designation would greatly help exchanges on this cluster!

Potential new B.1.1.370 sublineage Russia and Finland with spike E484K and others

Sub-lineage of: B.1.1.370
Earliest sequence: 2021-01-18
Most recent sequence: 2021-03-14
Countries circulating: Russia, Finland

hCoV-19/Russia/Pskov-16/2021|EPI_ISL_1259283|2021-01-18
hCoV-19/Russia/SPE-RII-MH14654S/2021|EPI_ISL_1491610|2021-02-11
hCoV-19/Russia/SPE-RII-MH14691S/2021|EPI_ISL_1491611|2021-02-11
hCoV-19/Finland/HEL-84-V8252/2021|EPI_ISL_1548040|2021-03-14
hCoV-19/Finland/HEL-84-V8262/2021|EPI_ISL_1548041|2021-03-14

pangolin lineage designation B.1.1.370 for
EPI_ISL_1259283 / hCoV-19/Russia/SPE-RII-MH14654S/2021|EPI_ISL_1491610|2021-02-11
and hCoV-19/Russia/Pskov-16/2021|EPI_ISL_1259283|2021-01-18
B.1.1 for hCoV-19/Russia/SPE-RII-MH14691S/2021|EPI_ISL_1491611|2021-02-11
hCoV-19/Finland/HEL-84-V8252/2021|EPI_ISL_1548040|2021-03-14
hCoV-19/Finland/HEL-84-V8262/2021|EPI_ISL_1548041|2021-03-14

However these have spike deletion S:N137-,S:D138-,S:P139-,S:F140-,S:L141-,S:G142-,S:V143-,S:Y144-,S:Y145-
multiple mutations:
ORF1a:S376L
ORF1a:V1006F
ORF1a:T1093A
ORF1a:T2247N
ORF1a:T3255I
ORF1a:Q3729R
ORF1a:S4119T
ORF1b:T1173N
ORF1b:V1905L
ORF1b:A2431V
S:P9L
S:C136Y
S:H245P
S:E484K
S:D614G
S:E780K
ORF3a:L95M
M:L16I

Substitutions shared with hCoV-19/Russia/SPE-RII-12101V/2020|EPI_ISL_524001|2020-05-08 (lineage B.1.1.370) include
ORF1a:V1006F
T1093A
C25000T
C25603T
G26211T
N:R203K
N:G204R

LPR fa contree

New lineage proposal for B.1.1.316 with 3 additional spike mutations

New lineage proposal
by Erik Alm

Lineage present in many countries, reported in media by Japan. These are currently assigned as B.1.316 but has many additional mutations including Spike E484K, G769V, W152L. The B.1.316 seems to be an old Canadian lineage (notice the time gap in the attached epicurve from cov-lineages.org), these new sequences are totally distinct from the old B.1.316
New lineage in B-1-316 time
.

Sub-lineage of: B.1.316
Earliest sequence: 2020-11
Most recent sequence: 2021-02-11
Countries circulating: Ghana, Nigeria, Japan, UAE, Belgium, France, Netherlands, Switzerland, UK, Canada, USA, Australia

Evidence
83 viruses
New lineage in B-1-316
new.lineage in B-1-316.txt

Proposal for new lineage C.2.1

New lineage proposal

Suggested by Andrew Page and Leonardo de Oliveira Martins

Description

Sub-lineage of: C.2
Earliest sequence: 2020-11-01 (Wales/ALDP-B19FFD/2020)
Most recent sequence: 2021-02-19 (18 samples from the Caribbean)
Countries circulating: Aruba (68), Zimbabwe (18), Sint Maarten (14), Curacao (10), Netherlands (4), Denmark (4), United Kingdom (3), Australia (3), and USA (2).

Observed first in Wales and England (end Nov, beginning Dec), several samples were detected in Zimbabwe and a few in
Curacao by the end of December. Currently there is a considerably-sized cluster in the Caribbean.
Using empirical Bayesian ancestral reconstruction with IQTREE v2, three substitutions on the branch separating this
lineage from its parent, C.2, were estimated: G922T (synonymous, ORF1ab) , C28453T (synonymous, N), and C29642T (Q29*, ORF10).

Genomes

Deposited on GISAID: lineage_C_2_1.txt

Evidence

Maximum likelihood tree of all C.2 sequences with the proposed lineage in red and ultrafast bootstrap support values on branches.

lineage_C_2_1 boot

Related reference: https://doi.org/10.1016/S2666-5247(21)00061-6

Proposed lineage name: C.2.1

Proposal for new lineage within B.1

Proposal for new lineage within B.1

Description

Sub-lineage of: B.1
Earliest sequence: 2020-09-28
Most recent sequence: 2020-12-28
Countries circulating: A lineage predominantly circulating in California but with exports to other countries. Characterised by the spike L452R mutation but also has spike:W152C orf1ab:D5584Y and N:T205I

orf1ab:D5584Y
S:W152C
S:L452R
synSNP:C26681T
N:T205I
synSNP:C29362T

Genomes

B.1.X_genomes.txt

List of genomes, collated 2021-01-13, attached.

Evidence

Phylogenetic tree PDF image. Proposed new lineage labels in red:
B 1 X_tree

Phylogenetic tree file:
B.1.X.tree.txt

Proposed lineage name

To be determined as the the next available designation within B.1

Sublineage of B.1.429 with S:Q677H (as well as ORF1a:F2827L, ORF3a:A23V, and N:P142S)

Description

Sub-lineage of: B.1.429
Earliest sequence: 2021-01-27
Most recent sequence: 2021-03-12 (database upload pending for the most recent few sequences; table of published sequences below)
Countries circulating: USA (Colorado)

We have identified a cluster of cases in Colorado, USA, descended from the lineage B.1.429 with an additional spike protein substitution at amino acid position 677, S:Q677H (as well as ORF1a:F2827L, ORF3a:A23V, and N:P142S). There is evidence of onward transmission.

Here is a Virological post with summary of this cluster:
https://virological.org/t/detection-of-the-recurrent-substitution-q677h-in-the-spike-protein-of-sars-cov-2-in-cases-descended-from-the-lineage-b-1-429/660

Of note is that this is not the first set of B.1.429-derived sequences with S:Q677H; some cases with the combination of B.1.429+S:Q677H have been sequenced previously from samples collected in California (tree) and share much of the same haplotype. The cluster we have identified in Colorado is further characterized by the amino acid substitutions ORF1a:F2827L, ORF3a:A23V, and N:P142S.

Genomes

Colorado_B.1.429_with_SQ677H_seqs_as_of_2021-03-28.tsv.txt

Evidence

tree_cluster_callout_img_clean

Proposed lineage name

We propose naming the haplotype B.1.429+S:Q677H, ORF1a:F2827L, ORF3a:A23V, and N:P142S as a sublineage, B.1.429.2. (Perhaps preceded by B.1.429+S:Q677H being named B.1.429.1.)

Lineages B.1.526, B.1.526.1 and B.1.526.2

By Anderson Brito & Nathan Grubaugh Lab.

Description

Sub-lineage of: B.1.526

Earliest sequence:
B.1.526 (2020-11-23); B.1.526.1 (2020-09-07); B.1.526.2 (2021-02-10)

Most recent sequence: 2020-09-07

Countries circulating: USA

These B.1.526 lineages and sub-lineages were recently reassigned, and their classifications are currently mixed up, as can be seen in the tree below (see image and link)

Genomes
B.1.526: metadata of 230 genomes (download here)
B.1.526.1: metadata of 49 genomes (download here)
B.1.526.2: metadata of 89 genomes (download here)

Evidence
Image: available here
Build: available here

Proposed lineage name
Same as already proposed by Pango team. We only want to report the need for an update of the B.1.526 lineage group assignments.

Alias of C.6 and C.10

Dear cov-lineages team,

I am trying to use the 'full_alias_key.txt' file to track the relationship between lineage names. I noticed that two "C" aliases are not listed in that table -- C.6 and C10. Likewise, in the 'lineage_notes' file they are not identified as aliases.

Am I correct in understanding that "C" is an alias for "B.1.1.1", and therefore C.6 and C.10 are aliases?

Also, is this file (full_alias_key.txt) meant to be a comprehensive dictionary of aliases, that would be suitable as input for software that groups lineages based on ancestry?

Thank you for maintaining this incredibly useful resource.

Sincerely,
Adam

p.s. L4 is also missing from full_alias_key.txt (edit)

New Linage proposal: B.1.177.637

New Linage proposal: B.1.177.637

Suggested by: Paula Ruiz-Rodriguez and Mireia Coscolla

Description:

Sublineage of: B.1.177
Proposed name: B.1.177.637
Earliest sequence: 2020-06-29
Latest sequence: 2021-02-06
Countries circulating: United Kingdom, Spain, France, Italy, Gibraltar, Denmark, Ireland, Norway, Switzerland, Hong Kong, Belgium, Portugal, Poland, Canada, China, Singapore, Japan, South Korea, Czech Republic, Austria, Luxembourg, Germany, Netherlands.

Genomes:

B.1.177.637.txt

Mutations of B.1.177.637:

Nucleotide position Reference Mutation Type of mutation Protein Amino acid replacement
445 T C synonymous_variant ORF1ab V60V
3037 C T synonymous_variant ORF1ab F924F
6286 C T synonymous_variant ORF1ab T2007T
14408 C T missense_variant ORF1ab P4715L
21255 G C synonymous_variant ORF1ab A6997A
22227 C T missense_variant S A222V
23403 A G missense_variant S D614G
25049 G T missense_variant S D1163Y
25062 G T missense_variant S G1167V
26801 C G synonymous_variant M L93L
28657 C T synonymous_variant N D128D
28932 C T missense_variant N A220V
29366 C T missense_variant N P365S
29645 G T missense_variant ORF10 V30L

Evidence:

B 1 177_and_B 1 177 637_phylo

PDF with B.1.177.637 phylogeny: B.177.637.pdf

New lineage proposal for B.1.1.74 with multiple spike mutations

New lineage proposal for B.1.1.74 with multiple spike mutations
by Erik Alm

Description
Sub-lineage of: B.1.1.74
Earliest sequence: 2021-01-07
Most recent sequence: 2021-02-07
Countries circulating: Nigeria, UK, USA

Multiple spike mutations and NSP6 deletion commonly seen in VOCs compared to ancestral lineage, already spread to multiple countries.
Spike D614G, Spike D796H, Spike E484K, Spike P681H, Spike T95I, Spike Y144del
NSP6 106-108del

Evidence, see
new lineage.txt
attachments

Proposed lineage name: Q.1
new lineage

Sublineage of P.1 focused in Europe

Distinct sublineage of P.1 (originally focused in Manaus, Brazil). Predominantly in Italy, but also Germany.
image

One synonymous SNP separates these from P.1 but one Italian genome contains spike P681H (Italy/EMR-064126-004/2021, EPI_ISL_1132668).

Genome list:
P.1.1.txt

Lineage designation: P.1.1

Potential new B.1.X-B.1.2 sublineage in Texas

Submitted by: Shelby Hendrickson and Caitlin Maloney

Description
Sub-lineage of: B.1 or B.1.2 with Spike_L18F and Spike_E780Q, also associated with ORF8_H17Y
Earliest sequence: 09/25/2020
Most recent sequence: 2/19/2021
Countries circulating: US, Canada

Appears to be two clusters of these spike mutations appearing concurrently.
First reported in GISAID 09/25/2020 in British Columbia, Canada = 595 cases, with 582 occurring in British Columbia. 17% prevalence of cases submitted to GISAID for Texas.
Also reported in GISAID 101/20/2020 in Texas, USA. USA = 375 cases, with 305 occurring in Texas. 4% prevalence of cases submitted to GISAID for Texas.
https://outbreak.info/situation-reports?country=United%20States&country=Canada&division=Texas&division=British%20Columbia&pango&muts=S%3AE780Q&muts=S%3AL18F&selected=United%20States&selectedType=country

Genomes
EPI_ISL_979400
EPI_ISL_979398
EPI_ISL_979396
EPI_ISL_979386
EPI_ISL_979380
EPI_ISL_978312
EPI_ISL_967510
EPI_ISL_967163
EPI_ISL_966655
EPI_ISL_966313
EPI_ISL_942926
EPI_ISL_942923
EPI_ISL_942914
EPI_ISL_942910
EPI_ISL_940807
EPI_ISL_935734
EPI_ISL_935721
EPI_ISL_935702
EPI_ISL_911669
EPI_ISL_911668
EPI_ISL_911493
EPI_ISL_911479
EPI_ISL_911475
EPI_ISL_886571
EPI_ISL_872379
EPI_ISL_786618
EPI_ISL_786170
EPI_ISL_786002
EPI_ISL_785824
EPI_ISL_785296
EPI_ISL_785255
EPI_ISL_785213
EPI_ISL_785132
EPI_ISL_785118
EPI_ISL_785114
EPI_ISL_785107
EPI_ISL_785054
EPI_ISL_785021
EPI_ISL_784992
EPI_ISL_784839
EPI_ISL_784789
EPI_ISL_784736
EPI_ISL_784699
EPI_ISL_784644
EPI_ISL_784607
EPI_ISL_784581
EPI_ISL_784546
EPI_ISL_784160
EPI_ISL_784143
EPI_ISL_783981
EPI_ISL_783925
EPI_ISL_783867
EPI_ISL_783792
EPI_ISL_783756
EPI_ISL_783718
EPI_ISL_783613
EPI_ISL_1114143
EPI_ISL_1114027
EPI_ISL_1113917
EPI_ISL_1113899
EPI_ISL_1113895
EPI_ISL_1113875
EPI_ISL_1110294
EPI_ISL_1110290
EPI_ISL_1109990
EPI_ISL_1090836
EPI_ISL_1090696
EPI_ISL_1090671
EPI_ISL_1090626
EPI_ISL_1090527
EPI_ISL_1090462
EPI_ISL_1089106
EPI_ISL_1088589
EPI_ISL_1088293
EPI_ISL_1087894
EPI_ISL_1087851
EPI_ISL_1087010
EPI_ISL_1086590
EPI_ISL_1086565
EPI_ISL_1086477
EPI_ISL_1086459
EPI_ISL_1086394
EPI_ISL_1081176
EPI_ISL_1081034
EPI_ISL_1081017
EPI_ISL_1081011
EPI_ISL_1080946
EPI_ISL_1080925
EPI_ISL_1080920
EPI_ISL_1080865
EPI_ISL_1080367
EPI_ISL_1080306
EPI_ISL_1080102
EPI_ISL_1080095
EPI_ISL_1080047
EPI_ISL_1079989
EPI_ISL_1079988
EPI_ISL_1079885
EPI_ISL_1079830
EPI_ISL_1079817
EPI_ISL_1079788
EPI_ISL_1079785
EPI_ISL_1079757
EPI_ISL_1079727
EPI_ISL_1079679
EPI_ISL_1079668
EPI_ISL_1079637
EPI_ISL_1079614
EPI_ISL_1079573
EPI_ISL_1079505
EPI_ISL_1079501
EPI_ISL_1079497
EPI_ISL_1079491
EPI_ISL_1079413
EPI_ISL_1079329
EPI_ISL_1079327
EPI_ISL_1079306
EPI_ISL_1079286
EPI_ISL_1079260
EPI_ISL_1079255
EPI_ISL_1079245
EPI_ISL_1079242
EPI_ISL_1079228
EPI_ISL_1079225
EPI_ISL_1079170
EPI_ISL_1079167
EPI_ISL_1079155
EPI_ISL_1079138
EPI_ISL_1079112
EPI_ISL_1079075
EPI_ISL_1079066
EPI_ISL_1079059
EPI_ISL_1078892
EPI_ISL_1078862
EPI_ISL_1078861
EPI_ISL_1078809
EPI_ISL_1078691
EPI_ISL_1078672
EPI_ISL_1078662
EPI_ISL_1078466
EPI_ISL_1078435
EPI_ISL_1078425
EPI_ISL_1078421
EPI_ISL_1078420
EPI_ISL_1078387
EPI_ISL_1078380
EPI_ISL_1078332
EPI_ISL_1078318
EPI_ISL_1078269
EPI_ISL_1078236
EPI_ISL_1078221
EPI_ISL_1078176
EPI_ISL_1078122
EPI_ISL_1078060
EPI_ISL_1078038
EPI_ISL_1078009
EPI_ISL_1078001
EPI_ISL_1077987
EPI_ISL_1077974
EPI_ISL_1077931
EPI_ISL_1077908
EPI_ISL_1077901
EPI_ISL_1077884
EPI_ISL_1077864
EPI_ISL_1077863
EPI_ISL_1077854
EPI_ISL_1077838
EPI_ISL_1077829
EPI_ISL_1077785
EPI_ISL_1077748
EPI_ISL_1077719
EPI_ISL_1077713
EPI_ISL_1077682
EPI_ISL_1077675
EPI_ISL_1077631
EPI_ISL_1077600
EPI_ISL_1077590
EPI_ISL_1077566
EPI_ISL_1077543
EPI_ISL_1077507
EPI_ISL_1077494
EPI_ISL_1077489
EPI_ISL_1077473
EPI_ISL_1077469
EPI_ISL_1077442
EPI_ISL_1077428
EPI_ISL_1077344
EPI_ISL_1077323
EPI_ISL_1077217
EPI_ISL_1077195
EPI_ISL_1077143
EPI_ISL_1077137
EPI_ISL_1077110
EPI_ISL_1077061
EPI_ISL_1077025
EPI_ISL_1077024
EPI_ISL_1077010
EPI_ISL_1076969
EPI_ISL_1076938
EPI_ISL_1076906
EPI_ISL_1076903
EPI_ISL_1076858
EPI_ISL_1076855
EPI_ISL_1076843
EPI_ISL_1076816
EPI_ISL_1076786
EPI_ISL_1076757
EPI_ISL_1076756
EPI_ISL_1076746
EPI_ISL_1076719
EPI_ISL_1076718
EPI_ISL_1076717
EPI_ISL_1076663
EPI_ISL_1076657
EPI_ISL_1076648
EPI_ISL_1076587
EPI_ISL_1076507
EPI_ISL_1076323
EPI_ISL_1076309
EPI_ISL_1076302
EPI_ISL_1076272
EPI_ISL_1076256
EPI_ISL_1076244
EPI_ISL_1076226
EPI_ISL_1076217
EPI_ISL_1076203
EPI_ISL_1076184
EPI_ISL_1076137
EPI_ISL_1076120
EPI_ISL_1076102
EPI_ISL_1076079
EPI_ISL_1076038
EPI_ISL_1075945
EPI_ISL_1075913
EPI_ISL_1075884
EPI_ISL_1075873
EPI_ISL_1075848
EPI_ISL_1075822
EPI_ISL_1075793
EPI_ISL_1075678
EPI_ISL_1075668
EPI_ISL_1075547
EPI_ISL_1075504
EPI_ISL_1075423
EPI_ISL_1075354
EPI_ISL_1075304
EPI_ISL_1075299
EPI_ISL_1075287
EPI_ISL_1075241
EPI_ISL_1075186
EPI_ISL_1075177
EPI_ISL_1075145
EPI_ISL_1075137
EPI_ISL_1075112
EPI_ISL_1075097
EPI_ISL_1075084
EPI_ISL_1075071
EPI_ISL_1075068
EPI_ISL_1075057
EPI_ISL_1075055
EPI_ISL_1075052
EPI_ISL_1074978
EPI_ISL_1074970
EPI_ISL_1074962
EPI_ISL_1074949
EPI_ISL_1074936
EPI_ISL_1074912
EPI_ISL_1074890
EPI_ISL_1074850
EPI_ISL_1074847
EPI_ISL_1074839
EPI_ISL_1074817
EPI_ISL_1074750
EPI_ISL_1074682
EPI_ISL_1074670
EPI_ISL_1074662
EPI_ISL_1074652
EPI_ISL_1074641
EPI_ISL_1074626
EPI_ISL_1074619
EPI_ISL_1074615
EPI_ISL_1074612
EPI_ISL_1074597
EPI_ISL_1074570
EPI_ISL_1074562
EPI_ISL_1074519
EPI_ISL_1074515
EPI_ISL_1074514
EPI_ISL_1074511
EPI_ISL_1074449
EPI_ISL_1074420
EPI_ISL_1074395
EPI_ISL_1074394
EPI_ISL_1074370
EPI_ISL_1074358
EPI_ISL_1074327
EPI_ISL_1074318
EPI_ISL_1074312
EPI_ISL_1059264
EPI_ISL_1059263
EPI_ISL_1059248
EPI_ISL_1059245
EPI_ISL_1059244
EPI_ISL_1036945
EPI_ISL_1029838
EPI_ISL_1027715
B.1.X-B.1.2 Sublineage Texas.pdf

Evidence
B 1 2 X

Possible new Italian lineage, B.1.177.88, mostly localized in Campania region.

New lineage proposal
by Antonio Grimaldi

Description:
New lineage harboring S E484K mutation.
The proposed new lineage specifically located near Salerno, Campania region, Italy. It harbors, among the others, the S E484K substitution.

List of aa substitutions:
N: Ala220Val
ORF10: Val30Leu
orf1ab: Arg6997Pro
orf1ab: Thr6374Ala
S: Ala222Val
S: Ala262Ser
S: Asp614Gly
S: Glu484Lys
S: Leu1063Phe
S: Pro272Leu
S: Thr572Ile
Sub-lineage of: B.1.177
Earliest sequence: 2020-12-26
Most recent sequence: 2021-02-11
New-Lineage.xlsx

Tree:
image

Countries circulating: Italy

Proposed lineage name: B.1.177.28

Potential sequences that should be included in B.1.617

Potential need for inclusion in designation of B.1.617

Flagging this for follow up

From Bijaya Dhakal via Gunter Bach:

Virus with similar mutation profile were classified as B.1.617 (E484Q/L452R). However below have the same double mutation are still classified as B.1.596.

EPI_ISL_1415164
EPI_ISL_1415165
EPI_ISL_1415172
EPI_ISL_1415181
EPI_ISL_1415203
EPI_ISL_1415233
EPI_ISL_1415276
EPI_ISL_1415277
EPI_ISL_1415278
EPI_ISL_1415286
EPI_ISL_1415317
EPI_ISL_1415318
EPI_ISL_1415319
EPI_ISL_1415356
EPI_ISL_1415357
EPI_ISL_1415358
EPI_ISL_1415386
EPI_ISL_1415387
EPI_ISL_1454202

Example Lineage Proposal

New lineage proposal

by

Description

Sub-lineage of:
Earliest sequence:
Most recent sequence:
Countries circulating:

<written description of the proposed lineage, where it is circulating, when, why it is distinct>

Genomes

<list of genome names as found on GISAID or accession ids - provide as a CSV/TSV table attachment - include locations and dates of sampling>

Evidence

<image from phylogenetic tree?>

Proposed lineage name

The pangolin web application don´t read my .fasta files

When I do the upload of my files, the software return this information:

FAILED: Sequence unable to be processed with Pangolin (unknown error)

I have already verified the files, and they are ok. It happens with 5 specific files, probably associate with P1 lineage, according to with mutations found. With the other 30 files, the process was ok.

Correct B.1.1.7 definition to remove deletions in S, ORF1a

There's a cluster in Michigan that resembles B.1.1.7 very closely except that it lacks S: Δ143–144 and ORF1a: Δ3675–3677 (instead, 3676G/3677F are replaced by a single L). Presumably what actually happened here is that the founding strain incubated in a single human host for a while, invented N:D3L and a collection of other substitutions, and the deletions happened after that — probably still in the same host. Somebody took the version without deletions home to Michigan and it didn't get picked up in the US's irregular sequencing until months later.

ORF1a: 3677L is a pretty good marker of the subclade without the deletions. But functionally I expect it's very similar to B.1.1.7. It only matters because RT-PCR proxies for actual sequencing that rely on Δ143–144 S-gene target failure will miss this subclade completely.

Undeletions like this simply do not happen, and if anybody knows somebody at nextstrain.org, I'd suggest that they adjust their common-ancestry cost function accordingly. (Insertions of any kind seem to be vanishingly rare in this virus, which is what you'd expect from its replication architecture. Very very very rarely, one sees gene duplication events and similar artifacts of degraded template-switching late in the life of an individual replication organelle. But that's my amateur perspective; ask a coronavirus expert. Maybe start from https://mbio.asm.org/content/12/1/e03014-20?)

Not sure how to get nextstrain.org to dump a CSV, but you can get to the earliest sequence in their Michigan subsample at https://nextstrain.org/groups/spheres/ncov/michigan?branchLabel=clade&c=gt-ORF1a_3676,3677,3966,265,3352&f_division=Michigan&label=clade:20I/501Y.V1&s=USA/MI-MDHHS-SC23042/2021

Screen Shot 2021-04-12 at 1 52 57 AM

Proposal for new lineage within B.1

Proposal for new lineage within B.1

Description

Sub-lineage of: B.1
Earliest sequence: 2020-12-15 (England/CAMC-C769B3/2020)
Most recent sequence: 2021-01-28 (England/MILK-119FD0B/2021)
Countries circulating: England (28), Nigeria (7), USA (7), France (5), Canada (4), Ghana (4), Japan (4), Jordan (2), Belgium (1), Italy (1), Spain (1)

Characterised by spike mutations: E484K, Q677H, F888L, 69-70 deletion, 144 deletion and 9 nucleotide mutation in nsp6 (as seen in B.1.1.7, B.1.351, P.1).

Genomes

B.1.525_genomes.txt

List of genomes, collated 2021-02-11, attached.

Evidence

Phylogenetic tree PDF image. Proposed new lineage labels in red:
B 1 525 ml tree

Phylogenetic tree file:
B.1.525.ml.tree.txt

Proposed lineage name

To be determined as the the next available designation within B.1

defining mutations

gene amino acid coordinates
ORF1ab   C1498T
    A1807G
    T8593C
    C9565T
    11288-11296del
  L4715F (L323F) C14407T
    C18171T
    A20724G
spike Q52R A21717G
  69-70del  
  144del  
  E484K G23012A
  Q677H G23593C
  F888L T24224C
    C24748T
E L21F C26305T
I82T T26767C
N SD2Y deletion 28278-28280del
    A28699G
N/ORF10 intergenic   G29543T

Novel lineage with key amino acid SNPs and a 9bp spike insertion

Proposed lineage name

B.1.214.2

Description

Sub-lineage of: B.1.214
Countries circulating: Belgium, Denmark, Switzerland, France, Mayotte

Mutations

S
9bp insertion 22204:ACAGATCGA
S:Q414K
S:N450K
S:T716I

ORF3a
30bp deletion 25448-25478

ORF1a deletion 11288-11297

The 9bp insertion in spike at position 22204:ACAGATCGA inserts the amino acids TDR downstream of R214. It also has a 30bp deletion in ORF3a (25448-25478) and approximately 60% of the samples have a 9bp deletion in ORF1a (11288-11297), the same deletion observed in the lineages B.1.1.7, B.1.351, B.1.525 and P.1.

Taxa

183 sequences on GISAID
image

B.1.214_2021-03-02.lineages.csv.zip

Evidence

Screenshot 2021-03-02 at 12 50 42

Red = B.1.214 Green = B.1.214.1 (Congo/ France lineage) Blue = B.1.214.2 (European/ Belgian lineage with key mutations and insertion mutation)

Suggested by

Keith Durkin, Piet Maes, Simon Dellicour, Guy Baele, Áine O'Toole, Andrew Rambaut

Proposal for new lineage C.1

Proposal for new lineage C.1

by Andrew Rambaut

Description

Sub-lineage of: B.1.1.1
Earliest sequence: England/LIVE-A4D52/2020 | 2020-03-02
Most recent sequence: England/QEUH-888FDE/2020 | 2020-07-27
Countries circulating: Predominantly UK lineage spanning March to end of July (probably still circulating). Other countries are mainly singletons or small clusters

Genomes

C.1_taxa.txt

List of genomes, collated 2020-08-14, attached.

Evidence

Phylogenetic tree PDF image. Proposed lineage C.1 labels in red:
B.1.1.1.tree.pdf

Phylogenetic tree file:
B.1.1.1.tree.txt

Proposed lineage name

C.1 (this is the first sub-lineage of B.1.1.1 and instead of B.1.1.1.1 it becomes C.1)

New sub lineages of B.1.214 (B.1.214.3 and B.1.214.4)

Description

Related to pango release v1.1.6

Updated designations for sublineages of B.1.214. Now 4 sister lineages, B.1.214.1 from Congo, B.1.214.2 from Belgium with a defining 9bp insertion in the NTD of the spike gene, B.1.214.3 a french lineage and B.1.214.4 a Danish lineage.

Lineage names & Numbers designated

B.1.214   43
B.1.214.1 20
B.1.214.2 344
B.1.214.3 58
B.1.214.4 89

New proposed sub-lineage in India under B.1

Description

Sub-lineage of: B.1
Earliest sequence: 2020-10-02 (EPI_ISL_1415203)
Most recent sequence: 2021-03-31 (EPI_ISL_1471955)
Countries circulating: India and Australia, Canada, Germany, Guadeloupe, New Zealand, Singapore, Sweden, USA, United Kingdom

Lineage was found to be circulating in India in 2021, predominantly in the states of Maharashtra, along with other states Gujarat, West Bengal, Karnataka, Kerala and Tamil Nadu (https://www.bbc.com/news/world-asia-india-56517495). The genomes have a combination of 2 mutations in the spike protein: L452R and E484Q.

Genomes
gisaid_accessions.txt

Evidence
image
Live nextstrain instance is available at https://nextstrain.org/community/banijolly/Phylovis/COVID-India?c=gt-S_452,484

Proposal for a new sub-lineage of B.1 with spike substitutions N440K and E484K

Description
Sub-lineage of: B.1
Earliest sequence: 2021-01-16
Most recent sequence: 2021-03-07
Countries circulating: Cameroon, Switzerland, France, Germany, England, Belgium, USA

This proposed sub-lineage was first identified in Switzerland from patients returning from Cameroon and their direct contacts. It was subsequently also detected in Cameroon, France, Germany, England, Belgium, USA.
This proposed sub-lineage is characterised by the spike protein amino acid mutations N440K and E484K in the RBD (as well as I210T, A879S, D936N, S939F and T1027I).

Genomes
accessions.txt

Evidence
evidence

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.