The number of species, or richness, and the equitability of their AD, are a reflection of ecological processes such as niche partitioning, resource distribution, and disturbances [
9]. Richness is known to be very high within SAR11 assemblages. High enough, by instance, as to have hampered traditional metagenomic assembly despite the abundance of SAR11 reads in most metagenomes [
11]. But studies focused specifically on richness are few compared to those that have dealt with other aspects of the biology of the group. A pioneer metagenomic read recruitment (MRR) study, showed that sequences retrieved from the Sargasso Sea (SS) were between 30% and 80% identical to SAR11 reference strain HTCC1062 at the amino acid level, indicating that the SAR11 sequences from SS are very different to each other [
12]. Later MRR studies made elsewhere, and with other reference sequences, produced similar results [
13,
14]. Likewise, comparisons performed between 140 internal transcribed spacer (ITS) sequences retrieved from 4 metagenomes from the Arctic Ocean, showed that only 29 ITS sequences were identical to at least one of the rest of sequences [
15]. On the other side, a metabarcoding study that surveyed ITS molecular clones from the Red Sea, revealed the presence of 69 to 130 SAR11 sequence variants in 7 samples [
16]. Our results extend the above-referenced studies, showing that the cells in SAR11 assemblages can present hundreds or thousands of variants of the 16S gene (Tab.1). SAR11 species possess a single copy of the 16S gene [
17,
18], which eases the interpretation of metabarcoding data. Interestingly, small differences in this gene can be associated with important differences at the genome level. For example,
Ca. Pelagibacter ubique str. HTCC1062 and
Ca. Pelagibacter sp. str. HTCC7211, two representatives of the
Ia subclade, are about 99% identical to each other in the 16S gene, but differ in their gene content and are only about 75% identical at the amino-acid level [
19,
20]. We are not aware of studies that have used high-performance metabarcoding to specifically study richness. So, to the best of our knowledge, this work provides the first data in that regard. Our results show that high-throughput metabarcoding can provide a very detailed picture of variation at a single locus compared to alternative approaches. For example, the 4 metagenomes from the Arctic Ocean mentioned above provided 140 sequences of the targeted gene, whereas we could generate between 7176 and 14,797 such sequences from our 4 coastal samples (Tab.1). MRR has the desirable property of targeting the entire genome, but there is no way to translate the results generated by this method into measures of richness and AD. Furthermore, it requires a reference sequence, whereas metabarcoding is a reference-free technique.