INTRODUCTION
Marine cold seeps are island-like ecosystems on the seafloor that are sustained by the emission of methane and other hydrocarbons from underlying sediments (
1,
2). In 2012, a series of ~570 previously undiscovered cold seeps were detected at depths of 50 to 1,700 m below sea level (mbsl) along the tectonically passive, northern U.S. Atlantic Margin (USAM). Roughly 75% of these seeps exist at water depths surrounding the updip limit of the gas hydrate stability zone (
3,
4), and many are fed in part by hydrate dissociation as local bottom water temperatures rise (
3,
5). Due to the relationship between water temperature and methane release on the USAM, anthropogenic climate change may continue to intensify methane seepage (
6,
7). Furthermore, an estimated 29,500 upper-slope seeps may be discoverable on other continental margins with similar lithology and hydrate reservoirs (
3), suggesting that the USAM seeps could be a model for cold seeps on other passive margins.
Despite their sensitivity to climate change and potential importance as model systems, the methane-consuming microbial communities at the USAM seeps have not been fully characterized. In fact, outside major sedimentary basins like the Gulf of Mexico, few studies have investigated the microbiology of passive margin seeps in general. Those that have were conducted recently within the South China Sea (
8–10) or within the southern USAM near Blake Ridge (
11,
12). However, the previous USAM studies used Sanger sequencing technologies and therefore lacked the sequencing depth ideal for these diverse environments. Several factors may lead to community differences between active and passive margin seeps, including dispersal effects due to decreased subsurface connectivity on passive margins, as well as differences in the origin and composition of seep fluids (
13–15), highlighting the need to study their potentially distinct microbiology.
Previous studies conducted primarily on active margins have documented a distinctive microbial community at seeps, termed the seep microbiome (summarized in reference
2). This community consists most notably of symbiotic anaerobic methanotrophic (ANME) archaea and sulfate-reducing bacteria (SRB) of the class
Deltaproteobacteria, which together couple the anaerobic oxidation of methane (AOM) to sulfate reduction (
16–19). The strains responsible for AOM have not been isolated, but phylogenetic and microscopic analyses have led to the classification of four major ANME lineages—ANME-1a/b (
20,
21), ANME-2a,b,c (
18,
19,
21), ANME-2d (
22,
23), and ANME-3 (
24–26)—as well as several groups of SRB—Seep-SRB1 (
18,
19,
27,
28) in the
Desulfobacteraceae, Seep-SRB2 (
29) and Seep-DBB (
30,
31) in the
Desulfobulbaceae, and a deeply branching thermophilic clade of
Deltaproteobacteria called HotSeep-1 (
29,
32). While ANME archaea, and particularly ANME-1, can be detected as single cells or as monospecific aggregates (
21,
26), ANME organisms and SRB are often physically associated in tight, multispecies consortia (
18,
19). Together, the organisms are responsible for oxidizing an estimated 80% of total sediment methane emissions (
33,
34) and thereby constitute a significant filter for greenhouse gas emissions. Other microbial groups are also enriched and widespread at seeps, including the sulfide-oxidizing and aerobic methane-oxidizing
Gammaproteobacteria of the orders
Thiotrichales and
Methylococcales, respectively, the putatively methanotrophic JS1 lineage of
Atribacterota (
35), and the acetogenic candidate division Hyd24-12 (“
Candidatus Fermentibacteria”) (
2). Although there is a consistent presence of these class- and order-level taxa at cold seeps worldwide, seep microbial communities diversify at finer taxonomic scales (
2,
15,
36). In an analysis of 23 globally distributed cold seeps by Ruff and colleagues, no operational taxonomic unit clustered at 97% similarity was shared between all seeps analyzed, and even the most cosmopolitan was shared among only 18 seeps (
2).
The diversity at seeps is partly attributed to local environmental and geochemical features, such as temperature (
2), latitude (
36), water depth (
2), and methane (
37) and sulfate (
2,
8,
38) concentrations. However, an overarching understanding of assembly processes at seeps (specifically, the relative importance of deterministic versus stochastic processes) is lacking, and the influence of distance-decay relationships is unknown. Deterministic processes include selection by environmental parameters and interactions between microorganisms, while stochastic processes are random events related to birth/death and dispersion. Distance-decay relationships are characterized by decreasing community similarity with physical distance, based on the concept that dispersal from one site to another is limited by physical transport and decreases with increasing distance (
39). However, how this concept applies to the island-like ecosystems of seeps is not well known, particularly seeps on passive margins where a lack of localized faulting limits subsurface microbial dispersal. Many of the microorganisms inhabiting seeps require anoxic conditions and have no obvious avenues of transport through the largely oxic surface sediments and bottom water. Although seep communities have been studied for nearly two decades now, surprisingly few have been investigated with next-generation amplicon sequencing, precluding data analyses that require deep sequencing and/or data from many samples per site. Applying these approaches to methane seeps, we can now begin to answer questions about community assembly and dispersion.
Here, we investigated the sediment microbial communities and pore water chemistry at four passive margin cold seep sites on the northern USAM (
Fig. 1): Shallop Canyon East (335 to 366 mbsl), Shallop Canyon West (390 to 395 mbsl), New England Seep (1,130 to 1,252 mbsl), and Veatch Canyon (1,407 to 1,545 mbsl). At each site, sediment cores were collected via submersible within and near (1 m to tens of meters from) areas of active methane seepage (referred to as Seep and Near-Seep cores, respectively) (see Table S1 in the supplemental material). Sediment cores were also collected via multicore outside (~500 to 1,500 m from) each site (referred to as Background cores) (Table S1). Using amplicon sequencing, we investigated the composition (DNA) and potential activity (RNA) of the total microbial community (
Bacteria and
Archaea) with 16S rRNA primers and of the methane-cycling community with methyl coenzyme M reductase (
mcrA) primers. Our goals were to (i) characterize the microorganisms within this previously undescribed and potentially climate-sensitive series of cold seeps; (ii) identify potential ecological relationships between microbial taxa at seeps; (iii) investigate the factors involved in seep community assembly; and (iv) compare USAM microbial communities to those at previously sequenced, globally distributed seeps to explore distance-decay relationships. This comparative analysis provides the first deep-sequencing perspective of the USAM seeps and, more broadly, will help us understand the influence of dispersal and environmental selection on cold-seep microorganisms.
MATERIALS AND METHODS
Site description.
The four USAM cold seep sites investigated in this study were Shallop Canyon East (335 to 366 mbsl), Shallop Canyon West (390 to 395 mbsl), New England Seep (1,130 to 1,252 mbsl), and Veatch Canyon (1,407 to 1,545 mbsl). The shallowest sites, Shallop Canyon East and West, were located on the upper rim of Shallop Canyon, in gently sloping seafloor. Backscatter data had previously indicated the presence of roughly 15 gas plumes in this area (
40). New England Seep was located 13 km southwest of Shallop Canyon on a steeply sloping, northwest-to-southeast-trending ridge, where 5 gas plumes had been previously detected (
40). Veatch Canyon, the deepest site sampled, was located 27 km southwest of New England Seep on a steeply sloping, northeast-to-southwest-trending ridge at the base of Veatch Canyon, where 13 gas seeps had been inferred from backscatter data (
40).
Sample collection and processing.
Sediment was collected from all four sites in July and August of 2016 aboard the R/V
Atlantis (cruise AT36). At each site, sediment push cores up to 20 cm long were collected via the HOV
Alvin submersible, either within a cold seep or from sediment several meters away. These were categorized as Seep cores and Near-Seep cores, respectively (Table S1). Seeps were delineated by the presence of white, filamentous microbial mats, live mussels, and/or visible methane gas bubbles. The filamentous mats were later confirmed to be rich in bacterial taxa known to oxidize sulfide (
40). In addition, multicores were collected using an MC-400 multicoring device equipped with a Nikon D3300 24.2 MP DSLR to collect background sediment from the area surrounding each site; these were categorized as Background cores (Table S1).
Onboard, cores were kept at 4°C until extruded from push core liners and sectioned into 3-cm horizons within 24 h of collection. Several 1-mL subsamples of each depth horizon were immediately flash frozen in liquid nitrogen and preserved at −80°C for later DNA and RNA analyses. For methane measurements, 3-mL subsamples were transferred into 25-mL butyl-rubber-sealed vials filled with 5 mL of 5 M sodium hydroxide solution. Paired, unextruded cores were processed for pore water geochemistry at the same 3-cm horizons, using Rhizon samplers inserted through predrilled holes. Pore water (0.5 mL) was fixed with 0.5 mL of 0.5 M zinc acetate and stored at 4°C; the remaining pore water was stored at −20°C.
Pore water geochemistry.
Headspace methane concentrations were measured from butyl-rubber-sealed vials (see above) using gas chromatography coupled with flame ionization detection (Shimadzu GC-2014; Boston University, Boston, MA, USA) and back-calculated to pore water concentrations using porosity values determined via weight loss after dehydration (and assuming a sediment density of 2.65 g cm
−3). Sulfide concentrations were measured in triplicate and determined colorimetrically from the zinc acetate-preserved pore water samples using the methylene blue method (
75), with a detection limit of 0.02 mM. All other assays were performed without replication due to limitations on pore water volume. Ammonium concentrations were determined colorimetrically using the indophenol blue method (
76), with a detection limit of 0.5 μM. Nitrite and nitrate (NO
x − nitrite) concentrations were determined colorimetrically using the vanadium(III) chloride method (
77), with detection limits of 0.1 μM and 1 μM, respectively. Concentrations of sulfate, phosphate, chloride, bromide, and acetate were determined by ion chromatography on frozen pore water samples (Dionex DX-500 ion chromatograph; Stanford University, CA, USA). Detection limits were 100, 0.05, 200,000, 100, and 10 μM, respectively.
Nucleic acid extraction, amplification, and sequencing of 16S rRNA and mcrA genes and transcripts.
RNA and DNA were extracted from flash-frozen sediments using the RNeasy PowerSoil total RNA isolation kit and the RNA PowerSoil DNA elution accessory kit (MoBio Laboratories, Carlsbad, CA, USA) from sediments flash frozen on board and stored at −80°C. RNA extracts were cleaned with the Ambion Turbo DNA-free kit (Thermo Fisher Scientific, Waltham, MA, USA), and reverse transcription of RNA to cDNA was completed using Superscript III first-strand synthesis Supermix (Thermo Fisher Scientific, Waltham, MA, USA).
DNA and cDNA were concentration normalized and amplified using a two-step PCR plan for Illumina amplicon sequencing. In the first step, universal primers 515F-Y and 926R (
78) were used to target the V4-V5 region of the 16S rRNA gene sequence. In addition, two separate primer pairs, mcrA_F/mcrA_R (
47,
79) and mcraMF/mcraMR, were used to target
mcrA genes and transcripts. The latter primer set was designed in house and modified from reference
80. All three sets of primers included an extension complementary to the primers used in the second PCR. The gene-targeting region of the primer sequences are listed in
Table 1. The mcrA_F/mcrA_R primer pair amplified the DNA and cDNA from more samples than mcraMF/mcraMR, likely due to increased efficiency with reduced degeneracies (Table S2). PCRs (25 μL) were performed with mixtures containing 0.5 μL of forward and 0.5 μL of reverse primers (10 μM concentration), 10 μL 5PRIME HotMasterMix (2.5×; Quanta-Bio, Beverly, MA, USA), 13 μL DNase-free water, and 1 μL DNA or cDNA template. The thermal cycling conditions were as follows: initial denaturing at 95°C for 180 s; 25 cycles of 95°C for 45 s, 50°C for 45 s, and 68°C for 90 s; a final elongation step at 68°C for 300 s; and refrigeration at 4°C until removal and storage.
In the second step, Illumina adaptors, barcodes, and indices were added to the amplicons. The same PCR mix was used with custom primers targeting the primer extension in the first PCR. The thermal cycling conditions were as follows: initial denaturing at 95°C for 180 s; 8 cycles of 95°C for 30 s, 55°C for 30 s, and 72°C for 30 s; a final elongation step at 72°C for 300 s; and refrigeration at 4°C until removal and storage. Amplicons were cleaned with 0.7× AMPure XP magnetic beads (Beckman-Coulter, Brea, CA, USA), pooled, and quantified before being sent to the UC Davis DNA Technologies Core Facility (Davis, CA, USA) for Illumina MiSeq 2 × 250 bp (16S rRNA) or 2 × 300 bp (mcrA) sequencing. Nine 16S rRNA and five mcrA samples were randomly chosen for duplicate amplification. The average weighted UniFrac distance between duplicate 16S rRNA samples was 0.080. Negative (molecular-grade water) and positive (mock communities of known composition) controls were processed and sequenced in parallel with the samples. Lack of DNA contamination in the RNA extracts was confirmed by processing RNA extracts (without reverse transcription) in parallel and seeing no visible amplification on a gel after the second PCR.
Phylogenetic analysis of 16S rRNA and mcrA genes and transcripts.
Returned, demultiplexed sequences were trimmed with cutadapt (v. 2.10) (
81) and then filtered and processed using the R (v. 4.0.2) package DADA2 (v. 1.16.0) (
82). Reads were trimmed to 216 (16S rRNA) or 260 and 230 (
mcrA; forward and reverse reads, respectively) base pairs, with those containing more than 2 expected sequencing errors removed. Amplicon sequence variants (ASVs) were then inferred from filtered reads. No reads were recovered from blank samples (9 16S rRNA blanks; 5
mcrA blanks). Phylogenetic classification of 16S rRNA ASVs was based on the SILVA SSU database (v. 132) (
83); on average, 1.96 × 10
4 ± 1.47 × 10
4 16S rRNA reads were recovered per sample. Classification of
mcrA ASVs was determined manually based on a phylogenetic tree, created as described below.
Sequences generated with the mcrA_F/mcrA_R primers rather than the mcraMF/mcraMR primers were used for further analysis, since they amplified more samples (Table S2). Additionally, across all samples amplified with both primer sets, 272 more unique ASVs were inferred when the mcrA_F/mcrA_R primers were used. ASVs generated with the mcrA_F/mcrA_R primers were clustered in Geneious Prime (v. 2021.1.1) with a minimum overlap identity of 97% and assigned to 122 clusters. The highest-scoring NCBI BLAST hit was identified for the most abundant ASV in each cluster. These sequences (98 unique), as well as several sequences from cultured methanogens, were aligned with MAFFT (v. 7.450) (
84) and incorporated into a maximum-likelihood PhyML (v. 3.3) (
85) reference tree with 100 bootstraps. The percent identity between individual ASVs inferred from mcrA_F/mcrA_R samples and the NCBI BLAST hit it was assigned to is shown in Fig. S12.
Comparison to previously published 16S rRNA gene sequences.
16S rRNA gene sequences for comparison to the USAM data set were downloaded from the Sequence Read Archive (SRA), project numbers
PRJNA275905 (
55),
PRJNA265122 (
56),
PRJNA381353 (
8),
PRJNA386387 (
36),
PRJNA506542 (
58), and
PRJDB8874 (
57). All projects in database searches for “methane seep,” “MiSeq,” and “16S rRNA” and those that used primers targeting the V4 and/or V5 region were included; samples that came from enrichment cultures or nonsediments were manually eliminated. A full list of sample names, as well as primer and geographic information, is provided in Data Set S3.
Sequences from all studies were trimmed using cutadapt with 515F-Y (
78) and 806R (
86) primers if applicable. (Samples from
PRJNA275905,
PRJNA265122,
PRJNA381353, and
PRJNA386387 were pretrimmed before submission to the SRA and thus not trimmed in house.) Sequences were then processed as described above using the R package DADA2, with the exception that reverse reads were trimmed to 100 bp, rather than 216 bp. Additionally, sequences assigned to bacteria were removed from samples in
PRJNA381353 and
PRJNA506542, as they were originally sequenced with archaeal primers rather than universal primers.
Sequence analysis and statistical methods.
The correlation analysis between ANME archaea and SRB was completed using the Java package MINE (v. 2.0.1) (
52). A table of read abundances was generated for the 4,782 16S rRNA gene and 16S rRNA ASVs within the classes
Methanomicrobia and
Deltaproteobacteria across all 52 USAM Seep samples. ASVs which did not appear in at least 10% of samples were removed. The maximal information coefficient was calculated for each pair of remaining ASVs, and
P values were corrected for multiple comparisons in R by the Benjamini-Hochberg method (
87). Of the 849 significant (
P < 0.01) correlations, 212 were found to involve an ANME archaeon and a member of the
Deltaproteobacteria and were included in our network analysis.
Maximum-likelihood trees of 16S rRNA gene and 16S rRNA ASVs from the USAM and the global seep data set were inferred with FastTree (v. 2.1.11) (
88) under a general time reversible (GTR) model of evolution. Nonmetric multidimensional scaling (NMDS) was carried out based on the weighted UniFrac distance metric (
89) using the R package vegan (v. 2.5-7) (
90). NMDS stress values below 0.2 reliably represent the underlying data. Distance-based redundancy analyses (dbRDA) were also carried out on based on the weighted UniFrac distance metric. The environmental data matrix was z-score transformed, and variance inflation factors of >10, representing covariability, were discarded before stepwise model selection using the ordistep function in the R package vegan (v. 2.5-7) (
90). Principal-component analysis (PCA) was carried out using singular value decomposition via the prcomp function in the R package stats (v 4.0.2). Geochemical data were centered and scaled.
Analysis of similarity (ANOSIM) was used to determine the significance of microbial community differences between groups of samples, also based on the weighted UniFrac distance metric. Analysis of variance (ANOVA) was used to test the significance of dbRDA models. One-sided Student’s t tests were used to compare the average weighted UniFrac distance between USAM seep samples with that of 10 equivalently sized collections of samples randomly chosen from the global seep data set. (This was conducted for the total microbial community as well as for the Archaea only).
The phylogenetic normalized stochasticity ratio (pNST) was calculated for Seep sediments and for non-Seep sediments using the R package NST (v. 3.0.6) (
54), developed by Ning et al. (
54). The index is based on the weighted UniFrac similarity between communities of the same treatment and the randomly expected similarity after randomization of the entire metacommunity for that treatment. For these analyses, ASV-level community data for samples within the same sediment core were pooled, and cores were categorized as “Seep” or “non-Seep” cores (giving an
n value of 7 in the Seep treatment and 8 in the non-Seep treatment).
Data availability.
Nucleotide sequences from this study were deposited in the European Nucleotide Archive, project number
PRJEB45337. Sample metadata are contained in Data Set S1.