Skip to main content
Dataset Overview | National Centers for Environmental Information (NCEI)

Global surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205)

browse graphicPreview graphic
This dataset contains surface-ocean partial pressure of carbon dioxide (pCO2) that the ensemble mean of six two-step clustering-regression machine learning methods. The ensemble is a combination of two clustering approaches and three regression methods. For the clustering approaches, we use K-means clustering (21 clusters) and open ocean CO2 biomes as defined by Fay and McKinley (2014). Three machine learning regression methods are applied to each of these two clustering methods. These machine learning methods are feed-forward neural-network (FFN), support vector regression (SVR) and gradient boosted machine using decision trees (GBM). The final estimate of surface ocean pCO2 is the average of the six machine learning estimates resulting in a monthly by 1° ⨉ 1° resolution product that extends from the start of 1982 to the end of 2016. Sea-air fluxes (FCO2) calculated from pCO2 are also presented in the data. The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and fCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude.
  • Cite as: Gregor, Luke; Lebehot, Alice D.; Kok, Schalk; Monteiro, Pedro M. S. (2019). Global surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.25921/z682-mn47. Accessed [date].
gov.noaa.nodc:0206205
Download Data
  • HTTPS (download)
    Navigate directly to the URL for data access and direct download.
  • FTP (download)
    These data are available through the File Transfer Protocol (FTP). FTP is no longer supported by most internet browsers. You may copy and paste the FTP link to the data into an FTP client (e.g., FileZilla or WinSCP).
Distribution Formats
  • Originator data format
Ordering Instructions Contact NCEI for other distribution options and instructions.
Distributor NOAA National Centers for Environmental Information
+1-301-713-3277
NCEI.Info@noaa.gov
Dataset Point of Contact NOAA National Centers for Environmental Information
ncei.info@noaa.gov
Time Period 1982-01-01 to 2016-12-31
Spatial Bounding Box Coordinates
West: -180
East: 180
South: -89.5
North: 89.5
Spatial Coverage Map
General Documentation
Associated Resources
Publication Dates
  • publication: 2019-11-05
Data Presentation Form Digital table - digital representation of facts or figures systematically displayed, especially in columns
Dataset Progress Status Complete - production of the data has been completed
Historical archive - data has been stored in an offline storage facility
Data Update Frequency As needed
Purpose This dataset is available to the public for a wide variety of uses including scientific research and analysis.
Use Limitations
  • accessLevel: Public
  • Distribution liability: NOAA and NCEI make no warranty, expressed or implied, regarding these data, nor does the fact of distribution constitute such a warranty. NOAA and NCEI cannot assume liability for any damages caused by any errors or omissions in these data. If appropriate, NCEI can only certify that the data it distributes are an authentic copy of the records that were accepted for inclusion in the NCEI archives.
Dataset Citation
  • Cite as: Gregor, Luke; Lebehot, Alice D.; Kok, Schalk; Monteiro, Pedro M. S. (2019). Global surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.25921/z682-mn47. Accessed [date].
Cited Authors
Principal Investigators
Contributors
Resource Providers
Publishers
Acknowledgments
  • Funding Information: SOCCO, Council for Scientific and Industrial Research
Theme keywords NODC DATA TYPES THESAURUS NODC OBSERVATION TYPES THESAURUS WMO_CategoryCode
  • oceanography
Global Change Master Directory (GCMD) Science Keywords OCADS Study Type
  • Data synthesis product
  • Discrete measurement
  • Profile
Provider Variable Abbreviations
  • FCO2_raw (time, lat, lon)
  • FCO2_smooth (time, lat, lon)
  • Time
  • lat (180)
  • lon (360)
  • pCO2air (time, lat, lon)
  • pCO2sea_raw (time, lat, lon)
  • pCO2sea_smooth (time, lat, lon)
  • seamask (lat, lon)
Data Center keywords NODC COLLECTING INSTITUTION NAMES THESAURUS NODC SUBMITTING INSTITUTION NAMES THESAURUS
Platform keywords NODC PLATFORM NAMES THESAURUS
Instrument keywords NODC INSTRUMENT TYPES THESAURUS Global Change Master Directory (GCMD) Instrument Keywords
Place keywords NODC SEA AREA NAMES THESAURUS Global Change Master Directory (GCMD) Location Keywords Provider Geographic Names
  • Arctic Ocean
  • Atlantic Ocean
  • Indian Ocean
  • Pacific Ocean
  • Southern Ocean
Project keywords NODC PROJECT NAMES THESAURUS Cruise ID
  • Various
EXPOCODE
  • Various
Ocean Acidification Search Keywords
  • Ocean Carbon and Acidification Data System (OCADS) Project
Reference Section ID
  • Various
Keywords NCEI ACCESSION NUMBER
Use Constraints
  • Cite as: Gregor, Luke; Lebehot, Alice D.; Kok, Schalk; Monteiro, Pedro M. S. (2019). Global surface-ocean partial pressure of carbon dioxide (pCO2) estimates from a machine learning ensemble: CSIR-ML6 v2019a (NCEI Accession 0206205). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.25921/z682-mn47. Accessed [date].
Access Constraints
  • Use liability: NOAA and NCEI cannot provide any warranty as to the accuracy, reliability, or completeness of furnished data. Users assume responsibility to determine the usability of these data. The user is responsible for the results of any application of this data for other than its intended purpose.
Fees
  • In most cases, electronic downloads of the data are free. However, fees may apply for custom orders, data certifications, copies of analog materials, and data distribution on physical media.
Lineage information for: dataset
Processing Steps
  • 2019-11-05T19:14:30Z - NCEI Accession 0206205 v1.1 was published.
Output Datasets
Lineage information for: dataset
Processing Steps
  • Parameter or Variable: Time; Abbreviation: Time; Unit: seconds since 2000-01-01; Detailed sampling and analyzing information: min = 1982-01-15; max = 2016-12-15; step = month.
  • Parameter or Variable: Latitude; Abbreviation: lat (180); Unit: degrees_north; Detailed sampling and analyzing information: min = -89.5; max = 89.5; step = 1.0.
  • Parameter or Variable: Latitude; Abbreviation: lon (360); Unit: degrees_east; Detailed sampling and analyzing information: min = -180; max = 180; step = 1.0.
  • Parameter or Variable: partial pressure of surface ocean CO2 raw; Abbreviation: pCO2sea_raw (time, lat, lon); Unit: µatm; Detailed sampling and analyzing information: The ensemble mean of six machine learning methods that first cluster data and then apply regression to the clusters.
  • Parameter or Variable: smoothed partial pressure of surface ocean CO2; Abbreviation: pCO2sea_smooth (time, lat, lon); Unit: µatm; Detailed sampling and analyzing information: The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and FCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude..
  • Parameter or Variable: sea-air CO2 flux; Abbreviation: FCO2_raw (time, lat, lon); Unit: molC/m2/yr; Detailed sampling and analyzing information: sea-air CO2 flux calculated after Landschutzer et al (2016) as the product of the following: (pCO_2^sea-pCO_2^air) where positive values are outgassing from sea to air; the Wanninkhof (1992) parameterisation of k_w scaled globally to 16 cm/hr with wind from ERA-interim (Dee et al, 2011); K_0 from Weiss (1974) using the OISSTv2 sea surface temperature product (Reynolds et al. 2007) and EN4 sea salinity (Good et al. 2013); (1 - ice_frac) as described in Butterworth and Miller (2016)..
  • Parameter or Variable: smoothed sea-air CO2 flux; Abbreviation: FCO2_smooth (time, lat, lon); Unit: mol/m2/a; Detailed sampling and analyzing information: The discrete boundaries of the clustering approach result in semi-discrete discontinuities in pCO2 and FCO2 estimates. These are smoothed by applying a 3 ⨉ 3 ⨉ 3 convolution (moving average) to the dataset in time, latitude and longitude..
  • Parameter or Variable: partial pressure of atmospheric CO2; Abbreviation: pCO2air (time, lat, lon); Unit: µatm; Detailed sampling and analyzing information: Atmospheric pCO2 from CarboScope v1.7 (Rodenbeck et al 2014) source = http://www.bgc-jena.mpg.de/CarboScope/.
  • Parameter or Variable: boolean mask where True is ocean and False is land or NULL; Abbreviation: seamask (lat, lon); Detailed sampling and analyzing information: source - https://www.ncei.noaa.gov/access/ocean-carbon-data-system/oceans/SPCO2_1982_present_ETH_SOM_FFN.html .
Acquisition Information (collection)
Instrument
  • showerhead equilibrator
Platform
  • VARIOUS CHARTERED VESSELS
Last Modified: 2024-03-08T13:21:39Z
For questions about the information on this page, please email: ncei.info@noaa.gov