Skip to main content
Dataset Overview | National Centers for Environmental Information (NCEI)

PlanktonSet 1.0: Plankton imagery data collected from F.G. Walton Smith in Straits of Florida from 2014-06-03 to 2014-06-06 and used in the 2015 National Data Science Bowl (NCEI Accession 0127422)

browse graphicPreview graphic
Data presented here are subset of a larger plankton imagery data set collected in the subtropical Straits of Florida from 2014-05-28 to 2014-06-14. Imagery data were collected using the In Situ Ichthyoplankton Imaging System (ISIIS-2) as part of a NSF-funded project to assess the biophysical drivers affecting fine-scale interactions between larval fish, their prey, and predators. This subset of images was used in the inaugural National Data Science Bowl (www.datasciencebowl.com) hosted by Kaggle and sponsored by Booz Allen Hamilton. Data were originally collected to examine the biophysical drivers affecting fine-scale (spatial) interactions between larval fish, their prey, and predators in a subtropical pelagic marine ecosystem. Image segments extracted from the raw data were sorted into 121 plankton classes, split 50:50 into train and test data sets, and provided for a machine learning competition (the National Data Science Bowl). There was no hierarchical relationships explicit in the 121 plankton classes, though the class naming convention and a tree-like diagram (see file "Plankton Relationships.pdf") indicated relationships between classes, whether it was taxonomic or structural (size and shape). We intend for this dataset to be available to the machine learning and computer vision community as a standard machine learning benchmark. This “Plankton 1.0” dataset is a medium-size dataset with a fair amount of complexity where image classification improvements can still be made.
  • Cite as: Cowen, Robert K.; Sponaugle, Su; Robinson, Kelly L.; Luo, Jessica; Guigand, Cedric (2015). PlanktonSet 1.0: Plankton imagery data collected from F.G. Walton Smith in Straits of Florida from 2014-06-03 to 2014-06-06 and used in the 2015 National Data Science Bowl (NCEI Accession 0127422). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.7289/v5d21vjd. Accessed [date].
gov.noaa.nodc:0127422
Download Data
  • HTTPS (download)
    Navigate directly to the URL for data access and direct download.
  • FTP (download)
    These data are available through the File Transfer Protocol (FTP). FTP is no longer supported by most internet browsers. You may copy and paste the FTP link to the data into an FTP client (e.g., FileZilla or WinSCP).
Distribution Formats
  • Originator data format
Ordering Instructions Contact NCEI for other distribution options and instructions.
Distributor NOAA National Centers for Environmental Information
+1-301-713-3277
NCEI.Info@noaa.gov
Dataset Point of Contact NOAA National Centers for Environmental Information
ncei.info@noaa.gov
Time Period 2014-06-03 to 2014-06-06
Spatial Bounding Box Coordinates
West: -81.9
East: -79.2
South: 24.3
North: 26
Spatial Coverage Map
General Documentation
Publication Dates
  • publication: 2015-04-28
  • revision: 2015-05-08
Data Presentation Form Digital table - digital representation of facts or figures systematically displayed, especially in columns
Dataset Progress Status Complete - production of the data has been completed
Historical archive - data has been stored in an offline storage facility
Data Update Frequency As needed
Supplemental Information
Submission Package ID: MLHFP1
In this Accession, NCEI has archived multiple versions of these data. The latest (and best) version of these data has the largest version number.
Purpose Data were originally collected to examine the biophysical drivers affecting fine-scale (spatial) interactions between larval fish, their prey, and predators in a subtropical pelagic marine ecosystem. Data were originally collected to examine the biophysical drivers affecting fine-scale (spatial) interactions between larval fish, their prey, and predators in a subtropical pelagic marine ecosystem. Image segments extracted from the raw data were sorted into 121 plankton classes, split 50:50 into train and test data sets, and provided for a machine learning competition (the National Data Science Bowl). There was no hierarchical relationships explicit in the 121 plankton classes, though the class naming convention and a tree-like diagram (see file "Plankton Relationships.pdf") indicated relationships between classes, whether it was taxonomic or structural (size and shape). We intend for this dataset to be available to the machine learning and computer vision community as a standard machine learning benchmark. This “Plankton 1.0” dataset is a medium-size dataset with a fair amount of complexity where image classification improvements can still be made.
Use Limitations
  • accessLevel: Public
  • Distribution liability: NOAA and NCEI make no warranty, expressed or implied, regarding these data, nor does the fact of distribution constitute such a warranty. NOAA and NCEI cannot assume liability for any damages caused by any errors or omissions in these data. If appropriate, NCEI can only certify that the data it distributes are an authentic copy of the records that were accepted for inclusion in the NCEI archives.
Dataset Citation
  • Cite as: Cowen, Robert K.; Sponaugle, Su; Robinson, Kelly L.; Luo, Jessica; Guigand, Cedric (2015). PlanktonSet 1.0: Plankton imagery data collected from F.G. Walton Smith in Straits of Florida from 2014-06-03 to 2014-06-06 and used in the 2015 National Data Science Bowl (NCEI Accession 0127422). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.7289/v5d21vjd. Accessed [date].
Cited Authors
Principal Investigators
Contributors
Resource Providers
Points of Contact
Publishers
Acknowledgments
  • Related Funding Agency: National Science Foundation; Directorate for Geosciences - NSF Award 1419987
Theme keywords NODC DATA TYPES THESAURUS NODC OBSERVATION TYPES THESAURUS WMO_CategoryCode
  • oceanography
Global Change Master Directory (GCMD) Science Keywords
Data Center keywords NODC COLLECTING INSTITUTION NAMES THESAURUS NODC SUBMITTING INSTITUTION NAMES THESAURUS Global Change Master Directory (GCMD) Data Center Keywords
Platform keywords NODC PLATFORM NAMES THESAURUS ICES/SeaDataNet Ship Codes
Instrument keywords Provider Instruments
  • In situ Ichthyoplankton Imaging System (ISIIS)
Place keywords NODC SEA AREA NAMES THESAURUS Global Change Master Directory (GCMD) Location Keywords
Project keywords Provider Project Names
  • National Data Science Bowl (www.datasciencebowl.com)
  • Spatial variability of larval fish in relation to their prey and predator fields: Patterns and interactions from cm to 10s of km in a subtropical, pelagic environment - NSF Award 1419987
Keywords NCEI ACCESSION NUMBER
Use Constraints
  • Cite as: Cowen, Robert K.; Sponaugle, Su; Robinson, Kelly L.; Luo, Jessica; Guigand, Cedric (2015). PlanktonSet 1.0: Plankton imagery data collected from F.G. Walton Smith in Straits of Florida from 2014-06-03 to 2014-06-06 and used in the 2015 National Data Science Bowl (NCEI Accession 0127422). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.7289/v5d21vjd. Accessed [date].
Access Constraints
  • Use liability: NOAA and NCEI cannot provide any warranty as to the accuracy, reliability, or completeness of furnished data. Users assume responsibility to determine the usability of these data. The user is responsible for the results of any application of this data for other than its intended purpose.
Fees
  • In most cases, electronic downloads of the data are free. However, fees may apply for custom orders, data certifications, copies of analog materials, and data distribution on physical media.
Lineage information for: dataset
Processing Steps
  • 2015-04-28T19:54:38Z - NCEI Accession 0127422 v1.1 was published.
  • 2015-05-07T20:25:00Z - NCEI Accession 0127422 was revised and v2.2 was published.
    Rationale: Updates were received for this dataset. These updates were copied into the data/0-data/ directory of this accession. These updates may provide additional files or replace obsolete files. This version contains the most complete and up-to-date representation of this archival information package. All of the files received prior to this update are available in the preceding version of this accession.
  • 2015-05-08T18:23:50Z - NCEI Accession 0127422 was revised and v2.3 was published.
    Rationale: Additional metadata files were received or created for this dataset. These updates were copied into the about/ directory of this accession. These updates may provide additional files or replace obsolete files. This version contains the most complete and up-to-date representation of this archival information package. All of the files received prior to this update are available in the preceding version of this accession.
Output Datasets
Lineage information for: repository
Processing Steps
  • 2015-04-22T00:00:00 - NOAA created the National Centers for Environmental Information (NCEI) by merging NOAA's National Climatic Data Center (NCDC), National Geophysical Data Center (NGDC), and National Oceanographic Data Center (NODC), including the National Coastal Data Development Center (NCDDC), per the Consolidated and Further Continuing Appropriations Act, 2015, Public Law 113-235. NCEI launched publicly on April 22, 2015.
Acquisition Information (collection)
Platform
  • R/V F.G. Walton Smith
Last Modified: 2024-04-11T12:19:15Z
For questions about the information on this page, please email: ncei.info@noaa.gov