Platform Features

The features of the Ocean Data Platform are designed to facilitate data analysis and tool creation - to make it easier to find and use liberated data to create a healthier ocean.


Ocean Data Platform Architecture schematic

Platform Architecture



SDK

  • An image of the Python logo

    Python

    The Python SDK provides functionality to pull all available data from the Ocean Data Platform and can be run using the Workspace feature, a JupyterHub environment.

    Users can run for fast and efficient queries allowing for joining of data one environment from a variety of datasources.


Packages

  • The Ocean Data Platform environment is made of up of many different open-source software packages.

    Each of these packages has its own repository and documentation.

  • An image of the Dask logo

    Dask is an open source library for parallel computing. It can easily be used with Pandas, Numpy and scikit-learn to speed up data through parallelization. Dask data structures can represent large data structure without loading them into memory allowing for scalability.

    Website | GitHub

  • An image of the xarray logo

    Xarray is an open source Python package to work with multi-dimensional arrays of data. It is tailored to working with netCDF files and is widely used in earth sciences. It provides many functions for advanced analytics and visualizations. It can be integrated with Dask for parallel computing.

    Website | GitHub

  • An image of the Jupyter logo

    Project Jupyter is a community dedicated to devlop open-source software and support open standards. Through Jupyter users can interact with the data in the Ocean Data Platform and our resources.

    Website | GitHub

  • An image of the Pandas logo

    Pandas provides data in a dataframe format that makes it easy for users to manipulate data. The noteworthy highlights are: tools for reading and writing data, reshaping dataset, group bys, merging and joining of datasets, and time series functionality. Our datapulls return either pandas or geopandas dataframes that are easy to work with and provide a single data format.

    Website | GitHub

  • An image of the Geopandas logo

    Geopandas allows users to easily work with geospatial data by extending the datatypes found in pandas to perform spatial operations. Geometric operations are performed by shapely. Data in our example notebooks are being pulled using a geopandas to sql connector.

    Website | GitHub

  • An image of the Matplotlib logo

    Matplotlib is an open-source python library for creating static, animated and interactive visualizations. It can integrate easily with Cartopy and used widely in our example notebooks.

    Website | GitHub

  • An image of the Cartopy logo

    Cartopy is a Python package for geospatial data processing, producing maps and other geospatial data analyses. It allows users to switch between different projections and can be easily integrated with Matplotlib. Many of the maps in our example notebooks are created using Cartopy. Previously knows as Basemap.

    Website | GitHub