The software architecture of PyCTOH : an on-line data distribution system for altimetry applications

The french Centre for Topography of the Oceans and the Hydrosphere, CTOH, applies innovative algorithms for processing altimetric data for oceans, continental hydrology and glaciology applications. Building on over 20 years of experience, a new data distribution system, PyCTOH, provides uniform DAP and web access to along-track and gridded data.

The architecture of the PyCTOH system will be described from the bottom up: storage of datasets, catalogue indexing and access, data retrieval for front-end servers and end-users, and visualisation. Its main developments guidelines are optimising the user experience, system scalability and sustainability.

Because of the new product opportunities available in CTOH, datasets are frequently updated; to avoid full archive reprocessing datasets were split at the lowest interaction level, while keeping common metadata information among them. That is, when treating one altimetric data track, instead of keeping the dataset as a full set of parameters (ku range, ionospheric correction, ), they are split in different files, while using a relational database for indexing and metadata tagging each subset. When a new parameter is added to the same dataset, we do not need to fully recreate each dataset file, but simply add it to the dataset and update the catalogue and metadata indexes.

Catalogue indexing and access is done by using an industrial-strength relational database system: PostgreSQL. It provides very effective management of the over 50 million records we must manage, for about 10 terabytes of data.

Data storage has to be scalable to that extent. A standard storage web-farm provides data splitting over several storage servers, and throughput aggregation with a cache hierarchy.

Data access is provided by two channels: a custom in-house DAP-based server and a standard web interface. The former provides live access to our data using the Data Access Protocol over HTTP, being compatible which clients such as Matlab®, IDL®, and Ferret. The latter, based on the Plone Content Management System, provides a graphic interface to data extraction and pre-visualisation , while being itself a DAP-client for the database DAP server.

Previous topic

Other resources

Next topic

PyCTOH : un serveur OPeNDAP à l’échelle peta-octet

This Page