loading page

Rapid automatic clean-up toolkit for large corrupted tidal datasets
  • Vamsi Krishna Sridharan
Vamsi Krishna Sridharan
University of California Santa Cruz

Corresponding Author:[email protected]

Author Profile


Tides are critical to coastal and oceanic processes. While tidal data are available readily, they are often corrupted by various sources of error. An automated, fast MATLAB toolbox is developed to clean-up tidal timeseries data from estuarine and oceanic locations corrupted by errors. This toolbox will immensely speed up delivery of quality-controlled tidal data. It will also reduce errors in quality control, which typically involves several manual tasks. The toolbox corrects poorly interpolated and noisy data, erroneous outliers, and instrumentation bias such as spurious jumps, drifts, spikes, and modulations in the true signal. Signal clean-up involves multiple stages. First, thresholds are imposed on higher order temporal derivatives of the signal to remove gross interpolations and noise saturated signal chunks, followed by a moving median threshold to remove outliers. Then the surviving signal is filtered into tidal, subtidal and long-period components, and the long-period component is subject to a maximal overlap discrete wavelet transformation, in which the transform coefficients corresponding to multi-scale edge features are removed. Subsequently, local information in the subtidal and tidal components is compared relative to the whole signal to correct spurious amplitude modulations and sudden biases. Consequently, these components are added to recover the uncorrupted signal, and large data gaps are filled with short term harmonic reconstruction. For estuarine locations, the correlation in the spectrogram between two nearby stations is initially used to quantify and remove river influence in the signal. Applications to datasets at multiple global locations demonstrate the value of the toolbox.