2.2.2 Self-Organizing Maps
SOMs are a non-linear technique of artificial neural networks based on
an unsupervised training process that allows multivariate analysis. The
results obtained using SOMs retain the original data’s topology while
projecting into a two-dimensional scheme for simplification. This allows
for visualizing, classifying, grouping, and detecting complex patterns
of any set of variables used in training simultaneously (Liu et al.,
2006). SOMs have been used for long-term currents characterization and
to study the possible hydrodynamic conditions in specific regions, such
as in Liu & Weisberg (2005, 2007), who obtained the current patterns on
the west Florida platform and established a relationship between local
winds and coastal up/downwelling processes, or as in Vilibić et al.
(2016), who used the SOMs method for forecasting system of surface
currents. More recently, Orfila et al. (2021) used SOMs to establish the
patterns and seasonal dynamics of the southern CS. These examples show
the versatility and capacity of the method as a useful and robust
technique for pattern recognition and feature extraction in variables
where non-linearity is important, as may be the case in oceanographic
processes. Further details of the SOMs method are in Liu & Weisberg
(2011).
In this study, we determined the current patterns by applying SOMs over
the HYCOM 25-year climatology. The method uses a neighborhood function,
a unit search radius, and a linear initialization process. The training
algorithm employed a group series approach, carefully analyzing
parameters to ensure the lowest quantization and topological errors,
following best practices outlined by Meza-Padilla et al. (2019) and Liu
et al. (2006). Before the training process, each variable was spatially
and temporally normalized to prevent any single component from
dominating the map organization in cases where its magnitude is
disproportionately higher than that of the other components. This
normalization ensured that all variables contributed equally to the
SOMs, leading to a more balanced and accurate data representation. After
the training process, the components were denormalized and further
analyzed under the terms of each variable. The determination of each map
size (cluster) is a subjective and empirical process that depends on the
desired detail for the analysis (Liu, Weisberg, Lenes, et al., 2016;
Liu, Weisberg, Vignudelli, et al., 2016; Meza-Padilla et al., 2019;
Weisberg & Liu, 2017; Zeng et al., 2015). After a series of sensitivity
tests, we choose the map sizes to obtain the minimum number without
losing essential pattern variation. The sensitivity tests were based on
the quantization error (QE), measuring how much detail is being learned
by the SOMs, and on the topological error (TE), measuring the properties
of the preserved space and the variation percentage of each pattern.
This empirical procedure depends intrinsically on the study (Polzlbauer,
2004). We determined the optimal number of patterns by quantifying the
associated QE and TE errors through various tests using different
cluster arrangements, including 2x2, 2x3, 3x3, 3x4, and 4x4. The results
showed that the QE decreases using the 3x3 cluster, and although the TE
increases when increasing the number of patterns, it remains an
acceptable and small value in terms of space preservation. As a result,
the spatial characterization was done for a cluster of nine patterns
(3x3 cluster). An additional recommendation when using SOMs to integrate
trajectories is to maximize the number of spatial patterns so that the
temporal variability given by BMUs, will correctly approximate the data
temporal variability. At some point, the method will detect that further
spatial patterns do not provide any additional information, e.g., in our
case with the 3x3 cluster, there is one pattern with 0% frequency; this
means that eight spatial patterns are enough to extract the dominant
patterns and therefore additional patterns will not improve the spatial
or temporal representation of the data. We use the SOMs MATLAB Toolbox
developed by Vesanto & Alhoniemi (2000) from the Laboratory of Computer
and Information at the Helsinki University of Technology
(Laboratory of Computer and Information Science, Adaptative and
Informatics Research Center , 2015).