Figure 7: Workflow for
model-based real-time monitoring of a chromatographic step adapted from
Sauer et a. 2018. In addition to the sensors for pH, conductivity, UV
absorbance and pressure which a chromatography workstation is typically
equipped with, four online sensors have been implemented in the flow
after the column. Online data were obtained for 8 identical runs of a
chromatographic purification step. The eluates were aliquoted and
collected as 15 fractions and offline analysis was carried out to
determine the desired quality attributes, product quantity and impurity
content. Part of this data set was then used to establish mathematical
models for each quality attribute by relating the offline with the
online data. The models were selected via the lowest root mean squared
error (RMSE) and then evaluated via their predictability for independent
test data sets which have not been used for the model training before.
Implemented in the stirring software of the chromatographic workstation,
the established models give information on all quality attribute they
have been trained on in real-time (<1sec) and enable real-time
decision making for e.g. pooling of the eluate for the next step.
Sauer et.al. equipped a chromatographic workstation with multiple
sensors (Sauer et al., 2019). Besides standard detectors (UV, pH and
conductivity), multi-angle light scattering, refractive index,
attenuated total reflection Fourier-transform infrared and fluorescence
spectroscopy were included. The real-time monitoring system was used in
a cation exchange capture step of fibroblast growth factor 2 expressed
in E. coli . Eight training runs were performed where 15 fractions
of the eluate were analyzed to get information about the product
quantity, host cell protein and double-stranded DNA impurities as well
as endotoxins and Monomer/aggregates (Figure 7). Prediction models were
generated for each individual response variable using cross-validation.
The same system was used for an antibody capture process (Walch et al.,
2019). The input data of the various devices was time-aligned
considering the different void volumes and time resolution. Individual
preprocessing methods were applied to the individual sensors together
with a variable selection procedure specific for the sensor. Finally,
the online signals were averaged over the time intervals of the
collected fractions that were analyzed off-line. A multiple sensor
approach is only feasible if the chromatographic workstation is equipped
with a central database (Oliveira, 2019; Steinwandter et al., 2019).
For these multiple sensors the software solution XAMIris (Evon, Austria)
was used for the recording of various signals, starting of the
chromatographic runs, data export, time-alignment as well as the
implementation of soft sensors for real-time monitoring of several CQAs
(Christler et al., 2021; Sauer et al., 2019; Walch et al., 2019).
Impact of different sensors
In same multi sensor setup UV/Vis spectroscopy was used as it mainly
measures the primary structure, such as the content of aromatic amino
acids (UV280nm), polypeptide backbone (UV214nm) or DNA content (UV260nm)
(Christler et al., 2021; Sauer et al., 2019; Walch et al., 2019). The
refractive index (RI) was included as it was previously used to quantify
protein (Zhao et al., 2011). ATR‐FTIR can distinguish between HCP and
target protein (Capito et al., 2013). Intrinsic fluorescence of the
aromatic amino acids can used to measure the tertiary structure of
proteins and to detect structural changes induced by polarity
(Ghisaidoobe et al., 2014; Rathore et al., 2009). Light scattering
methods (Minton, 2016) are used to determine their quaternary structure,
for example, protein aggregation. Fluorescence spectroscopy, as well as
light scattering techniques, have been used for at‐line determination of
quality attributes (Patel et al., 2018; Rathore et al., 2009; Yu et al.,
2013).
Multiple CQA monitoring- single sensors vs multiple sensors
If multiple CQAs are monitored using multiple sensors decent model
selection is required. Extensive investigation of the impact of the
individual sensors on the prediction performance was done (Sauer et al.,
2019; Walch et al., 2019). Typically, a prediction model is as simple as
possible but as complex as necessary. Therefore, models based solely on
one single sensor were compared to models with two, three up to all
available sensors. The best model was selected based solely on the
prediction error, e.g., the root mean squared error (RMSE) of prediction
on an independent test set, a purely data-driven approach. However, if
the performance of an extensive model including fluorescence and/or
ATR-FTIR data only slightly outperformed a basic model, it was still
recommended to use the basic model (Sauer et al., 2019). For all
investigated CQAs, the finally selected models contained more than one
single sensor.
Robustness of the monitoring system - sensor fouling / sensor shift
For the set-up of a real-time monitoring system, it is recommended to
implement multiple prediction models as sensor fault, shift or fouling
can easily distort the input data and make the prediction models
useless. The model used impacts the pooling decision (Walch et al.,
2019). Even though the performance of the more complex models was
superior on the independent test set, the simpler models were smoother
and more robust. An optimal monitoring system should always be based on
several prediction models based on different sensor combinations in
order to react on sensor fouling or sensor shifts. If one sensor fails,
the real-time monitoring can easily be based on an alternative model
where the specific sensor is not used. The technology transfer of the
monitoring system with multiple sensor (Sauer et al., 2019; Walch et
al., 2019) revealed that only a subset of possible prediction models
could be used for real-time prediction at the different sites as the
fluorescence device was not robust enough (Christler et al., 2021).
Simpler models without fluorescence could still be used at the different
sites. However, due to the very specific properties of the individual
sensors, the performance of the prediction models could be considerably
improved by new model training.