Integrating recurrent neural networks with data assimilation for scalable data-driven state estimation