Benchmarking of machine learning ocean subgrid parameterizations in an idealized model