Climate models still need to be improved in their capability of reproducing the present climate at both global and regional scale. The assessment of their performance depends on the datasets used as comparators. Reanalysis and gridded (homogenized or not homogenized) observational datasets have been frequently used for this purpose. However, none of these can be considered a reference dataset. Here, for the first time, using in-situ measurements from NOAA U.S. Climate Reference Network (USCRN), a network of 139 stations with high-quality instruments deployed across the continental U.S, daily temperature, and precipitation from a suite of dynamically downscaled regional climate models (RCMs; driven by ERA-Interim) involved in NA-CORDEX are assessed. The assessment is extended also to the most recent and modern widely used reanalysis (ERA5, ERA-Interim, MERRA2, NARR) and gridded observational datasets (Daymet, PRISM, Livneh, CPC). Results show that biases for the different datasets are mainly seasonal and subregional dependent. On average, reanalysis and in-situ-based datasets are generally warmer than USCRN year-round, while models are colder (warmer) in winter (summer). In-situ-based datasets provide the best performance in most of the CONUS regions compared to reanalysis and models, but still have biases in regions such as the Midwest mountains and the Northwestern Pacific. Results also highlight that reanalysis does not outperform RCMs in most of the U.S. subregions. Likewise, for both reanalysis and models, temperature and precipitation biases are also significantly depending on the orography, with larger temperature biases for coarser model resolutions and precipitation biases for reanalysis.