Benchmarking of numerical process-based hydrological models against a suite of standardized reference datasets and statistical metrics is essential to evaluate the accuracy of processes represented by the model and to instill confidence in their simulated results for understanding of hydrological processes at the watershed to regional to global scales. The International Land Model Benchmarking (ILAMB) project is a model-data intercomparison and integration project designed to improve the performance of the land component of Earth system models. ILAMB is a Python-based, open source software that automates the generation of a suite of scalar metrics and plots comparing the model(s) to a collection of observational data. ILAMB implements a methodology which synthesizes multiple models and reference observations, to provide a high-level understanding of model performance against the reference and provide insights in model behavior. We extended and enhanced the capabilities with ILAMB to benchmark hydrological processes, which are available in ILAMB v2.7 release. Study regions of interest and integration, often watersheds, can now be defined by Shapefile/GeoJSON. This release also includes a model confrontation module which compares model output to stream gauge data that is automatically retrieved from the USGS data servers in the background. This extension allows for comparison of gridded models, including raw E3SM output, but also model output in the form of hydrographs. We present results from E3SM, ATS, NWM, and SWAT and compare to gauge results of the American River Washington watershed. As a starting point we present comparisons using the so-called Nash-Sutcliffe and Kling-Gupta efficiencies. New developments in ILAMB enables systematic benchmarking and comparison of hydrological models from watershed to global scales using a suite of gauge based and gridded observations.