Traditional approaches for comparing global climate models and observational data products typically fail to account for the geographic location of the underlying weather station data. For modern global high-resolution models with a horizontal resolution of tens of kilometers, this is an oversight since there are likely grid cells where the physical output of a climate model is compared with a statistically interpolated quantity instead of actual measurements of the climate system. In this paper, we quantify the impact of geographic sampling on the relative performance of high-resolution climate model representations of precipitation extremes in boreal winter (December–January–February) over the contiguous United States (CONUS), comparing model output from five early submissions to the HighResMIP subproject of the CMIP6 experiment. We find that properly accounting for the geographic sampling of weather stations can significantly change the assessment of model performance. Across the models considered, failing to account for sampling impacts the different metrics (extreme bias, spatial pattern correlation, and spatial variability) in different ways (both increasing and decreasing). We argue that the geographic sampling of weather stations should be accounted for in order to yield a more straightforward and appropriate comparison between models and observational data sets, particularly for high-resolution models with a horizontal resolution of tens of kilometers. While we focus on the CONUS in this paper, our results have important implications for other global land regions where the sampling problem is more severe.