Wind gusts, and in particular intense gusts, are societally relevant but extremely challenging to forecast. This study systematically assesses the skill enhancement that can be achieved using artificial neural networks (ANNs) for forecasting of wind gust occurrence and magnitude. Geophysical predictors from the ERA5 reanalysis are used in conjunction with an autoregressive term in regression and ANN models with different predictors, and varying model complexity. Models are derived and assessed for the warm (April–September) and cold (October–March) seasons for three high passenger volume airports in the United States. Model uncertainty is assessed by deriving models for 1000 different randomly selected training (70%) and testing (30%) subsets. Gust prediction fidelity in independent test samples is critically dependent on inclusion of an autoregressive term. Gust occurrence probabilities derived using five-layer ANNs exhibit consistently higher fidelity than those from regression models and shallower ANNs. Inclusion of the autoregressive term and increasing the number of hidden layers in ANNs from 1 to 5 also improve the model performance for gust magnitudes (lower RMSE, increased correlation, and model standard deviations that more closely approximate observed values). Deeper ANNs (e.g., 20 hidden layers) exhibit higher skill in forecasting strong (17–25.7 m s−1) and damaging (≥25.7 m s−1) wind gusts. However, such deep networks exhibit evidence of overfitting and still substantially underestimate (by 50%) the frequency of strong and damaging wind gusts at the three airports considered herein. Significance Statement Improved short-term forecasting of wind gusts will enhance aviation safety and logistics and may offer other societal benefits. Here we present a rigorous investigation of the relative skill of models of wind gust occurrence and magnitude that employ different statistical methods. It is shown that artificial neural networks (ANNs) offer considerable skill enhancement over regression methods, particularly for strong and damaging wind gusts. For wind gust magnitudes in particular, application of deeper learning networks (e.g., five or more hidden layers) offers tangible improvements in forecast accuracy. However, deeper networks are vulnerable to overfitting and exhibit substantial variability with the specific training and testing data subset used. Also, even deep ANNs reproduce only half of strong and damaging wind gusts. These results indicate the need for future work to elucidate the dynamical mechanisms of intense wind gusts and advance solutions to their prediction.