Loading...
Thumbnail Image

Date

2023-12-15

Journal Title

Journal ISSN

Volume Title

Publisher

There has been an increasing interest over the past ten years in the use of Machine Learning (ML) algorithms such as Random Forests (RF) in the context of severe weather prediction. RF-based methods have even been shown to outperform human-generated operational convective outlook guidance in some cases. However, there remain obstacles to fully integrating the ML algorithms into the operational forecasting process of severe wind, hail and tornado events. For example, the perceived black-box nature of complex RF models can inhibit forecaster confidence in the ML guidance for high impact or atypical events. Since the error characteristics of predictors based on numerical weather prediction, or NWP, and the relationships between these predictors and severe weather risk can vary in different flow patterns, there is a need to better understand the impacts of large-scale flow patterns on RF model performance. In addition to improving confidence in the RF-based forecast products, such understanding can also be incorporated into the model building process to further improve their performance.

This thesis discusses the development and evaluation of a flow-dependent approach to training RF models to produce severe weather convective outlook guidance. This work leverages 53 real-time cases from the 2019 and 2021 real time convection-allowing FV3-based ensemble forecasts produced by the University of Oklahoma (OU) Multi-scale data Assimilation and Predictability (MAP) Lab during the 2021 Hazardous Weather Testbed (HWT) forecasting experiments as model predictors. This study will focus mainly on the 29 cases from 2021. As a first step, the composite difference in large-scale flow between cases with relatively high and low importance of key predictors using Permutation Feature Importance were calculated. These composite differences were used to evaluate if discernible large scale flow patterns could be when the non-flow dependent model would perform the best. Two different methods of classifying cases based on the large-scale flow patterns are then evaluated for the purpose of training separate RF models on cases of similar flow patterns. The appropriate RF model for the pattern is then used to generating convective outlook guidance for a forecast case not included in the RF model training. First, the CAPE/shear parameter space over the region of interest is used as a classification metric. Second, EOF patterns that are qualitatively similar to the previously described composite flow patterns related to predictor importance and used as the classification metric. Finally, both methods will check how sensitive the performance is to changes in sample size by adding the 2019 cases. Both flow-dependent training methods will be compared to the non-flow dependent models and compared to each other. Results will emphasize both objective impacts on forecast skill and physical explanations of the difference in performance among the RF training approaches. Results show that both methods of flow-dependent training initially show improvement in forecasting severe weather compared to the non-flow dependent model. However, the parameter space classification remain to show significant skill with an increase in sample size while not all EOF patterns skill remained significant.

Description

Keywords

Severe Weather, Random Forests, Flow Dependent Training

Citation

DOI

Related file

Notes

Sponsorship

Collections