Show simple item record

dc.contributor.advisorMcGovern, Amy
dc.contributor.authorLagerquist, Ryan
dc.date.accessioned2016-08-18T21:22:58Z
dc.date.available2016-08-18T21:22:58Z
dc.date.issued2016-08-12
dc.identifier.urihttps://hdl.handle.net/11244/44921
dc.description.abstractThunderstorms, including straight-line (non-tornadic) winds, cause an average of over 100 deaths and $10 billion of insured damage per year in the United States. In the past decade machine learning has led to significant improvements in the prediction of other convective hazards, such as tornadoes, hail, lightning, and convectively induced aircraft turbulence. However, very few studies have used machine learning specifically to predict damaging straight-line winds. We have developed machine-learning models to predict the probability of damaging straight-line wind, defined as a gust ≥ 50 kt (25.72 m s-1), for a given storm cell. Predictions are made for three buffer distances around the storm cell (0, 5, and 10 km) and five lead-time windows ([0, 15]; [15, 30]; [30, 45]; [45, 60]; and [60, 90] minutes). Three types of data are used to train models: radar images from the Multi-year Reanalysis of Remotely Sensed Storms (MYRORSS); atmospheric soundings from the Rapid Update Cycle (RUC) model and North American Regional Reanalysis (NARR); and near-surface wind observations from the Meteorological Assimilation Data Ingest System (MADIS), Oklahoma Mesonet, one-minute meteorological aerodrome reports (METARs), and National Weather Service local storm reports. Radar images are used to determine the structural and hydrometeorological properties of storm cells, while soundings are used to determine properties of the near-storm environment, which are important for storm evolution. Both of these data types are used to create predictor variables. Meanwhile, near-surface wind observations are used as verification data (to determine which storm cells produced damaging straight-line winds). For each buffer distance and lead-time window, we experiment with five machine-learning algorithms: logistic regression, logistic regression with an elastic net, feed-forward neural nets, random forests, and gradient-boosted tree (GBT) ensembles. Forecast probabilities from each model are calibrated with isotonic regression, which makes them more reliable. Forecasts are verified mainly with three numbers: area under the receiver-operating-characteristic curve (AUC), maximum critical success index (CSI), and Brier skill score (BSS). AUC and maximum CSI range from [0, 1], where 0 is the worst score and 1 is a perfect score. BSS ranges from (−∞, 1], where −∞ is the worst score; 1 is a perfect score; and > 0 means that the model is better than climatology. Models are ranked by AUC. The best model (for a buffer distance of 0 km and lead time of [15, 30] minutes) has an AUC of 0.996, maximum CSI of 0.99, and BSS of 0.88. The worst model (for a buffer distance of 10 km and lead time of [60, 90] minutes) has an AUC of 0.89, maximum CSI of 0.20, and BSS of 0.12. All models outperform climatology. Finally, for each buffer distance and lead-time window, we use three methods to select the most important predictor variables: sequential forward selection, J-measures, and decision trees.en_US
dc.languageenen_US
dc.subjectMeteorologyen_US
dc.subjectThunderstormsen_US
dc.subjectMachine learningen_US
dc.subjectArtificial intelligenceen_US
dc.titleUsing Machine Learning to Predict Damaging Straight-line Convective Windsen_US
dc.contributor.committeeMemberSmith, Travis
dc.contributor.committeeMemberRichman, Michael
dc.date.manuscript2016-08-18
dc.thesis.degreeMaster of Science in Meteorologyen_US
ou.groupCollege of Atmospheric & Geographic Sciences::School of Meteorologyen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record