Show simple item record

dc.contributor.advisorLiu, Tieming
dc.contributor.authorPiri, Saeed
dc.date.accessioned2018-04-23T19:36:01Z
dc.date.available2018-04-23T19:36:01Z
dc.date.issued2017-07
dc.identifier.urihttps://hdl.handle.net/11244/299540
dc.description.abstractImproving healthcare is a top priority for all nations. US healthcare expenditure was $3 trillion in 2014. In the same year, the share of GDP assigned to healthcare expenditure was 17.5%. These statistics shows the importance of making improvement in healthcare delivery system. In this research, we developed several data mining methods and algorithms to address healthcare problems. These methods can also be applied to the problems in other domains.
dc.description.abstractThe first part of this dissertation is about rare item problem in association analysis. This problem deals with the discovering rare rules, which include rare items. In this study, we introduced a novel assessment metric, called adjusted support to address this problem. By applying this metric, we can retrieve rare rules without over-generating association rules. We applied this method to perform association analysis on complications of diabetes.
dc.description.abstractThe second part of this dissertation is developing a clinical decision support system for predicting retinopathy. Retinopathy is the leading cause of vision loss among American adults. In this research, we analyzed data from more than 1.4 million diabetic patients and developed four sets of predictive models: basic, comorbid, over-sampled, and ensemble models. The results show that incorporating comorbidity data and oversampling improved the accuracy of prediction. In addition, we developed a novel "confidence margin" ensemble approach that outperformed the existing ensemble models. In ensemble models, we also addressed the issue of tie in voting-based ensemble models by comparing the confidence margins of the base predictors.
dc.description.abstractThe third part of this dissertation addresses the problem of imbalanced data learning, which is a major challenge in machine learning. While a standard machine learning technique could have a good performance on balanced datasets, when applied to imbalanced datasets its performance deteriorates dramatically. This poor performance is rather troublesome especially in detecting the minority class that usually is the class of interest. In this study, we proposed a synthetic informative minority over-sampling (SIMO) algorithm embedded into support vector machine. We applied SIMO to 15 publicly available benchmark datasets and assessed its performance in comparison with seven existing approaches. The results showed that SIMO outperformed all existing approaches.
dc.formatapplication/pdf
dc.languageen_US
dc.rightsCopyright is held by the author who has granted the Oklahoma State University Library the non-exclusive right to share this material in its institutional repository. Contact Digital Library Services at lib-dls@okstate.edu or 405-744-9161 for the permission policy on the use, reproduction or distribution of this material.
dc.titleDeveloping and deploying data mining techniques in healthcare
dc.contributor.committeeMemberHeragu, Sunderesh
dc.contributor.committeeMemberYousefian, Farzad
dc.contributor.committeeMemberPaiva, William
dc.contributor.committeeMemberDulen, Dursun
osu.filenamePiri_okstate_0664D_15272.pdf
osu.accesstypeOpen Access
dc.type.genreDissertation
dc.type.materialText
thesis.degree.disciplineIndustrial Engineering and Management
thesis.degree.grantorOklahoma State University


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record