General supervised learning framework for open world classification

Bhavaraju, Sai Krishna Theja

View/Open

2020_Bhavaraju_Sai Krishna Theja_Dissertation.pdf (323.6Kb)

2020_Bhavaraju_Sai Krishna Theja_Dissertation.tex (129.7Kb)

Date

2020-12-18

Author

Bhavaraju, Sai Krishna Theja

Metadata

Show full item record

Abstract

In machine learning, the most common scenario for classification modeling is when the training set contains all possible classes and the algorithm learns to identify these classes. The problem setting in which the training data does not contain all classes is referred to as open-world classification problem. Hence, when the model is applied to new data, it is imperative to identify the instances belonging to unknown classes. While literature addresses this issue, most of the work in this field has been limited to the domain of computer vision and the solution approaches are specific to a particular type of machine learning algorithm. Furthermore, it is equally important to categorize the identified instances into their classes, to facilitate retraining and to the best of our knowledge, there is no generalized approach that provides a complete solution to the problem. This work proposes a framework that can identify instances from unseen data and also categorize the identified instances into their respective classes. We claim that this methodology works irrespective of the nature of the data and also the type of classifier under consideration. To validate our claim, the methodology is tested on different types of data such as image, text, sensor, etc. Furthermore, the proposed framework is demonstrated on the case study; social media analytics for community resilience. Our results show that the performance of the methodology is successful and consistent across the data sets and the case study considered.