Integrating semantic knowledge to tackle zero-shot text classification
File(s)1903.12626v1.pdf (441.24 KB)
Accepted version
Author(s)
Zhang, Jingqing
Lertvittayakumjorn, Piyawat
Guo, Yike
Type
Conference Paper
Abstract
Insufficient or even unavailable training data of emerging classes is a big
challenge of many classification tasks, including text classification.
Recognising text documents of classes that have never been seen in the learning
stage, so-called zero-shot text classification, is therefore difficult and only
limited previous works tackled this problem. In this paper, we propose a
two-phase framework together with data augmentation and feature augmentation to
solve this problem. Four kinds of semantic knowledge (word embeddings, class
descriptions, class hierarchy, and a general knowledge graph) are incorporated
into the proposed framework to deal with instances of unseen classes
effectively. Experimental results show that each and the combination of the two
phases achieve the best overall accuracy compared with baselines and recent
approaches in classifying real-world texts under the zero-shot scenario.
challenge of many classification tasks, including text classification.
Recognising text documents of classes that have never been seen in the learning
stage, so-called zero-shot text classification, is therefore difficult and only
limited previous works tackled this problem. In this paper, we propose a
two-phase framework together with data augmentation and feature augmentation to
solve this problem. Four kinds of semantic knowledge (word embeddings, class
descriptions, class hierarchy, and a general knowledge graph) are incorporated
into the proposed framework to deal with instances of unseen classes
effectively. Experimental results show that each and the combination of the two
phases achieve the best overall accuracy compared with baselines and recent
approaches in classifying real-world texts under the zero-shot scenario.
Date Acceptance
2019-06-02
Copyright Statement
© 2019 The Authors.
Identifier
http://arxiv.org/abs/1903.12626v1
Source
2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Subjects
cs.CL
cs.CL
Notes
Accepted NAACL-HLT 2019
Start Date
2019-06-03
Finish Date
2019-06-05
Coverage Spatial
Minneapolis