A Weakly Supervised and Deep Learning Method for an Additive Topic Analysis of Large Corpora


Yair Fogel-Dror, Sheafer, Tamir , and Shenhav, Shaul R. . 2021. “A Weakly Supervised And Deep Learning Method For An Additive Topic Analysis Of Large Corpora”. Computational Communication Research, 31, Pp. 29-59. https://computationalcommunication.org/ccr/article/view/34.


The collaborative effort of theory-driven content analysis can benefitsignificantly from the use of topic analysis methods, which allow researchersto add more categories while developing or testing a theory. This additiveapproach enables the reuse of previous efforts of analysis or even themerging of separate research projects, thereby making these methodsmore accessible and increasing the discipline’s ability to create and sharecontent analysis capabilities. This paper proposes a weakly supervised topicanalysis method that uses both a low-cost unsupervised method to compilea training set and supervised deep learning as an additive and accuratetext classification method. We test the validity of the method, specificallyits additivity, by comparing the results of the method after adding 200categories to an initial number of 450. We show that the suggested methodprovides a foundation for a low-cost solution for large-scale topic analysis.