The Stylometry of Maoism: Quantifying the Language of Mao Zedong

Maciej Kurzynski


Abstract
Recent advances in computational stylometry have enabled scholars to detect authorial signals with a high degree of precision, but the focus on accuracy comes at the expense of explainability: powerful black-box models are often of little use to traditional humanistic disciplines. With this in mind, we have conducted stylometric experiments on Maospeak, a language style shaped by the writings and speeches of Mao Zedong. We measure per-token perplexity across different GPT models, compute Kullback–Leibler divergences between local and global vocabulary distributions, and train a TF-IDF classifier to examine how the modern Chinese language has been transformed to convey the tenets of Maoist doctrine. We offer a computational interpretation of ideology as reduction in perplexity and increase in systematicity of language use.
Anthology ID:
2023.nlp4dh-1.9
Volume:
Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages
Month:
December
Year:
2023
Address:
Tokyo, Japan
Editors:
Mika Hämäläinen, Emily Öhman, Flammie Pirinen, Khalid Alnajjar, So Miyagawa, Yuri Bizzoni, Niko Partanen, Jack Rueter
Venues:
NLP4DH | IWCLUL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
76–81
Language:
URL:
https://aclanthology.org/2023.nlp4dh-1.9
DOI:
Bibkey:
Cite (ACL):
Maciej Kurzynski. 2023. The Stylometry of Maoism: Quantifying the Language of Mao Zedong. In Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages, pages 76–81, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):
The Stylometry of Maoism: Quantifying the Language of Mao Zedong (Kurzynski, NLP4DH-IWCLUL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nlp4dh-1.9.pdf