Zhi Li

2023

pdf bib abs
Cultural Concept Adaptation on Multimodal Reasoning
Zhi Li | Yin Zhang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Developing cultural adaptation methods is important, which can improve the model performance on the low-resource ones and provide more equitable opportunities for everyone to benefit from advanced technology. Past methods primarily focused on multilingual and multimodal capabilities, and the improvement of multicultural competence is still an unexplored problem. This is largely due to the difficulty of data scarcity and expensive annotation. In this paper, we navigate this uncharted territory by leveraging high-resource cultures to facilitate comprehension of low-resource ones. We first introduce an annotation-free method for cultural-concept adaptation and construct a concept mapping set. To facilitate the model’s comprehension of cultural-concept mappings, we propose a new multimodal data augmentation called CultureMixup. This approach employs a three-tier code-switching strategy on textual sentences. Additionally, it uses a cultural concept-based mixup method for the images. This combination effectively generates new data instances across culture, phrase, word, and image levels. For visually grounded reasoning across languages and cultures, experimental results on five languages show that our method consistently improves performance for four existing multilingual and multimodal models on both zero-shot and few-shot settings.

pdf bib abs
CCL23-Eval 任务6系统报告:面向电信网络诈骗案件分类的优化策略(CCL23-Eval Task 6 System Report: Research on Optimization Strategies for Telecom Internet fraud Case Classification)
Junhui Yu (余俊晖) | Zhi Li (李智)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“电信网络诈骗案件的激增给社会带来了巨大的安全威胁,因此准确、高效地分类和检测电信网络诈骗成为了当务之急。本研究旨在针对电信网络诈骗案件分类问题,探索了一系列优化策略,并在“电信网络诈骗案件分类评测”技术评测比赛中最终成绩排名第一。本研究基于文本分类模型,并采用了BERT的继续预训练、FreeLB的对抗训练和模型融合等trick。通过BERT的继续预训练,使模型具备更好的语义理解能力和特征提取能力。而通过FreeLB的对抗训练,增强了模型的鲁棒性,使其能够更好地应对噪声和干扰。此外,本文采用模型融合的方法将多个模型的预测结果进行融合,进一步提高了分类的准确性。实验结果表明,本文的优化策略在比赛中取得了显著的成绩,证明了其在电信网络诈骗案件分类中的有效性和优越性。本研究的成果对于提高电信网络诈骗案件的分类性能具有重要意义,为相关领域的研究和实践提供了有益的参考。”

2021

Natural language processing (NLP) models often require a massive number of parameters for word embeddings, which limits their application on mobile devices. Researchers have employed many approaches, e.g. adaptive inputs, to reduce the parameters of word embeddings. However, existing methods rarely pay attention to semantic information. In this paper, we propose a novel method called Unique and Class Embeddings (UnClE), which explicitly leverages semantic similarity with weight sharing to reduce the dimensionality of word embeddings. Inspired by the fact that words with similar semantic can share a part of weights, we divide the embeddings of words into two parts: unique embedding and class embedding. The former is one-to-one mapping like traditional embedding, while the latter is many-to-one mapping and learn the representation of class information. Our method is suitable for both word-level and sub-word level models and can be used to reduce both input and output embeddings. Experimental results on the standard WMT 2014 English-German dataset show that our method is able to reduce the parameters of word embeddings by more than 11x, with about 93% performance retaining in BLEU metrics. For language modeling task, our model can reduce word embeddings by 6x or 11x on PTB/WT2 dataset at the cost of a certain degree of performance degradation.

pdf bib abs
AutoChart: A Dataset for Chart-to-Text Generation Task
Jiawen Zhu | Jinye Ran | Roy Ka-Wei Lee | Zhi Li | Kenny Choo
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

The analytical description of charts is an exciting and important research area with many applications in academia and industry. Yet, this challenging task has received limited attention from the computational linguistics research community. This paper proposes AutoChart, a large dataset for the analytical description of charts, which aims to encourage more research into this important area. Specifically, we offer a novel framework that generates the charts and their analytical description automatically. We conducted extensive human and machine evaluation on the generated charts and descriptions and demonstrate that the generated texts are informative, coherent, and relevant to the corresponding charts.

There is a long history of research related to automated story generation, dating back as far as the 1970s. Recently, the rapid development of pre-trained language models has spurred great progresses in this field. Equipped with GPT-2 and the latest GPT-3, AI Dungeon has been seen as a famous example of the powerful text generation capabilities of large-scale pre-trained language models, and a possibility for future games. However, as a game, AI Dungeon lacks incentives to players and relies entirely on players to explore on their own. This makes players’ enthusiasm decline rapidly. In this paper, we present an open-ended text adventure game in Chinese, named as KuiLeiXi. In KuiLeiXi, players need to interact with the AI until the pre-determined plot goals are reached. By introducing the plot goals, players have a stronger incentive to explore ways to reach plot goals, while the AI’s abilities are not abused to generate harmful contents. This limited freedom allows this game to be integrated as a part of a romance simulation mobile game, Yu Jian Love. Since KuiLeiXi was launched, it has received a lot of positive feedbacks from more than 100,000 players. A demo video is available at https://youtu.be/DyYZhxMRrkk.

This paper presents BSTC (Baidu Speech Translation Corpus), a large-scale Chinese-English speech translation dataset. This dataset is constructed based on a collection of licensed videos of talks or lectures, including about 68 hours of Mandarin data, their manual transcripts and translations into English, as well as automated transcripts by an automatic speech recognition (ASR) model. We have further asked three experienced interpreters to simultaneously interpret the testing talks in a mock conference setting. This corpus is expected to promote the research of automatic simultaneous translation as well as the development of practical systems. We have organized simultaneous translation tasks and used this corpus to evaluate automatic simultaneous translation systems.