Zhaoji Liang


2023

pdf bib
Multi-Modal Knowledge Graph Transformer Framework for Multi-Modal Entity Alignment
Qian Li | Cheng Ji | Shu Guo | Zhaoji Liang | Lihong Wang | Jianxin Li
Findings of the Association for Computational Linguistics: EMNLP 2023

Multi-Modal Entity Alignment (MMEA) is a critical task that aims to identify equivalent entity pairs across multi-modal knowledge graphs (MMKGs). However, this task faces challenges due to the presence of different types of information, including neighboring entities, multi-modal attributes, and entity types. Directly incorporating the above information (e.g., concatenation or attention) can lead to an unaligned information space. To address these challenges, we propose a novel MMEA transformer, called Meaformer, that hierarchically introduces neighbor features, multi-modal attributes, and entity types to enhance the alignment task. Taking advantage of the transformer’s ability to better integrate multiple information, we design a hierarchical modifiable self-attention block in a transformer encoder to preserve the unique semantics of different information. Furthermore, we design two entity-type prefix injection methods to redintegrate entity-type information using type prefixes, which help to restrict the global information of entities not present in the MMKGs.