IMAGINE | NLP4Health Lab Amsterdam

Between 2019 and 2022, dr. Calixto was a Marie Skwodówska-Curie Global Fellow working on the IMAGINE: Improving Multi-modal lAnguage Generation wIth world kNowledgE project, which was a research project to investigate how to incorporate world knowledge into vision & language tasks within natural language generation.

Dr. Calixto spent ~2 years in New York University’s Courant Institute for Mathematical Sciences where he worked with prof. Kyunghyun Cho, and 1 year at the Institute for Logic, Language and Computation (ILLC) in the University of Amsterdam, where he worked with prof. Raquel Fernández.

References

2023

SemEval-2023 Task 1: Visual Word Sense Disambiguation

Alessandro Raganato , Iacer Calixto , Asahi Ushio , Jose Camacho-Collados , and Mohammad Taher Pilehvar

In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023) , Jul 2023

Abs Bib HTML

This paper presents the Visual Word Sense Disambiguation (Visual-WSD) task. The objective of Visual-WSD is to identify among a set of ten images the one that corresponds to the intended meaning of a given ambiguous word which is accompanied with minimal context. The task provides datasets for three different languages: English, Italian, and Farsi.We received a total of 96 different submissions. Out of these, 40 systems outperformed a strong zero-shot CLIP-based baseline. Participating systems proposed different zero- and few-shot approaches, often involving generative models and data augmentation. More information can be found on the task’s website: }urlhttps://raganato.github.io/vwsd/.
@inproceedings{raganato-etal-2023-semeval, title = {{S}em{E}val-2023 Task 1: Visual Word Sense Disambiguation}, author = {Raganato, Alessandro and Calixto, Iacer and Ushio, Asahi and Camacho-Collados, Jose and Pilehvar, Mohammad Taher}, booktitle = {Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)}, month = jul, year = {2023}, address = {Toronto, Canada}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2023.semeval-1.308}, doi = {10.18653/v1/2023.semeval-1.308}, pages = {2227--2234}, }

2022

Endowing language models with multimodal knowledge graph representations

Ningyuan Huang , Yash R Deshpande , Yibo Liu , Houda Alberts , Kyunghyun Cho , Clara Vania , and Iacer Calixto

arXiv preprint arXiv:2206.13163, Jul 2022

Abs arXiv Bib Code

We propose a method to make natural language understanding models more parameter efficient by storing knowledge in an external knowledge graph (KG) and retrieving from this KG using a dense index. Given (possibly multilingual) downstream task data, e.g., sentences in German, we retrieve entities from the KG and use their multimodal representations to improve downstream task performance. We use the recently released VisualSem KG as our external knowledge repository, which covers a subset of Wikipedia and WordNet entities, and compare a mix of tuple-based and graph-based algorithms to learn entity and relation representations that are grounded on the KG multimodal information. We demonstrate the usefulness of the learned entity representations on two downstream tasks, and show improved performance on the multilingual named entity recognition task by 0.3%–0.7% F1, while we achieve up to 2.5% improvement in accuracy on the visual sense disambiguation task. All our code and data are available in: \urlthis https URL.
@article{huang2022endowing, title = {Endowing language models with multimodal knowledge graph representations}, author = {Huang, Ningyuan and Deshpande, Yash R and Liu, Yibo and Alberts, Houda and Cho, Kyunghyun and Vania, Clara and Calixto, Iacer}, journal = {arXiv preprint arXiv:2206.13163}, year = {2022}, }