Image captioning is an NLP task that has many applications such as image search and retrieval. This Task is a challenging task, and it needs a lot of data (image data and their text captions), which might not be available for some languages. In this work, we investigate the use of a machine translation system to provide resources for a low-resourced language (Arabic) for the imaging captioning task. We train a model on captions automatically translated using Google machine translation service. The performance is measured using the BLEU, ROUGE, CIDEr, METEOR metrics. We compare to English model's performance. We also evaluate the generated captions on manually translated captions. The results show that machine translation can be good enough for creating resources for low-resourced languages for the image captioning task and translating training data and building a new model is better than translating the model's output.

The Use of Machine Translation to Provide Resources for Under-Resourced Languages - Image Captioning Task

Saad M.
Ultimo
Membro del Collaboration Group
2021-01-01

Abstract

Image captioning is an NLP task that has many applications such as image search and retrieval. This Task is a challenging task, and it needs a lot of data (image data and their text captions), which might not be available for some languages. In this work, we investigate the use of a machine translation system to provide resources for a low-resourced language (Arabic) for the imaging captioning task. We train a model on captions automatically translated using Google machine translation service. The performance is measured using the BLEU, ROUGE, CIDEr, METEOR metrics. We compare to English model's performance. We also evaluate the generated captions on manually translated captions. The results show that machine translation can be good enough for creating resources for low-resourced languages for the image captioning task and translating training data and building a new model is better than translating the model's output.
2021
9781665436519
File in questo prodotto:
File Dimensione Formato  
The_Use_of_Machine_Translation_to_Provide_Resources_for_Under-Resourced_Languages_-_Image_Captioning_Task.pdf

solo utenti autorizzati

Tipologia: Versione editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 483.78 kB
Formato Adobe PDF
483.78 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11587/561287
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 1
social impact