Skip to main navigation Skip to search Skip to main content

Automatic transliteration of romanized dialectal Arabic

  • Mohamed Al-Badrashiny
  • , Ramy Eskander
  • , Nizar Habash
  • , Owen Rambow
  • George Washington University
  • Columbia University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

64 Scopus citations

Abstract

In this paper, we address the problem of converting Dialectal Arabic (DA) text that is written in the Latin script (called Arabizi) into Arabic script following the CODA convention for DA orthography. The presented system uses a finite state transducer trained at the character level to generate all possible transliterations for the input Arabizi words. We then filter the generated list using a DA morphological analyzer. After that we pick the best choice for each input word using a language model. We achieve an accuracy of 69.4% on an unseen test set compared to 63.1% using a system which represents a previously proposed approach.

Original languageEnglish
Title of host publicationCoNLL 2014 - 18th Conference on Computational Natural Language Learning, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages30-38
Number of pages9
ISBN (Electronic)9781941643020
DOIs
StatePublished - 2014
Event18th Conference on Computational Natural Language Learning, CoNLL 2014 - Baltimore, United States
Duration: Jun 26 2014Jun 27 2014

Publication series

NameCoNLL 2014 - 18th Conference on Computational Natural Language Learning, Proceedings

Conference

Conference18th Conference on Computational Natural Language Learning, CoNLL 2014
Country/TerritoryUnited States
CityBaltimore
Period06/26/1406/27/14

Fingerprint

Dive into the research topics of 'Automatic transliteration of romanized dialectal Arabic'. Together they form a unique fingerprint.

Cite this