Person name extraction from modern standard Arabic or colloquial text

Omnia H. Zayed, Samhaa R. El-Beltagy

Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

4 Citations (Scopus)

Abstract

Person Name extraction from Arabic text is a challenging task. While most existing Arabic texts are written in Modern Standard Arabic Text (MSA) the volume of Arabic Colloquial text is increasing progressively with the wide spread use of social media examples of which are Facebook, Google Moderator and Twitter. Previous work addressed extracting persons' names from MSA text only and especially from news articles. Previous work also relied on a lot of resources such as gazetteers for places, organizations, verbs, and person names. In this paper we introduce a system for extracting persons' names from any type of Arabic text whether it is MSA or Colloquial using very few resources. In our system, Natural Language Processing (NLP) is integrated with a limited set of dictionaries to extract a person's name from Arabic text. The paper also presents the results of evaluating the system on two datasets, one for MSA and the other for Colloquial Arabic. The results achieved were found to be satisfactory in terms of precision, recall and f-measure.

Original languageEnglish
Title of host publication2012 8th International Conference on Informatics and Systems, INFOS 2012
PagesNLP44-NLP48
Publication statusPublished - 2012
Externally publishedYes
Event2012 8th International Conference on Informatics and Systems, INFOS 2012 - Cairo, Egypt
Duration: 14 May 201216 May 2012

Publication series

Name2012 8th International Conference on Informatics and Systems, INFOS 2012

Conference

Conference2012 8th International Conference on Informatics and Systems, INFOS 2012
Country/TerritoryEgypt
CityCairo
Period14/05/1216/05/12

Keywords

  • colloquial Arabic
  • Modern Standard Arabic
  • named entity recognition
  • natural language processing (NLP)
  • social media

Fingerprint

Dive into the research topics of 'Person name extraction from modern standard Arabic or colloquial text'. Together they form a unique fingerprint.

Cite this