TY - GEN
T1 - Person name extraction from modern standard Arabic or colloquial text
AU - Zayed, Omnia H.
AU - El-Beltagy, Samhaa R.
PY - 2012
Y1 - 2012
N2 - Person Name extraction from Arabic text is a challenging task. While most existing Arabic texts are written in Modern Standard Arabic Text (MSA) the volume of Arabic Colloquial text is increasing progressively with the wide spread use of social media examples of which are Facebook, Google Moderator and Twitter. Previous work addressed extracting persons' names from MSA text only and especially from news articles. Previous work also relied on a lot of resources such as gazetteers for places, organizations, verbs, and person names. In this paper we introduce a system for extracting persons' names from any type of Arabic text whether it is MSA or Colloquial using very few resources. In our system, Natural Language Processing (NLP) is integrated with a limited set of dictionaries to extract a person's name from Arabic text. The paper also presents the results of evaluating the system on two datasets, one for MSA and the other for Colloquial Arabic. The results achieved were found to be satisfactory in terms of precision, recall and f-measure.
AB - Person Name extraction from Arabic text is a challenging task. While most existing Arabic texts are written in Modern Standard Arabic Text (MSA) the volume of Arabic Colloquial text is increasing progressively with the wide spread use of social media examples of which are Facebook, Google Moderator and Twitter. Previous work addressed extracting persons' names from MSA text only and especially from news articles. Previous work also relied on a lot of resources such as gazetteers for places, organizations, verbs, and person names. In this paper we introduce a system for extracting persons' names from any type of Arabic text whether it is MSA or Colloquial using very few resources. In our system, Natural Language Processing (NLP) is integrated with a limited set of dictionaries to extract a person's name from Arabic text. The paper also presents the results of evaluating the system on two datasets, one for MSA and the other for Colloquial Arabic. The results achieved were found to be satisfactory in terms of precision, recall and f-measure.
KW - colloquial Arabic
KW - Modern Standard Arabic
KW - named entity recognition
KW - natural language processing (NLP)
KW - social media
UR - https://www.scopus.com/pages/publications/84864860212
M3 - Conference Publication
AN - SCOPUS:84864860212
SN - 9789774035067
T3 - 2012 8th International Conference on Informatics and Systems, INFOS 2012
SP - NLP44-NLP48
BT - 2012 8th International Conference on Informatics and Systems, INFOS 2012
T2 - 2012 8th International Conference on Informatics and Systems, INFOS 2012
Y2 - 14 May 2012 through 16 May 2012
ER -