TY - GEN
T1 - Mining governmental collaboration through semantic profiling of open data catalogues and publishers
AU - Adel Rezk, Mohamed
AU - Ojo, Adegboyega
AU - Hassan, Islam A.
N1 - Publisher Copyright:
© IFIP International Federation for Information Processing 2017.
PY - 2017
Y1 - 2017
N2 - Due to the increasing adoption of open data among governments worldwide especially in the European Union area, a deeper analysis of the newly published data is becoming a mandate. Apart from analyzing the published dataset itself we aimed on analyzing published dataset catalogues. A dataset catalogue or a dataset metadata contains features that describe what the data is about in a textual representation. So, we first acquire data from open data portals, choose descriptive dataset catalogue features, and then construct an aggregated textual representation of the datasets. Afterwards we enrich those textual representations using Natural Language Processing (NLP) methods to create a new comparable data feature “Named Entities”. By mining the new data feature we are able to produce datasets and publishers relatedness network. Those networks are used to point similarities between the published data across multiple open data portals. Pointing all possible collaborations for integrating and standardizing data features and types would increase the value of da1ta and ease its analysis process.
AB - Due to the increasing adoption of open data among governments worldwide especially in the European Union area, a deeper analysis of the newly published data is becoming a mandate. Apart from analyzing the published dataset itself we aimed on analyzing published dataset catalogues. A dataset catalogue or a dataset metadata contains features that describe what the data is about in a textual representation. So, we first acquire data from open data portals, choose descriptive dataset catalogue features, and then construct an aggregated textual representation of the datasets. Afterwards we enrich those textual representations using Natural Language Processing (NLP) methods to create a new comparable data feature “Named Entities”. By mining the new data feature we are able to produce datasets and publishers relatedness network. Those networks are used to point similarities between the published data across multiple open data portals. Pointing all possible collaborations for integrating and standardizing data features and types would increase the value of da1ta and ease its analysis process.
KW - Collaborative network
KW - Data mining
KW - E-government
KW - Open data
KW - Unstructured data analysis
UR - https://www.scopus.com/pages/publications/85029593119
U2 - 10.1007/978-3-319-65151-4_24
DO - 10.1007/978-3-319-65151-4_24
M3 - Conference Publication
AN - SCOPUS:85029593119
SN - 9783319651507
T3 - IFIP Advances in Information and Communication Technology
SP - 253
EP - 264
BT - IFIP Advances in Information and Communication Technology
A2 - Camarinha-Matos, Luis M.
A2 - Fornasiero, Rosanna
A2 - Afsarmanesh, Hamideh
PB - Springer New York LLC
T2 - 18th IFIP WG 5.5 Working Conference on Virtual Enterprises, PRO-VE 2017
Y2 - 18 September 2017 through 20 September 2017
ER -