TY - JOUR
T1 - Multimodal Event Processing
T2 - A Neural-Symbolic Paradigm for the Internet of Multimedia Things
AU - Curry, Edward
AU - Salwala, Dhaval
AU - Dhingra, Praneet
AU - Pontes, Felipe Arruda
AU - Yadav, Piyush
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2022/8/1
Y1 - 2022/8/1
N2 - With the Internet of Multimedia Things (IoMT) becoming a reality, new approaches are needed to process real-time multimodal event streams. Existing approaches to event processing have limited consideration for the challenges of multimodal events, including the need for complex content extraction, and increased computational and memory costs. This article explores event processing as a basis for processing real-time IoMT data. This article introduces the multimodal event processing (MEP) paradigm, which provides a formal basis for native approaches to neural multimodal content analysis (i.e., computer vision, linguistics, and audio) with symbolic event processing rules to support real-time queries over multimodal data streams using the multimodal event processing language to express single, primitive multimodal, and complex multimodal event patterns. The content of multimodal streams is represented using multimodal event knowledge graphs to capture the semantic, spatial, and temporal content of the multimodal streams. The approach is implemented and evaluated within a MEP engine using single and multimodal queries achieving near real-time performance with a throughput of 30 frames processed per second (fps) and subsecond latency of 0.075-0.30 s for video streams of 30 fps input rate. Support for high input stream rates (45 fps) is achieved through content-aware load-shedding techniques with a 127X latency improvement resulting in only a minor decrease in accuracy.
AB - With the Internet of Multimedia Things (IoMT) becoming a reality, new approaches are needed to process real-time multimodal event streams. Existing approaches to event processing have limited consideration for the challenges of multimodal events, including the need for complex content extraction, and increased computational and memory costs. This article explores event processing as a basis for processing real-time IoMT data. This article introduces the multimodal event processing (MEP) paradigm, which provides a formal basis for native approaches to neural multimodal content analysis (i.e., computer vision, linguistics, and audio) with symbolic event processing rules to support real-time queries over multimodal data streams using the multimodal event processing language to express single, primitive multimodal, and complex multimodal event patterns. The content of multimodal streams is represented using multimodal event knowledge graphs to capture the semantic, spatial, and temporal content of the multimodal streams. The approach is implemented and evaluated within a MEP engine using single and multimodal queries achieving near real-time performance with a throughput of 30 frames processed per second (fps) and subsecond latency of 0.075-0.30 s for video streams of 30 fps input rate. Support for high input stream rates (45 fps) is achieved through content-aware load-shedding techniques with a 127X latency improvement resulting in only a minor decrease in accuracy.
KW - Data management and analytics
KW - event processing
KW - Internet of Multimedia Things (IoMT)
KW - service middleware and platform
UR - https://www.scopus.com/pages/publications/85123310192
U2 - 10.1109/JIOT.2022.3143171
DO - 10.1109/JIOT.2022.3143171
M3 - Article
AN - SCOPUS:85123310192
SN - 2327-4662
VL - 9
SP - 13705
EP - 13724
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 15
ER -