Managing bulk sensor data for heterogeneous distributed systems
The current U.S. transportation infrastructures require tremendous investments to maintain due to critical roadway and pavement conditions. To prioritize the repair expenditures, Cyber-Physical Systems (CPS) are a promising solution to obtain intrinsic knowledge about infrastructure performance such as roadway surface and subsurface deterioration through sensors and actuators. However, to characterize and quantify the infrastructures’ time varying behavior (infrastructure health and life cycle) in a cost-efficient and non-intrusive way, an underlying framework to handle domain-specific big data in CPS is needed.
This paper proposes a holistic approach to manage domain-specific bulk sensor data generated from heterogeneous distributed sensor systems to address CPS meeting big data. We focused on the big data handling in a Scalable Intelligent Roaming Multi-Modal Multi-Sensor (SIROM^3) Framework, which collects data about roadway conditions from multiple domains through mobile agents. A Heterogeneous Stream File-system Overlay (HSFO) is proposed as platform independent layer to uniformly define, organize and manage the high volume of heterogeneous streaming data. Additionally, a flexible plugin system (PLEX) is introduced to simplify and automate data feature extraction, correlation, fusion and visualization. Both HSFO and PLEX are designed with high scalability and adaptability. They can be executed on a wide range of platforms from mobile systems to mainstream servers with a common software/hardware stack. Our solution addresses big data collection, storage, aggregation to processing and knowledge discovery. The embodied automation eliminates human intervention at every stage and increases overall efficiency.
Over 20 terabytes of data covering 300 miles have been collected, aggregated, and fused for comprehending the pavement dynamics of the entire city of Brockton, MA. The performance of data processing with and without HSFO was compared. The results indicate that processing data with HSFO takes an overhead of 0.19s/KB than that in the absence of HSFO. The difference of CPU utilization between two methods is less than 5%. This implies that HSFO and PLEX has quite low overhead with a negligible impediment to the system performance. The unified automation fulfilled by them has demonstrated a significant increase in overall productivity by nearly 30 times, starting from data collection to processing. In result, we established foundational tools for managing the big data for distributed multi-modal multi-sensor systems in civil infrastructure monitoring. They provide rapid and comprehensive understanding of civil infrastructure health and life cycle management.
Appeared in:
Electrical and Computer EngineeringNortheastern
Year:
2014
Presentation Place:
Boston