Building A Traffic Quality Control System, Part 1: Handling Big Data

This is the first part of an article series regarding the development of a traffic quality control system. This part of the series concerns the system’s architecture.

4 min readDec 6, 2022

A prominent customer of Concept Reply makes mobility safer, more efficient, and more sustainable by developing innovative mobility ecosystems and services for smart cities. Part of their portfolio is a traffic quality control system. This system delivers real-time sensor data of intersections to traffic engineers, who analyze and assess the situation on the roads. This acts as a first step towards tackling traffic congestion, which has a negative effect on both a city’s economy and environment [1].

When it comes to traffic monitoring, a monitor describes a measurable event that is used to evaluate the traffic situation. Monitors may answer questions such as the duration of a traffic light being green, how long a vehicle has to wait at an intersection, or the time a public transport vehicle needs to cross an intersection.

A medium-sized city already generates thousands of data points per hour for each monitor. The system supports up to eight different monitors for hundreds of intersections within a city. Therefore processing millions of data points on a daily basis becomes a challenge. Concept Reply develops algorithms that process that data efficiently and present the information accurately to the users of the system.

The aim of this article is to share the architectural decisions that enabled Concept Reply to create an extensible basis for developing new monitors, while simultaneously ensuring the performance and resiliency of the system. Below is a diagram of the current architecture. It illustrates the points that are covered in the next paragraphs.

Deployment

The customer must be able to use the traffic quality control system in different cities. Each city requires a high level of customization to adapt to individual needs of complexity.

Each service is containerized and deployed to AWS’ Elastic Kubernetes Service. In order to fulfill the customer’s requirements the system has also been configured to be deployable on-premise. This hybrid approach offers great flexibility to both the development team and the customer.

Data

The developed application allows traffic engineers to monitor the traffic situation in their city. Raw data collected from sensors throughout the city is published to a Kafka cluster prior to any processing. The cluster has different topics, and each topic stores data of a certain type. This type is defined by the kind of sensor that provides raw data, e.g., camera, button, magnetic field detector, etc. Once published to the cluster, it is consumed and processed by the backend which combines different types of data.

Scalability and Extensibility

Flexibility in scale is key in order to secure applicability in various cities. Additionally, the system needs to be extensible so that new monitors can be added without rewriting major parts of the application.

Following a microservice architecture, each monitor is deployed as a standalone Kafka Streaming Application (KSA). It combines the data coming from different Kafka topics, computes the required metrics according to the monitor’s definition, and publishes that information back to the Kafka cluster. Each application is written in Java and makes use of Kafka Streams.

Some of the advantages include

Fast development of a new monitor
Deployment of new monitors without influencing the active ones
Split the processing load to multiple applications
Contain the consequences of a faulty monitor to a single streaming application
Easy monitoring and debugging
Scaling according to the needs of a specific monitor

Persistence

After the raw data is enriched by the KSA, it is published back to the cluster. The output topic of each KSA is forwarded by a Kafka Connector to a time-series database.

The system relies on a PostgreSQL database that is enhanced with TimescaleDB, which makes querying time-series data extremely performant compared to vanilla PostgreSQL while preserving all its functionality.

Backend Services

While the KSAs are deployed as independent services that serve one purpose only, the backend services are split into two, the configuration and the data service, where both take over a larger task. The data service provides the frontend with the requested data via a RESTful API. The configuration service handles data related to the actual configuration of the intersections and provides them upon request to the user also via a RESTful API. An intersection configuration can be described with a JSON file that provides its structure, e.g., what kind of detectors are placed in the intersection or how these detectors can be identified.

As a traffic engineer, it may be required to adapt the configuration of an intersection. This change should be reflected in the monitoring system, which is another task of the configuration service. The versioning of intersections and of the monitors themselves is crucial for traffic engineers in order to analyze past data and fine-tune the programs controlling the traffic lights. Changes in the configuration of an intersection are tracked and accurately depicted in the metrics defined by the affected monitor.

Conclusion

Thanks to the solid foundation the development team of Concept Reply could focus on delivering features on time and according to the customer’s needs. Hence progress has not been hindered by the complexity of the system or technical requirements and limitations.

Appendix — Tools and Tech Stack

Here is an overview of the technology stack that Concept Reply has used:

Spring Boot for the development of the backend services
Kafka Streams for the development of streaming applications that enrich raw data
PostgreSQL with TimescaleDB for performant and efficient time-series data storage and retrieval
Lens for quickly accessing the deployed services in the cluster
Grafana for accessing the logs of the deployed services
Helm for managing the deployment of the services
Maven for managing the projects’ dependencies
React for building the frontend

Resources

[1] https://journals.sagepub.com/doi/abs/10.1177/0885412211409754