^{1}

^{2}

^{2}

^{2}

^{2}

^{1}

^{2}

Urban traffic flow prediction has always been an important realm for smart city build-up. With the development of edge computing technology in recent years, the network edge nodes of smart cities are able to collect and process various types of urban traffic data in real time, which leads to the possibility of deploying intelligent traffic prediction technology with real-time analysis and timely feedback on the edge. In view of the strong nonlinear characteristics of urban traffic flow, multiple dynamic and static influencing factors involved, and increasing difficulty of short-term traffic flow prediction in a metropolitan area, this paper proposes an urban traffic flow prediction model based on chaotic particle swarm optimization algorithm-smooth support vector machine (CPSO/SSVM). The prediction model has built a new second-order smooth function to achieve better approximation and regression effects and has further improved the computational efficiency of the smooth support vector machine algorithm through chaotic particle swarm optimization. Simulation experiment results show that this model can accurately predict urban traffic flow.

The concept of a smart city has been quite popular in recent years as it demonstrates great potential for improving urban management and people’s life [

With the surge of data volume, cloud computing has encountered unforeseen challenges [

The paradigm to handle the aforementioned issues is edge computing. “Edge” indicates proximity to the user or the sources which generate data. Therefore, edge computing mainly provides computation, data storage, and network support from the edge. The tasks of storage and computation have been transferred to the edge instead of servers from the cloud, as indicated in Figure

Technical architecture for smart city edge computing.

Edge computing has demonstrated great advantages from the perspectives of wide connection, distributed computation, proximity to a data source, and low latency. Edge computing can provide better capability in the areas of data filtering and compression, situational awareness, and data classification, which has laid a strong technical foundation for the application of big data analysis, traffic management, and urban environmental monitoring [

There are some differences between edge data processing and traditional data processing techniques. Firstly, the core of smart city data is heterogeneous in nature. Smart city data normally come from areas of governance, public security, environment, transportation, internet, and IOT, all of which generate data from multiple sources in different modalities. Secondly, restriction on the access of the edge node should be taken into consideration during the process of edge processing. Lastly, the cooperation between edge and cloud needs to be taken into consideration [

Therefore, the paper focuses on the urban data processing on the edge side, particular small-scale urban data though not huge but significant in daily urban governance operation, and successful application of these data would eventually lead to the informed decision during urban governance and planning. The contribution of the article includes the following: (1) proposed a data application framework for urban edge computing based on the research of requirements of urban data processing and (2) proposed a new CPSO/SSVM algorithm, which has built a new 2^{nd}-order smooth function to achieve a better effect on approximation and regression. Meanwhile, CPSO optimization has further improved the efficiency of SVM. The prediction result is satisfactory on accuracy and stability when processing traffic flow data from the edge. The remaining structure of the paper is as follows: part 2 has introduced related work on edge computing and traffic flow prediction, part 3 designed the urban traffic prediction model under the edge computing framework, part 4 has tested the model, and part 5 has given the conclusion.

Big data and edge computing are both hotspots in the academia, and some scholars have begun to pay attention to big data processing at the edge. Yang and Liu [

The application of data processing in smart cities has also received considerable attention in recent years. Lau et al. [

The prediction algorithm for urban traffic has also received considerable attentions in recent years. Jiang and Adeli [

Based on the inherent properties of traffic flow in the urban area as well as factors influencing the efficiency of the intelligent transportation system (ITS), such as proximity to the edge, flowability, and heterogeneous, the paper proposed an application architecture of urban traffic data and designed a traffic prediction model to better handle traffic flow data from the edge. The overall architecture includes five layers including edge data collection, edge computation, data storage, data processing and computing, and data analysis and visualization, as illustrated in Figure

Technical architecture for urban traffic big data.

Edge data collection layer is the “entrance” of data for the overall technical architecture; it is mainly responsible for the collection of data from road traffic network. Real-time traffic data are mainly generated by ring induction coils, toll bayonet, car GPS, etc. The amount of data is on a large scale with heterogeneous in nature. At this layer, it is necessary to build a channel for data acquisition, so that traffic big data can converge to the edge nodes.

Edge computation layer will provide a timely response from close proximity at the edge of the network; data fusion methodologies are normally applied at this layer to deal with heterogeneous traffic information collected from multiple sources. At the same time, preliminary data preprocessing is also required at the edge computing layer for data quality control. The edge node is responsible for processing data in a larger area of the local area network and provides extensible data processing capabilities. The edge computing layer mainly performs tasks such as data cleaning, data integration, and data deduplication.

The data storage layer will store the data extracted from the edge data collection layer, and the extracted data will be temporarily or permanently stored in the edge device. In the urban traffic big data scenario, the data stored at this layer is divided into three categories, including traffic flow data, weather data, and street view data from different POIs. Historical traffic flow data is used to train traffic flow prediction models, and real-time traffic flow data is used to evaluate prediction effects; improvements on the model will be made accordingly based on evaluation results. This layer provides data support for the computing service layer. The data required by the computing service layer comes from this layer.

The data processing and computing service layer is the core functional layer of the architecture. The purpose of this architecture is to provide users with accurate traffic flow prediction. This layer provides traffic flow preprocessing function and traffic flow prediction algorithm library. The preprocessing function is mainly realized by the system automatically. A variety of prediction algorithms will be provided including SVM smoothing algorithm, traditional support vector regression (SVR) algorithm, SVR with Chaotic Genetic Algorithm (CGA-SVR), Back Propagation Neural Network (BPNN), Autoregressive Integrated Moving Average model (ARIMA), and other traffic flow prediction models; users can apply the corresponding algorithm for traffic flow prediction based on different scenarios.

Data analysis and visualization service layer will interact with the analytical interface for ITS. With the support of the data processing and computing service layer, prediction results can be obtained in a more efficient way. Meanwhile, analytical methods and visualization methods enhance users’ ability to acquire in-depth information. The RESTful architecture is used between the computing service layer and the visualization service layer, which can achieve loosely coupled connections between modules. Through the data analysis and visualization service layer, a more flexible and comprehensive interaction between the user and the data is achieved.

The combination of edge computing and data analytics is powerful as it can provide edge users a timely and accurate decision support when it comes to traffic management during rush hours. By deploying intelligent algorithms close to the edge computing layer, analytical results can be quickly shared across edge networks, which is vital to ITS as traffic flow management is highly related to efficient information sharing, and safety on the road can be guaranteed by providing accurate and timely road traffic feedback to the drivers. The urban traffic flow prediction model can be applied in the urban traffic data application architecture in two ways. Firstly, it can be deployed in the edge computing layer to conduct some instant analytical tasks in order to predict fluctuating traffic flow; the results can be shared with drivers through edge networks in a timely manner. Secondly, the model can be applied in the data analysis and visualization service layer in ITS’s enterprise cloud in order to conduct analysis over heterogeneous traffic data from multiple sources and put forward informed suggestions to key stakeholders who have overseen the entire metropolitan traffic network.

Urban traffic flow is a crucial part of smart city management with multiple influencing factors involved. When it comes to the metropolitan area, the issue can be even more complex. The factors influencing urban traffic flow normally include flow from adjacent traffic nodes, weather condition, and point of interest (POI), such as nearby school, hospital, and shopping mall, thanks to the multiple traffic data acquired from the edge node, which includes spatial data, geographical information, road network data, traffic flow data, weather condition data, and traffic management data. The paper is based on traffic flow data from Guiyang City, with a spatial span of 717 intersections and a temporal span of 6 months; the experimental data type is shown in Table

Experimental data type.

Data source | Data type | No. of indicators | Content of indicators |
---|---|---|---|

Traffic flow data | Continuous variable | 1 | Traffic flow |

Weather data | Discrete variable | 4 | Fog, haze, rain, snow |

POI data | Discrete variable | 10 | Adjacent traffic nodes with 500 meters, no. of shopping malls, no. of schools, no. of hospitals, no. of tourist sites, no. of bus stations, no. of restaurants, no. of hotels, no. of supermarkets |

The indicators of weather data are as follows:

Fog: the level of fog can be graded as minor fog, fog, heavy fog, dense fog, and heavy dense fog

Haze: the level of haze can be graded as light haze, haze, and heavy haze

Rain: the level of rain can be graded as light rain, medium rain, heavy rain, storm rain, and heavy storm rain

Snow: the level of snow can be graded as light snow, medium snow, and heavy snow

The urban traffic prediction model based on various kinds of machine learning methodologies is key to the construction of ITS. It is able to provide technical support for urban traffic management especially flow control of busy traffic nodes in the metropolitan area. Urban traffic is a complex system with high dynamic, which makes it difficult to analyze in a short-time manner. Increasing randomness within the urban intelligent system makes it difficult for traffic flow prediction. Short-time traffic flow is the key part of urban traffic big data; the basic characteristics of short-time traffic flow include nonlinearity, randomness, and uncertainty.

Currently, urban traffic control and route guidance is mostly applied on a preset manner; only a few cities apply self-adaptive control mode during traffic flow detection. In order to make up for the inefficiency, various machine learning methodologies have been introduced to build up related prediction techniques. SVM has been widely applied in the area of traffic flow prediction. With the method of structural risk minimization, SVM has great advantages on overcoming problems such as a small sample, nonlinearity, curse of dimensionality, oversimulation, and local minimization, which simplifies the problem of classification and regression during traffic flow analysis. Therefore, SVM has shown a promising future in intelligent traffic control and guidance, which can ease the issue of traffic congestion in the metropolitan area.

The paper has proposed an urban traffic prediction model based on CPSO/SSVM, which is able to predict short-time traffic flow at city intersection by considering multiple factors including POI and weather condition and acquire better prediction result compared to tradition SVM algorithm.

The standard SVM algorithm is as follows [

In 2005, Lee et al. from Taiwan University introduced the concept of smooth function to improve SVM by introducing nondifferentiable function [

There is a nondifferentiable function in the objective function, which has shown a strong rotundity and unique solution; however, its nondifferentiable function is not smooth; therefore, a smooth function is required to infinitely approach a nondifferentiable function. Lee et al. have performed integral processing over a sigmoid function [

By taking an integral function of

With the introduction of SSVM, it can replace nondifferentiable function by introducing a different smooth function to achieve the effect of smooth processing, which resulted to lots of smooth functions with a good approximation effect, which in turn resulted to several SSVM algorithms.

In 2005, Yuan et al. from UESTC has proposed a function as follows [

In 2013, Wu et al. from XUPT has proposed two 2^{nd}-order smooth functions as follows [

By a piecewise smooth function from formula (^{nd} smooth function; the approximation effect and final regression effect of the function are more superior than formula (

The paper has set up a new SSVM algorithm (Ma-Liu Piecewise Smooth Support Vector Machine, MLSSVM) as follows:

The smooth function

^{nd}-order smooth

For

When ^{nd}-order smooth about

When

When

The smooth function in this paper has a good degree of approximation under the same

Approximation accuracy of four smooth functions.

In order to further improve the computing efficiency of the SSVM algorithm, the chaotic particle swarm algorithm with good optimization characteristics has been introduced for the optimization of parameters over penalty coefficient, insensitive parameters, and relaxation variable [

Chaotic characteristic itself is a pattern and possesses the property of pseudorandomness. The paper will take advantage of the two characteristics to track any state without repetition. The paper applies the logistic equation to build a chaotic optimization sequence, which is as formula (

In formula (

When

This paper applies two characteristics of chaos to initialize the position and velocity of particles in the system, which is pseudorandomness of chaos theory and its own law to enhance search capability for the swarm. Assume formulas hold true as follows:

In formula (

Assume objective function is as follows:

The optimized process for the particle swarm algorithm is shown in Figure

Diagram of the particle swarm algorithm.

The specific procedures of the adaptive optimization algorithm are as follows [

Chaos initialization of corresponding parameters of a particle swarm algorithm

Comparison and optimization of the fitness level obtained from step (1)

Comparison between optimal fitness

Update the particle’s position and velocity

Chaotic optimization of the optimal position

In the original solution space, obtain a feasible solution

Through the operations of steps (1)-(7), and satisfying the set optimization conditions, the search is stopped, the optimal solution is given, and the best position is obtained, otherwise, return to step (2) and repeat the operation

The SVM prediction method is applied to predict multisource urban traffic flow data. It inherits the relevant ideas of machine learning. Through continuous training and learning of the prediction model, the goal of effective prediction is finally achieved. The process mainly includes two parts, the training process and the testing process. Figure

Framework for the CPSO/SSVM-based urban traffic flow prediction model.

In this paper, multisource urban traffic flow data is used as a model input, which needs to go through five stages: data collection, data preprocessing, data normalization processing, SSVM construction, and optimization problem solving. Among them, the preprocessed data consisted of training data and test data. The training data set is used to train the CPSO/SSVM model, and then, the test data set is used to test the performance of the established prediction model. The model’s performance is improved by constant learning and adjusting, which will eventually lead to an automatic prediction of urban traffic flow. The execution steps of the urban traffic flow prediction model based on CPSO/SSVM are as follows.

Data collection stage: collect traffic flow data, weather condition data, and POI (points of interest) data from various sources

Data preprocessing stage:the collected multisource urban intersection traffic flow data went through data cleaning and preprocessing procedures, considering the universality of the algorithm application scenario. Firstly, prepossessed all 9,577,708 pieces of traffic information from 717 intersections on a Python 3.8 platform, all the invalid records have been removed, and a descriptive statistical analysis was carried out to filter out key urban intersections with higher average flow rate, ones with a higher average flow rate and a larger number of surrounding POIs is used as the model input

Data normalization processing: normalize the multisource urban traffic flow data, including quantifying the collected POI information of the city intersection and the weather information of the day and apply the normalization algorithm to process all model variables to form a unified metric

Construct a smooth support vector machine: construct a smooth support vector machine algorithm model

Optimization problem solving: construct a second-order smooth kernel function and solve the optimization problem with the SSVM algorithm model to generated prediction results

Based on the Matlab_R2014a platform, this paper has built up a traffic flow prediction algorithm by applying the optimized parameter results from part 3 and uses particle swarm optimization-smooth support vector regression to predict traffic flow. The experimental data set is the cross-section flow data of Guiyang City, Guizhou Province (5 min interval). 200 intersections with high average traffic flow were selected with 10 flow records per intersection, which is a total of 2000 flow records for model testing. Among them, 1989 records are the training set, and the last 11 records are the test set. In order to verify the prediction effect of the algorithm in this paper, this paper uses a genetic-BP neural network [

Analysis of prediction results.

It can be concluded from Figure ^{th}, 1996^{th}, 1998^{th}, and 1999^{th} sample data have selected for comparison under the same dimension. The comparison result is as in Figure

Comparison of sample data under the same dimension.

From Figure ^{th} and 1999^{th} sample data, LS-SVM algorithm has shown large deviation at 1996^{th} sample data. Therefore, the algorithm put forward by the paper has a better performance in prediction.

In order to facilitate the analysis, the relative error is introduced for analysis. The relative error data is shown in Table

Relative error of prediction results of three algorithms.

Algorithm | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 |
---|---|---|---|---|---|---|---|---|---|---|---|

GA-BP | 11.91 | 7.97 | 1.92 | 3.95 | 4.11 | 4.06 | 4.14 | 24.50 | 16.75 | 7.28 | 5.25 |

LS-SVM | 5.95 | 6.00 | 4.88 | 2.73 | 4.18 | 1.49 | 5.70 | 19.00 | 15.38 | 2.89 | 9.75 |

MLSSVM | 2.59 | 3.95 | 0.22 | 0.95 | 2.32 | 1.11 | 1.77 | 2.00 | 4.85 | 0.32 | 0.25 |

As can be concluded from Figure

Analysis of relative error.

In order to further analyze the characteristics of these three algorithms, the time cost of the three algorithms under different sample data in the prediction process is counted. The specific results are shown in Table

Comparison of three algorithms time overhead.

Algorithm | No. of sample (500) time overhead ( | No. of samples (1000) time overhead ( | No. of samples (2000) time overhead ( |
---|---|---|---|

GA-BP | 0.0158 | 0.0276 | 0.0394 |

LS-SVM | 0.0162 | 0.0287 | 0.0299 |

Paper algorithm | 0.0147 | 0.0196 | 0.0255 |

It can be known from Table

In conclusion, the SSVM algorithm put forward in this paper has better prediction accuracy in the area of traffic flow management and possesses better robustness and rapid adaptability. The algorithm can meet the requirements of low latency during the processing of heterogeneous data at the edge side, which can benefit prospective research that combines edge computing and big data analytics.

In this paper, a CPSO/SSVM model is constructed to predict traffic flow at the intersection of Guiyang City. The CPSO/SSVM model achieves better approximation and regression effects by constructing a new second-order smooth function, and at the same time, further improves the computational efficiency of the SSVM regression algorithm through particle swarm optimization. Based on experimental results, it is proved that CPSO/SSVM model is able to output more accurate result compared with the GA-BP algorithm and LS-SVM algorithm. The model has powerful information processing and prediction capabilities and can be applied to deal with complex nonlinear problems, especially the problem of traffic flow prediction at urban intersection, the location of which normally comes with complex scenes and various disturbance factors. The model provides an alternative solution for the research of data-driven urban traffic flow forecasting, and extends the application of SVM algorithm in the area of short-term urban traffic flow prediction at the same time. The output accuracy of the model is high and can be deployed in ITS to achieve short-term traffic flow prediction, which has a high application value for smart city development and real-time traffic management in edge computing scenarios.

The paper is based on traffic flow data from Guiyang City, with a spatial span of 717 intersections and a temporal span of 6 months. The experimental data set is the cross-section flow data of Guiyang City, Guizhou Province (5 min interval).

The authors declare that there is no conflict of interest regarding the publication of this paper.

The research is part of the author’s employment to explore potential applications of big data analytics in smart city development and urban traffic planning. The employer is Zhejiang University.