Nicolas Saunier edited section_Introduction_The_use_of__.tex  almost 9 years ago

Commit id: 9a91e7ae3aa144bdecbbe3f59fa5c7dea93fbe5d

deletions | additions      

       

\section{Introduction}  The use of video data for automatic traffic data collection and analysis has been on an upward trend as more powerful computational tools, detection and tracking technology become available. Not only have video sensors been able for a long time to emulate inductive loops to collect basic traffic variables such as counts and speed as in the commercial system Autoscope \cite{michalopoulos91autoscope}, but they can also provide higher-level information regarding road user behavior and interactions more and more accurately. Examples include pedestrian gait parameters \cite{saunier11stride-length-trr}, crowd dynamics~\cite{johansson08crowd} and surrogate safety analysis applied to motorized and non-motorized road users in various road facilities~\cite{St_Aubin_2013,Sakshaug_2010,Autey_2012}. Video sensors are relatively inexpensive and easy to install or already installed for example by transportation agencies for traffic monitoring: large datasets can therefore be collected for large scale or long term traffic analysis. This so-called ``big data'' phenomenon offers opportunities to better understand transportation systems, presenting its own set of challenges for data analysis~\cite{st-aubin15big-data}.   Despite the undeniable progress of the video sensors and computer vision algorithms in their varied transportation applications, there persists a distinct lack of large comparisons of the performance of video sensors in varied conditions defined for example by the complexity of the traffic scene (movements and mix of road users), the characteristics of cameras~\cite{Wan_2014} and its their  installation (height, angle), the environmental conditions (e.g.\ the weather)~\cite{Fu_2015}, etc. This is particularly hampered by the poor characterization of the datasets used for performance evaluation, the limited availability of benchmarks and public video datasets for transportation applications~\cite{saunier14dataset}. Tracking performance is often reported using ad hoc and incomplete metrics such as ``detection rates'' instead of standard and more suitable metrics such as CLEAR MOT~\cite{Bernardin_2008}. Finally, the computer vision algorithms are typically manually adjusted by trial and error using a small dataset covering few conditions affecting performance while performance evaluated on the same dataset is thus over-estimated: compared to other fields such as machine learning, it should be clear that the algorithms should be systematically optimized on a calibration dataset, while performance should be reported for a separate validation dataset~\cite{ettehadieh15systematic}. While the performance of video sensors for more simple traffic variables has been more extensively studied, not all factors have been systematically analyzed and the issues with parameter optimization and the lack of separate calibration and validation datasets abound. Besides, the relationship of tracking performance with performance for traffic parameters has never been investigated.