\documentclass[num-refs]{wiley-article}
\usepackage{graphicx}
\usepackage[space]{grffile}
\usepackage{latexsym}
\usepackage{textcomp}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{booktabs,array,multirow}
\usepackage{amsfonts,amsmath,amssymb}
\usepackage{natbib}
\usepackage{url}
\usepackage{hyperref}
\hypersetup{colorlinks=false,pdfborder={0 0 0}}
\usepackage{etoolbox}
\makeatletter
\patchcmd\@combinedblfloats{\box\@outputbox}{\unvbox\@outputbox}{}{%
\errmessage{\noexpand\@combinedblfloats could not be patched}%
}%
\makeatother
% You can conditionalize code for latexml or normal latex using this.
\newif\iflatexml\latexmlfalse
\providecommand{\tightlist}{\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}%
\AtBeginDocument{\DeclareGraphicsExtensions{.pdf,.PDF,.eps,.EPS,.png,.PNG,.tif,.TIF,.jpg,.JPG,.jpeg,.JPEG}}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\papertype{Original Article}
\title{{Human-like Interactive Behavior Generation for Autonomous Vehicles: A
Bayesian Game-theoretic Approach with Turing Test}}
\author[1]{Yiran Zhang }
\author[2]{Peng Hang}
\author[3]{Chao Huang}
\author[4]{Chen Lv}
\affil[1]{Affiliation not available}
\affil[2]{Affiliation not available}
\affil[3]{Affiliation not available}
\affil[4]{Affiliation not available}
\runningauthor{Yiran Zhang }
\begin{document}
\maketitle
\selectlanguage{english}
\begin{abstract}
Interacting with surrounding road users is a key feature of vehicles and
is critical for intelligence testing of autonomous vehicles. The
Existing interaction modalities in autonomous vehicle simulation and
testing are not sufficiently smart and can hardly reflect human-like
behaviors in real world driving scenarios. To further improve the
technology, in this work we present a novel hierarchical
game-theoretical framework to represent naturalistic multi-modal
interactions among road users in simulation and testing, which is then
validated by the Turing test. Given that human drivers have no access to
the complete information of the surrounding road users, the Bayesian
game theory is utilized to model the decision-making process. Then, a
probing behavior is generated by the proposed game theoretic model, and
is further applied to control the vehicle via Markov chain. To validate
the feasibility and effectiveness, the proposed method is tested through
a series of experiments and compared with existing approaches. In
addition, Turing tests are conducted to quantify the human-likeness of
the proposed algorithm. The experiment results show that the proposed
Bayesian game theoretic framework can effectively generate
representative scenes of human-like decision-making during autonomous
vehicle interactions, demonstrating its feasibility and effectiveness.
Corresponding author(s) Email:~{ \emph{~lyuchen@ntu.edu.sg ~}}%
\end{abstract}%
\subsection*{ToC Figure}
{\label{243685}}\par\null\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.84\columnwidth]{figures/blank/Case-TOC-Kao-Bei-}
\caption{{~For the driving intelligence test, the road-users in the test scene
should be interactive and humanlike. A generic approach is proposed
inspired by referring to real human driver's behavior. First, the road
users estimate the aggressiveness of the tested vehicle by observation.
Then the observation is fed to the Bayesian game theoretic decision
module. Based on the decision results, the road users can generate three
different behaviors, to yield, to fight, and to probe.
{\label{506762}}%
}}
\end{center}
\end{figure}
\section*{Introduction}
{\label{121439}}
In the context of future smart mobility, there is an intensifying demand
for naturalistic scene generation for automated vehicle simulation,
intelligence testing and algorithm validation\cite{Feng2021}. The
mixture of human-driven vehicles, pedestrians, and other intelligent
autonomous agents will be on the roads, sharing the right of ways and
interacting with one another, in the foreseeable future. In order to
generate high-fidelity scenes for representing the new transportation
modality, the interactive behaviors among heterogeneous traffic
participants should be carefully considered. The conventional
human-driven traffic participants, including pedestrians, cyclists, and
human-driven cars, usually do not follow pre-defined trajectories or
patterns, and their behaviors are difficult to predict in real world.
But their decisions and actions are correlated, i.e. one's decision is
made based on the constraints imposed by surrounding ones, and its
behavior will also affect others in surround\cite{Huang_2021,Yu_2018,Hang_2020} .
Besides, as a human has limited perception capability, the information
one can obtain from the surrounding environment is limited
\cite{Dingus_2016,Li_2016,Kuo_2019,Hu_2021}. Further, their individual behaviors are usually
highly personalized, as different road users have diverse travel
demands, preferences and habits\cite{Fridman_2019,Marina_Martinez_2018,Sama_2020,Xing_2020} . Thus, for the scene
generation for autonomous driving, it is worthwhile exploring
intelligent methods which can realize naturalistic and human-like
interactive behaviors between intelligent agents. Instead of
establishing comprehensive and large-scale various scenes, we focus on
the intelligent representation of interacting moments. During
interactions, the specific decision or intent of a road user is
generally not available to the surrounding ones. However, through
driving performance observation or driving style recognition, it is
possible to infer their intents or possible actions using the trajectory
prediction or aggressiveness estimation\cite{Huang_2021a}, which is
crucial in competing for the right of way. Moreover, one's
aggressiveness or pattern my not always remain unchanged, as the
situation and demand are varying, which makes the interactions~
game-like\cite{Hang_2020,Hang_2021,Liniger_2020} with incomplete information.
The understanding and modeling of interaction modalities among various
road users, including cars, pedestrians, and cyclists, is critical,
because information exchanges, time-varying reactions, and mutual
influences would exert great impacts on the results of scene generation.
Considering the above facts, in the context of human-like interactive
scene generation for autonomous driving, challenges remain opening: What
is the best strategy to win the right of road during interactions? And
what is human's winning mechanism during interactions? To deal with the
above problems, the decision logic behind the interaction with
consideration of the aggressiveness should be explored first. Beyond
this, the representation of human-likeness and its quantification method
of human-likeness should be investigated as well.
To be more specific, we list some representative interactions and
possible conflicts in \textbf{Figure~}{\ref{243685}}
\textbf{} . The first situation is a vehicle-vehicle interaction,
occurring during lane-change and merging. Besides, the
vehicle-pedestrian interaction is also presented, and it is very
important especially~ in unstructured or unsignalized areas. The third
modality, i.e. the pedestrian-pedestrian interaction, which imposes more
uncertainties to the autonomous driving scenarios, is included in the
proposed paradigm as well. The most challenging situation is when road
users conflict in their expected trajectories due to their
non-cooperative behaviors. For instance, in the vehicle-pedestrian
interaction case, the optimal solution for each of their trajectories
(the yellow and blue lines, respectively, shown in
Figure~{\ref{243685}} is to not decelerate. However, if
both of them maintain their current speeds, a collision will be
inevitable.
\subsection*{The Decision-making Behind the
Scenes}
{\label{234522}}
Decision-making logics for autonomous vehicles and other road users can
be similar and mutual-beneficial in terms of researches. Many scholars
have studied the decision-making for autonomous
vehicles\cite{Kiran_2021} and pedestrians\cite{Kooij_2016}. Among
them, learning-based approaches are promising and gaining
popularity\cite{Kiran_2021}.~ Some studies that human driving behaviors
can be extracted through the machine learning algorithms, such as deep
learning\cite{Sama_2020,Huang_2020}, imitation learning\cite{christoph2017}, and
inverse reinforcement learning\cite{d2016}. However, due to the
inherent black-box nature of the neural networks, the interpret ability
of learning based methods is not ideal. Inspired by the game-like
essence of road users interactions, more interpretable~ game-theoretic
approaches are investigated and considered more reasonable and
practical. Some researches formulate the decision process as a
Stackelberg game\cite{Huang_2021,Yu_2018,Hang_2021,Hang_2020} , and they impose a strong assumption
on the availability of the leader as well as their utility function
during the game. Furthermore, as the opponent vehicles may not always
act as the formulated Stackelberg game expects, an online estimation
algorithm is proposed using historical data to improve the game-based
interactions \cite{Zhang_2020} and\cite{Zhang_2020a} . Additionally,
there exists a problem in finding Nash equilibrium. There may be more
than one Nash equilibrium, thus they might conflict with one
another\cite{Wang_2021,Spica_2020} .
Apart from the above mentioned methods, MOBIL-IDM model has also been
widely used and\cite{Kesting_2007} dominates the field of traffic scene
generation. It is originally designed to be collision-free. However,
after modifying some of the key parameters, such as the politeness,
acceleration,and the grid distance estimation, the algorithm can
generate adversarial behaviors for testing \cite{Feng2021,Lindorfer_2018}. Other
methods, including the risk field\cite{Kolekar2020}, artificial
potential field\cite{Rasekhipour_2017,Gao_2019}, constrained Delaunay
triangulation\cite{Huang_2021a,Huang_2021} , and scene
prediction\cite{Lawitzky_2013}, are capable of modeling human driver's
cognitive states while considering safety. In general, the
aforementioned methods either make strong assumptions on the
availability of data and information, or lack integrity in representing
human behaviors, resulting in in-ideal scenes for driving testing.
Besides, being either adversarial or cautious, the generated behaviors
of road users in the simulation testing should be human-like, to further
improve the fidelity of the simulation testing environment. Moreover,
currently both the learning-based and game-based approaches require
heavy computation resources, which limit the implementation of advanced
scene generation algorithms.
\subsection*{The Estimation of Driving
Aggressiveness}
{\label{640501}}
Aggressiveness is an important factor for vehicles in the competition of
right of ways\cite{Huang_2021a}. It can be considered as a result of
the trade-off between driving safety and travel
efficiency\cite{Zhang_2020}. Due to the complexity of the problem,
there is no unified method in measuring
aggressiveness\cite{Marina_Martinez_2018}. Intuitively, relative speed,
acceleration, and the distance between vehicles can be utilized to
quantify te aggressiveness\cite{Huang_2021a,Colombo_2017,Li_2019,Wang_2017} . Nevertheless, using just
vehicle dynamics to measure driving styles seems not very comprehensive.
Thus, many studies shift to the driver-behavior oriented and
scene-specific methods for the aggressiveness
estimation\cite{Solovey_2014,seon2020,Mole_2021} . In recent years, some new elements are
introduced to the discussions of the evaluation of aggressiveness. Many
new explorations are conducted from the the aspect of scenes, e.g.
straight road\cite{Kolekar_2020}\}, curves\cite{Kolekar2020}, and
roundabout\cite{Hang_2021a}, as well as from the aspect of human
factors --- hand\cite{Muhlbacher_Karrer_2017}, eye\cite{Hu_2021a},
EEG\cite{Rupp_2019}, and so forth.
However, these methods have two main drawbacks. First, the
aggressiveness of a driver may not be consistent, due to the varying
scenarios and the travel demands. Second, from the energy management
perspective, although the driving style recognition is proved to be
beneficial for long-term strategy optimization\cite{Yang_2018},
obtaining the exact value of aggressiveness instantly may not be
necessary. Therefore, instead of realizing an accurate and continuous
value for the estimation, we maintain that an identification of the
relative competitiveness or aggressiveness classification is a feasible
and more pragmatic way for autonomous driving.
\subsection*{The Human-like Behaviors}
{\label{637533}}
One of the challenges that distinguish autonomous driving from other
mobile platforms is the traffic uncertainty. From this point of view,
representing human-like behaviors of road-users is very essential for
scene generation. The results reported in\cite{Feng2021} indicate
that generating rare cases, i.e. using Markov model to randomly generate
initial scenes and IDM-MOBIL model for adversarial behavior, can shorten
the overall testing time. But the drawback is also clear. The Markov
model is only used when the vehicle is cruising, thus there's no
significant interactions between the ego vehicle and other surrounding
ones. Besides, during the interactions, the IDM-MOBIL cannot completely
represent human behaviors. Generating aggressive behaviors and possible
accidents are essential, but these can be hardly realized if only
non-human-like behaviors and interaction movements are produced in the
simulation environment. Learning-based approaches are exploited as
approximators for human-like driving as well\cite{Li_2018,Zhang_2018}.
However, these methods suffer from the black-box nature of neural
networks which can hardly be customized and interpreted for logic
analysis. Learning from datasets is an interesting and promising
methodology for realizing human-like driving \cite{Xu_2020}, but
the diversities and uncertainties of human driver behaviors should be
further considered. Therefore, it is difficult to formulate all human
decisions as a unified optimization problem, especially for solving a
global optimum. And because human drivers have limited and various
sensing and motor abilities, their control performances are imperfect.
\section*{Experimental
Section/Methods}
{\label{956968}}
According to the aforementioned analysis, the formulated problem
consists of two main elements: 1) Conflict. There will be a severe
consequence if neither of the two interaction agents is willing to
deviate from their original expected choices. Thus, one of them has to
yield eventually. 2) Alternatives. Each of the participants should have
at least two options, i.e. fight or yield. The conflict is defined as:
the expected trajectories of multiple agents would cross. The expected
motion trajectory is predicted based on an assumption of the current
velocity and yaw angle. This is in line with the human-like concept, as
a human does not make complicated and precise predictions (detailed
definition is given in Note S2, Supporting Information). In this work,
we will mainly focus on the alternatives.
\subsection*{Proposed Framework}
{\label{400713}}
The complete framework is shown in \textbf{Figure}
{\ref{832958}} for interactive driving scene
generation, which is explicit and inspired by the human's
decision-making process. Within the framework, the scheduler is used to
select players of interest. If any one is not selected, it will follow
its own expected way-points, as how it moves has no direct impact on the
driving situation of the tested vehicle. However, when one agent is
selected, the candidate trajectory generator and algorithm in
\textbf{Figure} {\ref{918425}} will be activated. The
trajectory generator algorithm can be RRT~\cite{LaValle_2001} ,
semi-reactive trajectory generation \cite{Werling_2010}, and so forth as
long as it can generate multiple possible trajectories for the player.
Then, the algorithm will determine whether there exists a conflict
between the expected trajectory and the prediction using the algorithm
presented. If there is no conflict, the best solution will be selected
from the candidates. If there exists a conflict, the algorithm will
determine whether there is still available room to fight. If the ego
player is not blocked, it will estimate the aggressiveness of other
surrounding agents using the method proposed. Then, the aggressiveness
will be updated for the decision-making module which is based on
Bayesian game theory. If the decision is explicit, i.e. to fight or to
yield, then the agent will follow the decision. However, when the
decision is not clear, it will make some small steps for probing
inspired from human behaviors. The small-step action is of low risk in
terms of collision, but it is enough to demonstrate the agent's
intention. This proposed framework is applicable for decision-making and
interactions among multi-modal agents, including vehicles, cyclists and
pedestrians, but in this work we will mainly focus on the interactions
between vehicles.
\par\null\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.56\columnwidth]{figures/f3-1/f3-1}
\caption{{Scene generation framework
{\label{832958}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/f3-2/f3-2}
\caption{{Framework of interactive scene generation algorithm
{\label{918425}}%
}}
\end{center}
\end{figure}
\subsection*{Bayesian Game Based
Decision-making}
{\label{427278}}
Since the core algorithm for the interaction is the decision-making
process, and the aggressiveness Observation \& update module is designed
to serve this process, we move the decision-making algorithm section
forward though it is at the middle part of the proposed framework. This
module is based on the Bayesian game theory which makes no assumptions
on the accessibility of the cost function of the opponent or leader of
the game. However, in order to simulate the subject for different
scenes, cost function of the Player One is still required, which will be
further formulated in the rest of this section.
Numerous scholars have studied game-theoretic approaches,in which an
N-player game for~\(N = 2,3,...,n\) can~ be simplified into a
two-player game, i.e., ~\(N = 2\) . For each
player~\(i\in N\), he or she has at least two alternative
solutions based on the problem definition represented by a discrete set
of~\(A_i=\left\{a_{i,1},a_{i,2},\dots,a_{i,k},\dots,a_{i,K}\right\}, K\in\left[2,\infty\right]\)and the utility function given
by~\(u_i(a_{i,k},a_{i^{\prime},k^{\prime}})\). Based on the conflicts
definition,~\(A_i\) comprises two clusters of strategies,
i.e. \( A_i = \left\{A_{i,F}, A_{i,Y}\right\}\). \( a_{i,F}(a_{i,F}\in A_{i,F})\), is the optimal or expected
trajectory that maximize or minimize the utility function among all the
alternatives to fight. And the mechanism is the same for the yielding
type \(a_{i,Y}\in A_{i,Y}\). Without loss of generality, the best solution
always dominates the rest choices, thus it is not necessary to list all
the choices.
Considering the lack of information of other agents, relative
aggressiveness becomes important for decision making and interaction.
Assuming the two players are player one and player two, from player
one's perspective, the aggressiveness of player two can be classified
into three types: equally aggressive, less aggressive, and more
aggressive. The probability distribution of these three types
\(p_j(\sum p_j = 1 (j = 1,2,3))\) subjects to a multinomial distribution. Different
types of the player two have different forms of utility functions, which
are summarized in \textbf{Table} {\ref{t2}}.
\renewcommand{\arraystretch}{1.2}\selectlanguage{english}
\begin{table}[h]
\newcommand{\tabincell}[2]{\begin{tabular}{@{}#1@{}}#2\end{tabular}}
\centering
\caption{{Decision model based on the Bayesian game theory}}\label{t2}
\begin{tabular}{cccccccc}
\hline
\multicolumn{2}{c}{} & \multicolumn{2}{c}{\tabincell{c}{Equally aggressive \\ $(p_1)$}} & \multicolumn{2}{c}{\tabincell{c}{More aggressive \\ $(p_2)$}} & \multicolumn{2}{c}{\tabincell{c}{Less aggressive \\ $(p_3)$}}\\
\cline{3-8}
\multicolumn{2}{c}{} & \multicolumn{2}{c}{Player Two} & \multicolumn{2}{c}{Player Two} & \multicolumn{2}{c}{Player Two}\\
\cline{3-8}
\multicolumn{2}{c}{} & $a_{2,F}$ & $a_{2,Y}$ & $a_{2,F}$ & $a_{2,Y}$ & $a_{2,F}$ & $a_{2,Y}$\\
\hline
\multirow{2}*{Player One} & $a_{1,F}$ & $(u_{1},u_{2})$ & $(u_{1},u_{2})$ & $(u^{\prime}_{1},u_{2})$ & $(u_{1},u_{2})$ & $(u_{1},u^{\prime}_{2})$ & $(u_{1},u_{2})$ \\
\cline{2-8}
& $a_{1,Y}$ & $(u_{1},u_{2})$ & $(u_{1},u_{2})$ & $(u_{1},u_{2})$ & $(u_{1},u_{2})$ & $(u_{1},u_{2})$ & $(u_{1},u_{2})$ \\
\hline
\end{tabular}
\end{table}
For clarity, \(u_{i}\) is short for~\( u_{i}(a_{i,F/Y},a_{i^{\prime},Y/F})\). In the
table, \(u_1^{\prime}\) and \(u_2^{\prime}\) refers to the more
aggressive player's cost if she or he chooses to fight, which are
supposed to be much smaller than \(u_{i}\), because aggressive
road-users tend to assume that there would be no collision. For
simplification, we define that \(u_1^{\prime}(a_{1,F},a_{2,Y})= u_1(a_{1,F},a_{2,Y})\)and \(u_2^{\prime}(a_{1,F},a_{2,Y})= u_2(a_{1,F},a_{2,Y})\).
Meanwhile, if Player One chooses to yield, Player Two's choice (to yield
or to fight) makes no difference to player one's cost, and vice versa.
This means that \(u_1(a_{1,Y},a_{2,F})= u_1(a_{1,Y},a_{2,Y})\), \(u_2(a_{1,Y},a_{2,Y})= u_2(a_{1,F},a_{2,Y})\). To find the
Bayesian Nash equilibrium, we have to extend the table using Note S3,
Supporting Information. Let
\begin{equation}
\left\{
\begin{aligned}
f_1(U_1,P) = sign(u_1(a_{1,Y},a_{2,F})-p_2u_1(a_{1,F},a_{2,Y})-u1(a_{1,F},a_{2,F})(p_1+p_3))\\
f_2(U_1,P) = sign(u_1(a_{1,Y},a_{2,F})-p_3u_1(a_{1,F},a_{2,F})-u1(a_{1,F},a_{2,Y})(p_1+p_2))\\
\end{aligned}
\right.
\label{eq7}
\end{equation}
Under the following circumstances, (2) and (3), there will be only one
equilibrium. Player one is confident to fight when
\begin{equation}
\begin{aligned}
f_1(U_1,P)>0,
f_2(U_1,P)>0\\
\end{aligned}
\label{eq8}
\end{equation}
Player one will choose to yield when
\par\null
\begin{equation}
\begin{aligned}
f_1(U_1,P)<0,
f_2(U_1,P)<0\\
\end{aligned}
\label{eq9}
\end{equation}
However, player one will be confused if there is more than one
equilibrium if
\par\null
\begin{equation}
\left\{
\begin{aligned}
&f_1(U_1,P)<0,
f_2(U_1,P)>0, or\\
&f_1(U_1,P)>0,
f_2(U_1,P)<0\\
\end{aligned}
\right.
\label{eq10}
\end{equation}
Notice that the final decision is defined by both probability and the
cost function, for the vehicle-vehicle interactions, the cost function
of the ego vehicle is given in Note S4, Supporting Information.
\subsection*{Aggressiveness Estimation and Belief
Updating}
{\label{154993}}
The road-user's decision is based on her/his observation of the
aggressiveness of the surrounding road-users. Aggressiveness is defined
as a trade-off between safe distance and travel efficiency. Based on
this definition, the observed aggressiveness is given by a sum of the
longitudinal and lateral components, which aligns with the research of
risk field: driver's reaction to the surrounding obstacles is a function
of relative distance. Meanwhile, we also have to mitigate the part of
distance variations that the subject driver generates. The risk field is
simplified with the Gaussian model as in (5) where the parameters of the
risk field can be obtained via the proposed experiments.
\par\null
\begin{equation}
\begin{aligned}
\alpha_{bv} = A_f(v, \Delta\varphi)\exp(-\frac{(x_{bv}-x_{sv}+v_{sv}\Delta t\cos(\varphi_{sv}))^2}{2\sigma^2_{X}(v_{sv})}-\frac{(y_{bv}-y_{sv}+v_{sv}\Delta t\sin(\varphi_{sv}))^2}{2\sigma^2_{Y}(\Delta\varphi_{sv})})\\
\end{aligned}
\label{eq17}
\end{equation}
where~\( \alpha_{bv}\) is the estimated aggressiveness of the obstacle
vehicle. \(\sigma\) defines the shape of the aggressiveness
field. The velocity in the numerator is to eliminate the variated
distance created by the subject vehicle itself. In order to accurately
measure the parameters, we first assume \(A_f(\Delta v, \Delta\varphi)\),
\(\sigma_{X}(v_{sv})\), \(\sigma_{Y}(\Delta\varphi_{sv})\) are linear as below.
\par\null
\begin{equation}
\left\{
\begin{aligned}
A_f(v_{sv}, \Delta\varphi) &= \theta_1v_{sv} + \theta_2\cos(\Delta\varphi) + \theta_3\\
\sigma_{X}(v_{sv}) &= \theta_4v_{sv} + \theta_5\\
\sigma_{Y}(\Delta\varphi_{sv}) &= \theta_6\Delta\varphi + \theta_7\\
\end{aligned}
\right.
\label{eq18}
\end{equation}
Notice that the relative velocity is not included in (5) because
according to the experiment questionnaires, drivers report that they
usually are not able to estimate the relative velocity which can be,
however, reflected in the distance variations.
After estimating the obstacle player's strategy, the subject player
updates her/his belief about the driving style probability distribution,
i.e.,~ \(p_i\)~in Table {\ref{t2}}. The
distribution of \textbf{\emph{p}} \emph{subjects to a Dirichlet
distribution given by}
\par\null
\begin{equation}
\begin{aligned}
p(\textit{\textbf{p}})\sim D(\beta_1,\dots,\beta_K) &= \frac{\Gamma(\sum_k\beta_k)}{\prod_k\Gamma(\beta_k)}\prod_kp^{\beta_k-1}_k\\
where & \quad p_k > 0 \\
\sum_k\textit{\textbf{p}}_k &= 1\\
\end{aligned}
\label{eq20}
\end{equation}
The algorithm updates hyper-parameters \(\beta_k\) based on the
observation of the obstacle vehicle using (8).
\par\null
\begin{equation}
\left\{
\begin{aligned}
& \beta^{t+1}_1 = \beta^{t}_1 + \kappa_1\left|\alpha_{bv}-\alpha_{sv}\right|, if \left|\alpha_{sv}-\alpha_{bv}\right|<\alpha_{th}\\
& \beta^{t+1}_2 = \beta^{t}_2 + \kappa_2(\alpha_{sv}-\alpha_{bv}), if \alpha_{sv}-\alpha_{bv}\geq\alpha_{th}\\
& \beta^{t+1}_3 = \beta^{t}_3 + \kappa_3(\alpha_{bv}-\alpha_{sv}), if \alpha_{bv}-\alpha_{sv}\geq\alpha_{th}\\
\end{aligned}
\right.
\label{eq21}
\end{equation}
where \(\alpha_{th}\) is the sensitivity threshold that the driver
can spot. \(\alpha_{sv}, \alpha_{bv}\) are the aggressiveness of the subject
vehicle and the background vehicle respectively. \(\kappa_1, \kappa_2, \kappa_3\) are
the sensitivity parameters. For the decision-making algorithm, the
probability in (9) is replaced by the expectation of the Dirichlet
distribution defined by
\par\null
\begin{equation}
\begin{aligned}
E\left[p_k\right] = \frac{\beta_k}{\sum_i\beta_i}\\
\end{aligned}
\label{eq22}
\end{equation}
\textbf{Experiment of aggressiveness estimation}
In order to obtain the personalized parameters in (6), a series of
experiments are conducted. All the experiments in this work are carried
out using a 64-bit Windows 10 machine with an Intel Core i7 CPU, 32 GB
of memory installed, and two Logitech G29 for driver input. The
experiment environment is based on Simulink and Unreal engine with 10Hz
sampling rate. Aggressiveness is an abstract idea and is difficult to
quantify. Nevertheless, what we can obtain from the experiment is the
extremity of aggressiveness, i.e., aggressiveness = 1. Hence, this
experiment is designed to find the participant's limitations and then
fit it into (6). In the experiment, the participant can observe the
subject and surrounding vehicle, but she or he has no control over the
subject vehicle except a stop button to terminate the experiment
(details can be found in Note S8, Supporting Information). The subject
vehicle and the obstacle vehicle initiated with a constant steering
angle and velocity, which are designed to ensure a collision if the
experiment is not stopped by the participant. The participant is asked
to stop when she or he thinks the obstacle is too aggressive to
undertake. When she or he stops, we record the current relative position
and yaw angle. The subject vehicle is controlled by a PID controller
with white Gaussian noise to enrich the dataset. After 5 repeated sets
of experiments, the vehicle starts at a new position as in
\textbf{Figure} {\ref{481784}}. There are a total of 6
different angles for collision. After that, we change the speed for
another set of experiments. The obstacle maintains constant velocity and
yaw angle. There are a total of 5 different velocity references. The
experiment results and the fitting algorithm can be found in Note S5,
Supporting Information. Meanwhile, comparisons of fitting results can
also be found in Note S5, Supporting Information.
\par\null\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/f5/f5}
\caption{{Diagram of proposed aggressiveness estimation method
{\label{481784}}%
}}
\end{center}
\end{figure}
Examples of the aggressiveness estimation function are given in
\textbf{Figure~{\ref{735767}},
Figure~{\ref{766650}},
Figure~{\ref{980650}},
Figure~{\ref{700403}}.~}
\textbf{}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/case1/static}
\caption{{Case 1: velocity = 7m/s, relative yaw angle = 0 rad
{\label{735767}}%
}}
\end{center}
\end{figure}
\textbf{~}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/case2/static}
\caption{{Case 2: velocity = 7m/s, relative yaw angle = pi/6 rad
{\label{766650}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/case3/static}
\caption{{Case 3: velocity = 13 m/s, relative yaw angle = 0 rad
{\label{980650}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/case4/static}
\caption{{Case 4: velocity = 13m/s, relative yaw angle = pi/6 rad ~
{\label{700403}}%
}}
\end{center}
\end{figure}
\subsection*{Probing Behavior
Generation}
{\label{703703}}
If the subject vehicle is confused by the relative aggressiveness
compared to the surrounding vehicles, it makes some small steps to test
the real aggressiveness of the obstacle vehicles. These steps cannot be
too big to trigger collisions, nor too small that shows no sign of his
own intention. This part is what makes this work different from the
extant studies. When dealing with uncertainties, current methods intends
to give up or fight outright according to some fixed thresholds, such as
time-to-collision. While, in reality, people intend to fight to certain
degrees. If the obstacles are indeed more competitive, they will then
eventually give up; but if not, they can win the right of way,
especially during a traffic jam where human driver inclines to get to a
more advantageous position. The small step is given by the algorithm in
\textbf{Table} {\ref{t4}}.\selectlanguage{english}
\begin{table}[h]
\centering
\caption{{Algorithm of a small step}}
\label{t4}
\begin{tabular}{ll}
\hline
\multicolumn{2}{l}{SMALL\_STEP$(\alpha_{bv},\alpha_{sv\_th})$}\\
1 & \textbf{if} $\alpha_{bv} \leq \alpha_{sv\_th}$ \textbf{then}\\
2 & $\alpha_{sv}\leftarrow \alpha_{sv\_last}+(\alpha_{sv\_th} - \max(\alpha_{bv},\alpha_{sv\_last}))\textit{rand}$;\\
3 & \textbf{else}\\
4 & $\alpha_{sv}\leftarrow \alpha_{sv\_last}(1-\textit{rand})$;\\
5 & \textbf{end}\\
6 &$\alpha_{sv\_last}=\alpha_{sv};$\\
7 & \textbf{Return} $\alpha_{sv}$\\
\hline
\end{tabular}
\end{table}
\emph{rand} is a random number generator between 0 and 1 to simulate
randomness in reality for tests. The result of the above algorithm can
be viewed as the subject vehicle's strategy or the aggressiveness that
she or he wants to impose on the surrounding road users. However, this
should be a value from 0 to 1, unconnected to the final control output.
Hence, a Markov-based control strategy is proposed to build the
connection. First, the Markov transition matrix based on one's driving
habit is constructed based on experiments. The reason why insisting on
experiments is that we can customize this matrix for different test
applications. For example, we can collect different driving data from
different drivers and abstract them into this matrix. When one special
or random type of driving is required for the driving intelligence test.
This matrix can be called for personalized behavior generating. We ask
participants to repeat lane change under different driving speed 40
times and record their behaviors. For longitudinal control, the
transitional probability to the next acceleration is given by the
current velocity. For lateral control, the transitional matrix models
the probability to the yaw angle increment based on the current velocity
and current yaw angle. The results of a participant is reported in Note
S6, Supporting Information.
Ideally, implanting the strategy interim \(\alpha_{sv}\) into the
Markov transitional probability is the best. However, as
\(\alpha_{sv}\) is an abstract variable, in order to connect the
Small-Step strategy, the transitional probability above is not used
directly. Inspired by \cite{Shin_2019}, the joint Markov chain is
defined by
\par\null
\begin{equation}
\begin{aligned}
{\pi_a}^{t+1}_{(i,j)} = P(a_{t+1}=a_j \mid v_{t}=v_i)\mathcal N(2a_{gridX}(\alpha_{sv}-\alpha_{M}),\delta^2_x)\\
\end{aligned}
\label{eq23}
\end{equation}
\begin{equation}
\begin{aligned}
{\pi_s}^{t+1}_{(i,j)} = P(y^{\prime}_{t+1}=a_j \mid v_{t}=v_i,\phi_{t}=\phi_i)\mathcal N(2a_{gridY}(\alpha_{sv}-\alpha_{M}),\delta^2_y)\\
\end{aligned}
\label{eq24}
\end{equation}
These two equations can be viewed as a combination of behavior and
strategy. \$\textbackslash{}mathcal N\$ is a normal distribution with a
mean of~\(2a_{gridX}(\alpha_{sv}-\alpha_{M})\) or~\(2a_{gridY}(\alpha_{sv}-\alpha_{M})\), and a standard
deviation of~\(\delta_x\) or~\(\delta_y\), which are tuned
according to the grid distance of the Markov transitional
matrix.~\(\alpha_{M}\) is given by the middle point of the
transitional probability, which equals to 0.5 in this
work.~\(a_{gridX}\) and~\(a_{gridY}\) are given by the shape
of the transitional probability matrices. Ideally, the direction of
fight or yield should be defined according to the gradient of
risk-field~\cite{Kolekar2020}. To simplify, the direction of fight is
given by the direction of shortening the distance of the two subjects,
and the direction of yield is vice versa. The transitional probability
we use are shown in \textbf{Figure~{\ref{244402}}}.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/Long/static}
\caption{{Longitudinal transitional probability (x: acceleartion(m/s\^{}2), y:
velocity (m/s), z: transitional probability (-))
{\label{244402}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/Lat/static}
\caption{{Lateral transitional probability (x: Velocity (m/s),
y:~\(\psi\)~(degree), z: \(\psi'\)(degree/s), size:
transitional probability (-)) ; The figure does not contain all the data
because the original data size is large.
{\label{382635}}%
}}
\end{center}
\end{figure}
\subsection*{Experiment of the Turing
Test}
{\label{797366}}
The experiment environment is given
in~\textbf{Figure~}{\ref{524748}} and \textbf{Figure}
{\ref{708405}}. Two participants sits in two simulators
separated by a barrier so that they can not see each other. The scene
and parameters are also presented in Note S1, Supporting Information.
Vehicle 1 can be manipulated by participant 2 or our algorithm. 20
rounds of tests were carried out. Male:16, Female:4, Driving license
(M:4.56, SD:2.6977), Age (M:27.44, SD:2.56).~The study protocol and
consent form were approved by the Nanyang Technological University
Institutional Review Board (protocol number IRB-2018-11-025).
\par\null\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/f38/f38}
\caption{{Turing test environment
{\label{524748}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/f33/f33}
\caption{{Turing test framework
{\label{708405}}%
}}
\end{center}
\end{figure}
The detailed procedure of the experiments are shown as below.
1) Before the test, we inform the participants the following items.
\begin{itemize}
\tightlist
\item
The goal of the experiment is to distinguish (Driver B: V2) whether
the opponent driver (Driver A: V1) is a human or an algorithm based on
the interaction.
\item
The task of the scene as is depicted in Note S1, Supporting
Information.
\item
The driver should follow the traffic rules.
\item
Their primary task should be the task of the scene: they should drive
normally; the secondary task is the goal of the experiment: they can
try to test vehicle 1. This is important because we think the driver
should behave reasonably, but at the same time, it is necessary to
give pressure to vehicle 1. When driving too cautious, without any
conflict of interest, it is hard to evaluate the performance of
vehicle 1. Also, the proposed algorithm is for scene generation. A
human-like way to trigger an accident is sometimes needed.
\item
The driving style of the algorithm is randomly generated before each
test.
\item
One single test takes 20 seconds and there will be a total of 10 tests
for each participant
\end{itemize}
2) Two participants have few test rounds till they are familiar with the
simulator and the dynamic performance of the vehicles. 3) After enough
practice, driver B chooses to drive in autonomous mode or fully manual
mode. If Driver B chooses autonomous mode. She or he also generates
random driving style parameter (\(\alpha_{th}\)). Driver A cannot see
this process. 4) Driver B initiates one test and before the start of the
test, Driver B informs Driver A so that Driver A can be well prepared
for the experiment. 5) Complete one test. 6) Fill in a questionnaire. 7)
Repeat step 3) to 6) nine more times. 8) score the entire experiment
using Table in Note S7, Supporting Information. 9) After one set of
tests, Driver B will be asked what is the criteria for their judgments.
The questionnaire for Driver A is only one question with five
alternatives for each test (Question: rate the performance of Diver B.
Choices:~\emph{Robot driver, Somewhat robot-like,~ Not sure, Somewhat
human-like, Human driver}). The questionnaire for Driver B comprises 3
questions. (Question 1: the ground truth whether this test is driven by
a human or an algorithm. Question 2: the driving style of Driver A.
Choices: \emph{Aggressive, Cautious, Normal}. Question 3: is there a
collision in this test and which driver is responsible for the
accidents?) ~
\section*{Results}
{\label{831157}}
Three major results are reported in this section: the way to customize
the driving behavior, the comparisons with existing methods, and the
Turing test results.
\subsection*{Driving Behavior
Customization}
{\label{806724}}
To demonstrate how the parameters in our algorithm will affect the
interactive behavior, demonstrations are presented. The way to
manipulate the driving behavior by adjusting~\(\alpha_{svth}\)
and~\(\kappa\) are reported and compared in this subsection.
First, different thresholds~ ~\(\alpha_{svth}\) are compared
in~\textbf{Figure ~}{\ref{113875}}. For comparison, the
surrounding vehicle (V1) stringently follows fixed predefined
way-points, which, including other information, is unknown to the
subject vehicle (V2) by setting the initiation~\(\beta^0_i = \frac{1}{3}, i=i,2,3\).~Also,
the costs in Table~{\ref{t2}} are set to a pair of
constants for comparisons. Other parameters are given in Note S1,
Supporting Information and~\cite{Werling_2010} is used as the candidate
trajectory generation algorithm.~
\par\null
Rich media available at \url{https://www.youtube.com/watch?v=5tDlAgcQ\_GA}
Different \(\kappa\)s are compared as well. Notice that we use
different \(\kappa\)s in (8), which is for different test
requirements: some people might be more cautious under certain
circumstances and reckless under others. For the comparison, we simplify
the cases by assuming \(\kappa = \kappa_1 = \kappa_2=\kappa_3\).
\par\null\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/fmerge2/fmerge2}
\caption{{Customizing cautiousness
{\label{896130}}%
}}
\end{center}
\end{figure}
\subsection*{Comparisons with Existing
Methods}
{\label{607823}}
To verify whether the proposed algorithm is reasonable and, furthermore,
human-like, the algorithm is first compared with extant decision-making
methodologies. In order to test the human-likeness, comparisons with
researches that are proven to be human-like are conducted.
We compare our Bayesian game based approach with two widely accepted
lane change decision-making methodologies: 1) IDM and MOBIL (metric 1a),
which are the most frequently used method to imitate human driver's
behavior as in~\textbf{Figure~}{\ref{907317}}; 2)
Xuemin's~\cite{Hu_2018} method (metric 1b) , which uses a generates
multiple alternatives and chooses one based on the cost function as
in~\textbf{Figure~}{\ref{665197}}.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/f21/f21}
\caption{{Comparison with IDM+MOBIL
{\label{907317}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/f22/static}
\caption{{Comparison with Xueming et.al's method (Blue: our method, Red: metric 2b
method)
{\label{665197}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/f26/f26}
\caption{{Comparison with Annemarie et.al's method
{\label{693354}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.84\columnwidth]{figures/fmerge3/merge-Kao-Bei-}
\caption{{Comparison with Hang et.al's method
{\label{539751}}%
}}
\end{center}
\end{figure}
To evaluate the human-likeness, the proposed method is compared to
methods that are proven to be human-like. Pedestrian trajectories are
generated using game theory as the first comparison
in~\textbf{Figure~}{\ref{693354}} (metric 2a) . The two
pedestrians' shortest trajectories to their respective goals are
contradictory. Using the rapid random tree method with B-spline,
multiple trajectories can be generated. The best trajectory
in~\textbf{Figure~}{\ref{693354}} to fight is set to
the shortest trajectory among all generated trajectories while the
expected trajectory to yield is set to the shortest trajectoy without
possible collision. Meanwhile, a comparison with game theoretic
approach~\cite{yang2020}~in Figure~{\ref{539751}}
(metric 2b) is also reported.
\subsection*{Turing Test}
{\label{461019}}
To verify the human-likeness of the proposed method, we conducted a
Turing Test. The experiment results are summarized as in \textbf{Table}
{\ref{t6}} and \textbf{Figure}
{\ref{428261}}. If the proposed algorithm is obviously
different from human driving behavior, then the score should be close to
0 or 10.\selectlanguage{english}
\begin{figure}[h!]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/f39/f39}
\caption{{Turing test results
{\label{428261}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{table}
\centering
\caption{{Turing test results}}
\begin{tabular}{cc}
\hline
Range & Percentage\\
(8,10] & 0\%\\
(6,8] & 30\%\\
(4,6] & 50\%\\
(2,6] & 20\%\\
(0,2] & 0\%\\
\hline
Mean & 5.2625 \\
Standard deviation & 1.6110 \\
\hline
\end{tabular}
\label{t6}
\end{table}
\section*{}
{\label{328721}}
Four selected experiment video are shown
in~\textbf{Figure~{\ref{779927}}}.
\par\null
Rich media available at \url{https://youtu.be/SlhSdPmNHF0}
\section*{Discussion}
{\label{328721}}
As can be seen in Figure~{\ref{113875}}, all three
cases start with a confusing situation because of the incomplete
information assumption where V2 has no prior on V1.
When~\(\alpha_{svth}=0.5\), V2 believes that it is less aggressive than V1
as can be seen in Figure~{\ref{113875}}~(a) but it
still accelerates a little bit to ensure this belief. After a short
period of acceleration, it finds out that it is actually less
aggressive, thus it decelerates to give the right of way as can be seen
in Figure~{\ref{113875}}~(d) . On the other hand,
when~\(\alpha_{svth}=0.7\), V2 believes it is more aggressive than V1, as
can be found in Figure~{\ref{113875}}~(b) , where the
green region is the largest at the beginning. It accelerates longer to
demonstrate its will to fight. After a turning point at around 3s, the
subject vehicle doubts itself whether it is really more aggressive than
the surrounding one. Because the obstacle vehicle follows pre-defined
way-points no matter what happens, which can be seen as extremely
aggressive. V2 yields eventually till there is no driving conflicts.
However, when~\(\alpha_{svth}=0.9\), V2 thinks it is more aggressive than
the surrounding one. Thus it accelerates intensively, along with its
belief as the green block in Figure~{\ref{113875}}~(c).
After a short period of probing, it chooses to fight. Though, for the
same reason, the obstacle vehicle can be seen as extremely aggressive.
V2 doubts itself even it is driving parallel with the obstacle one.
After a while, it knows that it is less aggressive, thus accelerates to
get out of the situation. In this case, V2 is not more aggressive than
the obstacle, but when the subject vehicle tries to probe, it
accelerates more and finally blocked the surrounding road user.
As is shown in Figure {\ref{896130}}, when
\(\kappa = 1\), the driver does not make a rush decision as compared
to \(\kappa = 2\): the relative aggressiveness varies less
intensively. By tuning those above white-box parameters, various complex
behaviors can be generated.
As in Figure {\ref{907317}}, when the ego vehicle is
less polite or more aggressive as defined in this paper, the driver
intends to start a lane change earlier. However, our method are not
exactly monotonous. This is because when the vehicle is probing, there
are chances that the subject misjudged the obstacle vehicle's intention
and also, the subject vehicle can block the obstacle vehicle, thus the
obstacle vehicle has to yield. This means the proposed concurs with
MOBIL's method, which is proved effective in modeling large traffic
flow, while our method can generate micro and more complicated human
behavior. Also, the comparison with Xuemin's\cite{Hu_2018} method in
Figure {\ref{665197}} indicates that our method could
be more aggressive. As can be seen in the early phase of lane change,
the compared method is more conservative because, when there is a
conflict of interest, the subject vehicle will choose a less costly
candidate trajectory without conflict. However, the proposed method is
adversarial because the algorithm assumes that the obstacle driver will
yield eventually.
When we set the collision weight to infinite, the trajectories selected
by the algorithm are the same as the compared method as is validated in
Figure {\ref{693354}}. It does not matter what
\(P\) is because, when the collision cost is too high, our
method will always choose a conservative alternative, which aligns with
the literature.
Also, the comparison with Hang et.al's method .
Figure~{\ref{539751}}b, vehicle 2 is assumed to be an
aggressive driver. In our case,~\(\alpha_{th,v1} = 0.1\),~\(\alpha_{th,v2} = 1\)
both methods indicate that the subject vehicle maintains a relatively
low velocity. But there are two major differences. In Table
Figure~{\ref{539751}}d, the subject vehicle of Hang
et.al's method accelerates and maintains its speed at around 21m/s.
However, to enlarge the space for lane-change, our vehicle 1 decelerates
and then starts to accelerate at 5s to restart a lane-change. As can be
seen, our method outperforms in lane changing time: proactive
decelerating increases grid distance for a faster lane change. As for
driver 2's behavior, though Hang's method accelerates intensively and
maintains at around 22 m/s, our vehicle 2 does not accelerate much
because vehicle 1 is faster, as can be seen as more aggressive, from 0s
to 1s. Though the aggressiveness threshold of our vehicle 2 is 1, it
accelerates conservatively to ensure driver 1's aggressiveness. Though
from 4s to 8s, our method accelerates to a relatively high velocity to
get rid of the lane-changing vehicle. The other situation is the
opposite of case one. As can be found in Table
Figure~{\ref{539751}}c and
Figure~{\ref{539751}}e, vehicle 1 is quite sure that it
is more aggressive, thus it starts a lane change at the early phase and
accelerates to 24m/s at 3s. However, vehicle 2 does not quit fighting
from 0s to 1s. The above phenomenon aligns with Hang's method, though
from 3s to 8s, the two methods are different because, in the scene
generation context, after the interaction, the vehicles are enforced to
bounce back to the initial velocity, through which we can eliminate the
chance of collisions with irrelevant vehicles, which can be tuned to
Hang's method easily. Additionally, although the trends in the two
methods are identical, the velocity or trajectories are not exactly the
same. This is partially because Hang et.al use MPC for the control of
vehicle which is a strong assumption because human drivers can control
the vehicle perfectly.
According to the above comparisons, we can see, from the decision level,
the proposed method concurs with the state of art literature and is more
flexible and intelligent with respect to scene generation. Though, we
compare the proposed method with some methodologies that are proven to
be human-like, to further evaluate the human-likeness of this method, a
variation of the Turing test is conducted.
In the Turing test, most people are not able to distinguish whether it
is human or algorithm as 50\% of them have approximately 50\% accuracy.
Also, the other participant's scores are closed to 50\% as well. This
indicates our method can confuse the participants so that they can not
distinguish whether it is algorithm-generated or controlled by a real
human driver. Thus the proposed is human-like and effective for scene
generation for driving intelligence tests. Meanwhile, the accident rate
is 14\% (most of them are caused by participant 1, rear-end collision),
which is higher than usual\cite{Feng2021}, indicating that the
participants are actively testing the subject vehicle. This is higher
than normal driving but aligns with what we told the drivers before the
test: we want them driving normally as a primary task but we also need
them to test the subject, which makes our results more reliable.
However, there are still some drawbacks in the proposed work as well.
One major drawback as reported by the participants is that there are
case when driver A acts so indecisive. This is because when the random
driving style is too small, the algorithm behaves so cautiously, which
does not happen in reality. Also, there is also a overshoot problem for
both participants and algorithm. Algorithm overshoot usually happens in
the early stage of lane-changing because, as is the same, since the
control part is not constrained, when~\(\alpha_{th}\) is too small,
the driver will tries to get away from the obstacle driver. As for real
human driver, the overshoot usually happens in the late stage of lane
changing because when they accelerate too heavily, they can not control
the vehicle properly. On the other hand, these cases are rare. Hence, we
think the results is generally valid. In future work, we may take the
control level into consideration to generate more human-like behavior.
Also, we will increase the number of participants and set a base-line
for the test as well.
\section*{Conclusion}
{\label{949011}}
This paper presents a human-like decision-making algorithm for driving
intelligence tests. The interaction model of road users is firstly
established using the Bayesian game theory. Besides an extreme
conservative choice or an extreme aggressive choice, a probing behavior
can be generated using the proposed method based on the cost and
relative aggressiveness probability. To evaluate the aggressiveness of
the opponent, an observation model is established and the way to
customize it is given by an experiment. Additionally, the driver's
probing strategy generation method is developed to test the real
aggressiveness of the background vehicle. The strategy is reflected on
the vehicle's behavior through a proposed Markov method. Next, the
proposed methodology is compared with commonly used approaches and state
of art literature. The comparison indicates that our method concurs with
previous researches while is capable of generating more complex and
human-like behavior. Finally, the human-likeness of our algorithm is
evaluated using the Turing test. The test results indicate that the
participants cannot distinguish human behavior from the behavior
generated by our algorithm.
Although the proposed method is designed for scene generation, it may
shed some light on the autonomous driving algorithm as well. One of the
major challenges in autonomous driving is the uncertainty of traffic.
Instead of passively accepting the probability, we may actively make
some small steps to reduce the entropy without compromising safety as is
given in this paper. Current researches focus on prediction accuracy and
learning convergence, which is supposed to be a trade-off between
perception/computation burden, and accuracy; the more data available,
the more powerful the computer is, the better the decision can be. In
this way, we may eventually be able to predict the future, thus obtain a
best decision. But, this demand is endless. The decision algorithm, as
well as prediction and aggressiveness estimation methodology in this
paper, are simple and direct, thus computationally efficient because we
do not insist on the global best decision, which is the same for normal
human drivers; when human drivers are confused, they just try with small
steps, which are simple but powerful.
Additionally, the Turing test framework given in this paper might be
applied to autonomous driving algorithm evaluation. As so many
researchers and manufacturers are developing human-like self-driving
algorithms, this unified and objective method can be used for the
assessment of human-likeness.
Our future work will focus on the human control level. Other human
behaviors, such as human distraction, control latency will be considered
to generate more human-like behavior for autonomous tests. Also, the
Markov method will be replaced with a better approximator that can be
even more tightly connected to the strategy. Moreover, a more general
Turing test procedure with more participants might be our focus as well.
\section*{Acknowledgements}
{\label{802803}}
This work was supported in part by the SUG-NAP Grant (No. M4082268.050)
of Nanyang Technological University, the A*STAR Grant of Singapore (No.
1922500046).
\section*{}
\section*{Conflict of interest}
The authors declare no competing interests.
\par\null
\section*{Supporting Information}
{\label{948995}}
Supporting Information is available from the Wiley Online Library or
from the author.
\selectlanguage{english}
\FloatBarrier
\bibliography{bibliography/converted_to_latex.bib%
}
\end{document}