Bidding

When deciding which paths (bundles) to bid on, agents need to consider two factors: the path cost, denoted by \(c\) and the path current price, denoted by \(p\). Agents’ valuations for bundles, or paths, are relative – for example, \(a_1\) will be willing to pay for its shortest path with \(c=1\) up to one unit more than for a path with \(c=2\) (for example, a path in which it waits at its starting location (node A) for one turn and only then moves to it goal at node C).

A reasonable strategy for the agents is the myopic best response strategy \cite{parkes1999bundle}.According to this strategy, agents bid on their best bundles at each iteration given the current prices and always place bids with the current price (i.e., they are not considering raising the price before it has been raised by the auctioneer). It has been shown that myopic best response is the best strategy for an agent given that other agents are acting based on this strategy \cite{parkes2001iterative}. In MAPF, an agent’s best response is to bid on all paths that minimize \(c+p\) at their current ask price (\(p\)).

For example, in the first iteration of an auction for the problem shown in Figure \ref{fig:mapf}(a), \(a_1\)’s best response bid in the first round is to bid on its shortest path, \([\langle A,t_1\rangle,\langle C,t_2\rangle]\). \(a_2\)’s best response is to place a \(XOR\) bid on its three shortest paths:\([\langle B,t_1\rangle,\langle A,t_2,\langle C,t_3,\langle E,t_4\rangle]\); \([\langle B,t_1\rangle,\langle D,t_2,\langle C,t_3,\langle E,t_4\rangle]\); \([\langle B,t_1\rangle,\langle D,t_2,\langle G,t_3,\langle E,t_4\rangle]\). Similarly, \(a_3\)’s best response is to bid on all of its shortest paths (paths with \(c=4\)). Note that if agents bid on paths that end at an earlier time than some of the other agents’ paths, we extend their paths by adding their last location to the next time steps.

As prices increase, agents begin to consider longer paths. Specifically, their best response should include all paths that reach the goal with a minimum value of \(c+p\). For example, given that all paths with \(c=3\) that \(a_2\) can traverse to reach its goal are priced at 1 (\(p=1\)) and paths with \(c=4\) are priced at 0, \(a_2\) will bid on all paths with \(c=3\) at price 1 and all paths with \(c=4\) at price 0, as they have the same total cost (\(c+p=4\)).

As the auction progresses, agents will consider bidding on paths with higher costs. Importantly, the number of paths for an agent grows exponentially with the increasing cost considered. This is because the agent can consider waiting at any of the location on its lower cost paths, as well as finding new paths of that cost. This growth in the number of bundles (paths) agents bid on is atypical of auction problems and requires a compact representation to represent bids. To address this problem, we propose using multi-value decision diagrams (MDDs) \cite{srinivasan1990algorithms} which have been used by Sharon et al. \cite{sharon2012increasing} in the context of MAPF to represent a set of paths. MDDs grow polynomially with cost rather than exponentially and thus provide a compact representation \cite{sharon2012increasing}.

An MDD for a single agent \(a\) and a given cost \(c\), denoted \(MDD_{a}^c\), represents all possible paths from the location of the agent to its goal at the specified cost. Agents can thus bid on complete MDDs rather than explicitly representing each path (each MDD can be seen as a \(XOR\) bid that includes all the paths represented by the MDD). Figure \ref{fig:mapf}(b-d) shows MDDs for the agents’ shortest paths to their goals. For example, \(MDD_{2}^3\) (Figure \ref{fig:mapf}(c)) describes all the paths that \(a_2\) can take from its location to its goal at a cost of 3. Each level of the MDD includes all of the locations that \(a_2\) might be at in a specific time step when traversing a path of cost 3 to the goal. An edge connects an MDD node with locations in the next level that can be arrived at from the location in the node. For example, at level 1, \(a_2\) could be at node \(A\) or \(D\). From node \(D\) it could move to either node \(C\) or \(G\). To ensure agents do not conflict after they reach their goals, we align the MDDs by extending them to the length of the longest MDD by replicating the goal node (as shown in dashed lines in the figure).