Authorea

Ofra Amir edited subsectionBidding_Al.tex over 9 years ago

Commit id: 9d63fade23fb6c83022cb15908a1662503b3ccd6

deletions | additions

\subsection{Bidding} All %All path of cost optimal + X. Use %Use of MDD its advantage. ***Ofra: old version, now editing*** Agents' valuations for bundles, or paths, are relative -- for example, $a_1$ will be willing to pay for its shortest path with $c=1$ up to one unit more than for a path with $c=2$ (for example, a path in which it waits at its starting location for one turn and only then moves to $(1,1)$). % if there is a path $A$ with cost $c_a$ and a path $B$ of cost $c_{b} = c_{a}+1$, an agent would be willing to pay up to one unit more on path $A$ compared to path $B$. However, there is no absolute limit to the amount an agent would be willing to pay for a particular path. We implemented a myopic best response bidding strategy~\cite{parkes1999bundle} for the agents. According to this strategy, agents bid on their best bundles at each iteration given the current prices and always place bids with the current price (e.g., they are not considering raising the price before it has been raised by the auctioneer). This is a reasonable strategy for agents, and it has been shown that myopic best response is the best strategy for an agent given that other agents are acting based on this strategy~\cite{parkes2001iterative}. In MAPF, myopic best response means that agents bid at each iteration on paths that minimize $c+p$, where $p$ denotes the current price of the path in the auction. They bid on these paths with the current ask price. %A bid is represented by a tuple $\langle c,p \rangle$. For example, in the first iteration of an auction solving the problem shown in Figure~\ref{fig:mapf}, $a_1$ would bid on its shortest path, $[\langle(0,1),t_1\rangle,\langle(1,1),t_2\rangle]$. $a_2$ would place a $XOR$ bid on its three shortest paths:$[\langle(0,2),t_1\rangle,\langle(0,1),t_2,\langle(1,1),t_3,\langle(2,1),t_4\rangle]$; $[\langle(0,2),t_1\rangle,\langle(1,2),t_2,\langle(1,1),t_3,\langle(2,1),t_4\rangle]$; $[\langle(0,2),t_1\rangle,\langle(1,2),t_2,\langle(2,2),t_3,\langle(2,1),t_4\rangle]$. Similarly, $a_3$ will bid on all of its shortest paths (paths with $c=4$). Note that if agents bid on paths that end at an earlier time than some of the other agents' paths, we extend their paths by adding their last location to the next time steps. As prices increase, agents begin to consider longer paths. Specifically, their best response should include all paths that reach the goal with a minimum value of $c+p$. For example, given that all paths with $c=3$ that $a_2$ can traverse to reach its goal are priced at 1 ($p=1$) and paths with $c=4$ are priced at 0, $a_2$ will bid on all paths with $c=3$ at price 1 and all paths with $c=4$ at price 0, as they have the same total cost ($c+p=4$). Importantly, there could be exponentially more paths for an agent when increasing cost. This is because the agent can consider waiting at any of the location on its lower cost paths, as well as finding new paths of that cost. This increase in the number of bundles (paths) agents bid on is atypical of auction problems and poses a representation problem as well as a computational difficulty for the winner determination problem (which is already computationally hard). To address these problems, agents bid on MDDs that compactly represent their desired paths. Note that the cost used in the MDDs refers to the travel cost ($c$) and does not consider the prices of the MDDs in the auction. Each bid can therefore be represented by the pair $\langle mdd,p_{mdd} \rangle$. Figure~\ref{fig:mapf}(b) shows the MDD for agent $a_2$ with $c=3$. This MDD represents all of the paths that $a_2$ can take from its location to its goal, at a cost of 3. Each level of the MDD includes all of the locations that $a_2$ might be at in a specific time step when traversing a path of cost 3 to the goal. An edge connects an MDD node with locations in the next level that can be arrived at from the current location. For example, at level 1, $a_2$ could be at either location $(1,2)$ or $(0,1)$. From $(1,2)$ it can move to either $(1,1)$ or $(2,2)$, while from $(0,1)$ it can only move to $(1,1)$ in order to reach the goal in cost 3. As explained above, we align the MDDs to have the same cost by replicating the goal node in the next time steps as show for agent $a_1$ in Figure~\ref{fig:mapf}(c) (dashed nodes and edges were added to match the length of $a_2$'s MDD).