In a near future, teams of heterogeneous robots are expected to increase capabilities, benefits and robustness of a whole range of applications that require coordination. Examples include lunar exploration, search and rescue, construction sites, coordinated mapping. The heterogeneity of agents can be captured by adapted reward signals from the environment, by different models of actions, and by individual decisions that rely on locally measurable internal resource states. The latter corresponds to proprioceptive continuous signals such as battery level, remaining time of operation, remaining memory, remaining hard-drive space, etc...
Decision theoretic planning provides a robust framework for decision-making under uncertainty. If multiagent problems have received a serious attention, existing formulations and algorithms do not make use of the explicit structure of local resource constraints. This prevents the solving of problems where individual resources control the configurations of the multiagent system.
Using a dynamic partitioning of the resource state-spaces, we tackle the computational blowup introduced by continuous variables. We detail an algorithm that aggregates continuous regions with identical strategies of one agent w.r.t. the rest of the team. The algorithm considers far fewer points than a naive approach, while approximating the true value functions with controllable precision. We report on the application of our algorithm to a range of multi-robot exploration problems under continuous resource constraints.