Behavioral control task supervisor with memory based on reinforcement learning for human–multi-robot coordination systems