Observations, Actions and Coordination
Coordination
Pymasep support three types of coordination :
MULTIPLAYER_COORDINATION_TURN : The order of the agents is defined in the AgentOrder system characteristic in the current state. The environment will take into account only the action of the agent whose turn it is. This action will be sent to the game for calculating the next state.
MULTIPLAYER_COORDINATION_WAITING_ALL : When this coordination strategy is set, the engine will wait until all agents have chosen their action. Then, the next state is calculated. However, this strategy does not block the game engine loop.
MULTIPLAYER_COORDINATION_FREE_FOR_ALL : In this strategy, an agent can take an action whenever it wants and the next state is calculated immediately. If more agents have chosen their action at the same time, all these actions will be sent to the next state calculation.
One important point for agent coordination is the speed of engine and interface loops. Be sure that the speed of the engine loop is not higher that the speed of the interface loop otherwise, the engine will send more observation than the interface will be able to display. If no interface agent is present is not totally a problem, but if an interactive agent is present, it will not be able to act when it want. The configuration ‘engine.eps’ in app.yaml controls the engine loop per second speed.
Observations
In general, at each turn, the Engine sends an Observation to all Interfaces. The Observation contains the part of the environment observable by the agent handled by the interface and additional information, such as rewards (for RL agents/environments). However, since the engine may run at higher speed than the game turn, a mechanism ensures that the observation is send only once at each environment episode/step for each interface, in order to avoid the overload of the interface.
Moreover, to avoid performance issues, Game.next_state() must set to True the ‘state_changed’ key in additional_information dict result for the engine to send the observation. Any non modified state is not send to the interfaces.
Actions
Each action has a type, defined by the Game, and instantiated by the player Controller. PyMas Game class defines 2 actions:
ACTION_INIT_OBJECT : Action of the interface to initialize one or many elements of an object. The parameters are a object with characteristic value initialized.
ACTION_INTENTION : Action used to send intention to engine to valid if the rules permit them. Parameters are other actions (with possible parameters).
Other actions can be defined in a specialized Game.
Intentions and Interface
The approach to handle intentions in the game should follow these elements:
The agent/interface send a intentions containing the action he wants to do.
According to the game rules, the engine returns, in the observation, the agent object containing the intention if the action could be done in the current state.
With this information, the interface updates the screen to be in phase with the observation.
If the intention does not respect the rules, the engine keeps the last intention of the agent in the observation (if it exists).
If a intention has been validated previously by the game, this intention can be deleted only on the will of the player, not after a new non validated intention.
These points are guidelines to avoid some bugs with intentions.