DiscreteMicrogridEnv.step#

DiscreteMicrogridEnv.step(action)[source]#

Run one timestep of the environment’s dynamics.

When the end of the episode is reached, you are responsible for calling reset() to reset the environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters#

actionint

An action provided by the agent.

Returns#

observationdict[str, list[float]] or np.ndarray, shape self.observation_space.shape

Observations of each module after using the passed action. observation is a nested dict if flat_spaces is True and a one-dimensional numpy array otherwise.

rewardfloat

Reward/cost of running the microgrid. A positive value implies revenue while a negative value is a cost.

donebool

Whether the microgrid terminates.

infodict

Additional information from this step.