DiscreteMicrogridEnv.step#

DiscreteMicrogridEnv.step(action)[source]#

Run one timestep of the environment’s dynamics.

When the end of the episode is reached, you are responsible for calling reset() to reset the environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters#

observationdict[str, list[float]] or np.ndarray, shape self.observation_space.shape: Observations of each module after using the passed action. observation is a nested dict if flat_spaces is True and a one-dimensional numpy array otherwise.
rewardfloat: Reward/cost of running the microgrid. A positive value implies revenue while a negative value is a cost.
donebool: Whether the microgrid terminates.
infodict: Additional information from this step.