DiscreteMicrogridEnv.step#
- DiscreteMicrogridEnv.step(action)[source]#
Run one timestep of the environment’s dynamics.
When the end of the episode is reached, you are responsible for calling reset() to reset the environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
Parameters#
- actionint
An action provided by the agent.
Returns#
- observationdict[str, list[float]] or np.ndarray, shape self.observation_space.shape
Observations of each module after using the passed
action.observationis a nested dict ifflat_spacesis True and a one-dimensional numpy array otherwise.- rewardfloat
Reward/cost of running the microgrid. A positive value implies revenue while a negative value is a cost.
- donebool
Whether the microgrid terminates.
- infodict
Additional information from this step.