Environment#

smart_control.environment.environment #

Controllable building RL environment to interact with TF-Agents.

RL environment where the agent is able to control various setpoints with the goal of making the HVAC system more efficient.

ActionConfig #

ActionConfig(action_normalizers: ActionNormalizerMap)

Configures BaseActionNormalizers for each setpoint.

This class allows the user to configure a BaseActionNormalizer for any device_id/setpoint name tuple.

Only setpoints given as part of this config will be part of the action space.

Example

action_normalizers = { ('boiler_0', 'supply_water_setpoint'): ContinuousBaseActionNormalizer(args) }

This would set a ContinuousBaseActionNormalizer for the supply_water_setpoint setpoint on the device with id boiler_0.

get_action_normalizer #

get_action_normalizer(
    setpoint_name: FieldName,
) -> Optional[base_normalizer.BaseActionNormalizer]

Returns corresponding action normalizer if it exists.

Parameters:

Name	Type	Description	Default
`setpoint_name`	`FieldName`	Name of setpoint to get action normalizer for.	required

Environment #

Environment(
    building: BaseBuilding,
    reward_function: BaseRewardFunction,
    observation_normalizer: BaseObservationNormalizer,
    action_config: ActionConfig,
    discount_factor: float = 1,
    metrics_path: str | None = None,
    num_days_in_episode: int = 3,
    device_action_tuples: Sequence[DeviceActionTuple] | None = None,
    default_actions: DefaultActions | None = None,
    metrics_reporting_interval: float = 100,
    label: str = "episode_metrics",
    num_hod_features: int = 1,
    num_dow_features: int = 1,
    occupancy_normalization_constant: float = 0.0,
    run_command_predictors: Sequence[BaseRunCommandPredictor] | None = None,
    observation_histogram_reducer: HistogramReducer | None = None,
    time_zone: str = "US/Pacific",
    image_generator: BuildingImageGenerator | None = None,
    step_interval: Timedelta = pd.Timedelta(5, unit="minutes"),
    writer_factory: BaseWriterFactory | None = None,
)

Bases: PyEnvironment

Controllable building RL environment to interact with TF-Agents.

Environment constructor.

Parameters:

Name	Type	Description	Default
`building`	`BaseBuilding`	An implementation of BaseBuilding.	required
`reward_function`	`BaseRewardFunction`	An implementation of BaseRewardFunction.	required
`observation_normalizer`	`BaseObservationNormalizer`	Normalizer parameters for observations.	required
`action_config`	`ActionConfig`	Parameters for actions: min, max, type, etc.	required
`discount_factor`	`float`	Future reward discount, i.e., gamma.	`1`
`metrics_path`	`str \| None`	CNS directory to write environment data.	`None`
`num_days_in_episode`	`int`	Episode duration.	`3`
`device_action_tuples`	`Sequence[DeviceActionTuple] \| None`	List of (device, setpoint) pairs for control.	`None`
`default_actions`	`DefaultActions \| None`	Initial actions.	`None`
`metrics_reporting_interval`	`float`	Frequency of TensorBoard metrics.	`100`
`label`	`str`	Episode label prepended to the episode output directory.	`'episode_metrics'`
`num_hod_features`	`int`	Number of sin/cos pairs of time features for hour.	`1`
`num_dow_features`	`int`	Number of sin/cos pairs of time features for day.	`1`
`occupancy_normalization_constant`	`float`	Value used to normalize occupancy sig.	`0.0`
`run_command_predictors`	`Sequence[BaseRunCommandPredictor] \| None`	Predictors for setting on/off in RunCommands	`None`
`observation_histogram_reducer`	`HistogramReducer \| None`	Add histogram reduction to observations.	`None`
`time_zone`	`str`	Time zone of the building/environment.	`'US/Pacific'`
`image_generator`	`BuildingImageGenerator \| None`	Building image generator that generates image encodings from observation responses.	`None`
`step_interval`	`Timedelta`	amount of time between env steps.	`Timedelta(5, unit='minutes')`
`writer_factory`	`BaseWriterFactory \| None`	Used with metrics_path, factory for metrics writers.	`None`

all_actions_accepted #

all_actions_accepted(action_response: ActionResponse) -> bool

Returns true if all single action requests have response code ACCEPTED.

compute_action_regularization_cost #

compute_action_regularization_cost(action_history: Sequence[ndarray]) -> float

Applies a smoothing cost based on recent action history.

Returns the L2 Norm of the actions as a penalty term for large changes.

Parameters:

Name	Type	Description	Default
`action_history`	`Sequence[ndarray]`	Seqential array of actions taken in the episode.	required

Returns:

Type	Description
`float`	A smoothing cost applied to the reward function for applying big changes.

generate_field_id #

generate_field_id(
    device: DeviceId, field: FieldName, id_map: bidict
) -> DeviceFieldId

Returns new Id not already present in id_map.

Ids are created by joining the device and field: device_field.

If the same device and field are added again, the same id will be returned.

If a unique device/field generates the same id as a different device/field, the id will be concatenated with an integer if the id already exists.

Examples:

>>> generate_field_id(device='a_b', field='c') -> a_b_c
>>> generate_field_id(device='a_b', field='c') -> a_b_c
>>> generate_field_id(device='a', field='b_c') -> a_b_c_1

The first id is a_b_c. The second call is an exact duplicate of the first, so the same id is returned. When the third call is made, because a_b_c is already taken, an int is concatenated and the returned id is a_b_c_1.

Parameters:

Name	Type	Description	Default
`device`	`DeviceId`	Device id.	required
`field`	`FieldName`	Measurement or setpoint name.	required
`id_map`	`bidict`	Current mapping of device fields to ids.	required

replace_missing_observations_past #

replace_missing_observations_past(
    current_observation_response: ObservationResponse,
    past_observation_response: Optional[ObservationResponse],
) -> ObservationResponse

Replaces any missing observations with a past ObservationResponse.

Sometimes, the building doesn't report all the observations; however, the agent requires all fields to be populated. When a missing observation is encountered, impute the value from the most recent observation.

Parameters:

Name	Type	Description	Default
`current_observation_response`	`ObservationResponse`	Current observations from the building.	required
`past_observation_response`	`Optional[ObservationResponse]`	Use this observation to fill in any missing observations.	required

Returns:

Type	Description
`ObservationResponse`	A merged ObservationResponsem filled in from the past observation.

Raises: ValueError when a missing observation exists and there is no past observation.