Skip to content

Environment#

smart_control.environment.environment #

Controllable building RL environment to interact with TF-Agents.

RL environment where the agent is able to control various setpoints with the goal of making the HVAC system more efficient.

ActionConfig #

ActionConfig(action_normalizers: ActionNormalizerMap)

Configures BaseActionNormalizers for each setpoint.

This class allows the user to configure a BaseActionNormalizer for any device_id/setpoint name tuple.

Only setpoints given as part of this config will be part of the action space.

Example

action_normalizers = { ('boiler_0', 'supply_water_setpoint'): ContinuousBaseActionNormalizer(args) }

This would set a ContinuousBaseActionNormalizer for the supply_water_setpoint setpoint on the device with id boiler_0.

get_action_normalizer #

get_action_normalizer(
    setpoint_name: FieldName,
) -> Optional[base_normalizer.BaseActionNormalizer]

Returns corresponding action normalizer if it exists.

Parameters:

Name Type Description Default
setpoint_name FieldName

Name of setpoint to get action normalizer for.

required

Environment #

Environment(
    building: BaseBuilding,
    reward_function: BaseRewardFunction,
    observation_normalizer: BaseObservationNormalizer,
    action_config: ActionConfig,
    discount_factor: float = 1,
    metrics_path: str | None = None,
    num_days_in_episode: int = 3,
    device_action_tuples: Sequence[DeviceActionTuple] | None = None,
    default_actions: DefaultActions | None = None,
    metrics_reporting_interval: float = 100,
    label: str = "episode_metrics",
    num_hod_features: int = 1,
    num_dow_features: int = 1,
    occupancy_normalization_constant: float = 0.0,
    run_command_predictors: Sequence[BaseRunCommandPredictor] | None = None,
    observation_histogram_reducer: HistogramReducer | None = None,
    time_zone: str = "US/Pacific",
    image_generator: BuildingImageGenerator | None = None,
    step_interval: Timedelta = pd.Timedelta(5, unit="minutes"),
    writer_factory: BaseWriterFactory | None = None,
)

Bases: PyEnvironment

Controllable building RL environment to interact with TF-Agents.

Environment constructor.

Parameters:

Name Type Description Default
building BaseBuilding

An implementation of BaseBuilding.

required
reward_function BaseRewardFunction

An implementation of BaseRewardFunction.

required
observation_normalizer BaseObservationNormalizer

Normalizer parameters for observations.

required
action_config ActionConfig

Parameters for actions: min, max, type, etc.

required
discount_factor float

Future reward discount, i.e., gamma.

1
metrics_path str | None

CNS directory to write environment data.

None
num_days_in_episode int

Episode duration.

3
device_action_tuples Sequence[DeviceActionTuple] | None

List of (device, setpoint) pairs for control.

None
default_actions DefaultActions | None

Initial actions.

None
metrics_reporting_interval float

Frequency of TensorBoard metrics.

100
label str

Episode label prepended to the episode output directory.

'episode_metrics'
num_hod_features int

Number of sin/cos pairs of time features for hour.

1
num_dow_features int

Number of sin/cos pairs of time features for day.

1
occupancy_normalization_constant float

Value used to normalize occupancy sig.

0.0
run_command_predictors Sequence[BaseRunCommandPredictor] | None

Predictors for setting on/off in RunCommands

None
observation_histogram_reducer HistogramReducer | None

Add histogram reduction to observations.

None
time_zone str

Time zone of the building/environment.

'US/Pacific'
image_generator BuildingImageGenerator | None

Building image generator that generates image encodings from observation responses.

None
step_interval Timedelta

amount of time between env steps.

Timedelta(5, unit='minutes')
writer_factory BaseWriterFactory | None

Used with metrics_path, factory for metrics writers.

None

all_actions_accepted #

all_actions_accepted(action_response: ActionResponse) -> bool

Returns true if all single action requests have response code ACCEPTED.

compute_action_regularization_cost #

compute_action_regularization_cost(action_history: Sequence[ndarray]) -> float

Applies a smoothing cost based on recent action history.

Returns the L2 Norm of the actions as a penalty term for large changes.

Parameters:

Name Type Description Default
action_history Sequence[ndarray]

Seqential array of actions taken in the episode.

required

Returns:

Type Description
float

A smoothing cost applied to the reward function for applying big changes.

generate_field_id #

generate_field_id(
    device: DeviceId, field: FieldName, id_map: bidict
) -> DeviceFieldId

Returns new Id not already present in id_map.

Ids are created by joining the device and field: device_field.

If the same device and field are added again, the same id will be returned.

If a unique device/field generates the same id as a different device/field, the id will be concatenated with an integer if the id already exists.

Examples:

>>> generate_field_id(device='a_b', field='c') -> a_b_c
>>> generate_field_id(device='a_b', field='c') -> a_b_c
>>> generate_field_id(device='a', field='b_c') -> a_b_c_1

The first id is a_b_c. The second call is an exact duplicate of the first, so the same id is returned. When the third call is made, because a_b_c is already taken, an int is concatenated and the returned id is a_b_c_1.

Parameters:

Name Type Description Default
device DeviceId

Device id.

required
field FieldName

Measurement or setpoint name.

required
id_map bidict

Current mapping of device fields to ids.

required

replace_missing_observations_past #

replace_missing_observations_past(
    current_observation_response: ObservationResponse,
    past_observation_response: Optional[ObservationResponse],
) -> ObservationResponse

Replaces any missing observations with a past ObservationResponse.

Sometimes, the building doesn't report all the observations; however, the agent requires all fields to be populated. When a missing observation is encountered, impute the value from the most recent observation.

Parameters:

Name Type Description Default
current_observation_response ObservationResponse

Current observations from the building.

required
past_observation_response Optional[ObservationResponse]

Use this observation to fill in any missing observations.

required

Returns:

Type Description
ObservationResponse

A merged ObservationResponsem filled in from the past observation.

Raises: ValueError when a missing observation exists and there is no past observation.