Environment#
smart_control.environment.environment
#
Controllable building RL environment to interact with TF-Agents.
RL environment where the agent is able to control various setpoints with the goal of making the HVAC system more efficient.
ActionConfig
#
Configures BaseActionNormalizers for each setpoint.
This class allows the user to configure a BaseActionNormalizer for any device_id/setpoint name tuple.
Only setpoints given as part of this config will be part of the action space.
Example
action_normalizers = { ('boiler_0', 'supply_water_setpoint'): ContinuousBaseActionNormalizer(args) }
This would set a ContinuousBaseActionNormalizer for the supply_water_setpoint setpoint on the device with id boiler_0.
get_action_normalizer
#
get_action_normalizer(
setpoint_name: FieldName,
) -> Optional[base_normalizer.BaseActionNormalizer]
Returns corresponding action normalizer if it exists.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
setpoint_name
|
FieldName
|
Name of setpoint to get action normalizer for. |
required |
Environment
#
Environment(
building: BaseBuilding,
reward_function: BaseRewardFunction,
observation_normalizer: BaseObservationNormalizer,
action_config: ActionConfig,
discount_factor: float = 1,
metrics_path: str | None = None,
num_days_in_episode: int = 3,
device_action_tuples: Sequence[DeviceActionTuple] | None = None,
default_actions: DefaultActions | None = None,
metrics_reporting_interval: float = 100,
label: str = "episode_metrics",
num_hod_features: int = 1,
num_dow_features: int = 1,
occupancy_normalization_constant: float = 0.0,
run_command_predictors: Sequence[BaseRunCommandPredictor] | None = None,
observation_histogram_reducer: HistogramReducer | None = None,
time_zone: str = "US/Pacific",
image_generator: BuildingImageGenerator | None = None,
step_interval: Timedelta = pd.Timedelta(5, unit="minutes"),
writer_factory: BaseWriterFactory | None = None,
)
Bases: PyEnvironment
Controllable building RL environment to interact with TF-Agents.
Environment constructor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
building
|
BaseBuilding
|
An implementation of BaseBuilding. |
required |
reward_function
|
BaseRewardFunction
|
An implementation of BaseRewardFunction. |
required |
observation_normalizer
|
BaseObservationNormalizer
|
Normalizer parameters for observations. |
required |
action_config
|
ActionConfig
|
Parameters for actions: min, max, type, etc. |
required |
discount_factor
|
float
|
Future reward discount, i.e., gamma. |
1
|
metrics_path
|
str | None
|
CNS directory to write environment data. |
None
|
num_days_in_episode
|
int
|
Episode duration. |
3
|
device_action_tuples
|
Sequence[DeviceActionTuple] | None
|
List of (device, setpoint) pairs for control. |
None
|
default_actions
|
DefaultActions | None
|
Initial actions. |
None
|
metrics_reporting_interval
|
float
|
Frequency of TensorBoard metrics. |
100
|
label
|
str
|
Episode label prepended to the episode output directory. |
'episode_metrics'
|
num_hod_features
|
int
|
Number of sin/cos pairs of time features for hour. |
1
|
num_dow_features
|
int
|
Number of sin/cos pairs of time features for day. |
1
|
occupancy_normalization_constant
|
float
|
Value used to normalize occupancy sig. |
0.0
|
run_command_predictors
|
Sequence[BaseRunCommandPredictor] | None
|
Predictors for setting on/off in RunCommands |
None
|
observation_histogram_reducer
|
HistogramReducer | None
|
Add histogram reduction to observations. |
None
|
time_zone
|
str
|
Time zone of the building/environment. |
'US/Pacific'
|
image_generator
|
BuildingImageGenerator | None
|
Building image generator that generates image encodings from observation responses. |
None
|
step_interval
|
Timedelta
|
amount of time between env steps. |
Timedelta(5, unit='minutes')
|
writer_factory
|
BaseWriterFactory | None
|
Used with metrics_path, factory for metrics writers. |
None
|
all_actions_accepted
#
Returns true if all single action requests have response code ACCEPTED.
compute_action_regularization_cost
#
Applies a smoothing cost based on recent action history.
Returns the L2 Norm of the actions as a penalty term for large changes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
action_history
|
Sequence[ndarray]
|
Seqential array of actions taken in the episode. |
required |
Returns:
Type | Description |
---|---|
float
|
A smoothing cost applied to the reward function for applying big changes. |
generate_field_id
#
Returns new Id not already present in id_map.
Ids are created by joining the device and field: device_field.
If the same device and field are added again, the same id will be returned.
If a unique device/field generates the same id as a different device/field, the id will be concatenated with an integer if the id already exists.
Examples:
>>> generate_field_id(device='a_b', field='c') -> a_b_c
>>> generate_field_id(device='a_b', field='c') -> a_b_c
>>> generate_field_id(device='a', field='b_c') -> a_b_c_1
The first id is a_b_c
. The second call is an exact duplicate of the first,
so the same id is returned. When the third call is made, because a_b_c
is
already taken, an int is concatenated and the returned id is a_b_c_1
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
device
|
DeviceId
|
Device id. |
required |
field
|
FieldName
|
Measurement or setpoint name. |
required |
id_map
|
bidict
|
Current mapping of device fields to ids. |
required |
replace_missing_observations_past
#
replace_missing_observations_past(
current_observation_response: ObservationResponse,
past_observation_response: Optional[ObservationResponse],
) -> ObservationResponse
Replaces any missing observations with a past ObservationResponse.
Sometimes, the building doesn't report all the observations; however, the agent requires all fields to be populated. When a missing observation is encountered, impute the value from the most recent observation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
current_observation_response
|
ObservationResponse
|
Current observations from the building. |
required |
past_observation_response
|
Optional[ObservationResponse]
|
Use this observation to fill in any missing observations. |
required |
Returns:
Type | Description |
---|---|
ObservationResponse
|
A merged ObservationResponsem filled in from the past observation. |
Raises: ValueError when a missing observation exists and there is no past observation.