Making Home-Assistant Distributed, Modular State Machine Part 2

Minimalist Diagram of Home-Assistant Distributed

Home-Assistant Distributed: Deployment with Ansible

Why Ansible

Ansible is open-source state management and application deployment tool; we use Ansible for all of our server configuration management in our home lab. It uses human-readable YAML configuration files that define tasks, variables and files to be deployed to a machine, making it fairly easy to pick up and work with and, when written correctly can be idempotent (running the same playbook again should have no side-effects) – perfect for the deployment of our Home-Assistant Distributed Cluster.

Ansible uses inventory files that define groups that hosts are members of. For this project, I created a group under our home lab parent group named “automation-cluster” with child groups for each slave instance, e.g. tracker, hubs, settings, etc. Using this pattern facilitates scalability: if deploying the cluster to a single host, or deploying to multiple hosts makes no difference to the playbook.

Role: automation-cluster

Ansible operates using roles. A role defines a set of tasks to run, variables, files/file templates in order to configure the target host.

Tasks

The automation-cluster role is fairly simple:

  1. Clone the git repository for the given slave instances being deployed
  2. Generate a secrets.yaml file specific to each instance, providing only the necessary secrets for the instance
  3. Modify the file mods for the configuration directory for the service user account
  4. Run instance-specific tasks, i.e. create the plex.conf file for the media instance
  5. Generate the Docker container configurations for the instances being deployed
  6. Pull the Home-Assistant Docker image and run a Docker container for each slave instance

Variables (vars)

To make the automation-cluster role scalable to add new instances in the future, a list variable named instance_arguments was created. Each element of the list represents one instance with the following properties:

  • Name – The name of the instance which matches the automation-cluster child group name mentioned above
  • Description – The description to be applied to the Docker container
  • Port – The port to be exposed by the Docker container
  • Host – The host name or IP used for connecting the cluster instances together
  • Secrets – The list of Ansible variables to be inserted into the secrets.yaml file
  • Configuration Repo – The URL to the git repository for the instance configuration
  • Devices – Any devices to be mounted in the Docker container

The other variables defined are used for populating template files and the secrets.yaml files.

Defaults

In Ansible, the defaults are variables that are meant to be overridden, either by group variables, host variables, or variables entered in the command line.

Templates

Templates are files that are rendered via the Jinja2 templating engine. These are perfect for files that differ for different groups or hosts using Ansible variables. For the secrets.yaml, my template looks like this:

#Core Config Variables
{% for secret in core_secrets %}
{{ secret }}: {{ hostvars[inventory_hostname][secret] }}
{% endfor %}
#Instance Config Variables
{% for secret in item.secrets %}
{{ secret }}: {{ hostvars[inventory_hostname][secret] }}
{% endfor %}

There is an Ansible list variable named core_secrets that contains the names of variables used in all slave instances, i.e. home latitude, home longitude, time zone, etc. The secrets variable is defined in each automation-cluster child group and is a list variable that contains the names of variables that the instance requires. This allows you to keep the secrets.yaml file out of the configuration repository, in case an API key or password may be accidentally committed and become publically available.

In the template, you’ll see hostvars[inventory_name][secret]. Since the loop variable, secret, is simply the name of the Ansible variable, we need to retrieve the actual value of the variable. The hostvars dictionary, where the keys are the hostname currently being ran against, contains all of the variables and facts that are currently known. So hostvars[inventory_hostname] is a dictionary, with keys of the variable names, of all of the variables that are applicable to the current host. In the end, hostvars[inventory_name][secret] returns the value of that variable.

Handlers (WIP)

Handlers are tasks that are run after each block of tasks in a play, perfect for restarting services, or if a task should only be run if something changed. For the automation-cluster, if the instance configuration or secrets.yaml change from a previous run will trigger the Docker container to be restarted and thus loading the changes into the state machine

Role: Docker

Our Docker role simplifies the deployment of Docker containers on the target host by reducing boilerplate code and is a dependency of the automation-cluster role.

Gotchas Encountered

Home-Assistant deprecated the API password authentication back in version 0.77 and now uses a more secure authentication framework that relies on refresh tokens for each user. In order to connect to the API via services or, in our case, separate instances, you must create a long-lived bearer token. Because of this, the main (aggregation instance) must be deployed after all other instances are deployed and long-lived bearer tokens are created for each instance. It is possible to deploy all instances on the first play, but the tokens will need to be added to the variables and the play re-ran to regenerate the secrets.yaml file for the main instance.

If you are deploying multiple instances on the same host, you will have to deal with intercommunication between each instance. There are a few different solutions here:

  • Set the host for each instance in the instance-arguments variable to the Docker bridge gateway address
  • Add an entry to the /etc/hosts file in the necessary containers that aliases the Docker bridge gateway address to a resolvable hostname
  • Set the homeassistant.http.port property in the instance configurations to discrete ports and run the Docker container with network mode host
  • Create a Docker network and deploy the instances on that network with a static IP set for each instance

Next Steps

  • Unlink all Z-Wave and Zigbee devices from my current setup and link them on the host housing the hubs instance
  • Remove the late-night /etc/hosts file hack and deploy the containers to docker network or host network mode
  • Implement handlers to restart or recreate the containers based on task results

Making Home-Assistant Distributed, Modular State Machine Part 1

Minimalistic Example of Home-Assistant Distributed

Why Make Home-Assistant Distributed

Home-Assistant is one of the largest open-source projects on GitHub, integrating with almost every IoT device, mesh protocol, internet platform, and more; it is highly extensible at its core, and easily configured. However, once your system reaches a certain size, many of its limitations become apparent – I believe making Home-Assistant distributed will resolve many of these limitations.

The biggest and most annoying limitation is the inability to change almost any of the configurations without restarting the system. This is compounded when your Z-Wave or Zigbee controller is Home-Assistant, causing the mesh network to be re-initialized because you added a new component or new input-slider for automations. Anecdotally, I have noticed the integrations that require polling to update states to slow as the number of services needed to update states increases, i.e. hundreds of devices on multitudes of platforms.

Separation of concerns is one engineering principle that facilitates maintainability. Home-Assistant has two protocols that facilitate multiple instance communication: web-sockets and MQTT. MQTT, Message Queuing Transport Telemetry, is a lightweight pub/subsystem used by many IoT devices and platforms. While Home-Assistant includes an MQTT Auto-Discovery component that requires a decent amount of upfront planning and it is not the easiest to work with. Home-Assistant’s web-socket api is well documented moreover is fairly easy to work with, especially given this custom component by Lukas Hetzenecker:

The delineation between modules is kind of up to you; I decided to run the following instances:

  • Hubs – Z-Wave, Zigbee, and Insteon
  • Media – Televisions, set-top boxes, and receivers
  • Tracker – LAN-based presence detection, zones, and GPS tracking
  • Security – Security cameras, and alarm control panels
  • Communications – Notifiers, text-to-speak, and voice assistant intents
  • Appliances – Vacuums, litter box, and smart furniture
  • Planning – Shopping lists, calendars, and To-Do lists
  • Settings – Inputs for controlling, and configuring rules
  • AI – Image processing, face detection, and license plate recognition
  • Environmental – Weather, air quality, and neighborhood news
  • Main – Primary UI, people, and history recording

The main instance implements the Home-Assistant Remote Instance custom component configured to connect to the IP and port of all of the slave instances. If you haven’t moved your automation/rules out of Home-Assistant’s YAML-based automation components, this is the time to investigate Node-Red or AppDaemon; Home-Assistant is an amazing aggregation of hundreds of integrations, but it’s a state machine, and following good engineering principles dictates atomic purpose. Following the philosophy of atomic purpose means everything handles one, and only one, purpose very well. YAML-based automations, while probably decent enough for most end-users, cannot compete with writing real applications around your state machine and break the atomic purpose philosophy (combining the state machine with the rules engine).

You have several options for deploying a Home-Assistant distributed state machine: install them side-by-side, configured each on different ports; on the same box using Python Virtual Environments; Docker Compose, which allows you to describe all of the containers that make up the application; or a configuration management system, like Ansible or Salt Stack, to deploy however you wish (Docker containers, separate virtual machines, etc).

As of this post, I have started splitting up the configurations (see repositories below) and am waiting for a good time to unpair all of my Z-Wave and Zigbee devices from the controller – my family has grown quite used to the automations so this needs to happen when everyone is out or asleep. We currently use MaaS to provision our virtual machines; my current plan is to spin up an instance for AI, as it is fairly process-intensive, and either: one for the remaining slaves and one for main, or run the rest on one instance (will try both to see if there are any performance issues running so many instances on one box). Each instance will be deployed as a Docker container via Ansible playbooks. All code related to my Home-Assistant Distributed State-Machine will be hosted on GitHub: