Utilities for tracking hosts and ports and load balancing DAGs
airflow-balancer is a utility library for Apache Airflow to track host and port usage via yaml files. It enables you to:
- Track Hosts: Define and manage a pool of worker hosts with different capabilities (OS, queues, tags)
- Manage Ports: Track port usage across your host infrastructure to avoid conflicts
- Load Balance: Intelligently select hosts based on queues, operating systems, tags, or custom criteria
- Integrate with Airflow: Automatically create Airflow pools for each host and port
airflow-balancer is tightly integrated with the airflow-laminar ecosystem:
| Library | Integration |
|---|---|
| airflow-pydantic | Core data models (Host, Port, BalancerConfiguration) are defined in airflow-pydantic, providing full Pydantic validation, type checking, and JSON/YAML serialization support |
| airflow-config | Configuration loading via Hydra for hierarchical configs with defaults, overrides, and environment-specific settings |
With airflow-balancer, you can register host and port usage in configuration:
_target_: airflow_balancer.BalancerConfiguration
default_username: timkpaine
hosts:
- name: host1
size: 16
os: ubuntu
queues: [primary]
- name: host2
os: ubuntu
size: 16
queues: [workers]
- name: host3
os: macos
size: 8
queues: [workers]
ports:
- host: host1
port: 8080
- host_name: host2
port: 8793Either via airflow-config or directly, you can then select amongst available hosts for use in your DAGs.
from airflow_balaner import BalancerConfiguration, load
balancer_config: BalancerConfiguration = load("balancer.yaml")
host = balancer_config.select_host(queue="workers")
port = balancer_config.free_port(host=host)
...
operator = SSHOperator(ssh_hook=host.hook(), ...)Configuration, and Host and Port listing is built into the extension, available either from the topbar in Airflow or as a standalone viewer (via the airflow-balancer-viewer CLI).
You can install from pip:
pip install airflow-balancerFor use with Apache Airflow 2.x:
pip install airflow-balancer[airflow]For use with Apache Airflow 3.x:
pip install airflow-balancer[airflow3]Or via conda:
conda install airflow-balancer -c conda-forgeThe recommended approach is to use airflow-balancer as an extension within your airflow-config configuration:
# config/config.yaml
# @package _global_
_target_: airflow_config.Configuration
defaults:
- extensions/balancer@extensions.balancer# config/extensions/balancer.yaml
# @package extensions.balancer
_target_: airflow_balancer.BalancerConfiguration
default_username: airflow
default_key_file: /home/airflow/.ssh/id_rsa
hosts:
- name: worker1
size: 16
os: ubuntu
queues: [workers]from airflow_config import load_config
config = load_config("config", "config")
balancer = config.extensions["balancer"]
# Select a host and use its SSH hook
host = balancer.select_host(queue="workers")
operator = SSHOperator(ssh_hook=host.hook(), ...)Since the core models are defined in airflow-pydantic, you can leverage its testing utilities:
from airflow_balancer import BalancerConfiguration, Host
from airflow_balancer.testing import pools, variables
from airflow_pydantic import Variable
# Testing with mocked pools
with pools():
config = BalancerConfiguration(
hosts=[Host(name="test-host", size=8, queues=["test"])]
)
assert config.select_host(queue="test").name == "test-host"
# Using Airflow Variables for credentials
host = Host(
name="secure-host",
username="admin",
password=Variable(key="host_password"),
)This software is licensed under the Apache 2.0 license. See the LICENSE file for details.
Note
This library was generated using copier from the Base Python Project Template repository.


