Local Development#
This guide covers running Scythe workers locally using native processes or Docker Compose.
Prerequisites#
- A Hatchet instance -- either Hatchet Cloud or self-hosted
- An S3-compatible bucket with configured credentials
- Python 3.10+ with uv (recommended) or pip
Environment Setup#
Create a .env file with the required variables:
# Hatchet
HATCHET_CLIENT_TOKEN=<your-token>
# AWS / S3
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=<your-key>
AWS_SECRET_ACCESS_KEY=<your-secret>
# Scythe Storage
SCYTHE_STORAGE_BUCKET=my-research-bucket
SCYTHE_STORAGE_BUCKET_PREFIX=scythe
# Scythe Timeouts
SCYTHE_TIMEOUT_SCATTER_GATHER_SCHEDULE=10m
SCYTHE_TIMEOUT_SCATTER_GATHER_EXECUTION=20m
SCYTHE_TIMEOUT_EXPERIMENT_SCHEDULE=10m
SCYTHE_TIMEOUT_EXPERIMENT_EXECUTION=2m
See the Configuration Reference for the complete list.
Splitting Env Files#
For projects with multiple worker roles, it helps to split configuration into separate files:
SCYTHE_WORKER_DOES_LEAF=True
SCYTHE_WORKER_DOES_FAN=False
SCYTHE_WORKER_SLOTS=1
SCYTHE_TIMEOUT_SCATTER_GATHER_SCHEDULE=10h
SCYTHE_TIMEOUT_SCATTER_GATHER_EXECUTION=10h
SCYTHE_TIMEOUT_EXPERIMENT_SCHEDULE=10h
SCYTHE_TIMEOUT_EXPERIMENT_EXECUTION=30m
SCYTHE_WORKER_DOES_LEAF=False
SCYTHE_WORKER_DOES_FAN=True
SCYTHE_WORKER_SLOTS=6
SCYTHE_TIMEOUT_SCATTER_GATHER_SCHEDULE=10h
SCYTHE_TIMEOUT_SCATTER_GATHER_EXECUTION=10h
SCYTHE_TIMEOUT_EXPERIMENT_SCHEDULE=10h
Then run workers with multiple env files:
uv run --env-file .env.aws --env-file .env.hatchet --env-file .env.scythe.storage --env-file .env.scythe.simulations worker
Running Natively#
Single Worker (Both Roles)#
Separate Leaf and Fan Workers#
In separate terminals:
# Terminal 1: Leaf worker (simulations)
uv run --env-file .env --env-file .env.scythe.simulations main.py
# Terminal 2: Fan worker (scatter/gather)
uv run --env-file .env --env-file .env.scythe.fanouts main.py
Allocating#
In a third terminal:
Using a Makefile#
A Makefile simplifies common operations:
.PHONY: install
install:
@uv sync --all-groups --all-extras
.PHONY: worker
worker:
@uv run --env-file .env main.py
.PHONY: allocate
allocate:
@uv run --env-file .env allocate.py
Then:
Docker Compose#
Self-contained example
The scythe-example repository provides a complete Docker Compose setup that bundles Hatchet Lite, LocalStack (S3), and Scythe workers. A single make up command starts everything -- no external Hatchet instance or S3 bucket required. Use it as a starting point for your own projects.
For containerized local development:
Dockerfile#
ARG PYTHON_VERSION=3.12
FROM python:${PYTHON_VERSION}-slim-bookworm AS main
COPY --from=ghcr.io/astral-sh/uv:0.6.16 /uv /uvx /bin/
WORKDIR /code
COPY uv.lock pyproject.toml README.md /code/
RUN uv sync --locked --no-install-project
COPY experiments /code/experiments/
COPY main.py /code/main.py
RUN uv sync --locked
CMD [ "uv", "run", "main.py" ]
Compose File#
services:
worker:
build:
context: .
dockerfile: Dockerfile.worker
args:
- PYTHON_VERSION=${PYTHON_VERSION:-3.12}
env_file:
- .env
deploy:
mode: replicated
replicas: 1
Running#
To scale workers:
Separate Roles#
For production-like setups with separate leaf and fan workers:
services:
simulations:
build:
context: .
dockerfile: Dockerfile.worker
env_file:
- .env
- .env.scythe.storage
- .env.scythe.simulations
deploy:
replicas: 4
fanouts:
build:
context: .
dockerfile: Dockerfile.worker
env_file:
- .env
- .env.scythe.storage
- .env.scythe.fanouts
deploy:
replicas: 1
Lab Cluster#
If you have access to a set of machines on a local network (e.g. lab desktops), you can run workers on each one. Each machine needs:
- Network access to your Hatchet instance
- AWS credentials for S3 access
- The worker code and dependencies installed
Since Hatchet handles task distribution, you simply start a worker on each machine and they will all pull tasks from the same queue. This is a cost-effective way to run large experiments overnight using idle compute resources.
Next Steps#
- See Cloud Deployment for AWS ECS and production infrastructure
- See Workers for detailed worker configuration