Skip to content

Configuration Reference#

Scythe is configured entirely through environment variables, using pydantic-settings for validation and type coercion.

Storage Settings#

Configured via ScytheStorageSettings (prefix: SCYTHE_STORAGE_):

Variable Type Default Required Description
SCYTHE_STORAGE_BUCKET str -- Yes The S3 bucket name for storing experiment data
SCYTHE_STORAGE_BUCKET_PREFIX str "scythe" No Key prefix within the bucket for all Scythe data

All experiment data is stored under s3://<BUCKET>/<BUCKET_PREFIX>/.

Timeout Settings#

Configured via TimeoutSettings (prefix: SCYTHE_TIMEOUT_):

Variable Type Default Description
SCYTHE_TIMEOUT_SCATTER_GATHER_SCHEDULE Duration 1h Maximum time for a scatter/gather task to be picked up by a worker
SCYTHE_TIMEOUT_SCATTER_GATHER_EXECUTION Duration 1h Maximum execution time for a scatter/gather task
SCYTHE_TIMEOUT_EXPERIMENT_SCHEDULE Duration 1h Maximum time for an experiment task to be picked up by a worker
SCYTHE_TIMEOUT_EXPERIMENT_EXECUTION Duration 1m Maximum execution time for an experiment task

Duration values accept Hatchet duration strings like "30s", "5m", "2h", "1d", or Python timedelta objects.

Note

The EXPERIMENT_SCHEDULE and EXPERIMENT_EXECUTION timeouts are defaults used by @ExperimentRegistry.Register(). You can override them per-experiment using the schedule_timeout and execution_timeout parameters of the decorator.

Timeout Guidance#

  • Schedule timeouts should be generous -- they cover the time a task waits in the queue. If workers are scaling up from zero, this can take minutes.
  • Execution timeouts should be set based on the expected runtime of your longest simulation, with some headroom.
  • Scatter/gather timeouts should account for the time to run all child tasks, since the parent waits for all children to complete.

Worker Settings#

Configured via ScytheWorkerConfig (prefix: SCYTHE_WORKER_):

Variable Type Default Description
SCYTHE_WORKER_NAME str \| None Auto-detected Custom name for the worker (shown in Hatchet dashboard)
SCYTHE_WORKER_SLOTS int \| None CPU count Number of concurrent task slots
SCYTHE_WORKER_DURABLE_SLOTS int \| None 1000 Number of durable (persistent) task slots
SCYTHE_WORKER_HIGH_MEMORY bool False Advertise this worker as having high memory
SCYTHE_WORKER_HIGH_CPU bool False Advertise this worker as having high CPU
SCYTHE_WORKER_HAS_GPU bool False Advertise this worker as having a GPU
SCYTHE_WORKER_DOES_FAN bool True Whether this worker handles scatter/gather tasks
SCYTHE_WORKER_DOES_LEAF bool True Whether this worker handles experiment (leaf) tasks

Worker Name Auto-Detection#

If SCYTHE_WORKER_NAME is not set, the name is generated based on the hosting environment:

Environment Variable Detection Name Pattern
AWS_BATCH_JOB_ARRAY_INDEX AWS Batch ScytheWorker--AWSBatch<index>
COPILOT_ENVIRONMENT_NAME AWS Copilot ScytheWorker--AWSCopilot<ENV>
FLY_REGION Fly.io ScytheWorker--Fly<REGION>
(none) Local ScytheWorker--Local

Slot Calculation#

If SCYTHE_WORKER_SLOTS is not set:

  • Machines with < 8 CPUs: slots = cpu_count
  • Machines with >= 8 CPUs: slots = cpu_count - 1

Hatchet Settings#

These are standard Hatchet SDK environment variables (not Scythe-specific):

Variable Description
HATCHET_CLIENT_TOKEN Authentication token for the Hatchet instance
HATCHET_CLIENT_TLS_STRATEGY TLS strategy (none, tls, mtls)

See the Hatchet SDK documentation for the full list of client configuration options.

AWS Settings#

Standard AWS SDK environment variables used by boto3 for S3 access:

Variable Description
AWS_REGION AWS region for the S3 bucket
AWS_ACCESS_KEY_ID AWS access key
AWS_SECRET_ACCESS_KEY AWS secret key
AWS_SESSION_TOKEN (Optional) Session token for temporary credentials
AWS_PROFILE (Optional) Named profile from ~/.aws/credentials

Workers running on AWS (ECS, EC2, Lambda) can use IAM roles instead of explicit credentials.

Example .env File#

.env
# Hatchet
HATCHET_CLIENT_TOKEN=eyJhbGciOiJFUzI1NiIs...

# AWS
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...

# Storage
SCYTHE_STORAGE_BUCKET=my-research-bucket
SCYTHE_STORAGE_BUCKET_PREFIX=scythe

# Timeouts
SCYTHE_TIMEOUT_SCATTER_GATHER_SCHEDULE=10m
SCYTHE_TIMEOUT_SCATTER_GATHER_EXECUTION=20m
SCYTHE_TIMEOUT_EXPERIMENT_SCHEDULE=10m
SCYTHE_TIMEOUT_EXPERIMENT_EXECUTION=5m

# Worker (optional -- defaults are usually fine for local dev)
# SCYTHE_WORKER_DOES_LEAF=True
# SCYTHE_WORKER_DOES_FAN=True
# SCYTHE_WORKER_SLOTS=4
# SCYTHE_WORKER_HAS_GPU=False