Skip to main content

Retry Support for Vendoring and Source Operations

· 3 min read
Erik Osterman
Founder @ Cloud Posse

Atmos now supports automatic retry with exponential backoff for vendoring and source operations. This makes component downloads more resilient to transient network failures, connection resets, and GitHub API rate limits.

What Changed

We've added native retry support to git operations and HTTP downloads used by vendoring and the source provisioner. The retry behavior is configurable per-source using the new retry field:

components:
terraform:
vpc:
source:
uri: github.com/cloudposse/terraform-aws-components//modules/vpc
version: 1.450.0
retry:
max_attempts: 5
initial_delay: 2s
max_delay: 60s
backoff_strategy: exponential

Why This Matters

Network operations are inherently unreliable. Transient failures like connection resets, timeouts, and rate limits can cause vendoring to fail even when there's nothing wrong with your configuration.

Previously, you'd have to manually retry atmos vendor pull or atmos terraform source pull when these failures occurred. Now Atmos handles this automatically with intelligent retry logic:

  • Exponential backoff prevents thundering herd problems
  • Jitter adds randomness to avoid synchronized retries across parallel operations
  • Smart detection only retries transient errors (not auth failures or missing repos)
  • GitHub rate limit awareness waits for rate limit reset when limits are hit

Configuration Options

The retry field supports the following options:

FieldDefaultDescription
max_attempts3Maximum number of download attempts
initial_delay2sInitial delay before first retry
max_delay30sMaximum delay between retries
backoff_strategyexponentialStrategy: constant, linear, or exponential
multiplier2.0Backoff multiplier for exponential/linear strategies
random_jitter0.1Randomness added to delays (0.1 = 10%)
max_elapsed_time5mMaximum total time for all retries

How to Use It

In Source Configuration

Add retry to your component's source configuration:

components:
terraform:
vpc:
source:
uri: github.com/cloudposse/terraform-aws-components//modules/vpc
version: 1.450.0
retry:
max_attempts: 5
initial_delay: 2s
backoff_strategy: exponential

In Vendor Configuration

Add retry to your vendor source specifications:

# vendor.yaml
spec:
sources:
- component: vpc
source: github.com/cloudposse/terraform-aws-components//modules/vpc?ref=1.450.0
retry:
max_attempts: 5
initial_delay: 2s

Default Behavior

When no retry configuration is specified, sensible defaults are used:

FieldDefault
max_attempts3
initial_delay2s
max_delay30s
backoff_strategyexponential
multiplier2.0
random_jitter0.1 (10%)
max_elapsed_time5m

Retryable Errors

The retry logic automatically detects transient errors including:

  • Connection reset / connection refused
  • Timeouts and temporary failures
  • SSL/TLS handshake errors
  • GitHub rate limit exceeded (429 responses)
  • "Remote end hung up unexpectedly" during git operations

Non-retryable errors (like authentication failures or repository not found) fail immediately without retry.

Get Involved

We'd love to hear your feedback! Please open an issue if you have questions or encounter edge cases.

For more details, see the Source-Based Version Pinning design pattern and vendor configuration.