Add OFFLINE_MODE for air-gapped deployments by alexnorell · Pull Request #2263 · roboflow/inference

alexnorell · 2026-04-24T06:43:37Z

Summary

Adds OFFLINE_MODE env var that blocks all outbound Roboflow API traffic
Models and workflows load from local cache only -- no retries, no timeouts
Composes with existing feature flags (METRICS_ENABLED, DISABLE_VERSION_CHECK, ACTIVE_LEARNING_ENABLED, SINGLE_TENANT_WORKFLOW_CACHE) rather than adding a parallel guard system
Guards at chokepoints: _get_from_url() and post_to_roboflow_api() for sync requests, individual POST functions for fire-and-forget operations, AutoModel.from_pretrained() and get_model_from_provider() for inference_models model loading
Workflow spec falls back to file cache via the existing except ConnectionError handler naturally
Bumps inference-models to 0.27.3

How it works

Flag composition (env.py): OFFLINE_MODE=True auto-sets METRICS_ENABLED=False, DISABLE_VERSION_CHECK=True, ACTIVE_LEARNING_ENABLED=False, SINGLE_TENANT_WORKFLOW_CACHE=True.

Chokepoint guards (roboflow_api.py): _get_from_url() raises ConnectionError and post_to_roboflow_api() raises RoboflowAPIConnectionError, which flow through existing error handlers (e.g. get_workflow_specification falls back to file cache without any OFFLINE_MODE-specific code). Fire-and-forget endpoints (image upload, annotation, monitoring) return empty/no-op.

Usage tracking: UsageCollector._offload_to_api() and PlanDetails fetch/WebRTC lookups short-circuit silently.

inference_models: get_model_from_provider() raises ModelRetrievalError, download_files_to_directory() raises RuntimeError. AutoModel.from_pretrained() has a dedicated attempt_loading_model_from_offline_cache() fallback that scans {INFERENCE_HOME}/models-cache/ for cached packages. Also handles RetryError (API unreachable) with the same offline cache fallback even when OFFLINE_MODE is not set.

Customer usage

# docker-compose.yml -- phase 1: cache models with network
services:
  inference:
    environment:
      ROBOFLOW_API_KEY: "your-key"
      SINGLE_TENANT_WORKFLOW_CACHE: "True"

# docker-compose.yml -- phase 2: run offline
services:
  inference:
    environment:
      OFFLINE_MODE: "True"
    # no ROBOFLOW_API_KEY needed
    # no network needed

Test plan

138 roboflow_api unit tests pass
836 inference_models unit tests pass (35 new)
935 inference core unit tests pass
black, isort, flake8 clean
End-to-end container test: private model lildc-hardhat/4 loaded from cache with --network none, no API key
Workflow spec loaded from file cache offline
Uncached model/workflow fails immediately

Adds an OFFLINE_MODE environment variable that disables all outbound network requests to the Roboflow API. Models and workflows load exclusively from local cache. Designed for air-gapped deployments where the inference server has no network access. When OFFLINE_MODE=True: - METRICS_ENABLED, ACTIVE_LEARNING_ENABLED are set to False - DISABLE_VERSION_CHECK is set to True - SINGLE_TENANT_WORKFLOW_CACHE is set to True (drops API key hash from workflow cache filenames) - All Roboflow API calls are blocked at chokepoints - inference_models loads from the models-cache directory - Workflow specs fall back to file cache via existing error handlers - Uncached models/workflows fail immediately with clear errors Bumps inference-models to 0.27.1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OFFLINE_MODE for air-gapped deployments#2263

Add OFFLINE_MODE for air-gapped deployments#2263
alexnorell wants to merge 1 commit intomainfrom
feat/offline-mode

alexnorell commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alexnorell commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Customer usage

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alexnorell commented Apr 24, 2026 •

edited

Loading