Skip to content

Add OFFLINE_MODE for air-gapped deployments#2263

Draft
alexnorell wants to merge 1 commit intomainfrom
feat/offline-mode
Draft

Add OFFLINE_MODE for air-gapped deployments#2263
alexnorell wants to merge 1 commit intomainfrom
feat/offline-mode

Conversation

@alexnorell
Copy link
Copy Markdown
Contributor

@alexnorell alexnorell commented Apr 24, 2026

Summary

  • Adds OFFLINE_MODE env var that blocks all outbound Roboflow API traffic
  • Models and workflows load from local cache only -- no retries, no timeouts
  • Composes with existing feature flags (METRICS_ENABLED, DISABLE_VERSION_CHECK, ACTIVE_LEARNING_ENABLED, SINGLE_TENANT_WORKFLOW_CACHE) rather than adding a parallel guard system
  • Guards at chokepoints: _get_from_url() and post_to_roboflow_api() for sync requests, individual POST functions for fire-and-forget operations, AutoModel.from_pretrained() and get_model_from_provider() for inference_models model loading
  • Workflow spec falls back to file cache via the existing except ConnectionError handler naturally
  • Bumps inference-models to 0.27.3

How it works

Flag composition (env.py): OFFLINE_MODE=True auto-sets METRICS_ENABLED=False, DISABLE_VERSION_CHECK=True, ACTIVE_LEARNING_ENABLED=False, SINGLE_TENANT_WORKFLOW_CACHE=True.

Chokepoint guards (roboflow_api.py): _get_from_url() raises ConnectionError and post_to_roboflow_api() raises RoboflowAPIConnectionError, which flow through existing error handlers (e.g. get_workflow_specification falls back to file cache without any OFFLINE_MODE-specific code). Fire-and-forget endpoints (image upload, annotation, monitoring) return empty/no-op.

Usage tracking: UsageCollector._offload_to_api() and PlanDetails fetch/WebRTC lookups short-circuit silently.

inference_models: get_model_from_provider() raises ModelRetrievalError, download_files_to_directory() raises RuntimeError. AutoModel.from_pretrained() has a dedicated attempt_loading_model_from_offline_cache() fallback that scans {INFERENCE_HOME}/models-cache/ for cached packages. Also handles RetryError (API unreachable) with the same offline cache fallback even when OFFLINE_MODE is not set.

Customer usage

# docker-compose.yml -- phase 1: cache models with network
services:
  inference:
    environment:
      ROBOFLOW_API_KEY: "your-key"
      SINGLE_TENANT_WORKFLOW_CACHE: "True"
# docker-compose.yml -- phase 2: run offline
services:
  inference:
    environment:
      OFFLINE_MODE: "True"
    # no ROBOFLOW_API_KEY needed
    # no network needed

Test plan

  • 138 roboflow_api unit tests pass
  • 836 inference_models unit tests pass (35 new)
  • 935 inference core unit tests pass
  • black, isort, flake8 clean
  • End-to-end container test: private model lildc-hardhat/4 loaded from cache with --network none, no API key
  • Workflow spec loaded from file cache offline
  • Uncached model/workflow fails immediately

Adds an OFFLINE_MODE environment variable that disables all outbound
network requests to the Roboflow API. Models and workflows load
exclusively from local cache. Designed for air-gapped deployments
where the inference server has no network access.

When OFFLINE_MODE=True:
- METRICS_ENABLED, ACTIVE_LEARNING_ENABLED are set to False
- DISABLE_VERSION_CHECK is set to True
- SINGLE_TENANT_WORKFLOW_CACHE is set to True (drops API key
  hash from workflow cache filenames)
- All Roboflow API calls are blocked at chokepoints
- inference_models loads from the models-cache directory
- Workflow specs fall back to file cache via existing error handlers
- Uncached models/workflows fail immediately with clear errors

Bumps inference-models to 0.27.1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant