Skip to content

Commit ff15023

Browse files
committed
feat(fetch): add allowed_domains and blocked_domains filters
Lets operators restrict the fetch tool to a curated set of hosts (or deny a few sensitive ones), mirroring Anthropic's web-fetch tool and Claude Code's WebFetch permission model. Patterns match the host and any subdomain by default; a leading dot restricts to strict subdomains. The check runs before any network call (including robots.txt) so blocked URLs never leak DNS or TCP traffic. Assisted-By: docker-agent
1 parent 01a8b20 commit ff15023

9 files changed

Lines changed: 471 additions & 7 deletions

File tree

agent-schema.json

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1319,6 +1319,28 @@
13191319
"description": "Timeout in seconds for the fetch tool",
13201320
"minimum": 1
13211321
},
1322+
"allowed_domains": {
1323+
"type": "array",
1324+
"description": "Allow-list of domains the fetch tool is permitted to fetch (only valid for type 'fetch'). A pattern matches the host exactly (case-insensitive) and any of its subdomains; e.g. 'example.com' matches 'example.com' and 'docs.example.com' but not 'badexample.com'. A leading dot ('.example.com') restricts the match to strict subdomains. Mutually exclusive with 'blocked_domains'.",
1325+
"items": {
1326+
"type": "string"
1327+
},
1328+
"examples": [
1329+
["docker.com", "docs.docker.com"],
1330+
["github.com", "raw.githubusercontent.com"]
1331+
]
1332+
},
1333+
"blocked_domains": {
1334+
"type": "array",
1335+
"description": "Deny-list of domains the fetch tool is forbidden to fetch (only valid for type 'fetch'). Uses the same matching rules as 'allowed_domains'. Mutually exclusive with 'allowed_domains'.",
1336+
"items": {
1337+
"type": "string"
1338+
},
1339+
"examples": [
1340+
["internal.example.com"],
1341+
["169.254.169.254"]
1342+
]
1343+
},
13221344
"url": {
13231345
"type": "string",
13241346
"description": "URL for the a2a or openapi tool",

docs/tools/fetch/index.md

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,21 @@ toolsets:
2828
2929
### Options
3030
31-
| Property | Type | Default | Description |
32-
| --------- | ---- | ------- | ----------------------------------------------------------------- |
33-
| `timeout` | int | `30` | Default request timeout in seconds (overridable per tool call). |
31+
| Property | Type | Default | Description |
32+
| ----------------- | -------------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
33+
| `timeout` | int | `30` | Default request timeout in seconds (overridable per tool call). |
34+
| `allowed_domains` | array[string] | _none_ | Allow-list of hosts the tool may fetch. When set, every URL whose host is **not** in the list is rejected before any network call is made. Mutually exclusive with `blocked_domains`. |
35+
| `blocked_domains` | array[string] | _none_ | Deny-list of hosts the tool must not fetch. URLs whose host matches one of these patterns are rejected before any network call (including `robots.txt`) is made. Mutually exclusive with `allowed_domains`. |
36+
37+
### Domain matching
38+
39+
Domain patterns in `allowed_domains` and `blocked_domains` use the following rules (case-insensitive):
40+
41+
- **Bare domain** — `example.com` matches the host `example.com` _and_ any subdomain such as `docs.example.com`. It does **not** match unrelated hosts that share a suffix (e.g. `badexample.com`).
42+
- **Leading dot** — `.example.com` matches **only** strict subdomains (`docs.example.com`, `a.b.example.com`), not the apex `example.com`.
43+
- **IP literal** — IP addresses are matched exactly (`169.254.169.254`).
44+
45+
The lists are mutually exclusive: a single fetch toolset may set either `allowed_domains` or `blocked_domains`, but not both.
3446

3547
### Custom Timeout
3648

@@ -40,6 +52,27 @@ toolsets:
4052
timeout: 60
4153
```
4254

55+
### Restrict to specific domains
56+
57+
```yaml
58+
toolsets:
59+
- type: fetch
60+
allowed_domains:
61+
- docker.com # docker.com and *.docker.com
62+
- github.com # github.com and *.github.com
63+
- .githubusercontent.com # only subdomains, e.g. raw.githubusercontent.com
64+
```
65+
66+
### Block sensitive hosts
67+
68+
```yaml
69+
toolsets:
70+
- type: fetch
71+
blocked_domains:
72+
- 169.254.169.254 # cloud metadata endpoint
73+
- internal.example.com # internal corporate hostnames
74+
```
75+
4376
## Tool Interface
4477

4578
The toolset exposes a single tool, `fetch`, with the following parameters:
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
#!/usr/bin/env docker agent run
2+
3+
# Demonstrates domain filtering for the fetch tool. The agent is only allowed
4+
# to fetch URLs whose host matches one of the entries in `allowed_domains`
5+
# (the host itself or any subdomain). Use `blocked_domains` instead to keep
6+
# fetch open by default while denying specific hosts.
7+
8+
agents:
9+
root:
10+
model: anthropic/claude-sonnet-4-5
11+
description: An agent restricted to Docker and GitHub documentation.
12+
instruction: |
13+
You are a documentation assistant. You may only fetch pages from the
14+
docker.com and github.com families of domains. If asked about an
15+
external resource, politely refuse and suggest a search instead.
16+
toolsets:
17+
- type: fetch
18+
# Each entry matches the bare host AND any subdomain.
19+
# Example: "docker.com" matches docker.com AND docs.docker.com.
20+
# Use ".github.com" to match strict subdomains only.
21+
allowed_domains:
22+
- docker.com
23+
- github.com
24+
- raw.githubusercontent.com
25+
26+
# Alternative: keep fetch open but deny specific hosts (e.g. cloud
27+
# metadata endpoints, internal services). Switch the active agent to
28+
# `safe_fetch` with `-a safe_fetch` to try this variant.
29+
safe_fetch:
30+
model: anthropic/claude-sonnet-4-5
31+
description: An agent that can fetch the open web except a few sensitive hosts.
32+
instruction: |
33+
You are a research assistant. Use the fetch tool to gather information
34+
from the public web. Some hosts are blocked for safety; if a fetch is
35+
rejected, do not retry and explain the situation to the user.
36+
toolsets:
37+
- type: fetch
38+
blocked_domains:
39+
- 169.254.169.254 # cloud instance metadata
40+
- 100.100.100.200 # alibaba/oracle metadata
41+
- metadata.google.internal

pkg/config/latest/types.go

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -816,6 +816,18 @@ type Toolset struct {
816816
// For the `fetch` tool
817817
Timeout int `json:"timeout,omitempty"`
818818

819+
// For the `fetch` tool - allow-list of domains the tool is permitted to fetch.
820+
// A pattern matches the host exactly (case-insensitive) and any of its subdomains;
821+
// e.g. "example.com" matches "example.com" and "docs.example.com" but not
822+
// "badexample.com". A leading dot (".example.com") restricts the match to
823+
// strict subdomains. Mutually exclusive with `blocked_domains`.
824+
AllowedDomains []string `json:"allowed_domains,omitempty" yaml:"allowed_domains,omitempty"`
825+
826+
// For the `fetch` tool - deny-list of domains the tool is forbidden to fetch.
827+
// Uses the same matching rules as `allowed_domains`. Mutually exclusive with
828+
// `allowed_domains`.
829+
BlockedDomains []string `json:"blocked_domains,omitempty" yaml:"blocked_domains,omitempty"`
830+
819831
// For the `rag` tool
820832
RAGConfig *RAGConfig `json:"rag_config,omitempty" yaml:"rag_config,omitempty"`
821833

pkg/config/latest/validate.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,15 @@ func (t *Toolset) validate() error {
7979
if len(t.FileTypes) > 0 && t.Type != "lsp" {
8080
return errors.New("file_types can only be used with type 'lsp'")
8181
}
82+
if len(t.AllowedDomains) > 0 && t.Type != "fetch" {
83+
return errors.New("allowed_domains can only be used with type 'fetch'")
84+
}
85+
if len(t.BlockedDomains) > 0 && t.Type != "fetch" {
86+
return errors.New("blocked_domains can only be used with type 'fetch'")
87+
}
88+
if len(t.AllowedDomains) > 0 && len(t.BlockedDomains) > 0 {
89+
return errors.New("allowed_domains and blocked_domains are mutually exclusive")
90+
}
8291
if len(t.Models) > 0 && t.Type != "model_picker" {
8392
return errors.New("models can only be used with type 'model_picker'")
8493
}

pkg/config/latest/validate_test.go

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,106 @@ agents:
217217
}
218218
}
219219

220+
func TestToolset_Validate_Fetch_Domains(t *testing.T) {
221+
t.Parallel()
222+
223+
tests := []struct {
224+
name string
225+
config string
226+
wantErr string
227+
}{
228+
{
229+
name: "fetch with allowed_domains",
230+
config: `
231+
version: "8"
232+
agents:
233+
root:
234+
model: "openai/gpt-4"
235+
toolsets:
236+
- type: fetch
237+
allowed_domains:
238+
- docker.com
239+
- github.com
240+
`,
241+
wantErr: "",
242+
},
243+
{
244+
name: "fetch with blocked_domains",
245+
config: `
246+
version: "8"
247+
agents:
248+
root:
249+
model: "openai/gpt-4"
250+
toolsets:
251+
- type: fetch
252+
blocked_domains:
253+
- 169.254.169.254
254+
`,
255+
wantErr: "",
256+
},
257+
{
258+
name: "fetch with both is rejected",
259+
config: `
260+
version: "8"
261+
agents:
262+
root:
263+
model: "openai/gpt-4"
264+
toolsets:
265+
- type: fetch
266+
allowed_domains:
267+
- docker.com
268+
blocked_domains:
269+
- example.com
270+
`,
271+
wantErr: "allowed_domains and blocked_domains are mutually exclusive",
272+
},
273+
{
274+
name: "allowed_domains on non-fetch toolset is rejected",
275+
config: `
276+
version: "8"
277+
agents:
278+
root:
279+
model: "openai/gpt-4"
280+
toolsets:
281+
- type: shell
282+
allowed_domains:
283+
- docker.com
284+
`,
285+
wantErr: "allowed_domains can only be used with type 'fetch'",
286+
},
287+
{
288+
name: "blocked_domains on non-fetch toolset is rejected",
289+
config: `
290+
version: "8"
291+
agents:
292+
root:
293+
model: "openai/gpt-4"
294+
toolsets:
295+
- type: shell
296+
blocked_domains:
297+
- docker.com
298+
`,
299+
wantErr: "blocked_domains can only be used with type 'fetch'",
300+
},
301+
}
302+
303+
for _, tt := range tests {
304+
t.Run(tt.name, func(t *testing.T) {
305+
t.Parallel()
306+
307+
var cfg Config
308+
err := yaml.Unmarshal([]byte(tt.config), &cfg)
309+
310+
if tt.wantErr != "" {
311+
require.Error(t, err)
312+
require.Contains(t, err.Error(), tt.wantErr)
313+
} else {
314+
require.NoError(t, err)
315+
}
316+
})
317+
}
318+
}
319+
220320
func TestToolset_Validate_MCP_RemoteOAuth_CallbackRedirectURL(t *testing.T) {
221321
t.Parallel()
222322

pkg/teamloader/registry.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -291,6 +291,12 @@ func createFetchTool(_ context.Context, toolset latest.Toolset, _ string, _ *con
291291
timeout := time.Duration(toolset.Timeout) * time.Second
292292
opts = append(opts, builtin.WithTimeout(timeout))
293293
}
294+
if len(toolset.AllowedDomains) > 0 {
295+
opts = append(opts, builtin.WithAllowedDomains(toolset.AllowedDomains))
296+
}
297+
if len(toolset.BlockedDomains) > 0 {
298+
opts = append(opts, builtin.WithBlockedDomains(toolset.BlockedDomains))
299+
}
294300
return builtin.NewFetchTool(opts...), nil
295301
}
296302

0 commit comments

Comments
 (0)