Skip to content

Commit 3fdaa94

Browse files
sergio-sisternes-epamSergio SisternesCopilot
authored
fix: marketplace build respects GITHUB_HOST for GHE repos (#1009)
* fix: marketplace build respects GITHUB_HOST for GHE repos (#1008) Thread the existing default_host() / build_https_clone_url() / AuthResolver pattern (used by apm install) through the marketplace build pipeline. Changes: - RefResolver: accept optional host parameter, use build_https_clone_url() instead of hardcoded github.com for git ls-remote URLs - MarketplaceBuilder: resolve tokens against configured host, use REST API for metadata fetch on GHES/GHE Cloud (raw.githubusercontent.com is github.com-only), skip metadata for non-GitHub hosts - Fix AuthResolver import scoping so classify_host() works when auth_resolver is pre-injected - Add GHE Cloud early-exit when no token (avoids pointless 401) Tests: - Update URL assertions to use urlparse (test convention) - Add 4 RefResolver GHE host tests - Add 3 metadata fetch path tests (GHES REST API, non-GitHub skip, GHE Cloud no-token skip) - Add builder host env test Docs: - CHANGELOG: Fixed entry under [Unreleased] - marketplace-authoring guide: GHES section - apm-usage authentication skill: marketplace build example Closes #1008 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(marketplace): decouple auth from resolution, reuse DependencyReference for URL sources Phase B of #1008 -- decouples authentication from marketplace generation and reuses existing resolution infrastructure for cross-source compatibility. Changes: - RefResolver: accept optional token for authenticated git ls-remote - Builder: extract lazy _ensure_auth() called from _get_resolver() so both resolve() and build() benefit from authenticated ls-remote - Builder: eagerly init resolver before thread pool (race prevention) - Builder: fix _host_info type annotation (Optional["HostInfo"] with TYPE_CHECKING guard) - resolver.py: _resolve_url_source() now delegates to DependencyReference.parse() -- accepts any valid Git URL (GitHub, GHES, GitLab, Bitbucket, ADO, SSH) instead of github.com only - 13 new tests covering token injection, lazy auth, and cross-source URL resolution Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(marketplace): address review panel findings - Add _auth_resolved sentinel to _ensure_auth() for true idempotency - Clarify _resolve_url_source() docstring: host is not preserved (#1010) - Split CHANGELOG #1008 entry into GHE fix + URL-source expansion - Add test documenting host-is-ignored behaviour for non-GitHub URLs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(marketplace): address Copilot review findings - Fix _ensure_auth() offline branch to set _auth_resolved sentinel - Clarify CHANGELOG and docs: URL host is not preserved, GITHUB_HOST required - Update marketplace-authoring.md to warn against cross-host URL reliance Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Sergio Sisternes <sergio.sisternes@epam.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent a26c2e7 commit 3fdaa94

10 files changed

Lines changed: 481 additions & 44 deletions

File tree

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1616
### Fixed
1717

1818
- Remove redundant `seen` set from `_scan_patterns()` discovery walk (#918)
19+
- `apm marketplace build` now respects `GITHUB_HOST` for GitHub Enterprise repos -- ref resolution, token lookup, and metadata fetch all use the configured host instead of hardcoded `github.com`. `git ls-remote` is authenticated so private repos work without separate credential setup. (#1008)
20+
- `apm marketplace build` now accepts multiple Git URL forms (GitHub, GHES, GitLab, Bitbucket, ADO, SSH) for `type: url` parsing via `DependencyReference.parse()`. Host resolution is still driven by `GITHUB_HOST`, so non-`github.com` hosts require `GITHUB_HOST` to be set accordingly. (#1008)
1921

2022
## [0.10.0] - 2026-04-27
2123

docs/src/content/docs/guides/marketplace-authoring.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,6 +269,17 @@ Run it first when `build` or `publish` fails in an unfamiliar environment.
269269
| `No cached refs (offline)` | First-ever `--offline` build. | Run once online to populate the cache, then retry offline. |
270270
| `git ls-remote` auth failure | Private source without credentials. | Ensure your git credentials (SSH agent or `gh auth login`) can reach the source repo. |
271271

272+
### GitHub Enterprise Server
273+
274+
`apm marketplace build` respects the `GITHUB_HOST` environment variable. Set it before building to resolve packages from a GHES instance:
275+
276+
```bash
277+
export GITHUB_HOST=github.company.com
278+
apm marketplace build
279+
```
280+
281+
Token resolution and metadata fetch use the same host, so existing auth configuration (see [Authentication](../../getting-started/authentication/)) works automatically. `git ls-remote` calls are authenticated with the resolved token, so private GHES repos work without a separate git credential helper. `type: url` sources accept Git-style repository URLs as input, including HTTPS and SSH forms, but APM resolves auth and metadata against `GITHUB_HOST`. In practice, the URL host is ignored unless it matches `GITHUB_HOST`, so do not rely on `type: url` for true cross-host resolution.
282+
272283
## Discovering upgrades
273284

274285
`apm marketplace outdated` compares the currently resolved version of each package (as captured in `marketplace.json`) against the latest tag available in the source repo.

packages/apm-guide/.apm/skills/apm-usage/authentication.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ If `ADO_APM_PAT` is set but ADO returns 401, APM silently retries with the `az`
7878
export GITHUB_HOST=github.company.com
7979
export GITHUB_APM_PAT_MYORG=ghp_ghes_token
8080
apm install myorg/internal-package # resolves to github.company.com
81+
apm marketplace build # also resolves to github.company.com
8182
```
8283

8384
## GHE Cloud data residency (*.ghe.com)

src/apm_cli/marketplace/builder.py

Lines changed: 92 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,10 @@
2626
from concurrent.futures import ThreadPoolExecutor, as_completed
2727
from dataclasses import dataclass, field
2828
from pathlib import Path
29-
from typing import Any, Dict, List, Optional, Tuple
29+
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple
30+
31+
if TYPE_CHECKING:
32+
from ..core.auth import HostInfo
3033

3134
import yaml
3235

@@ -42,6 +45,7 @@
4245
from .semver import SemVer, parse_semver, satisfies_range
4346
from .tag_pattern import build_tag_regex, render_tag
4447
from ..utils.path_security import ensure_path_within
48+
from ..utils.github_host import default_host
4549
from .yml_schema import MarketplaceYml, PackageEntry, load_marketplace_yml
4650

4751
logger = logging.getLogger(__name__)
@@ -151,6 +155,9 @@ def __init__(
151155
self._auth_resolver = auth_resolver
152156
# Resolved once per build, used by worker threads (read-only).
153157
self._github_token: Optional[str] = None
158+
self._host: str = default_host() or "github.com"
159+
self._host_info: Optional["HostInfo"] = None
160+
self._auth_resolved: bool = False
154161

155162
# -- lazy loaders -------------------------------------------------------
156163

@@ -161,12 +168,32 @@ def _load_yml(self) -> MarketplaceYml:
161168

162169
def _get_resolver(self) -> RefResolver:
163170
if self._resolver is None:
171+
self._ensure_auth()
164172
self._resolver = RefResolver(
165173
timeout_seconds=self._options.timeout_seconds,
166174
offline=self._options.offline,
175+
host=self._host,
176+
token=self._github_token,
167177
)
168178
return self._resolver
169179

180+
def _ensure_auth(self) -> None:
181+
"""Lazily resolve host classification and GitHub token.
182+
183+
Short-circuits when already resolved (even if no token was found)
184+
or when running in offline mode. Offline mode is still marked as
185+
resolved so repeated calls remain idempotent. Called by
186+
``_get_resolver()`` so both ``resolve()`` and ``build()`` benefit
187+
from authenticated ``git ls-remote`` when available.
188+
"""
189+
if self._auth_resolved:
190+
return
191+
if self._options.offline:
192+
self._auth_resolved = True
193+
return
194+
self._github_token = self._resolve_github_token()
195+
self._auth_resolved = True
196+
170197
# -- output path --------------------------------------------------------
171198

172199
def _output_path(self) -> Path:
@@ -365,6 +392,11 @@ def resolve(self) -> ResolveResult:
365392
results: Dict[int, ResolvedPackage] = {}
366393
errors: List[Tuple[str, str]] = []
367394

395+
# Eagerly resolve auth + create the shared RefResolver before
396+
# spawning workers -- avoids a race on _ensure_auth() and
397+
# matches the pattern used in _prefetch_metadata().
398+
self._get_resolver()
399+
368400
with ThreadPoolExecutor(
369401
max_workers=min(self._options.concurrency, len(entries))
370402
) as pool:
@@ -413,16 +445,60 @@ def _fetch_remote_metadata(self, pkg: ResolvedPackage) -> Optional[Dict[str, str
413445
When a GitHub token is available (via ``self._github_token``), it
414446
is included as an ``Authorization`` header so private repos can be
415447
accessed.
448+
449+
For non-github.com GitHub-family hosts (GHES, GHE Cloud), uses the
450+
GitHub REST API instead of raw.githubusercontent.com (which is only
451+
available for github.com). For non-GitHub hosts, metadata
452+
enrichment is skipped.
416453
"""
417454
try:
418455
path_prefix = f"{pkg.subdir}/" if pkg.subdir else ""
419-
url = (
420-
f"https://raw.githubusercontent.com/"
421-
f"{pkg.source_repo}/{pkg.sha}/{path_prefix}apm.yml"
422-
)
423-
req = urllib.request.Request(url)
424-
if self._github_token:
425-
req.add_header("Authorization", f"token {self._github_token}")
456+
file_path = f"{path_prefix}apm.yml"
457+
458+
# Determine URL strategy based on host kind
459+
host_kind = self._host_info.kind if self._host_info else "github"
460+
461+
if host_kind not in ("github", "ghe_cloud", "ghes"):
462+
# Non-GitHub hosts -- skip metadata enrichment
463+
logger.debug(
464+
"Skipping metadata fetch for %s (non-GitHub host: %s)",
465+
pkg.name,
466+
self._host,
467+
)
468+
return None
469+
470+
if host_kind == "ghe_cloud" and not self._github_token:
471+
logger.debug(
472+
"Skipping metadata fetch for %s (GHE Cloud requires auth)",
473+
pkg.name,
474+
)
475+
return None
476+
477+
if self._host == "github.com":
478+
# github.com -- use fast raw.githubusercontent.com CDN
479+
url = (
480+
f"https://raw.githubusercontent.com/"
481+
f"{pkg.source_repo}/{pkg.sha}/{file_path}"
482+
)
483+
req = urllib.request.Request(url)
484+
if self._github_token:
485+
req.add_header("Authorization", f"token {self._github_token}")
486+
else:
487+
# GHES / GHE Cloud -- use REST API
488+
api_base = (
489+
self._host_info.api_base
490+
if self._host_info
491+
else None
492+
) or f"https://{self._host}/api/v3"
493+
url = (
494+
f"{api_base}/repos/{pkg.source_repo}/contents/{file_path}"
495+
f"?ref={pkg.sha}"
496+
)
497+
req = urllib.request.Request(url)
498+
req.add_header("Accept", "application/vnd.github.raw")
499+
if self._github_token:
500+
req.add_header("Authorization", f"token {self._github_token}")
501+
426502
with urllib.request.urlopen(req, timeout=5) as resp: # noqa: S310
427503
raw = resp.read().decode("utf-8")
428504
data = yaml.safe_load(raw)
@@ -460,13 +536,17 @@ def _resolve_github_token(self) -> Optional[str]:
460536
auth failures are logged at debug and silently ignored.
461537
"""
462538
try:
539+
from ..core.auth import AuthResolver # lazy import
540+
463541
resolver = self._auth_resolver
464542
if resolver is None:
465-
from ..core.auth import AuthResolver # lazy import
466-
467543
resolver = AuthResolver()
468544
self._auth_resolver = resolver
469-
ctx = resolver.resolve("github.com") # type: ignore[union-attr]
545+
# Always classify the host, regardless of token availability,
546+
# so _fetch_remote_metadata() can branch on host kind.
547+
if self._host_info is None:
548+
self._host_info = AuthResolver.classify_host(self._host)
549+
ctx = resolver.resolve(self._host) # type: ignore[union-attr]
470550
if ctx.token:
471551
logger.debug("Resolved GitHub token for metadata fetch (source=%s)", ctx.source)
472552
return ctx.token
@@ -492,7 +572,7 @@ def _prefetch_metadata(
492572
return {}
493573

494574
# Resolve token once -- threads read self._github_token (immutable).
495-
self._github_token = self._resolve_github_token()
575+
self._ensure_auth()
496576

497577
results: Dict[str, Dict[str, str]] = {}
498578
workers = min(self._options.concurrency, len(resolved))

src/apm_cli/marketplace/ref_resolver.py

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
from .errors import GitLsRemoteError, OfflineMissError
2828
from ._git_utils import redact_token as _redact_token
2929
from .git_stderr import translate_git_stderr
30+
from ..utils.github_host import default_host, build_https_clone_url
3031

3132
__all__ = [
3233
"RemoteRef",
@@ -136,6 +137,10 @@ class RefResolver:
136137
stderr_translator_enabled:
137138
When ``True`` (default), stderr from failed ``git`` calls is
138139
classified via ``translate_git_stderr``.
140+
token:
141+
Optional GitHub PAT to embed in the ``https://`` URL. When set
142+
the URL uses ``x-access-token`` authentication; when ``None``
143+
(default) git runs unauthenticated.
139144
"""
140145

141146
def __init__(
@@ -144,10 +149,14 @@ def __init__(
144149
timeout_seconds: float = 10.0,
145150
offline: bool = False,
146151
stderr_translator_enabled: bool = True,
152+
host: Optional[str] = None,
153+
token: Optional[str] = None,
147154
) -> None:
148155
self._timeout = timeout_seconds
149156
self._offline = offline
150157
self._stderr_translator = stderr_translator_enabled
158+
self._host: str = host or default_host() or "github.com"
159+
self._token: Optional[str] = token
151160
self._cache = RefCache()
152161
self._lock = threading.Lock()
153162
# Per-remote locks to serialise calls to the same remote while
@@ -166,7 +175,7 @@ def _remote_lock(self, owner_repo: str) -> threading.Lock:
166175
return self._remote_locks[owner_repo]
167176

168177
def list_remote_refs(self, owner_repo: str) -> List[RemoteRef]:
169-
"""Fetch all tags and heads from ``https://github.com/<owner_repo>.git``.
178+
"""Fetch all tags and heads from the configured Git host.
170179
171180
Results are cached; subsequent calls for the same remote return
172181
the cached value until the TTL expires.
@@ -198,7 +207,9 @@ def list_remote_refs(self, owner_repo: str) -> List[RemoteRef]:
198207
if self._offline:
199208
raise OfflineMissError(package="", remote=owner_repo)
200209

201-
url = f"https://github.com/{owner_repo}.git"
210+
url = build_https_clone_url(self._host, owner_repo, token=self._token)
211+
if not url.endswith(".git"):
212+
url += ".git"
202213
env = {**os.environ, "GIT_TERMINAL_PROMPT": "0", "GIT_ASKPASS": "echo"}
203214
try:
204215
result = subprocess.run(
@@ -273,7 +284,9 @@ def resolve_ref_sha(self, owner_repo: str, ref: str = "HEAD") -> str:
273284
GitLsRemoteError
274285
When the ref does not exist or the subprocess fails.
275286
"""
276-
url = f"https://github.com/{owner_repo}.git"
287+
url = build_https_clone_url(self._host, owner_repo, token=self._token)
288+
if not url.endswith(".git"):
289+
url += ".git"
277290
env = {**os.environ, "GIT_TERMINAL_PROMPT": "0", "GIT_ASKPASS": "echo"}
278291
try:
279292
result = subprocess.run(

src/apm_cli/marketplace/resolver.py

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
from typing import Callable, Optional, Tuple
1414

1515
from ..utils.path_security import PathTraversalError, validate_path_segments
16+
from ..models.dependency.reference import DependencyReference
1617
from .client import fetch_or_cache
1718
from .errors import MarketplaceFetchError, PluginNotFoundError
1819
from .models import MarketplacePlugin
@@ -95,25 +96,28 @@ def _resolve_github_source(source: dict) -> str:
9596
def _resolve_url_source(source: dict) -> str:
9697
"""Resolve a ``url`` source type.
9798
98-
APM is Git-native -- URL sources that point to GitHub repos are
99-
resolved to ``owner/repo``. Non-GitHub URLs are rejected.
99+
Delegates to ``DependencyReference.parse()`` to extract the
100+
``owner/repo`` coordinate from any valid Git URL (GitHub, GHES, GitLab,
101+
Bitbucket, ADO, SSH). The URL's host is *not* preserved -- downstream
102+
resolution (``RefResolver``) uses the configured ``GITHUB_HOST`` for
103+
``git ls-remote``. True cross-host resolution is tracked in #1010.
100104
"""
101105
url = source.get("url", "")
102-
# Try to extract owner/repo from common GitHub URL patterns
103-
for prefix in ("https://github.com/", "http://github.com/"):
104-
if url.lower().startswith(prefix):
105-
path = url[len(prefix) :].rstrip("/").split("?")[0]
106-
# Remove .git suffix
107-
if path.endswith(".git"):
108-
path = path[:-4]
109-
parts = path.split("/")
110-
if len(parts) >= 2:
111-
return f"{parts[0]}/{parts[1]}"
112-
113-
raise ValueError(
114-
f"Cannot resolve URL source '{url}' to a Git coordinate. "
115-
f"APM requires Git-based sources (owner/repo format)."
116-
)
106+
if not url:
107+
raise ValueError("URL source requires a non-empty 'url' field")
108+
try:
109+
dep = DependencyReference.parse(url)
110+
except ValueError as exc:
111+
raise ValueError(
112+
f"Cannot resolve URL source '{url}': {exc}"
113+
) from exc
114+
if dep.is_local:
115+
raise ValueError(
116+
f"URL source '{url}' resolves to a local path, not a Git coordinate."
117+
)
118+
if dep.reference:
119+
return f"{dep.repo_url}#{dep.reference}"
120+
return dep.repo_url
117121

118122

119123
def _resolve_git_subdir_source(source: dict) -> str:

tests/unit/commands/test_marketplace_build.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -427,3 +427,23 @@ def test_no_traceback_without_verbose(self, MockBuilder, runner, yml_cwd):
427427
assert result.exit_code == 1
428428
assert "Traceback" not in result.output
429429
assert "Build failed" in result.output
430+
431+
432+
# ---------------------------------------------------------------------------
433+
# GHE host support
434+
# ---------------------------------------------------------------------------
435+
436+
437+
class TestBuildGHEHost:
438+
"""build command -- GHE / custom host scenarios."""
439+
440+
def test_build_ghe_host_env(
441+
self, monkeypatch: pytest.MonkeyPatch, tmp_path: Path
442+
) -> None:
443+
"""MarketplaceBuilder respects GITHUB_HOST for token resolution."""
444+
monkeypatch.setenv("GITHUB_HOST", "corp.ghe.com")
445+
from apm_cli.marketplace.builder import MarketplaceBuilder, BuildOptions
446+
yml_path = tmp_path / "marketplace.yml"
447+
yml_path.write_text("name: test\noutput: marketplace.json\npackages: []\n")
448+
builder = MarketplaceBuilder(yml_path)
449+
assert builder._host == "corp.ghe.com"

0 commit comments

Comments
 (0)