feat: @actions/cache optional path validation during restore#2414
feat: @actions/cache optional path validation during restore#2414jasongin wants to merge 8 commits into
@actions/cache optional path validation during restore#2414Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds an opt-in integrity gate to @actions/cache restore that validates tar archive entry paths (and link targets) against the user-declared cache paths prior to extraction, surfacing violations as warnings or as a hard CacheIntegrityError depending on mode.
Changes:
- Introduces
DownloadOptions.pathValidation('off' | 'warn' | 'error') and plumbs it throughrestoreCache*()toextractTar(). - Adds an in-process archive listing/validation pass (
tarParser + custom path/link validation) and a newCacheIntegrityErrorfor parse/violation failures. - Updates package metadata (version bump,
tardependency, Node engine) and adds comprehensive unit/integration tests + test plan doc.
Show a summary per file
| File | Description |
|---|---|
| packages/cache/src/options.ts | Adds pathValidation option with defaults and validation in getDownloadOptions(). |
| packages/cache/src/internal/tar.ts | Runs optional pre-extraction validation and reports/throws based on mode. |
| packages/cache/src/internal/pathValidation.ts | Implements allowed-root derivation, entry/link validation, and violation formatting. |
| packages/cache/src/internal/listAndValidate.ts | Streams archives through tar Parser (plus zstd decompression) to collect violations. |
| packages/cache/src/internal/cacheIntegrityError.ts | Defines CacheIntegrityError + error codes for integrity failures. |
| packages/cache/src/cache.ts | Exposes CacheIntegrityError publicly and forwards pathValidation/paths into extraction. |
| packages/cache/package.json | Bumps version, adds tar dep, adds engines.node, adjusts exports. |
| packages/cache/package-lock.json | Updates lockfile for new version and dependency graph. |
| packages/cache/docs/path-validation-test-plan.md | Documents test strategy/coverage for the new feature. |
| packages/cache/tests/tarPathValidation.test.ts | Adds integration tests asserting extract behavior across modes. |
| packages/cache/tests/restoreCache.test.ts | Updates assertions for new extractTar argument plumbing (v1). |
| packages/cache/tests/restoreCacheV2.test.ts | Updates assertions for new extractTar argument plumbing (v2). |
| packages/cache/tests/pathValidation.test.ts | Adds extensive unit tests for root derivation + entry/link validation. |
| packages/cache/tests/options.test.ts | Validates new default/override behavior for pathValidation. |
| packages/cache/tests/listAndValidate.test.ts | Adds real-archive tests for gzip/zstd parsing + validation behavior. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Files not reviewed (1)
- packages/cache/package-lock.json: Language not supported
- Files reviewed: 14/15 changed files
- Comments generated: 3
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| * - 'error': surfaced as `CacheIntegrityError` with `code: 'PARSE_ERROR'` | ||
| * and extraction does not run. | ||
| * | ||
| * @default 'off' |
There was a problem hiding this comment.
This feels like a security bug fix.
I'm wondering if we should default to error. We could major-version the package if we're worried about a breaking change.
There was a problem hiding this comment.
There are consumers of this package other than actions/cache (primarily actions/setup-*), so I didn't want to risk impacting them automatically. Even with a major version update, they might not realize the implications of taking the new major version. So I think it's safer to require consumers to opt in explicitly. Or at best 'warn' could be the default with a major version update.
The linked PR actions/cache#1761 defaults to warn and I proposed updating the default to error in a major version update of the action (not the toolkit package). We should consider similar updates to actions/setup-* repos.
Optional path validation during cache restore
New
restoreCache()optionA new opt-in
DownloadOptions.pathValidationon@actions/cachethat streams a downloaded cache archive through an in-process validator before handing it to systemtarfor extraction. Each entry's path (and, for symlinks/hardlinks, its target) must resolve under one of the declared cachepaths— anything that escapes (../..., absolute, UNC, drive-relative, NUL bytes, unsupported entry types) is caught by the validator.'off'— legacy behavior, validation is skipped.'warn'— validate; on violations, log viacore.warning/core.debugand extract anyway.'error'— validate; on violations throwCacheIntegrityError(code: 'PATH_VIOLATION', withviolationsattached). No files are extracted.Architecture: node-tar for listing, system
tarfor extractionThis is the biggest design call in the PR.
Listing/validation uses node-tar v7's
Parserin-process.onReadEntry. No subprocess, no parsingtar -tvtext output that varies by GNU/BSD/Windows.tar: it surfacesTAR_BAD_ARCHIVE/TAR_ENTRY_INVALID/TAR_ENTRY_ERROR/TAR_ABORTas discrete codes we can promote to parse errors, which is exactly what a pre-extraction gate wants.zstdbinary so we get the same--long=30window behavior the extract path uses (Node's built-in zstd is still experimental).Extraction stays on system
tar.tar -P(--absolute-names) option must be used when extracting cache archives. We cannot usetarto validate that extracted paths are within a target directory, because cache paths frequently live outside$GITHUB_WORKSPACE(~/.npm,~/.cargo,~/.gradle/caches, etc.).cacheUtils.resolvePathsrewrites every path relative to the workspace, so anything outside becomes../../..., which GNU tar refuses without-P. node-tar's extractor refuses these by default — swapping it in wholesale would break essentially every common dependency cache.--force-local,--delay-directory-restore,zstdmt/zstd -T0, the BSD-tar-on-Windows-with-zstd workaround). Re-implementing all of that on node-tar's writer is a much larger, riskier change.Soundness of the split: The validator is strictly more conservative than system
tar: anything the validator approves,taraccepts; in'error'modetaris never invoked when the validator rejects.'warn'mode exists for the potential rare case where the validator is stricter than systemtar(e.g. unusual PAX headers it can't classify) — a deliberate ergonomic concession so opting in doesn't break working caches.Performance tradeoff: Validation is a full pre-pass over the archive: decompress + parse headers (entry bodies are drained, not read or written to disk), then system tar decompresses a second time to extract. Roughly a 2× decompress cost on the archive stream, no extra disk I/O at the destination. The extra pass is skipped when validation is
'off'.Other approaches considered but rejected:
Other design notes
'quarantine').paths.deriveAllowedRootsmirrorscacheUtils.resolvePaths: tilde / env-var expansion, glob-prefix stripping, etc.prepareAllowedRoots()pre-bakes per-root normalize/lowercase/trailing-sep forms once per archive, dropping the per-entry hot loop from O(N · R) string ops to O(N + R)..catch(() => undefined)on the pipeline / exit promises to suppress late rejections,try/finallySIGTERMif the child is still running on early exit.Known limitations
-Pabove).caseInsensitiveFs()) is a platform-level guess. On case-sensitive APFS or per-dir-case-sensitive NTFS it loosens (fewer violations) rather than tightens — error direction is "lets something through", not "miss at the comparison layer".Tests
115 total: 86 pure-string unit tests for
pathValidation.ts, 14 real-archive tests forlistAndValidate.ts(zstd testsdescribe.skipwhenzstdisn't onPATH, scratch space viamkdtempSync), 15 mocked integration tests fortar.tscovering both modes, parse-error handling per mode, and the "no extraction in'error'mode" assertion. Seedocs/path-validation-test-plan.mdin this PR for a more detailed description of the test coverage.Out of scope
tarfor extraction - see explanation above.actions/cacheupdateactions/cache#1761 is a related PR to actually use this path validation when restoring from cache. The path validation mode will be a new input to the action; initially the default will be
'warn', then we can consider changing the default to'error'in a major version update.