Skip to content

wip: model adaptor#2480

Draft
EAGzzyCSL wants to merge 1 commit into
mainfrom
zzy/model-adaptor
Draft

wip: model adaptor#2480
EAGzzyCSL wants to merge 1 commit into
mainfrom
zzy/model-adaptor

Conversation

@EAGzzyCSL
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a model adapter pattern that centralizes per-model-family behavior (JSON parsing, chat completion params, image preprocessing, planning, locate, reasoning) into a single registry. It also restructures packages/core/src/ai-model into models/, prompts/, shared/, and workflows/ subtrees, and pulls bbox normalization out of common.ts into a dedicated locate-result adapter framework. Service-level deep-locate flow is rebuilt around adapter.locate.supportsSearchArea and a new resolveLocateSearchArea helper. The PR title is marked "wip".

Changes:

  • Introduce ResolvedModelAdapter plus per-family adapters (qwen, doubao, gemini, gpt, glm, auto-glm, ui-tars) and route planning/locate/reasoning/image-detail through them.
  • Replace adaptBbox* / pointToBbox / fillBboxParam / bboxDescription with a LocateResultAdapter (extractRawLocateResultresolveLocateResultnormalizeResultToPixelBbox) and pixel mapping helpers; rename generateElementByRect/Point to createLocateResultElementFromRect/Point and move to @/locate-result-element.
  • Restructure ai-model directory (prompt/prompts/, new workflows/{inspect,planning,generation,image-preprocess}, new shared/{json,model-locate-result}); rewrite locate/section-locate/planning around prepareModelImage and buildSearchAreaConfig; service now uses first-pass locate when a model does not support search-area locate, and ServiceDump.matchedElement becomes a single optional element.

Reviewed changes

Copilot reviewed 107 out of 112 changed files in this pull request and generated no comments.

Show a summary per file
File Description
packages/core/src/ai-model/models/** New model adapter registry, types, ResolvedModelAdapter, per-family adapters.
packages/core/src/ai-model/shared/{json,model-locate-result}/** Extracted JSON parser and locate-result adapter framework.
packages/core/src/ai-model/workflows/{inspect,planning,generation,image-preprocess}/** New workflow modules replacing prior inspect/llm-planning/prompt files.
packages/core/src/ai-model/prompts/** Renamed from prompt/; locate/section/planning prompts now use response-format descriptors.
packages/core/src/ai-model/service-caller/{reasoning,image-detail,client,error}.ts Reasoning resolution rewritten over adapters; image-detail helper removed; client/error extracted.
packages/core/src/ai-model/models/auto-glm/, models/ui-tars/ Auto-GLM and UI-TARS adapters moved under models/; prompts split into per-locale exports.
packages/core/src/service/index.ts Locate flow refactored around resolveLocateSearchArea, supportsSearchArea, single-element parseResult.
packages/core/src/agent/{agent,tasks,task-builder,utils}.ts Replanning limit and plan selection driven by adapter; uses new createLocateResultElementFromRect.
packages/core/src/{common,types,index,locate-result-element}.ts Removed bbox adapters from common; matchedElement is now optional single element; new public locate-result helpers.
packages/core/src/ai-model/auto-glm/{index,util}.ts Old auto-glm helpers/util removed.
packages/shared/src/extractor/{index,dom-util}.ts, src/img/transform.ts Removed generateElementByPoint/Rect and paddingImage option from cropByRect.
packages/playground/src/server.ts Switched to createLocateResultElementFromPoint from @midscene/core.
packages/evaluation/tests/llm-planning.test.ts Updated to adaptModelLocateResultToRect options shape.
packages/core/tests/** Large test refactor: imports updated to new module paths, new tests for adapters/image-preprocess/locate-result, removed adapt_bbox and generate-element-by-rect tests, updated reasoning/empty-content expectations.
packages/shared/tests/unit-test/iife-bundle.test.ts Removed generateElementByRect from required IIFE exports and documented usage status of the remaining exports.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 18, 2026

Deploying midscene with  Cloudflare Pages  Cloudflare Pages

Latest commit: eba3b9e
Status: ✅  Deploy successful!
Preview URL: https://7d8bd7cc.midscene.pages.dev
Branch Preview URL: https://zzy-model-adaptor.midscene.pages.dev

View logs

@EAGzzyCSL EAGzzyCSL marked this pull request as draft May 18, 2026 05:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants