wip: model adaptor#2480
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a model adapter pattern that centralizes per-model-family behavior (JSON parsing, chat completion params, image preprocessing, planning, locate, reasoning) into a single registry. It also restructures packages/core/src/ai-model into models/, prompts/, shared/, and workflows/ subtrees, and pulls bbox normalization out of common.ts into a dedicated locate-result adapter framework. Service-level deep-locate flow is rebuilt around adapter.locate.supportsSearchArea and a new resolveLocateSearchArea helper. The PR title is marked "wip".
Changes:
- Introduce
ResolvedModelAdapterplus per-family adapters (qwen, doubao, gemini, gpt, glm, auto-glm, ui-tars) and route planning/locate/reasoning/image-detail through them. - Replace
adaptBbox*/pointToBbox/fillBboxParam/bboxDescriptionwith aLocateResultAdapter(extractRawLocateResult→resolveLocateResult→normalizeResultToPixelBbox) and pixel mapping helpers; renamegenerateElementByRect/PointtocreateLocateResultElementFromRect/Pointand move to@/locate-result-element. - Restructure
ai-modeldirectory (prompt/→prompts/, newworkflows/{inspect,planning,generation,image-preprocess}, newshared/{json,model-locate-result}); rewrite locate/section-locate/planning aroundprepareModelImageandbuildSearchAreaConfig; service now uses first-pass locate when a model does not support search-area locate, andServiceDump.matchedElementbecomes a single optional element.
Reviewed changes
Copilot reviewed 107 out of 112 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/src/ai-model/models/** | New model adapter registry, types, ResolvedModelAdapter, per-family adapters. |
| packages/core/src/ai-model/shared/{json,model-locate-result}/** | Extracted JSON parser and locate-result adapter framework. |
| packages/core/src/ai-model/workflows/{inspect,planning,generation,image-preprocess}/** | New workflow modules replacing prior inspect/llm-planning/prompt files. |
| packages/core/src/ai-model/prompts/** | Renamed from prompt/; locate/section/planning prompts now use response-format descriptors. |
| packages/core/src/ai-model/service-caller/{reasoning,image-detail,client,error}.ts | Reasoning resolution rewritten over adapters; image-detail helper removed; client/error extracted. |
| packages/core/src/ai-model/models/auto-glm/, models/ui-tars/ | Auto-GLM and UI-TARS adapters moved under models/; prompts split into per-locale exports. |
| packages/core/src/service/index.ts | Locate flow refactored around resolveLocateSearchArea, supportsSearchArea, single-element parseResult. |
| packages/core/src/agent/{agent,tasks,task-builder,utils}.ts | Replanning limit and plan selection driven by adapter; uses new createLocateResultElementFromRect. |
| packages/core/src/{common,types,index,locate-result-element}.ts | Removed bbox adapters from common; matchedElement is now optional single element; new public locate-result helpers. |
| packages/core/src/ai-model/auto-glm/{index,util}.ts | Old auto-glm helpers/util removed. |
| packages/shared/src/extractor/{index,dom-util}.ts, src/img/transform.ts | Removed generateElementByPoint/Rect and paddingImage option from cropByRect. |
| packages/playground/src/server.ts | Switched to createLocateResultElementFromPoint from @midscene/core. |
| packages/evaluation/tests/llm-planning.test.ts | Updated to adaptModelLocateResultToRect options shape. |
| packages/core/tests/** | Large test refactor: imports updated to new module paths, new tests for adapters/image-preprocess/locate-result, removed adapt_bbox and generate-element-by-rect tests, updated reasoning/empty-content expectations. |
| packages/shared/tests/unit-test/iife-bundle.test.ts | Removed generateElementByRect from required IIFE exports and documented usage status of the remaining exports. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Deploying midscene with
|
| Latest commit: |
eba3b9e
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://7d8bd7cc.midscene.pages.dev |
| Branch Preview URL: | https://zzy-model-adaptor.midscene.pages.dev |
No description provided.