Platform · Multimodal AI
VLM × VQA × VLA — three legs on one passport.
Vision-Language-Models, Visual Question Answering, and Vision-Language-Action sit on top of the same CoatingPassport. VLM produces it. VQA queries it. VLA acts on it. The data contract underneath stays stable as model generations land and as autonomous robotics matures.
VLM Vision-Language-Model
LiveReads ROV / drone / handheld / fixed-camera / satellite / crawler / AUV footage and emits structured findings — category, severity, confidence, n_frames, image quality, source frame URIs, AI model version. The default backend is Claude; the VLMProvider interface lets us swap without refactoring the analysis pipeline.
Who uses it. Every CoatingPassport in production is produced by a VLM call. Operators don't call this directly — they consume the passport.
Endpoint
Internal — surfaces via POST /api/analyze and /api/export/pdfVQA Visual Question Answering
LiveTakes an existing CoatingPassport and answers natural-language questions about it. Returns a structured answer with explicit confidence + citations to specific finding_ids and source frames. Refuses to invent data the passport doesn't carry — grounded_in_passport flag is enforced.
Who uses it. Agentic stacks (Cognite Atlas AI, Equinor Echo, custom Claude / GPT agents) that need an interpretable answer that cites passport lineage instead of hallucinating.
Endpoint
POST /api/passports/{id}/qa body: { "question": "..." }Example
Ask: "Which findings drive the most ETS exposure on this hull?" → grounded answer citing finding_ids + source frames + confidence.
VLA Vision-Language-Action
Co-developedTakes a finding and orchestrates the next physical action — request a closeup capture, reposition the camera, trigger a Spot / ANYmal navigation to the location, request human review. The VLAProvider interface is in src/lib/vlm/provider.ts; implementations are co-developed per operator at platform tier.
Who uses it. Energy majors running autonomous inspection robotics (Aker BP's Eureka, Equinor subsea autonomy, ConocoPhillips Ekofisk autonomy). Co-developed with the operator's engineering team.
Endpoint
Co-developed; not exposed on the public surface.Example
On a severe-graded finding, VLA can trigger Spot to navigate to the location, capture a high-frame-rate closeup, and emit an updated passport revision — all in one loop, no human in the middle.
Try VQA live
Pick any demo passport and ask a question with curl:
curl -X POST https://hullproof.com/api/passports/demo-offshore-jacket-001/qa \
-H "Content-Type: application/json" \
-d '{"question": "Which findings are most severe and which standards do they cite?"}'
Returns a structured response with answer, confidence, citations to finding_ids + source_frames, ai_model_version, and a grounded_in_passport flag.
For operators co-developing VLA
Aker BP's Eureka, Equinor subsea autonomy, ConocoPhillips Ekofisk autonomy programs — the VLA layer is where Hullproof findings drive Spot / ANYmal / AUV action loops. Platform-tier scope, co-developed with the operator's robotics + engineering team.