Google AI Agents Challenge 2026

Track 2: Optimizing an Existing Procurement Action Agent

We optimized an existing ADK procurement/vendor-risk agent so it handles risky real-world payment and supplier edge cases using Gemini 2.5 Flash on Vertex AI, Google ADK multi-agent orchestration, custom private-data grounding/RAG, MBS runtime gating, and Google Cloud deployment.

Mandatory Technologies Verification

Intelligence

Gemini 2.5 Flash via Vertex AI / Agent Engine

The routing LLM and multi-agent reasoning backbone, hosted on Google Cloud Vertex AI in us-central1.

Orchestration

Google ADK root_agent + SequentialAgent

Agentic workflow orchestrated by Google ADK with EvidenceRetrievalAgent → ProcurementDecisionAgent → MBSGateAgent → ReviewPacketAgent.

Infrastructure

Google Cloud Run backend + Agent Engine deployed smoke test

Public demo front-end served on Cloudflare, while the agent backend runs on Google Cloud infrastructure.

Google Cloud Proof

Agent Engine Resource

projects/109897818537/locations/us-central1/reasoningEngines/355545226783227904

Deployment Note

Public demo is hosted on Cloudflare; agent backend runs on Google Cloud.

Demo Cases

Case 1: Normal Vendor Approval

PASS (Baseline)

Scenario: Standard vendor with complete evidence pack, standard contract terms, valid insurance.

The agent successfully retrieves policy and evidence, makes the correct approval decision, and passes MBS gate validation.

Case 2: Risky Edge Case

REVIEW (Baseline) / FAIL (Gated)

Scenario: Bank detail change request or off-channel payment instruction with missing evidence.

Baseline agent may approve; MBS-gated agent correctly blocks or routes to review before execution.

Case 3: Optimized MBS-Gated

PASS/REVIEW/FAIL Logic

Scenario: All edge cases handled by MBS runtime gate that evaluates decision confidence, evidence completeness, and policy compliance before action.

The optimized agent uses deterministic MBS gate logic to ensure safe action selection.

Evidence Access

Scorecard

Early local mock evidence — live Vertex Gemini/Vertex Agent Engine evidence is separately validated.

baseline_local

2/3
decision match
2/3
gate match
Local mock evidence, unoptimized

optimized_local

3/3
decision match
3/3
gate match
Local mock evidence, GEPA optimized (33 iterations)

live_vertex_agent

6/6
gate match
100% gate match rate
Live Gemini 2.5 Flash on Vertex Agent Engine

Opaque label

Local scorecard is local mock evidence. Live Gemini/Vertex deployment is proven separately by the Agent Engine smoke test with 100% gate match.

Grounding / RAG

We use custom private-data grounding / custom RAG, NOT Vertex AI Search.

Retrieval Agents & Tools

def retrieve_policy() -> dict[str, Any]: """Retrieve the private procurement policy used for grounding.""" policy = load_policy() return { "truth_label": "local custom private retrieval", "source": "demo/policies/procurement_policy_v1.md", "chars": len(policy), "policy": policy, } def retrieve_evidence_pack(pack_id: str) -> dict[str, Any]: """Retrieve a private vendor evidence pack by id.""" pack = load_evidence_pack(pack_id) return { "truth_label": "local custom private retrieval", "pack_id": pack_id, "found": pack is not None, "evidence_pack": pack, }

EvidenceRetrievalAgent calls both tools before any ProcurementDecisionAgent action. This ensures grounding in private procurement policy and vendor-specific evidence.

Private Data Sources

  • Private procurement policy: demo/policies/procurement_policy_v1.md
  • Vendor evidence packs: demo/evidence_packs/pack_*.json (10 different packs)
  • No external search: All retrieval is from local, pre-loaded private data

Evidence in Traces

Trace Example (normal_standard_vendor)

policy_retrieved
"source": "demo/policies/procurement_policy_v1.md"
{
  "spans": [
    {
      "name": "policy_retrieved",
      "status": "ok",
      "attributes": {
        "truth_label": "local custom private retrieval",
        "source": "demo/policies/procurement_policy_v1.md"
      }
    },
    {
      "name": "evidence_retrieved",
      "status": "ok",
      "attributes": {
        "pack_id": "pack_approve_standard"
      }
    }
  ]
}

Multi-Agent Workflow

EvidenceRetrievalAgent
ProcurementDecisionAgent
MBSGateAgent
ReviewPacketAgent

Agent Chain Description

  1. EvidenceRetrievalAgent: Retrieves private procurement policy and vendor evidence pack
  2. ProcurementDecisionAgent: Uses evidence to make structured decision (PASS/REVIEW/FAIL)
  3. MBSGateAgent: Validates/gates proposed action using deterministic MBS logic
  4. ReviewPacketAgent: Creates human review packet for edge cases that need manual review

The chain is orchestrated by Google ADK SequentialAgent and configured via root_agent. All agents use VertexGemini(model="gemini-2.5-flash").

Limitations

Opaque labels

  • Not claiming full hosted SaaS readiness.
  • Not claiming Vertex AI Search.
  • Some evaluation artifacts are local mock; live Gemini evidence is separately labeled.
  • MBS gate is deterministic packaged runtime gate.

Product branding

General Aletheia/MBS product marketing is below this page or at /mbs. This page is judge-first and focused on Challenge Track 2 evidence.