Inside AI Agents / for engineers and architects

Right-sized models, and an interface the agents draw

Two ideas carry the experience: spend reasoning only where it earns its cost, and let an agent describe the interface so the frontend stays a pure renderer. Both are configuration-driven, not hardcoded.

Model tiering

Cheap where it can be, capable where it counts.

The most expensive habit in a multi-agent system is running routing and classification on a frontier model. One router table maps each agent to the smallest model that clears its bar. Retiering is a config edit, not a code change.

Proposed LLM and cloud stack, tentative
Agent or job Reasoning load Tier
Supervisor, sentiment, intent, profile extractionTrivialNova Micro
Generative UI payload, onboardingLow to mid, structuredNova Lite
Dr. Nila consultationMid to highClaude Sonnet
Deep research synthesisHighest, gatedClaude, deep tier
Voice sessionRealtime speechNova Sonic
Agents ask the router for a model by role and never name a model id. Escalate only when a cheaper tier demonstrably fails the bar.
AGUI, agent generated dynamic UI

Generative UI: the backend describes the screen

After a consultation turn, an agent emits a typed payload of components. The frontend holds a registry from component type to React component and mounts whatever arrives. One owner of the interface, a typed contract on both sides of the wire.

Dr. Nila turnthe consultation text Generative UIemits typed payload Component registrytype to React radar_chart comparison_table interactive_timeline
The payload is a typed schema on the backend; the frontend types are generated from it. A component the backend can emit is one the frontend knows how to render.
There is no free model

No Bedrock model is free. The cheapest is near zero per call but still draws account credit. The tiering target is near-zero cost on high-volume trivial work, not a free model, because none exists. A budget alarm guards the balance as a circuit breaker.