The Q3 release of PGX Stack — version 3.0 — brings two brand-new modules, a redesigned REST API, a 38% improvement in average latency, and a step-change in what agentic AI pipelines can do in production environments.
What Shipped This Quarter
This was the largest release in PGX Stack's history — not just in lines of code, but in architectural scope. Q3 marks the completion of PGX Stack v3.0, a ground-up rearchitecture of our intelligence layer designed to support the next generation of enterprise AI workloads: autonomous, agentic, and built for scale.
Two major new modules shipped to general availability. Four existing subsystems received significant performance upgrades. The public REST API was versioned to v2, with a clean separation of concerns and a migration path for all existing integrations. And end-to-end test coverage climbed from 87% to 94%.
New Modules
Agentic AI Pipeline Module
The most significant addition in v3.0 is native support for agentic AI workflows — multi-step sequences where the AI model selects and invokes tools, makes intermediate decisions, and adapts its plan based on observed results, all within a single API call.
- Define complex pipelines as declarative YAML or JSON configuration
- Built-in tool use: web retrieval, database queries, external API calls, code execution
- Configurable guard rails and output validation at each pipeline step
- Real-time execution telemetry via the
/v2/workflow/{id}/statusendpoint - Full audit log for every tool invocation and model decision within a run
Workflow Automation Engine
Complementing the agentic pipeline module is a persistent workflow orchestration layer — think event-driven automation, cron-scheduled tasks, and reactive pipelines triggered by external webhooks or internal system events.
- Event-driven triggers: webhooks, message queues, schedules, and API events
- Branching logic and conditional execution paths
- Retry policies, dead-letter queues, and failure handling
- Visual workflow builder in the PGX Enterprise dashboard (beta)
Performance Improvements
In addition to new features, Q3 delivered material latency and throughput improvements across the existing stack:
-
LLM Router: -38% Latency A rewritten routing layer now evaluates model selection in under 5ms — down from an average of 28ms. This alone accounts for the majority of the headline latency improvement in end-to-end API calls.
-
Vector Store: 2× Indexing Speed A new HNSW index implementation cuts both write latency and query time in half for semantic search workloads, enabling retrieval at sub-10ms for collections up to 100 million vectors.
-
Task Queue: +65% Throughput The internal task queue now sustains 12,000 requests per second at p99 latency under 80ms — up from 7,300 req/s in v2.x — enabling larger-scale concurrent pipeline workloads.
-
Auth Layer: -40% Overhead Token validation is now handled at the edge, eliminating a round-trip to the central auth service for the majority of API calls.
REST API v2: What's Changed
PGX Stack v3.0 ships with a new version of the public REST API. The v1 API remains fully supported through Q2 next year, giving existing integrations a stable migration window. Key changes in v2:
- Unified authentication via Bearer tokens — API keys and session tokens merged into a single scheme
- Consistent response envelope: all endpoints return
{ data, meta, errors } - Streaming responses via Server-Sent Events for long-running pipeline operations
- Pagination standardized to cursor-based across all list endpoints
- New
/v2/pipeline/runand/v2/search/semanticendpoints (no v1 equivalents)
Full migration documentation and a compatibility shim for common v1 patterns are available in the PGX Developer Portal.
What's Coming in Q4
In Active Development
- Multi-agent orchestration: coordinate multiple specialized agents within a single pipeline
- Fine-tuning API: bring-your-own-data model customization endpoints
- PGX Copilot SDK: drop-in AI assistant integration for third-party applications
In Design & Review
- Real-time collaboration layer for multi-user agentic workspaces
- On-premise deployment option for regulated industries
- Expanded observability: per-token cost attribution and usage analytics