The whole protocol, end to end
The load generator runs the payment flow and data flow in full, with PKCE, consent and polling. The simulator doesn't expose shortcut routes. The only way to receive a 200 is to walk through the correct protocol.
The regulated proxy runs at this throughput under realistic conditions calibrated with production telemetry. In practice, what's already in place handles roughly 1.9× the business-hour average of the entire Brazilian Open Finance ecosystem. The path to more capacity is already mapped.
The four numbers
All measurements below came out of the same environment, running full FAPI-BR flows with production-calibrated latency. Not paper throughput, what's already in place today.
Comparison
Based on internal analysis of public Dashboard do Cidadão data, with a pessimistic ~47% buffer over the observed weekly peak. Cumbuca, as configured, absorbs almost twice the business-hour average of the entire ecosystem.
The ecosystem's largest initiator generated 26.06 million requests during its peak week in the reference period. At the throughput our infrastructure delivers today, that entire volume clears in under nine minutes of operation.
The 52k req/s ceiling is a ceiling of the current configuration, not of the architecture. The layer that saturates under synthetic load is the load balancer's mTLS termination, not the proxy application. The path to more capacity is already defined: load balancer sharding, incremental, without changing a line of the application.
How we measured
Every virtual user runs the whole protocol: payment consent, PAR, authorise, client credentials, auth code, PIX initiation, polling. Real PKCE, real consent IDs, properly-scoped Bearer tokens.
The load generator runs the payment flow and data flow in full, with PKCE, consent and polling. The simulator doesn't expose shortcut routes. The only way to receive a 200 is to walk through the correct protocol.
The simulator responds with a piecewise-linear distribution anchored to the quantiles we see in production: p50 at 130 ms, p90 at 350 ms, p99 at 1,350 ms. The reported number is under realistic load, not a zero-latency baseline.
120 m8i.large behind a managed ALB with mTLS termination. 16 m8i.large run the Open Finance simulator. Two c5.9xlarge run k6 in parallel as the load generator.
The ceiling and the path beyond
Around 52k req/s, failures show up that horizontal proxy scaling can't resolve. The cause is the load balancer's cryptographic handshake capacity. The bottleneck lives at the edge, not in the application.
The architectural path is pre-planned: multiple independent load balancers behind an L4 distributor. Each additional shard adds mTLS termination capacity linearly.
Full report
English text, 38 pages, Confidential classification. Includes test topology, latency distribution, status codes, the mTLS bottleneck analysis, comparison against the entire ecosystem, and the sharding roadmap.
Drop your email and company. An engineer on the team delivers the PDF within one business day.
Next step
We give access to a sandbox that mirrors production. You run your own k6 suite against real infrastructure. Engineer in the loop from the first email.
Talk to engineering