Scalability is the evaluation criterion that's hardest to test in a trial and most important to get right before you're in production. A reporting layer that works well at 10 customers may not work well at 100 — and the failure mode is visible to your customers before it's visible to you.

This chapter covers what to look for in a platform's performance and scaling architecture, and a consideration most evaluations skip entirely: how your analytics platform billing model interacts with how you charge your own customers for analytics access.

What Degrades at Scale

Query performance against growing data. A report that renders in 400ms against a tenant with one year of transaction data may take 8 seconds against a tenant with five years. Analytical queries against large datasets are slow by nature — the question is what the platform does about it. In-memory caching is the primary answer: the first query runs against the database and the result is cached; subsequent requests for the same report serve the cached result instantly. Without caching, every report load hits the database, and load times scale with data volume.

Concurrency across tenants. In a multi-tenant deployment, tenants don't run reports at isolated times — they tend to cluster. Monday morning, end of month, right after a scheduled delivery window. When many tenants run heavy reports simultaneously, they compete for database resources. In a shared database architecture, this can create contention that slows everyone down. In a per-tenant database architecture, it's less of an issue — but the analytics platform itself can be a bottleneck if it doesn't have a connection pooling or queuing layer.

Scheduled report processing. A job queue processing 50 tenants' scheduled Monday morning reports needs to handle concurrency, failure, and retry gracefully. A naive sequential processor creates a delivery window that expands proportionally with tenant count — reports scheduled for 7am may arrive at 9am when you have enough tenants. A production-ready scheduler runs concurrent jobs with configurable worker counts and handles failures without dropping deliveries.

Caching Architecture — What to Look For

The most meaningful performance feature in an embedded analytics platform for ISVs is in-memory query caching with per-tenant isolation. Here's what that means in practice:

In-memory means results are stored in RAM, not on disk — retrieval is fast enough to be effectively instantaneous from the user's perspective. Disk-based caches are faster than re-running queries but still measurably slower than memory.

Per-tenant isolation means cached results from Tenant A are never served to Tenant B — the cache is namespaced by tenant identity. This is required for security as much as performance. A cache without tenant isolation is a data isolation failure waiting to happen.

Configurable invalidation means you can control when cached results are refreshed — on a schedule, on data change, or manually. A report that shows real-time inventory should invalidate frequently; a monthly summary report can cache for hours. If cache invalidation is all-or-nothing (everything expires together), you lose the performance benefit for slowly-changing reports while still hitting the database for fast-changing ones.

Yurbi FastCache

FastCache is Yurbi's in-memory caching engine — included in every plan. Query results are cached in memory with per-tenant isolation enforced. Cache invalidation is configurable per report. At scale, FastCache is the primary performance lever — ISVs with large data volumes or high concurrent user counts see the most significant impact. No additional configuration or add-on pricing required.

How Many Production Servers Do You Need?

Most ISV deployments start with a single production server. At larger scale — many tenants, high concurrent user counts, or customers who require their own dedicated instance — multiple production servers become necessary.

When evaluating platforms, understand the multi-server model before you need it:

Does the platform support horizontal scaling — multiple server instances behind a load balancer? What's the cost model for additional servers? Is there a limit on the number of production servers per license? Can individual customers be assigned to dedicated server instances for performance or data sovereignty reasons?

Yurbi's model: every plan includes one production server license. Additional servers are $500/server/year at list rate. Volume discounts apply above 10 additional servers. There's no limit on the number of servers you can run — the example deal in our sweet spot is 160 servers. The per-server model means you can grow server count with your customer base at a predictable incremental cost.

Aligning Analytics Billing With Your Own Pricing

This is the strategic consideration most evaluations miss: how does your analytics platform billing interact with how you charge your customers for analytics access?

If you charge customers per user for analytics access, a per-user platform model is relatively straightforward to pass through — though your margin compresses as users grow unless you charge more per user than the platform costs per user. If you offer analytics as an included feature at a flat tier, a flat-tier platform cost is the cleanest model — your analytics cost is fixed regardless of how many users your customers provision.

The worst mismatch: a consumption-based platform model when you charge customers a flat fee. You've committed to a fixed price for your customers but your underlying cost varies with usage. A traffic spike or a heavy reporting month creates cost that you can't pass through and didn't budget for.

Your pricing to customers Best-fit platform billing model Avoid
Flat tier (analytics included) Flat-tier platform pricing Consumption or per-user — both create variable cost against a fixed revenue commitment
Per-user analytics access Per-user platform pricing (if margin works) or flat-tier with user cap Consumption — unpredictable cost, no relationship to user count
Premium analytics tier / upsell Flat-tier platform pricing — margin is cleaner, growth doesn't erode it Per-user — margin compresses as premium tier users grow
Per-tenant analytics (each customer billed separately) Per-deployment platform model with volume discounts at scale Per-user — user count per tenant is outside your control

Getting the billing alignment right at the start saves a painful repricing conversation with your customers later — and prevents a scenario where your most successful customers (the ones with the most active users) become your least profitable ones from an analytics margin perspective.

FastCache. Flat pricing. Predictable at scale.

In-memory caching with per-tenant isolation included in every plan. $500/server/year for additional production deployments. No consumption spikes, no per-user growth penalties.

See Full Pricing