Scalability is the evaluation criterion that's hardest to test in a trial and most important to get right before you're in production. A reporting layer that works well at 10 customers may not work well at 100 — and the failure mode is visible to your customers before it's visible to you.
This chapter covers what to look for in a platform's performance and scaling architecture, and a consideration most evaluations skip entirely: how your analytics platform billing model interacts with how you charge your own customers for analytics access.
What Degrades at Scale
Query performance against growing data. A report that renders in 400ms against a tenant with one year of transaction data may take 8 seconds against a tenant with five years. Analytical queries against large datasets are slow by nature — the question is what the platform does about it. In-memory caching is the primary answer: the first query runs against the database and the result is cached; subsequent requests for the same report serve the cached result instantly. Without caching, every report load hits the database, and load times scale with data volume.
Concurrency across tenants. In a multi-tenant deployment, tenants don't run reports at isolated times — they tend to cluster. Monday morning, end of month, right after a scheduled delivery window. When many tenants run heavy reports simultaneously, they compete for database resources. In a shared database architecture, this can create contention that slows everyone down. In a per-tenant database architecture, it's less of an issue — but the analytics platform itself can be a bottleneck if it doesn't have a connection pooling or queuing layer.
Scheduled report processing. A job queue processing 50 tenants' scheduled Monday morning reports needs to handle concurrency, failure, and retry gracefully. A naive sequential processor creates a delivery window that expands proportionally with tenant count — reports scheduled for 7am may arrive at 9am when you have enough tenants. A production-ready scheduler runs concurrent jobs with configurable worker counts and handles failures without dropping deliveries.
Caching Architecture — What to Look For
The most meaningful performance feature in an embedded analytics platform for ISVs is in-memory query caching with per-tenant isolation. Here's what that means in practice:
In-memory means results are stored in RAM, not on disk — retrieval is fast enough to be effectively instantaneous from the user's perspective. Disk-based caches are faster than re-running queries but still measurably slower than memory.
Per-tenant isolation means cached results from Tenant A are never served to Tenant B — the cache is namespaced by tenant identity. This is required for security as much as performance. A cache without tenant isolation is a data isolation failure waiting to happen.
Configurable invalidation means you can control when cached results are refreshed — on a schedule, on data change, or manually. A report that shows real-time inventory should invalidate frequently; a monthly summary report can cache for hours. If cache invalidation is all-or-nothing (everything expires together), you lose the performance benefit for slowly-changing reports while still hitting the database for fast-changing ones.
FastCache is Yurbi's in-memory caching engine — included in every plan. Query results are cached in memory with per-tenant isolation enforced. Cache invalidation is configurable per report. At scale, FastCache is the primary performance lever — ISVs with large data volumes or high concurrent user counts see the most significant impact. No additional configuration or add-on pricing required.
How Many Production Servers Do You Need?
Most ISV deployments start with a single production server. At larger scale — many tenants, high concurrent user counts, or customers who require their own dedicated instance — multiple production servers become necessary.
When evaluating platforms, understand the multi-server model before you need it:
Does the platform support horizontal scaling — multiple server instances behind a load balancer? What's the cost model for additional servers? Is there a limit on the number of production servers per license? Can individual customers be assigned to dedicated server instances for performance or data sovereignty reasons?
Yurbi's model: every plan includes one production server license. Additional servers are $500/server/year at list rate. Volume discounts apply above 10 additional servers. There's no limit on the number of servers you can run — the example deal in our sweet spot is 160 servers. The per-server model means you can grow server count with your customer base at a predictable incremental cost.
Aligning Analytics Billing With Your Own Pricing
This is the strategic consideration most evaluations miss: how does your analytics platform billing interact with how you charge your customers for analytics access?
If you charge customers per user for analytics access, a per-user platform model is relatively straightforward to pass through — though your margin compresses as users grow unless you charge more per user than the platform costs per user. If you offer analytics as an included feature at a flat tier, a flat-tier platform cost is the cleanest model — your analytics cost is fixed regardless of how many users your customers provision.
The worst mismatch: a consumption-based platform model when you charge customers a flat fee. You've committed to a fixed price for your customers but your underlying cost varies with usage. A traffic spike or a heavy reporting month creates cost that you can't pass through and didn't budget for.
| Your pricing to customers | Best-fit platform billing model | Avoid |
|---|---|---|
| Flat tier (analytics included) | Flat-tier platform pricing | Consumption or per-user — both create variable cost against a fixed revenue commitment |
| Per-user analytics access | Per-user platform pricing (if margin works) or flat-tier with user cap | Consumption — unpredictable cost, no relationship to user count |
| Premium analytics tier / upsell | Flat-tier platform pricing — margin is cleaner, growth doesn't erode it | Per-user — margin compresses as premium tier users grow |
| Per-tenant analytics (each customer billed separately) | Per-deployment platform model with volume discounts at scale | Per-user — user count per tenant is outside your control |
Getting the billing alignment right at the start saves a painful repricing conversation with your customers later — and prevents a scenario where your most successful customers (the ones with the most active users) become your least profitable ones from an analytics margin perspective.
FastCache. Flat pricing. Predictable at scale.
In-memory caching with per-tenant isolation included in every plan. $500/server/year for additional production deployments. No consumption spikes, no per-user growth penalties.
See Full Pricing