Scalability, Performance, and Billing Strategy for Embedded Analytics | Buyer's Guide

Scalability is the evaluation criterion that's hardest to test in a trial and most important to get right before you're in production. A reporting layer that works well at 10 customers may not work well at 100 — and the failure mode is visible to your customers before it's visible to you.

This chapter covers what to look for in a platform's performance and scaling architecture, and a consideration most evaluations skip entirely: how your analytics platform billing model interacts with how you charge your own customers for analytics access.

What Degrades at Scale

Query performance against growing data. A report that renders in 400ms against a tenant with one year of transaction data may take 8 seconds against a tenant with five years. Analytical queries against large datasets are slow by nature — the question is what the platform does about it. In-memory caching is the primary answer: the first query runs against the database and the result is cached; subsequent requests for the same report serve the cached result instantly. Without caching, every report load hits the database, and load times scale with data volume.

Concurrency across tenants. In a multi-tenant deployment, tenants don't run reports at isolated times — they tend to cluster. Monday morning, end of month, right after a scheduled delivery window. When many tenants run heavy reports simultaneously, they compete for database resources. In a shared database architecture, this can create contention that slows everyone down. In a per-tenant database architecture, it's less of an issue — but the analytics platform itself can be a bottleneck if it doesn't have a connection pooling or queuing layer.

Scheduled report processing. A job queue processing 50 tenants' scheduled Monday morning reports needs to handle concurrency, failure, and retry gracefully. A naive sequential processor creates a delivery window that expands proportionally with tenant count — reports scheduled for 7am may arrive at 9am when you have enough tenants. A production-ready scheduler runs concurrent jobs with configurable worker counts and handles failures without dropping deliveries.

Caching Architecture — What to Look For

The most meaningful performance feature in an embedded analytics platform for ISVs is in-memory query caching with per-tenant isolation. Here's what that means in practice:

In-memory means results are stored in RAM, not on disk — retrieval is fast enough to be effectively instantaneous from the user's perspective. Disk-based caches are faster than re-running queries but still measurably slower than memory.

Per-tenant isolation means cached results from Tenant A are never served to Tenant B — the cache is namespaced by tenant identity. This is required for security as much as performance. A cache without tenant isolation is a data isolation failure waiting to happen.

Configurable invalidation means you can control when cached results are refreshed — on a schedule, on data change, or manually. A report that shows real-time inventory should invalidate frequently; a monthly summary report can cache for hours. If cache invalidation is all-or-nothing (everything expires together), you lose the performance benefit for slowly-changing reports while still hitting the database for fast-changing ones.

Yurbi FastCache

FastCache is Yurbi's in-memory caching engine — included in every plan. Query results are cached in memory with per-tenant isolation enforced. Cache invalidation is configurable per report. At scale, FastCache is the primary performance lever — ISVs with large data volumes or high concurrent user counts see the most significant impact. No additional configuration or add-on pricing required.

How Many Production Servers Do You Need?

Most ISV deployments start with a single production server. At larger scale — many tenants, high concurrent user counts, or customers who require their own dedicated instance — multiple production servers become necessary.

When evaluating platforms, understand the multi-server model before you need it:

Does the platform support horizontal scaling — multiple server instances behind a load balancer? What's the cost model for additional servers? Is there a limit on the number of production servers per license? Can individual customers be assigned to dedicated server instances for performance or data sovereignty reasons?

Yurbi's model: every plan includes one production server license. Additional servers are $500/server/year at list rate. Volume discounts apply above 10 additional servers. There's no limit on the number of servers you can run — the example deal in our sweet spot is 160 servers. The per-server model means you can grow server count with your customer base at a predictable incremental cost.

Aligning Analytics Billing With Your Own Pricing

This is the strategic consideration most evaluations miss: how does your analytics platform billing interact with how you charge your customers for analytics access?

If you charge customers per user for analytics access, a per-user platform model is relatively straightforward to pass through — though your margin compresses as users grow unless you charge more per user than the platform costs per user. If you offer analytics as an included feature at a flat tier, a flat-tier platform cost is the cleanest model — your analytics cost is fixed regardless of how many users your customers provision.

The worst mismatch: a consumption-based platform model when you charge customers a flat fee. You've committed to a fixed price for your customers but your underlying cost varies with usage. A traffic spike or a heavy reporting month creates cost that you can't pass through and didn't budget for.

Your pricing to customers	Best-fit platform billing model	Avoid
Flat tier (analytics included)	Flat-tier platform pricing	Consumption or per-user — both create variable cost against a fixed revenue commitment
Per-user analytics access	Per-user platform pricing (if margin works) or flat-tier with user cap	Consumption — unpredictable cost, no relationship to user count
Premium analytics tier / upsell	Flat-tier platform pricing — margin is cleaner, growth doesn't erode it	Per-user — margin compresses as premium tier users grow
Per-tenant analytics (each customer billed separately)	Per-deployment platform model with volume discounts at scale	Per-user — user count per tenant is outside your control

Getting the billing alignment right at the start saves a painful repricing conversation with your customers later — and prevents a scenario where your most successful customers (the ones with the most active users) become your least profitable ones from an analytics margin perspective.

FastCache. Flat pricing. Predictable at scale.

In-memory caching with per-tenant isolation included in every plan. $500/server/year for additional production deployments. No consumption spikes, no per-user growth penalties.

See Full Pricing

What Degrades at Scale

Caching Architecture — What to Look For

How Many Production Servers Do You Need?

Aligning Analytics Billing With Your Own Pricing

FastCache. Flat pricing. Predictable at scale.

Get your dev team back on your product.