Cutting BigQuery cost on a Firebase stack without slowing the product down
A practical FinOps case study on reducing BigQuery scan cost by moving the right reads off Firestore exports, simplifying joins, and debouncing search-heavy paths.
Costs rarely come down because of one dramatic rewrite.
In most products, the useful work is less glamorous:
- deciding which queries really need fresh data
- moving stable reads onto cheaper paths
- reducing how often expensive queries are triggered
- simplifying query shapes that grew more complex than the product actually needed
That was the situation on a retail workflow product built on Firebase and BigQuery.
The goal was not to make analytics infrastructure theoretically cleaner. The goal was to cut query cost hard without making the product feel slower, stale, or less useful.
All examples below are anonymized, but they reflect the real structure of the work.
The review surfaced three high-value optimizations. One was shipped immediately because it dominated request volume. The others were validated with direct query measurements and were clear follow-up changes.
The expensive default
The original setup leaned heavily on Firestore export tables in BigQuery.
That was convenient for one reason: the data was fresh.
It was also expensive for another: queries had to scan JSON-heavy payloads and repeatedly extract fields with functions like JSON_VALUE(...) and JSON_EXTRACT_STRING_ARRAY(...).
That tradeoff is easy to ignore at first. A single query looks fine. A product path that runs all day across kiosk and web surfaces is where it starts to hurt.
Three query families were doing most of the damage:
- loading the list of companies available to the current store
- searching companies while a user typed during visit or account creation
- searching contacts in the web app
The key lesson was simple: freshness should be spent intentionally.
Not every query deserves the most up-to-date source.
A better framing for FinOps work
Once the problem is framed as “make BigQuery cheaper,” it is tempting to focus only on SQL tricks.
That helps, but it misses the bigger lever.
The more useful framing is this:
Which product interactions actually require real-time data, and which ones only need data that is recent enough?
That question led to three practical changes:
- move some reads from raw Firestore export tables to structured daily backup tables
- debounce the company search so typing does not fire a query on every keystroke
- remove a join from contact search that was not pulling its weight anymore
1) Move store-scoped company lists off the raw export
One path loaded the list of companies accessible from the user’s current store. It was used in filters, especially around visit views.
The original version read from a raw export table:
SELECT
DISTINCT(JSON_VALUE(data, '$.company')) AS company
FROM
`project.analytics_export.accounts_raw_latest`
WHERE
'stores/STORE_ID' IN UNNEST(JSON_EXTRACT_STRING_ARRAY(data, '$.stores'))
AND JSON_VALUE(data, '$.company') IS NOT NULL
ORDER BY
company ASC
The measured replacement used a structured backup table instead:
SELECT
DISTINCT(company) AS company
FROM
`project.analytics_backup.accounts`
WHERE
'STORE_ID' IN UNNEST(stores.name)
AND company IS NOT NULL
ORDER BY
company ASC
That change came with one explicit compromise: the backup table was updated daily, so companies created today would not appear immediately.
For this product path, that was acceptable. Those entries were not the ones users depended on first, and the UI could tolerate a daily lag in that specific filter.
The measured gain was large enough to justify the tradeoff:
| Source | Bytes processed | Latency |
|---|---|---|
| Raw export | 446.7 MB | 1 s |
| Daily backup | 31.18 KB | 517 ms |
That is a 99.9% reduction in bytes processed for a query users were hitting regularly.
2) Fix the busiest path first: company search during typing
The most important query was not the most complex one. It was the one users triggered the most.
Company search ran while users typed into an input on the kiosk flow and a related web flow. Before any optimization, each input change fired a new request.
That meant the product was paying twice:
- the query itself was expensive
- the UI was generating a lot of unnecessary calls
In a sampled six-minute window of 100 requests, about 84 requests came from this search path alone. That made the priority obvious.
The original query joined raw export tables and extracted fields from JSON on both sides:
SELECT
JSON_VALUE(c.data, '$.label') AS label,
TIMESTAMP_SECONDS(CAST(JSON_VALUE(c.data, '$.creationDate._seconds') AS INT64)) AS creationDate,
JSON_VALUE(c.data, '$.pending') AS pending,
TIMESTAMP_SECONDS(CAST(JSON_VALUE(c.data, '$.lastUsedDate._seconds') AS INT64)) AS lastUsedDate,
COUNT(*) AS countAccounts,
c.document_id AS id
FROM
`project.analytics_export.companies_raw_latest` AS c
LEFT JOIN
`project.analytics_export.accounts_raw_latest` AS a
ON
JSON_VALUE(c.data, '$.label') = JSON_VALUE(a.data, '$.company')
WHERE
CAST(JSON_VALUE(c.data, '$.pending') AS BOOL) IS DISTINCT FROM TRUE
AND UPPER(REGEXP_REPLACE(bigfunctions.eu.remove_accents(JSON_VALUE(c.data, '$.label')), ' ', '')) LIKE '%SEARCH%'
GROUP BY
id,
label,
creationDate,
pending,
lastUsedDate
ORDER BY
countAccounts DESC,
lastUsedDate DESC,
label ASC
LIMIT 20
The replacement moved the same behavior onto structured backup tables:
SELECT
c.label AS label,
c.creationDate AS creationDate,
c.pending AS pending,
c.lastUsedDate AS lastUsedDate,
COUNT(*) AS countAccounts,
c.__key__.name AS id
FROM
`project.analytics_backup.companies` AS c
LEFT JOIN
`project.analytics_backup.accounts` AS a
ON
c.label = a.company
WHERE
c.pending IS DISTINCT FROM TRUE
AND UPPER(REGEXP_REPLACE(bigfunctions.eu.remove_accents(c.label), ' ', '')) LIKE '%SEARCH%'
GROUP BY
id,
label,
creationDate,
pending,
lastUsedDate
ORDER BY
countAccounts DESC,
lastUsedDate DESC,
label ASC
LIMIT 20
The query-level result was already strong:
| Source | Bytes processed | Latency |
|---|---|---|
| Raw export | 761.14 MB | 1 s |
| Daily backup | 2.49 MB | 408 ms |
That is another 99.9% reduction in bytes processed, plus a meaningful drop in response time.
But the bigger win came from one UI change: adding a 500 ms debounce to the search input.
That mattered because even a faster query is still wasteful if the client keeps firing it on every keystroke.
With the debounce in place:
- fewer requests were sent to BigQuery
- fewer expensive calls were made during fast typing
- the UI felt more stable because results stopped flashing on every letter
This was the most useful combination in the whole effort: cheaper query + fewer calls + smoother interaction.
At the time, this search endpoint represented roughly 75% of request volume. After switching it to the backup table and debouncing the input, the time to get a useful result felt about 4x shorter overall, even with the small delay introduced before firing the request.
That is a good reminder that performance work is not only about raw execution time. Sometimes the best improvement is to stop asking the backend the same question over and over.
3) Keep fresh data where it matters, but simplify the query
Contact search had a different constraint.
Unlike the previous two paths, this one had to stay close to real time because it powered a page where freshness mattered more. That meant keeping the query on the Firestore export instead of the daily backup.
The useful optimization was elsewhere.
The original version joined accounts to visitors in order to verify a visitor existed for each account:
SELECT
JSON_VALUE(a.data, '$.company') AS company,
JSON_VALUE(a.data, '$.lastName') AS lastName,
a.document_id AS id
FROM
`project.analytics_export.accounts_raw_latest` AS a
INNER JOIN
`project.analytics_export.visitors_raw_latest` AS v
ON
CONCAT('accounts/', a.document_id) = JSON_VALUE(v.data, '$.user')
WHERE
'stores/STORE_ID' IN UNNEST(JSON_EXTRACT_STRING_ARRAY(a.data, '$.stores'))
AND CAST(JSON_VALUE(a.data, '$.disabled') AS BOOL) IS DISTINCT FROM TRUE
AND JSON_VALUE(a.data, '$.role') = 'VISITOR'
GROUP BY
id,
company,
lastName
ORDER BY
company ASC
In practice, that join was doing a lot of work to confirm something the access model already made very likely.
Once the query was reduced to accounts only, the business result stayed aligned with the product need while the measured scan volume dropped substantially:
| Version | Bytes processed | Latency |
|---|---|---|
| With join | 913.65 MB | 2 s |
| Without join | 467.97 MB | 1 s |
That is a 48.8% reduction in bytes processed without changing the freshness model.
This part mattered because not every optimization needs a new data source. Sometimes the right answer is simply to remove work the query no longer needs to do.
BigQuery habits this reinforced
These changes lined up with a few BigQuery practices that are easy to agree with in theory and worth enforcing in real product code.
Reduce the amount of data processed
Avoid SELECT * when you only need a few fields.
On structured backup tables, the difference is immediate:
| Query | Bytes processed | Latency |
|---|---|---|
SELECT * FROM visits LIMIT 100 | 488.94 KB | 1 s |
SELECT __key__.name, checkinDate FROM visits LIMIT 100 | 6.54 KB | 449 ms |
The problem with raw Firestore export tables is that the useful fields are often buried inside one large data payload. Once you scan it, you are already paying for a lot of information you may not need.
That is why structured backup tables are such an effective lever when daily freshness is acceptable.
Prefer simpler string matching when you can
When a query only needs substring matching, LIKE is usually a better fit than REGEXP_CONTAINS.
REGEXP_CONTAINS uses a regular expression engine, which is useful for more complex matching patterns but usually unnecessary for a simple “contains this text” check.
It will not always reduce bytes processed, but it often helps runtime and keeps intent easier to read.
Use approximate aggregation where exactness is unnecessary
For reporting and exploration, functions like APPROX_COUNT_DISTINCT(...) can be a better tradeoff than exact distinct counts.
APPROX_COUNT_DISTINCT returns a close estimate rather than a mathematically exact result, which is often enough for analytics while being cheaper to compute.
Exactness should be chosen deliberately, not by habit.
Write queries that are easy to reason about
Readable queries are easier to optimize.
Using WITH clauses, also known as Common Table Expressions (CTEs), to separate filtering, joining, grouping, and final ordering makes it much easier to see where bytes and complexity are coming from.
A CTE is just a named intermediate query you define at the top of a statement, then reuse below. In practice, it is often the simplest way to break a large query into readable steps.
That is not just a maintainability concern. Over time, readability affects cost because unreadable queries are harder to challenge and simplify.
Why request frequency mattered as much as query cost
The most important operational insight was not hidden inside SQL.
It came from looking at request frequency.
In that sampled 100-request window:
searchCompaniesrepresented 84 callsgetCompaniesByStorerepresented 8 callsgetContactsrepresented 8 calls
That distribution made the priority clear.
If one path dominates request volume, optimizing it first can outweigh several smaller wins elsewhere. In this case, the company search path was expensive per query and overrepresented in traffic.
That is why the debounce mattered so much. Reducing bytes per call was useful. Reducing the number of calls was just as important.
What changed in practice
The useful outcome was not one magic optimization. It was a stack of smaller decisions that worked together.
One change was already live in production during this work: the company search path moved onto the backup table and gained a debounce.
The other changes were validated directly through query measurements and were straightforward next steps:
- use structured daily backups for queries that do not need immediate freshness
- keep raw export tables for paths where freshness is genuinely part of the product requirement
- remove joins that no longer change the business outcome
- debounce search-heavy inputs so the client generates fewer redundant requests
That combination cut scan cost aggressively while improving the experience users actually felt.
The product did not become less responsive.
It became more disciplined.
Takeaways
On Firebase and BigQuery stacks, the expensive part is often not one obviously bad query. It is the quiet combination of convenience, freshness, and frequency.
The practical lessons I would keep are:
- treat real-time freshness as a product decision, not a default
- prefer structured backup tables when they are fresh enough for the job
- optimize high-frequency paths before chasing low-volume wins
- add debounce where live search would otherwise spam the backend
- make query readability part of FinOps discipline, not just code style
Google’s own BigQuery cost optimization guidance points in the same direction.
What this case study made clear is that the biggest gains often come from combining infrastructure choices, SQL simplification, and product-level interaction design.
That is what made this work effective: fewer bytes, fewer calls, and a smoother product at the same time.