Putting Lyria 3 in production: cost, abuse, and SEO

We have been running Brainy Records for a few weeks now — long enough to break things in production, fix them, and form opinions worth writing down. This is for the next person plumbing Lyria 3, or any per-call paid model, into a real product.

What it does

Brainy Records generates a complete AI band from a single genre prompt: name, members, tagline, lore, album cover, and an actual song with vocals. The pipeline runs in about twenty-five seconds end to end. Stack: Gemini 2 Flash for concept text, Imagen 4 for visuals, Lyria 3 Pro Preview for the vocal song, Veo 3 for optional music videos, all on Cloud Run with Firestore and Firebase Auth, fronted by a React SPA on Firebase Hosting.

The first lesson, before anything technical: Lyria 3 minutes cost real money on every call, and the model has its own opinions about what it will and will not generate. Both of those facts shape the entire architecture.

Lesson 1: the credit-check race condition

Here is the obvious wrong way to gate a paid model. Check credits, call the model, increment usage on success. It looks fine. It is not.

Two failure modes break it. Retry abuse: a user with one free credit calls the endpoint, Lyria errors out (timeout, safety filter, transient), the catch returns 500, the frontend retries. Usage was never incremented, so the credit check still passes. Each retry costs another Lyria call. One determined user with a flaky network can drain a meaningful share of the daily budget. Concurrent abuse: the same user fires two requests in parallel from two tabs. Both pass the credit check while usage is still zero. Both call Lyria. Then we increment twice; the user spent one credit and got two songs.

Fix: reserve the credit before calling Lyria, atomically, in a Firestore transaction. The transaction reads usage, checks against the limit, and increments in one operation. A user with one credit can call Lyria once until the call settles. Concurrent requests serialise on the transaction. PayPal-purchased credits show up as negative usage values, so the math stays simple — remaining is FREE_LIMIT minus usage. Negative usage means paid credits are stacked on top.

Lesson 2: refund infrastructure errors, not safety rejections

Once you reserve credit before the call, you have to decide what to do when the call fails. The naive answer is "always refund — it is not the user's fault." That is wrong. It opens a different exploit.

A user has zero credits and types something Lyria's safety filter blocks (violent lyrics, copyrighted artist references, a thousand other things). They click generate. Lyria returns a safety error. We "refund" their credit. Now they have one credit again. They try a slightly different blocked prompt. Refund. They have found a way to call Lyria infinitely for free.

The rule we landed on is asymmetric. Timeouts, network errors, "no audio returned", upload failures, our own bugs all refund — we failed them. Safety filter rejections keep the charge — they asked for something we cannot deliver, and trying harder is not free. The classifier is a regex on the error message. It is fragile, and that is correct: better to occasionally charge for a recoverable error than to leave the safety hole open.

Lesson 3: branch on model ID, not feature flag

When we started, Lyria RealTime was the only API. It generates instrumental tracks via a websocket Live API that streams PCM. Then Lyria 3 Pro Preview showed up in the Generative Language API listing — same SDK package, different model ID, different surface entirely. It is a single generateContent call that returns MP3 inline, with vocals.

The temptation is to add a config flag like USE_LYRIA_3=true. Don't. The cleaner pattern is to branch on the model ID itself, configured by environment variable. If the model ID matches /lyria-3/i we hit the generateContent surface. Otherwise we hit the Lyria RealTime websocket surface. On production we set LYRIA_FULL_MODEL=models/lyria-3-pro-preview. To switch models — even between completely different APIs — we change one env var. The cheap RealTime model stays on the free preview clip path because previewing as instrumental and buying as vocal is fine UX. The expensive Lyria 3 only fires for users who have a credit reserved.

Lesson 4: the SPA shell is invisible to AI crawlers

Brainy Records' frontend is React + Vite. The default deploy serves index.html for every route, the SPA hydrates client-side, and Google can mostly figure it out. Google has a JS rendering pipeline. It is not great but it works.

ClaudeBot, GPTBot, PerplexityBot — they do not run JavaScript. If your only content is in the bundled JS, you are invisible to AI search.

What we did: server-render the high-leverage pages on the backend. Each band has a /band/:id HTML page generated by Express, with full title, meta tags, JSON-LD MusicGroup schema, visible band content, and an audio element. Same for /genre/:slug, /discover, /blog, /pricing. Firebase Hosting rewrites those paths to Cloud Run; everything else falls through to the SPA. Result over a few weeks: ClaudeBot is consistently our top non-Google crawler. GPTBot crawls the FAQ-structured llms.txt regularly. The SPA never sees those bots.

Lesson 5: an llms.txt that's actually useful

Most llms.txt files we have seen are link dumps — a flat list of URLs. AI bots are not sitemap parsers. They are language models. They want prose they can quote.

Our llms.txt is structured as Q&A: "What AI models power the site?", "How fast does a band get generated?", "Is it free to use?". Each answer is a few sentences with concrete numbers. There is a Featured Bands section with markdown links to specific band pages, so when an AI cites Brainy Records the link lands on the right page, not the homepage. When AI-search referrals show up in the logs, the destination URL is usually a specific band or genre page — not the homepage. The structured llms.txt is doing real work.

Lesson 6: pricing without proof doesn't convert

We had thirty-six visits to /pricing over two weeks and zero clicks on Buy. The page had everything you would want — clear packs, structured Product schema, fast load. What it did not have was evidence the output is good. Visitors saw "$0.99 for one song" and had no reason to believe that song would be good.

We embedded four real generated tracks at the top of the page: cover art, band name, style description, and an audio element that streams the actual MP3 from Cloud Storage. Cards link to the full band page for deeper engagement. Too early to claim a conversion lift; the experiment shipped during a quiet week. But the principle is general: do not ask for money before you have shown what gets delivered.

What still doesn't work

Conversion is still zero. Lots of audience-discovery work, no paid conversions yet. The funnel is fine; we do not have an audience.

Lyria 3 latency is real. Ninety seconds for a vocal song means we cannot show progress in the SPA without keeping the connection open. We currently swallow it; that is probably fine for now, definitely not at scale.

Per-instance in-flight locks are not enough. If we scale to multiple Cloud Run replicas under load, the same-user-firing-twice protection only works because the credit reservation is the real cross-instance guard. We documented the assumption; we have not load-tested it.

Stack we'd pick again

Cloud Run for the backend. Source deploy in ninety seconds. Firestore for state. Atomic transactions are the killer feature for paid-credit accounting. Firebase Auth for sign-in. The free tier covers everything we needed before worrying about provider lock-in. Lyria 3 Pro Preview for vocals. Suno and Udio are good products, but they are someone else's product. The Google API gives us per-call pricing and a single bill.

Things we'd skip

A custom CMS for blog posts. This post is a JS const in a routes file. Markdown in code, rendered server-side by a forty-line parser. Adding a CMS is the kind of thing that feels productive and costs you a week.

Subscriptions. $0.99 single song, $4.99 starter, $14.99 creator — all one-time PayPal credit packs, no recurring billing. It is not the highest-ARPU model, but it is also not a thing we have to support tickets about.

If you want to poke at the running thing: brainyrecords.com. Generate a band; the first one is free.