Brainy Records is three AI models in a trench coat. Here's what each one does and why we picked it.
Gemini — the concept writer
Gemini 2 Flash handles everything text-shaped:
- Band name, tagline, musical style paragraph
- Member roster with roles and personalities
- Concept/lore blurb
- Inline editing suggestions ("make this more metal", "less cheesy")
The reason for Flash over Pro: latency. A first-draft concept needs to arrive in under four seconds, or the "hit the button, watch magic happen" moment dies. Flash gets there. Pro we reserve for the judging panel at the competition stage.
Imagen 4 — the art director
Imagen 4 draws the visual identity — album cover, member portraits, optional stage concepts. Four images per band on the free tier, more on paid plans.
The prompts are templated from Gemini's band output: style descriptors, colour palette, era cues. Imagen's upgrade from 3 to 4 tightened compositional coherence on multi-subject prompts, which matters a lot for band photos with 3-5 members.
Lyria 3 — the composer
Lyria 3 does the music. Two tiers:
- 30-second clip for anonymous visitors, to seed the concept.
- Full 3-minute song for logged-in users on a paid plan.
Lyria's long-form coherence is why we can ship actual songs instead of loops. Give it a style prompt, a tempo hint, and a mood, and it produces something that holds together from verse to chorus to bridge.
Gluing it together
The backend is Node/Express on Cloud Run, Firestore for state, Cloud Storage for the audio and image blobs. Firebase Hosting serves the SPA and rewrites the API + SEO routes to Cloud Run.
The full stack end-to-end:
1. User picks a genre, clicks Create.
2. Gemini returns a concept (≈3s).
3. Imagen generates the cover (≈5s).
4. Lyria renders the 30-second clip (≈12s).
5. Everything gets saved to Firestore and displayed.
Total: under 25 seconds for a complete first-draft band. Hit the button on the home page and watch.
The cost reality
Generation is not free. Lyria minutes and Imagen renders both cost real money, which is why anonymous visitors get concept-only while the paid tiers ($4.99 Starter, $14.99 Pro) unlock actual music. Memory: be ruthless about gating the expensive endpoints.
What we want to build next
- Veo 3 music videos on the paid tier.
- Per-band discography (multiple songs per act).
- Stream the concept back token-by-token so the user sees Gemini think out loud.
If you want to poke the stack yourself, the API is documented — free key is read-only, paid plans unlock creation.