Someone is selling an AI A&R tool right now. Upload a song, get a score. Notes on the hook, the mix, the commercial potential, sometimes from a "producer" persona and an "A&R" persona stacked on top of each other.
My first reaction was everyone’s first reaction. AI cannot hear music. It reads a file the way a spreadsheet reads a file, then hands back the audio equivalent of a horoscope.
That reaction feels good. It is not quite right, and the accurate version turns out to be the sharper argument anyway.
Yes, it can process audio.
Start with the concession, because skipping it is how you lose the room on camera. Gemini’s newest models are not transcribing a song and reasoning about the transcript. The audio goes in as audio, handled natively by one model trained across text, images, sound, and video from the start, not a chain of smaller tools faking it. Say "AI cannot hear audio" on camera in 2026 and somebody screen-records a demo and hands you your own clip back.
Processing the sound is real now. It is still not the same job as an A&R ear.
The test is bigger than metadata.
A new benchmark called MuseBench was built to test something harder than a BPM tag, whether models can actually reason across music theory, written scores, and real performance audio. The same models that score in the mid-80s answering music theory questions in plain text drop to roughly 53 to 56 percent, barely above a coin flip, the moment you hand them the actual recording and ask them to reason about the performance in it. The fix researchers found was not a better ear. It was handing the model the sheet music alongside the audio. Structure closed most of the gap that raw listening could not.
That is where the technology actually sits. Serious, moving fast, still not hearing the way a trained ear hears.
Most AI A&R tools track signals.
This is the part that connects straight back to whatever pitch you saw. The tools running inside major labels in 2026, Chartmetric, Soundcharts, Sodatone, Instrumental, and the rest of that stack, are not listening to a single track. They are watching TikTok velocity, Discover Weekly add rate, save-to-stream ratio, how a song’s numbers spread across platforms. An artist only gets flagged after the algorithm has already been circling for weeks.
Ask what the tools are actually for, and the honest answer is on the page, including one blunt admission buried in the trade coverage of that stack: a "filter, not a crystal ball." No independent study proving any of them can call a future star. Their real job is cutting a hundred thousand daily uploads down to a shortlist a human can sit with.
Useful. Not a replacement for an ear.
Audio analysis is not taste.
Some tools do touch the actual sound, and it is worth being precise about what that buys you. Cyanite turns a track into a spectrogram, a picture of its frequencies over time, then trains a neural net to recognize visual patterns in that picture rather than listen to the song the way a person would. It comes back with genre, mood, energy, instrumentation. Real signal analysis, not a metadata trick.
Cyanite’s own documentation makes the honest admission for us. Tempo and instrumentation, the model nails those. Genre and mood, the subjective calls, stay harder every time, by the company’s own account, not a critic’s.
Tagging a track is not the same as knowing why it matters.
Basic audio data got messy.
Pull the argument down to the plumbing and it gets even clearer. Spotify killed its own audio_features and audio_analysis endpoints in November 2024. That was the tempo, key, energy, and mood data every song-analysis tool leaned on. Everything built on top of it has spent eighteen months patching the hole. The replacements that stepped in are upfront about the limits. Tempo, key, and loudness come from measured signal processing. Mood, valence, and energy are labeled approximate. Calibrate before you rely on them, the documentation says outright. In most of these systems, mood is just a two-axis plot of valence against energy. That is the entire model for how a song feels.
BPM is not a gut instinct. Key is not a hit prediction. Mood is not demand.
Not yet.
So, can AI tools replace the human. No. Not anytime soon.
The tool can score the song. It cannot make the bet.
A score tells you what a track measures like. It does not know the artist’s story, the timing of a release, whether the fans already there will bring the next ten thousand with them, or whether a room full of strangers feels something the first time the chorus lands. That is still a human call, and it is the call that decides who gets signed.
Data helps. It should never be confused with judgment.
That is the whole reason Before The Data exists. Not a tool that pretends to replace the ear. A system that helps a human operator see the signal early enough to still make the bet before it is obvious.
Before The Data helps humans make the bet earlier.
Track early artist signals, fan behavior, release momentum, and market movement before the obvious moment arrives.
Start the 7-day trial →