3.0 KiB
Audio-Reactive Animation
Drive visuals from music, voice, or sound. Any GSAP-animatable property can respond to pre-extracted audio data.
Audio Data Format
var AUDIO_DATA = {
fps: 30,
totalFrames: 900,
frames: [{ bands: [0.82, 0.45, 0.31, ...] }, ...]
};
frames[i].bands[]— frequency band amplitudes, 0-1. Index 0 = bass, higher = treble.- Each band normalized independently across the full track.
Mapping Audio to Visuals
| Audio signal | Visual property | Effect |
|---|---|---|
| Bass (bands[0]) | scale |
Pulse on beat |
| Treble (bands[12-14]) | textShadow, boxShadow |
Glow intensity |
| Overall amplitude | opacity, y, backgroundColor |
Breathe, lift, color shift |
| Mid-range (bands[4-8]) | borderRadius, width |
Shape morphing |
Any GSAP-tweenable property works — clipPath, filter, SVG attributes, CSS custom properties.
Content, Not Medium
Audio provides timing and intensity. The visual vocabulary comes from the narrative.
Never add: equalizer bars, spectrum analyzers, waveform displays, musical notes clip art, generic particle systems, rainbow color cycling, strobing white on beats, abstract pulsing orbs.
Instead: Let content guide the visual and audio drive its behavior. Bass makes warmth swell. Treble sharpens contrast. The visual choice comes from "what does this piece feel like?"
Sampling Pattern
Audio reactivity requires per-frame sampling via a for loop with tl.call(), not a single tween:
// ✅ Correct — sample every frame
for (var f = 0; f < AUDIO_DATA.totalFrames; f++) {
tl.call(
(function (frame) {
return function () {
draw(frame);
};
})(AUDIO_DATA.frames[f]),
[],
f / AUDIO_DATA.fps,
);
}
// ❌ Wrong — single tween, doesn't react to audio
gsap.to(".el", { scale: 1.2, duration: totalDuration });
Without per-frame sampling, the composition doesn't actually react to audio.
textShadow Gotcha
textShadow on a parent container with semi-transparent children (e.g., inactive caption words at rgba(255,255,255,0.3)) renders a visible glow rectangle behind all children. Fix: apply scale to the container for beat pulse, but apply textShadow to individual active words only.
Guidelines
- Subtlety for text — 3-6% scale variation, soft glow. Heavy pulsing makes text unreadable.
- Go bigger on non-text — backgrounds and shapes can handle 10-30% swings.
- Match the energy — corporate = subtle; music video = dramatic.
- Deterministic — pre-extracted data, no Web Audio API, no runtime analysis.
Constraints
- All audio data must be pre-extracted (use
extract-audio-data.pyfrom the gsap skill's scripts/) - No
Math.random()orDate.now() - Audio reactivity runs on the same GSAP timeline as everything else