← Back to Blog
blogs Featured

Data Sonification Case Study: Turning Econometric Research into Music with AI and Reaper

How I turned an academic paper on movie word-of-mouth into a 4-layer musical composition using Python, MIDI, and Reaper. Mulhacen Labs case study.

Data Sonification Case Study: Turning Econometric Research into Music with AI and Reaper
ai sonification audio data-visualization case-study reaper

I took a 44-page academic paper about how weather affects movie ticket sales and turned it into a 2-minute 40-second musical composition where you can literally hear the data. Four musical layers, 8 automation parameters, 48 bars, all generated from a Python pipeline and mixed in Reaper. This is Momentum Cascade, a data sonification project built by Mulhacen Labs.

What Is Data Sonification?

Data sonification maps numbers to sound. Instead of reading a chart, you hear the data. A rising stock price becomes a rising pitch. Accelerating growth becomes an accelerating rhythm. It is not new (sonar is sonification), but AI and modern audio tools make it practical for research, accessibility, and creative applications.

Aspect Visualization Sonification
Sense Sight Hearing
Dimensions 2-3 (x, y, color) 6+ (pitch, volume, rhythm, timbre, stereo, density)
Time perception Static or animated Naturally temporal
Accessibility Requires sight Works for visually impaired
Engagement Analytical Emotional + analytical

The Source Data

The source paper is "Something to Talk About: Social Spillovers in Movie Consumption" by Gilchrist and Sands (2016, Journal of Political Economy). The key finding: a positive weather shock on a movie's opening weekend creates a compounding wave of word-of-mouth. By week 6, a $1 shock to opening revenue generates $2.14 in total revenue.

Week Coefficient Cumulative Multiplier
1 1.000 1.000
2 0.474 1.474
3 0.269 1.743
4 0.188 1.931
5 0.112 2.043
6 0.096 2.139

All coefficients statistically significant at the 1% level. That cascading pattern is what I wanted to make audible.

The Pipeline

I built a 5-layer pipeline: data extraction (Python + pdfplumber), parameter mapping, MIDI generation, automation control, and mixing/mastering in Reaper.

Step 1: Extract. Python scripts pull the tables and figures from the PDF into structured CSVs (momentum coefficients, viewership decay curves, quality splits).

Step 2: Map. Each data dimension gets assigned to a musical parameter, grounded in The Sonification Handbook (Hermann, Hunt & Neuhoff, 2011). I did not make arbitrary choices. Every mapping follows established perceptual principles.

Step 3: Generate. A custom Python script (raw_midi_generator.py, stdlib only, no dependencies) generates a Type 1 multi-track MIDI file at 480 PPQN. 4 musical tracks + conductor track + automation track.

Step 4: Automate. 8 continuous MIDI CC parameters control effects in real-time: filter cutoff, reverb, delay feedback, EQ, stereo width. Sent via IAC Driver at 32nd-note resolution.

Step 5: Mix. Loaded into Reaper, assigned instruments and effects per track, mixed and mastered to a final stereo file.

The Four Musical Layers

Layer Data Dimension Musical Parameter Effect
Pad Cumulative multiplier (1.0 to 2.14) Chord density (1 to 5 voices) + velocity Texture thickens as the cascade grows
Bass Week coefficient (1.0 to 0.096) Pitch (C2 to C3, lower = stronger) Initial shock hits deep, echoes rise
Echo Number of active echoes per week Fragment count (1 to 6 ascending motifs) Space fills up as word-of-mouth spreads
Pulse Week progression Note interval (whole to eighth notes) Heartbeat accelerates with tension

The piece is in C natural minor at 72 BPM. 6 sections of 8 bars each, one per weekend. It builds from sparse and quiet (week 1) to dense and urgent (week 6), mirroring the compounding cascade in the data.

Technical Specs

Property Value
Duration 48 bars, ~2 min 40 sec
Tempo 72 BPM
Key C natural minor (Aeolian)
MIDI resolution 480 PPQN
Automation 8 CC parameters, 32nd-note resolution
Tools Python, Reaper, IAC Driver
Dependencies Zero external Python libraries for MIDI generation

Why This Matters

Data sonification is a growing field with applications in scientific research, accessibility (making data available to visually impaired users), financial monitoring, and creative arts. The techniques I used here (parameter mapping, multi-layer composition, automation from data) apply to any dataset.

If you have data that tells a story (and most data does), sonification can make that story felt, not just understood.

About Mulhacen Labs

I'm Barry Faassen, founder of Mulhacen Labs. I build at the intersection of software engineering, AI, and audio. 25+ years of experience across scientific computing (Deltares), geotechnical software (Fugro), and audio plugin development (C++/JUCE). Based in Granada, Spain.

Have a dataset you want to hear? Book a call.

Published April 22, 2026