Sound work performed live with real-time AI audio.
The computer music pioneer Joel Chadabe described performing with a non-deterministic system as like sailing a boat through stormy seas. A storm has its own agency of chaos, unmoved by whatever intentions I may have in harnessing its forces. In moments of desperation, it’s tempting to think that the storm is aware of our plight as it ushers or torments us.
If, like me, you are a purist who likes to experience work with an untainted mind, then you might like to listen to the recording to form your own impression before reading my interpretation here.
At the core of this piece are four versions of the audio-generating AI model RAVE. RAVE is an auto-encoder: it ingests sound, encoding it into its own internal language. It then decodes its own language back into sound. To train it, I give it hours of audio, and it optimises the encoding and decoding process to work well for that audio. (Think of a person training to listen, remember and vocalise a sound. They learn what to listen for, what details to remember and how to recreate a sound from those details.)
I trained four versions of the RAVE model. One is trained on a corpus of lectures by Alan Watts. Another on every sound I've recorded, including the few seconds attached to each Live Photo taken on my iPhone. A third on all the music and sound art I've ever made. The fourth version is trained on a set of recordings of Adriana Minu, vocal performer extraordinaire and my wife. These were the first recordings of her emergent experimental vocal practice after 10 years of not singing. There is a hint of struggle and vulnerability in her early voice that becomes clearer as her practice continues evolving.
I experimented by combining these models together in new ways. If I feed the sound of Alan Watts through the model trained on his own voice, I get a slightly distorted version out. The distortion has an uncanny nature to my ears, less like analogue noise or digital glitch, and more like a skilful robotic imitator slipping up here and there.
Next, I tried running the models simultaneously, and feeding the internal language encoded by one model into the decoder of a different model. The sound departs further. The dynamics and rhythm remain. The timbre is reminiscent but not quite there.
There was a magic moment where focus of the piece became clear. I was gradually degrading the quality of the Alan Watts model by modulating its internal representation between encoder and decoder. When I fed these encodings from the Alan Watts model directly into the decoder of Adriana’s model, something else emerged. These sounds were eerie. The struggle of the human combines with the uncanny valley of the AI. In the errors and distortions I hear the struggle of a living being: effort, intention, agency.
There's an ambiguity in where that struggle is rooted. Is it Adriana's struggle appropriated by the AI? Is it the AI models trying to get through as I reroute their internals into each other, Frankenstein style.
Reviews
Performances
- 26 May 2023, (de)Stabilizing Diffusions, Cafe SAT, Montreal.
Acknowledgements
Developed with support from the Machine Agencies research cluster of Concordia University. With thanks to Fenwick McKelvey and Maurice Jones.
Thank you to Adriana Minu for letting me use her recordings as training data.
The piece uses the RAVE model and nn~ Max external developed by Antoine Caillon at IRCAM.