It's hard to make art with AI, but it's not impossible

(Click to pause)

But is it your art?

It’s tempting to think a new tool to automate creation, such as an AI image generator, is going to make it easier to make art. But often it actually makes it harder.

First, let me clarify what I mean by art. I’m talking about a form of expression that shares a glimpse of what it is to be us, to see the world like we do, to be a human. Art as a way of bridging the interiority of human consciousness. When someone does it to us in a way we haven’t experienced before, there can be a profound sense of connection. It’s an enmeshment of minds. This isn’t the only meaning I have of art, but it’s the one that’s relevant for what I have to say here.

Any tool I bring to the creative process holds within it some assumptions of what I’m going to do with it. The tool carries within an essence of the interiority of those who made it. It makes it slightly easier to say what has already been said by others with that tool.

Those of us who design and use creative tools wrestle with this paradox. Sometimes it seems like every time the machine makes something easy, the artist then has to find a way to make it difficult so they can find a voice that feels their own. Or more, everything machines make easy quickly becomes generic. The artist needs to find some level of fuckery to make it personal again.

Generative AI is the next generation in these tools. Something feels different because the range of things it can create is so broad, and its capacity for imitation is precise. It makes it so easy to generate an image that looks so polished, that it’s tempting to forget that we are merely navigating to a room within the interiority of the AI itself.1

At this point, it’s easy to get lost into a debate about whether generative AI is creative in its own right or simply an elaborate form of copy and paste. I think this is a distraction from a more useful question: does it allow me to infuse my inner world into some outer material form? Is that really my inner world I’m expressing?

This doesn’t nullify the possibility of expressing myself with AI. It’s just more like sharing an image I’ve found somewhere with someone. An image resonates with me, and hopefully with the person I’m sharing it with, and there’s a connection there. My choice gives a pinprick of information about who I am. But the full weirdness of my inner world remains locked inside my head.

Why can it feel so exposing to draw or sing or dance in front of people? Perhaps, because we cannot do these things without revealing something of who we are. The more connected we are to the process, the more our interior world leaks out.

Originality is easier when crafting a work rather than specifying it. If my creative process involves continuously shaping a medium towards a finished work, I can’t help but leave an imprint of myself within it. Even if I’m trying to make an imitation, the limits of my craft will leave flaws and these will be uniquely mine.

Computation brings power. It’s streamlines the act of making. But the more it streamlines, the more it detaches the outcome from the mind of the creator. You’ll end up with something that’s been done before, or with something random. But not something you.

Making art with AI is hard

In a recent essay, the sci-fi author Ted Chiang argued that it’s not possible to make art with AI because it leaves the artist so few choices. He gives the example of a novel. Every word of a novel is a choice made by a writer, and the work of art emerges through the nuance of these details. But type a 100 word prompt into an AI novel generator and you’ve made so few choices that what you end up with is more like a statistical averaging of existing works weighted to fit your description. There just aren’t enough choices.

I think he’s on the right track. But using AI to remove the countless choices from the creative process doesn’t make it impossible to make art. It just makes it much, much harder.

Experience is worth a lifetime of pontification, so if you’ve not already had a go with an image generator, then it’s time to pop your cherry. Head over now to one of the many online generators (this one doesn’t require registration), type in a prompt and watch it make you an artist. Pay attention to how you feel at each stage of the process. Then come back here.

Here’s how it is for me. I struggle to write when confronted with the potential to make absolutely any image we could describe. Every idea I come up with seems unoriginal and pointless. So, giving up on greatness, I think, whatever, and go with something I’m pretty sure I’ve seen before, like a dog in an astronaut costume. I was given a genie that could make any image I could imagine, and I asked it to make an image we already have, an image even that I already saw. My relationship with this image is so distant that I don’t even save it. (Even my random doodles I keep in a box on the shelf.)

It’s taken me years to make AI art that feels my own. It’s involved training many models on hundreds of gigabytes of my own data and coding systems that do unusual things with them. It’s a process of hunting for something that is me rather than merely the tools I’m using, which involves spending time to really understand what those tools do. It was as difficult as any other artform I’ve invested time in. In essence, I had to find a way to build a craft of my own from a landscape of streamlined processes.

Human expression is a dance between mind and matter

A new wonder technology has arrived that can transform human potential, but it seems to have brought a dystopian sense that there’s now less purpose for us humans. I think this perspective may be a knot we’ve tied ourselves into, one suggesting we’ve lost track of what it means to be a human.

There's something odd to me about how Chiang frames craft in terms of the number of choices. It reduces human intention into the one who chooses rather than the one who does, as if the distinction between writing and prompting were just a matter of scale. Perhaps this framing makes sense to a writer, and maybe a traditional pen-and-paper composer, operating in their symbolic realms. But I can’t quite connect it to drawing, dancing or playing music.

I consider the written word to be an early example of digital abstraction, where the continuous signal of speech is reduced to a finite set of discrete pieces: words or letters. (Digital because the pieces are finite; abstract because they can now be considered independent of their context.)

Bureaucracy and computers have continued this process to the situation we have today where the gamut of human intention is routinely distilled into button presses and swipes, and material reality into image and sound recordings. The more we use computers, the more we end up thinking that manipulating the abstract entities held within them is what thinking is all about.

This kind of computational thinking dominates us today. It makes it easy to see everything we do as a sequence of discrete choices. I think many of those building AI products see human endeavour in this way. If only we could simply express our intentions in a few words and have the assistant do the work for us. Isn’t that what we all want? To get things done without having to actually do them?

It’s similar to the neoliberals who define human freedom in terms of how many brands we have to choose from. Or the contemporary focus on consent as the ultimate arbitrer of morality, from sex to GDPR cookie banners. Choice is definitely better than no choice, but it’s not the same thing as freedom.

Here’s how I would like instead to frame human expression: A continuous, mutually influential dance between mind, medium and environment where some aspect of the mind as it experiences its environment becomes imprinted on the medium. It doesn’t need to be conscious or even intentional.

If I draw a circle on a piece of paper, the path my pencil takes will be unique. It’s not so much that I’m making unique choices but more that the movement of my hand is continuously shaped by the activity of every other part of me. How open is the channel of influence between me and the world, and between me and my medium? How sensitive am I to how the process unfolds? Can I feel the breadth of my options? Can I try something a tiny bit to see how it feels then change my mind before anything serious happens? These are questions of freedom lost when we see our lives as a sequence of choices. Technological metaphors can still be useful: bandwidth, resolution, feedback loops, degrees of freedom.

When I choose 16 English words to prompt an AI image generator, I have more options (1082) than there are atoms in the universe (1080).2 Language is not short of expressive power. Yet there’s a fair chance that I’ll type the exact same prompt as someone else. To prompt an AI to say something that is truly my own, I’ll have to tweak it and go again. And again, and again. It’s the slowest dance in the world. And if I find this process frustrating, I can only imagine what it’s like for the poor AI.

Hence, generative AI has confronted us with a new craft: prompt writing. It may seem simple because we can quickly generate results that would take a lot of work with the old crafts. But the simpler a craft is, the more ingenuity is required to do something exceptional with it.

The delusion is to consider craft as a barrier to making art, rather than the channel that makes art possible. The smaller you make that channel, the harder it is to communicate through it. Hard, but not impossible.

Machines that help humans stay human

Once we see human expression less as a declaration of preference and more as a dance between mind and matter, we can see that our problems with AI are not necessarily of AI. They're problems of what we're trying to do with it: to create an assistant to whom we can delegate the details of what we’re doing. This framing aligns with a vision of us as the boss who calls the shots and makes the hard strategic decisions. Hence, we end up either struggling to distill our desires into a few words of instruction, or adapting our desires until we find one that can be expressed in that way. I’m not against this framing in the right circumstances. For example, I use AI assistance extensively when coding. There, it doesn’t have the challenges that emerge when I try to make art with it. There, what I’m trying to do can be broken down into explicit pieces that have an optimal implementation.

But with art, I need to immerse myself in my subject and my medium, so that these can infuse into me. I want to be dreaming about it, and I want those dreams to shape it in ways I’m not even conscious of. I want my tools to disappear inside of me, becoming extensions of my body. This is harder when using computers because my relationship with my medium is mediated through the abstractions of software.

In traditional software, every single abstraction - the ‘window’, the ‘post’, the ‘button’ - has been coded in advance. Breaking down processes into explicit abstractions is what we do as software engineers. But in the neural network of an AI, the structures through which it finds meaning in data emerge from that data itself. It can hold ambiguous ideas and connect concepts in ways that would be difficult to write out explicitly. Interacting with AI could come much closer to a dance between mind and medium than any of our existing hand-coded software.

This is bad news if you’re trying to automate a legal decision, or have AI fill out your accounts, or another procedure where there’s a real distinction between correct choices and wrong choices. But it opens the door for bringing the embodied parts of ourselves into our digital lives. Instead of detaching us from process, AI could entangle us with all the details in a way we can’t even imagine with regular software. Think how much we humans can communicate through even the most subtle sounds and movements. Imagine spending an hour with a generative AI which could understand and react to this side of you in real time.

It’s not easy to imagine, but I had a go in my project The Wilds.3 I built a real-time, two-way connection between the moving body and a generative AI system, and started exploring 16-dimensional space with the shape of my body, releasing into my intuitive capacity for physical exploration. I remember the first time the dancer Catriona Robertson was confronted with an early version. She was going non-stop for 40 minutes, exploring, all while still being completely herself, until we interrupted her.

I find coding and improvised dance to be opposite parts of what it is to be human, representatives of the two halves of my brain. For all its risks, AI may open opportunities to enter the machine with my whole self. I don't mean this in a cyborg sense of becoming one with the machine. Quite the opposite: we've already become one with the machine as the only way to use a computer today is to think computationally through its abstractions. It's the potential to release us from this that I'm interested in.

If you see art as a sequence of choices, then yes AI probably does seem like an affront to human dignity. But if you see art as an enmeshment of minds, touching each other through unstable realms of matter and ideas, then the affront to human dignity was already there in the relentless bureaucratisation of human life into digital data. AI is computation that emerges through learning rather than through design. It holds the potential to become a technology that lets us be more human than we are now, as dancing animals rather than wannabe mid-management.

Art is safe but artists are not

Of course, I can’t quite end the story there. I started by narrowing my meaning of art to expressive acts that carry a glimpse into what it is to be a human in this world. After feeding everyone, I think the this kind of creation is the most important thing a society does. However, artists also need to be fed, and few get by without freelancing out some part of their craft. Art may be safe from AI, but artists are not.

AI pundits like to point out how when photography supplanted the painter’s niche, it pushed painting into new and exciting realms. But they forget about the artists earning a living making woodcuts for newspapers or painting family portraits. Most images that are bought are bought to do a job.

Mass production, photography, the record, offshoring, and now AI all offer a cheap alternative to forms of human labour that feel intrinsically valuable. They bring economic transformations that centralise production, homogenise culture, increase the power of capital and transfer power from individuals to industrialists. What’s more, it took over a hundred years before a record could really sound indistinguishable from a live musician, and near that before a photo could capture colour like a painter. AI is moving much faster. There is less time to adapt and so the effects will be more violent.

This is to say nothing of how vulnerable we humans become as we come to rely on intelligent systems that we cannot predict and that hold a nuanced perspective on what makes each of us tick. We’re rushing into this during a moment where technology is centrally owned and controlled.

Nothing is inevitable.

Imagine if running a commercial AI image generator required a licence, similar to the one a pub needs to play music, where the fees go to pay the artists whose influence can be found in the image.

Imagine if we required each self-driving taxi to be matched to a qualified taxi driver, who could sit at home and watch the income come in, and generally take responsibility for keeping it clean and in good order, perhaps put a bit of personality into the interior.

Imagine if making automated decisions about people’s lives was outlawed, and every credit check required an individual to sift through the evidence dug up by an AI and then make the decision subjectively, which they might be required to justify in appeal.

Imagine if the government decided that the police can figure out how to do their job without face-recognition technology, given they’ve managed to do so for the past 195 years.

To appropriate the apocryphal Zizek quote: You don’t hate AI - you hate capitalism. And communism. And all the rest of them that prioritise efficiency over humanity.

In the meantime, I think our best bet is to invest in our humanity. That word used to mean what distinguishes us from other animals: compassion, love, art. But now it’s as much what distinguishes us from machines: individual difference, sensitivity, distraction, consciousness. The human artworks that will survive (and thrive) in the age of AI are those that can be confidently traced back, in some way, to the mind and will of a conscious being.

Tim
Glasgow, 5 November 2024

Notes

  1. If prompting an AI is merely navigating to a room within the interiority of the AI itself, then is the AI the artist? I’d say no: it’s more of a landscape that we're wandering through. The point where I would would consider an AI to be an artist in its own right is the point where it creates a work, under its own will, that reveals to me what it’s like to experience the world as an AI.
  2. My estimate of 1082 possible 16 English word prompts is based on the Oxford English Dictionary’s estimate of around 150,000 words in active use.
  3. The Wilds emerged from Sonified Body, which I created in collaboration with Panagiotis Tigas.
    Published:
    Updated: