Collage for Generative AI

Portrait of Daniel Mason

It's easy to leverage what's been created when imagining something new.

TL;DR:

Check out the prototype

After hearing about AI, reading about AI, going to meetups about AI, and taking classes on AI, I wanted to design some stuff.

I identified a few gaps and audience types for which to design software that makes generative AI workable and intuitive.

This particular project is conceived of for the amateur artist/designer/creator. Someone with a vision, a desire, and self-described limited technical talent.

This software imagines collage, a protected art form that is immediately accessible and intuitive, to enable creative expression via generative AI. It is well known that editing is easier than creating. If you don't believe you can draw a flower then find an image of a flower and let the AI do the rest.

Results

Test the prototype

Image upload

Image upload screen

Because collages don't start from zero the user needs a way to add images to work with. Enter the image upload.

This is a bog standard pattern with no unnecessary adjustments on my part.

Benefit

  • Add your own images
  • Track upload progress
  • Multi-method upload

Value

  • Familiarity
  • Ease of use

Image Cutouts

An piece of an image being cut out

Because a key interaction with collage is removing parts and pieces of existing images I included that interaction pattern.

The image gets selected before a context relevant menu with the option to cut the image appears.

Benefit

  • Provides control over what to include
  • Allows for mix and match imagery

Value

  • Using what's already provided

Style referencing

Selecting or adding style reference images

An important component of this AI is the ability to pull together disparate image styles into a cohesive whole. Giving style to your image is a highly desirable feature though not terribly easy to accomplish.

I imagined a way for the user to select pretrained style models or add images to train a model in the moment.

Benefit

  • Customizable styles
  • Pick up and go actions
  • The output will look different from the input

Value

  • Differentiation
  • Control over the style

Variations

Detail of the regiment interactions

Because a key feature of existing generative AIs is the ability to generate variations of images I included the option here as well.

The AI generates four potentional options for the user to choose from.

Benefit

  • Freedom to choose

Value

  • Options
  • Explorations

Check the results

Test the prototype

The story so far

Research

I scoured the internet, ChatGPT, Meetups, and my LinkedIn feed for softwares that are designed for image creation.

There are many and the list grows near daily.

A spread of generative AI softwares

Insight

There are already many players in this game and they are already far ahead, nearly too far to be overtaken. Consider though that each and every one of them, save for perhaps Adobe and Canva are build on the worst paradigm possible for the visual arts, language. And it is asinine to keep it that way.

Few creative people want to bang away trying to describe an image to a computer program. The relationship between creator and AI needs to shift from 'make me something' to 'here's something, make it better.'

Visual generative AI softwares plotted on a positioning graph

"What about reference images," you ask? They are better than a text prompt but do not provide the control necessary to easily externalize what's in someone's head.

Opportunitiy

For people who don't think they can draw but want to create pretty pictures, collage combined with generative AI is a no barrier to entry means for them to do so.

Until the advent of generative AI, collage has carried the stigma of unattractive and disparate styling. A collage looks like a collage. Generative AI can now pull the ideas from a collage into a cohesive and similarly styled whole. Drawing ability is no longer a limiting factor to creating art.

Why now?

We're in a race to win the market. Enthusiasm is high. Interest is up. But nobody has it figured out.

A sentiment commonly expressed by the average person, in the western world of North America at least, is that they can't draw. One doesn't have to look too far or wide to validate that statement. It is also known that editing is easier than creating. Collage is a form of art that allows someone with no artistic talent to edit instead of create.

Audience

Persona archetype

You'll likely complain that I'm doing this wrong and making everything up. I created this persona from a retrospective of the creative people I've encountered in my life to date, not through formal interviews. This serves as a starting point not the answer.

The audience distills down to people in the WEIRD culture who want to make art but are too afraid to try because they can't draw. The audience is a tech sophisticate that tends to keep to themselves and worries too much about not being valued.

Value proposition

Value proposition canvas

Because the main barrier to making art for the audience is a perceived lack of talent and their experience with generative AI to date hasn't satisfied their need for creative control this software promises to remove both by following the model of collage.

Use cases

While this list is not comprehensive it does illustrate the breadth of applications a generative AI based off collage has.

  • Posters
  • Flyers
  • Social media graphics
  • Book covers
  • Banners
  • Billboards
  • Ad concept
  • Storyboards
  • Comic books

Flow

Key screens in a line indicating a sequence

Because the core use loop for this software is using cutouts from uploaded images to create a new stylistically cohesive image that's the first flow I imagined. I begin all my flows with a goal and a context and I call out the main features on display to.

Testing

Test the prototype

As this is the earliest of stage prototypes I focused the testing on how well the idea is communicated and how well it matches audience expectations.

My predictions

Though you likely haven't asked for it nor truly care about my predictions I'm going to go ahead a toss out a few softwares that I think will emerge as the dominate players. Check them out when you have a chance.

But wait? Why not Adobe? Being that they are the incumbent in this race, with essentially having defined the software paradigm for visual people, I think they'll see an customer exodus towards these more streamlined and independent offerings.

Recap

Check out the prototype

This particular project is conceived of for the amateur artist/designer/creator. Someone with a vision, a desire, and self-described limited technical talent.

This software imagines collage, which is a protected art form as well as an immediately accessible and intuitive one, as the interaction model for visual generative AI.

It is well known that editing is easier than creating. If you don't believe you can draw a flower then find an image of a flower and let the AI do the rest.

Let's work together.

Work with me✏️