Members Only

Adobe Firefly + Generative Fill

Adobe takes the next step in building Generative AI features into its offerings.

This article appears in Issue 21 of CreativePro Magazine.

It’s a safe bet that 2023 will be remembered as the year Generative AI hit the mainstream. One major reason for that is Firefly, Adobe’s new automated image-generation technology. To ensure Firefly gets maximum exposure right out of the gate, Adobe has made it publicly available at firefly.adobe.com and baked the technology into a beta version of Photoshop with the Generative Fill feature. In this article, we’ll explore what you can do with both.

Firefly on the Web

Let’s begin with a look at what you can do (and will be able to do in the future) with the web version of Firefly. There are few tools to learn. Indeed, for many operations you don’t need any tools at all, as text prompts are enough to get you started creating fantasy images.

Text to Image

As the name implies, with Text to Image all you have to do is tell Firefly what you want to create, and it will do all the work for you. But before you immediately start typing, pause for a moment and think of how to describe your desired image precisely and in detail. The more detailed your text prompt, the more likely Firefly is to produce something like what you had in mind.  The prompt for Figure 1 was brain in a jar in a dark castle laboratory with lightning outside the window. Firefly produces four results for you to choose from, though you can always ask for more if none of the first batch hits the mark.

Figure 1. A brain in a jar, courtesy of Adobe Firefly

Figure 1. A brain in a jar, courtesy of Adobe Firefly

As you can see, Firefly has interpreted the prompt reasonably well. There

was clearly some issue with the idea of the lightning being outside the window rather than in the room. One of the images has a person standing in the jar; another has what appears to be a mutant goose. But the overall impression in all four cases is striking. Firefly got the mood right, but not some of the details. If you make the prompt more detailed, you’re likely to get even better results. Figure 2 shows the result of asking for highly detailed retro 3D printer on wooden desk in dusty study at twilight. 
Figure 2. Imaginative renderings of a retro 3D printer

Figure 2. Imaginative renderings of a retro 3D printer

The lighting is beautiful and applied consistently throughout each of the generated scenes: flares, highlights, and shadows are surprisingly realistic. Firefly excels at cute fantasy images. The prompt in Figure 3 was dog wearing a crown sitting on a cloud, and that’s exactly what I got in three out of four images. 
Figure 3. A dog wearing a crown, sitting on a cloud

Figure 3. A dog wearing a crown, sitting on a cloud

At this early stage in its development, Firefly already renders animals well—much better than people. Its real strength, however, is in the range of styles it’s able to generate. In addition to being able to choose between square, landscape, and portrait image shapes, you can also choose between photo, graphic, and art styles, as well as apply a range of effects, colors, and lighting to your images. Figure 4 shows the result of asking for layered paper animals in forest. The results are astonishingly good: Creating images like this using traditional Photoshop tools would take hours, if not days, even for an experienced Photoshop user. 
Figure 4. Layered paper animals in a forest

Figure 4. Layered paper animals in a forest

Firefly isn’t great at producing realistic images of people. In Figure 5, asking for Medieval woman wearing jewelry in a castle, highly detailed, with evening sun through window produced a quartet of painterly, but not quite photorealistic, images. Just a few months ago these images would have been startling, but now they don’t stack up very well against alternatives such as Midjourney. 
Figure 5. Firefly can produce painterly images of people, but not photographic ones.

Figure 5. Firefly can produce painterly images of people, but not photographic ones.

Generative Fill

As well as creating entire images from scratch, Firefly can modify existing images with text prompts. In this first example, I started with this image of myself on a sunny day (Figure 6). 

Figure 6. Steve Caplin on a sunny day

Figure 6. Steve Caplin on a sunny day

You can use the built-in brushes to select the area you want to replace. The area you paint is shown with a checkerboard pattern, just as if you had erased that region of a layer in Photoshop, as seen in Figure 7.
Figure 7. The shirt area selected, shown as a checkerboard pattern

Figure 7. The shirt area selected, shown as a checkerboard pattern

Once you’ve made your selection, you can type your text prompt. I chose business suit. Click the Generate button, and Firefly creates the image (Figure 8).
Figure 8. A convincing business suit, with appropriate shadows

Figure 8. A convincing business suit, with appropriate shadows

The result is good: a perfect-fitting, rather sharp suit. But not just that: It’s at exactly the right angle for the pose, even replicating the lighting direction with the shadow of my head on the collar. To replace the background, you just click the Select Background button beneath the image, and Adobe Sensei technology will usually do a good job of making the selection. In Figure 9, I chose to replace the clouds with a cityscape. 
Figure 9. The clouds replaced with a cityscape

Figure 9. The clouds replaced with a cityscape

Text Effects

Another tool in Firefly’s toolbox is Text Effects, which creates text based on the prompt you supply. You can choose the font from a menu of 16 (including Acumin Pro as well as four other sans-serif fonts, three serif fonts, and a few display fonts). To have Firefly generate text effects, simply enter the word(s) you want to use and describe the desired style. The results are impressive: Figure 10 and Figure 11 show spring flower and foil balloon effects, which are rendered as transparent PNG files so you can place them over the background of your choice. 

Figure 10. Spring, rendered as spring flowers

Figure 10. Spring, rendered as spring flowers

Figure 11. Foil balloon lettering

Figure 11. Foil balloon lettering

Generative Recolor

Perhaps the least impressive tool, at least for now, is Generative Recolor. As the name suggests, this recolors vector artwork based on your text prompts. The artwork must be saved as an SVG file first, which means an extra step in Illustrator before you can proceed. The results are often odd and not very useful, with colors shifted almost randomly. It’s hard to see how my drawing of a pirate (Figure 12) has been significantly improved in the process (Figure 13). 

Figure 12. My drawing of a pirate…

Figure 12.
My drawing of a pirate…

Figure 13. …recolored—but so what?

Figure 13. …recolored—
but so what?

Firefly further

A number of additional Firefly technologies are forthcoming. The details are unknown at this point, because each one is referred to on the website with an image and a brief text description only. They include the following: Extend Image, which is available now in the beta version of Photoshop, as we’ll see later on Text to Brush, which generates Photoshop brushes Text to Pattern, which creates seamless tiled patterns 3D to Image, which aims to create textured objects from uploaded 3D models—and which could be a real game changer for 3D artists Text to Vector, which creates editable vector artwork from a text prompt Sketch to Image, which allows users to upload line art drawings and have them turned into full-color painterly images All these putative tools are labeled as “in exploration,” with no anticipated arrival date. But if they perform as well as the sample images suggest, it’s definitely worth keeping an eye on the Firefly website to watch out for their appearance.

Photoshop + Firefly

With the meteoric rise of AI-generated images, you might be concerned that traditional Photoshop artists are being left behind. But that’s all set to change with the release of the first public beta of Photoshop to include its Firefly engine, which generates imagery to order, via the Generative Fill feature. As with all beta software, Firefly in Photoshop is not yet ready for prime time. Indeed, Adobe’s GenAI User Guidelines specifically prohibit its use for commercial work while it’s in beta. In the meantime, here’s what it can do for you.

Outfilling

We’re all used to the Content-Aware Fill, which does a reasonable job of extending and reshaping images. But now, Photoshop’s new Generative Fill feature (powered by the same GenAI as Firefly) takes that to a whole new level. In Figure 14, a portrait shot of a street scene has been extended to the left by widening the canvas with the Crop tool.

Figure 14. The original street, with the canvas extended to the left

Figure 14. The original street, with the canvas extended to the left

Selecting any region in Photoshop—in this case, the white area—brings up the Generative Fill button in a new floating palette. Click the button, and you can enter a text prompt, as we’ll see later. Leave it blank and press the Enter key, and in a few seconds, you’ll be presented with three alternative images.   Figure 15 shows a couple of attempts. The first has a curious construction at the end of the street, but the perspective of the new buildings opposite works well; the second has some bizarre architectural features where the building becomes confused with the trees. 
Figure 15. Two views of the extended street

Figure 15. Two views of the extended street

Figure 16 shows the best result: a thoroughly convincing street, with some added trees. Granted, there should realistically be a road to make the sidewalk narrower, but that’s something that can relatively easily be added using conventional Photoshop tools.
Figure 16. The extended street, with generated trees

Figure 16. The extended street, with generated trees

Adding a Text Prompt

You can choose to fill a selected area with literally anything you can imagine. In the example in Figure 17, I started with a view of an ancient farmyard.  

Figure 17. A farmyard with an empty space in front of it

Figure 17. A farmyard with an empty space in front of it

After making a selection, you’re given the option of adding text to tell Photoshop what you want to add to the scene. In this case, I asked Photoshop to generate a vintage car. Once again, three variants were created; two of them, as seen in Figure 18, are misshapen objects that are more crash than car. 
Figure 18. Two failed cars

Figure 18. Two failed cars

But the third attempt produced a realistic car at just the right angle (Figure 19). I was particularly impressed by the white debris on the tires, which matches the soft chalky surface on which the car is parked. 
Figure 19. The car, in perfect perspective for the scene

Figure 19. The car, in perfect perspective for the scene

It’s worth noting here how Photoshop doesn’t just add a new object, it also modifies the background behind that object. In the three examples shown here, you can clearly see how the windows in the wall above the car have been reconfigured. It’s unclear why this change takes place, and it appears even Photoshop engineers are scratching their heads over exactly why it happens. The next step was to add a person to the scene. People are a particular difficulty for Generative Fill, with faces and hands proving most problematic. My prompt for Figure 20 was woman sitting, and this is the best of the three results. The most impressive part of this addition is that she’s clearly sitting on the car’s running board; Generative Fill shows a real awareness of context (but difficulty with hands). 
Figure 20. A seated figure really does sit on the running board.

Figure 20. A seated figure really does sit on the running board.

Next, I added a hay wagon behind the car (Figure 21). The angle and perspective are exactly right, and note how the back of the hay wagon is in shadow where it disappears beneath the archway. 
Figure 21. The added hay wagon

Figure 21. The added hay wagon

In the process of adding the hay wagon, however, Photoshop obliterated the woman’s face and arm. Once again, Generative Fill tends to alter backgrounds when adding new objects. But because each new generated object is created on its own, separate layer, you can easily paint on the mask that comes with that layer to reveal the layers beneath, as Figure 22 shows. 
Figure 22. The hay wagon layer masked to reveal the woman’s face again.

Figure 22. The hay wagon layer masked to reveal the woman’s face again.

Finally, I made a selection at the bottom of the image with the Lasso tool, and told Photoshop to add a puddle. It was neatly generated, reflecting the car, the woman, and even the white rock (Figure 23). The shape of the puddle varies with each iteration and was randomly created; but all versions respect the fact that the road surface is made of pebbles, forming a totally convincing edge between the water and the stones. 
Figure 23. The puddle produces largely accurate reflections.

Figure 23. The puddle produces largely accurate reflections.

Object Removal

Some versions ago Photoshop introduced Content-Aware Fill, which enables objects in a scene to be replaced by texture sampled from the image. Generative Fill takes this process to a whole new level. In this example (Figure 24), I wanted to remove the line of cars in front of the building.

Figure 24. The line of cars spoils our view of this building.

Figure 24. The line of cars spoils our view of this building.

After making a rough selection around the cars with the Lasso tool, I used Generative Fill with no text prompt. You can see the result in Figure 25. 
Figure 25. The cars, gone in an instant.

Figure 25. The cars, gone in an instant.

The cars are gone, the brick wall rebuilt. There are even piers supporting the brick capitals. But as before, Photoshop respects the existing image: Note how the trees and piers now have shadows on the sidewalk that match the lighting direction in the scene.

Horsing Around (with Mediocre Results)

You can try to produce photographic images of animals, although the results are often questionable at best. The image shown in Figure 26, of an abandoned mine entrance makes the perfect backdrop to demonstrate. I started by making a rectangular selection where I wanted the animal to appear. 

Figure 26. The original mine entrance

Figure 26. The original mine entrance

Adding the prompt horse and cart produced three results to choose from. Figure 27 is the most acceptable, if you ignore the odd cart construction. It’s worth noting here that the filter doesn’t just add the horse and cart, it also modifies the background around it: See how the building behind has been remodeled in the process. It seems this is, at present, unavoidable.
Figure 27. An acceptable horse and cart

Figure 27. An acceptable horse and cart

Sometimes, though, the software gets confused, producing images that are downright surreal (Figure 28). 
Figure 28. Oops! There are somewhat bizarre results in this horse and cart version.

Figure 28. Oops! There are somewhat bizarre results in this horse and cart version.

Getting It Wrong

Although Generative Fill is undeniably powerful, there are times when it simply refuses to produce the results you want. In Figure 29, I asked Photoshop to add leaves to the tree (top left). None of the results are ideal, and some of them are downright weird. 

Figure 29. Adding leaves to a tree seems beyond Photoshop’s present capabilities.

Figure 29. Adding leaves to a tree seems beyond Photoshop’s present capabilities.

As you saw in the vintage car sequence, Generative Fill has particular difficulty with people. In the example shown in Figure 30, I asked it to add a man in a deckchair to the backyard scene. 
Figure 30. A prompt for a man in a deckchair produces some scary results.

Figure 30. A prompt for a man in a deckchair produces some scary results.

None of the figures are acceptable. Faces and hands are a particular problem; but feet, legs, and arms can go horribly wrong as well. Photoshop is also limited by the source material available. In another example, I started with the empty street scene (Figure 31). 
Figure 31. The empty street scene

Figure 31. The empty street scene

I asked Photoshop to add a speeding car, which it managed well; the angle of view is exactly right (Figure 32). 
Figure 32. Love the speeding car, but who asked for a bush behind it?

Figure 32. Love the speeding car, but who asked for a bush behind it?

But why has the bush been added behind the car? I asked the question of Photoshop engineers, who pointed out that most images of speeding cars have foliage behind them. And so Generative Fill, when asked to produce a speeding car, assumed that foliage was part of the requirement.

Why Can’t Photoshop Create Convincing People?

There are two sorts of artificial intelligence. Standard AI is what allows your car to know when another car is coming towards it and to dim the headlights. It’s a job it does well, but you can’t ask it to render the plot of Moby Dick as a haiku.  The second sort, Artificial General Intelligence (AGI), aims to allow computers to do anything a human can do. It uses what’s called a Large Language Model (LLM) and works on the basis of the most likely words to appear in any given context. So, if you say I’m going to take the kids to… then the word that follows is more likely to be Disneyland than, say, cauliflower or Iceland. This is a trivial example; but by now it’s probable that LLMs have read almost everything ever written. This enables ChatGPT and the like to produce text on just about any subject. The results may or may not bear any resemblance to reality, but they almost always sound convincing. Image creation tools also use LLMs: They use a visual language, of course, but the process is the same. And tools such as Midjourney have consumed many millions of images from around the internet, which enables them to create thoroughly realistic synthetic images to illustrate just about anything you ask for. So, more is better, right? Well, up to a point. The reason image generators are so good is that they’re indiscriminate about where they source their training material—as evidenced recently, when Getty Images sued Stable Diffusion for poaching 12 million of their copyrighted photographs in order to create new images. How did they know? Because many of the images produced by Stable Diffusion included the Getty Images watermark, as Figure 33 shows. 

Figure 33. Evidence from the Getty Images lawsuit against Stable Diffusion: The original Getty image (left) is reworked by Stable Diffusion (right), but the watermark is also retained.

Figure 33. Evidence from the Getty Images lawsuit against Stable Diffusion: The original Getty image (left) is reworked by Stable Diffusion (right), but the watermark is also retained.

The moral question is clear: Should LLMs be allowed to poach the work of artists and photographers without consent, credit, or compensation? Adobe, rightly, thinks not. So, Photoshop’s Generative AI is trained entirely on images from the Adobe photo library, which, although extensive, holds just a tiny fraction of the many millions of images on the internet. And that necessarily limits the extent of realistic image creation based on text prompts. Adobe says they’re “committed to compensating the people who are contributing their work to these databases.” This is good news for illustrators and photographers, but it does mean it will be a while before the photorealistic quality of Firefly-generated images can come close to that offered by less scrupulous players.

Firefly Takes Flight

Unlike using the other AI image-generation tools, working directly in Photoshop means we can integrate generated results with our own images and choose exactly where the new elements appear and then use standard Photoshop tools to modify them. The Firefly website allows you to create entirely new images from mere text prompts. Together, it’s like having an infinite, free photo library at your fingertips. Where this leaves traditional Photoshop artists, illustrators, and photographers, though, is another question. Are the skills we’ve painstakingly learned over our careers now becoming obsolete? Or are these just new tools that we can learn to master and turn to our advantage? One thing is certain: The gap between inspiration and execution is getting shorter by the day. Stay tuned!

Bookmark
Please login to bookmark Close

Not a member yet?

Get unlimited access to articles and member-only resources with a CreativePro membership.

Become a Member

Comments (0)

Leave a Reply

Your email address will not be published. Required fields are marked *

Loading comments...