Drawing pictures of animals with ChatGPT
• ~900 words • 4 minute read
I asked ChatGPT to draw some pictures for me by producing the SVG code to "sketch" out my requests.
What did I ask for?
- A cat
- A dog
- An elephant
- A walrus
- A monkey
- The Mona Lisa
- A smiley face
- A smiley face with arms (I'll explain)
- The words "Hello World"
What I learned is that ChatGPT is... not so great at this. That or it has really interesting interpretation of some of these things. If you want something approaching visual accuracy and usability, you'll probably want to use another tool like DALL-E. There are probably better tools out there that generate actual SVGs. I've heard them mentioned in various newsletters and papers but not actually stumbled across one I could use in the wild yet.
Consider this my digital refrigerator upon which I am hanging young ChatGPT's best artistic efforts.
Here's the cat:
This was the first images I requested. I laughed when I saw the output and thought "Well... okay. I guess that's an extremely minimal interpretation of a cat."
It got a little more surreal when I asked for a dog:
Oooh, now it's getting interesting! I feel like I'm seeing an obvious head, maybe the points at the bottom of the three "hills" could be the four arms or paws or something.
When I asked for an elephant, I started to notice a pattern:
This one was somehow even more abstract than the dog. The same "hills" and weird, iris-less and pupil-less "eye" shape floating in the bottom remained.
Because I still think about the adorable walruses I saw in Svalbard last summer, I decided to ask ChatGPT to draw a walrus next:
I've concluded maybe ChatGPT just thinks all animals (except for cats) have weird all-seeing eyes and hilltop shapes as their base. Weirdly I feel like I could almost squint and imagine this one as a single-tusked walrus, poking its head out of the water, taking a break from scouring the seabed for clams.
I decided to try a very different kind of animal and went for a monkey:
I had to double-check my prompt and make sure I hadn't made a mistake, because at first glance I thought this looked exactly like the elephant ChatGPT had given me earlier. Upon closer inspection, it basically was, except everything had been "condensed" a little bit, and some of the shapes in the base were subtly different.
I decided to try something harder and ask for the Mona Lisa next:
I'm not quite sure we captured her enigmatic smile here, but it's definitely more... symmetrical than the walrus?
I should pause here to note that I did try wilder, more surrealistic prompts—the type I'm used to submitting to DALL-E. When I asked for a monkey riding a rhino like a cowboy I got this in response:
Kudos to knowing your limits, I suppose.
After that I asked for something I assumed it would be able to give me—a yellow smiley face. I audibly laughed when I saw the render, as I was reminded of probably one of the first times I'd used MS Paint or something as a kid:
I asked for it to draw the same thing but with arms next. My hope was to try and recreate these little finger puppets I've used for years as avatars and in Zoom backdrops for many years:
"A" for effort, I guess.
Finally, I decided to throw a curve ball and ask it to render an SVG drawing the words "Hello World." DALL-E is famously bad with text and I wondered how it would handle this.
The very first rendition included the <text>
element, which I'd forgotten is part of the SVG spec. Ha! That is probably the most "correct" answer here, but not what I wanted. I adjusted my prompt asking for the same thing but explicitly requesting that ChatGPT not use that element.
Not only did it return a new SVG using <path>
elements instead, but it gave me a surprisingly confident-sounding explanation of how the code is accomplishing this!
It said:
This code creates the text "Hello World" by using three path elements, each with a different set of quadratic Bezier curves (Q) which forms the letters in the phrase. The first path is for the word "Hello", the second path is for the word "World" and the third one is for the baseline. The fill attribute is set to none and the stroke attribute is set to black and stroke-width is 2.
Please note that this is a more complex example, you can find more detailed svg codes for text by searching the internet, and you can customize the code to match your needs. Also this is not a scalable method for text as vou will need to manually adiust the path for different font sizes.
I opened the SVG with eager anticipation to reveal:
I guess my takeaway from all of this: while the relative safety of traditional computer engineering jobs (among others) may be challenged as large language models evolve, job security for SVG clipart artists feels relatively safe from the onslaught of AI 😂
At least for now.
😐
PS—for the curious, you can view and download all of the SVGs here in a GitHub repo: