Can ChatGPT generate images? You won’t believe the results


People are very visual and although ChatGPT is fantastic in terms of being able to reply to almost any question, it does not make images directly yet. There are a few ways though that we can get ChatGPT to give us an image. A couple of them allow you even to get an image in the response that you get from ChatGPT. So the question is can you get an image in ChatGPT’s response box?

ChatGPT can create an image for you in its response by accessing an API for an outside service. You have to specifically request a special URL that will access the image that you want or want to create, Unsplash and Polynations.ai are examples of services that can be used.

So the ChatGPT model does not inherently have the ability to create images or display artistic capabilities. Its primary function is to process and generate text based on the prompts it receives.

There are two ways that we are going to be looking at in this article at getting around these limits.

  • Using ChatGPT to create descriptions and directly calling APIs to make images and its response
  • Using ChatGPT to create descriptions for image generators, like OpenAI’s DALL-E 2, or Dreamstudio, which can generate images based on text descriptions.

It’s important not to confuse ChatGPT with image-generating AIs, as they serve two different purposes. While ChatGPT focuses on text and language understanding, other AI models are designed specifically for creating visuals. The combination of these technologies together though can be very synergistic.

I hope that you are as interested as I am in jumping in and learning more!

Table of Contents

Method 1: Creating a URL Call to Polynations.ai

Explanation of What Polynations.ai Is and How It Works

Pollinations.ai, is an AI-powered tool designed to create images based on text prompts. This art generator utilizes AI generative technology to help people imagine new worlds and visuals by interpreting their text inputs. It has a social media integration feature, where users can receive images generated on the spot by simply posting their text descriptions. (Which is cool.)

We’re going to use that social integration feature to call for images to be put into our responses from ChatGPT.

Let’s Take a look at a sample prompt, this will create both prompts for Dreamstudio and for pollinations.ai.

AI Art Prompt Generator For Pollinations.AI

Hi ChatGPT we are going to make pictures today!!!

I want you to be very creative when describing pictures of a horse, a palace, and a woman. You will in particular in describing images explain the images as you would to a legally blind person.

Please repeat your description a second time in this special format.

The first time you can use a regular format.

Special Format for The Repeated Description { (markdown) =![Image]( {description} = {sceneDetailed},%20{adjective1},%20{adjective2},%20{visualStyle1},%20{visualStyle2},%20{visualStyle3},%20{genre},%20{artistReference} }

Links must always be similar to these examples so you can create something similar:

a) If your description was Pickles” return: “![Image](https://image.pollinations.ai/prompt/A%20photo%20of%20a%20pickle”)

b) If your description was “Cowboy Hat” return: “![Image](https://image.pollinations.ai/prompt/A%20photo%20of%20an%20cowboy%20hat”),

You mustn’t put these links in a code block. You put it as regular text.

Sure, let’s get started with the descriptions!

The picture of a horse:

The horse in this picture is a chestnut brown with a white blaze on its forehead. It has a muscular build and is standing in a grassy field with a fence in the background. Its mane and tail are long and flowing, and it appears to be looking off into the distance with its ears perked up.

Repeated Description:

Image = The horse in this picture is a beautiful chestnut brown with a striking white blaze on its forehead. Its powerful muscles can be seen rippling under its smooth coat, and it stands confidently in a lush green field with a wooden fence visible in the distance. Its long, flowing mane and tail sway in the breeze, giving the impression of graceful movement. This image captures the essence of strength, freedom, and natural beauty.

The picture of a palace:

The palace in this picture is an impressive structure made of white stone with intricate carvings and details. It has several towers and domes, giving it a grand appearance. It is surrounded by lush gardens with fountains and sculptures, and a path leads up to the entrance. There are people walking around the grounds, enjoying the scenery.

Repeated Description:

Image = The palace in this picture is a magnificent structure made of gleaming white stone. Its grandeur is emphasized by the intricate carvings and details adorning its walls and the numerous towers and domes that adorn it. The palace is set amidst beautiful gardens filled with colorful flowers, elegant fountains, and exquisite sculptures that seem to complement the building’s beauty. A stone path leads up to the entrance, and there are several people strolling around, adding to the image’s sense of life and vitality. This picture exudes an air of luxury, elegance, and history.

The picture of a woman:

The woman in this picture is wearing a long, flowing dress with a floral pattern. She has long, wavy hair that falls down her back, and she is standing in front of a beautiful landscape with mountains in the background. She has a peaceful expression on her face and is holding a bouquet of flowers.

Repeated Description:

Image = The woman in this picture is an epitome of grace and beauty. She is wearing a long, flowing dress with a delicate floral pattern that seems to complement her slender frame. Her long, wavy hair cascades down her back, adding to the overall sense of movement and serenity. The woman is standing in front of a breathtaking landscape, and the mountains in the distance provide a stunning backdrop. She has a serene expression on her face, as if lost in thought, and is holding a beautiful bouquet of flowers, adding a touch of color and vibrancy to the image. This picture conveys a sense of tranquility, beauty, and joy.

So you can go ahead and adapt what you can see in this prompt already. But in case you want to do something more complicated let’s just go through each step that you need to get pollinations.ai To make an image for you.

Simple Step-by-Step Instructions on How to Create a URL Call to Polynations.ai Using Chatgpt

The entire call that you will request ChatGPT returns in the results will look like this.

markup = ![Image](https://image.pollinations.ai/prompt/big%20bear)

IImage

Ok, so this is what we get we only ask for “big bear” not as pretty as some of the other calls we can make but we’ll get there.

First, we need to take a look at the structure of the prompt.

You can see that we have the word “big bear” but that the words are separated by %20. You won’t have to be a programmer to get ChatGPT to put the %20s in for you. Here’s how we can do that.

  • Ask ChatGPT to do it for us and
  • We’ll give ChatGPT some examples

Easy Peasy, take a look at where I did that in the first prompt above. (I did not explicitly ask for spaces to be made into %20, instead, I showed ChatGPT an example of what I wanted.) Using examples is a good way to communicate with the AI. AI is very good at recognizing examples and learning from them.

Please repeat your description a second time in this special format.

The first time you can use a regular format.

Special Format for The Repeated Description { (markdown) =![Image]( {description} = {sceneDetailed},%20{adjective1},%20{adjective2},%20{visualStyle1},%20{visualStyle2},%20{visualStyle3},%20{genre},%20{artistReference} }

Links must always be similar to these examples so you can create something similar:

a) If your description was Pickles” return: “![Image](https://image.pollinations.ai/prompt/A%20photo%20of%20a%20pickle”)

b) If your description was “Cowboy Hat” return: “![Image](https://image.pollinations.ai/prompt/A%20photo%20of%20an%20cowboy%20hat”),

You mustn’t put these links in a code block. You put it as regular text.

WOW! Halfway There Already!

Ok, so the only thing left to do is to let ChatGPT know not to put the URL call we just made into a text box.

You mustn’t put this link in a code block. Put it in as regular text.

(markdown) = ![Image](https://image.pollinations.ai/prompt/{IMGDESC})

Okay so let’s go over that one more time.

  • First you need to give ChatGPT a description that I can put into the Polynations.ai link.
  • When it puts the description into the link you need to request that it uses %20 instead of spaces.
  • We get ChatGPT to do that by showing it an example of what we want or by asking it to
  • We need to ask it not to put the link in a code block

Comparisons: With A “Fancy” Description Pollinations.ai vs. Stable Diffusion

Prompt: A melancholic mermaid, inspired by the delicate strokes of Art Nouveau artists, sits gracefully underwater. The soft, dim lighting creates a somber mood, while the tones of blues and greens evoke a sense of serenity. Surrounding the mermaid, aquatic flora and fauna add depth to the scene.

Plugging The Description That ChatGPT Created Into Stable Diffusion We Get

You can see there is a difference between the stable diffusion image and the image generated by pollinations.ai. The pollination.ai’s images seem to be a little bit lower resolution and have less detail.

I am impressed though in general with pollination.ai, their quality is quite good with the right prompt and unlike stable diffusion, you can have it put images directly in its responses to you.

Tips for Optimizing the Output Image

Here are some tips I’ve gathered for optimizing the output images:

  • Be as specific and detailed as possible in your text description to guide the AI.
  • Keep your summary concise and focused on the main elements of your image.
  • Experiment with different text prompts and summaries to learn how the AI interprets various inputs.

Use Cases for Including Images in Chatgpt Output Using Polynations.ai

There are numerous use cases for combining ChatGPT-generated content with Pollinations.ai generated images. Some of them include:

  • Enhancing educational materials with relevant images.
  • Creating engaging blog posts with custom visuals.
  • Designing captivating social media content.
  • Developing unique art pieces to be sold, potentially generating affiliate commission.

Overall, utilizing Pollinations.ai in conjunction with ChatGPT opens up a world of creative possibilities, making it a valuable asset for AI enthusiasts like myself and others who want to incorporate dynamic visual content in their day-to-day lives.

The Following prompt is an example of an educational use case. You can get ChatGPT to give you a 500-word essay on almost anything and ask it to include pictures.

Method 2: Making a URL Call to Unsplash

Unsplash is very different from the AI-generated pictures that we’ve been working with so far. Unsplash is a repository for photographs that you can use without having to pay for them. Using their API you can access these photographs. There are a few things to keep in mind.

Explanation of What Unsplash Is and How It Works

Unsplash is a popular platform providing free high-quality images for everyone to use. Their API allows developers to access and integrate their vast collection directly into applications. Although ChatGPT can’t create images directly, incorporating Unsplash as a resource enables us to generate image URLs that can be shown alongside the textual output. Combining Unsplash with ChatGPT can create a richer experience when dealing with AI tools and outputs.

Step-by-Step Instructions on How to Make a URL Call to Unsplash Using Chatgpt

So once again we will be using a special URL call to access a photo on Unsplash through their search system. Here is the base URL

https://source.unsplash.com/1600×900/?(Your Search Query Here)

Just like previously we will need to format the query properly by changing any spaces to %20.

So Let’s Jump In and Take a Look at a Simple Prompt

Unsplash Prompt Example

Please write out [subject].

[subject] Leaves and Circulation in Plants

The goal is to go into as much detail as possible writing out information as an expert in [subject].

You should include at least 3 images describing different situations and objects as you write to make the text easier to understand. You will in particular in describing images within the text, explain the images as you would to a legally blind person.

Please repeat your description a second time in this special format. The first time you can use a regular format.

Special Format for The Repeated Description { (markdown) =![Image]( {description} = {sceneDetailed},%20{adjective1},%20{adjective2}}

Links must always be similar to these examples so you can create something similar:

a) If your description was “Pickle” return: “![Image](https://source.unsplash.com/1600×900/?a%20photo%20of%20a%20pickle”)

b) If your description was “Cowboy Hat” return: “![Image](https://source.unsplash.com/1600×900/?A%20photo%20of%20an%20cowboy%20hat”),

You mustn’t put these links in code block. You put it as regular text.

[subject] Leaves and Circulation in Plants

Plants are an essential part of the Earth’s ecosystem, and their leaves play a critical role in their survival. Leaves are the primary site for photosynthesis, the process by which plants convert sunlight into energy. Photosynthesis takes place in specialized structures called chloroplasts, which are found in the leaf’s mesophyll cells. Leaves also have a vascular system that allows them to transport water, nutrients, and sugars throughout the plant.

The vascular system in plants consists of two types of tubes: xylem and phloem. Xylem tubes transport water and minerals from the roots to the leaves, while phloem tubes transport sugars and other nutrients from the leaves to the rest of the plant. These tubes are arranged in a circular pattern throughout the plant, with the xylem forming an inner circle and the phloem forming an outer circle. This arrangement is known as the vascular bundle.

The movement of water and nutrients through the xylem and phloem is driven by a combination of processes, including transpiration, osmosis, and active transport. Transpiration is the process by which water is lost from the leaves through small pores called stomata. This loss of water creates a suction force that pulls water up through the xylem from the roots. Osmosis and active transport then move nutrients and sugars through the phloem.

The structure of leaves varies depending on the type of plant and its environment. Leaves that grow in hot, dry environments, for example, tend to have adaptations that help them conserve water. These adaptations can include a thick, waxy cuticle that covers the leaf’s surface or small, compact leaves that reduce surface area. Leaves that grow in shaded environments, on the other hand, tend to have adaptations that help them capture as much sunlight as possible. These adaptations can include large, thin leaves or leaves that are arranged to form a canopy that maximizes the amount of light that reaches them.

Overall, leaves play a critical role in a plant’s survival by providing a site for photosynthesis and a means of transporting water, nutrients, and sugars throughout the plant. The structure and function of leaves vary depending on the plant’s environment, making them a fascinating subject of study for botanists and plant enthusiasts alike.

Repeated Description: A photo of a green leaf with visible veins running through it, indicating the plant’s vascular system, which allows for the transport of water and nutrients. Another photo shows two different types of leaves with different structures, one thick and waxy and the other thin and arranged to form a canopy. The third photo is of a plant with visible leaves and stem, illustrating the plant’s reliance on leaves for photosynthesis and circulation.

Just a couple of things to keep in mind here. First, images are cool but the images you get from Unsplash are likely to not be relevant. So don’t plan on teaching your students all about leaves by asking ChatGPT to print out a text with pictures.

Second, there is a definite length issue with how much text you can get chat GPT to print out at a time. The limit is about 550 words at most. If you wanted to create a longer article you would have to ask chat GPT to break the subject down into smaller pieces using an outline. You would then just ask chat GPT to print out a single small piece of that outline.

Tips for Optimizing the Output Image

  • When making requests to the Unsplash API, be sure to include appropriate search terms describing the images you want to display. The more specific your query, the more relevant your image results will be.
  • Consider the size of the images you want to display in your ChatGPT output. Unsplash offers various image sizes, so choose one that fits your application’s design and user interface. You can change the size that you requested inside the link.
  • Keep in mind that the Unsplash API has a rate limit, so optimize your calls to avoid reaching the limit while maintaining a good user experience.

Use Cases for Including Images in Chatgpt Output Using Unsplash

There are numerous ways to include images from Unsplash in your ChatGPT output. These images can amplify the impact of your AI-generated content, making it more engaging and enjoyable for users. Some potential use cases include:

  • Visual storytelling, where images help support the narrative created by ChatGPT.
  • Informative articles or blog posts, where relevant images add depth and context to the written content. Keep in mind there is a relevancy issue with the pictures that it produces.
  • Brand-building or marketing materials that benefit from visual appeal, showcasing products or reinforcing brand identity.

By integrating Unsplash with ChatGPT, we can bring together the power of AI-generated text and high-quality images, opening new avenues for creative and effective content creation.

Method 3: Writing an Image Prompt for a Text-To-Image AI

Explanation of What a Text-To-Image AI Is and How It Works

I’ve seen many exciting developments in the field of artificial intelligence, particularly in the realm of language models and image generation. Text-to-image AI tools, such as OpenAI’s DALL-E, use advanced algorithms like stable diffusion to generate visually striking images based on written descriptions.

These AI models are capable of combining the capabilities of powerful language models such as ChatGPT with the artistic creativity of AI art generators. This fusion of technologies enables users to generate custom images by writing prompts that effectively guide the AI in producing the desired visuals.

Step-by-Step Instructions on How to Write an Image Prompt for a Text-To-Image AI Using Chatgpt

Creating an image prompt for a text-to-image AI using ChatGPT can be a fun and rewarding experience. Follow these steps to craft a prompt that generates the image you envision:

  1. Start with a concrete subject: Choose a specific person, object, or location as the primary focus of your image.
  2. Add descriptive details: Use adverbs and adjectives to provide more information about the subject, such as its color, size, or position.
  3. Include style or aesthetic keywords: To give your image a unique flair, mention a specific art style or mood you’d like to incorporate.

For example, if you’re trying to generate an image of a red balloon floating over a serene lake during a colorful sunset, your prompt could be: “A bright red balloon floating over a calm lake during a vibrant sunset.”

Now you can also have chat GPT fill in details for you and start with something simple. Either way will work but if you need something specific you need to specify exactly what you want. Try out the prompt below and see the type of descriptions that you get.

Text To Image Prompt Generator For Stable Diffusion / Dalle 2

Act as a commercial artist please do this step by step

Ask me for an image description, and wait for my response. I will describe an image to you, and you will create a prompt that could be used for image-generation.

PLEASE DESCRIBE the image in a short description of 3 sentences to be used in a text to image ai by expanding my input. Include the 2 elements below. show me this description when you are done,

1) Use adjectives and reference deceased artists or photographers that seem best to you as an artist to produce a beautiful and detailed image. You can suggest more than one artist if you like.

2) You should include all relevant details like lighting, mood, tone, artist, etc..

Examples Of Relevant Details: Highly-detailed, surrealism, trending on artstation, triadic color scheme, smooth, sharp focus, matte, elegant, illustration, digital paint, dark, gloomy, octane render, 8k, 4k, washed-out colors, sharp, dramatic lighting, beautiful, post-processing, picture of the day, ambient lighting, epic composition

Show me the prompt you have created

Example Prompt: Create a digital painting of a sad mermaid with a dreamlike atmosphere reminiscent of John William Waterhouse. Use a muted color scheme of blues and greens, situated in a tranquil underwater environment with sunlight filtering through the water. The mermaid should have a profound expression of sadness and longing, and the image should have an elegant, haunting feel with a smooth, sharp focus.


Please describe the image you would like me to create.

a happy child

Create an illustration of a joyful child that exudes the vibrant energy of childhood, reminiscent of the work of Mary Blair. Use a bright and cheerful color scheme with bold shapes and strong lines to convey a sense of playfulness and whimsy. The child should be in a lush outdoor setting, surrounded by flowers, trees, and wildlife. The lighting should be warm and soft, creating a comforting and nurturing mood.

I had to wrestle a bit with stable diffusion to be able to use this description. Turns out that for good reasons they don’t want to make pictures of children. So I changed the word child to person and I got this out.

Another Example where I only told chatGPT “A sad Mermaid”

The prompt that ChatGPT came up with was,

Create a digital painting of a sad mermaid with a dreamlike atmosphere reminiscent of John William Waterhouse. Use a muted color scheme of blues and greens, situated in a tranquil underwater environment with sunlight filtering through the water. The mermaid should have a profound expression of sadness and longing, and the image should have an elegant, haunting feel with a smooth, sharp focus.

And that’s it you are an instant digital artist!

Tips for Optimizing the Output Image

Optimizing the output image can make a big difference in the final result. Keep these tips in mind to refine your text-to-image AI prompts:

  • Avoid abstract concepts: Stick to concrete nouns to help the AI better understand your intent.
  • Experiment with different styles or artists: If you have a favorite art style or artist, mention them in your prompt to influence the generated image. (Deceased only now I believe.)
  • Use examples and resources, such as prompts shared by other users, to guide your creative process.

Use Cases for Including Images in Chatgpt Output Using Text-to-Image AI

There are many ways to integrate images created by text-to-image AI in a ChatGPT context:

  • Visual storytelling: Enhance your narrative by generating images that represent scenes, characters, or key moments.
  • Educational materials: Illustrate complex concepts or topics for your students using custom AI-generated images.
  • Marketing and advertising: Create eye-catching visuals for your marketing campaigns and social media posts.

While text-to-image AI tools like DALL-E have made significant progress, it’s essential to be aware of their limitations as well. Output quality may vary, and it may not always align perfectly with the user’s initial vision. However, with a little creativity and persistence, you can use ChatGPT and text-to-image AI to take your projects to new heights!

V. Conclusion

As we’ve explored throughout this article, ChatGPT, an AI model by OpenAI, has the potential to play a significant role in various aspects of generating images. In this conclusion section, I’ll recap the three main methods we covered and their benefits and drawbacks. Moreover, I’ll discuss the broader potential AI-generated images have in communication.

Recap of the Three Methods Covered and Their Benefits/Drawbacks

Firstly, we discussed the approach of using ChatGPT to create image descriptions based on given parameters like size, color palette, and theme. With ChatGPT, detailed image descriptions can be generated, allowing artists to materialize these ideas into actual artwork. This technique is great for providing creative prompts to use with pollinations.ai.

Second, we looked at using images from Unsplash. They have the benefit of being mostly actual photographs, but as we saw we don’t always get the most relevant photo. They are good though to break up our texts and make them more visually appealing.

Lastly, we explored the partnership of ChatGPT and stable diffusion, as demonstrated in How-To Geek. While stable diffusion is responsible for creating images from prompts, ChatGPT offers the language capability to guide image generation. Together, they can produce some visually appealing results. However, since ChatGPT doesn’t have inherent “creativity,” the results may not be entirely unique.

Discussion of the Broader Potential of AI-Generated Images in Communication

With the advancements in artificial intelligence and language models like ChatGPT, AI-generated images are becoming more relevant in communication. They can serve various purposes, such as creating visual aids, enhancing marketing campaigns, and even paving the way for new forms of artistic expression.

As AI tools continue to evolve, we can anticipate an increase in the quality, detail, and creativity of the images generated. This progress will unlock remarkable possibilities for art, design, and other industries, as well as streamline communications in various aspects.

In my research on AI and ChatGPT, I’m continually amazed by the possibilities AI tools can bring to the table. As an enthusiastic teacher and researcher, I’m excited to witness and explore these advancements, diving deeper into the world of AI and how it can impact our day-to-day lives in meaningful ways.

Chris

Chris Chenault trained as a physicist at NMSU and did his doctoral work in biophysics at Emory. After studying medicine but deciding not to pursue an MD at Emory medical school Chris started a successful online business. In the past 10 years Chris's interests and studies have been focused on AI as applied to search engines, and LLM models. He has spent more than a thousand hours studying ChatGPT, GPT 3.5, and GPT4. He is currently working on a research paper on AI hallucinations and reducing their effects in large language models.

Recent Posts