Automating Blog Cover Image Generation: Leverage Playwright and GPT-4 with Midjourney for Programmatic SEO

Discover how we automated blog post cover image generation for using GPT-4 and Playwright. Learn from our process and see the top picks in our upcoming image gallery.

Automating Blog Cover Image Generation: Leverage Playwright and GPT-4 with Midjourney for Programmatic SEO
Sample Generated Blog Post Cover Images

My girlfriend and I took on this cool side project, We put our heads together and came up with a custom-built gift recommendation engine, all juiced up with OpenAI's GPT. It's a pretty neat setup: users throw in some filters like age, occasion, budget, and bam! They get 10 unique gift ideas. We've even got most of these linked up to Amazon with affiliate codes, so we can make a bit of cash on the side.

Wanna see how it works? Here's a demo of our Gift Recommendation Engine:


Demo of the Gift Recommendation Engine

We were all starry-eyed at launch, thinking users would just start flowing in. But, as you fellow indiehackers know, it doesn't really roll that way. So, we started thinking, why not try to reel in the crowd with some long-tail keywords related to gift ideas? This way, we get to drive some traffic to our blog and then direct folks to our free gift idea engine. Not a bad plan, right?

Challenge 1: Google Will Not Index Our Pages

Spent a good chunk of time crafting a neat data schema, and even had GPT-3 churn out some snazzy titles, subtitles, and descriptions for the gift ideas. Thought we'd nailed it, but Google had other plans: just wouldn't index those pages. Let's dig into this a bit more:

Google finding but not indexing pages.

To tell you the truth, I was pretty happy with the content, the headings, and the intros we put together. But, even then, our blog posts looked a bit thin - like a party without the balloons. What were we missing? A good, attention-grabbing blog post cover image.

Example of barebone blog post.

Challenge 2: Scoring Stock Images for Blog Post Covers

So, my first shot at this was to use the Unsplash API, thinking I could easily snag some good-looking cover images for our blog posts. Turns out, finding the perfect picture for each specific keyword was like finding a needle in a haystack. For instance, try finding the perfect image for Gift ideas 9yr old girl. Not as straightforward as I thought.

Unsplash Search Results Gift ideas 9yr old girl

Then I thought, why not try Stable diffusion? But yet again, the results left a lot to be desired. I just wasn't hitting the mark. I'd heard a lot of buzz around Midjourney, so I figured, why not give it a shot?

Well, it turns out that even with Midjourney, my initial prompts were a bit off-target and didn't quite give me the results I was hoping for:

Housewarming gift ideas, new homeowners, vibrant, inviting, decorated living room, warm and cozy, coffee table, potted plant, scented candles, personalized welcome mat, elegant coasters, stylish decorative vase, soft natural lighting, modern kitchen, comfortable seating area, thoughtful gestures, meaningful gifts
Bad prompt for housewarming gifts Midjourney

After doing some research though, I finally was able to use GPT 3.5 to start generating some good prompts based on a given blog post title.


Blog Post Title: 10 Unique Gift Ideas for a 9-Year-Old Girl


Generated Prompt: A photograph of a beaming 9 - year - old girl unwrapping a gift, surrounded by her friends at a colorful and lively birthday party, captured during midday with natural lighting streaming in through a large window. The picture is shot with a Canon EOS Rebel T7i camera, using a 35mm f/ 1. 8 lens, highlighting the excitement and joy of the moment.

Here's the final result of the picture. Not too shabby, right?

A photograph of a beaming 9-year-old girl unwrapping a gift, surrounded by her friends at a colorful and lively birthday party, captured during midday with natural lighting streaming in through a large window. The picture is shot with a Canon EOS Rebel T7i camera, using a 35mm f/1.8 lens, highlighting the excitement and joy of the moment.

By the way, if you're interested, we'll soon be offering the stock images prompt generator as a free tool on It could be just what you need for your blog posts.

Free SEO Tools by

Challenge 3: Automating the Midjourney Blog Post Cover Image Generation

Here's a curveball: Midjourney doesn't yet have an API for easy image generation. Let's look into this:

On top of that, instead of offering a straightforward interface, you end up sending a bunch of messages to a Discord bot. It's not exactly the most automation-friendly setup:

  1. You start with the /imagine [prompt] command
  2. You need then to upscale one of the provided versions
  3. Finally, you need to save the image from an url
Sample Discord Commands

Being a programmer who values efficiency (okay, maybe a little lazy), I hit up Google to find any existing solutions. I found a few overpriced unofficial API services and a Medium article that leaned heavily on PyAutoGUI. But since I wanted to keep working while the script was running, I started looking into web browser automation.

When I couldn't find exactly what I needed, I did what any self-respecting coder does: I decided to write the script myself, with a little assist from GPT-4.

Automating Midjourney Image Generation

So here's the game plan for the automation at a high level, along with a peek at the corresponding code block:

  1. Log in to discord using my email, password and two-factor auth_code
  2. Get the slug and blog_post_cover_prompt for all my blog posts from the database where there is no blog_post_cover_image_url set yet.
  3. Post the prompt with the slash command
  4. Wait for the 4 image variations to be generated and upscale the first version
  5. Wait for the upscale version to be available and retrieve the image_url
  6. Download the image to my local file system
  7. Upload the image to an S3 Bucket
  8. Update the row blog_post_cover_image_url with the newly set S3 Url
async def main() -> None:
    async with async_playwright() as playwright:
        # playwright = await async_playwright().start()

        engine = create_engine(DATABASE_URL)

        browser = await playwright.chromium.launch(headless=False)
        context = await browser.new_context()
        page = await context.new_page()

        records = get_records_with_null_cover_image(engine)

        await login_to_discord(

        for record in records[181:]:
            slug = record["slug"]
            prompt = record["blog_post_cover_prompt"]

            await post_prompt(

            await upscale_image(page=page)

            image_url = await get_image_url(page=page)

            local_image_path = IMAGE_PATH / f"{slug}.png"

            image_path = download_image(image_url=image_url, image_path=local_image_path)

            s3_path = upload_to_s3(


        await context.close()
        await browser.close()

A little known secret to automation is to actually write as little code as possible by leveraging GPT-4 and the playwright codegen utility.

I started of generating 70% of the code by running the following simple command:

playwright codegen[server_id]/[channel_id]

This process launches a window where you can start interacting with the Discord server - clicking, typing, you name it. The great thing is, as you're doing this, the code is being automatically generated and recorded for you. All you have to do is copy it and paste it into your script. Here's how it looks:

Recording a test
Playwright Codegen Test Generator

So, with the code generated by Playwright and a little help from GPT to refactor it, creating the final login_to_discord function becomes a breeze. Let's take a look:

And voilà! Here's what our final function looks like:

async def login_to_discord(
    page: Page,
    server_id: str,
    channel_id: str,
    email: Optional[str] = None,
    password: Optional[str] = None,
    auth_code: Optional[str] = None,
) -> None:
    Log in to Discord via a Playwright browser page.

        page (Page): Playwright browser page instance.
        server_id (str): Discord server ID to navigate to after login.
        channel_id (str): Discord channel ID to navigate to after login.
        email (Optional[str], optional): Email to use for logging in to Discord. Defaults to None.
        password (Optional[str], optional): Password to use for logging in to Discord. Defaults to None.
        auth_code (Optional[str], optional): Authentication code to use for logging in to Discord. Defaults to None.

        TimeoutError: If any of the page actions do not complete within the default timeout period.
    discord_channel_url = f"{server_id}/{channel_id}"
    await page.goto(discord_channel_url)

    await page.get_by_role("button", name="Continue in browser").click()
    await page.get_by_label("Email or Phone Number*").click()

    if not email:
        email = input("Please enter your email: ")
    await page.get_by_label("Email or Phone Number*").fill(email)

    await page.get_by_label("Email or Phone Number*").press("Tab")

    if not password:
        password = getpass("Please enter your password: ")
    await page.get_by_label("Password*").fill(password)

    await page.get_by_role("button", name="Log In").click()

    if not auth_code:
        auth_code = input("Please enter your authentication code: ")
    await page.get_by_placeholder("6-digit authentication code/8-digit backup code").fill(auth_code)

    await page.get_by_role("button", name="Log In").click()

Next, I followed the same process for the remaining parts. There was a bit of manual work involved - mainly finding the last Discord messages and figuring out the waiting logic to decide when to proceed.

GPT Engineering Wait Functions for Upscale Button

And we repeat the same process again for retrieving the image link 🫠, giving minimal feedback:

GPT Engineering Wait Functions for Image Link Retrieval

With a little bit of debugging, the final script was ready to roll. I've shared it in this GitHub Gist for you to take, tweak, and tailor to your needs.

To wrap things up, we embarked on a journey to automate blog post cover image generation. We navigated through challenges with Google's indexing, experimented with different solutions for sourcing images, and eventually automated the process using a blend of Playwright and GPT-4. The result? An efficient way to generate engaging, relevant cover images for our blog posts.

And there's more! As a little bonus, I'll be sharing a gallery of my favorite stock images generated with this method. Keep an eye out for that - it's proof of what a bit of coding and creativity can achieve. Happy coding, everyone!