Automating Blog Cover Image Generation: Leverage Playwright and GPT-4 with Midjourney for Programmatic SEO
Discover how we automated blog post cover image generation for GiftIdeasAI.xyz using GPT-4 and Playwright. Learn from our process and see the top picks in our upcoming image gallery.

My girlfriend and I took on this cool side project, GiftIdeasAI.xyz. We put our heads together and came up with a custom-built gift recommendation engine, all juiced up with OpenAI's GPT. It's a pretty neat setup: users throw in some filters like age, occasion, budget, and bam! They get 10 unique gift ideas. We've even got most of these linked up to Amazon with affiliate codes, so we can make a bit of cash on the side.
Wanna see how it works? Here's a demo of our Gift Recommendation Engine:
Demo of the Gift Recommendation Engine
We were all starry-eyed at launch, thinking users would just start flowing in. But, as you fellow indiehackers know, it doesn't really roll that way. So, we started thinking, why not try to reel in the crowd with some long-tail keywords related to gift ideas? This way, we get to drive some traffic to our blog and then direct folks to our free gift idea engine. Not a bad plan, right?
Challenge 1: Google Will Not Index Our Pages
Spent a good chunk of time crafting a neat data schema, and even had GPT-3 churn out some snazzy titles, subtitles, and descriptions for the gift ideas. Thought we'd nailed it, but Google had other plans: just wouldn't index those pages. Let's dig into this a bit more:

To tell you the truth, I was pretty happy with the content, the headings, and the intros we put together. But, even then, our blog posts looked a bit thin - like a party without the balloons. What were we missing? A good, attention-grabbing blog post cover image.

Challenge 2: Scoring Stock Images for Blog Post Covers
So, my first shot at this was to use the Unsplash API, thinking I could easily snag some good-looking cover images for our blog posts. Turns out, finding the perfect picture for each specific keyword was like finding a needle in a haystack. For instance, try finding the perfect image for Gift ideas 9yr old girl
. Not as straightforward as I thought.

Then I thought, why not try Stable diffusion? But yet again, the results left a lot to be desired. I just wasn't hitting the mark. I'd heard a lot of buzz around Midjourney, so I figured, why not give it a shot?
Well, it turns out that even with Midjourney, my initial prompts were a bit off-target and didn't quite give me the results I was hoping for:
Housewarming gift ideas, new homeowners, vibrant, inviting, decorated living room, warm and cozy, coffee table, potted plant, scented candles, personalized welcome mat, elegant coasters, stylish decorative vase, soft natural lighting, modern kitchen, comfortable seating area, thoughtful gestures, meaningful gifts

After doing some research though, I finally was able to use GPT 3.5 to start generating some good prompts based on a given blog post title.
Blog Post Title: 10 Unique Gift Ideas for a 9-Year-Old Girl
Generated Prompt: A photograph of a beaming 9 - year - old girl unwrapping a gift, surrounded by her friends at a colorful and lively birthday party, captured during midday with natural lighting streaming in through a large window. The picture is shot with a Canon EOS Rebel T7i camera, using a 35mm f/ 1. 8 lens, highlighting the excitement and joy of the moment.
Here's the final result of the picture. Not too shabby, right?

By the way, if you're interested, we'll soon be offering the stock images prompt generator as a free tool on BacklinkGPT.com. It could be just what you need for your blog posts.

Challenge 3: Automating the Midjourney Blog Post Cover Image Generation
Here's a curveball: Midjourney doesn't yet have an API for easy image generation. Let's look into this:

On top of that, instead of offering a straightforward interface, you end up sending a bunch of messages to a Discord bot. It's not exactly the most automation-friendly setup:
- You start with the
/imagine [prompt]
command - You need then to upscale one of the provided versions
- Finally, you need to save the image from an url

Being a programmer who values efficiency (okay, maybe a little lazy), I hit up Google to find any existing solutions. I found a few overpriced unofficial API services and a Medium article that leaned heavily on PyAutoGUI. But since I wanted to keep working while the script was running, I started looking into web browser automation.
When I couldn't find exactly what I needed, I did what any self-respecting coder does: I decided to write the script myself, with a little assist from GPT-4.
Automating Midjourney Image Generation
So here's the game plan for the automation at a high level, along with a peek at the corresponding code block:
- Log in to discord using my
email
,password
and two-factorauth_code
- Get the
slug
andblog_post_cover_prompt
for all my blog posts from the database where there is noblog_post_cover_image_url
set yet. - Post the
prompt
with the slash command - Wait for the 4 image variations to be generated and upscale the first version
- Wait for the upscale version to be available and retrieve the
image_url
- Download the image to my local file system
- Upload the image to an S3 Bucket
- Update the row
blog_post_cover_image_url
with the newly set S3 Url
async def main() -> None:
async with async_playwright() as playwright:
# playwright = await async_playwright().start()
engine = create_engine(DATABASE_URL)
browser = await playwright.chromium.launch(headless=False)
context = await browser.new_context()
page = await context.new_page()
records = get_records_with_null_cover_image(engine)
await login_to_discord(
page=page,
server_id=DISCORD_SERVER_ID,
channel_id=DISCORD_CHANEL_ID,
)
for record in records[181:]:
slug = record["slug"]
prompt = record["blog_post_cover_prompt"]
await post_prompt(
page=page,
prompt=prompt,
)
await upscale_image(page=page)
image_url = await get_image_url(page=page)
local_image_path = IMAGE_PATH / f"{slug}.png"
image_path = download_image(image_url=image_url, image_path=local_image_path)
s3_path = upload_to_s3(
image_path=image_path,
aws_access_key_id=S3_ACCESS_KEY_ID,
aws_secret_access_key=S3_SECRET_ACCESS_KEY,
bucket=S3_BUCKET_NAME,
region_name=S3_REGION_NAME,
s3_image_name=f"{slug}.png",
)
update_db_record(
engine=engine,
s3_path=s3_path,
keyword_value=slug,
)
await context.close()
await browser.close()
A little known secret to automation is to actually write as little code as possible by leveraging GPT-4
and the playwright codegen
utility.
I started of generating 70% of the code by running the following simple command:
playwright codegen
https://discord.com/channels/[server_id]/[channel_id]
This process launches a window where you can start interacting with the Discord server - clicking, typing, you name it. The great thing is, as you're doing this, the code is being automatically generated and recorded for you. All you have to do is copy it and paste it into your script. Here's how it looks:

So, with the code generated by Playwright and a little help from GPT to refactor it, creating the final login_to_discord function becomes a breeze. Let's take a look:

And voilà! Here's what our final function looks like:
async def login_to_discord(
page: Page,
server_id: str,
channel_id: str,
email: Optional[str] = None,
password: Optional[str] = None,
auth_code: Optional[str] = None,
) -> None:
"""
Log in to Discord via a Playwright browser page.
Args:
page (Page): Playwright browser page instance.
server_id (str): Discord server ID to navigate to after login.
channel_id (str): Discord channel ID to navigate to after login.
email (Optional[str], optional): Email to use for logging in to Discord. Defaults to None.
password (Optional[str], optional): Password to use for logging in to Discord. Defaults to None.
auth_code (Optional[str], optional): Authentication code to use for logging in to Discord. Defaults to None.
Raises:
TimeoutError: If any of the page actions do not complete within the default timeout period.
"""
discord_channel_url = f"https://discord.com/channels/{server_id}/{channel_id}"
await page.goto(discord_channel_url)
await page.get_by_role("button", name="Continue in browser").click()
await page.get_by_label("Email or Phone Number*").click()
if not email:
email = input("Please enter your email: ")
await page.get_by_label("Email or Phone Number*").fill(email)
await page.get_by_label("Email or Phone Number*").press("Tab")
if not password:
password = getpass("Please enter your password: ")
await page.get_by_label("Password*").fill(password)
await page.get_by_role("button", name="Log In").click()
if not auth_code:
auth_code = input("Please enter your authentication code: ")
await page.get_by_placeholder("6-digit authentication code/8-digit backup code").fill(auth_code)
await page.get_by_role("button", name="Log In").click()
Next, I followed the same process for the remaining parts. There was a bit of manual work involved - mainly finding the last Discord messages and figuring out the waiting logic to decide when to proceed.

And we repeat the same process again for retrieving the image link 🫠, giving minimal feedback:

With a little bit of debugging, the final script was ready to roll. I've shared it in this GitHub Gist for you to take, tweak, and tailor to your needs.
To wrap things up, we embarked on a journey to automate blog post cover image generation. We navigated through challenges with Google's indexing, experimented with different solutions for sourcing images, and eventually automated the process using a blend of Playwright and GPT-4. The result? An efficient way to generate engaging, relevant cover images for our blog posts.
And there's more! As a little bonus, I'll be sharing a gallery of my favorite stock images generated with this method. Keep an eye out for that - it's proof of what a bit of coding and creativity can achieve. Happy coding, everyone!









Some of my favorite blog post cover images.