See How Midjourney V6.1 Improves Image Creation

kylixie
Apr 30, 2025
5 min read

midjourney blog post image — A Midjourney generated image using Midjourney Automation Suite

Midjourney recently updated to version 6.1, bringing new features, improvements, and shifts in how it handles prompts compared to the previous version, V6. This comparison looks closely at V6.1 to see if it's a clear upgrade and what areas show the most change.

We'll test Midjourney's ability to understand language, create realistic images, handle details, render text, and improve your workflow side by side. If you use Midjourney often, understanding these differences can help you get better results. Let's see how V6.1 stacks up.

Understanding Your Words: Natural Language

How well Midjourney understands your prompt matters a lot. Better understanding means it can combine and separate things in your scene just right. We tested this with several challenges.

Challenges for Understanding Language

Basic prompt with a twist (e.g., "a horse is riding a man"): Both V6 and V6.1 struggled with this unusual setup, often showing a man riding a horse instead. More specific wording was needed to get the desired result.
Multi-character scenes: With a prompt asking for two women in a cafe with different outfits (red hair and leather vs. a suit), V6.1 did much better at keeping the characters distinct compared to V6, which sometimes blended features like hair color.
Unusual or odd concepts (e.g., "friendship of a whale and a dragon"): V6.1 separated the whale and dragon more clearly in most images. V6 sometimes blended the creatures.
Testing odd phrases ("reversed Egyptian pyramid"): Both models failed to create a reversed pyramid from this simple phrase, showing that unusual concepts sometimes need more detailed prompts.
Another odd concept ("a layered cake in the style of landscape"): Both versions handled this abstract idea well, though V6 slightly edged out V6.1 in showing clearer landscape layers.
Long prompts with many details (clothing, background): Both V6 and V6.1 did a good job with a detailed scene of a couple in a room with specific clutter.
Complex clothing combinations (t-shirt under hoodie): V6.1 was impressive here, correctly showing the t-shirt inside the hoodie with the right patterns. V6 had trouble with the layers and details.
Random word clusters: Both handled chaotic prompts reasonably well, finding ways to combine disparate ideas. V6.1 showed a stronger influence of the 'cyberpunk' keyword in one test.
World knowledge (character Tanjiro in sci-fi armor): Both versions depicted the character and scene well. V6.1 had facial details (like a scar) closer to the original character in more images.

Overall, V6.1 shows a noticeable improvement in understanding language, particularly with differentiating characters and handling specific outfit descriptions. This makes crafting prompts for complex scenes easier.

Managing prompts and generations manually can take time, especially with detailed comparisons like this. Consider exploring tools that can automate image creation. The Midjourney Automation Suite from TitanXT can help streamline your workflow and manage large batches of prompts.

How Realistic Are the Images? Photorealism

Midjourney is known for its realism, and V6 was already a big step up. We looked for improvements in V6.1, focusing on common elements and especially skin textures.

Challenges for Photorealism

[LI]Material realism (smoke, grass, water, debris):

[/UL][/LI]

[/UL]

Overall, V6.1 shows sharper details and improved realism for animal images like koalas and underwater scenes. Improvements in human skin realism were not as dramatic as hoped, suggesting this is an area where both models still have room to grow. Expecting further realism improvements in future updates like V6.2.

Are the Details Right? Accuracy

Getting small details right, from anatomy to objects, is a common challenge for AI image generators. We tested how well V6.1 handles these specifics.

Challenges for Accuracy

Hand and foot anatomy: V6.1 accurately depicted hands and feet themselves. However, both models struggled with the interaction of hands holding objects (like a burger) in a natural way. Feet in high heels were also mostly fine, but context could show issues.
Holding weapons/objects (witch on broom, bow and arrow): The witch on a broom was handled well by V6.1 with minor errors. The bow and arrow test showed mixed results for both, with hands holding the objects incorrectly in several attempts. V6 potentially performed slightly better in the initial bow/arrow test.
Holding multiple objects (umbrella and cigarette): V6.1 correctly interpreted prompt parameters like 'high angle shot'. However, both versions had issues with the accuracy of hands holding both items and the positioning of the umbrella relative to the person.
Faces at a distance: Testing crowds and distant people showed that getting clear, accurate faces at a distance is still hard for both models. There were distortions and unclear faces in both V6.1 and V6 results, making it difficult to call a clear winner.
Scenes with art inside art (art gallery): Both models depicted art galleries with people and varied art styles. However, faces of people, especially at a distance, showed distortions in both versions, with no significant improvement seen in V6.1.
Complex multi-subject actions (team sports): Generating accurate sports scenes with players, nets, and audience remains a major challenge for generative AI. Both V6.1 and V6 showed significant anatomy issues, misplaced objects (nets, balls), and distorted faces. This highlights the current limits.
Complex anatomy/action (artistic gymnastics): Showing a gymnast on a pommel horse was difficult for both models. The specific equipment often wasn't rendered, and anatomy could be off. Like team sports, this type of complex, action-oriented scene still requires improvement.

Overall, while V6.1 handles basic anatomy well, connecting anatomy with objects and rendering complex scenes with multiple subjects or dynamic action still presents significant challenges. Accuracy of details, especially faces at a distance and object interaction, remains an area where major improvement is still needed.

Simplifying complex prompts or generating many variations to find accurate details can be time-consuming. A tool like the Midjourney Automation Suite may help manage bulk generations and testing different prompt approaches to improve accuracy.

Text on Images: Text Rendering

Adding text consistently and correctly has historically been tough for AI image generators. Midjourney mentioned improvements here for V6.1. We tested with a product label prompt.

Prompt: "product photography hot sauce with brand jungle fire in a cactus bed."

V6.1 produced sharper text compared to V6. It also made fewer spelling or placement mistakes across the generated images. While not perfect, the improvement in text accuracy and clarity in V6.1 is high.

Getting Work Done Quicker: Workflow Improvements

Generating images quickly is a big help for users. V6.1 was noted to be about 25% faster for standard jobs compared to V6. This speed boost was noticeable during testing and directly impacts workflow by speeding up generation time.

While other workflow features like image prompting and character/style references weren't extensively tested in this comparison, the increased generation speed alone offers a considerable advantage for users who generate many images.

Wrapping Up: V6.1 vs V6

Midjourney V6.1 brings clear improvements in specific areas.

It has a better understanding of natural language, particularly with distinguishing multiple characters and handling outfit descriptions.
Photorealism is slightly improved for animals and certain textures, but human skin realism didn't show a major leap.
Text rendering is significantly better, producing sharper and more accurate words on images.
Workflow is improved mainly through faster image generation speeds.

Areas like accuracy of details, handling complex multi-subject scenes, and photorealism in human portraits still present challenges for V6.1, much like they did for V6. Expecting more focus on these areas in future Midjourney updates.

Overall, V6.1 is a solid step forward, offering noticeable benefits in language understanding and text rendering, while maintaining the high-quality realism V6 was known for. The speed increase is also a valuable improvement for creators.

Speed up your creative process and manage your Midjourney generations more effectively by trying out automation tools. Check out the Midjourney Automation Suite from TitanXT to simplify batch processing and testing different prompt versions based on these sorts of comparisons.