Prompt Mass: Characterizing What Makes a Good Prompt When Writing with AI

Everyone has access to the same AI writing models, yet the quality of AI-assisted writing ranges from sharp to slop. This post argues that the gap comes down to how much of the final text is determined by the model's training versus your prompt. When a prompt is light—little constraint, lots of room to drift—the model leans on its defaults and produces generic content. When a prompt is heavy—tight constraints, not much wiggle room—the output is pulled toward your intent. I call this prompt mass: a way to think about what are good prompts when using LLM for writing.

Slop and the Gap in AI Writing Quality

When we prompt a model to write, the result is determined by two sources:

The model’s parameters (what it learned during training)
The prompt (the information and constraints we provide)

If our prompt does not include many conditions on the output, then most of what determines the text will come from the model’s training. That training is shared across users, and different models often have similar defaults¹. So when the prompt has little influence, the output tends to converge toward what is generic by definition.

Even if we had a model with the writing ability of a Nobel laureate, a light prompt would still push it toward broadly similar prose for everyone. The problem is not only model capability. It is also how much direction we give.

But what makes one prompt better than another? That is what I will try to pin down here.

A Demonstration

Let’s say I ask an AI (Gemini 3 Pro) to write a paragraph explaining how the heavier the conditioning you include in your prompt the better the result. What would that look like?

Each option below asks for the same thing (two sentences), but I progressively add different kinds of constraints.

Option 1: Bare request

Write 2 sentences about what makes a good prompt for AI writing.

A good prompt provides clear context and specific constraints, such as tone, audience, and format, to guide the AI toward a relevant output. It also includes "few-shot" examples or detailed instructions to reduce ambiguity and ensure the style aligns with the user's intent.

Option 2: Adding the core idea

Write 2 sentences explaining that what makes a good prompt for a writing task are the constraints it includes over the output that serve as conditioning to shape the output.

A superior writing prompt derives its power from explicit constraints that serve as conditioning mechanisms to steer the model's creative direction. These boundaries—ranging from stylistic nuances to structural requirements—effectively sculpt the raw output by narrowing the range of possibilities into a focused, high-utility result.

Option 3: Adding purpose and mechanism

I'm writing a blog post trying to explain that we need better language to talk about the different levels of using AI to write text. I need 2 sentences that explain why prompts with more conditioning on the output are better. Explain that the generation process is affected by a balance between the model's parameters and the prompt—the more conditioning the prompt has, the more the balance tips toward the prompt's influence, and thus the greater its impact on the final results. Because the parameters are shared across all users, we always want the balance tilted toward the prompt side.

Every AI-generated output is a negotiation between the model's static parameters and your specific prompt; the more conditioning you provide, the more you tip the scales, allowing your unique instructions to override the model's default tendencies. Because those internal parameters are a shared baseline for all users, maximizing prompt influence is the only way to shift the balance away from generic "average" results and toward your specific intent.

Option 4: Adding context and metaphor priming

I'm writing a blog post trying to explain that we need better language to talk about the different levels of using AI to write text. See below my opening:

[The first part of this post pasted here]

I believe that language affects how we think and so I want in this blog post to convince the community that we should try to come up with better terms to define the spectrum of information that a person provides in the instruction when writing using AI. I also aim to propose a metric that I hope that people use that quantifies this with the concept of instruction information—a measure of how much the prompt conditions the model's output.

In the second part of the post, I want to demonstrate in a convincing manner that such prompts lead to better outputs. For that, I need 2 sentences that convinces why making prompts that include more conditioning over the output are better...

Prime the reader for a later "mass/weight" concept by using a subtle physics metaphor (weight/scale/gravity) for how constraints pull the output toward the author's intent—but don't name the concept yet.

Every AI-generated sentence is the result of a delicate balance on a scale: one side holds the model's shared statistical 'gravity,' while the other holds the specific weight of your instructions. To produce work that is truly distinct, we must provide enough conditioning to tip that scale, exerting a stronger pull that anchors the output to our unique intent rather than allowing it to drift into the orbit of generic, predictable patterns.

Analysis: How Input Shapes Output

The basic request is the same in all four options, yet the outcomes are noticeably different.

Option 1 is so light that the model invents its own answer: “few-shot examples,” “tone, audience, and format”—none of which I asked for.

Option 2 says “constraints… as conditioning,” and the output echoes that directly: “explicit constraints that serve as conditioning mechanisms.” But the phrasing (“high-utility result,” “sculpt the raw output”) is still generic AI-speak.

Option 3 introduces “a balance between the model’s parameters and the prompt,” and the output picks it up almost verbatim: “a negotiation between the model’s static parameters and your specific prompt.” It even extends the metaphor on its own with “tip the scales”—but I didn’t ask for that phrasing.

Option 4 explicitly primes “weight/scale/gravity,” and the output now threads that vocabulary throughout—”a delicate balance on a scale,” “gravity,” “weight,” “tip that scale,” “drift into the orbit”—setting up the reader for what comes next.

Different levels of effort went into the instructions, and that affected the output. But “effort” is vague. How should we think about this spectrum more precisely?

Defining Prompt Mass

I suggest here prompt mass as a useful term in this context.

In the same way that an object with a lot of mass pulls a balance to its side, a prompt with more constraints has more control over what the output looks like. In the same way that big objects often have more mass, longer prompts often have more mass. In the same way that a small object can sometimes have more mass than a big one, a short prompt can sometimes have more mass than a long one.

A light prompt gives the model latitude to drift toward its defaults. A heavy prompt constrains that space, pulling the output toward your intent.

Length is only a proxy. A prompt gets heavier when it makes more of the important decisions in advance. A prompt stays light when it leaves those decisions to the model.

Why This Framing Matters

With this concept, a few things become easier to see:

Why we get slop. When someone uses a light prompt, the model defaults to generic patterns. Through this lens, slop is not an inherent property of AI writing. It is what you get when the prompt does not carry enough mass.
Why autocomplete feels bland. When you accept an autocomplete suggestion, your only input is the text that came before. You add no constraints on what the next text should do. You get the average continuation.
“I used AI” stops being a binary. “Used AI” hides the important question: how much of the final text was determined by the model’s defaults versus your own constraints and judgment.

What the Guidelines Have in Common

There’s no shortage of advice on how to prompt well: write a draft first, specify your audience, use frameworks like CO-STAR, iterate. But these tips can feel disconnected. If prompt mass is a useful concept, it should tie them together.

Start with your own content

Many guidelines emphasize that you shouldn’t begin with a blank page. User studies found that people who injected their own content produced significantly higher-quality outputs than those who asked the AI to write from scratch.²

Through the lens of prompt mass, a draft adds weight because it imports your actual claims and structure.

Specify your audience, tone, and purpose

A draft carries your arguments but typically lacks the implicit context. Frameworks like CO-STAR remind you to make these elements explicit.³

Even just the audience matters. “Audience: PhD researchers” invites caveats. “Audience: 10-year-olds” forces simplification. Same draft, same instruction, different outputs.

Refine over multiple rounds

Rarely does a single prompt contain everything. This is what Mollick calls “Cyborg” behavior: moving back and forth between human judgment and AI generation.⁴

Each round of feedback adds mass. “Too formal” constrains the space. “Focus on the cost argument” constrains it further. The weight can build up across a conversation even if no single prompt is heavy on its own.

Put differently, these are just ways of moving decisions from the model back into the input. Once you see that, a lot of prompting advice starts to look the same.

Closing

Prompt mass won’t tell you exactly what to write in your prompts—that depends on your specific goals and context. But it might serve as a more abstract guideline, one that captures at a conceptual level what you’re trying to do when prompting an LLM for writing.

Wu, Black & Chandrasekaran (2024). Generative Monoculture in Large Language Models. Explores how LLMs narrow output diversity across users. ↩
Lee et al. (2024). Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild. ↩
Teo, S. (2023). “How I Won Singapore’s GPT-4 Prompt Engineering Competition.” Towards Data Science. ↩
Mollick, E. (2024). “I, Cyborg: Using Co-Intelligence.” One Useful Thing. ↩

Mosh Levy