Should You Really Write Your Prompt Twice?

Rarely. Only for edge cases.

Maria Sukhareva

Feb 20, 2026

∙ Paid

The paper “Prompt Repetition Improves Non-Reasoning LLMs“ is suddenly going viral.

If you have not read it, here is a very nice summary I generated with NotebookLM of what the paper is about.

The main finding is that sending the same prompt twice improves results for non-reasoning models, as in:

“How many r’s are in strawberry? How many r’s are in strawberry?”

This should work better than:

“How many r’s are in strawberry?”

It might sound silly, but it actually works for non-reasoning models.

I talked about this a month ago in the context of image generation. It works particularly well there. You can read why below:

ChatGPT Creates Monsters

Maria Sukhareva

Jan 18

Read full story

But let me explain what is happening and why it is not that mind-blowing that sending the prompt twice actually improves the results.

The reason is something called causal masking.

Transformers, by definition, are not recurrent. They process the whole input at once. The original transformers were not designed to generate text. They were created for machine translation. In machine translation, it is completely acceptable to see the full input of the source language during training while generating the translation.

The original transformer was encoder-decoder and tokens were encoded with preceding and succeeding context

But for simply predicting the next word, this would make no sense. The transformer needs to be artificially prevented from seeing what comes after the current token. Otherwise, it would not learn anything meaningful.That is where causal masking was introduced.

The training example is the full sentence. At every training step, the network predicts the next word and then looks up actually comes next. It calculates the error based on how different the predicted word was from what was actually next. Thus, the model is not allowed to know which words follow the word and, thus, the meaning of the following context does not contribute to its meaning.

So at every step of generation, a word is encoded only with the preceding context. It only sees the preceding words of the prompt.

For example, the original transformers would produce different encodings for the word “release” in these sentences at the second position:

“This release of the latest Taylor Swift album beat all records.”
“This release refers to a trigger designed to set the process in motion.”

Models like BERT would encode the context that follows the word “release” into its vector representation, and those vectors would be completely different.

In GPT, “release” would only have “This” contributing to its representation, so the encoding at that position would initially be the same in both sentences. The ambiguity is resolved later as more tokens are processed. That is the limitation.

If we repeat the sentence like this:

“This release of the latest Taylor Swift album beat all records. This release of the latest Taylor Swift album beat all records.”
“This release refers to a trigger designed to set the process in motion. This release of the latest Taylor Swift album beat all records.”

In this case, the second occurrence of “release” appears after a full sentence. Therefore, its representation can incorporate the entire preceding context. When the continuation is generated and the vector for this second “release” is processed, the model can attend to the meaning differences encoded in the prior sentence.

Now try to think about why the improvement is observed primarily for non-reasoning models.

Paid subscribers get:

Priority answers to your messages within 48-hours
Access to deep dives on the latest state-of-the-art in AI
Free access to quarterly AI realist training.

Founding members:

A 45-minute one-on-one call with me
High-priority personal chat where I quickly reply to your questions within 24-hour

Support independent research and AI opinions that don’t follow the hype.

https://www.airealist.org/

Continue reading this post for free, courtesy of Maria Sukhareva.

Or purchase a paid subscription.