5 Comments
User's avatar
Daniel Graetzer's avatar

Maybe morse code is the one true language of the future.

Expand full comment
Joseph de Castelnau's avatar

Maria I think you are missing the point entirely. With reinforcement learning, our precious LLMs area stating to figure out that concise, precise answers are key. So em dashes go up.(As they are a stylistic way to provide clarity). Be careful with endogeinty bias.

Expand full comment
Maria Sukhareva's avatar

LLMs don’t start “figuring anything out”. They don’t have cognition or common sense, what they have though is a loss function that sums up the probabilities during generation. The smaller the loss, the better. Em dashes have make generation use less tokens, better score. So now they put em dashes everywhere where they belong and don’t. And models by no means aim to provide short and concise answer, the so-called reasoning made them enormous ramblers but one of the training objectives is still to optimise token use.

Expand full comment
Joseph de Castelnau's avatar

I disagree. With reinforcement learning loss function we are heading towards clarity. Though you are correct their primary token base charging model will continue to have them rambling on to charge the customers more.

Expand full comment
Yong Zheng-Xin (Yong)'s avatar

thanks for writing this piece. doesn’t em dash used by models use the ones without connecting spaces? i don’t think the example you gave in ‘cheapness’ argument uses the same em dash.

Expand full comment