Maria I think you are missing the point entirely. With reinforcement learning, our precious LLMs area stating to figure out that concise, precise answers are key. So em dashes go up.(As they are a stylistic way to provide clarity). Be careful with endogeinty bias.
LLMs don’t start “figuring anything out”. They don’t have cognition or common sense, what they have though is a loss function that sums up the probabilities during generation. The smaller the loss, the better. Em dashes have make generation use less tokens, better score. So now they put em dashes everywhere where they belong and don’t. And models by no means aim to provide short and concise answer, the so-called reasoning made them enormous ramblers but one of the training objectives is still to optimise token use.
I disagree. With reinforcement learning loss function we are heading towards clarity. Though you are correct their primary token base charging model will continue to have them rambling on to charge the customers more.
thanks for writing this piece. doesn’t em dash used by models use the ones without connecting spaces? i don’t think the example you gave in ‘cheapness’ argument uses the same em dash.
Maybe morse code is the one true language of the future.
Maria I think you are missing the point entirely. With reinforcement learning, our precious LLMs area stating to figure out that concise, precise answers are key. So em dashes go up.(As they are a stylistic way to provide clarity). Be careful with endogeinty bias.
LLMs don’t start “figuring anything out”. They don’t have cognition or common sense, what they have though is a loss function that sums up the probabilities during generation. The smaller the loss, the better. Em dashes have make generation use less tokens, better score. So now they put em dashes everywhere where they belong and don’t. And models by no means aim to provide short and concise answer, the so-called reasoning made them enormous ramblers but one of the training objectives is still to optimise token use.
I disagree. With reinforcement learning loss function we are heading towards clarity. Though you are correct their primary token base charging model will continue to have them rambling on to charge the customers more.
thanks for writing this piece. doesn’t em dash used by models use the ones without connecting spaces? i don’t think the example you gave in ‘cheapness’ argument uses the same em dash.