Discussion about this post

User's avatar
michaelh's avatar

I don't understand why you would train a model that wastes 90% of its energy on *whether* it should respond at all rather than *how* best to respond.

Expand full comment

No posts