• Architeuthis
    link
    fedilink
    English
    arrow-up
    5
    ·
    9 months ago

    Still occasionally think about that bit in the o1 white paper where the openai researchers innocuously pose the question of what if our benchmarks for detecting hallucinations are shit actually, wouldn’t that be something.