• Architeuthis
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 个月前

    Still occasionally think about that bit in the o1 white paper where the openai researchers innocuously pose the question of what if our benchmarks for detecting hallucinations are shit actually, wouldn’t that be something.