“Notably, O3-MINI, despite being one of the best reasoning models, frequently skipped essential proof steps by labeling them as “trivial”, even when their validity was crucial.”

  • froztbyte
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    8 months ago

    feels like the same manner as my “‘just’ is a weaselword” speech