“Notably, O3-MINI, despite being one of the best reasoning models, frequently skipped essential proof steps by labeling them as “trivial”, even when their validity was crucial.”

  • bitofhope
    link
    fedilink
    English
    arrow-up
    18
    ·
    8 months ago

    To be fair, the typesetting of the papers is quite pleasant and the pictures are nice.

    • froztbyte
      link
      fedilink
      English
      arrow-up
      10
      ·
      8 months ago

      they gotta make up for all those scary cave-wall pictures somehow