• David GerardOPMA
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 year ago

    there’s a research result that the precise tokeniser makes bugger all difference, it’s almost entirely the data you put in

    because LLMs are lossy compression for text