The AI beat humans in predicting the next word an AI would write

The AI beat humans in predicting the next word an AI would write (https://www.reddit.com/r/SneerClub/comments/wvfhcp/the_ai_beat_humans_in_predicting_the_next_word_an/)

posted on August 23, 2022 04:45 AM by u/NaiveDeontologist

EDIT: Some users correctly pointed out that the title of this post does not literally reflect the content of the linked article, and is therefore misleading. See this comment and the answers below.

Language models seem to be much better than humans at next-token prediction - LessWrong

u/Snuggly_Person 32 points at 1661230930.000000

Humans were given the start of a random OpenWebText document, and they were asked to guess which token comes next. Once they had guessed, the true answer was revealed (to give them the opportunity to get better at the game), and they were again asked to guess which token followed the revealed one.

Seems like the human and language model are both being asked to repeatedly guess the next token in a real document.

permalink

u/NaiveDeontologist 9 points at 1661244485.000000

Yes, the title is maybe a bit unfair; however, to win this game, you should predict what word typically comes next in a typical internet document (which I dubbed "the next word an AI would write" for comical effect).

permalink

u/Nixavee 2 points at 1661634855.000000

I don't understand how that's "comical effect" and not just a blatant misrepresentation.

permalink

u/NaiveDeontologist 5 points at 1661692392.000000

Token generation is the way AI completes text. To compete in token generation, an human should try to complete a text like an AI would do (for example, no human would add "Enlarge this image toggle caption." after the headline of an article, but if you want to improve your score in this game you should). I feel that the fact that AI turned up to be better than humans in the AI-specific task (token generation) does not tell us anything meaningful about whether humans are "better" or "worse" than AI in language, and certainly does not imply (like the linked article) that GPT-3 can be accurately described as a "superhuman language modeler". I have tried to edit the title of the post, but apparently it is not possible. I will add a disclaimer before the link.

permalink

u/typell 41 points at 1661235779.000000

in our opinion playing this game for half an hour gives you some useful perspective on what it’s like to be a language model.

ah yes, because there definitely is something that it is like to be a language model

permalink

u/fusion_curious 26 points at 1661238691.000000

LessWrongers reinventing animism

permalink

u/supercalifragilism 9 points at 1661263294.000000

Based, unironically. A panpsychic or ratioinal-animist endpoint is on the better end of the LessWrong spectrum-of-likely-outcomes.

permalink

u/sieben-acht 1 points at 1661277621.000000

Jesse, what the fuck are you talking about?

permalink

u/supercalifragilism 6 points at 1661289647.000000

LessWrong is a theological project at this point, basically explicitly trying to make the case that all religions are nuts because God's in the future, not the past (all while not crediting Tielhard de Chardain). The possibility of plausibility or likelihood is irrelevant. One possible\* state this could end up in is that the doctrine of substrate independence, that intelligence is intelligence, code or cell, results in an animistic cosmology. After all, if intelligence is a process of computation, there's computation everywhere and therefore intelligence everywhere. This would be the hard mystic bend of the religious memplex\*\* that includes MIRI or LessWrong theology. You see the hard mystic bend periodically in major religions in their early lifespans: gnosticism pops up regularly in pagan, Christian and Muslim traditions, you have esoteric Buddhism, Chuan, Zen and mystic Taoism, secret society and participatory cults in Enlightenment and Revolutionary times, and it's to be expected in the nascent new religious system accreting around 20th century cultural, historic and technological developments. It's already in the silicon valley ecosystem (quantum bullshit, holographic universe, multiple world theory) so it'll Katamari Damacy it's way into eventual pseudo to then actual religion that evolves out of this. \*theologically possible \*\*I'm sorry but it's the best word I know/paradigm is even worse. Non exclusively includes: race realism, g, timeless decision theory, Roko's pet, longtermism, specific utilitiarian implementations

permalink

u/verasev 1 points at 1661416376.000000

"demon in a box" but literal?

permalink

u/fusion_curious 1 points at 1661440421.000000

No, more like "everything has a soul, man. even this rock".

permalink

u/DELETED 11 points at 1661266662.000000

[deleted]

permalink

u/Snuggly_Person 6 points at 1661273943.000000

Playing the game for a bit shows it to be genuinely hard. I didn't play that much, but the *vast* majority of my mistakes were in guessing the wrong word, not playing catch-up with arbitrary and weird tokenization artifacts (a couple prompts out of 50+ were of the form "port","ing" or similar). It at least *looks* like they've tried to make the game fair by actually asking for prediction of words (where the language model presumably has to guess the equivalent sequence of tokens up to the next whitespace or punctuation). Or maybe modern tokenization methods mostly tokenize at word boundaries, I'm not sure. Honestly this looks like something that people just want to be mad at from reading the title, because anything that shows up on lesswrong must be dunked on. It wouldn't take much cleaning up to make this into a perfectly good paper.

permalink

u/DELETED 12 points at 1661278618.000000

[deleted]

permalink

u/ursinedemands2112 2 points at 1661286405.000000

That looks like hidden text that would perhaps be below the headline of every article on NPR's website. That's a rigged game, for sure. Even if it were visible, we would tend to “fuzz” it when attempting to read the article. We would be able to ignore the boilerplate text in favor of reading the webpage for meaning.

permalink

u/dizekat 2 points at 1661451495.000000

> By the way, my very first task involved filling in the headline to an article that started with "Buried in Debt", it then spontaneously became "Enlarge this image toggle caption. Emily / Bogle / NPR Emily / Boggle NPR." No wonder humans aren't good at this task... I don't often predict my interlocutor will throw out forward slashes in the middle of conversation. This is brilliant. The bottom line is, of course an empirical model would be better at predicting that kind of shit.

permalink

u/Think_Olive_1000 0 points at 1667579033.000000

>say nothing interesting about human cognition Speak for yourself

permalink

u/pleasetrimyourpubes 1 points at 1661360574.000000

It's still worth it to do this study on say some mTurkers (and not just friends) and fix the tokenization. Nice PhD project for someone in AI. I'm actually curious now if its been done before.

permalink

u/dingledog 2 points at 1661364475.000000

As someone who is doing their PhD in a related field, I don't know what the paper would show that is not just trivia. You can extend this to any domain: "predict the next pixel", "predict the next note", "predict the next controller input in a game of Smash Bros" Why do we expect humans to be better at predicting any of this than a machine trained to do just this task?

permalink

u/pleasetrimyourpubes 1 points at 1661365070.000000

For sure the result alone wouldn't be worthy of a PHd paper, I was suggesting it would be part of a larger PHd thesis.

permalink

u/pleasetrimyourpubes 2 points at 1661360295.000000

It is actually not too surprising a result. If you had humans who religiously studied OpenWebText like monks study koans 16 hours a day, they will approach or exceed this level. It does merit a paper and to show the superhuman ability of models but it shows no reasoning or anything that we didn't know for at least 40 years. A more interesting thing would be to test it against something not in its training set. Maybe some fan fiction or even better, erotica.

permalink

u/noactuallyitspoptart 2 points at 1661371400.000000

out of many problems I could raise with your comment, come on: **Did you read their conclusions?**

permalink

u/DELETED 27 points at 1661238086.000000

[deleted]

permalink

u/musicmage4114 6 points at 1661292404.000000

It’s like they can’t possibly conceive of a lateral move in terms of ability of function. Everything is hierarchical: superhuman, human, subhuman. I’d be perfectly willing to concede that there’s a language which computers could learn to speak amongst themselves that humans would find it difficult to parse—which is to say, code—and that it might even be possible for computers to invent more optimal code themselves that humans find it impossible to parse, but that’s not a *higher* level of functioning, just a different one.

permalink

u/dizekat 2 points at 1661457374.000000

It's funny to apply this idiocy to other contexts, especially ones that ought to be available to said computer touchers. E.g. make their neural network predict the next byte in a binary, does that make it wildly super-gcc or super-clang (or insert another compiler here) at anything at all? Like, maybe predicting the next byte in a binary has absolutely nothing to do with compiling or modeling a compiler, and maybe for something more complex the connection is even less. It's not even that they use some computer analogies to think this nonsense, it's that they don't do any reasoning at all here. Human brains are magic, insides of the neural network are magic, a perfect model.

permalink

u/ThisIsMyAltFace 2 points at 1661270760.000000

Here’s the game they used to collect the data: http://rr-lm-game.herokuapp.com/

permalink