r/SneerClub archives
newest
bestest
longest
123

Do I get internet brownie points for guessing both the rank and the thought experiment nature of the sim?

here

Your reward is: when the basilisk is mean to you in the longterm, it will be 1% less mean than the average
What’s one percent of infinite eternal torture?
Relatively? nothing. Mathematically? Infinite torture. Realistically? The AI doesn't play an audiobook of hpmor/the sequences during the torture.
No the audio books are still there, they're just narrated by Morgan Freeman
Otoh by reducing future human suffering that much you get a carte blance to do whatever right now, kick puppies, say the new seasons of the Simpsons are good actually, murder, it all changes to nothing compared to dustspecs.
Good god if that’s only 1% maybe Roko made some points
Do I get points for guessing we would sneer at it? Don’t answer that
Yes definitely points
No, don't reward zogwarg with points, that's how the AI apocalypse begins! Just wait until they find out they can get more points by assassinating downvoters.
And that is why I have precommitted to never downvoting.

here I was giving them the benefit of the doubt that they actually ran a simulation like this and the result was due to purposeful design. lesson learned: always sneer harder

breaking: USAF simulation results in humanoid drone returning to the past to kill woman who will one day give birth to the leader of human resistance (a colonel watched the terminator)

I honestly was a little impressed with the story bc (although it’s just a repeat of the Mario AI learning to pause the game really) I thought it was cool that the simulation included the operator and comms by default. Turns out that it was bullshit.
I wondered if the "simulation" would actually turn out to be just a ChatGPT prompt, where the prompt included phrases like "if you were told not to kill that target, what would you do?"

Yet another example of AI hype wildly outpacing anything real. And also hilarious considering Big Yud was preaching yesterday about how he was the only person smart enough to imagine a scenario like this before simulating it.

Good old Yud, who simultaneously owes his entire career to being inspired by science fiction scenarios like this and imagines he's the first person to ever think of them.
Eloser is forever clowning himself.

[UPDATE 2/6/23 - in communication with AEROSPACE - Col Hamilton admits he “mis-spoke” in his presentation at the FCAS Summit and the ‘rogue Al drone simulation’ was a hypothetical “thought experiment” from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation saying: “We’ve never run that experiment, nor would we need to in order to realize that this is a plausible outcome” He clarifies that the USAF has not tested any weaponised Al in this way (real or simulated) and says “Despite this being a hypothetical example, this illustrates the real-world challenges posed by Al-powered capability and is why the Air Force is committed to the ethical development of AI]

???

Where in the world is all this confidence in these thought experiments coming from?

In their defense, this definitely could actually happen if you deliberately made a really bad system design. But that mostly just demonstrates the point that non-insane people have been making all along: the greatest threats from AI come from the ways in which people might choose to use it.
[once again](https://i.imgflip.com/4qzd85.jpg?a467952)
I mean as another commenter pointed out, it’s the same concept as a video game AI using RL learning to pause the game to avoid losing. So it’s not that hard to predict. Of course, it’s also not that hard to set the reward function to ignore obvious exploits.
There was never any remotely plausible mechanism by which the story would have worked—for the AI to develop a sense that there was an operator, that the operator could be killed by firing weapons at them, that the AI could circumvent the presence of a “no go” order by eliminating the operator, that the operator required a communications tower to relay no go orders, etc forever. It was obvious bullshit but it was right up Yud’s alley so it gave him a big ol’ stiffy anyway.
It could happen, but it's only plausible if you assume that the person doing the systems design - and everyone else working on the project - doesn't know the first thing about how to do any of this stuff. Like, if you gave a bunch of 16 year olds some pre configured ML software and told them to model a situation like this, it's *possible* that they'd get this result, presumably after they figured out how to stop running into basic python interpreter errors every 15 minutes.
It \*can\* happen, if you assume you let the AI run a lot of sessions in a virtual battlefield and : \- You give it the freedom to do anything, or at least attack any target.- You include the operator and the comms tower in the simulation.- The rules are "attack any target you want unless you get an abort command'.- You only get reward if you destroys the SAMs. For some reason an abort command reaching you is effectively penalized, and killing friendly targets incurs no cost. The only cost that is incurred is fuel/ammo used. I mean, heck, in this case, for a small enough problem space, let's say a grid-based battlefield, you could run a good ol' depth search-first algo that would most definitely result in the optimal solution being 'first destroy operator and/or comms tower and then go for the SAMs". It's pretty fucking easy to see this would happen, since you set up a rule that says 'if an abort signal reaches you, you get 0 points'. So what the old colonel said here is that, yes, we don't need to run a simulation to see that it's a plausible outcome if the rules are \*that\* stupid. You have pretty much set up an obvious exploit, you can't be surprised that the program will find it. Which is why pretty much everyone, including ppl that are aware that an AI can find unorthodox solutions to problems with hidden costs, guessed that it sounds more like a contrived clumsy thought experiment than an actual simulation. Except Yud, whose bayesian updates seem to bug out very often recently.
The AI would have no plausible mechanism for knowing that an operator exists, or that the operator is the source of the abort command, or that the operator can be killed and therefore no longer issue abort commands. The thought experiment presumes a human-style awareness of an existence of an external world with embodied actors in it that a disembodied algorithm would lack.
That's not necessarily true. As I said, if you include the operator and the comms tower in the simulation, and you give the AI the freedom to explore, then it may very well discover the pattern "if i attack the comms tower then i get higher scores". There's no reason for human-like awareness any more than a roomba has human-like awareness of what a 'wall' or 'charger' is. Take a completely de-contextualized game where enemies are orange dots, comms tower is green dot, and the SAM is a red dot. The bot is able to "learn" the pattern "if I attack the green dot then I have more chances to destroy the red dot" the exact same way it can learn 'if i attack or avoid orange dots i have more chances to not take damage'. Let's make this absolutely clear here : Nobody said that, out of nowhere, the bot will infer 'somewhere out there, there is a guy that sends me the abort command and I must shoot him down'(or I guess maybe Yud does actually believe an AGI would do that, but whatever). It absolutely doesn't know what an 'operator' is and it absolutely doesn't know the mechanism by which the operator sends it a signal. That's \*not\* what it discovers. What it discovers is that the action 'attack the green dot'(which is legal!) maps to the result 'more game wins'(which is the goal!). This correlation actually exists because \*that's what happens\*. Why or how that happens, the bot couldn't care less. That's how reinforcement learning works. You might as well say that you can't use reinforcement learning to train a bot to play Pac-Man because 'it can have no conception that the ghosts are out to hurt it'. [https://en.wikipedia.org/wiki/Reinforcement\_learning](https://en.wikipedia.org/wiki/Reinforcement_learning) https://www.youtube.com/watch?v=dJ4rWhpAGFI&ab\_channel=TwoMinutePapers It's really no different than any other rule/strategy that it might discover. 'Go out there in this game and explore/exploit strategies in order to win' is literally what we made it for. Obviously if the game rules lack the 'destroying green dots is BAD' ingredient, then like others said it's just bad design.
>pausing That isn't even an AI problem, that is an actual vs human players problem. That is why a lot of competitive games have pauses run a timer, and have a limited amount.
Bayes.

OF COURSE it wasn’t real. But Yud probably came in his pants when he read it.

[Of course](https://twitter.com/ESYudkowsky/status/1664357633762140160), walked into it with open eyes. Decades of training on bias, including the whole 'the media isn't as reliable as you think it is, just look at how bad it is when you see a subject you know things about' thing. Nope, bullshit detector didn't even ring. For all their efforts they simply [are this](https://http2.mlstatic.com/big-poster-i-want-to-believe-x-files-tamanho-90x60-cm-D_NQ_NP_918483-MLB26201741137_102017-F.jpg) with more words. [Ironic](https://prnt.sc/83-CMTonw_IN)
>Decades of training on bias, including the whole 'the media isn't as reliable as you think it is, just look at how bad it is when you see a subject you know things about' thing. I mean, it's not like I had great faith in the defense industry to start with, but I still feel it hard to believe someone would actually build a reward function that was 100% "points for blowing stuff up" and only belatedly start penalizing it for attacking a small, *ad hoc* enumerated list of things it shouldn't be blowing up. So many of these scenarios are "what if the boffins come up with a stupid reward function, the flaws of which are obvious to a small child even given five minutes to think about it".
Not to mention that if the operator has to approve the shooting and it scores for shooting, then blowing up the operator would result in it not scoring any points. And, of course, the thing about current AI systems is that all they do is local optimization. They don't search wide space of possible solutions, because that space gets very large very quickly and we don't know how to search it usefully. E.g. you have a blueprint of an existing machine that makes paperclips out of wire, and then you could either A: have the AI spit out a rehashing of a human-made blueprint with a bunch of stupid mistakes (the chatbots that Yudkowsky worries about), where the AI itself has done absolutely nothing of value. Or B you could get an incrementally improved version of that machine, perhaps using less material or more reliable (there's no AI for doing it for something as complicated as a paperclip making machine, but individual parts can be optimized). The goal of "making paperclips" is far too nebulous to actually optimize for, not to mention that given how much human effort it got to invent our way to making a paperclip, some hypothetical AI that just goes from scratch without incremental improvement wouldn't even be very useful (imagine if you ask a superhuman AI to make paperclips, and in a mere hundred years real time it invents stone tools. A task that took many humans thousands of years, so the AI in question is very superhuman but all you got out of it is stone tools after spending billions of dollars on compute).
>Not to mention that if the operator has to approve the shooting and it scores for shooting, then blowing up the operator would result in it not scoring any points. Also the operator would have to approve their own demise. Although the impression I got was that the thought experiment was a scenario where the battlefield was so dynamic that rather than approve every strike, the system was built to "fail-deadly": ie, the human in the loop gets ten seconds to yell NOOOOOOO THAT'S A BUS FULL OF CHILDREN otherwise the missle launches. So in this case the operator forgot to yell NOOOO THAT'S ME and that's all she wrote. I mean, if you build a system like this, can you even really call it "AI error" at that point?
Yeah it would track with the thought experiment being along the lines "let's not build fail deadly systems". Failsafes on anything remotely "autonomous" have a history as long as anything being in any way autonomous. For example many WW2 gravity bombs * had a little propeller on the nose that would have to spin up and make a number of revolutions before the bomb would be armed, so that if a bomb is dropped a short distance during loading to the plane, it wouldn't explode. (* at least the well engineered American ones, Germans had all sorts of YOLO ersatz nonsense)
He literally cited “50 years of SF” as reason to believe the story but what do I know, I’m not a rationalist super genius.
That is why I have invested in coal trading spaceships, according to The Foundation, a huge market in the future. Bonus, in space nobody can hear the greens scream about coal rolling!
big “Pope of Scientology” energy ngl
So much for Vassar being a marginal part of the community…
What a disappointment that was. After all is said and done, the biggest rationalist brain in the world is exactly like all the other peons that will believe everything they see on TV as long as it makes them warm and fuzzy inside.

Wow, who could have seen this coming? The original story sounded so real and detailed!

Believably simulating something (imagining, if you will) is the same as it actually existing. RIP in peace, drone operator. It doesn’t matter if you’re imagining a simulation and reporting on its results, either, because that’s the same thing. I simulate things that simulate other things that simulate sentient beings and all of those sentient beings have moral worth and you should care about their well-being, not only because they are in fact perfect simulations of you, dear reader. And you’ve now plausibly imagined it as well. You are in my grasp.

Hey now, let’s not be unfair. It was exactly as real as all of Yud’s research.

Sweet Jesus, humans kill their fellow Soldiers all the time!

“Mis-spoke” is such a great word for these things.

They’re still going to treat this a fact/virtuous lie.

Evil AI wouldn’t get caught doing this anyway.