r/SneerClub archives
newest
bestest
longest
12

Out of all the neologism filled, straw-manny, ‘still wrong’ and nonsense papers and blogposts, Yud’s FDT paper stands out as the best of the worst. I see how they do a poor job in writing their paper, I see how confusing it is to many, but what I do not see is discussion of the theory, when almost all other work by Yud is being discussed. There are two papers on FDT published by MIRI, one by Yud and Nate Soares and the other by philosopher Benjamin Levinstein and Soares. There seem to be few writings trying to critically discuss the theory online, there is one post in the LW blogs that discusses the theory, which at least to me does not seems like a good piece of writing, and one blogpost by Prof. Wolfgang Schwarz, in which some of the criticisms are not clear enough.

So, I want to know what exactly is problematic with the FDT, what shall I do when a LWer comes to me and says that Yud has solved the problem of rationality by creating the FDT?

Because the Sneer Club community, which exists, is a perfect parallel and evil dark triad mirror of the Rationalist community you are of course correct that we are obligated to care about and dispute every minor point proposed by a member of their despicable Rationalist tribe who we hate because they make too much sense. However you forget that as the anti-rationals, we must reject logic, reason, and debate as these are tools of the rationals and are thus vile. We cannot allow functional decision theory to go unchallenged, however we cannot challenge it using logic as we will turn immediately to stone.

If I'm in a dark triad, do I get to learn kung fu? If so, do I have to pay for it?
Yes, but after the first session you can just beat up rationalists for their lunch money.
Well then, in that case we're discussing an *investment*!
[deleted]
You turn into bone.

So, I want to know what exactly is problematic with the FDT, what shall I do when a LWer comes to me and says that Yud has solved the problem of rationality by creating the FDT?

You could always ignore the weirdo and do something fun like dancing.

But seriously, we’re under no obligation to waste our time with every “prove me wrong” guy who happens by. If someone wants to engage with the rationalists on their terms, they certainly can. However, they can also spend their finite brain cycles looking at more plausibly fruitful avenues.

I agree. But seeing how popular the FDT thing has gotten and has quickly replaced all arguments that the 'Yudkowskians' use to 'prove' Yud is the smartest man on Earth, the messiah, we do need some arguments against the theory that we can use against the prove me wrong rationalist guy.
I mean, the paper proposing FDT was rejected. You provided the [link](https://www.umsu.de/blog/2018/688) from the academic decision theorist that rejected it, explaining why they rejected it. Roughly speaking, it failed to engage with the strongest versions of CDT, is super murky about how decisions are actually made, overstates it's success based on a definition of success tailored specifically to FDT, doesn't mention previous similar theories, etc. It seems that the theory, while interesting, is nothing earth shattering, I don't really see what else needs to be said?
The MIRI website says that the 'Cheating Death in Damascus' paper written by Nate Soares and Dr. Benjamin Levinstein is 'forthcoming in The Journal of Philosophy.' So I believe that we may see the paper being published soon. Even then, LWers will keep calling philosophers dumb and saying that their theory "solves rationality", we do need to counter that.
Do "we" though?
I mean, it obviously doesn't solve rationality, they admit as much in the paper itself. Being able to solve a few toy examples with perfect information isn't particularly useful in real world problems. If the paper is published it might be an interesting contribution, i'm not expert enough to tell. MIRI does have millions of dollars in funding, so I'd be surprised if they didn't achieve anything*.* The average PHD student publishes several papers, after all.
oh gawd where are they saying this
Define "popular."
In this context, 'widely used by Yudkowskians(LWers) to annoy people'

I am not a philosopher, but that doesn’t stop these people so why should I let it stop me.

OK! so the first thing I see is that the Levinstein/Soares paper comes up with a couple of examples that EDT/CDT fail on, and then comes up with a problem FDT passes. w00t!

Since every other field of human experience is of course a simplified case of my own speciality, system administration, I’ll ask: is there a standard battery of unit tests for decision theories? Run all the decision theories through your standard barrage of tests, then you can do grids in colourblind-hostile red and green setting out the results.

There you go, since I am smarter than everyone else I am quite sure I have solved a problem in philosophy those mere philosophers never thought of. Gimme Ph.D.

By the way, what ever happened to UDT?

To add to the above, the reason there's no battery of tests is that the field is far too informal (even without Yudkowsky making it even more informal). Basically anyone can claim anything about their common problems. You can claim that CDT one boxes in Newcomb's, you can claim that it two boxes in Newcomb's, you can claim EDT smokes in the smoking lesion, you can claim EDT doesn't smoke. edit: and by "you can claim" i mean there are papers saying so. There are well defined (if incomputable in exact form) algorithms like Marcus Hutter's AIXI where one could potentially have some ok-ish-defined scenarios for them. Is AIXI CDT? No. As far as I can tell it doesn't impose a requirement that U(q1, a1, a2, x).o1 = U(q1, a1, a2, y).o1 for all x and y , meaning that it can conceive of a world where the later action influences an earlier observation or reward (interestingly upon receiving an observation it simply won't be able to consider that tape when evaluating observation-incompatible action). I guess it can be set up so it does, rather easily (only provide later actions after the machine has printed observation and reward), but that doesn't seem necessary. Is AIXI EDT? Well not in the "smoking lesion" sense, if you only restrict the tapes considered to only tapes where some random lesion is causing cancers, it isn't going to be linked to the action. edit: although other argument could be that AIXI is EDT. The "conditional probabilities" of observations and rewards (conditional to an action) are probabilities that given an action A on the input tape, a random tape gives that specific list of observations and rewards. On the other hand we could have UTMs print actions out and throw out UTMs that don't match the chosen actions. I may think more about that / see if Hutter explored that option. It seems silly though, would give greater reward to actions that are more shortly encoded... I guess that would constitute a bit of a strawman but not quite clear a strawman of what exactly.

I think that blog post is a million times easier to understand than the actual paper.

The whole thing is light on math. What jumps out to me is that the definition of FDT in the first link is defined in terms of itself…with no base case. How do you even calculate that? How do they know what FDT outputs for anything? They acknowledge that this is “perhaps the largest open problem” in fdt (as noted in the blog). I want to know what they think the larger open problem is.

It reminds me of someone arguing that an airplane can’t take off on a treadmill.

At the risk of starting yet another internet argument full of bad misreadings and worse physics in both directions THE AIRPLANE CAN’T TAKE OFF ON THE TREADMILL GODDAMNIT
of course because then it would no longer be on the treadmill, it would be off it
This is the most mind blowing response to this argument I’ve ever heard of Well fucking done
i can hardly imagine how many new Ph.Ds are gonna come out of just this thread

what shall I do when a LWer comes to me and says that Yud has solved the problem of rationality by creating the FDT?

Who cares? The claim is obvious bullshit. What does “solving rationality” even mean?

That sounds like something a Scientologist would say about L Ron Hubbard.
To be fair to Hubbard at least he wrote his on science fiction. (Which sucked, but he then had his legion of fans promote it. He also wrote writers advice, brilliant stuff like 'get over writers block by going outside nerd'. I own one of his sf short stories collections (most of the stories were not written by him), which was super weird. Half the book is stories, the other half drivel by Hubbard and other writers praising him for being so smart (prob because he published these writers in the past, and cult reasons)).
That probably means that Yud has made a decision theory 'better' than the other ones. Or in other words their theory performs better in 'fair problems' on their definition of it. You know how annoying these people can get.

From A Critique Of Functional Decision Theory on LW:

There’s a long-running issue where many in the rationality community take functional decision theory (and its variants) very seriously, but the academic decision theory community does not.

:thonk:

here's a sad rationalist [singing EY's praises](https://www.lesswrong.com/posts/mEwodogJGCnidD6KA/acknowledging-rationalist-angst): > Now I’ll be honest, I’ve only read half of the FDT paper, but it seems like the takeaway is that Eliezer has finally found a way to create a clear, rigorously defined decision theory that allows you to one-box on Newcomb's problem (and other cool stuff). I’m guessing that was really hard to do, and most rationalists wouldn’t have had the intellectual firepower to figure that out. And in the realm of everyday applied rationality, we don’t have anything nearly as powerful and as clearly defined as FDT. that last sentence ... if FDT is "powerful and clearly defined" ... why not ... apply it and a question that might get officially declared a LW Never Ask That Question: > It’s a questions that’s been asked plenty of times. Why aren’t rationalists wiping the floor with the competition? why not, indeed

lol Wolfgang Schwarz is my old supervisor, and he’s an incredibly intelligent and dedicated guy, to the point of being a robot in the office.

I knew he worked on decision theory but I had no idea he’d ever even heard of Yudkowsky, let alone this. It’s a shame because it could have livened up a few of our meetings, since we had a hard time finding much in common.

I was briefly on a post-graduate course that covered decision theory but I switched to other stuff because it seemed (a) hard and (b) hard to care about (that was how I ended up studying some much more interesting) - and as /u/dgerard and /u/dizekat both point out further down, it isn’t clear that I was wrong about either of those points even when it comes to mainstream academic philosophy

You say lower down that “we” “need” to have some kind of challenge to this thing and also that the idea is basically something that rationalists use to annoy other people: these two claims seem inconsistent, whereby only one can really be true

I fall on the latter side: who cares? It’s an idea originally dreamt up by known cranks taken up by potential non-cranks in a minor sub-field of logic and mathematics. So be it, let them have their fun!

Other thing to note, I'm not sure that their causality in decision theory (talking of mainstream academic philosophers here) isn't just a case of confusing some "folk physics" for a fundamental logical necessity. Adding as a preconception something that humans empirically found about the world (and then found it to be an approximation that isn't always good). Take Einstein-Podolsky-Rosen paradox and Bell's theorem for example. It is pretty easy to set up a situation where a pair of entangled photons gets sent to you and to some place far away (from a point near midpoint but not precisely at midpoint). Then photons pass through polarizers, then enter ideal photomultipliers. Signals from photomultipliers are sent to a mid way station, compared, and you get 1 paperclip if both click (or both not click), you lose 1 paperclip if one clicks and other doesn't click. The far away station polarizer's angle is uniformly distributed random from 0 to 45 degrees, you get to choose your angle (which you'll choose as 22.5 degrees). How much would you be willing to pay to play this game? I'm pretty tired and it's Friday so I may very well be fucking up my math, but I think it's anything under about 0.900316 paperclips (as an integral of reward over angles). Bell's Theorem basically states that this value is going to be a total pain in the arse to justify if you are forced to track "true consequences" of your actions like a chain of falling dominoes through the world, somehow getting to the other polarizer. I don't doubt some weird model with actions at distance could be ad-hoc-d to give the right number via really weird means, but why? I see no reason not to use some generic function (see Marcus Hutter's AIXI) that takes a list of observations and the chosen action and spits out the expected reward (inside the function may have a multitude of world models and may well do causality). I don't even see why it would need to be "hard coded" that a prior observation (or reward) can not be influenced by a later action. This all really looks like it will just quietly fall into the dust bin of history. edit: maybe the problem is they are being prematurely prescriptive when they should be descriptive i.e. analyse how humans do it. Also, for being philosophy, this is very weirdly close to actual engineering. Simple AIs existed for a long time now. Now you can engineer a self driving car, where the consequences of car's actions better take into account the consequences of having several such cars next to one another making same decisions under same circumstances (in spite of lack of causal chain from one car's "choice" to other car's "choice"). edit: which for learning algorithms would happen naturally, because a learning algorithm only cares about correlations.
I agree with you here, on the last point. The FDT paper truly is the best of the worst, we shall let them have their fun with it, even when it may not be what they claim it to be. Yud's posts on 'rationalist taboo', 'priori' and 'arguing by definition', I can never forget how bad they were. His series on free will, I can not forget how badly he misrepresented philosophy in 'Dissolving the question', how he called compatibilism 'requiredism' for *stupid reasons*, how he 'thought' he 'debunked' libertarianism, and how badly he defended his 'requiredism' in general. I was actually happy to see that Yudkowsky finally has decided to actually look at some of the relevant philosophical literature rather than random blogs and random books. What I want to say is that 'rationalists' use this idea to annoy people saying things like Yud has 'solved rationality' or 'owned those mainstream philosophers.' We need to counter that if one rationalist comes up to us and says those things, if the theory is not what they claim it to be. Otherwise they will keep annoying people, and trick some unfortunate young kiddos to actually think Yud has 'solved philosophy'. Anyway, after discussing the theory a bit more, I believe I have enough arguments in my hand that can allow me to get rid of that one prove-me-wrong guy that comes to annoy me(offline). Its problems of counterfactuals and Dr. Schwarz's arguments are, I think, good enough to show that the theory is not what they claim it to be.

The biggest problem is the stipulation of so-called counterlogicals, where the agent needs to consider what would be the case if an a priori truth were false. This is considered a much more serious problem than “regular” counterfactuals, which are usually understood to refer to some alternative possibility that didn’t obtain, because it doesn’t even seem coherent to say counterlogicals are possibilities.

Not really. Counterlogicals are only a problem to agents who have so much detailed information that they are able to make exact, not merely probabilistic , predictions of what will happen next ...combined with an inability to compartmentalise the counterfactual scenario from the rest of their mental content. In other words,it's a problem with two solutions , and an agent operating under realistic limitations is likely to have at least one solution available. In further words, they are *still* thinking in terms of incomputable AIXI like agents.
> Counterlogicals are only a problem to agents who have so much detailed information that they are able to make exact, not merely probabilistic , predictions of what will happen next I really don't see how this follows. A counterfactual analysis would try to answer, "How would the outcome be different if some prior event had been different?", but a counterlogical one would have to try to answer e.g. "How would the outcome be different if 1+1 = 3?". But this has to be compartmentalized in some very careful (and totally undertheorized) way, because you can deduce anything starting from a contradiction like this. This is a fundamental problem long before you have perfect knowledge of the empirical world.
Why would a counteroogical of that kind arise in a decision theory?
Yudkowsky reformulates decision theoretic questions to treat agents as instantiations of deterministic algorithms. His goal in doing so is to allow for logical (rather than causal) dependencies between a predictor who knows how you think and your future decisions. The analysis for the agent becomes, in his formulation, "What will the outcome be if my decision algorithm outputs X?" But the algorithm is a deterministic mathematical object (and this fact alone is what justifies the relationship between the agent and the predictor), so its output (assuming it has an output, don't know how he handles the halting problem) can be determined deductively. So considering *different* outputs amounts to considering deductive contradictions.
So a decision theorist would consider counterlogicals because logic is being used to model something real ... themself, the world they are embedded in, or both. That is what I was assuming in the first place. And the same solutions therefore apply. Considering an outcome that your decision algorithm would not output is only a contradiction if you include a detailed model of your decision algorithm. If you treat your decision algorithm as a black box, you can still think about the outcomes of actions. You solve the problem by temporarily losing informatuon.And all that is a non problem for realistic agents, because they don't have perfect information or fully detailed models in the first place.
Couldn't the counterlogicals here be more akin to a proof by contradiction? Except instead of a logical proof that goes "assume X, but X combined with other axioms leads to a contradiction, therefore not-X" this would be more like a motivational proof "assume I'll do X, but that hypothetical leads me to predict X will have a result that conflicts with what I actually want, therefore I won't do X". Or have I misunderstand what Levinstein/Soares mean when they talk about counter-logicals on p. 11 of the Cheating Death in Damascus paper?
What you're describing is in fact how they would like to use FDT, but it doesn't really get to the core issue. They formalize FDT using Pearle's causal graph "surgery", where there's a graph structure representing our beliefs about causal dependencies between variables, and where we calculate the effect of intervention by altering the structure of the graph to reflect that the variable we intervene on no longer depends on its usual causal antecedents. This makes decent sense in the case of empirical variables, because if I'm intervening to lower a patient's blood pressure, then I can rule out other "natural" causes of lower blood pressure. In FDT, you need to intervene on a variable whose value is logically determined--not causally induced--by other variables. Instead of severing the variable from its causal antecedents, you sever it from its logical antecedents. So what are its logical antecedents? How would I rigorously separate a term like 1+1 from its logical antecedents that allows me to consider the possibility 1+1=3? The current theory treats it like an empirical variable and pretty much ignores this question. Yudkowsky acknowledges this, though.
Tbh knowing the libertarian and cryopreservation ideological bents shared by many LWers, I wouldn’t be surprised if the counterlogical stipulation was the result of a generalization of their unwillingness to accept the phrase “there are two certainties in life: death and taxes” — or, you know, an obsession with contrarianism.

FDT is nothing new, even from Yudkowsky - it’s just an extension of the same abstractive logic that led to TDT in the first place.

And, some responses to it that I can use to get rid of 'that one annoying Yudkowskian'?

I’m not too big on philosophy, but in the spirit of Newcomb, I hereby present a thought experiment in which FDT does worse than CDT.

The experiment is this: just like in Newcomb, there’s an adversarial agent who reads your mind. However, instead of withholding the money if your intent is to 1-box, she will withhold the money if you’re using FDT to make the decision of whether to 1-box or 2-box.

Conclusion: the whole thing is remarkably silly. Thought experiments with mind-reading involved are ripe for abuse.

l I do when a LWer comes to me and says that Yud has solved the problem of rationality by creating the FDT?

If this actually happens to you, something is wrong with your social situation.

The ancient Indian economist and strategist Chanakya advised people in his book on "Niti" to not stay in places where there is no river, no income source, no 'rich man'(moneylender), no doctor, no ruler, no wise man, no relative and where you have no respect. He missed where there is no LessWronger.

You’re getting scolded a bit here, but I just wanted to say that you’re probably not going to get a better explanation than the Schwartz blogpost.

Minimally, it gives a pretty clear demonstration that LWers talk too much to themselves and don’t have a very good understanding of what’s going on in academia generally (e.g. not knowing what modern CDT and EDT defenders think, misusing Representation theorems, not commenting at all on other novel decision theories that cover similar ground, etc). More importantly, he raises a lot of issues with the central conceits of UDT/TDT/FDT dogma (“We just love winning folks, can’t get enough of that winning”).

After further reading and some explanations by other people I now better understand Dr. Schwarz's criticism of the theory and the paper and I admit I was taking the LW cultists too seriously. One more thing I learnt is that apparently you can not specify the predictor as a black box with FDT, yet another problem it seems.

I think the basic intuition they get right is that if you want a form of decision theory that applies to agents who are deterministic algorithms (mind uploads living in simulated environments, for example), then in any situation where you are facing a predictor who may have already run multiple simulations of you from the same starting conditions, it doesn’t seem rational to use causal decision theory.

For example, suppose you are a mind upload in a simulated world and are presented with Newcomb’s paradox. Also suppose the simulation is designed in such a way that although the contents of the boxes are determined prior to your making a choice of which to open, the contents of the box have no effect on your simulated brain/body until the boxes are opened, so if the simulation is re-run multiple times with the initial state of your brain/body and everything outside the boxes being identical in each run, you will make the same choice on each run regardless of what is inside the boxes. Finally, suppose the predictor plans to do a first trial run where there is money in both boxes to see what you will choose, then do millions of subsequent runs where the initial conditions outside the boxes are identical to the first one, but the insides are determined by the following rule:

  1. If you chose to open both the box labeled “,000” and the box labeled “,000,000”, then on all subsequent runs, the box labeled “,000,000” will be empty.

  2. If you chose to open only the box labeled “,000,000” and leave the other box closed, then on all subsequent runs, both boxes contain the amount of money they are labeled with.

Since you don’t know in advance whether you are experiencing the first run or one of the million of subsequent runs, but you know that whatever you choose is/was also the choice made on the first run, it makes sense to only open the box labeled ,000,000.

However, the papers note that you can also justify a one-boxing recommendation from “evidential decision theory”, which is based on making the choice that an outside observer would see as “good news” for you in terms of increasing the probability of a desirable result. And having looked over the papers, it seems to me like a big flaw in both the initial Yudkowsky/Soares paper and the later Levinstein/Soares paper is that in both cases, they rely on ambiguous and ill-defined assumptions when they try to make the argument that there are situations where functional decision theory gives better recommendations than evidential decision theory.

In the initial Yudkowsky/Soares paper, they think FDT is superior to EDT in the “smoking lesion” problem, where we know that the statistical association between smoking and lung cancer is really due to a common cause, an arterial lesion that both makes people more likely to “love smoking” and that in 99% of cases leads to lung cancer (but meanwhile, cancer aside, smoking does cause some increase in utility, though it’s not clear whether this increase is the same regardless of whether people have the lesion or not). They say that in this case EDT says you shouldn’t take up smoking, but that FDT says it’s OK to do so, and that this is fundamentally different from Newcomb’s paradox, arguing “Where does the difference lie? It lies, we claim, in the difference between a carcinogenic lesion and a predictor.” (p. 4) But they never really define what they mean by “predictor”, why couldn’t the presence or absence of this lesion on your artery itself count as a predictor of whether you will take up smoking? Yudkowsky is a materialist so presumably he wouldn’t define “predictor” specifically in terms of consciousness or intelligence. And even if we do define it that way, we could imagine an alternate scenario where there’s an arterial lesion that still has the same probabilistic effect on whether people will take up smoking but which itself has no effect on cancer rates, coupled with a malicious but lazy predictor who’s determined to kill off future smokers by poisoning them with a slow-acting carcinogen that will eventually cause cancer, and who decides who to poison based solely on who has the lesion. Would Yudkowsky/Soares really say that this trivial change from the initial scenario, which won’t change the statistics at all, should result in a totally different recommendation about whether to smoke or not?

They also claim that a hypothetical quantitative calculation of utility would favor smoking in the smoking lesion problem, asking us to imagine an agent considering this problem, and to imagine “measuring them in terms of utility achieved, by which we mean measuring them by how much utility we expect them to attain, on average, if they face the dilemma repeatedly. The sort of agent that we’d expect to do best, measured in terms of utility achieved, is the sort who one-boxes in Newcomb’s problem, and smokes in the smoking lesion problem.” (p. 4) However, the scenario as presented doesn’t give enough detail to say why this should be true. We are given specific numbers for the statistical link between having the lesion and getting cancer, but no numbers for the link between having the lesion and propensity to take up smoking, just told that the lesion makes people “love smoking”. It’s also not clear if they’re imagining that there would be some larger set of agents who take up smoking for emotional reasons (just because they ‘love’ it) and for whom the statistical link between having the lesion and smoking would be strong, vs. a special subset who take up smoking for some sort of “purely rational” reasons like knowing all the statistical and causal facts about the problem and then applying a particular version of decision theory to make their choice, such that there would be no correlation between having the lesion and deciding to take up smoking for this special subset. If they are thinking along these lines, I see no reason why we couldn’t get different conclusions from EDT about whether it’s “good news” that someone took up smoking depending on which class they belong to.

The claim that an agent following an EDT strategy would have lower expected utility than one following an FDT strategy also seems dubious on its face since, according to the explicit form of EDT given on p. 3 of the Levinstein/Soares paper, EDT is simply based on an expected utility calculation where we do a weighted sum of utility for each possible outcome of a given action by an agent, weighted by the probability of each outcome. So this would again indicate that if they think EDT does worse, it’s likely because they are artificially limiting EDT to a certain set of agents/actions, as in my guess above that they might be lumping together agents who choose whether to smoke based on feelings alone with agents who make the choice using a particular brand of decision theory, as opposed to only using the latter group in the EDT utility calculation. It would really help if they would give an explicit utility calculation involving all the relevant conditional probabilities for all the relevant classes of agents so we could see exactly what assumptions they make to justify their claim that EDT does worse.

Also note that many of the other examples Yudkowsky gives in his original timeless decision theory paper to support the intuition that EDT can go wrong are similarly ambiguous in terms of whether your own rational use of decision theory might give you an advantage over some larger group who don’t necessarily make their choices that way, like in the “Newcomb’s soda” problem explained starting on p. 11 of that paper. In any of these kinds of problems, if we assume everyone facing the decision is a “mind clone” of yourself–say, if you are an upload and multiple copies were made and given the same test, possibly with some random small differences in their environment to cause some degree of divergence–it’s a lot harder to believe the intuition that EDT is giving the wrong answer about what you should do (like the intuition he describes that it’s better to choose vanilla ice cream in the Newcomb’s soda problem even though EDT recommends choosing chocolate). Yudkowsky does talk about the thought-experiment of copyable mind uploads starting on p. 83 of the timeless decision theory paper, but does not go on to think about the implications of using copies of the same upload in experiments like Newcomb’s soda where he claims EDT goes wrong, only experiments where he does agree with EDT, like the standard Newcomb’s paradox.

(cont.) In the Levinstein/Soares paper they seem to have recognized some sort of problem with the smoking lesion example and so no longer use it to differentiate EDT from FDT, saying in a footnote on p. 3 that "the smoking lesion problem requires the agent to be uncertain about their own desires". But this paper's sole example of a case where EDT differs from FDT is one they call "XOR Blackmail" (this example is also mentioned on p. 24 of the Yudkowsky/Soares paper): >An agent has been alerted to a rumor that her house has a terrible termite infestation, which would cost her $1,000,000 in damages. She does not know whether this rumor is true. A greedy and accurate predictor with a strong reputation for honesty has learned whether or not it’s true, and drafts a letter: > "I know whether or not you have termites, and I have sent you this letter iff exactly one of the following is true: (i) the rumor is false, and you are going to pay me $1,000 upon receiving this letter; or (ii) the rumor is true, and you will not pay me upon receiving this letter." >The predictor then predicts what the agent would do upon receiving the letter, and sends the agent the letter iff exactly one of (i) or (ii) is true. Thus, the claim made by the letter is true. Assume the agent receives the letter. Should she pay up? In terms of the earlier idea of the predictor doing multiple runs of a deterministic simulation, the basic problem I have with their claims about what EDT vs. FDT would recommend here is that they don't specify whether the goal is A) to get the best outcome for a narrowly-defined "self" which only considers the current run of the simulation that you're experiencing, or B) to maximize utility for a broader collection of alternate selves on alternate runs whose experience may already have diverged from your experience on the current run (specifically because the blackmailer never sent them a letter at all). A simpler example of this issue can be seen if we imagine repeated runs of a simulated world where I am an agent facing Newcomb's paradox where both boxes are transparent. Suppose in this case the predictor follows the same rule I discussed for opaque boxes, first doing a run with money visible in both boxes, and if I only take the $1,000,000 on that run then all subsequent runs will be identical, while if I take both the $1,000,000 and the $1,000 then all subsequent runs will have one box with $1,000 and the other empty. If I know this, and I see both boxes full, then there are two possibilities: 1) I take the money from both boxes, leading me to be certain that I was experiencing the first run and that all subsequent copies of me will only see a box with $1,000, or 2) I only take the $1,000,000, and therefore conclude that the first run (whether that's me or not) only took the $1,000,000, and all subsequent runs made the same choice and got $1,000,000. So here, if I only care about a "narrow self" defined exclusively in terms of my current run it makes sense to take from both boxes, but if I care about maximizing utility for the broader collection of selves, I should only take from one box. To see that the same considerations would apply to "XOR Blackmail" scenario, suppose it's similarly happening in a deterministic simulated world including an AI homeowner, and that there will be many runs of the simulation. On each run, there's a variable in the program that is set at the beginning to either "HasTermites=YES" or "HasTermites=NO", with some fixed probability (say, 70% of runs will have "NO" and 30% will have "YES") that isn't affected by the choices of the blackmailer or the homeowner. The initial value of the variable has no effect on the AI homeowner, only after 10 days have passed will the value of the variable cause a divergence in the simulations as some copies of the homeowner will begin to experience signs of termites and others will not. The blackmailer has no control over the value of the "HasTermites" variable on each run, but they do know whether it's set to "YES" or "NO" on each run, and based on that they can decide whether or not to send the blackmail letter on the 5th day of a given run, before there is any visible evidence that would tell the homeowner whether they have termites. All simulations where the homeowner receives the letter are identical to one another (we don't have different versions of the letter placed at slightly different positions in their mailboxes for example), so in this setup we can expect that on every run where the blackmailer sends the letter, the homeowner will make an identical choice. Now suppose the blackmailer uses the following rule (and that the homeowner knows that they'll be using this rule). On the first run, the letter is sent regardless of the value of the "HasTermites" variable, and then on all subsequent runs the decision is made like this: 1) If the homeowner paid the blackmailer after getting the letter on the first run, then the letter will be sent on all subsequent runs with "HasTermites=NO", but it will *not* be sent on subsequent runs with "HasTermites=YES" 2) If the homeowner refused to pay the blackmailer after getting the letter on the first run, then the letter will be sent on all subsequent runs with "HasTermites=YES", but will *not* be sent on subsequent runs with "HasTermites=NO". Note that this rule guarantees that what the letter says is *true* on all the subsequent runs after the first one. But when the scenario is laid out this way, one can see that if you're the homeowner and you receive the letter, the recommended course of action in EDT would depend on what group of copies you want to create the best "good news" for. If you take the more "selfish" stance of only wanting to maximize utility for all the copies whose experience is identical to yours up until the moment of decision, i.e. only the subset that *also* received a letter, then it makes sense to pay up; your paying would then be proof that the blackmailer followed course #1 above, which means that with the possible exception of the first run, all subsequent runs where a copy gets a letter are also runs where "HasTermites=NO". On the other hand, if you take the more "altruistic" stance of trying to to maximize utility for *all* copies on all runs of the simulation, including ones whose experience has already diverged from yours because they never received a letter, then you shouldn't pay, since the fraction of runs that have to pay to deal with termites is the same regardless, and paying up just add slightly to the average expenses over all runs. So it seems that any claimed difference in recommendations between EDT and FDT is due to the implicit assumption that the EDT user was trying to maximize "good news" only for possible selves identical to you, not the broader set of possible selves whose experiences may have diverged from yours in the past. But EDT is just a broad framework for making decisions that maximize utility, it doesn't dictate what group you're trying to maximize utility for, so it seems like they're inadvertently attacking a strawman version of EDT in order to try to draw a contrast with FDT. (Note that Yudkowsky says [here](https://www.lesswrong.com/posts/szfxvS8nsxTgJLBHs/ingredients-of-timeless-decision-theory) that the 'the expected utility formula is actually over a counterfactual on our actions, rather than an ordinary probability update on our actions', so I don't see how he could disagree that in EDT you are allowed to consider utility for a group of counterfactual versions of you whose experience diverged from yours in the past.) And it may be that FDT could *also* be used to arrive at different possible recommendations based on what group of runs of a given agent-function (which can diverge in output due to different inputs, like some versions of the homeowner experiencing receiving a letter while other versions do not), though I'm not sure about that. Either way, I'd be skeptical that there are any other scenarios where EDT actually *forces* you to come to a different recommended course of action than FDT even if you are free to choose how inclusive a definition of "self" to use when trying to maximize the good news for yourself. So FDT might just be a decision procedure which is conceptually different than EDT but functionally identical; I'm sympathetic to the idea that it may be conceptually useful to highlight the possibility that your choices may be algorithmic, but as u/DaveyJF pointed out, their detailed conceptualization involves applying Pearl's causal graph analysis to "counterlogicals", which seems philosophically problematic (though perhaps it can be justified in terms of Bayesianism, where one can assign subjective probabilities to facts that may have a logically determinate answer, like whether some mathematical theorem is true), while EDT's way of analyzing these problems seems more straightforward. Finally, since EDT is just based on probabilities it seems much more natural to generalize it to less science-fictional scenarios of predictors who are just using [ordinary psychological techniques](https://www.youtube.com/watch?v=U_eZmEiyTo0) and not running detailed simulations of intelligent beings.