Alien Implant: Newcomb’s Smoking Lesion

In an alternate universe, on an alternate earth, all smokers, and only smokers, get brain cancer. Everyone enjoys smoking, but many resist the temptation to smoke, in order to avoid getting cancer. For a long time, however, there was no known cause of the link between smoking and cancer.

Twenty years ago, autopsies revealed tiny black boxes implanted in the brains of dead persons, connected to their brains by means of intricate wiring. The source and function of the boxes and of the wiring, however, remains unknown. There is a dial on the outside of the boxes, pointing to one of two positions.

Scientists now know that these black boxes are universal: every human being has one. And in those humans who smoke and get cancer, in every case, the dial turns out to be pointing to the first position. Likewise, in those humans who do not smoke or get cancer, in every case, the dial turns out to be pointing to the second position.

It turns out that when the dial points to the first position, the black box releases dangerous chemicals into the brain which cause brain cancer.

Scientists first formed the reasonable hypothesis that smoking causes the dial to be set to the first position. Ten years ago, however, this hypothesis was definitively disproved. It is now known with certainty that the box is present, and the dial pointing to its position, well before a person ever makes a decision about smoking. Attempts to read the state of the dial during a person’s lifetime, however, result most unfortunately in an explosion of the equipment involved, and the gruesome death of the person.

Some believe that the black box must be reading information from the brain, and predicting a person’s choice. “This is Newcomb’s Problem,” they say. These persons choose not to smoke, and they do not get cancer. Their dials turn out to be set to the second position.

Others believe that such a prediction ability is unlikely. The black box is writing information into the brain, they believe, and causing a person’s choice. “This is literally the Smoking Lesion,” they say.  Accepting Andy Egan’s conclusion that one should smoke in such cases, these persons choose to smoke, and they die of cancer. Their dials turn out to be set to the first position.

Still others, more perceptive, note that the argument about prediction or causality is utterly irrelevant for all practical purposes. “The ritual of cognition is irrelevant,” they say. “What matters is winning.” Like the first group, these choose not to smoke, and they do not get cancer. Their dials, naturally, turn out to be set to the second position.

 

Wishful Thinking about Wishful Thinking

Cameron Harwick discusses an apparent relationship between “New Atheism” and group selection:

Richard Dawkins’ best-known scientific achievement is popularizing the theory of gene-level selection in his book The Selfish Gene. Gene-level selection stands apart from both traditional individual-level selection and group-level selection as an explanation for human cooperation. Steven Pinker, similarly, wrote a long article on the “false allure” of group selection and is an outspoken critic of the idea.

Dawkins and Pinker are also both New Atheists, whose characteristic feature is not only a disbelief in religious claims, but an intense hostility to religion in general. Dawkins is even better known for his popular books with titles like The God Delusion, and Pinker is a board member of the Freedom From Religion Foundation.

By contrast, David Sloan Wilson, a proponent of group selection but also an atheist, is much more conciliatory to the idea of religion: even if its factual claims are false, the institution is probably adaptive and beneficial.

Unrelated as these two questions might seem – the arcane scientific dispute on the validity of group selection, and one’s feelings toward religion – the two actually bear very strongly on one another in practice.

After some discussion of the scientific issue, Harwick explains the relationship he sees between these two questions:

Why would Pinker argue that human self-sacrifice isn’t genuine, contrary to introspection, everyday experience, and the consensus in cognitive science?

To admit group selection, for Pinker, is to admit the genuineness of human altruism. Barring some very strange argument, to admit the genuineness of human altruism is to admit the adaptiveness of genuine altruism and broad self-sacrifice. And to admit the adaptiveness of broad self-sacrifice is to admit the adaptiveness of those human institutions that coordinate and reinforce it – namely, religion!

By denying the conceptual validity of anything but gene-level selection, therefore, Pinker and Dawkins are able to brush aside the evidence on religion’s enabling role in the emergence of large-scale human cooperation, and conceive of it as merely the manipulation of the masses by a disingenuous and power-hungry elite – or, worse, a memetic virus that spreads itself to the detriment of its practicing hosts.

In this sense, the New Atheist’s fundamental axiom is irrepressibly religious: what is true must be useful, and what is false cannot be useful. But why should anyone familiar with evolutionary theory think this is the case?

As another example of the tendency Cameron Harwick is discussing, we can consider this post by Eliezer Yudkowsky:

Perhaps the real reason that evolutionary “just-so stories” got a bad name is that so many attempted stories are prima facie absurdities to serious students of the field.

As an example, consider a hypothesis I’ve heard a few times (though I didn’t manage to dig up an example).  The one says:  Where does religion come from?  It appears to be a human universal, and to have its own emotion backing it – the emotion of religious faith.  Religion often involves costly sacrifices, even in hunter-gatherer tribes – why does it persist?  What selection pressure could there possibly be for religion?

So, the one concludes, religion must have evolved because it bound tribes closer together, and enabled them to defeat other tribes that didn’t have religion.

This, of course, is a group selection argument – an individual sacrifice for a group benefit – and see the referenced posts if you’re not familiar with the math, simulations, and observations which show that group selection arguments are extremely difficult to make work.  For example, a 3% individual fitness sacrifice which doubles the fitness of the tribe will fail to rise to universality, even under unrealistically liberal assumptions, if the tribe size is as large as fifty.  Tribes would need to have no more than 5 members if the individual fitness cost were 10%.  You can see at a glance from the sex ratio in human births that, in humans, individual selection pressures overwhelmingly dominate group selection pressures.  This is an example of what I mean by prima facie absurdity.

It does not take much imagination to see that religion could have “evolved because it bound tribes closer together” without group selection in a technical sense having anything to do with this process. But I will not belabor this point, since Eliezer’s own answer regarding the origin of religion does not exactly keep his own feelings hidden:

So why religion, then?

Well, it might just be a side effect of our ability to do things like model other minds, which enables us to conceive of disembodied minds.  Faith, as an emotion, might just be co-opted hope.

But if faith is a true religious adaptation, I don’t see why it’s even puzzling what the selection pressure could have been.

Heretics were routinely burned alive just a few centuries ago.  Or stoned to death, or executed by whatever method local fashion demands.  Questioning the local gods is the notional crime for which Socrates was made to drink hemlock.

Conversely, Huckabee just won Iowa’s nomination for tribal-chieftain.

Why would you need to go anywhere near the accursèd territory of group selectionism in order to provide an evolutionary explanation for religious faith?  Aren’t the individual selection pressures obvious?

I don’t know whether to suppose that (1) people are mapping the question onto the “clash of civilizations” issue in current affairs, (2) people want to make religion out to have some kind of nicey-nice group benefit (though exterminating other tribes isn’t very nice), or (3) when people get evolutionary hypotheses wrong, they just naturally tend to get it wrong by postulating group selection.

Let me give my own extremely credible just-so story: Eliezer Yudkowsky wrote this not fundamentally to make a point about group selection, but because he hates religion, and cannot stand the idea that it might have some benefits. It is easy to see this from his use of language like “nicey-nice,” and his suggestion that the main selection pressure in favor of religion would be likely to be something like being burned at the stake, or that it might just have been a “side effect,” that is, that there was no advantage to it.

But as St. Paul says, “Therefore you have no excuse, whoever you are, when you judge others; for in passing judgment on another you condemn yourself, because you, the judge, are doing the very same things.” Yudkowsky believes that religion is just wishful thinking. But his belief that religion therefore cannot be useful is itself nothing but wishful thinking. In reality religion can be useful just as voluntary beliefs in general can be useful.

Eliezer Yudkowsky on AlphaGo

On his Facebook page, during the Go match between AlphaGo and Lee Sedol, Eliezer Yudkowsky writes:

At this point it seems likely that Sedol is actually far outclassed by a superhuman player. The suspicion is that since AlphaGo plays purely for *probability of long-term victory* rather than playing for points, the fight against Sedol generates boards that can falsely appear to a human to be balanced even as Sedol’s probability of victory diminishes. The 8p and 9p pros who analyzed games 1 and 2 and thought the flow of a seemingly Sedol-favoring game ‘eventually’ shifted to AlphaGo later, may simply have failed to read the board’s true state. The reality may be a slow, steady diminishment of Sedol’s win probability as the game goes on and Sedol makes subtly imperfect moves that *humans* think result in even-looking boards. (E.g., the analysis in https://gogameguru.com/alphago-shows-true-strength-3rd-vic…/ )

For all we know from what we’ve seen, AlphaGo could win even if Sedol were allowed a one-stone handicap. But AlphaGo’s strength isn’t visible to us – because human pros don’t understand the meaning of AlphaGo’s moves; and because AlphaGo doesn’t care how many points it wins by, it just wants to be utterly certain of winning by at least 0.5 points.

IF that’s what was happening in those 3 games – and we’ll know for sure in a few years, when there’s multiple superhuman machine Go players to analyze the play – then the case of AlphaGo is a helpful concrete illustration of these concepts:

He proceeds to suggest that AlphaGo’s victories confirm his various philosophical positions concerning the nature and consequences of AI. Among other things, he says,

Since Deepmind picked a particular challenge time in advance, rather than challenging at a point where their AI seemed just barely good enough, it was improbable that they’d make *exactly* enough progress to give Sedol a nearly even fight.

AI is either overwhelmingly stupider or overwhelmingly smarter than you. The more other AI progress and the greater the hardware overhang, the less time you spend in the narrow space between these regions. There was a time when AIs were roughly as good as the best human Go-players, and it was a week in late January.

In other words, according to his account, it was basically certain that AlphaGo would either be much better than Lee Sedol, or much worse than him. After Eliezer’s post, of course, AlphaGo lost the fourth game.

Eliezer responded on his Facebook page:

That doesn’t mean AlphaGo is only slightly above Lee Sedol, though. It probably means it’s “superhuman with bugs”.

We might ask what “superhuman with bugs” is supposed to mean. Deepmind explains their program:

We train the neural networks using a pipeline consisting of several stages of machine learning (Figure 1). We begin by training a supervised learning (SL) policy network, pσ, directly from expert human moves. This provides fast, efficient learning updates with immediate feedback and high quality gradients. Similar to prior work, we also train a fast policy pπ that can rapidly sample actions during rollouts. Next, we train a reinforcement learning (RL) policy network, pρ, that improves the SL policy network by optimising the final outcome of games of self-play. This adjusts the policy towards the correct goal of winning games, rather than maximizing predictive accuracy. Finally, we train a value network vθ that predicts the winner of games played by the RL policy network against itself. Our program AlphaGo efficiently combines the policy and value networks with MCTS.

In essence, like all such programs, AlphaGo is approximating a function. Deepmind describes the function being approximated, “All games of perfect information have an optimal value function, v ∗ (s), which determines the outcome of the game, from every board position or state s, under perfect play by all players.”

What would a “bug” in a program like this be? It would not be a bug simply because the program does not play perfectly, since no program will play perfectly. One could only reasonably describe the program as having bugs if it does not actually play the move recommended by its approximation.

And it is easy to see that it is quite unlikely that this is the case for AlphaGo. All programs have bugs, surely including AlphaGo. So there might be bugs that would crash the program under certain circumstances, or bugs that cause it to move more slowly than it should, or the like. But that it would randomly perform moves that are not recommended by its approximation function is quite unlikely. If there were such a bug, it would likely apply all the time, and thus the program would play consistently worse. And so it would not be “superhuman” at all.

In fact, Deepmind has explained how AlphaGo lost the fourth game:

To everyone’s surprise, including ours, AlphaGo won four of the five games. Commentators noted that AlphaGo played many unprecedented, creative, and even“beautiful” moves. Based on our data, AlphaGo’s bold move 37 in Game 2 had a 1 in 10,000 chance of being played by a human. Lee countered with innovative moves of his own, such as his move 78 against AlphaGo in Game 4—again, a 1 in 10,000 chance of being played—which ultimately resulted in a win.

In other words, the computer lost because it did not expect Lee Sedol’s move, and thus did not sufficiently consider the situation that would follow. AlphaGo proceeded to play a number of fairly bad moves in the remainder of the game. This does not require any special explanation implying that it was not following the recommendations of its usual strategy. As David Wu comments on Eliezer’s page:

The “weird” play of MCTS bots when ahead or behind is not special to AlphaGo, and indeed appears to have little to do with instrumental efficiency or such. The observed weirdness is shared by all MCTS Go bots and has been well-known ever since they first came on to the scene back in 2007.

In particular, Eliezer may not understand the meaning of the statement that AlphaGo plays to maximize its probability of victory. This does not mean maximizing an overall rational estimate of the its chances of winning, giving all of the circumstances, the board position, and its opponent. The program does not have such an estimate, and if it did, it would not change much from move to move. For example, with this kind of estimate, if Lee Sedol played a move apparently worse than it expected, rather than changing this estimate much, it would change its estimate of the probability that the move was a good one, and the probability of victory would remain relatively constant. Of course it would change slowly as the game went on, but it would be unlikely to change much after an individual move.

The actual “probability of victory” that the machine estimates is somewhat different. It is a learned estimate based on playing itself. This can change somewhat more easily, and is independent of the fact that it is playing a particular opponent; it is based on the board position alone. In its self-training, it may have rarely won starting from an apparently losing position, and this may have happened mainly by “luck,” not by good play. If this is the case, it is reasonable that its moves would be worse in a losing position than in a winning position, without any need to say that there are bugs in the algorithm. Psychologically, one might compare this to the case of a man in love with a woman who continues to attempt to maximize his chances of marrying her, after she has already indicated her unwillingness: he may engage in very bad behavior indeed.

Eliezer’s claim that AlphaGo is “superhuman with bugs” is simply a normal human attempt to rationalize evidence against his position. The truth is that, contrary to his expectations, AlphaGo is indeed in the same playing range as Lee Sedol, although apparently somewhat better. But not a lot better, and not superhuman. Eliezer in fact seems to have realized this after thinking about it for a while, and says:

It does seem that what we might call the Kasparov Window (the AI is mostly superhuman but has systematic flaws a human can learn and exploit) is wide enough that AlphaGo landed inside it as well. The timescale still looks compressed compared to computer chess, but not as much as I thought. I did update on the width of the Kasparov window and am now accordingly more nervous about similar phenomena in ‘weakly’ superhuman, non-self-improving AGIs trying to do large-scale things.

As I said here, people change their minds more often than they say that they do. They frequently describe the change as having more agreement with their previous position than it actually has. Yudkowsky is doing this here, by talking about AlphaGo as “mostly superhuman” but saying it “has systematic flaws.” This is just a roundabout way of admitting that AlphaGo is better than Lee Sedol, but not by much, the original possibility that he thought extremely unlikely.

The moral here is clear. Don’t assume that the facts will confirm your philosophical theories before this actually happens, because it may not happen at all.