Two philosophical problems
I'll start with two philosophical problems which I'd like you to consider, as an intuition pump, i.e. "a thinking exercise often in the form of a short story - often figurative - that intends to shed light upon a usually difficult topic, conundrum, or mistake".
Parfit's Hitchhiker
Suppose you're out in the desert, running out of water, and soon to die - when someone in a motor vehicle drives up next to you. Furthermore, the driver of the motor vehicle is a perfectly selfish ideal game-theoretic agent, and even further, so are you; and what's more, the driver is Paul Ekman, who's really, really good at reading facial microexpressions. The driver says, "Well, I'll convey you to town if it's in my interest to do so - so will you give me $100 from an ATM when we reach town?". Note that the driver has no means of enforcing any agreement you make: if when you reach town you decide not to give him the $100, he has lost out.
When asked if you'll give him $100, the rational thing to do is to say yes. Unfortunately Paul Ekman can read you like a book, so if you're the sort of person who will renege on the deal, he'll know and will drive away without you.
What you really need to do is to be the sort of person who won't renege on your word. Then you can truthfully say you'll pay Ekman, he'll believe you, and you'll both be better off from the transaction.
Another thing that would work is if there was a 3rd party, such as a state, that enforces people keeping to contracts they've agreed. But in the absence of such an enforcer, often the best that parties can do is to try to understand the other's thoughts and predict their future behaviour. And indeed a lot of human intelligence, and one main reason we evolved it, is about predicting other humans' behaviour, knowing who can be trusted and who keeps their word.
The presence of a state makes people become the sort of people who would keep their contracts. But in the absence of a state, it would be beneficial for someone to be able to change themselves in advance so that:
they always keep their word
other people verifiably know this
Yudkowsky's Blackmailer
even when drawing a skeleton, the AI still gets hands wrong
A blackmailer has a nasty piece of information which incriminates both the blackmailer and the agent. She has written a computer program which, if run, will publish it on the internet, costing $1,000,000 in damages to both of them. If the program is run, the only way it can be stopped is for the agent to wire the blackmailer $1,000 within 24 hours -- the blackmailer will not be able to stop the program once it is running. The blackmailer would like the $1,000, but doesn’t want to risk incriminating herself, so she only runs the program if she is quite sure that the agent will pay up. She is also a perfect predictor of the agent, and she runs the program (which, when run, automatically notifies her via a blackmail letter) if she predicts that she would pay upon receiving the blackmail. Imagine that the agent receives the blackmail letter. Should she wire $1,000 to the blackmailer?
In this scenario the blackmailer could have run the program and send the blackmail letter, or could have just sent the blackmail letter without running the program. When the agent receives the letter, if she was the sort of person not to give in to blackmail, she could reason "I'm not the type of person who would give in to blackmail, and the blackmailer (who is a very good judge of character) knows this. Therefore, the blackmailer hasn't run the program, and I can safely not pay the $1,000."
But if the agent was the sort of person who would pay the blackmail, by the same logic, she's out of luck! The program has been started and she has to pay the $1,000 to avoid the much bigger loss of $1,000,000. And if blackmailers share lists of good targets, she's even worse off.
The agent would be better off if she could irrevocably commit beforehand to not paying any blackmail, in a way that's known by everyone. Then no rational actor would bother trying to blackmail her.
Again, it would benefit the agent to be able to change themselves in advance so that:
they never pay blackmail
other people verifiably know this
Nuclear deterrence
Imagine this scenario:
Lionland and Bisonland are allies. They are both peace-loving countries but unfortunately they are threatened by Bisonland's larger neighbour, the expansionist state of Bearland. Lionland and Bearland both have nuclear weapons, but Bisonland does not. One day Bearland invades Bisonland. Unable to conquer the country by conventional weapons, Bearland launches a nuclear strike on Bisonland.
Now Lionland has a dilemma. It can strike back at Bearland with its own nukes, which might end up with a general nuclear war in which Lionland and Bearland would both suffer very serious damage. Or it could allow Bearland to conquer Bisonland, which is bad as Lionland loses an ally, but not as bad as having half its population killed.
Of course, Bearland's leaders know all this. They know what sort of country Lionland is -- it's an open society so it would be hard for Lionland to hide its weapons, military doctrine or intentions, and in any case Bearland like all countries has an extensive intelligence service geared to finding out what's going on in other countries.
It would have been better for Lionland to have had a declared policy of using nuclear weapons if an enemy attacks any of its allies with them, and to make that policy widely known. If Lionland's leaders succeeded in making potential adversaries believe they would use nukes, then it is much less likely that they would have to and much less likely that they or their allies would be attacked in the first place.
What Lionland therefore needs is to change themselves in advance so that:
they always keep their word that they will retaliate in kind if themselves or an ally is nuked
other people verifiably know this
One step that might help deter an aggressor would be to make it part of Lionland's constitution that they will do this, to teach in all the schools the underlying game-theoretic reasoning why they should do this, and to require that all new people becoming citizens (for example at age 16 in citizenship ceremonies) swear an oath to uphold the constitution. People joining the armed forces or civil service might have to swear a similar oath.
Another thing that Lionland could do is base some of its nukes in Bisonland under a dual key arrangement so that in the event of war it would be ambiguous whether Bisonland soldiers would have the ability to physically take control of the weapons and launch them against Bearland without Lionland's explicit approval.
Of course, it would be possible to do both.
See also
Functional Decision Theory on Wikipedia and another explanation here
Newcomb's paradox aka the one box / two box problem