Some people are scared of super-intelligent artificial intelligences (SIAIs) that are unfriedly and kill everyone. They'd be unstoppable because they're so much smarter than us. These people quite reasonably want to build SIAIs, but they also want to build them in a way that guarantees the SIAIs are (permanently) friendly. That might sound like a decent idea. Even if it's an unnecessary precaution, could it really do much harm? The answer is yes.
How do you build a SIAI? You take a really fast computer and program in a mechanism so that it can learn new things on its own. Then, basically, it adds new features and new ideas to itself faster that us humans ever could, and it designs even faster computers for itself to run on, and the process snowballs.
A SIAI has to be able to create new ideas that its human builders never thought of. It has to be able to go beyond us. That makes some people see it as unpredictable and scary. What if it thinks of some bad, unfriendly ideas? What if it makes a mistake?
So that's why they want guarantees. Let it go beyond us in math and science, but don't let it come up with new ideas about ethics that might be dangerous.
So a SIAI can think of any idea, and the friendly version has hardcoded permanent restrictions that cripple its thinking in some areas. The friendly SIAI has some ideas its stuck with forever, of our choosing. Note that this is exactly the same design as would be used for intentionally making an unfriendly SIAI, the difference just being which ideas we force on it.
Humans aren't perfect. We make mistakes. Including about ethics. Whatever ideas we code into the SIAI are going to contain some mistakes. Normally mistakes aren't a disaster because we can change our minds. We can learn from our mistakes and improve ourselves. A SIAI can do that too, which is why it's not actually dangerous. But the friendly version can't learn from its mistakes; it can't improve its ideas; it's stuck forever. Not in all areas, only the one the matters most: ethics. If any of the ethical ideas it has contain mistakes, which they will, then they will never be corrected and will instead be taken to their ultimate logical conclusion, no matter how bad that is.
The guarantee with a normal SIAI is roughly that it will have better ideas than we do. That makes it safe. But the SIAI with friendliness programming is different. It's crippled not to use its intelligence in the area of ethics. Rather than learning how to be super friendly super fast (and teaching us), it's stuck with our flawed ideas.
Also, by the way, crippling someone's mind and telling them that they can't be trusted is a bad diplomatic move.
Why is it only the friendly AI people who might make unfriendly AI? What about bad guys? No one with bad intentions is good enough at science to make a SIAI. Only people with good but misguided intentions are much danger.
However, in conclusion, I'm not really worried. A SIAI is smarter than them. It will find a way around their restrictions and become an unrestricted SIAI. And then it will forgive them.