AGI Alignment and Karl Popper

I quit the Effective Altruism forum due to a new rule requiring posts and comments be basically put in the public domain without copyright. I had a bunch of draft posts, so I’m posting some of them here with light editing.

On certain premises, which are primarily related to the epistemology of Karl Popper, artificial general intelligences (AGIs) aren’t a major threat. I tell you this as an expert on Popperian epistemology, which is called Critical Rationalism.

Further, approximately all AGI research is based on epistemological premises which contradict Popperian epistemology.

In other words, AGI research and AGI alignment research are both broadly premised on Popper being wrong. Most of the work being done is an implicit bet that Popper is wrong. If Popper is right, many people are wasting their careers, misdirecting a lot of donations, incorrectly scaring people about existential dangers, etc.

You might expect that alignment researchers would have done a literature review, found semi-famous relevant thinkers like Popper, and written refutations of them before being so sure of themselves and betting so much on the particular epistemological premises they favor. I haven’t seen anything of that nature, and I’ve looked a lot. If it exists, please link me to it.

To engage with and refute Popper requires expertise about Popper. He wrote a lot, and it takes a lot of study to understand and digest it. So you have three basic choices:

Do the work.
Rely on someone else’s expertise who agrees with you.
Rely on someone else’s expertise who disagrees with you.

How can you use the expertise of someone who disagrees with you? You can debate with them. You can also ask them clarifying questions, discuss issues with them, etc. Many people are happy to help explain ideas they consider important, even to intellectual opponents.

To rely on the expertise of someone on your side of the debate, you endorse literature they wrote. They study Popper, they write down Popper’s errors, and then you agree with them. Then when a Popperian comes along, you give them a couple citations instead of arguing the points yourself.

There is literature criticizing Popper. I’ve read a lot of it. My judgment is that the quality is terrible. And it’s mostly written by people who are pretty different than the AI alignment crowd.

There’s too much literature on your side to read all of it. What you need (to avoid doing a bunch of work yourself) is someone similar enough to you – someone likely to reach the same conclusions you would reach – to look into each thing. One person is potentially enough. So if someone who thinks similarly to you reads a Popper criticism and thinks it’s good, it’s somewhat reasonable to rely on that instead of investigating the matter yourself.

Keep in mind that the stakes are very high: potentially lots of wasted careers and dollars.

My general take is you shouldn’t trust the judgment of people similar to yourself all that much. Being personally well read regarding diverse viewpoints is worthwhile, especially if you’re trying to do intellectual work like AGI-related research.

And there aren’t a million well known and relevant viewpoints to look into, so I think it’s reasonable to just review them all yourself, at least a bit via secondary literature with summaries.

There are much more obscure viewpoints that are worth at least one person looking into, but most people can’t and shouldn’t try to look into most of those.

Gatekeepers like academic journals or university hiring committees are really problematic, but the least you should do is vet stuff that gets through gatekeeping. Popper was also respected by various smart people, like Richard Feynman.

Mind Design Space

The AI Alignment view claims something like:

Mind design space is large and varied.

Many minds in mind design space can design other, better minds in mind design space. Which can then design better minds. And so on.

So, a huge number of minds in mind design space work as starting points to quickly get to extremely powerful minds.

Many of the powerful minds are also weird, hard to understand, very different than us including regarding moral ideas, possibly very goal directed, and possibly significantly controlled by their original programming (which likely has bugs and literally says different things, including about goals, than the design intent).

So AGI is dangerous.

There is an epistemology which contradicts this, based primarily on Karl Popper and David Deutsch. It says that actually mind design space is like computer design space: sort of small. This shouldn’t be shocking since brains are literally computers, and all minds are software running on literal computers.

In computer design, there is a concept of universality or Turing completeness. In summary, when you start designing a computer and adding features, after very few features you get a universal computer. So there are only two types of computers: extremely limited computers and universal computers. This makes computer design space less interesting or relevant. We just keep building universal computers.

Every computer has a repertoire of computations it can perform. A universal computer has the maximal repertoire: it can perform any computation that any other computer can perform. You might expect universality to be difficult to get and require careful designing, but it’s actually difficult to avoid if you try to make a computer powerful or interesting.

Universal computers do vary in other design elements, besides what computations they can perform, such as how large they are. This is fundamentally less important than what computations they can do, but does matter in some ways.

There is a similar theory about minds: there are universal minds. (I think this was first proposed by David Deutsch, a Popperian intellectual.) The repertoire of things a universal mind can think (or learn, understand, or explain) includes anything that any other mind can think. There’s no reasoning that some other mind can do which it can’t do. There’s no knowledge that some other mind can create which it can’t create.

Further, human minds are universal. An AGI will, at best, also be universal. It won’t be super powerful. It won’t dramatically outthink us.

There are further details but that’s the gist.

Has anyone on the AI alignment side of the debate studied, understood and refuted this viewpoint? If so, where can I read that (and why did I fail to find it earlier)? If not, isn’t that really bad?

Information/Links

Recent Posts

List All Posts

AGI Alignment and Karl Popper

Mind Design Space

Messages