[Previous] Valuing Criticism | Home | [Next] Aristotle (and Peikoff and Popper)


I wrote:

The thing to do [about AI] is figure out what programming constructs are necessary to implement guesses and criticism.

Zyn Evam replied (his comments are green):

Cool. Any leads? Can you tell more? That's is what I have problems with. I cannot think of anything else than evolution to implement guesses and criticism.

the right answer would have to involve evolution, b/c evolution is how knowledge is created. i wonder why you were looking for something else.

one of the hard problems is:

suppose you:

  1. represent ideas in code, in a general way
  2. represent criticism in code (this is actually implied by (1) since criticisms are ideas)
  3. have code which correctly detects which ideas contradict each other and which don't
  4. have code to brainstorm new ideas and variants of existing ideas

that's all hard. but you still have the following problem:

two ideas contradict. which one is wrong? (or both could be wrong.)

this is a problem which could use better philosophy writing about it, btw. i'd expect that philosophy work to happen before AI gets anywhere. it's related to what's sometimes called the duhem-quine problem, which Popper wrote about too.

one of my own ideas about epistemology is to look at symmetries. two ideas contradicting is symmetric.

what do you mean by symmetries? how two ideas contradicting symmetric? could you give an example?

"X contradicts Y" means that "Y contradicts X". When two ideas contradict, you know at least one of them is mistake, but not which one. (Actually it's harder than that because you could be mistaken that they contradict.)

Criticism fundamentally involves contradiction. Sometimes a criticism is right, and sometimes the idea being criticized is right, and how do you decide which from the mere fact that they contradict each other?

With no additional information beyond "X and Y contradict", you have no way to take sides. And labelling Y a criticism of X doesn't mean you should side with it. X and Y have symmetric (equal) status. In order to decide whether to judge X or Y positively you need some kind of method of breaking the symmetry, some way to differentiate them and take sides.

Arguments are often symmetric too. E.g., "X is right because I said so" can be used equally well to argue for Y. And "X is imperfect" can be used equally well to argue against Y.

How to break this kind of symmetry is a major epistemology problem which is normally discussed in other terms like: When evidence contradicts a hypothesis, it's possible to claim the evidence is mistaken rather than the hypothesis. (And sometimes it is!) How do you decide?

So when two ideas contradict we know one of them at least is mistaken, but not which one. When we have evidence that seems to contradict a hypothesis we can never be sure that it indeed contradicts it. From the mere fact of contradiction, without additional information, we cannot decide which one is false. We need additional information.

Hypotheses are built on other hypotheses. We need to break the symmetry by looking at the hypotheses on which the contradicting ideas depend. And the question is: how would you do that? Is that right?

Mostly right. You can also look at the attributes of the contradicting ideas themselves, gather new observational data, or consider whatever else may be relevant.

And there are two separate questions:

  1. How do you evaluate criticisms at all?

  2. How do you evaluate criticisms formally, in code, for AIs?

I believe I know a lot amount about (1), and have something like a usable answer. I believe I know only a little about (2) and have nothing like a usable answer to it. I believe further progress on (1) -- refining, organizing, and clarifying the answer -- will help with solving (2).

Below I discuss some pieces of the answer to (1), which is quite complex in full. And there's even more complexity when you consider it as just one piece fitting into an evolutionary epistemology. I also discuss typical wrong answers to (1). Part of the difficult is that what most people believe they know about (1) is false, and this gets in the way of understanding a better answer.

My answer is in the Popperian tradition. Some bits and pieces of Popper's thinking have fairly widespread influence. But his main ideas are largely misunderstood and consequently rejected.

Part of Popper's answer to (1) is to form critical preferences -- decide which ideas better survive criticism (especially evidentiary criticism from challenging test experiments).

But I reject scoring ideas in general epistemology. That's a pre-Popper holdover which Popper didn't change.

Note: Ideas can be scored when you have an explanation of why a particular scoring system will help you solve a particular problem. E.g. CPU benchmark scores. Scoring works when limited to a context or domain, and when the scores themselves are treated more like a piece of evidence to consider in your explanations and arguments, rather than a final conclusion. This kind of scoring is actually comparable to measuring the length of an object -- you define a measure and you decide how to evaluate the resulting length score. This is different than an epistemology score, universal idea goodness score, or truth score.

I further reject -- with Popper -- attempts to give ideas a probability-of-truth score or similar.

Scores -- like observations -- can be referenced in arguments, but can't directly make our decisions for us. We always must come up with an explanation of how to solve our problem(s) and expose it to criticism and act accordingly. Scores are not explanations.

This all makes the AI project harder than it appears to e.g. Bayesians. Scores would be easier to translate to code than explanations. E.g. you can store a score as a floating point number, but how do you store an explanation in a computer? And you can trivially compare two scores with a numerical comparison, but how do you have a computer compare two explanations?

Well, you don't directly compare explanations. You criticize explanations and give them a boolean score of refuted or non-refuted. You accept and act on a single non-refuted explanation for a particular problem or context. You must (contextually) refute all the other explanations, rather have one explanation win a comparison against the others.

This procedure doesn't need scores or Popper's somewhat vague and score-like critical preferences.

This view highlights the importance of correctly judging whether an idea refutes another idea or not. That's less crucial in scoring systems where criticism adds or subtract points. If you evaluate one issue incorrectly and give an idea -5 points instead of +5 points, it could still end up winning by 100 points so your mistake didn't really matter. That's actually bad -- it essentially means that issue had no bearing on your conclusion. This allows for glossing over or ignoring criticisms.

A correct criticism says why an idea fails to solve the problem(s) of interest. Why it does not work in context. So a correct criticism entirely refutes an idea! And if a criticism doesn't do that, then it's harmless. Translating this to points, a criticism should either subtract all the points or none, and thus using a scoring system correctly you end up back at the all-or-nothing boolean evaluation I advocate.

This effectively-boolean issue comes up with supporting evidence as well. Suppose some number of points is awarded for fitting with each piece of evidence. The points can even vary based on some judgement of how importance each piece of evidence is. The importance judgement can be arbitrary, it doesn't even matter to my point. And consider evidence fitting with or supporting a theory to refer to non-contradiction since the only known alternatives basically consist of biased human intuition (aka using unstated, ambiguous ideas without figuring out what they are very clearly).

So you have a million pieces of evidence, each worth some points. You may, with me, wish to score an idea at 0 points if it contradicts a single piece of evidence. That implies only two scores are possible: 0 or the sum total of the point value of every piece of evidence.

But let's look at two ways people try to avoid that.

First, they simply don't add (or subtract) points for contradiction. The result is simple: some ideas get the maximum score, and the rest get a lower score. Only the maximum score ideas are of interest, and the rest can be lumped together as the bad (refuted) category. Since they won't be used at all anyway, it doesn't matter which of them outscore the others.

Second, they score ideas using different sets of evidence. Then two ideas can score maximum points, but one is scored using a larger set of evidence and gets a higher score. This is a really fucked up approach! Why should one rival theory be excluded from being considered against some of the evidence? (The answer is because people selectively evaluate each idea against a small set of evidence deemed relevant. How are the selections made? Biased intuition.)

There's an important fact here which Popper knew and many people today don't grasp. There are infinitely many theories which fit (don't contradict) any finite set of evidence. And these infinitely many theories include ones which offer up every possible conclusion. So there are always max-scoring theories, of some sort, for every position. Which makes this kind of scoring end up equivalent to the boolean evaluations I advocated in the first place. Max-score or not-max-score is boolean.

Most of these infinitely many theories are stupid which is why people try to ignore them. E.g. some of the form, "The following set of evidence is all correct, and also let's conclude X." X here is a completely unargued non sequitur conclusion. But this format of theory trivially allows a max-score theory for every conclusion.

The real solution to this problem is that, as Deutsch clearly explained in FoR (with the grass cure for the cold example), most bad ideas are rejected without experimental testing. Most ideas are refuted on grounds like:

  1. bad explanation

I was going to make a longer list, but everything else on my list can be considered a type of bad explanation. The categorizations aren't fundamental anyway, it's just organizing ideas for human convenience. A non sequitur is a type of bad explanation (non explanation). And a self-contradictory idea is a type of bad explanation too. And having a bad explanation (including none) of how it solves the problem it's supposed to solve is another important case. That gets into something else important which is understood by Popper and partly by Rand, but isn't well known:

Ideas are contextual. And the context is, specifically, that they address problems. Whether a criticism refutes an idea has to be evaluated in a particular context. The same idea (as stated in English) can solve one problem and fail to solve another problem. One way to approach this is to bundle ideas with their context and consider that whole thing the idea.

Getting back to the previous point, it's only ideas which survive our initial criticism (including doesn't blatantly contradict evidence we know offhand) that we take more interest in them and start carefully comparing them against the evidence and doing experimental tests. Testing helps settle a small number of important cases, but isn't a primary method. (Popper only partly understood this, and Deutsch got it right.)

The whole quest -- to judge ideas by how well (degree, score) they fit evidence -- is a mistake. That's a dead end and distraction. Scores are a bad idea, and evidence isn't the the place to focus. The really important thing is evaluating criticism in general, most of which broadly related to: what makes explanations bad?

BTW, what is an explanation? Loosely it's the kind of statement which answers why or how. The word "because" is the most common signal of explanations in English.

Solving problems requires some understanding of 1) how to solve the problem and 2) why that solution will work (so you can judge if the solution is correct). So explanation is required at a basic level.

So, backing up, how do you address all those stupid evidence-fitting rival ideas? You criticize them (by the category, not individually) for being bad explanations. In order to fit the evidence and have dumb conclusion, they have to have a dumb part you can criticize (unless the rival idea actually isn't so dumb as you thought, a case you have to be vigilant for). It's just not an evidence-based criticism (and nor should the criticism by done with unstated, based commonsense intuitions combined with frustration at the perversity of the person bringing an arbitrary, dumb idea into the discussion). And how do you address the non-evidence-fitting rival ideas? By rejecting them for contradicting the evidence (with no scoring).

Broadly it's important to take seriously that every flaw with an idea (such as contradicting evidence, having a self-contradiction, having a non sequitur, or having no explanation of how or why it solves the problem it claims to solve) either 1) ruins it for the problem context or 2) doesn't ruin it. So every criticism is either decisive or (contextually) a non-criticism. So evaluations of ideas have to be boolean.

There is no such thing as weak criticism. Either the criticism implies the idea doesn't solve the problem (strong criticism), or it doesn't (no criticism). Anything else is, at best, more like margin notes which may be something like useful clues to think about further and may lead to a criticism in the future.

The original question of interest was how to take sides between two contradicting ideas, such as an idea and a criticism of it. The answer requires a lot of context (only part of which I've covered above), but then it's short: reject the bad explanations! (Another important issue I haven't discussed is creating variants of current ideas. A typical reaction to a criticism is to quickly and cheaply make a new idea which is a little different in such a way that the criticism no longer applies to it. If you can do this without ruining the original idea, great. But sometimes attempts to do this run into problems like all the variants with the desired-traits-to-address-the-criticism ruin the explanation in the original idea.)

Elliot Temple on June 14, 2017


What do you think?

(This is a free speech zone!)