FI Learning

For learning with practice. Posts are not private and could end up on Bing.


I decided to go back to my work on SICP. SICP is the book Structure and Interpretation of Computer Programs, which is here

I had taken a detour from my SICP work to make another website on which to post my SICP answers. I ended up making a website with GitHub Pages, which until then had seemed too scary and complicated for me to try. I did successfully make that website and make some formatting modifications. It is here. I put answers to Exercises 2.40, 2.41 and 2.42 there.

Then I wanted to learn more about html and css and liquid in order to better understand how websites are made. I learned a little about those. But I’ve decided I’d rather be working on SICP for now, although I want to know more about those other things eventually.

My plan for SICP is to go back to the beginning and redo my earlier exercise answers in order to post them to my new website. I’m curious how much I’ll remember of what I did before and if I’ll understand some things better the second time around.

As before, my SICP work is mostly to understand the concepts they’re teaching in the book. But it’s also to practice putting things on a website and to practice writing clearly.

Comments & Events

Anne B
From the Preface to SICP:

Second, we believe that the essential material to be addressed by a subject at this level is not the syntax of particular programming-language constructs, nor clever algorithms for computing particular functions efficiently, nor even the mathematical analysis of algorithms and the foundations of computing, but rather the techniques used to control the intellectual complexity of large software systems.

I like this idea. 

Other things in life besides large software systems are complex. We have techniques to control that complexity.

The first two chapters of the book are Building Abstractions with Procedures and Building Abstractions with Data. Building abstractions is a way to handle complex things.

In Section 1.1 they say:

A powerful programming language is more than just a means for instructing a computer to perform tasks. The language also serves as a framework within which we organize our ideas about processes. Thus, when we describe a language, we should pay particular attention to the means that the language provides for combining simple ideas to form more complex ideas. Every powerful language has three mechanisms for accomplishing this:
  • primitive expressions, which represent the simplest entities the language is concerned with,
  • means of combination, by which compound elements are built from simpler ones, and
  • means of abstraction, by which compound elements can be named and manipulated as units.
We can think of things other than programming languages this way too. Instead of primitive expressions, we can think of little parts. We can combine little parts to make bigger things (compound elements). Abstracting the bigger things means thinking of them as wholes, without considering the parts they’re made from. It’s like collapsing the nodes below them in a tree, as Elliot talks about here and elsewhere.
Anne B
Note that I might change the outline of my computers project as I go. I don't yet know all the things I'll want to learn or how they'll fit together.
Anne B
I continued my work on changing the formatting on the above website. I didn't want to study html and css in detail, but I figured out how to do enough with them to make the changes I wanted:
  • changed the title at the top of the page, making the two parts of it different sizes
  • decided to consistently use code blocks when talking about arguments to procedures
  • made a link to Home on every page
  • changed the color of the text in blockquotes
Feedback welcome on the formatting of the site and pages.

In the tree diagram I posted on March 26, I added a connecting arrow from "attitude" to "SICP Project". I gained some confidence by successfully making these formatting changes that seemed too hard a few weeks ago. One of my goals with the Computers Project is to feel less intimidated by computers.
You can make the answer to exercise 1.3 more readable by defining min and max procedures. I also suggest you name your procedure less ambiguously. The name "sum-larger-squares" could be taken to mean you are summing the two largest squares, rather than summing the squares of the two largest numbers. This difference matters when you call the procedure with negative numbers.
(define (square x) (* x x))

(define (sum-of-squares x y)
  (+ (square x) (square y)))

(define (min x y) (if (< x y) x y))

(define (max x y) (if (> x y) x y))

(define (sum-of-squares-of-two-largest-numbers x y z)
  (sum-of-squares (max x y) (max z (min x y))))
Elliot, Fallible Ideas 👍
Anne B
You can make the answer to exercise 1.3 more readable by defining min and max procedures.

I had been thinking about the readability of my English but not the readability of my code. I agree that your version is more readable than mine. I should start thinking more about code readability.

I also suggest you name your procedure less ambiguously. 

Good catch. I think “sum-larger-squares” is incorrect rather than ambiguous. When I first wrote the procedure I wasn’t thinking about negative numbers. Then later, when I remembered to consider negative numbers as input, I didn’t re-look at my procedure name with that in mind. I like your procedure name suggestion.
Anne B


I want to change my procedure and its name in Exercise 1.3 to give credit to Firebench. How about something like “Thanks to Firebench for some readability suggestions.”? I can’t share a link to this thread because only members can see it.
I don't need credit. I'd rather you didn't credit me. I'm happy my comments helped you.
Anne B 👍
Anne B

Breaking down procedures

I’ve wondered how much to break down procedures into smaller procedures. 

Firebench’s suggested procedure not only is more readable than mine, but also breaks things down more. I thought of a way to break it down even more:

My new answer #1:

(define (largest-of-three-numbers x y z)
  (max (max x y) (max y z)))

(define (second-largest-of-three-numbers x y z) (if (< x z) (min (max x y) z) (max (min x y) z)))
(define (sum-of-squares-of-two-largest-numbers x y z) (sum-of-squares (largest-of-three-numbers x y z) (second-largest-of-three-numbers x y z)))

Does breaking it down like this add too much code for the amount of clarity benefit?

How do I decide how much breaking down to do? 

It seems like the procedure has two main things it's doing: finding the two largest numbers out of three numbers and finding the sum of their squares. 

I thought of another option that does those two things but it seems clunky because it turns individual numbers into a list and then turns the list into a number. My new answer #2:

(define (list-of-largest-two-of-three-numbers x y z)
  (list (max x y)
        (max z (min x y))))
(define (sum-of-squares-of-two-largest-numbers x y z) (sum-of-squares (car (list-of-largest-two-of-three-numbers x y z)) (car (cdr (list-of-largest-two-of-three-numbers x y z)))))

I prefer my new answer #1 and am leaning towards that. But maybe Firebench’s answer is better because it’s the shortest readable answer.
Anne B
I also figured out how to change the colors in the syntax highlighting. The default colors look like this:
I don't like how some are bold and some aren't. And I don't like how similar the blue and the teal are.

The new colors look like this:

These colors fit in better with the other colors on the site.

I chose the black and the indigo, which are close to each other, for special forms and procedures that are already defined in scheme. Then the teal, which is sort of close to those, is for user-defined procedures. I'm not sure what to call the category that comes out orange, since it included both my variables and the already-defined "nil".

If you follow my link in the previous comment to Exercise 1.3, you can see that the syntax highlighting made the procedure names "min" and "max" indigo, even though they are not pre-defined in my version of Scheme. Maybe the parsing isn't perfect, maybe due to different versions of Scheme, or maybe there's something I'm not understanding.

(Note: I say "orange" because it looks orange to me on my screens, but it's actually called "coral".)

Comments welcome about colors and syntax highlighting.
How do I decide how much breaking down to do?

I make these judgments intuitively. I guess my approach is along the lines of "if the code is hard to comprehend then refactor (e.g. extract methods) until it's easy to comprehend". That is, I can read it and "know" it's going to work. Sometimes extracting methods results in code that is harder to comprehend because there's more of it—more boilerplate, more named concepts to hold onto etc.

Other reasons to extract methods are to make it easier to test parts of the logic on their own, and to enable reuse. For example, the "min" and "max" procedures seem  to me like they could be useful in many situations, whereas "second-largest-of-three-numbers" seems to be more narrowly reuseful.
Anne B Thank you.
Anne B
I’m having trouble with part of SICP Exercise 1.7.

They give a set of programs for finding square roots. They ask us to explain this:

Also, in real computers, arithmetic operations are almost always performed with limited precision. This makes our test inadequate for very large numbers.

Here are the programs:

(define (sqrt x)
  (sqrt-iter 1.0 x))
(define (sqrt-iter guess x) (if (good-enough? guess x) guess (sqrt-iter (improve guess x) x)))
(define (improve guess x) (average guess (/ x guess)))
(define (good-enough? guess x) (< (abs (- (square guess) x)) 0.001))

When I try this with very large numbers, the computer gets stuck because the guesses don’t change after a certain point. But I don’t understand why the improved guess is too close to the previous guess for the computer to register them as different but the square of the improved guess is not too close to x for the computer to register them as different. The numbers near x that can be represented in the computer will be even more spread out than the numbers near the guess, but I don’t have an idea of how much more spread out.

The answers below all seem unsurprising to me, but I don’t understand why they’re true, if they are. Anyone have an idea that will help me understand this? Or an idea for how to approach the problem, given what I’ve done so far?

On this page, Owen writes:

For a large number x, and this is where people get confused, the real reason for hanging is because (improve guess x) never actually improve the result because of the limitation of bits, the "improved guess" will simply be equal to "old guess" at some point, results in (- y^2 x) never changes and hence never reach inside the tolerance range. 

Same page, Maggyero:

For large radicands, the procedure sqrt-iter enters an infinite recursion because the tolerance is not scaled up to the large radicands and floating-point numbers are represented with limited precision so the absolute error at that scale is always greater than the tolerance.

Because of rounding errors, the function (improve guess x) can’t improve the guess anymore as the smallest possible difference between guess2 and x is larger than 0.001. This is because with numbers of this order magnitude, the distance between two consecutive floating point numbers is larger than 0.001.

When performing arithmetic operations with limited precision, the difference between two successive numbers increases as the magnitude of the numbers increases. It's common to call this difference ε. For any number x(n) and its limited-precision successor x(n+1) whose ε is greater than or equal to 0.001, good-enough? will never return #t, meaning that a square root implementation which uses good-enough? will never find a value for numbers >= x(n).

The problem with large numbers is a little more tricky. The square root computations must be performed in floating point arithmetic, which has limited precision in CL systems (on the other hand, integer arithmetic is arbitrarily large). Floating point numbers are usually represented with an exponent and a mantissa, and when the exponent grows large, the “quantas” the number can advance in become large. In our example, after a few iterations the number 3.039737E8 was reached and from there the process entered infinite recursion because (improve 3.039737E8) returns 3.039737E8 itself.

The floating-point numbers used in computers are not mathematically equivalent to the real numbers. Real numbers are infinitely many (uncountably infinitely many, to be more precise). A 64-bit floating-point number can only assume 2^64 distinct discrete values. In addition the distribution of these values is not uniform along the real line. Reasonably close to zero they are densely packed. The distance between two consecutive floating-point values gradually increases as their magnitude increases.

At some point this distance may become larger than our tolerance. In this case we are almost guaranteed that our algorithm will never terminate. Almost, because there is a small chance, that the representation of the square of our guess will be the same as the representation of x after rounding in the floating-point unit.

In the example above, the smallest rounding error is much larger than the cited 0.001. This makes it impossible for the condition
(< (abs (- (square guess) x)) 0.001))
to yield #t.
With big numbers, the correct square root is either not exactly representable, or the math operations to improve the guess do not allow the correct answer to be reached exactly. This is due to the way floating point numbers are implemented (with a limited number of bits). The bigger the numbers, the less precision you have in the decimal places.

The upshot is that with big numbers, the square of the best guess may not be within the required accuracy (0.001).
Anne B
It makes sense to me that with big numbers,
  • there is less precision
  • the correct square root is not exactly representable
  • the math operations to improve the guess do not allow the correct answer to be reached exactly

What I don’t understand is, with big numbers, why the best approximation the program comes to (which may not be within 0.001 of the actual square root) usually won’t pass the good-enough? test. It seems to me that since, around the very large radicand, the numbers that the computer can represent are far apart, the square of the guess could easily evaluate to the same number as the radicand.
I see what you mean and I am not sure of the answer. 

My guess is that the result of each arithmetic operation is rounded to a representable value. And when you string several arithmetic operations together, rounding errors will accumulate, thus reducing the chance that the final figure will match the radicand's representable value.

Are you able to single-step through the code during its execution to see what's happening?
Anne B
Is this what you mean? I had forgotten that I could do something like this and it was good for me to look up how to do it.

Here are partial results for two very large numbers.

This one terminates and these are the last few entries:

> (sqrt 9999999999999999999999999999999999999999)

[...] guess:1.0000149206922578e+20 guess-squared:1.0000298416071425e+40 x:9999999999999999999999999999999999999999 x-divided-by-guess:9.99985079530366e+19 guess:1.0000000001113119e+20 guess-squared:1.0000000002226238e+40 x:9999999999999999999999999999999999999999 x-divided-by-guess:9.999999998886881e+19 guess:1e+20 guess-squared:1e+40 x:9999999999999999999999999999999999999999 1e+20

This one gets stuck. These are the last few entries that change and the first few that don't change:

> (sqrt 99999999999999999999999999999999999999999999999999)








You can see in this example that the guess isn't changing. And you can see that the square of the guess is further than 0.001 from x. However, I don't understand why this will almost always be the case. 

Well, I see why if the square of x is different from x, it'll always be further from it than 0.001. The computer doesn't go to that many decimal places with such large numbers. But I don't see why it doesn't round to x.

Maybe this is too hard for me to figure out now? I can leave it as an unanswered question for now.
Isn't it simply the case that there's no representable number just below 1e+25, so when it reaches 1e+25, it can't reduce the guess further?

Coupled with the fact that the multiplication operation for square rounds up, so 1e+25 squared doesn't produce 99999999999999999999999999999999999999999999999999, but 1.0000000000000003e+50, even though the former is representable and would be much closer to the right answer.
Anne B
Hmm. Here's a bigger number where the program does terminate:

> (sqrt 1000000000000000000000000000000000000000000000000000000.001)







Anne B
Does multiplication always round up or something?
Anne B
I didn't find an answer to my question above about SICP Exercise 1.7. I noted my question in my post about it and moved on. If anyone has further ideas about my question, feel free to post them.

I started a document for myself of Unanswered Questions about SICP. I'm hoping this will help me keep track of them better.
Anne B
SICP Exercise 1.10 gives this procedure:

(define (A x y)
  (cond ((= y 0) 0)
        ((= x 0) (* 2 y))
        ((= y 1) 2)
        (else (A (- x 1)
                 (A x (- y 1))))))

I wondered what a tree for using this procedure would look like. I came up with a series of trees for the simple example of (A 2 2).

Any thoughts on other ways to represent this kind of thing?
I find that tree representation easy to follow.
Anne B
My answer to SICP Exercise 1.13 is here. My intended audience is other people who are doing the SICP exercises.

This particular exercise is about math, not about coding, in case that makes a difference in whether you want to read it.

I especially want feedback on this exercise because I did parts of it differently than other people did and other people mostly didn’t do it the same as each other. So I have no check on whether I did it correctly other than my own reading of what I and others wrote. For correctness, I wonder more about the second part than about the first part. For clarity, I wonder about all of it. 

This is the answer I wrote to this exercise last year. I think my new answer is better organized, in part because of the outline I put at the top.  I made lots of edits, trying to make it more clear. But it still doesn’t seem super clear to me.

Here are links to some other people's answers:
Alisa Zinov’yevna Rosenbaum
Your solution to 1.13 looks good to me.