Cool Blog Title Goes Here: October 2009

Thursday, October 22, 2009

OOP maybe should have been called Oops

http://www.informit.com/articles/article.aspx?p=1327762

"Designing object-oriented software is hard—designing reusable object-oriented software is even harder. Experienced object-oriented designers will tell you that a reusable and flexible design is difficult, if not impossible, to get "right" the first time, and that multiple attempts at reuse with subsequent redesigns are normal."

Now there's a statement about Object Oriented Design I can agree with :-)

Yes I realize that Design Patterns tried to give a set of reusable designs to certain problem cases or environments, and that's all great and what not, but I think software design has to perhaps abstract these ideas even further into "modules of services".

It won't matter so much if you used a methodology like OOP or Functional Programming, or any old imperative approach, as long as the pieces are small and purposeful, and the code is written in a clear enough way, it should be fairly straightforward to adapt it to new situations.

That said...

I've been successful in having written a service in haskell that doesn't really use much in the way of any object oriented design. in fact, it's mainly done with monads, and an EDSL that looks a lot like "Expect". I've been able to tweak this EDSL to write the solution to the problem I'm solving in the language *I* want.

Due to the way the program is designed to work with an external program for getting communication done, it's isolated from the mechanisms of communication, and only implements the logic around parsing data and issuing commands.

This has proven to be very powerful in testing systems via various interconnects. I've used a bizarre combination of socat, hooked up to serial ports, that are looped back through to another program that uses a proprietary embedded serial streaming protocol back up into the Haskell code with exactly the same ease as launching an ssh client session or using socat straight across a serial line.

Thursday, October 15, 2009

Tim Newsham strikes again!

Another, more cleaned up version of 9P in Haskell is here.

Nice!

Wednesday, October 14, 2009

I don't really know what to say..

...except that Tim Newsham is the man.

Out of the blue one day he totally writes up a very nice 9P implementation in Haskell that I started to look at, then ran out of time before I had to start working on other things. A few days ago I thought I'd get back to it, and it doesn't work out.

So a couple days ago he sends me *another* implementation of 9P in Haskell, but instead of just tackling straight 9P, he goes and does 9P.u too (Unix extensions involving fields for the stat message and such). But not only that, he adds an example client program that works with a 9p service running on the Google Android platform, reading sensor data from the phone.

Now that's cool... I need to try it against some of my Inferno service stuff.

Tuesday, October 6, 2009

More Y

Y is trivial to write in Haskell. The wikipedia entry on fixed point combinators shows that via some reduction operations you end up with a simple definition of: Y g = g (Y g)

To write this in Haskell is really straightforward:


let y g = g (y g)

Yes. Just that easy.

The type of the function y is now:


y :: (t -> t) -> t

That is to say y is a function, that takes a function that takes a type "t" and returns a type "t" and returns a type "t".

This form of type is consistent with being a fixed point calculator for functions. (see the top of that wikipedia article linked earlier for an explanation of a fixed point of a function).

Can this be used to compute a fibonacci function from a fibonacci step? You bet!


let fs = (\n -> (\x -> if x =< 1 then 1 else (n (x -1)) + (n (x - 2))))

Now just like before, in scheme, we can use this to try to manually unroll recursion via the "n" argument which is a function to be run next.


((fs (fs (fs (fs (fs undefined))))) 5)
8

So let's run it with Y


((y fs) 5)
8

or something larger


((y fs) 10)
89

Strictly speaking, writing trace in a pure functional language is not allowed, unless you break the rules. Debug.Trace.trace lets you do this, and since I'm tracing anyway...


let tracestep label = (\f -> (\g -> (\n -> let res = (f g) n in (Debug.Trace.trace (label ++ " " ++ (show res)) res))))

Can I still compose this with "y"?


((y (tracestep "fib" fs)) 5)
fib 1
fib 1
fib 2
fib 1
fib 3
fib 1
fib 1
fib 2
fib 5
fib 1
fib 1
fib 2
fib 1
fib 3
fib 8
8

Yes!

I was going to write about memoizing this as in the Scheme example, but I'm quite tired, and my wife and I watch So You Think You Can Dance, so that's what I'm going to do instead.

Monday, October 5, 2009

Some Scheme and Y

Y combinators are pretty cool. They allow for the definition of recursive code without needing support for self-reference directly.

You can find some good links about it here(javascript), here (wikipedia), and there's a ton of results for it if you search with google.

In a nutshell it's a fixed point combinator that can generate a recursive function from a function that is not defined recursively.

Take the following tried and true example of recursion, the factorial!



(define factorial
    (lambda (x) 
        (cond ((= x 0) 1)
                   (else (* x (factorial (- x 1)))))))

factorial is a function defined in terms of itself (references factorial).

What if this wasn't allowed in your language though? How could you write this?

Well if you had the Y combinator you could write a step of the factorial calculation and add an argument which would be the next operation to run after this step is complete.


(define factstep
  (lambda (next)
    (lambda (n)
      (if (= n 0) 1 (* n (next (- n 1)))))))

Now if you want to use factstep to compute factorials you must first pass in the "next" operation, which yields another function which takes the parameter to the factorial.


; factorial 1
((factstep 'blah) 1)

; factorial 3
((factstep (factstep (factstep (factstep 0)))) 3)

As you can see this gets really tedious without recursion.

Enter the Y combinator:


(define Y
  (lambda (le)
    ((lambda (f) (f f))
     (lambda (f)
       (le (lambda (x) ((f f) x)))))))

Looks like complete nonsense if you don't know how it is derived. There's many different websites that go into that, so I'm not going to. I'm looking for a practical use of this thing.

Well it turns out by applying Y to the function "factstep" you can create the factorial function you're after.


(define factorial (Y factstep))

Now (factorial 6) still yields 720. (Yes Y seems quite magical at this point if you don't understand it, so I suggest reading up on that if it is confusing to you)

So what good is this if we can just define functions recursively to begin with?

Well what if we want to weave in some other behaviors?

Let's look at a more interesting primitive recursive function like fibonacci.


(define fibstep
  (lambda (next)
    (lambda (n)
      (cond 
        ((<= n 1) 1)
        (else (+ (next (- n 1)) (next (- n 2))))))))

This is known to be horribly inefficient as it redoes a lot of the same work? Want to watch how inefficient it is? Ok, let's add a trace step.


(define tracestep
  (lambda (label f)
    (lambda (g)
      (lambda (n)
        (let ((res ((f g ) n)))
          (display label) (display " ") (display res) (newline)
          res)))))

This simply passes through the previous steps, stores an intermediate value, and prints it using a label that's passed in. Here's how to use it.


(define traced-fib (Y (tracestep "fib" fibstep)))

Now if we run it for say "traced-fib 6"


> (traced-fib 6)
fib 1
fib 1
fib 2
fib 1
fib 3
fib 1
fib 1
fib 2
fib 5
fib 1
fib 1
fib 2
fib 1
fib 3
fib 8
fib 1
fib 1
fib 2
fib 1
fib 3
fib 1
fib 1
fib 2
fib 5
fib 13
13

Look at all of those steps!!! Clearly this sucks, and if you try something like (traced-fib 100) you'll be waiting for a long time for completion. It expands it's "recursion" via Y like a bomb!

How can we improve on this? Well without modifying fibstep, we could add memoization, so first I'll make a global memoization table (forgive the possibly poor choice of data structures here)

Memoization is a technique to prevent computing the same value twice, which we're doing a lot above per the trace. (Again this has been explained on the web a lot. Googling around for why may prove enlightening)


(define memoization-table (make-vector 100 0))

This is a vector of 100 0s. I'm only going to memoize 100 values of this computation statically.

So now I'll write the memoization step.


(define memoize
  (lambda (f)
    (lambda (g)
      (lambda (n)
        (let ((ref (vector-ref memoization-table n)))
          (if 
           (= 0 ref)
           (let ((val ((f g) n)))
             (vector-set! memoization-table n val)
             val)
           ref))))))

The indexes of the memoization table are the results of f(i), where i is the ith fibonacci number. f(1) = 1, f(2) = 1, f(3) = 2, f(4) = 3... etc

Basically once we compute a value for i, we insert it in the table, so we can look it up again. if f(i) is 0, it means we have to compute it yet, but we'll only do it one time.

So how do I stack all of this up with Y?


(define traced-memo-fib (Y (memoize (tracestep "fib" fibstep))))

or if you don't want tracing


(define memo-fib (Y (memoize fibstep)))

Let's watch it run for 6 now


> (traced-memo-fib 6)
fib 1
fib 1
fib 2
fib 3
fib 5
fib 8
fib 13
13

How about something bigger?


> (traced-memo-fib 8)
fib 21
fib 34
34

It didn't have to compute up to 13, so it just did fib(7) and fib(8). If we reset the memoization table (reload the program) we can see all the steps again.


> (traced-memo-fib 8)
fib 1
fib 1
fib 2
fib 3
fib 5
fib 8
fib 13
fib 21
fib 34
34

Compare this to the unmemoized fibonacci:


> (traced-fib 8)
fib 1
fib 1
fib 2
fib 1
fib 3
fib 1
fib 1
fib 2
fib 5
fib 1
fib 1
fib 2
fib 1
fib 3
fib 8
fib 1
fib 1
fib 2
fib 1
fib 3
fib 1
fib 1
fib 2
fib 5
fib 13
fib 1
fib 1
fib 2
fib 1
fib 3
fib 1
fib 1
fib 2
fib 5
fib 1
fib 1
fib 2
fib 1
fib 3
fib 8
fib 21
fib 1
fib 1
fib 2
fib 1
fib 3
fib 1
fib 1
fib 2
fib 5
fib 1
fib 1
fib 2
fib 1
fib 3
fib 8
fib 1
fib 1
fib 2
fib 1
fib 3
fib 1
fib 1
fib 2
fib 5
fib 13
fib 34
34

Yikes! That's a lot of waste there.

What I've been able to do with Y is write steps of a computation in near isolation of each other, and then compose them with Y to form a useful function (arguably... it's just fibonacci) to optimize and augment behaviors by weaving them together.

Now I'm not sure what this means for Y in terms of practicality, but it's kind of neat. Obviously one has to obey certain rules to use Y compositionally, but it's totally possible for some classes of problems. I'm not sure how my coworkers would react to my using this in any sort of production code. Could anyone read it? :-)

Ah well, that's why this blog is just stuff that interests me.

Sunday, October 4, 2009

A few random thoughts

In conversations with people, it occurs to me that the concept of Functional Programming really scares some folks off. I feel this is the effect of the object oriented paradigm having been shown to be the almighty hammer and the whole world to be a nail. Shame on us.

It's interesting to note that people often know more functional programming than they may even know. Take the C++ STL for example. It's all in generic algorithms that apply an outer-most layer of an algorithm, requiring the coder to specify the smaller steps that make the outer-most algorithm mean anything useful at all. In C even, we have qsort and mergesort as standard calls. These are a great start to addressing the "who needs Functional Programming?" question.

The Standard C Library's qsort is a well documented and useful sort function with some very interesting properties, or no one would even care about putting it into a library. Implementing it correctly the first time, from memory is not likely to happen, so to save us time the library has implemented all of the tricky outer-logic for us, requiring us only to provide a pointer to a function that implements the comparison function that will be used to determine if one item in the collection being sorted is less than, greater than, or equal to another item.

I do not think anyone can argue that this is not a very powerful concept. With this, one can create an arbitrarily complex data type, and as long as one can specify the comparison criteria, one can "implement" qsort, without doing the heavy lifting. Isn't that what OOP was supposed to provide? Is this not the same re-use goal we all want in libraries to begin with?

Note that this means one can also "lie" to qsort about what value should come before the next, and by simply making the compare function return the swapped values for a less than comparison for the value returned for greater than, one can reverse sort a collection (descending order vs ascending order), again without writing all the outer-most, and more complex qsort code.

How great is that?

STL and C++ standard libraries let one define the "allocation" step for geting more memory as a separate "policy" from the rest of the code. I've written my own C++ compliant allocators that allow for getting memory from "mmap" instead of malloc, and then run benchmark tests against each just for fun. The algorithms for the collections are the same code, and the only difference is dropping in a template parameter for the allocation class. This, even though it's C++, is not solely the domain of OOP languages. In fact, Functional Programming is meant for this kind of flexibility.

Is there any wonder now why people claim that Functional Programming is about managing complexity by thinking about operations rather than the data?

I've written a good bit of Erlang code over the last 2 years, and there's no mutable variables (sort of) in that language, so in order to do a replace of a value in a list, I have to basically emit an entirely new list with the value in question updated. Well I could have written all the code to do that myself using primitive recursion, and subtraction until a value for an index reaches 0 or 1 and then changing the value in question, and appending the rest of the list to it, but with functions like "foldl" and "foldr" that's not necessary. I can just write the "guts" of the algorithm, and let the collection traversal of the list be handled by a well-defined, and reusable function. Sure for lists this may seem a trivial operation, but consider that one could use a fold function over a tree in a few different traversal orders, and I think one can begin to see the power of this approach.

If all of that evidence doesn't turn one on to at least appreciating Functional Programming, perhaps just looking at the sheer growth in popularity of well respected software institutions, and the application of Functional Programming to mission critical software could convince one to give it a second look.