Sunday, January 17, 2010

C/C++ Quiz: the comma operator

I never cease to amaze me how I am always learning new things when programming C/C++ even after this many years. However I thought that mostly related to quirks in C++ and C++ template stuff. Yesterday I discovered something in C, which I kind of think I should have known. Especially since I have used it countless times without really knowing it. Why not make this a little quiz, to see if you really thought about it too.

You have probably written code like this before int x = 2, y = 3 or for (i = 0, j = 1; i < size; ++i, ++j). So here is the quiz. Without looking up in a reference or googling. What does this do, and print out?


int i = 5, size = 5;
i < size || (i = -1), ++i;
printf("%d\n", i); 
 

Answer: It increments i but wraps when becomes size or bigger. Meaning incrementing i when it is size or larger sets it to zero. So 0 is printed out. How is how it works:

  1. The || operator will only evaluate the right expression if the left one was false. Because the right expression does not need to be evaluated to determine that the whole expression is true when left is true.
  2. So when i is 5 the left expression is false and i is thus set to -1.
  3. The comma operator is an expression operator. a, b makes sure b is evaluated after a. Meaning ++i will always be evaluated, but after all the other expressions that needs to be evaluated to find the value of the expression.
So it could have been written as:

int i = 5, size = 5;
if (i >= size)
  i = 0;
++i;
printf("%d\n", i); 
 

What got me into this was trying to understand this macro found in the source code of the lua script language:


#define luaL_addchar(B,c) \
  ((void)((B)->p < ((B)->buffer+LUAL_BUFFERSIZE) || luaL_prepbuffer(B)), \
   (*(B)->p++ = (char)(c)))
 

It is basically used to copy characters into a buffer. B->p is current position in buffer. The code makes sure that luaL_prepbuffer is called if the size of the buffer is exceeded. That function will empty the current buffer (passing on the contents) and making it ready to receive more characters. So the current pointer is reset to the start.

The question is of course why do this? Probably because it makes it possible to treat the macro more like a function. An expression can be placed most places a function call can be placed. However multiple statement, even if they are enclosed by {} can not.

This is legal after current definition

for (i < 0; i < size; luaL_addchar(B,c))
 
But needles to say it would not have been legal if luaL_addchar(B,c) had expanded to something like this in the for statement:

for (i < 0; i < size; {if (B->p >= B->buffer+LUAL_BUFFERSIZE) 
                         luaL_prepbuffer(B); 
                       *B->p++ = c;})
 

Wednesday, January 13, 2010

Interpreter pattern in Go using closures

I recently had to create some code using the interpreter pattern from GoF. However I didn't have my Design Pattern by the Gang of Four ready so I had to look it up at wikipedia. That got me looking at example implementations in C# and Java. Being all about Go for the time being it got me thinking what the same code would look like in Go using closures instead of classes. Just like I demonstrated with other patterns in previous posts.

The code below demonstrates the interpreter pattern by implementing an interpreter of Reverse Polish notation as described on the wiki page. The wiki example uses a class to implement each grammar rule. Here we use a closure. A higher order function will return a function which when run will evaluate the grammar in given context.


import "fmt"
import . "container/vector"
import . "strings"

type Expression func(variables map[string]int) int

func Number(number int) Expression {
  return func(variables map[string]int) int {
    return number
  }
}

func Plus(left, right Expression) Expression {
  return func(variables map[string]int) int {
    return left(variables) + right(variables)
  }
}

func Minus(left, right Expression) Expression {
  return func(variables map[string]int) int {
    return left(variables) - right(variables)
  }
}

func Variable(name string) Expression {
  return func(variables map[string]int) int {
    return variables[name]
  }
}

What is interesting to notice is that the Java version using classes to define grammars takes up 43 lines, while the Go version takes 25 lines. The C# version takes even more lines: 63. However that is in large part because the approach is a bit different. The Evaluator takes about the same number of lines.


func Evaluator(expression string) Expression {
  var expStack Vector
  for _, token := range Split(expression, " ", 0) {
    switch {
    case token == "+": 
      subExpression := Plus(expStack().(Expression), 
                            expStack().(Expression))
      expressionStack.Push(subExpression)
    case token == "-": 
      subExpression := Minus(expStack().(Expression), 
                             expStack().(Expression))
      expressionStack.Push(subExpression)
    default:
      expressionStack.Push(Variable(token))
    }    
  }
  syntaxTree := expStack().(Expression)
  
  return func(context map[string]int) int {
    return  syntaxTree(context)
  }
}

The test code saves some lines on having map as a builtin type.

func main() {
  expression := "w x z - +"
  sentence   := Evaluator(expression)
  variables  := map[string]int {"w" : 5, "x" : 10, "z" : 42}
  result     := sentence(variables)
  fmt.Println(result) 
}

One could change the code to have the + and - operator share some more code. But it wouldn't increase the code size as in the C# example.

func plus(a, b int)  int { return a + b }
func minus(a, b int) int{ return a - b }

func BinOp(left, right Expression, op func(a, b int) int) Expression {
  return func(variables map[string]int) int {
    return op(left(variables), right(variables))
  }
}