Sunday, November 22, 2009

How does Go fit in with the C-family of languages

The C-family of languages is a pretty large group. It could be discussed which language is fit and which don't but I think it's worthwhile including the languages: C, C++, Objective-C, Java, D, C#.

First we got C++ which is trying to extend C into an object oriented language by taking inspiration from Simula. And then came Objective-C which took a more dynamic approach to object oriented programming and tried essentially to embed smalltalk in to C.

Years later we got Java which try to retain the simplicity of Objective-C compare to C++ but give it a more C++ looking syntax. C# tried to create better Java by using the basic ideas of Java but allow more the power of C++.

And right in the middle there comes D and tries to be what C++ should have been, or rather could have been if we could throw the nasty bits out.

A lot has been centered around C++, and I feel Go is about going back to the roots. Go feels a lot more like C:

  • Fairly simple and clean language. Compared to C++ with its huge feature list
  • No inheritance
  • No classes
  • No constructors and destructors. No RAII
  • You just allocate structs. No code is run like on C++. Like C you have to create separate initialization functions
  • The standard library is very similar to the C standard library. The way you deal with IO, strings etc, feel much more like C than C++
  • In C++ structs are not just simple structs. They can contain invisible fields you didn't explicitly put there like a vtable. In Go structs are just like C structs. There is no vtable

Unlike Java and C# there is no virtual machine. However like those and unlike C++ if a crash happens you are not left in the cold. You get a stack backtrace on the console. In this respect Go is more in the C++ and D camp than in the Java and C# camp. While Java, C# and D have been about trying to keep the basic ideas from C++ and fix them, Go seems to be more about getting duck typing like found in languages like Python, Ruby into the C-family. Or perhaps one could say the designer skipped the whole class and inheritance thing and looked at all the great stuff done with C++ templates which essentially gives compile time duck typing and decided that was a model on to witch Go's dynamic dispatch should be built on

Benefits of Go's interface type

Go doesn't use inheritance but achieve much the same through interfaces. However interfaces have some benefits over the traditional approach I'd like to highlight. Below I am showing two code examples illustrating a struct or class B with a corresponding interface name A.
 struct A {
   virtual int alfa(int a) = 0;
   virtual int beta(int b) = 0;  
 };

 struct B : public A {
   int alfa(int a);  
   int beta(int b);  
   int d;
 };

 int B::alfa(int a) {
  return d + a;
 }

 int B::beta(int b) {
  return d + b + 3;
 }
Below is the Go version of the code above:
 type A interface {
  alfa(a int) int;
  beta(b int) int;
 }

 type B struct {
  d int;
 }

 func (m B) alfa(a int) int {
  return m.d + a;
 }

 func (m B) beta(b int) int {
  return m.d + b + 3;
 }
In C++ we can call the methods defined on the struct B through its interface A like this:
 int x, y;
 B b;
 b.d = 3;
 A *a = &b;

 x = a->alfa(1);
 y = a->beta(2);
Likewise in Go:
 var x, y int;
 b := B{3};
 var a A = b;
 x = a.alfa(1);
 y = a.beta(2);
On the surface this looks very similar but there are some notable differences. In C++ the inheritance hierarchy can be arbitrarily deep and so dynamic dispatch is dependent on traversing a vtable on a class to its superclass on so on until the correct implementation is found. In Go there is no inheritance tree, so each method will be accessed as if a single function pointer. In this way Go is more similar to how you typically create some kind of polymorphism in C. In C it is common to define structs with lists of function pointers, which are changed depending on type etc. The other difference is that if you change the code on Go to this:
 var x, y int;
 b := B{3};
 x = b.alfa(1);
 y = b.beta(2);
Then the calls are resolved statically. There is no function pointer lookup, simply because there are no virtual functions in Go. If you call a method on a struct in Go, the compiler knows exactly what method you are calling. However if you change the code in C++ likewise, you are still making a virtual method call. To be fair in some cases the compiler can figure out the right method call. But the bp pointer could have been passed around and you can't know if it points to a subclass of B or not. With Go, this problem doesn't exist since structs don't have subclasses, making the job much easier for the compiler.
 int x, y;
 B b;
 b.d = 3;
 B *bp = &b;

 x = bp->alfa(1);
 y = bp->beta(2);

Better consistency in type system

Having methods on structs be statically resolved also makes it trivial to support methods on basic types like ints and floats. Something which isn't possible in C++ or Java. The reason that isn't possible is because ints and floats don't have vtables. While in Go you don't need a vtable to have methods, so it is not a problem. In my view this blurs the distinction between basic types like ints and objects which are e.g. in Java two clearly different things. That is a good thing since it creates better consistency. There are less special cases.

How to deal with missing features in Go

A lot of people will probably complain about all those C++ not found in Go. E.g. without constructors and destructors how does one handle resources safely through RAII? Go doesn't really need RAII because it has closures. That is used extensively in e.g. Ruby to get the same benefits. E.g. here is some code I wrote that opens a file, reads one line at a time and closes the file.
 ReadLines("struct-template.h", func(line string) {
  fmt.Printf(doStuffWithLine(line));
 })
The ReadLines function was implemented like this:
 func ReadLines(file_name string, fn func(line string)) {
  file, err := os.Open(file_name, os.O_RDONLY, 0);
  defer file.Close();
     if file == nil {
      fmt.Printf("can't open file; err=%s\n",  err.String());
      os.Exit(1);
     }

  in := bufio.NewReader(file);
  for s, err := in.ReadString('\n'); err != os.EOF; s, err = in.ReadString('\n') {
   fn(s)
  }  
 }
Which shows another way to deal with RAII. One can use defer which will call method after defer when function goes out of scope. Exceptions are not present in Go either but you can mimic the kind of error handling you do with exceptions by using multiple return values were one signals error and use the named return value feature.
 func ProcessFile(file_name string) (err os.Error) {
  file, err := os.Open(file_name, os.O_RDONLY, 0);
  // ...
  return;
 }
In the simple example above err will automatically be bound to the return value. So if the function didn't handle the error returned in err it will be automatically propagated to the calling function.

First impressions from the Go programming language

I spent the last week or so learning the new programming language release from Google called Go. The first program I written this a simple text processing tool. That is the kind of tools I've previously written in Python and Ruby. I would say that Python is still better suited for this kind of job. However that is mainly due to less functional regular expressions and string libraries found in Go. Although it is unavoidable that when using a statically typed language there is a bit more overhead. In particular with respect to typing.

Compared to regular statically typed languages

However for a statically typed language I can't find anything that can compare. I could write code in a manner to remind me a lot of how it feels to write code and script language. Other languages which I'm familiar with like C++, Objective-C, Java are much more verbose and clunky to use. Those are languages which encourage much more planning and feels better suited for larger applications.

C is a simple language but it lack so much features that it becomes cumbersome to do string processing. No string class and a bit too code spent managing memory.

Compared to Haskell and C#

Of course there are other statically typed languages like Haskell and C#. Now Haskell is probably a more innovative and elegant language then Go. But it is also the language requires much more understanding before it can be used productively. Most developers are not intimate with the functional languages. Especially pure functional languages like Haskell. With Go on the other hand I could use the skills I had developed while using languages like Python, C++ and C. That meant I could be productive quite quickly. With a sharp unafraid to comment because so much has happened without language to last year's and I haven't used in years. I know at least that the C# that I used to use could not compete with Go in ease-of-use.

Compared to Scheme

I have written text processing utility and scheme previously. The whole development process is nicer than most other languages I think. Mainly because of the interactive style development that scheme allows. I could quickly and easily test segments of my code in the interactive shell because everything in scheme is an expression. In this respect go is as cumbersome as any other traditional language. You run test your program one file at a time.

Of course came that same problem as Haskell. You can easily reuse programming skills developed while using C and C++ for many years. It's cumbersome to get used to reading scheme code that takes time to get used to what functions are called and how to print special characters like newline etc.

Advantages of C over C++

It has been claimed that C++ is a better C them C. this is being taken to mean that when switching to C++ you can continue to code more or less as she did in C and use a little extra C++ functionality for convenience. The problem with that is that a lot of things which are perfectly safe to do and see are not safe to do while using C++. So here is my list of issues not found in C. You can avoid many of these issues in C++ by limiting what features you use. But you never have any guarantees. You can't pick up random C++ code, look at it and be certain whether it is doing something safe or not when e.g. statically initializing a variable.
  • Static initialize is safe in C but not in C++, because in C++ static initialization can cause code to run, which depends on other variables having been statically initialized. It can also cause cleanup code to run at shutdown which you can't control sequence of (destructors).
  • C gives you better control over what happens when your code is executed. When reading seek out it is fairly straightforward to decipher one code is getting executed and when memory is just restart or primitive operations are performed. In C++ on the other hand your have to deal with several potential problems:
    • A simple variable definition can cause code to run (constructors and instructors)
    • Implicitly generated and called functions. If you didn't define constructors, destructors and operator= you will get them generated for you.
    See hidden cost of C++ or Defective C++
  • C supports variable sized arrays on the stack. Which is much faster to allocate than on the heap. (C99 feature)
  • No name mangling. If you intend to read generated assembly code, this makes that much easier. It can be useful when trying to optimize code.
  • De facto standard application binary interface (ABI). Code produced by different compilers can easily be combined.
  • Much easier to interface with other languages. A lot of languages will let you call C functions directly. Binding to a C++ library is usually a much more elaborate job.
  • Compiling C programs is faster than compiling C++ programs, because parsing C is much easier than parsing C++.
  • Varargs cannot safely be used in C++. They're not entirely safe in in C either. However they're much more so in the C++, to the point that they are prohibited in the C++ coding standards (Sutter, Alexandrescu).
  • C requires less runtime support. Makes it more suitable for low-level environments such as embedded systems or OS components.
  • Standard way in C to do encapsulation is to forward declare a struct and only allow access to its data through functions. This method also creates compile time encapsulation. Compile time encapsulation allows us to change the data structures members without recompilation of client code (other code using our interface). The standard way of doing encapsulation C++ on the other hand (using classes) requires recompilation of client code when adding or removing private member variables.
Disliking C++ is not a fringe thing. It does not mean that one is not capable of understanding complex languages. Quite a lot of respect computer science people and language designers aren't fond of C++. See C++ coders