Saturday, February 04, 2006

Object Oriented programming in C

Object Oriented programming in C has been covered by many other before and actually whole frameworks are based on it. E.g. the Gtk (API on Gnome) and Core Foundation (API on OS X). So I am not going to show how all aspects of Object Oriented Programming (OOP) can be done. Just a few to demonstrate that it is in fact possible to do OOP in C. Perhaps then it will be clearer what the benefits of OOP is and that it is more a matter of style and a way of thinking than language syntax. Of course C++ and Java wasn't created for no reason. They have syntax that assist one in thinking and designing programs around Object Oriented (OO) principles. But that is all they do, assist. They do not magically make your program OO. Procedural example To understand the benfits of OOP lets take an example of one piece of code written in a procedural way, and then later in a OOP way. Our task is simple. We want to display a button on the screen that the user can click. When he/she clicks the button the program exits. We here assume we have the needed functions and skip initialing graphics hardware etc.
void draw_box(int x, int y, int width, int height); void draw_string(int x, int y, char *text); int is_mouse_button_down(int* x, int* y); int is_inside(int x, int y, int x1, int y2, int width, int height); int main (int argc, char const* argv[]) { draw_box(10, 10, 60, 30); draw_string(12, 12, "Quit"); int done = 0; while (!done) { int x, y; if (!is_mouse_button_down(x, y)) continue; if (is_inside(x, y, 10, 10, 60, 30)) done = 1; } return 0; }
We assume here that the functions listed above main() exists and do as their name implies. The simple program just draws a box that represent our button on screen and then draws a string on top of that so the button gets a label/caption. It is obvious that if we want to create more than one button we it is tedious to keep passing almost to same coordinates to draw the string on the button. So we create a new function draw_button(), which gives us the following code:
int main (int argc, char const* argv[]) { draw_button(10, 10, 30, 60, "Quit"); draw_button(50, 10, 30, 60, "Button1"); int done = 0; while (!done) { int x, y; if (!is_mouse_button_down(x, y)) continue; if (is_inside(x, y, 10, 10, 60, 30)) done = 1; if (is_inside(x, y, 50, 10, 60, 30)) printf("Clicked Button1"); } return 0; }
However let us say that each time one clicks Button1 we want it to move 5 pixels to down and to the right. We can change as follows:
int main (int argc, char const* argv[]) { int bx = 50, by = 10; draw_button(10, 10, 30, 60, "Quit"); draw_button(bx, by, 30, 60, "Button1"); int done = 0; while (!done) { int x, y; if (!is_mouse_button_down(x, y)) continue; if (is_inside(x, y, 10, 10, 60, 30)) done = 1; if (is_inside(x, y, bx, by, 60, 30)) { bx += 5; by += 5; button_draw(bx, by 60, 30, "Button1"); printf("Clicked Button1"); } } return 0; }
Now there is a number of problems with this code.
  1. There is no forced relationship between the drawing of the button and checking for clicks in it. It is possible to change drawing position of button and forget to update the mouse code check code as well.
  2. If we start making more buttons, say 20 more buttons, which each will move in different direction when clicked, it becomes a nightmare to keep track of where each button is.
Object Oriented Approach What needs to be done is instead of thinking of the problem as that of actions. That is what needs to be drawn, where etc. One should focus on the button as an entity or unit in the program. That is we should move to a more data centric way of thinking. Instead of thinking about drawing, checking for mouse clicks, etc, we think about creating, placing and moving button objects.
typedef struct { int x, y; int width, height; char *caption; } Button; Button* button_create(int x, int y, int width, int height) { Button* b = malloc(sizeof(Button)); b.x = x; b.y = y; b.width = width; b.height = height; button_draw(b); } void button_draw(Button* this) { draw_button(this->x, this->y, this->width, this->height, this->caption); } int button_is_inside(Button* this, x, y) { return is_inside(x, y, this->x, this->y, this->width, this->height); } void button_move(Button* this, int dx, int dy) { this->x += dx; this->y += dy; button_draw(this); } int main (int argc, char const* argv[]) { Button* quit = button_create(10, 10, 60, 20, "Quit"); Button* b1 = button_create(50, 10, 60, 20, "Button1"); int done = 0; while (!done) { int x, y; if (!is_mouse_button_down(x, y)) continue; if (button_is_inside(quit, x, y)) done = 1; if (button_is_inside(b1, x, y)) { button_move(5, 5) printf("Clicked Button1"); } } return 0; }
As one can see, we can now easily create lots of buttons and move them around without loosing track. But this is not all there is to OOP. So far I have made it look like it is just about collecting all variables in a struct so it is easier to see which variables that belong to each other. To show another important aspect of OOP I will give another code example. Here we are creating an Array object. The point of this object is that unlike regular C arrays it keeps track of its size, so we can query it in the rest of the program.
typedef struct { int *data; int size; } Array; Array* array_create(int size) { int* a = malloc(size*sizeof(int)); Array* array = malloc(sizeof(Array)); array->data = a; array->size = size; } int main (int argc, char const* argv[]) { Array* a = array_create(10); /* Fill array with values */ for (int i=0; a->size; i++) { a->data[i] = i; } return 0; }
Now this is all fine. But lets consider that we want to iterate over the whole array using pointers. So we decide that in order to do this it would be better to change the data structure for the Array. Instead of storing size of array we will store a pointer to the end of the array. So we change the data structure to the following:
typedef struct { int *begin; int *end; } Array;
Except there is one problem with this. Suddenly our code in main() is broken. We have to change a->size to a->end - a->begin. That is of course quick for us to do in this example. But what if we had written 10000 lines of code and iterated over the array loads of places? We would have to go through all that source code and made changes! Data encapsulation Data encapsulation is the solution to this problem. And this is one of the cornerstones of OOP. Instead of letting users of our Array object access its data directly we hide the internal representation of the object by requiring the users of it to access its properties through function calls. The code below shows this approach:
Array* array_create(int size) { int* a = malloc((size+1)*sizeof(int)); Array* array = malloc(sizeof(Array)); array->begin = a; array->end = a+size; } int array_size(Array* this) { return this->end - this->begin; } int* array_begin(Array* this) { return this->begin; } int* array_end(Array* this) { return this->end; } void array_set_at(Array* this, int index, int value) { this->begin[index] = value; } int main (int argc, char const* argv[]) { Array* a = array_create(10); /* Fill array with values */ for (int i=0; array_size(a); i++) { array_set_at(a, i, i); } /* Print values to screen */ for (int* it = array_begin(a); it != array_end(a); it++) { printf("%d", *it); } return 0; }
Inheritance Another important aspect of OOP in inheritance. This is a mechanism of style of programming that allows one to reuse a lot of code. Of course the code examples I show here are rather small so they don't show fully how much code there is to save having to rewrite. But it doesn't require too much imagination to see that this can be a big benefit when writing larger programs. To illustrate the usage of inheritance I will use geometric primitives. E.g. a point has a location is space. We can imagine functions to move the point in space relative to current position or set it at an absolute position.
typedef struct { int x, y; } Point; Point* point_alloc() { return malloc(sizeof(Point)); } Point* point_init(Point* this, int x, int y) { this->x = x; this->y = y; } void point_move(Point* this, int dx, int dy) { this->x += dx; this->y += dy; } float point_distance_to_origo(Point* this) { return sqrt(this->x*this->x + this->y*this->y); } int main (int argc, char const* argv[]) { Point* p = point_init(point_alloc(), 10, 10); printf("Distance to origo: %f", point_distance_to_origo(p)); return 0; }
Now we want to define a Circle object but we do not want to have to rewrite the code for moving the center of the circle since it is basically the same as moving a point. How can we reuse the point_move() function with Circle? All we need to do is to make it look like for the point_move() function as if it is dealing with a Point data structure as its first argument. This is actually not as hard as is seems:
typedef struct { Point position; int radius } Circle;
If we have a pointer to a circle structure we can now access the x and y coordinates like this:
Circle *c; int x = c->position->x;
However this is not very interesting for us since it does not allow us to treat a Circle as a point. What is interesting is that we can do the following:
Circle *c; Point* p = (Point*)c; int x = p->x;
How is this possible!? The reason why this works has to do with how C deals with structs. When you define a struct, the C compiler will store offset values for each variable in the struct. Here is how it works. When you write int x = p->x the C compiler will compile this into machine code that takes the start address of the struct pointed to by p and add a offset value representing the x to this address. This will create a new address which is used to locate the x member variable. Addresses are typically given on int boundaries (32 bit). So in the case of the Point data structure. Let us say that p is located in memory location 11. Then since x is the first variable defined it is located at offest 0, which means it is at 11 too. y on the other hand is the second variable so it has offset 1. Meaning it is on location 11+1=12. p->y would thus be converted to address 12. Bottom line is that offsets are given relative to top of struct definition. Thus if we put Point at the top of the Circle definition, the offset values x and y will still be valid for the Circle struct. Of course the C compiler doesn't know that, so we must trick it by doing a cast to Point*. Otherwise it will complain that Circle does not have any members named x and y. So to demonstrate inheritance, here is a short program again:
Circle* circle_alloc() { return malloc(sizeof(Circle)); } Circle* circle_init(Circle* this, int x, int y, int radius) { point_init((Point*)this, x, y); this->radius = radius; } float circle_area(Circle* this) { return this->radius*this->radius*M_PI; } int main (int argc, char const* argv[]) { Cricle* c = cricle_init(circle_alloc(), 10, 10, 5); printf("Distance to origo: %f", point_distance_to_origo((Point*)c)); printf("Area of circle: %f", circle_area(c)); return 0; }
There is of course a lot more to OOP, but several other people on the web have written much more extensive explanations, even books on doing OOP in C. My intention here was just to give an introduction. So you might wonder what is the point of using a dedicated OOP language like C++ or Java? First of all is the syntax sugar. Instead of writing point_move(p, x, y) you can write p->move(x, y) in C++. C++ makes p the first argument to move but by putting it in front it makes it clear that p is the object of focus. The other important aspect is that one is not required to prefix the function names with say point_ to avoid clutter in the namespace. C++ makes sure that each function called move() is associated with a type. And then there is the case of e.g. polymorphism that would just get very ugly looking in C. I do of course not advice anybody to do OO in C. There is no need for that when we have so many nice OOP language to choose from. Doing OOP in C however does make it clear that OOP is a paradigm and not a language feature. To look at it from the other side: One could choose to program procedure oriented in Java or C++ for instance by:
  1. Declaring member variables public and not use accessor methods (thus breaking the OO principle of data encapsulation and abstraction).
  2. One could use switch case statements instead of polymorphism.
  3. Don't do code reuse by using inheritance. Make all methods static
Personally I believe OOP thinking is best understood by either learning it through a procedure oriented language or a OOP language that takes OO to the extreme like Smalltalk. E.g. Smalltalk just force the user to think more about OO than C++ or Java does. Java and C++ lets programmers easily slip into procedure oriented ways of thinking. And I believe the reason for this is that both C++ and Java are procedure oriented at the low level. Instances of primitive types like int, float and char are not objects. if, while and for statements have procedure like semantics. And these are the statements and objects the newbie programmers are exposed to first, thus locking them into a procedure oriented way of thinking early on.

1 comment:

Unknown said...

I know it's been a long time since this was posted, but there are a couple issues I have with this page.

First, the formatting needs fixing.

Second, and significantly more important, is that every call to "malloc" *must* have a matching call to "free"! To take from C++ nomenclature, you have a "constructor", where is your "destructor"? Massive memory leaks are going to occur if people create hundreds of buttons using this.