Someone asked:
hey can anybody tell me what do i choose for programing either GNU c or JAVA for implementing some networkprograms using some linux systemcalls
Being a C++ evangelist (for what I believe are very good reasons), I suggested that the person started his project in C++ instead of Java or C.
Now, the language wars will never end, and everyone who enjoys programming have their own favourite language. I do not believe that I will ever manage to convince the unwashed masses that C++ is a superior language - yet, when someone cries out for advice, as was the case, I will of course give my best recommendations - and since I firmly believe that C++ is a superior general purpose programming language, I am going to recommend it. That was the start of "Beowulf Language War 2004". Q1 I guess ;)
The C++ programming language was initially created by Bjarne Stroustrup, as a "better C". C++ was an attempt to extend and (to some extent) change the C programming language to overcome some of the common and big problems with the language. C++ was first standardized in 1998, and as such, is a much younger language than C.
For the vast majority of problems out there, I state that C++ provides no significant downsides and a number of significant improvements. Bold? Some people seem to think so, but it's really the case. Let's start out by clearing up a few very common C++ misunderstandings:
struct element_t { struct element_t *next, *prev; void *element; };C++ allows one to write something like:
template <typename T> struct element_t { element_t<T> *next, *prev; T element; };Not only does the C++ implementation prevent common programming errors (like putting an element of the wrong type on the list), it also allows better optimization by the compiler! For example, a generic sort implementation is available in both C and C++ - the C routine is defined as:
void qsort(void *base, size_t nmemb, size_t size, int(*compar)(const void *, const void *));whereas the C++ routine is defined as
templateThe difference being, that for example sorting an array of integers, would, in the C case, require a function call for every single compare, whereas the C++ implementation would allow the compiler to inline the integer comparison calls, as the actual sort routine is instantiated at compile time, automatically by the compiler, with the correct types inserted in the template arguments.void sort(RandomAccessIterator first, RandomAccessIterator last);
Note, that I am not arguing that everything is rewritten in C++. There are many large projects out there which are written in C - I do not believe that it is a good idea to just "convert" them to C++. C++ allows for cleaner solutions than C does, for a great many problems. Doing a minimal conversion of a solution which is "as clean as it gets" in C, to C++, would convert "good C" code into "poor C++". That is not a change to the better!
I have converted code from C to C++. But such a conversion has always ended up being a complete rewrite. From scratch. A redesign as such may not be necessary - after all, C++ does not force any new design principles down your throat, and if the original C code was well designed, there may not be a need to redo that.
However, for new projects starting out today, I fail to see why anyone (who knows C++ or is willing to learn it) would use C rather than C++. Unless, of course, that one of the two problems stated earlier are important.
The problem: Write a program that reads text from standard-input, and returns (prints) the total number of distinct words found in the text.
In short, I presented a C++ solution to the problem, and challenged someone to do it in C. I even wrote:
I still guarantee you two things: 1) Your code will be longer 2) Your program will be slowerHey, you gotta be bold to keep these things interesting!
#include <set> #include <string> #include <iostream> int main(int argc, char **argv) { // Declare and Initialize some variables std::string word; std::set<std::string> wordcount; // Read words and insert in rb-tree while (std::cin >> word) wordcount.insert(word); // Print the result std::cout << "Words: " << wordcount.size() << std::endl; return 0; }Personally, I think this is elegant. Anyone who knows C++ will be able to glance at the code and understand completely what it does, within a few tens of seconds. The code is short, and only extremely simple well-understood operations are performed on the data - it is quickly established that the code is correct. If the beholder understands the language, of course, but this is a problem with all languages.
Note, the actual code presented here is not the exact suggestion sent to the list - I changed the printing of the result at the bottom of the code, so that the program prints out the number of distinct words - the original code was made to print out an unsorted list of word frequencies. This change only touches a few of the very last lines in the code, it touches nothing in the central parts of the program at all.
Unlike my initial solution, this program uses a hash to store the distinct words. A hash can be faster than an RB-tree - a well dimensioned hash can exhibit O(1) complexity for insertions where an RB-tree exhibits O(log(n)) complexity.
The change of algorithm is unfortunate for a language-performance comparison point of view - as the C implementation will exhibit different run-time performance than the C++ implementation, for reasons that are not related to the choice of language at all. However, I'm not going to bitch about it, I'm going to evaluate the suggestion and still prove my point; that C++ is a better language for solving general purpose programming problems.
Let's have a look at the C code shall we;
/* Copyright (C) 1999 Lucent Technologies */ /* From 'Programming Pearls' by Jon Bentley */ /* wordfreq.c -- list of words in file, with counts */ #include <stdio.h> #include <stdlib.h> #include <string.h> typedef struct node *nodeptr; typedef struct node { char *word; int count; nodeptr next; } node; #define NHASH 29989 #define MULT 31 nodeptr bin[NHASH]; unsigned int hash(char *p) { unsigned int h = 0; for ( ; *p; p++) h = MULT * h + *p; return h % NHASH; } #define NODEGROUP 1000 int nodesleft = 0; nodeptr freenode; nodeptr nmalloc() { if (nodesleft == 0) { freenode = malloc(NODEGROUP*sizeof(node)); nodesleft = NODEGROUP; } nodesleft--; return freenode++; } #define CHARGROUP 10000 int charsleft = 0; char *freechar; char *smalloc(int n) { if (charsleft < n) { freechar = malloc(n+CHARGROUP); charsleft = n+CHARGROUP; } charsleft -= n; freechar += n; return freechar - n; } void incword(char *s) { nodeptr p; int h = hash(s); for (p = bin[h]; p != NULL; p = p->next) if (strcmp(s, p->word) == 0) { (p->count)++; return; } p = nmalloc(); p->count = 1; p->word = smalloc(strlen(s)+1); strcpy(p->word, s); p->next = bin[h]; bin[h] = p; } int main() { int i; nodeptr p; char buf[16384]; unsigned total = 0; for (i = 0; i < NHASH; i++) bin[i] = NULL; while (scanf("%s", buf) != EOF) incword(buf); for (i = 0; i < NHASH; i++) for (p = bin[i]; p != NULL; p = p->next) total += p->count ? 1 : 0; printf("Words: %d\n", total); return 0; }Well there you have it. An interesting exchange happened on the list after this was posted:
> As in maybe it's just me, but when I look at Jakob's code I have no idea > how it works. What are these routines? What do they do? Where's the > logic? Oh, I can read the comments and see the keywords and sort of > guess that all sorts of dark magic is being done; I just have no idea > how it does it, any more than I would if one took Jakob's code or my > code or anybody's code for the same task and wrapped it up in a single > routine: And here I argue that this is where C++'s strength becomes evident. STL is _standard_ (hence its name). Thus, for a C++ programmer, Jakob's code is comprehensible in one 15second long glimpse and this programmer can immediately use (trust?) the code. With the equivalent C code from the posted link, I personally have the problem that I need at least 15 minutes to read and understand it, and even then, I can't be sure I identified all gotchas and thus I can use it without major crashes and problems. I argue that the biggest problem with C up until glib and gsl (thus very recently) was exactly the lack of high level routines collected in a ubiquous library.
C and C++ code was compiled with GCC 3.3.2, using the following switches:
-O3 -Wall -g -march=pentium2Only -O3 -march=pentium2 have performance impacts - basically they enable all optimization in the compiler, and tells the compiler to generate code for an Intel PentiumII processor. I run the tests on a PentiumII Celeron machine.
Run-time performance | ||
Solution | Lines of code | Run-time |
C++ RB-tree | 15 | 235 sec. |
C Hash | 85 | 36 sec. |
Wow! Quite a difference! At 567% the code-size, the C implementation gets the work done in 15% of the time of the C++ solution. Quite a difference, in run-time as well as implementation-time.
But this benchmark is no good for the purpose of the "C versus C++" argument - it's two completely different approaces that yield completely different results, for reasons that have nothing to do with the languages used.
With the time that I spent on this, I was not able to achieve all of those goals. It was a tradeoff between code length/clarity and speed, basically.
As it turned out, making the code more generic did not affect run-time performance, and did not significantly affect code size.
The main thing that affected code size, was, that I made the code reliable and reusable - I changed the C implementation which used global variables and #defines (thus making the implementation unusable if for example someone wanted to use two different hashes in a program), into a generic template class.
A short table summarizing the changes:
C versus C++, hash implementation | ||
Quality | C implementation | C++ implementation |
Generality | Hash parameters are defines, everything-global | Hash parameters specified as template parameters, nothing-global |
Usability | No memory cleanup - customized allocation routines make addition of cleanup slightly complicated | Full cleanup in hash destructor |
One of the nifty features of the C++ implementation is, that the hash parameters are specified at hash instantiation, as template parameters. So, a program needing two different hashes with different size parameters, would simply declare them as:
my_hash_t<> one; // Uses default parameters my_hash_t<11,3> two; // NHASH=11, MULT=3 in this instantiationWhile it would of course be possible to adapt the C implementation so that one could use more than one hash in a given program, it would be rather more difficult to adapt the code to actually accept different hash parameters. In fact, for a solution that would be run-time performance comparable to the C++ implementation presented here, one would have to make all routines using the NHASH, MULT, NODEGROUP and CHARGROUP defines (which are template parameters in the C++ implementation) macros!. Now macros ain't pretty - but an entire hash implementation in macros, that's, well, not pretty.
/* * Rewrite in C++ by Jakob OestergaardAs you can see, in order to duplicate malloc tricks from the C implementation in a clean manner (which also gives us trivial memory cleanup), I implemented the alist_t template structure - this is an allocation helper which will allocate a number of elements (of some given type), and allows for very efficient expansion of the allocated area (while maintaining the trivial-cleanup property). This accounts for 20 lines of code - the remaining 15 lines of code goes to some extra blank lines, the extra comments in the beginning of the file, and the template class wrapping of the hash implementation.* of: (original copyright follows) */ /* Copyright (C) 1999 Lucent Technologies */ /* From 'Programming Pearls' by Jon Bentley */ /* wordfreq.c -- list of words in file, with counts */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <vector> struct node_t { char *word; int count; node_t *next; }; template <typename T> struct alist_t { alist_t(size_t elms) : next(0) { area = new T[elms]; } ~alist_t() { if (next) delete next; delete[] area; } T* allocate(size_t elms) { alist_t *n = new alist_t<T>(elms); n->next = next; next = n; return n->area; } T *area; alist_t<T> *next; }; template <unsigned NHASH = 29989, unsigned MULT = 31, unsigned NODEGROUP = 1000, unsigned CHARGROUP = 1000> class my_hash_t { public: my_hash_t() : bin(NHASH,0), nodesleft(0), charsleft(0), nglist(NODEGROUP), cglist(CHARGROUP) { } unsigned hash(const char *p) const { unsigned int h = 0; while (*p) h = MULT * h + *p++; return h % NHASH; } void incword(const char *s) { node_t* p; int h = hash(s); for (p = bin[h]; p; p = p->next) if (strcmp(s, p->word) == 0) { ++p->count; return; } p = nmalloc(); p->count = 1; p->word = smalloc(strlen(s)+1); strcpy(p->word, s); p->next = bin[h]; bin[h] = p; } node_t *const * begin() const { return &bin[0]; } node_t *const * end() const { return &bin[NHASH]; } private: std::vector<node_t*> bin; int nodesleft; node_t* freenode; int charsleft; char *freechar; alist_t<node_t> nglist; alist_t<char> cglist; node_t* nmalloc() { if (nodesleft == 0) { freenode = nglist.allocate(NODEGROUP); nodesleft = NODEGROUP; } nodesleft--; return freenode++; } char *smalloc(int n) { if (charsleft < n) { freechar = cglist.allocate(n+CHARGROUP); charsleft = n+CHARGROUP; } charsleft -= n; freechar += n; return freechar - n; } }; int main() { my_hash_t<> my_hash; char buf[16384]; while (scanf("%s", buf) != EOF) my_hash.incword(buf); unsigned total = 0; for (node_t*const* p = my_hash.begin(); p != my_hash.end(); ++p) for (const node_t *e = *p; e; e = e->next) total += e->count ? 1 : 0; printf("Words: %d\n", total); return 0; }
Now, what you may not realize if you're an old-school C programmer is, that you can actually stick the entire hash implementation - everything in the above listing except for the main routine - in a header file. Yep, that's right - C++ will handle this, the linker will not barf. You can then include this header file in every project you do, where you may need a hash implementation. You define the properties of the has when you instantiate the hash - in your own code. The header file just provides the template implementation. C'mon, that's clever!
Run-time performance | ||
Solution | Lines of code | Run-time |
C++ Hash | 121 | 34.697 sec. |
C Hash | 85 | 35.929 sec. |
I believe that I have demonstrated how C++ constructs (the template) allows for generic re-usable and flexible implementations without run-time performance overhead, and (the constructor/destructor) allows for automatic initialization and cleanup of data structures - making it simpler for the user of generic data structures, such as the hash presented, to write real-world programs that are reliable and do not leak memory.
I also believe that I have proven, that C++ does not force any particular type of design down anyone's throat. For example, the node_t in my code is almost identical to the struct node in the original C code - there simply was no need to change it, and therefore it was left basically the way that it was.
The reasons for the beating turned out to be pretty interesting.
std::string a = "abc"; // easy initialization from C strings const char *b = a.c_str(); // easy to get a C string out again std::string b = "foo" + a; // operations on strings are easy too! size_t b_len = b.size(); // O(1) behavior for some operations bool a_eq_b = a == b; // string comparison, done rightThe std::string maintains the length of the string internally - this has several advantages: it is possible to have a string that contains '\0' characters anywhere - '\0' is not special, and of course retrieving the length of a string is fast.
I discovered that for string comparisons (which is the number-one CPU consuming operation in both the C hash implementation and the C++ RB-tree implementation) in my STL use the memcmp routine, whereas the C implementation directly calls the strcmp routine. Now, for some odd reason, the strcmp routine on my system is heavily optimized for the Pentium II processor, while the memcmp routine used when I specify Pentium II optimizations is optimized for the i486 processor. This was one cause for the slowdown - my RB-tree implementation simply could not compare strings as fast as the C implementation, even though it ought to be faster at comparing since it already has O(1) access to the string lengths.
But this was not the whole explanation. It turns out, that using std::cin >> buf is just horribly inefficient. I did not dig further into this to discover why - I guess that all in all, maintaining the length of the string explicitly as the STL std::string does, is just not a win in this scenario. Further, I guess that std::cin is just not horribly well optimized, in the STL that I use. I guess this is understandable - the STL is big, and I/O may just not have been the number one priority for the STL people - understandable, I guess, since the standard C library offers fine routines for doing I/O, and especially when high-performance I/O is a concern, one will use the "raw" read and write system calls anyway.
So, I cooked up a very simple implementation that still uses std::string (because it gives us safe memory management and fully automatic cleanup), but takes the C approach to reading in data from the text file. The code is presented below:
#include <set> #include <string> #include <iostream> #include <stdio.h> int main(int argc, char **argv) { // Declare and Initialize some variables char buf[16384]; std::set<std::string> wordcount; // Read words and insert in rb-tree while (scanf("%s", buf) != EOF) wordcount.insert(buf); // Print the result std::cout << "Words: " << wordcount.size() << std::endl; return 0; }The run-time performance of this code ("C++ RB-tree (II)") is presented along with the other measurements in the table below.
Run-time performance | ||
Solution | Lines of code | Run-time |
C++ RB-tree (I) | 15 | 235 sec. |
C++ Hash | 121 | 34.697 sec. |
C Hash | 85 | 35.929 sec. |
C++ RB-tree (II) | 15 | 63.970 sec. |
It really serves no purpose. Except perhaps to boost my ego by presenting something that is faster than the "programming pearl" presented earlier ;)
The following solution uses something that is almost-but-not-quite a digital trie. In my tests, it uses about four times as much memory as the hash implementations did. It sacrifices memory footprint for performance.
#include <unistd.h> #include <vector> #include <cctype> #include <stdint.h> #include <iostream> const size_t pagesize = 4096; struct knot_t { knot_t() : exists(false), childcount(0), children(0) { } knot_t* child(unsigned c) { // Look for a matching child if (children) { unsigned i = childcount / 2; unsigned leftpos = 0; unsigned rightpos = childcount; while (children[i].letter != uint8_t(c)) { if (uint8_t(c) < children[i].letter) { rightpos = i; i = (leftpos + i) / 2; } else { leftpos = i+1; i = (rightpos + i) / 2; } if (rightpos == leftpos) goto nomatch; } return children + i; } nomatch: // Create matching child - insert in order knot_t *nary = new knot_t[childcount + 1]; knot_t *destp = nary; knot_t *sourcep = children; knot_t *endsource = children + childcount; while (sourcep != endsource && sourcep->letter < c) *destp++ = *sourcep++; knot_t *outp = destp++; while (sourcep != endsource) *destp++ = *sourcep++; delete[] children; children = nary; outp->letter = c; childcount++; return outp; } unsigned distinctcount() const { unsigned total = exists ? 1 : 0; for (unsigned i = 0; i != childcount; ++i) total += children[i].distinctcount(); return total; } bool exists; private: uint8_t letter; unsigned short childcount; knot_t *children; }; int main(int argc, char **argv) { knot_t base; char buffer[pagesize]; knot_t *curnot = 0; while (ssize_t r = read(0, buffer, pagesize)) { if (r <= 0) break; for (unsigned bufpos = 0; bufpos != unsigned(r); ++bufpos) { uint8_t c = buffer[bufpos]; if (isspace(c)) { if (!curnot) continue; curnot->exists = true; curnot = 0; continue; } if (!curnot) { curnot = base.child(c); continue; } curnot = curnot->child(c); } } if (curnot) curnot->exists = true; std::cout << "Distinct: " << base.distinctcount() << std::endl; return 0; }
Run-time performance | ||
Solution | Lines of code | Run-time |
C++ RB-tree (I) | 15 | 235 sec. |
C++ Hash | 121 | 34.697 sec. |
C Hash | 85 | 35.929 sec. |
C++ RB-tree (II) | 15 | 63.970 sec. |
C++ Ego-booster | 93 | 18.108 sec. |
Of course, the ego-booster could also be implemented in C. Probably with negligible run-time penalty (as per the previous tests). This code really uses virtually none of the niceties that C++ provides, in mainly serves to prove that C++ can be used for real performance critical code, where old-timers would cry "C" if asked. Well, I'm not asking, I was too busy coding ;)
The above code spends about half of the run-time waiting for a register stall, because I store the letter as a uint8_t instead of an unsigned. It would be simple to change, but it would cost about 30% extra on the memory footprint. Ah, tradeoffs...