Comparing the performance of C# vs. C++

February 2010


Most of my work has been in C++ over the years, but there are several things that annoy me about the language. One problem with C++ is that lack of garbage collection makes life more difficult at times, and without reflection, it is difficult and tedious to save the state of a complex program. Also, although the separation of programs into declaration (headers) and implementation (source) is sometimes clearer, it is often just annoying to copy code around. Every once in a while I consider switching languages and made a major attempt over the last few months into C#.

In some ways C# is nicer than C++ because of the flexibility that a virtual machine allows. Reflection and dynamic types are especially nice. However, C# is no panacea. The inability to declare arrays on the stack or in a struct results in bloated memory and overuse of memory allocations or in overuse of local variables named p0, p1, p2, p3, etc. Constantly typing ref and out for structs that I pass around is extremely annoying as is having to specifically declare every member, field, and class public. I am willing to live with a certain amount of annoyance for a more powerful and flexible language, but the real question to me is performance cost.

Proponents of C# and Java claim that with their JIT compilers, their code will run nearly as fast as code written in C++. In fact, they will sometimes even claim that the code can run faster in these high level languages by using profile guided optimization. Of course, I need some sort of hard evidence before changing the language I use for hobby game programing, homeworks, and research. Unfortunately, such evidence is nearly impossible to find on the Internet. Programs in the Great Language Shootout, are written completely differently between languages, and are therefore useless. All the rest of the language comparisons are unsubstantiated opinion among the sort of people that like to debate languages on forums. It became clear what I had to do. After writing a few nontrivial programs, I decided to code the ray tracer for my Image Synthesis class in C#. Once I had a ray tracer with a reasonable number of features, I ported the code over to C++ while modifying it as little as possible to give a fair speed comparison. This writeup presents my results.


While coding the original C# ray tracer I did not worry too much about optimization, but rather overall design. Because I did not want to worry about numerics (and because of the inflexibility of C#), I only use integers and doubles to store numbers. Also, although I mostly use triangles in my scenes, I treat triangles generically by inheriting attributes from a base shape class. Similarly, I calculate all pixel colors through generic shaders that inherit from a base class.

Once I had a bug-free ray tracer in C#, I froze development and ported the code to C++ as identically as possible. I introduced a few minor changes to the way things are calculated, but tried to keep the code as identical as possible for a fair comparison. If anyone would like to review my code or reproduce my results, they can use my source code and resources (the original C++ source is lost, and I substituted a later version).


I traced a couple of different types of scenes that have varied code paths, amount of calculation, and memory access patterns. All testing was performed on an Intel Core 2 Quad 2.66 GHz processor with 3 Gb of memory on Windows 7 64-bit. I used Visual Studio 2010 Beta 2 for both C# and C++. I could not find any optimization options for C#, but set C++ to do full optimization, favor speed, and inline any appropriate functions. For each scene I measured initialization time and actual ray tracing time separately from within the program, which I sum to find the total. I ignore any startup and shutdown time for the program. All times are measured in seconds.

The simplest scene that I tested has two point lights shining on a sphere and infinite plane with no texturing. In this scene I imagine that everything should fit in cache.


The textured scene is identical to the simplest scene, except that bilinearly filtered textures are applied to the surfaces.


Lastly I ray traced an armadillo man that is composed of 30,000 triangles. Triangle intersection is accelerated by a KD-Tree and intersection with individual triangles is tested by calculating the barycentric coordinates of the ray-plane intersection. Unlike most of the other method calls, tree methods are not virtual.



C# is slow. I really hoped that C# would be competitive, but it seems to be approximately five times slower than identical C++ code on average. That C# performed so poorly on this test is particularly bad, because I suspect that the C++ code has a lot more room for improvement through tuning than C# does. Also, the Microsoft compiler and runtime is the best one available for C#, but is pretty bad compared to the competition for C++. For C++, the Intel compiler almost always produces code that is at least 1.1x faster and rarely up to 2x faster than Microsoft's offering.