In gamedev articles about Entity-Component-System, data locality is often mentioned as a big reason to use such design pattern. The underlying data structures of the ECS are cache friendly, thus allowing much better performance for iterations of large amount of game objects. I knew that cache-friendly usage of memory (sequential memory access, for example) would yield better performance, but I was curious, how much better it would be?
In case you never heard about Entity-Component-System, this article is a good place to read about it.
The Tests
In order to test that, I decided to benchmark an iteration of 2d array, using C# and C++, using BenchmarkDotNet and Google Benchmark respectively.
In C# (running .Net Core 3.1), I used the following test:
classarray2d_benchmark { private: shared_ptr<vector<vector<unsignedlonglong>>> array2d; int _array_size; public: array2d_benchmark(int array_size) : _array_size(array_size) { array2d = make_shared<vector<vector<unsignedlonglong>>>(array_size , vector<unsignedlonglong> (array_size, 0)); } voidbenchmark_column_first()const { for(unsignedlonglong y = 0; y < _array_size; y++) for(unsignedlonglong x = 0; x < _array_size; x++) (*array2d)[x][y] = x + y; }
voidbenchmark_row_first()const { for(unsignedlonglong x = 0; x < _array_size; x++) for(unsignedlonglong y = 0; y < _array_size; y++) (*array2d)[x][y] = x + y; } };
Unsurprisingly, for “row-first” iterations the performance was much better, because the memory access in this case is sequential, thus allowing much less cache misses, as can be seen from C# benchmark (really awesome feature of BenchmarkDotNet!) What did surprise me is how much faster the sequential memory access in fact is, even for such a simple use-case. For 16384x16384 arrays, in C# it is x7 running time improvement and for C++ it is approximately x22 improvement!
Also, the run-time difference between C++ and C# in case of row-first iteration for 16384x16384 arrays is almost x3 - much more than I expected. Overall, this was an interesting experiment that proved to me the value of Entity-Component-System as a performance optimization. Next step would probably be to test how much C#’s array boundary checks affect the performance and check if C++ code benefits from automatic vectorization (which it probably does!)
If you are interested to play around with the code, you can find it in its repository.