A few days ago I was reading about compiler optimization flags and architecture specific optimizations on gcc, and I came up with the idea of trying to squeeze out the maximum performace out of my pentium 4 server by compiling minetest with custom settings, since I have been running a dev version on my server and Debian has not updated minetest to 5.4.0 on their repositories, I thought it was a good idea.
To begin with, these are my specs:
Hardware:
CPU: Intel Pentium 4 HT 524 3066 MHz 1 MiB L2 Cache x86_64 Hyperthreaded 84 W TDP (not sure if it is a 524 though, but it is very likely)
RAM: 1220 MiB
Software: GNU/Linux Debian 10 amd64 using gcc 8.3.0 and g++ 8.3.0
First, I benchmarked my dev build of commit d1ec5117d9095c75aca26a98690e4fcc5385e98c, the one used on the announcement of 5.4.0 reaching feature freeze. This build was compiled with the same OS running on a VirtualBox virtual machine under a Core 2 Duo machine, client build disabled, spatialindex support enabled and I am pretty sure I forgot to install the libraries for gettext so that is not enabled. My goal was to compare hyperthreading enabled vs disabled, these are the results of running
Code: Select all
$ time minetestserver --run-unittests
Code: Select all
3.5246 s average (HT on)
3.4804 s average (HT off)
Code: Select all
static_spawnpoint= 0,0,0
mapgen v7 default flags
seed: testserver
Procedure: wait for server to initialize and idle, then join, wait for server to idle, execute /emergeblocks here 50, log results, delete world folder and repeat
mods:
airtanks
ambience
backpacks
basic_materials
bonemeal
border
clean
commons
compassgps
craftguide
dfcaverns
i3
3darmor
dmobs
doc
doors
ethereal
farming
footprints
hangglider
hazmat_suit
hbhunger_1.0.1
hot_air_balloons
hudbars_2.3.2
lucky_block
magma_conduits
mapgen_helper
mesecons
mob_horse
mobs_animal
mobs_monster
mobs_redo
mobs_water
moreblocks
moreores
multi_ip
orienteering_1.6
painting
pipeworks
radiant_damage
rangedweapons
redef
regional_weather_modpack
ropes
subterrane
technic
travelnet
vote
wardrobe
wielded_light
xpanes
The following results are the average of three runs:
Code: Select all
55619 ms average (HT on)
50261 ms average (HT off)
So for this reason, all benchmakrs from now on are run without hyperthreading.
Now here comes the gcc/g++ stuff.
In total I compiled minetest 5.4.0 release version (commit f3e51dca155ce1d1062a339cf925f41d7c751df8) 5 times, the average compilation time on this CPU is 23 minutes, everytime I compiled it I swear I felt the room was getting hotter.
The minetest_game version used was commit 0a90bd8a0ec530f48e1bd9a438e24bd85cc9cd66.
All builds I made have spatialindex support and the only database backend is sqlite3.
These are the settings for all my setups and their respective benchmarks:
Normal minetest 5.4.0
Code: Select all
$ cmake . -DRUN_IN_PLACE=TRUE -DBUILD_SERVER=TRUE -DBUILD_CLIENT=FALSE -DIRRLICHT_INCLUDE_DIR=path
Code: Select all
3.468 s average unittests
52157 ms average mapgen test
Code: Select all
[code]$ cmake . -DRUN_IN_PLACE=TRUE -DBUILD_SERVER=TRUE -DBUILD_CLIENT=FALSE -DIRRLICHT_INCLUDE_DIR=path
Code: Select all
3.4652 s average unittests
48619 ms average mapgen test
Code: Select all
[code]$ cmake . -DRUN_IN_PLACE=TRUE -DBUILD_SERVER=TRUE -DBUILD_CLIENT=FALSE -DIRRLICHT_INCLUDE_DIR=path -DCMAKE_BUILD_TYPE=MinSizeRel
Code: Select all
19.1106 s average unittests
59755 ms executed only once mapgen test
To optimize for this CPU I used the same cmake flags
Code: Select all
$ cmake . -DRUN_IN_PLACE=TRUE -DBUILD_SERVER=TRUE -DBUILD_CLIENT=FALSE -DIRRLICHT_INCLUDE_DIR=path
The size of the minetestserver binary was 6.2 MiB for this one and the next build.
Code: Select all
3.7758 s average unittests
48673 ms average mapgen test
Pentium 4 optimized minetest 5.4.0 without gettext
Same flags
Code: Select all
$ cmake . -DRUN_IN_PLACE=TRUE -DBUILD_SERVER=TRUE -DBUILD_CLIENT=FALSE -DIRRLICHT_INCLUDE_DIR=path
Code: Select all
3.7832 s average unittests
50554 ms average mapgen test
This is the version which comes as the minetest-server package.
Code: Select all
30710 ms one run mapgen test
UPDATE: I tried a new configuration which is the best so far
Pentium 4 optimized with gettext and native luajit and jsoncpp libraries Minetest 5.4.0
This time I forced the compiler to use jsoncpp from the system and installed the libraries for luajit and the size of the minetestserver binary is 5.4 MiB, perfect.
Code: Select all
cmake . -DRUN_IN_PLACE=TRUE -DBUILD_SERVER=TRUE -DBUILD_CLIENT=FALSE -DIRRLICHT_INCLUDE_DIR=path -DENABLE_SYSTEM_JSONCPP=ON
Code: Select all
29336 ms average mapgen test
So this is all, this is my best try without hyperthreading, I will be using this on my server for now, it gives me 90% more performance over my dev build running hyperthreading, so it is really worth it. The reason I was fiddling around with gettext is because I was trying to figure out why my dev build performed better than my normal 5.4.0 release build, and it seems that disabling it improves performance, but I would not be so sure, if someone, maybe a core dev, knows if gettext internationalisation support is really needed on a server build please let me know.
I tried to be as specific as possible so that if you want to compare with my results you can perform your own tests, however, this is not an appropiate test because it involves many specific mods, it would be best to try with only devtest or minetest_game if we wanted to make an official benchmark database, which is not the goal of this post.