optimizing updateAllFastFaceRows performance in raspberry pi

saldavonschwartz · by **saldavonschwartz** » Fri Apr 07, 2017 02:18 Post

Hi,

I am trying to get minetest to run on a raspi 3 at an acceptable framerate. Doing a gprof run it seems that a considerable amount of time is spent in updateAllFastFaceRows, which is in mablock_mesh.cpp and is called as part of the MapBlockMesh constructor.

I tried a somewhat naive approach of multithreading the 3 loops in the method which sequentially process 16x16 faces via updateFastFaceRow. That is, assigning each of the loops that look like this:

Code: Select all

for(s16 y = 0; y < MAP_BLOCKSIZE; y++) {
		for(s16 z = 0; z < MAP_BLOCKSIZE; z++) {
			updateFastFaceRow(data,
					v3s16(0,y,z),
					v3s16(1,0,0), //dir
					v3f  (1,0,0),
					v3s16(0,1,0), //face dir
					v3f  (0,1,0),
					dest);
		}
	}

to a separate thread and then joining the results of the 3 threads back. My idea was that while I would not reduce the time complexity of the code, I would at least cut it down by a factor of 3. But this didn't result in any improvement.

I also tried another apporach, where I make each MapBlockMesh instantiation async / detached. That is, for the mesh update loop:

Code: Select all

void MeshUpdateThread::doUpdate()
{
	QueuedMeshUpdate *q;
	while ((q = m_queue_in.pop())) {
		if (m_generation_interval)
			sleep_ms(m_generation_interval);
		ScopeProfiler sp(g_profiler, "Client: Mesh making");

		MapBlockMesh *mesh_new = new MapBlockMesh(q->data, m_camera_offset);

		MeshUpdateResult r;
		r.p = q->p;
		r.mesh = mesh_new;
		r.ack_block_to_server = q->ack_block_to_server;

		m_queue_out.push_back(r);

		delete q;
	}
}

I turned it into something like:

Code: Select all

void MeshUpdateThread::doUpdate()
{
	QueuedMeshUpdate *q;
	while ((q = m_queue_in.pop())) {
		if (m_generation_interval)
			sleep_ms(m_generation_interval);
		ScopeProfiler sp(g_profiler, "Client: Mesh making");

		MapBlockMesh::createMesh(q->data, m_camera_offset, [=](MapBlockMesh* mesh_new) {
			MeshUpdateResult r;
			r.p = q->p;
			r.mesh = mesh_new;
			r.ack_block_to_server = q->ack_block_to_server;

			m_queue_out.push_back(r);

			delete q;
		});
	}
}

But again, this did not work either.

I am wondering if anyone:

1. knows if the mesh creation is actually cpu bound or it really is gpu bound; gprof only tells me cpu time and I can confirm that commenting out updateAllFastFaceRows does indeed bump up the framerate by aprox 40%.

2. Has any ideas or hints on what might I be able to optimize.

Fixer · by **Fixer** » Tue Apr 11, 2017 21:12 Post

Is this related to your problem? https://github.com/minetest/minetest/pull/5239

by **paramat** » Wed Apr 12, 2017 16:43 Post

Related commit https://github.com/minetest/minetest/co ... 75131f2ecc

Minetest Forums

optimizing updateAllFastFaceRows performance in raspberry pi

optimizing updateAllFastFaceRows performance in raspberry pi

Re: optimizing updateAllFastFaceRows performance in raspberr

Re: optimizing updateAllFastFaceRows performance in raspberr

Who is online