optimizing updateAllFastFaceRows performance in raspberry pi

For people working on the C++ code.
saldavonschwartz
New member
 
Posts: 2
Joined: Thu Mar 23, 2017 05:49
GitHub: saldavonschwartz

optimizing updateAllFastFaceRows performance in raspberry pi

by saldavonschwartz » Fri Apr 07, 2017 02:18

Hi,

I am trying to get minetest to run on a raspi 3 at an acceptable framerate. Doing a gprof run it seems that a considerable amount of time is spent in updateAllFastFaceRows, which is in mablock_mesh.cpp and is called as part of the MapBlockMesh constructor.

I tried a somewhat naive approach of multithreading the 3 loops in the method which sequentially process 16x16 faces via updateFastFaceRow. That is, assigning each of the loops that look like this:

Code: Select all
for(s16 y = 0; y < MAP_BLOCKSIZE; y++) {
      for(s16 z = 0; z < MAP_BLOCKSIZE; z++) {
         updateFastFaceRow(data,
               v3s16(0,y,z),
               v3s16(1,0,0), //dir
               v3f  (1,0,0),
               v3s16(0,1,0), //face dir
               v3f  (0,1,0),
               dest);
      }
   }


to a separate thread and then joining the results of the 3 threads back. My idea was that while I would not reduce the time complexity of the code, I would at least cut it down by a factor of 3. But this didn't result in any improvement.

I also tried another apporach, where I make each MapBlockMesh instantiation async / detached. That is, for the mesh update loop:

Code: Select all
void MeshUpdateThread::doUpdate()
{
   QueuedMeshUpdate *q;
   while ((q = m_queue_in.pop())) {
      if (m_generation_interval)
         sleep_ms(m_generation_interval);
      ScopeProfiler sp(g_profiler, "Client: Mesh making");

      MapBlockMesh *mesh_new = new MapBlockMesh(q->data, m_camera_offset);

      MeshUpdateResult r;
      r.p = q->p;
      r.mesh = mesh_new;
      r.ack_block_to_server = q->ack_block_to_server;

      m_queue_out.push_back(r);

      delete q;
   }
}


I turned it into something like:

Code: Select all
void MeshUpdateThread::doUpdate()
{
   QueuedMeshUpdate *q;
   while ((q = m_queue_in.pop())) {
      if (m_generation_interval)
         sleep_ms(m_generation_interval);
      ScopeProfiler sp(g_profiler, "Client: Mesh making");

      MapBlockMesh::createMesh(q->data, m_camera_offset, [=](MapBlockMesh* mesh_new) {
         MeshUpdateResult r;
         r.p = q->p;
         r.mesh = mesh_new;
         r.ack_block_to_server = q->ack_block_to_server;

         m_queue_out.push_back(r);

         delete q;
      });
   }
}


But again, this did not work either.

I am wondering if anyone:

1. knows if the mesh creation is actually cpu bound or it really is gpu bound; gprof only tells me cpu time and I can confirm that commenting out updateAllFastFaceRows does indeed bump up the framerate by aprox 40%.

2. Has any ideas or hints on what might I be able to optimize.
 

Fixerol
Member
 
Posts: 761
Joined: Sun Jul 31, 2011 11:23
Location: Ukraine
IRC: Fixer
In-game: Fixer
 

paramat
Developer
 
Posts: 3000
Joined: Sun Oct 28, 2012 00:05
Location: UK
GitHub: paramat
IRC: paramat
 


Return to Partly official engine development



Who is online

Users browsing this forum: No registered users and 3 guests