Thursday, 2 March 2017

SSE2 QEF implementation

I've ported the existing QEF implementation to SSE2. You can grab the code here: https://github.com/nickgildea/qef/blob/master/qef_simd.h

It should be possible to just drop that in to your project as a replacement, assuming you then enable SSE2 extentions when building. I saw ~2.5x speed up in my tests.

9 comments:

  1. Okay, noob question:

    What exactly does a QEF do?

    ReplyDelete
    Replies
    1. You can think of it doing something similar to a "line of best fit" you might have drawn on a graph in school, except in 3D with a point of best fit, not a line. You have the input points and then the QEF minimises the error introduced by the approximation with the new position/vertex, giving a "point of best fit".

      Something like this, but in 3D: https://en.wikipedia.org/wiki/Least_squares

      Delete
  2. How does this compare to the OpenCL version? Also, do you have any plans to extend this to other instruction sets, like AVX?

    ReplyDelete
    Replies
    1. The use case would be the main difference. The OpenCL version would have a lot of overhead attached if you want to _just_ use OpenCL to calculate the QEFs (i.e. everything else was on the CPU). This is for when the code is all running on the CPU.

      I had a look but didn't see many opportunties for taking advantage of the wider registers in AVX. The obvious case is the matrix multiply but that didn't really make any difference.

      Delete
  3. Nice to see this blog is still going! Quick question; how are you managing chunks' creation/destruction and their positions in relation to the camera distance? A chunk manager class? If so, does this then also handle LOD or do the chunks themselves do that?

    ReplyDelete
    Replies
    1. Sorry I missed your comment until now! Yeah I have a Clipmap class which maintains the chunks in an octree and the LOD is decided by walking the octree and selecting "active" nodes which will be drawn, based on the distance of the node to the camera. If you check the code on github its all in clipmap.cpp. Hope that helps

      Delete