Tuesday, November 5, 2019

Parallel programming is really powerful.

Making a digital human usually needs lots of image processing. Size of image is normally greater than 3000 x 5000 and around 10 images of these, combining takes few seconds or minutes.

When I implement something I usually implement logic as quickly as possible and then optimize it. I checked my image processing logic took around 2 seconds.

for ( int y = 0; y < height; ++y ) {
    for ( int x = 0; x < width; ++x ) {
    {
        // processing logic here
    }
}

<<Traditional 2D image processing driver code>>

my processing logic was simple, just grab pixel from different textures and then combine it and put it on some specific texture. Processing logic could be parallelized. so I changed it and profiled. 

Original logic took 2 seconds and new parallelized logic took 0.3 seconds which is 7 times faster!

How to change
first of you should include ppl.h.

#include <ppl.h>

parallel_for is the function what we are going to use.

parallel_for(0, 100, [](int value) {
    // processing logic here, order of value is not determined. this block is called in parallel
});

lambda is called 100 times and order is not determined. like we saw in the above code blocks. Traditional 2D image processing takes x, y for accessing pixel. but parallel_for takes 1D value. we can use mapping function. (this is also traditional converting logic in gaming industry)

const int linearCount = height * width;
parallel_for(0, linearCount, [&](int linearIndex) {
    int y = linearIndex / width;
    int x = linearIndex % width;

    // processing logic here.
} );

using / and % operator, we can get x, y from 1D value. and whole logic can be parallelized.

Try and apply it to your project!
This is very simple and easy way to improve performance :)

Another tip
Recently performance counting in c++ is really easy.

auto t1 = std::chrono::high_resolution_clock::now();

// logic

auto t2 = std::chrono::high_resolution_clock::now();
auto t = std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count();

No comments:

Post a Comment

Task in UnrealEngine

 https://www.youtube.com/watch?v=1lBadANnJaw