Thread pool

Motivation / Goal

My motivation for making a thread pool was to optimize my projects through threading. During my latest game project I said to myself that I want to be able to play our next game project on a potato. As the games would lag and stutter when I played them on my old laptop. By creating a thread pool I should be able to optimize the game both for a smoother experience and to utilize more of the computer's hardware.

Background / Reason

In some of my previous game projects, I've noticed that when played on lower-end hardware, certain levels can suffer from low frame rates. To address this, I wanted to develop a tool that boosts performance not just on high-end systems but also on older and slower hardware.

Breakdown

My thread pool is made in C++. Threads are created at the start and joined when the  software closes. The amount of threads can be specified at start. Jobs can be pushed via lambda functions as shown in picture below.

// With the use of my thread pool.

threadpool.PushJob([this, aTimeDelta]() {
	myParticleManager.SetCameraTransform(&camera.GetTGACamera().GetTransform());
	myParticleManager.Update(aTimeDelta);
});

// Without the use of my thread pool.
myParticleManager.SetCameraTransform(&camera.GetTGACamera().GetTransform());
myParticleManager.Update(aTimeDelta);

As I wanted to create a tool for myself and for my group to utilize I wanted it to be intuitive and easy to push jobs with my thread pool. For that reason I decided to use lambda functions as they act as a wrapper around the code that is already there. Making it easy to implement and test if a function is viable for threading.

While developing the thread pool, I encountered various scenarios that I hadn’t anticipated. One common issue when working with threading and conditional variables is spurious wakeups. A spurious wakeup occurs when a sleeping thread is awakened but cannot access the data it needs because it's behind a lock. This means the thread was woken at an inopportune time, and resolving it can be difficult, or even impossible, without altering the core functionality of the thread pool.

while (true)
{
    if (myDoJoinAllThreads)
    {
        return;
    }

    std::optional unresolvedJob = GetJob();
    if (unresolvedJob.has_value())
    {
        (*unresolvedJob)();
    }
    if (myJobQueue.size() == 0)
    {
        std::unique_lock<std::mutex> lock(myJobQueueMutex);
        myConditionalVariable.wait(lock, [this]() { return myDoJoinAllThreads || myJobQueue.size() != 0; });
    }
}

To test my thread pool I wanted something close to a real game, but without the complexity. To highlight the thread pool I considered a test where I had one animated model blending between two animated states, one setup of lines connecting the joints of the animated model, two models with animated shaders and some particles. 
To combine all the models and particles as one module I could perform my test and implement multiple instances of the module. To show both the results of a single threaded solution and a multi-threaded solution with the thread pool. 

The "module"

Results

After finishing the thread pool and measuring the results, I am pleased with the increase in performance the thread pool made. 

I want to highlight that this test was made without any significant performance optimizations except for the thread pool. There is no optimizations made on the graphical side of this test. The delta time shown below is from the start of the update loop until the end of the update loop. There is still a lot of possible improvements that could be made, both within the thread pool but mostly on the graphical pipeline as it is the current bottleneck.

With the thread pool                                                                           Without the thread pool
ModulesFPS
avg
Delta timeCPU usage
10004124,2ms33%
7505222,6ms33%
5007614,8ms33%
2501458,6ms30%
ModulesFPS
avg
Delta timeCPU usage
10001186,2ms6%
7501464,8ms6%
5002242,8ms6%
2504622,1ms6%