Multithreading a Function is Slow? [SOLVED]

:information_source: Attention Topic was automatically imported from the old Question2Answer platform.
:bust_in_silhouette: Asked By qdeanc

I have this script that splits a task between all available CPU threads:

extends Node

export(int)var count_total = 120000000
export(bool)var multithreading = true
onready var CPU_thread_count = OS.get_processor_count()
var threads = []

func _ready():
    if(!multithreading):
	    CPU_thread_count = 1

    var count_per_thread = count_total / CPU_thread_count
    for thread_index in range(0, CPU_thread_count):
	    var thread = Thread.new()
	    threads.append(thread)
	    var userdata = [thread_index, count_per_thread]
	    thread.start(self, "_thread_function", userdata)
	    print("Thread " + str(thread_index) + " dispatched")

func _thread_function(userdata):
    var time_started = OS.get_ticks_msec()

    var count = userdata[1]
    while(count > 0):
	    count -= 1

    var time_finished = OS.get_ticks_msec()
    var process_time = (time_finished - time_started) / 1000.0
    print("Thread process time: " + str(process_time))

    var thread_index = userdata[0]
    call_deferred("_exit_thread", threads[thread_index])

func _exit_thread(var thread: Thread):
    thread.wait_to_finish()

When I run the script, it takes about 6.5 seconds for a single thread to complete the task.
But when I enable multithreading, it takes about 4 seconds for each thread to complete the task; if we combine those processing times, that’s a total 48 seconds of processing!

Is there something I’m doing wrong, here? Shouldn’t mutlithreading be faster?

OS: Windows 10
CPU: Intel Core i8700K @ 3.7GHz, 6 cores, 12 threads

:bust_in_silhouette: Reply From: kelaia

I don’t have an answer to the question but I can say that this is not something unique to GDScript. I made a script in C# just to test that and the task takes more time to run using multiples threads.

Code: using System;using System.Diagnostics;using System.Security.Cryptography;u - Pastebin.com
Results:
enter image description here

What we can conclude with this is that to use multiple threads to solve a problem we need to first know what we are doing. Split the task between the threads may not always result in a better scenario.

Thanks for confirming this further; that’s really concerning…

This essentially means that every thread I add will cause all other threads to slow down considerably. I don’t know how we’re supposed to take that into account when applying multithreading optimizations :frowning:

qdeanc | 2022-02-16 00:56

I personally like to use multithreading as a way to offload some tasks. This way I can have the UI thread always “free”.

kelaia | 2022-02-16 01:46

:bust_in_silhouette: Reply From: rossunger

Multithreading is not a trivial task, and you probably don’t need to manually use it unless you have a cpu bottle-neck somewhere, or if you have something that really benefits from parallel execution.

Spinning up a new thread has overhead. Accessing heap memory has overhead. If you’re only using it for a few seconds it’s not going to be worth it. It doesn’t look like you’re function actually does anything either…so it’s not a good indicator of multithread efficiency. You’re getting all overhead, no work!

Dispatching 12 threads and iteratively decrementing an integer per-thread causes that much overhead?

I tried doubling the decrement count and each thread now takes 7 times longer (84 seconds) to process than if it were single-threaded. Is that all overhead?

Isn’t multithreading designed to handle iterative processes like this? Like A* pathfinding?
Isn’t it counter-intuitive if it results in such extreme overhead and slows down other processes massively?

qdeanc | 2022-02-16 01:43

Is this a debug build? Or a release build? Did you try the profiler to see where the time is being spent?

Decrementing an integer is not a real world use-case for multithreading. Have you considered trying something more complicated?

I definitely am no expert on multithreading, so if you actually know what you are talking about please let me know and I’ll step back from this conversation.

rossunger | 2022-02-16 03:47

Is mentioned above, I’m using 3.4.2 - stable, which is the latest release build.

Decrementing an integer isn’t a real world use-case, but I’d like to know why it’s hurting performance anyways; and in 8+ hours of reading articles and forum threads on multithreading bottlenecks for general programming, I haven’t come across any mention of overhead from heap allocation. The profiler shows minimal memory usage and expected CPU usage, as the threading maxes out all of my CPU cores.

That being said, I’ve swapped out the decrement function for a function that counts prime numbers within a range, which is a lot like decrementing but with a division operation and conditional statement. As far as the logic goes, it’s very similar to a pathfinding algorithm (lots of conditional statements and memory allocations), which is a common function used in multithreading. The performance still suffers from multithreading, though.

I’m still learning all this myself, and I appreciate your efforts, but I think that blaming the performance hit on general overhead is leading me in the wrong direction. That is unless Godot specifically suffers from multithreading overhead.

qdeanc | 2022-02-16 04:50

If you’re running in editor, that’s a debug build… If you export the project and run without editor that’s a release build in my mind.

Didn’t mean to point you in the wrong direction! Sorry. I’m kinda curious myself now. Please let us know if you find anything.

rossunger | 2022-02-16 04:55

Also, curious to see if you try the same thing but in c++ via a gdnative plugin if it has the same result? And also in a standalone c++ app.

rossunger | 2022-02-16 04:57

Oh. Crap.

I just tried exporting the game to an exe (debug build) and the multithreading works fine.
Splitting the decrementing operation between 6 threads now brings the processing time down from 6 seconds to roughly 1.2 seconds per thread.

Thank you for reminding me. I guess that solves everything. Sorry, I should have tried that before even posting!!

qdeanc | 2022-02-16 05:21

And yeah it’s totally overhead from running it in the editor ;-;

qdeanc | 2022-02-16 05:55

:bust_in_silhouette: Reply From: qdeanc

GDScript struggles to call two or more functions at the same time. Note that his is an issue with calling the functions; but the logic within the functions can run concurrently. This is true even for Godot 4.0. For efficient multithreaded function calling, you have to use C++. This means using GDNative or a Module. (I use GDNative)

GDScript can perform multithreading efficiently, as long as you avoid concurrent function calling. Doing so is pretty limiting, though.

If your game is slow due to repeated function calling, GDScript multithreading is not the solution. You should first port your code to C++ (which should give you a significant performance boost). Once your code is in C++, then you can benefit from multithreading.

For more information, please read this post: Poor multithreading performance when calling functions within a thread · Issue #58279 · godotengine/godot · GitHub