Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreading worker issue #21959

Open
SoftwareMechanic opened this issue May 17, 2024 · 3 comments
Open

Multithreading worker issue #21959

SoftwareMechanic opened this issue May 17, 2024 · 3 comments

Comments

@SoftwareMechanic
Copy link

SoftwareMechanic commented May 17, 2024

V.3.1.39

I am compiling and running a program which use 3rd party C++ library.

the 3rd party library uses multithreading.
theoretically uses the number of processors as maximum number of threads, (that number is an input I give using std::thread::hardware_concurrency();)

but when this program is compiled with Emscripten and run in the browser, seems that it creates/uses a different web worker for each iteration, instead of "reusing" already created threads/workers, as it should.

this is where multithreading starts init_future_ = std::async(std::launch::async, [this]() { process_concurrently(); });:

bool initialize(){
...
if (num_threads_ != 1) {
	collect();

	init_future_ = std::async(std::launch::async, [this]() { process_concurrently(); });

	// wait for the first element, because after init(), get() can be called.
	// so the element conversion must succeed
	initialization_outcome_ = wait_for_element();
} else {
	initialization_outcome_ = create();
}
...
}

this is processConcurrently function

void process_concurrently() {
	size_t conc_threads = num_threads_;
	if (conc_threads > tasks_.size()) {
		conc_threads = tasks_.size();
	}

	kernel_pool.reserve(conc_threads);
	for (unsigned i = 0; i < conc_threads; ++i) {
		kernel_pool.push_back(new MAKE_TYPE_NAME(Kernel)(kernel));
	}

	std::vector<std::future<geometry_conversion_task*>> threadpool;			
	
	for (auto& rep : tasks_) {
		MAKE_TYPE_NAME(Kernel)* K = nullptr;
		if (threadpool.size() < kernel_pool.size()) {
			K = kernel_pool[threadpool.size()];
		}

		while (threadpool.size() == conc_threads) {
			for (int i = 0; i < (int)threadpool.size(); i++) {
				auto& fu = threadpool[i];
				std::future_status status;
				status = fu.wait_for(std::chrono::seconds(0));
				if (status == std::future_status::ready) {
					process_finished_rep(fu.get());							

					std::swap(threadpool[i], threadpool.back());
					threadpool.pop_back();
					std::swap(kernel_pool[i], kernel_pool.back());
					K = kernel_pool.back();
					break;
				} // if
			}   // for
		}     // while

		std::future<geometry_conversion_task*> fu = std::async(
			std::launch::async, [this](
				IfcGeom::MAKE_TYPE_NAME(Kernel)* kernel,
				const IfcGeom::IteratorSettings& settings,
				geometry_conversion_task* rep) {
					this->create_element_(kernel, settings, rep); 
					return rep;
				},
			K,
			std::ref(settings),
			&rep);

		if (terminating_) {
			break;
		}

		threadpool.emplace_back(std::move(fu));
	}

	for (auto& fu : threadpool) {
		process_finished_rep(fu.get());
	}

	finished_ = true;

	if (!terminating_) {
		Logger::Status("\rDone creating geometry (" + boost::lexical_cast<std::string>(all_processed_elements_.size()) +
			" objects)                                ");
	}
}

I think this could be somehow related to https://github.com/emscripten-core/emscripten/issues/8201

ps. I tried setting THREAD_POOL_SIZE to 490 (just as a test) and the iteration goes for more or less 490 objects.

this is an example on how I use the iterator

if (geom_iterator_serializer.initialize()) {
        do {

               // Do my stuff
               //Here seems that every iteration uses a new thread/worker
               //leading to errors
         } while (geom_iterator_serializer.next());
}

errors:

Tried to spawn a new thread, but the thread pool is exhausted.
This might result in a deadlock unless some threads eventually exit or the code explicitly breaks out to the event loop.
If you want to increase the pool size, use setting `-sPTHREAD_POOL_SIZE=...`.

worker.js onmessage() captured an uncaught exception: std::__2::system_error: thread constructor failed: Resource temporarily unavailable

Uncaught exception from main loop: ffbd51c6-1c0a-44cc-b833-e08ba3ca66c0:9:99729

unwind

Halting program

Pthread 0x4c147998 sent an error! blob:http://localhost:58010/ffbd51c6-1c0a-44cc-b833-e08ba3ca66c0:9: Error: thread constructor failed: Resource temporarily unavailable

any tip will be much appreciated!!

@SoftwareMechanic SoftwareMechanic changed the title Threads management Threads errors with std::async May 17, 2024
@sbc100
Copy link
Collaborator

sbc100 commented May 17, 2024

Emscripten will only only create new workers when there are no unused workers in the pool.

Workers are returned to the "unused" pool each time a C++ thread exits.

If you are seeing the pool continue to grow then that means that your C++ threads probably are not exiting correctly and emscripten things that they are still running. Perhaps there is some issue with the way C++ threads are built on top of pthreads? Perhaps libc++ is pooling pthreads at some level?

I wonder if you could share a simple example of a program that continues to grow the worker pool even though the number of active C++ threads is not going up?

@SoftwareMechanic
Copy link
Author

SoftwareMechanic commented May 18, 2024

thank you for the response @sbc100 !

I found that even with the following code, at some point, will try to create workers:

std::cout << "Test trheads!" << std::endl

while (true) {
    std::thread test1(func);
    test1.detach();
    test1.~thread();
    std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}

void func()
{
    std::cout << "I'm a thread!\n";
}

I can see the output printed, but after some iterations, it fails because of the same error.
shouldn't detach free the worker?

I tried also not allowing the memory to grow but it doesn't change the result.

I am sorry if this is a noob error, I am new to Emscripten, so please bear with me.

EDIT:
also by setting -sPTHREAD_POOL_SIZE_STRICT=0 I face the same behaviour, but errors will not be printed

EDIT2:
Using std::thread::join() instead of detach just works with that simple example, and every thread created has the same ID.

if you need more info don't hesitate to ask!

@SoftwareMechanic SoftwareMechanic changed the title Threads errors with std::async Multithreading worker issue May 18, 2024
@xrui94
Copy link

xrui94 commented May 19, 2024

You can see this explanation, and I believe you will get a good solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants