Start with
man pthreads and look over what's available. Threads, since they all live in one address space, can communicate with each other through boolean flags,
e.g. for things like "please stop doing what you're doing now." To coordinate their activities, they use mutexes (mutual-exclusions) and condition-variables.
As a general, high-level overview: first of all, there should
not be a one-to-one correspondence between 'a request' and 'a thread.' Don't take the "flaming arrow" approach, wherein each time a new request comes in you light up another flaming arrow (a new thread...) and shoot it into the air. Instead, place the new request onto a queue, using a mutex to protect it, then signal a condition-variable that a certain, controllable number of worker-threads are waiting on. One of the threads will wake up, dequeue the request, execute it, and place it on a completed-queue for disposal. Usually, one master-thread has the chore of accepting new work, placing it on the to-do list, and harvesting the completed work from the completed-queue.
The advantage of this approach is that (what IBM calls) the "multiprogramming level" (MPL) is always controlled; always predictable. This prevents thrashing. The MPL is set according to what
the system can reasonably withstand, and if the instantaneous workload exceeds that, some work piles-up in the queue.
You can learn
a lot about this sort of thing by working in, or watching, a well-run McDonald's restauraunt. Pick a shop that's close to a factory, walk in about noon, buy a greasy burger (what t'heck), and
watch a well-designed workload management system operating at overload.
Watch how the front-line servers manage the workload, "parking" orders and taking the next one. Watch how the manager responds, and how the back-line employees do their work. The queues may be piling up but the system is not breaking down.
(Unless... the cash-registers break down... McD's employees can't do simple math.)