What is the difference between a mutex and a semaphore?

I answered that within a mutex only one process can access a critical section and in a semaphore, multiple processes can access in parallel a common resource. I also said that the semaphore is initialized with the number of processes which can access a critical section at the same time.


a mutex is a binary semaphore.


A binary semaphore is functionally the same as a mutex. However, a semaphore is a more general programming construct than a mutex. It combines the functionality of a mutex and what is known as a condition variable. The most commonplace use of mutexes is to provide atomicity whereas semaphores are used more often than not to enforce ordering among instructions executing on different cores. Having said that, semaphores can also be used for atomicity requirements, by proper initialization, hence emulating a mutex’s behaviour.

Long Answer:

There are many issues unique to multithreaded programming, the most prominent beingatomicity and ordering. Whats important to realize is that these are completely unrelated to each other.
An illusion that a section of code either executes completely, or doesn’t execute at all. This illusion can be provided by allowing only one thread to execute that code at a time. Atomicity is a key requirement for generating consistent results with respect to a memory location. It is particularly useful for the parts of your program that modify shared state, say globals.
Different threads might be running on different cores. However, since there is no such thing as a global clock, at times its imperative to achieve ordering of instructions across various cores, for correctness. For example, it might be a correctness requirement toexecute instruction X of thread T1 running on core C1 before instruction Y of thread T2 running on core C2.

A mutex is used to meet the atomicity requirement. However, it does not impose any ordering. In other words, given two threads, use of a mutex can’t specify, which thread will acquire the mutex first, and hence execute the critical section before the other. The only assurance is that if one thread is executing the critical section, the other will be kept out of it.

On the other hand, a semaphore can be used to impose ordering constraints in your execution. Considering the aforementioned example, you can block thread T2 just before it executes instruction Y, conditioned on whether T1 is done executing instruction X. This can be done by making T2 wait on the same condition variable that T1 signals, precisely the programming abstraction that the semaphore provides through its wait() and signal() operations.

Thus, a mutex can only be used to maintain atomicity whereas a semaphore can be used for both ordering and atomicity.