diff options
author | Manfred Spraul <manfred@colorfullife.com> | 2014-12-12 19:58:11 -0500 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2014-12-13 15:42:52 -0500 |
commit | 2e094abfd1f29a08a60523b42d4508281b8dee0e (patch) | |
tree | 60c10635e14ebc3065b1a40e62517244d929409b /ipc/sem.c | |
parent | a060bfe032bcb8522b470f8a7a16e225a9fe5dd6 (diff) |
ipc/sem.c: change memory barrier in sem_lock() to smp_rmb()
When I fixed bugs in the sem_lock() logic, I was more conservative than
necessary. Therefore it is safe to replace the smp_mb() with smp_rmb().
And: With smp_rmb(), semop() syscalls are up to 10% faster.
The race we must protect against is:
sem->lock is free
sma->complex_count = 0
sma->sem_perm.lock held by thread B
thread A:
A: spin_lock(&sem->lock)
B: sma->complex_count++; (now 1)
B: spin_unlock(&sma->sem_perm.lock);
A: spin_is_locked(&sma->sem_perm.lock);
A: XXXXX memory barrier
A: if (sma->complex_count == 0)
Thread A must read the increased complex_count value, i.e. the read must
not be reordered with the read of sem_perm.lock done by spin_is_locked().
Since it's about ordering of reads, smp_rmb() is sufficient.
[akpm@linux-foundation.org: update sem_lock() comment, from Davidlohr]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'ipc/sem.c')
-rw-r--r-- | ipc/sem.c | 13 |
1 files changed, 10 insertions, 3 deletions
@@ -326,10 +326,17 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops, | |||
326 | 326 | ||
327 | /* Then check that the global lock is free */ | 327 | /* Then check that the global lock is free */ |
328 | if (!spin_is_locked(&sma->sem_perm.lock)) { | 328 | if (!spin_is_locked(&sma->sem_perm.lock)) { |
329 | /* spin_is_locked() is not a memory barrier */ | 329 | /* |
330 | smp_mb(); | 330 | * The ipc object lock check must be visible on all |
331 | * cores before rechecking the complex count. Otherwise | ||
332 | * we can race with another thread that does: | ||
333 | * complex_count++; | ||
334 | * spin_unlock(sem_perm.lock); | ||
335 | */ | ||
336 | smp_rmb(); | ||
331 | 337 | ||
332 | /* Now repeat the test of complex_count: | 338 | /* |
339 | * Now repeat the test of complex_count: | ||
333 | * It can't change anymore until we drop sem->lock. | 340 | * It can't change anymore until we drop sem->lock. |
334 | * Thus: if is now 0, then it will stay 0. | 341 | * Thus: if is now 0, then it will stay 0. |
335 | */ | 342 | */ |