diff options
Diffstat (limited to 'Documentation/memory-barriers.txt')
-rw-r--r-- | Documentation/memory-barriers.txt | 68 |
1 files changed, 48 insertions, 20 deletions
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index f8550310a6d5..92f0056d928c 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt | |||
@@ -610,6 +610,7 @@ loads. Consider the following sequence of events: | |||
610 | 610 | ||
611 | CPU 1 CPU 2 | 611 | CPU 1 CPU 2 |
612 | ======================= ======================= | 612 | ======================= ======================= |
613 | { B = 7; X = 9; Y = 8; C = &Y } | ||
613 | STORE A = 1 | 614 | STORE A = 1 |
614 | STORE B = 2 | 615 | STORE B = 2 |
615 | <write barrier> | 616 | <write barrier> |
@@ -651,7 +652,20 @@ In the above example, CPU 2 perceives that B is 7, despite the load of *C | |||
651 | (which would be B) coming after the the LOAD of C. | 652 | (which would be B) coming after the the LOAD of C. |
652 | 653 | ||
653 | If, however, a data dependency barrier were to be placed between the load of C | 654 | If, however, a data dependency barrier were to be placed between the load of C |
654 | and the load of *C (ie: B) on CPU 2, then the following will occur: | 655 | and the load of *C (ie: B) on CPU 2: |
656 | |||
657 | CPU 1 CPU 2 | ||
658 | ======================= ======================= | ||
659 | { B = 7; X = 9; Y = 8; C = &Y } | ||
660 | STORE A = 1 | ||
661 | STORE B = 2 | ||
662 | <write barrier> | ||
663 | STORE C = &B LOAD X | ||
664 | STORE D = 4 LOAD C (gets &B) | ||
665 | <data dependency barrier> | ||
666 | LOAD *C (reads B) | ||
667 | |||
668 | then the following will occur: | ||
655 | 669 | ||
656 | +-------+ : : : : | 670 | +-------+ : : : : |
657 | | | +------+ +-------+ | 671 | | | +------+ +-------+ |
@@ -829,8 +843,8 @@ There are some more advanced barrier functions: | |||
829 | (*) smp_mb__after_atomic_inc(); | 843 | (*) smp_mb__after_atomic_inc(); |
830 | 844 | ||
831 | These are for use with atomic add, subtract, increment and decrement | 845 | These are for use with atomic add, subtract, increment and decrement |
832 | functions, especially when used for reference counting. These functions | 846 | functions that don't return a value, especially when used for reference |
833 | do not imply memory barriers. | 847 | counting. These functions do not imply memory barriers. |
834 | 848 | ||
835 | As an example, consider a piece of code that marks an object as being dead | 849 | As an example, consider a piece of code that marks an object as being dead |
836 | and then decrements the object's reference count: | 850 | and then decrements the object's reference count: |
@@ -1263,15 +1277,17 @@ else. | |||
1263 | ATOMIC OPERATIONS | 1277 | ATOMIC OPERATIONS |
1264 | ----------------- | 1278 | ----------------- |
1265 | 1279 | ||
1266 | Though they are technically interprocessor interaction considerations, atomic | 1280 | Whilst they are technically interprocessor interaction considerations, atomic |
1267 | operations are noted specially as they do _not_ generally imply memory | 1281 | operations are noted specially as some of them imply full memory barriers and |
1268 | barriers. The possible offenders include: | 1282 | some don't, but they're very heavily relied on as a group throughout the |
1283 | kernel. | ||
1284 | |||
1285 | Any atomic operation that modifies some state in memory and returns information | ||
1286 | about the state (old or new) implies an SMP-conditional general memory barrier | ||
1287 | (smp_mb()) on each side of the actual operation. These include: | ||
1269 | 1288 | ||
1270 | xchg(); | 1289 | xchg(); |
1271 | cmpxchg(); | 1290 | cmpxchg(); |
1272 | test_and_set_bit(); | ||
1273 | test_and_clear_bit(); | ||
1274 | test_and_change_bit(); | ||
1275 | atomic_cmpxchg(); | 1291 | atomic_cmpxchg(); |
1276 | atomic_inc_return(); | 1292 | atomic_inc_return(); |
1277 | atomic_dec_return(); | 1293 | atomic_dec_return(); |
@@ -1282,21 +1298,31 @@ barriers. The possible offenders include: | |||
1282 | atomic_sub_and_test(); | 1298 | atomic_sub_and_test(); |
1283 | atomic_add_negative(); | 1299 | atomic_add_negative(); |
1284 | atomic_add_unless(); | 1300 | atomic_add_unless(); |
1301 | test_and_set_bit(); | ||
1302 | test_and_clear_bit(); | ||
1303 | test_and_change_bit(); | ||
1285 | 1304 | ||
1286 | These may be used for such things as implementing LOCK operations or controlling | 1305 | These are used for such things as implementing LOCK-class and UNLOCK-class |
1287 | the lifetime of objects by decreasing their reference counts. In such cases | 1306 | operations and adjusting reference counters towards object destruction, and as |
1288 | they need preceding memory barriers. | 1307 | such the implicit memory barrier effects are necessary. |
1289 | 1308 | ||
1290 | The following may also be possible offenders as they may be used as UNLOCK | ||
1291 | operations. | ||
1292 | 1309 | ||
1310 | The following operation are potential problems as they do _not_ imply memory | ||
1311 | barriers, but might be used for implementing such things as UNLOCK-class | ||
1312 | operations: | ||
1313 | |||
1314 | atomic_set(); | ||
1293 | set_bit(); | 1315 | set_bit(); |
1294 | clear_bit(); | 1316 | clear_bit(); |
1295 | change_bit(); | 1317 | change_bit(); |
1296 | atomic_set(); | ||
1297 | 1318 | ||
1319 | With these the appropriate explicit memory barrier should be used if necessary | ||
1320 | (smp_mb__before_clear_bit() for instance). | ||
1298 | 1321 | ||
1299 | The following are a little tricky: | 1322 | |
1323 | The following also do _not_ imply memory barriers, and so may require explicit | ||
1324 | memory barriers under some circumstances (smp_mb__before_atomic_dec() for | ||
1325 | instance)): | ||
1300 | 1326 | ||
1301 | atomic_add(); | 1327 | atomic_add(); |
1302 | atomic_sub(); | 1328 | atomic_sub(); |
@@ -1317,10 +1343,12 @@ specific order. | |||
1317 | 1343 | ||
1318 | 1344 | ||
1319 | Basically, each usage case has to be carefully considered as to whether memory | 1345 | Basically, each usage case has to be carefully considered as to whether memory |
1320 | barriers are needed or not. The simplest rule is probably: if the atomic | 1346 | barriers are needed or not. |
1321 | operation is protected by a lock, then it does not require a barrier unless | 1347 | |
1322 | there's another operation within the critical section with respect to which an | 1348 | [!] Note that special memory barrier primitives are available for these |
1323 | ordering must be maintained. | 1349 | situations because on some CPUs the atomic instructions used imply full memory |
1350 | barriers, and so barrier instructions are superfluous in conjunction with them, | ||
1351 | and in such cases the special barrier primitives will be no-ops. | ||
1324 | 1352 | ||
1325 | See Documentation/atomic_ops.txt for more information. | 1353 | See Documentation/atomic_ops.txt for more information. |
1326 | 1354 | ||