aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorStefan Richter <stefanr@s5r6.in-berlin.de>2008-07-12 08:49:19 -0400
committerStefan Richter <stefanr@s5r6.in-berlin.de>2008-07-14 07:06:04 -0400
commit792a61021c6043f6c2b24b1cdd42be5753b3e54c (patch)
tree5d83079ac9d2fe89735feee5d14c9d8e9532f1cd
parenta7ea67823af4a7e442e92064b0fab46603a588f6 (diff)
firewire: fix race of bus reset with request transmission
Reported by Jay Fenlason: A bus reset tasklet may call fw_flush_transactions and touch transactions (call their callback which will free them) while the context which submitted the transaction is still inserting it into the transmission queue. A simple solution to this problem is to _not_ "flush" the transactions because of a bus reset (complete the transcations as 'cancelled'). They will now simply time out (completed as 'cancelled' by the split-timeout timer). Jay Fenlason thought of this fix too but I was quicker to type it out. :-) Background: Contexts which access an instance of struct fw_transaction are: 1. the submitter, until it inserted the packet which is embedded in the transaction into the AT req DMA, 2. the AsReqTrContext tasklet when the request packet was acked by the responder node or transmission to the responder failed, 3. the AsRspRcvContext tasklet when it found a request which matched an incoming response, 4. the card->flush_timer when it picks up timed-out transactions to cancel them, 5. the bus reset tasklet when it cancels transactions (this access is eliminated by this patch), 6. a process which shuts down an fw_card (unregisters it from fw-core when the controller is unbound from fw-ohci) --- although in this case there shouldn't really be any transactions anymore because we wait until all card users finished their business with the card. All of these contexts run concurrently (except for the 6th, presumably). The 1st is safe against the 2nd and 3rd because of the way how a request packet is carefully submitted to the hardware. A race between 2nd and 3rd has been fixed a while ago (bug 9617). The 4th is almost safe against 1st, 2nd, 3rd; there are issues with it if huge scheduling latencies occur, to be fixed separately. The 5th looks safe against 2nd, 3rd, and 4th but is unsafe against 1st. Maybe this could be fixed with an explicit state variable in struct fw_transaction. But this would require fw_transaction to be rewritten as only dynamically allocatable object with reference counting --- not a good solution if we also can simply kill this 5th accessing context (replace it by the 4th). Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
-rw-r--r--drivers/firewire/fw-topology.c2
1 files changed, 0 insertions, 2 deletions
diff --git a/drivers/firewire/fw-topology.c b/drivers/firewire/fw-topology.c
index 213b0ff8f3d6..c1b81077c4a8 100644
--- a/drivers/firewire/fw-topology.c
+++ b/drivers/firewire/fw-topology.c
@@ -510,8 +510,6 @@ fw_core_handle_bus_reset(struct fw_card *card,
510 struct fw_node *local_node; 510 struct fw_node *local_node;
511 unsigned long flags; 511 unsigned long flags;
512 512
513 fw_flush_transactions(card);
514
515 spin_lock_irqsave(&card->lock, flags); 513 spin_lock_irqsave(&card->lock, flags);
516 514
517 /* 515 /*