diff options
Diffstat (limited to 'Documentation/scsi/libsas.txt')
-rw-r--r-- | Documentation/scsi/libsas.txt | 484 |
1 files changed, 484 insertions, 0 deletions
diff --git a/Documentation/scsi/libsas.txt b/Documentation/scsi/libsas.txt new file mode 100644 index 000000000000..9e2078b2a615 --- /dev/null +++ b/Documentation/scsi/libsas.txt | |||
@@ -0,0 +1,484 @@ | |||
1 | SAS Layer | ||
2 | --------- | ||
3 | |||
4 | The SAS Layer is a management infrastructure which manages | ||
5 | SAS LLDDs. It sits between SCSI Core and SAS LLDDs. The | ||
6 | layout is as follows: while SCSI Core is concerned with | ||
7 | SAM/SPC issues, and a SAS LLDD+sequencer is concerned with | ||
8 | phy/OOB/link management, the SAS layer is concerned with: | ||
9 | |||
10 | * SAS Phy/Port/HA event management (LLDD generates, | ||
11 | SAS Layer processes), | ||
12 | * SAS Port management (creation/destruction), | ||
13 | * SAS Domain discovery and revalidation, | ||
14 | * SAS Domain device management, | ||
15 | * SCSI Host registration/unregistration, | ||
16 | * Device registration with SCSI Core (SAS) or libata | ||
17 | (SATA), and | ||
18 | * Expander management and exporting expander control | ||
19 | to user space. | ||
20 | |||
21 | A SAS LLDD is a PCI device driver. It is concerned with | ||
22 | phy/OOB management, and vendor specific tasks and generates | ||
23 | events to the SAS layer. | ||
24 | |||
25 | The SAS Layer does most SAS tasks as outlined in the SAS 1.1 | ||
26 | spec. | ||
27 | |||
28 | The sas_ha_struct describes the SAS LLDD to the SAS layer. | ||
29 | Most of it is used by the SAS Layer but a few fields need to | ||
30 | be initialized by the LLDDs. | ||
31 | |||
32 | After initializing your hardware, from the probe() function | ||
33 | you call sas_register_ha(). It will register your LLDD with | ||
34 | the SCSI subsystem, creating a SCSI host and it will | ||
35 | register your SAS driver with the sysfs SAS tree it creates. | ||
36 | It will then return. Then you enable your phys to actually | ||
37 | start OOB (at which point your driver will start calling the | ||
38 | notify_* event callbacks). | ||
39 | |||
40 | Structure descriptions: | ||
41 | |||
42 | struct sas_phy -------------------- | ||
43 | Normally this is statically embedded to your driver's | ||
44 | phy structure: | ||
45 | struct my_phy { | ||
46 | blah; | ||
47 | struct sas_phy sas_phy; | ||
48 | bleh; | ||
49 | }; | ||
50 | And then all the phys are an array of my_phy in your HA | ||
51 | struct (shown below). | ||
52 | |||
53 | Then as you go along and initialize your phys you also | ||
54 | initialize the sas_phy struct, along with your own | ||
55 | phy structure. | ||
56 | |||
57 | In general, the phys are managed by the LLDD and the ports | ||
58 | are managed by the SAS layer. So the phys are initialized | ||
59 | and updated by the LLDD and the ports are initialized and | ||
60 | updated by the SAS layer. | ||
61 | |||
62 | There is a scheme where the LLDD can RW certain fields, | ||
63 | and the SAS layer can only read such ones, and vice versa. | ||
64 | The idea is to avoid unnecessary locking. | ||
65 | |||
66 | enabled -- must be set (0/1) | ||
67 | id -- must be set [0,MAX_PHYS) | ||
68 | class, proto, type, role, oob_mode, linkrate -- must be set | ||
69 | oob_mode -- you set this when OOB has finished and then notify | ||
70 | the SAS Layer. | ||
71 | |||
72 | sas_addr -- this normally points to an array holding the sas | ||
73 | address of the phy, possibly somewhere in your my_phy | ||
74 | struct. | ||
75 | |||
76 | attached_sas_addr -- set this when you (LLDD) receive an | ||
77 | IDENTIFY frame or a FIS frame, _before_ notifying the SAS | ||
78 | layer. The idea is that sometimes the LLDD may want to fake | ||
79 | or provide a different SAS address on that phy/port and this | ||
80 | allows it to do this. At best you should copy the sas | ||
81 | address from the IDENTIFY frame or maybe generate a SAS | ||
82 | address for SATA directly attached devices. The Discover | ||
83 | process may later change this. | ||
84 | |||
85 | frame_rcvd -- this is where you copy the IDENTIFY/FIS frame | ||
86 | when you get it; you lock, copy, set frame_rcvd_size and | ||
87 | unlock the lock, and then call the event. It is a pointer | ||
88 | since there's no way to know your hw frame size _exactly_, | ||
89 | so you define the actual array in your phy struct and let | ||
90 | this pointer point to it. You copy the frame from your | ||
91 | DMAable memory to that area holding the lock. | ||
92 | |||
93 | sas_prim -- this is where primitives go when they're | ||
94 | received. See sas.h. Grab the lock, set the primitive, | ||
95 | release the lock, notify. | ||
96 | |||
97 | port -- this points to the sas_port if the phy belongs | ||
98 | to a port -- the LLDD only reads this. It points to the | ||
99 | sas_port this phy is part of. Set by the SAS Layer. | ||
100 | |||
101 | ha -- may be set; the SAS layer sets it anyway. | ||
102 | |||
103 | lldd_phy -- you should set this to point to your phy so you | ||
104 | can find your way around faster when the SAS layer calls one | ||
105 | of your callbacks and passes you a phy. If the sas_phy is | ||
106 | embedded you can also use container_of -- whatever you | ||
107 | prefer. | ||
108 | |||
109 | |||
110 | struct sas_port -------------------- | ||
111 | The LLDD doesn't set any fields of this struct -- it only | ||
112 | reads them. They should be self explanatory. | ||
113 | |||
114 | phy_mask is 32 bit, this should be enough for now, as I | ||
115 | haven't heard of a HA having more than 8 phys. | ||
116 | |||
117 | lldd_port -- I haven't found use for that -- maybe other | ||
118 | LLDD who wish to have internal port representation can make | ||
119 | use of this. | ||
120 | |||
121 | |||
122 | struct sas_ha_struct -------------------- | ||
123 | It normally is statically declared in your own LLDD | ||
124 | structure describing your adapter: | ||
125 | struct my_sas_ha { | ||
126 | blah; | ||
127 | struct sas_ha_struct sas_ha; | ||
128 | struct my_phy phys[MAX_PHYS]; | ||
129 | struct sas_port sas_ports[MAX_PHYS]; /* (1) */ | ||
130 | bleh; | ||
131 | }; | ||
132 | |||
133 | (1) If your LLDD doesn't have its own port representation. | ||
134 | |||
135 | What needs to be initialized (sample function given below). | ||
136 | |||
137 | pcidev | ||
138 | sas_addr -- since the SAS layer doesn't want to mess with | ||
139 | memory allocation, etc, this points to statically | ||
140 | allocated array somewhere (say in your host adapter | ||
141 | structure) and holds the SAS address of the host | ||
142 | adapter as given by you or the manufacturer, etc. | ||
143 | sas_port | ||
144 | sas_phy -- an array of pointers to structures. (see | ||
145 | note above on sas_addr). | ||
146 | These must be set. See more notes below. | ||
147 | num_phys -- the number of phys present in the sas_phy array, | ||
148 | and the number of ports present in the sas_port | ||
149 | array. There can be a maximum num_phys ports (one per | ||
150 | port) so we drop the num_ports, and only use | ||
151 | num_phys. | ||
152 | |||
153 | The event interface: | ||
154 | |||
155 | /* LLDD calls these to notify the class of an event. */ | ||
156 | void (*notify_ha_event)(struct sas_ha_struct *, enum ha_event); | ||
157 | void (*notify_port_event)(struct sas_phy *, enum port_event); | ||
158 | void (*notify_phy_event)(struct sas_phy *, enum phy_event); | ||
159 | |||
160 | When sas_register_ha() returns, those are set and can be | ||
161 | called by the LLDD to notify the SAS layer of such events | ||
162 | the SAS layer. | ||
163 | |||
164 | The port notification: | ||
165 | |||
166 | /* The class calls these to notify the LLDD of an event. */ | ||
167 | void (*lldd_port_formed)(struct sas_phy *); | ||
168 | void (*lldd_port_deformed)(struct sas_phy *); | ||
169 | |||
170 | If the LLDD wants notification when a port has been formed | ||
171 | or deformed it sets those to a function satisfying the type. | ||
172 | |||
173 | A SAS LLDD should also implement at least one of the Task | ||
174 | Management Functions (TMFs) described in SAM: | ||
175 | |||
176 | /* Task Management Functions. Must be called from process context. */ | ||
177 | int (*lldd_abort_task)(struct sas_task *); | ||
178 | int (*lldd_abort_task_set)(struct domain_device *, u8 *lun); | ||
179 | int (*lldd_clear_aca)(struct domain_device *, u8 *lun); | ||
180 | int (*lldd_clear_task_set)(struct domain_device *, u8 *lun); | ||
181 | int (*lldd_I_T_nexus_reset)(struct domain_device *); | ||
182 | int (*lldd_lu_reset)(struct domain_device *, u8 *lun); | ||
183 | int (*lldd_query_task)(struct sas_task *); | ||
184 | |||
185 | For more information please read SAM from T10.org. | ||
186 | |||
187 | Port and Adapter management: | ||
188 | |||
189 | /* Port and Adapter management */ | ||
190 | int (*lldd_clear_nexus_port)(struct sas_port *); | ||
191 | int (*lldd_clear_nexus_ha)(struct sas_ha_struct *); | ||
192 | |||
193 | A SAS LLDD should implement at least one of those. | ||
194 | |||
195 | Phy management: | ||
196 | |||
197 | /* Phy management */ | ||
198 | int (*lldd_control_phy)(struct sas_phy *, enum phy_func); | ||
199 | |||
200 | lldd_ha -- set this to point to your HA struct. You can also | ||
201 | use container_of if you embedded it as shown above. | ||
202 | |||
203 | A sample initialization and registration function | ||
204 | can look like this (called last thing from probe()) | ||
205 | *but* before you enable the phys to do OOB: | ||
206 | |||
207 | static int register_sas_ha(struct my_sas_ha *my_ha) | ||
208 | { | ||
209 | int i; | ||
210 | static struct sas_phy *sas_phys[MAX_PHYS]; | ||
211 | static struct sas_port *sas_ports[MAX_PHYS]; | ||
212 | |||
213 | my_ha->sas_ha.sas_addr = &my_ha->sas_addr[0]; | ||
214 | |||
215 | for (i = 0; i < MAX_PHYS; i++) { | ||
216 | sas_phys[i] = &my_ha->phys[i].sas_phy; | ||
217 | sas_ports[i] = &my_ha->sas_ports[i]; | ||
218 | } | ||
219 | |||
220 | my_ha->sas_ha.sas_phy = sas_phys; | ||
221 | my_ha->sas_ha.sas_port = sas_ports; | ||
222 | my_ha->sas_ha.num_phys = MAX_PHYS; | ||
223 | |||
224 | my_ha->sas_ha.lldd_port_formed = my_port_formed; | ||
225 | |||
226 | my_ha->sas_ha.lldd_dev_found = my_dev_found; | ||
227 | my_ha->sas_ha.lldd_dev_gone = my_dev_gone; | ||
228 | |||
229 | my_ha->sas_ha.lldd_max_execute_num = lldd_max_execute_num; (1) | ||
230 | |||
231 | my_ha->sas_ha.lldd_queue_size = ha_can_queue; | ||
232 | my_ha->sas_ha.lldd_execute_task = my_execute_task; | ||
233 | |||
234 | my_ha->sas_ha.lldd_abort_task = my_abort_task; | ||
235 | my_ha->sas_ha.lldd_abort_task_set = my_abort_task_set; | ||
236 | my_ha->sas_ha.lldd_clear_aca = my_clear_aca; | ||
237 | my_ha->sas_ha.lldd_clear_task_set = my_clear_task_set; | ||
238 | my_ha->sas_ha.lldd_I_T_nexus_reset= NULL; (2) | ||
239 | my_ha->sas_ha.lldd_lu_reset = my_lu_reset; | ||
240 | my_ha->sas_ha.lldd_query_task = my_query_task; | ||
241 | |||
242 | my_ha->sas_ha.lldd_clear_nexus_port = my_clear_nexus_port; | ||
243 | my_ha->sas_ha.lldd_clear_nexus_ha = my_clear_nexus_ha; | ||
244 | |||
245 | my_ha->sas_ha.lldd_control_phy = my_control_phy; | ||
246 | |||
247 | return sas_register_ha(&my_ha->sas_ha); | ||
248 | } | ||
249 | |||
250 | (1) This is normally a LLDD parameter, something of the | ||
251 | lines of a task collector. What it tells the SAS Layer is | ||
252 | whether the SAS layer should run in Direct Mode (default: | ||
253 | value 0 or 1) or Task Collector Mode (value greater than 1). | ||
254 | |||
255 | In Direct Mode, the SAS Layer calls Execute Task as soon as | ||
256 | it has a command to send to the SDS, _and_ this is a single | ||
257 | command, i.e. not linked. | ||
258 | |||
259 | Some hardware (e.g. aic94xx) has the capability to DMA more | ||
260 | than one task at a time (interrupt) from host memory. Task | ||
261 | Collector Mode is an optional feature for HAs which support | ||
262 | this in their hardware. (Again, it is completely optional | ||
263 | even if your hardware supports it.) | ||
264 | |||
265 | In Task Collector Mode, the SAS Layer would do _natural_ | ||
266 | coalescing of tasks and at the appropriate moment it would | ||
267 | call your driver to DMA more than one task in a single HA | ||
268 | interrupt. DMBS may want to use this by insmod/modprobe | ||
269 | setting the lldd_max_execute_num to something greater than | ||
270 | 1. | ||
271 | |||
272 | (2) SAS 1.1 does not define I_T Nexus Reset TMF. | ||
273 | |||
274 | Events | ||
275 | ------ | ||
276 | |||
277 | Events are _the only way_ a SAS LLDD notifies the SAS layer | ||
278 | of anything. There is no other method or way a LLDD to tell | ||
279 | the SAS layer of anything happening internally or in the SAS | ||
280 | domain. | ||
281 | |||
282 | Phy events: | ||
283 | PHYE_LOSS_OF_SIGNAL, (C) | ||
284 | PHYE_OOB_DONE, | ||
285 | PHYE_OOB_ERROR, (C) | ||
286 | PHYE_SPINUP_HOLD. | ||
287 | |||
288 | Port events, passed on a _phy_: | ||
289 | PORTE_BYTES_DMAED, (M) | ||
290 | PORTE_BROADCAST_RCVD, (E) | ||
291 | PORTE_LINK_RESET_ERR, (C) | ||
292 | PORTE_TIMER_EVENT, (C) | ||
293 | PORTE_HARD_RESET. | ||
294 | |||
295 | Host Adapter event: | ||
296 | HAE_RESET | ||
297 | |||
298 | A SAS LLDD should be able to generate | ||
299 | - at least one event from group C (choice), | ||
300 | - events marked M (mandatory) are mandatory (only one), | ||
301 | - events marked E (expander) if it wants the SAS layer | ||
302 | to handle domain revalidation (only one such). | ||
303 | - Unmarked events are optional. | ||
304 | |||
305 | Meaning: | ||
306 | |||
307 | HAE_RESET -- when your HA got internal error and was reset. | ||
308 | |||
309 | PORTE_BYTES_DMAED -- on receiving an IDENTIFY/FIS frame | ||
310 | PORTE_BROADCAST_RCVD -- on receiving a primitive | ||
311 | PORTE_LINK_RESET_ERR -- timer expired, loss of signal, loss | ||
312 | of DWS, etc. (*) | ||
313 | PORTE_TIMER_EVENT -- DWS reset timeout timer expired (*) | ||
314 | PORTE_HARD_RESET -- Hard Reset primitive received. | ||
315 | |||
316 | PHYE_LOSS_OF_SIGNAL -- the device is gone (*) | ||
317 | PHYE_OOB_DONE -- OOB went fine and oob_mode is valid | ||
318 | PHYE_OOB_ERROR -- Error while doing OOB, the device probably | ||
319 | got disconnected. (*) | ||
320 | PHYE_SPINUP_HOLD -- SATA is present, COMWAKE not sent. | ||
321 | |||
322 | (*) should set/clear the appropriate fields in the phy, | ||
323 | or alternatively call the inlined sas_phy_disconnected() | ||
324 | which is just a helper, from their tasklet. | ||
325 | |||
326 | The Execute Command SCSI RPC: | ||
327 | |||
328 | int (*lldd_execute_task)(struct sas_task *, int num, | ||
329 | unsigned long gfp_flags); | ||
330 | |||
331 | Used to queue a task to the SAS LLDD. @task is the tasks to | ||
332 | be executed. @num should be the number of tasks being | ||
333 | queued at this function call (they are linked listed via | ||
334 | task::list), @gfp_mask should be the gfp_mask defining the | ||
335 | context of the caller. | ||
336 | |||
337 | This function should implement the Execute Command SCSI RPC, | ||
338 | or if you're sending a SCSI Task as linked commands, you | ||
339 | should also use this function. | ||
340 | |||
341 | That is, when lldd_execute_task() is called, the command(s) | ||
342 | go out on the transport *immediately*. There is *no* | ||
343 | queuing of any sort and at any level in a SAS LLDD. | ||
344 | |||
345 | The use of task::list is two-fold, one for linked commands, | ||
346 | the other discussed below. | ||
347 | |||
348 | It is possible to queue up more than one task at a time, by | ||
349 | initializing the list element of struct sas_task, and | ||
350 | passing the number of tasks enlisted in this manner in num. | ||
351 | |||
352 | Returns: -SAS_QUEUE_FULL, -ENOMEM, nothing was queued; | ||
353 | 0, the task(s) were queued. | ||
354 | |||
355 | If you want to pass num > 1, then either | ||
356 | A) you're the only caller of this function and keep track | ||
357 | of what you've queued to the LLDD, or | ||
358 | B) you know what you're doing and have a strategy of | ||
359 | retrying. | ||
360 | |||
361 | As opposed to queuing one task at a time (function call), | ||
362 | batch queuing of tasks, by having num > 1, greatly | ||
363 | simplifies LLDD code, sequencer code, and _hardware design_, | ||
364 | and has some performance advantages in certain situations | ||
365 | (DBMS). | ||
366 | |||
367 | The LLDD advertises if it can take more than one command at | ||
368 | a time at lldd_execute_task(), by setting the | ||
369 | lldd_max_execute_num parameter (controlled by "collector" | ||
370 | module parameter in aic94xx SAS LLDD). | ||
371 | |||
372 | You should leave this to the default 1, unless you know what | ||
373 | you're doing. | ||
374 | |||
375 | This is a function of the LLDD, to which the SAS layer can | ||
376 | cater to. | ||
377 | |||
378 | int lldd_queue_size | ||
379 | The host adapter's queue size. This is the maximum | ||
380 | number of commands the lldd can have pending to domain | ||
381 | devices on behalf of all upper layers submitting through | ||
382 | lldd_execute_task(). | ||
383 | |||
384 | You really want to set this to something (much) larger than | ||
385 | 1. | ||
386 | |||
387 | This _really_ has absolutely nothing to do with queuing. | ||
388 | There is no queuing in SAS LLDDs. | ||
389 | |||
390 | struct sas_task { | ||
391 | dev -- the device this task is destined to | ||
392 | list -- must be initialized (INIT_LIST_HEAD) | ||
393 | task_proto -- _one_ of enum sas_proto | ||
394 | scatter -- pointer to scatter gather list array | ||
395 | num_scatter -- number of elements in scatter | ||
396 | total_xfer_len -- total number of bytes expected to be transfered | ||
397 | data_dir -- PCI_DMA_... | ||
398 | task_done -- callback when the task has finished execution | ||
399 | }; | ||
400 | |||
401 | When an external entity, entity other than the LLDD or the | ||
402 | SAS Layer, wants to work with a struct domain_device, it | ||
403 | _must_ call kobject_get() when getting a handle on the | ||
404 | device and kobject_put() when it is done with the device. | ||
405 | |||
406 | This does two things: | ||
407 | A) implements proper kfree() for the device; | ||
408 | B) increments/decrements the kref for all players: | ||
409 | domain_device | ||
410 | all domain_device's ... (if past an expander) | ||
411 | port | ||
412 | host adapter | ||
413 | pci device | ||
414 | and up the ladder, etc. | ||
415 | |||
416 | DISCOVERY | ||
417 | --------- | ||
418 | |||
419 | The sysfs tree has the following purposes: | ||
420 | a) It shows you the physical layout of the SAS domain at | ||
421 | the current time, i.e. how the domain looks in the | ||
422 | physical world right now. | ||
423 | b) Shows some device parameters _at_discovery_time_. | ||
424 | |||
425 | This is a link to the tree(1) program, very useful in | ||
426 | viewing the SAS domain: | ||
427 | ftp://mama.indstate.edu/linux/tree/ | ||
428 | I expect user space applications to actually create a | ||
429 | graphical interface of this. | ||
430 | |||
431 | That is, the sysfs domain tree doesn't show or keep state if | ||
432 | you e.g., change the meaning of the READY LED MEANING | ||
433 | setting, but it does show you the current connection status | ||
434 | of the domain device. | ||
435 | |||
436 | Keeping internal device state changes is responsibility of | ||
437 | upper layers (Command set drivers) and user space. | ||
438 | |||
439 | When a device or devices are unplugged from the domain, this | ||
440 | is reflected in the sysfs tree immediately, and the device(s) | ||
441 | removed from the system. | ||
442 | |||
443 | The structure domain_device describes any device in the SAS | ||
444 | domain. It is completely managed by the SAS layer. A task | ||
445 | points to a domain device, this is how the SAS LLDD knows | ||
446 | where to send the task(s) to. A SAS LLDD only reads the | ||
447 | contents of the domain_device structure, but it never creates | ||
448 | or destroys one. | ||
449 | |||
450 | Expander management from User Space | ||
451 | ----------------------------------- | ||
452 | |||
453 | In each expander directory in sysfs, there is a file called | ||
454 | "smp_portal". It is a binary sysfs attribute file, which | ||
455 | implements an SMP portal (Note: this is *NOT* an SMP port), | ||
456 | to which user space applications can send SMP requests and | ||
457 | receive SMP responses. | ||
458 | |||
459 | Functionality is deceptively simple: | ||
460 | |||
461 | 1. Build the SMP frame you want to send. The format and layout | ||
462 | is described in the SAS spec. Leave the CRC field equal 0. | ||
463 | open(2) | ||
464 | 2. Open the expander's SMP portal sysfs file in RW mode. | ||
465 | write(2) | ||
466 | 3. Write the frame you built in 1. | ||
467 | read(2) | ||
468 | 4. Read the amount of data you expect to receive for the frame you built. | ||
469 | If you receive different amount of data you expected to receive, | ||
470 | then there was some kind of error. | ||
471 | close(2) | ||
472 | All this process is shown in detail in the function do_smp_func() | ||
473 | and its callers, in the file "expander_conf.c". | ||
474 | |||
475 | The kernel functionality is implemented in the file | ||
476 | "sas_expander.c". | ||
477 | |||
478 | The program "expander_conf.c" implements this. It takes one | ||
479 | argument, the sysfs file name of the SMP portal to the | ||
480 | expander, and gives expander information, including routing | ||
481 | tables. | ||
482 | |||
483 | The SMP portal gives you complete control of the expander, | ||
484 | so please be careful. | ||