diff options
Diffstat (limited to 'Documentation/blockdev/README.DAC960')
-rw-r--r-- | Documentation/blockdev/README.DAC960 | 756 |
1 files changed, 756 insertions, 0 deletions
diff --git a/Documentation/blockdev/README.DAC960 b/Documentation/blockdev/README.DAC960 new file mode 100644 index 000000000000..0e8f618ab534 --- /dev/null +++ b/Documentation/blockdev/README.DAC960 | |||
@@ -0,0 +1,756 @@ | |||
1 | Linux Driver for Mylex DAC960/AcceleRAID/eXtremeRAID PCI RAID Controllers | ||
2 | |||
3 | Version 2.2.11 for Linux 2.2.19 | ||
4 | Version 2.4.11 for Linux 2.4.12 | ||
5 | |||
6 | PRODUCTION RELEASE | ||
7 | |||
8 | 11 October 2001 | ||
9 | |||
10 | Leonard N. Zubkoff | ||
11 | Dandelion Digital | ||
12 | lnz@dandelion.com | ||
13 | |||
14 | Copyright 1998-2001 by Leonard N. Zubkoff <lnz@dandelion.com> | ||
15 | |||
16 | |||
17 | INTRODUCTION | ||
18 | |||
19 | Mylex, Inc. designs and manufactures a variety of high performance PCI RAID | ||
20 | controllers. Mylex Corporation is located at 34551 Ardenwood Blvd., Fremont, | ||
21 | California 94555, USA and can be reached at 510.796.6100 or on the World Wide | ||
22 | Web at http://www.mylex.com. Mylex Technical Support can be reached by | ||
23 | electronic mail at mylexsup@us.ibm.com, by voice at 510.608.2400, or by FAX at | ||
24 | 510.745.7715. Contact information for offices in Europe and Japan is available | ||
25 | on their Web site. | ||
26 | |||
27 | The latest information on Linux support for DAC960 PCI RAID Controllers, as | ||
28 | well as the most recent release of this driver, will always be available from | ||
29 | my Linux Home Page at URL "http://www.dandelion.com/Linux/". The Linux DAC960 | ||
30 | driver supports all current Mylex PCI RAID controllers including the new | ||
31 | eXtremeRAID 2000/3000 and AcceleRAID 352/170/160 models which have an entirely | ||
32 | new firmware interface from the older eXtremeRAID 1100, AcceleRAID 150/200/250, | ||
33 | and DAC960PJ/PG/PU/PD/PL. See below for a complete controller list as well as | ||
34 | minimum firmware version requirements. For simplicity, in most places this | ||
35 | documentation refers to DAC960 generically rather than explicitly listing all | ||
36 | the supported models. | ||
37 | |||
38 | Driver bug reports should be sent via electronic mail to "lnz@dandelion.com". | ||
39 | Please include with the bug report the complete configuration messages reported | ||
40 | by the driver at startup, along with any subsequent system messages relevant to | ||
41 | the controller's operation, and a detailed description of your system's | ||
42 | hardware configuration. Driver bugs are actually quite rare; if you encounter | ||
43 | problems with disks being marked offline, for example, please contact Mylex | ||
44 | Technical Support as the problem is related to the hardware configuration | ||
45 | rather than the Linux driver. | ||
46 | |||
47 | Please consult the RAID controller documentation for detailed information | ||
48 | regarding installation and configuration of the controllers. This document | ||
49 | primarily provides information specific to the Linux support. | ||
50 | |||
51 | |||
52 | DRIVER FEATURES | ||
53 | |||
54 | The DAC960 RAID controllers are supported solely as high performance RAID | ||
55 | controllers, not as interfaces to arbitrary SCSI devices. The Linux DAC960 | ||
56 | driver operates at the block device level, the same level as the SCSI and IDE | ||
57 | drivers. Unlike other RAID controllers currently supported on Linux, the | ||
58 | DAC960 driver is not dependent on the SCSI subsystem, and hence avoids all the | ||
59 | complexity and unnecessary code that would be associated with an implementation | ||
60 | as a SCSI driver. The DAC960 driver is designed for as high a performance as | ||
61 | possible with no compromises or extra code for compatibility with lower | ||
62 | performance devices. The DAC960 driver includes extensive error logging and | ||
63 | online configuration management capabilities. Except for initial configuration | ||
64 | of the controller and adding new disk drives, most everything can be handled | ||
65 | from Linux while the system is operational. | ||
66 | |||
67 | The DAC960 driver is architected to support up to 8 controllers per system. | ||
68 | Each DAC960 parallel SCSI controller can support up to 15 disk drives per | ||
69 | channel, for a maximum of 60 drives on a four channel controller; the fibre | ||
70 | channel eXtremeRAID 3000 controller supports up to 125 disk drives per loop for | ||
71 | a total of 250 drives. The drives installed on a controller are divided into | ||
72 | one or more "Drive Groups", and then each Drive Group is subdivided further | ||
73 | into 1 to 32 "Logical Drives". Each Logical Drive has a specific RAID Level | ||
74 | and caching policy associated with it, and it appears to Linux as a single | ||
75 | block device. Logical Drives are further subdivided into up to 7 partitions | ||
76 | through the normal Linux and PC disk partitioning schemes. Logical Drives are | ||
77 | also known as "System Drives", and Drive Groups are also called "Packs". Both | ||
78 | terms are in use in the Mylex documentation; I have chosen to standardize on | ||
79 | the more generic "Logical Drive" and "Drive Group". | ||
80 | |||
81 | DAC960 RAID disk devices are named in the style of the obsolete Device File | ||
82 | System (DEVFS). The device corresponding to Logical Drive D on Controller C | ||
83 | is referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1 | ||
84 | through /dev/rd/cCdDp7. For example, partition 3 of Logical Drive 5 on | ||
85 | Controller 2 is referred to as /dev/rd/c2d5p3. Note that unlike with SCSI | ||
86 | disks the device names will not change in the event of a disk drive failure. | ||
87 | The DAC960 driver is assigned major numbers 48 - 55 with one major number per | ||
88 | controller. The 8 bits of minor number are divided into 5 bits for the Logical | ||
89 | Drive and 3 bits for the partition. | ||
90 | |||
91 | |||
92 | SUPPORTED DAC960/AcceleRAID/eXtremeRAID PCI RAID CONTROLLERS | ||
93 | |||
94 | The following list comprises the supported DAC960, AcceleRAID, and eXtremeRAID | ||
95 | PCI RAID Controllers as of the date of this document. It is recommended that | ||
96 | anyone purchasing a Mylex PCI RAID Controller not in the following table | ||
97 | contact the author beforehand to verify that it is or will be supported. | ||
98 | |||
99 | eXtremeRAID 3000 | ||
100 | 1 Wide Ultra-2/LVD SCSI channel | ||
101 | 2 External Fibre FC-AL channels | ||
102 | 233MHz StrongARM SA 110 Processor | ||
103 | 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) | ||
104 | 32MB/64MB ECC SDRAM Memory | ||
105 | |||
106 | eXtremeRAID 2000 | ||
107 | 4 Wide Ultra-160 LVD SCSI channels | ||
108 | 233MHz StrongARM SA 110 Processor | ||
109 | 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) | ||
110 | 32MB/64MB ECC SDRAM Memory | ||
111 | |||
112 | AcceleRAID 352 | ||
113 | 2 Wide Ultra-160 LVD SCSI channels | ||
114 | 100MHz Intel i960RN RISC Processor | ||
115 | 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) | ||
116 | 32MB/64MB ECC SDRAM Memory | ||
117 | |||
118 | AcceleRAID 170 | ||
119 | 1 Wide Ultra-160 LVD SCSI channel | ||
120 | 100MHz Intel i960RM RISC Processor | ||
121 | 16MB/32MB/64MB ECC SDRAM Memory | ||
122 | |||
123 | AcceleRAID 160 (AcceleRAID 170LP) | ||
124 | 1 Wide Ultra-160 LVD SCSI channel | ||
125 | 100MHz Intel i960RS RISC Processor | ||
126 | Built in 16M ECC SDRAM Memory | ||
127 | PCI Low Profile Form Factor - fit for 2U height | ||
128 | |||
129 | eXtremeRAID 1100 (DAC1164P) | ||
130 | 3 Wide Ultra-2/LVD SCSI channels | ||
131 | 233MHz StrongARM SA 110 Processor | ||
132 | 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) | ||
133 | 16MB/32MB/64MB Parity SDRAM Memory with Battery Backup | ||
134 | |||
135 | AcceleRAID 250 (DAC960PTL1) | ||
136 | Uses onboard Symbios SCSI chips on certain motherboards | ||
137 | Also includes one onboard Wide Ultra-2/LVD SCSI Channel | ||
138 | 66MHz Intel i960RD RISC Processor | ||
139 | 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory | ||
140 | |||
141 | AcceleRAID 200 (DAC960PTL0) | ||
142 | Uses onboard Symbios SCSI chips on certain motherboards | ||
143 | Includes no onboard SCSI Channels | ||
144 | 66MHz Intel i960RD RISC Processor | ||
145 | 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory | ||
146 | |||
147 | AcceleRAID 150 (DAC960PRL) | ||
148 | Uses onboard Symbios SCSI chips on certain motherboards | ||
149 | Also includes one onboard Wide Ultra-2/LVD SCSI Channel | ||
150 | 33MHz Intel i960RP RISC Processor | ||
151 | 4MB Parity EDO Memory | ||
152 | |||
153 | DAC960PJ 1/2/3 Wide Ultra SCSI-3 Channels | ||
154 | 66MHz Intel i960RD RISC Processor | ||
155 | 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory | ||
156 | |||
157 | DAC960PG 1/2/3 Wide Ultra SCSI-3 Channels | ||
158 | 33MHz Intel i960RP RISC Processor | ||
159 | 4MB/8MB ECC EDO Memory | ||
160 | |||
161 | DAC960PU 1/2/3 Wide Ultra SCSI-3 Channels | ||
162 | Intel i960CF RISC Processor | ||
163 | 4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory | ||
164 | |||
165 | DAC960PD 1/2/3 Wide Fast SCSI-2 Channels | ||
166 | Intel i960CF RISC Processor | ||
167 | 4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory | ||
168 | |||
169 | DAC960PL 1/2/3 Wide Fast SCSI-2 Channels | ||
170 | Intel i960 RISC Processor | ||
171 | 2MB/4MB/8MB/16MB/32MB DRAM Memory | ||
172 | |||
173 | DAC960P 1/2/3 Wide Fast SCSI-2 Channels | ||
174 | Intel i960 RISC Processor | ||
175 | 2MB/4MB/8MB/16MB/32MB DRAM Memory | ||
176 | |||
177 | For the eXtremeRAID 2000/3000 and AcceleRAID 352/170/160, firmware version | ||
178 | 6.00-01 or above is required. | ||
179 | |||
180 | For the eXtremeRAID 1100, firmware version 5.06-0-52 or above is required. | ||
181 | |||
182 | For the AcceleRAID 250, 200, and 150, firmware version 4.06-0-57 or above is | ||
183 | required. | ||
184 | |||
185 | For the DAC960PJ and DAC960PG, firmware version 4.06-0-00 or above is required. | ||
186 | |||
187 | For the DAC960PU, DAC960PD, DAC960PL, and DAC960P, either firmware version | ||
188 | 3.51-0-04 or above is required (for dual Flash ROM controllers), or firmware | ||
189 | version 2.73-0-00 or above is required (for single Flash ROM controllers) | ||
190 | |||
191 | Please note that not all SCSI disk drives are suitable for use with DAC960 | ||
192 | controllers, and only particular firmware versions of any given model may | ||
193 | actually function correctly. Similarly, not all motherboards have a BIOS that | ||
194 | properly initializes the AcceleRAID 250, AcceleRAID 200, AcceleRAID 150, | ||
195 | DAC960PJ, and DAC960PG because the Intel i960RD/RP is a multi-function device. | ||
196 | If in doubt, contact Mylex RAID Technical Support (mylexsup@us.ibm.com) to | ||
197 | verify compatibility. Mylex makes available a hard disk compatibility list at | ||
198 | http://www.mylex.com/support/hdcomp/hd-lists.html. | ||
199 | |||
200 | |||
201 | DRIVER INSTALLATION | ||
202 | |||
203 | This distribution was prepared for Linux kernel version 2.2.19 or 2.4.12. | ||
204 | |||
205 | To install the DAC960 RAID driver, you may use the following commands, | ||
206 | replacing "/usr/src" with wherever you keep your Linux kernel source tree: | ||
207 | |||
208 | cd /usr/src | ||
209 | tar -xvzf DAC960-2.2.11.tar.gz (or DAC960-2.4.11.tar.gz) | ||
210 | mv README.DAC960 linux/Documentation | ||
211 | mv DAC960.[ch] linux/drivers/block | ||
212 | patch -p0 < DAC960.patch (if DAC960.patch is included) | ||
213 | cd linux | ||
214 | make config | ||
215 | make bzImage (or zImage) | ||
216 | |||
217 | Then install "arch/i386/boot/bzImage" or "arch/i386/boot/zImage" as your | ||
218 | standard kernel, run lilo if appropriate, and reboot. | ||
219 | |||
220 | To create the necessary devices in /dev, the "make_rd" script included in | ||
221 | "DAC960-Utilities.tar.gz" from http://www.dandelion.com/Linux/ may be used. | ||
222 | LILO 21 and FDISK v2.9 include DAC960 support; also included in this archive | ||
223 | are patches to LILO 20 and FDISK v2.8 that add DAC960 support, along with | ||
224 | statically linked executables of LILO and FDISK. This modified version of LILO | ||
225 | will allow booting from a DAC960 controller and/or mounting the root file | ||
226 | system from a DAC960. | ||
227 | |||
228 | Red Hat Linux 6.0 and SuSE Linux 6.1 include support for Mylex PCI RAID | ||
229 | controllers. Installing directly onto a DAC960 may be problematic from other | ||
230 | Linux distributions until their installation utilities are updated. | ||
231 | |||
232 | |||
233 | INSTALLATION NOTES | ||
234 | |||
235 | Before installing Linux or adding DAC960 logical drives to an existing Linux | ||
236 | system, the controller must first be configured to provide one or more logical | ||
237 | drives using the BIOS Configuration Utility or DACCF. Please note that since | ||
238 | there are only at most 6 usable partitions on each logical drive, systems | ||
239 | requiring more partitions should subdivide a drive group into multiple logical | ||
240 | drives, each of which can have up to 6 usable partitions. Also, note that with | ||
241 | large disk arrays it is advisable to enable the 8GB BIOS Geometry (255/63) | ||
242 | rather than accepting the default 2GB BIOS Geometry (128/32); failing to so do | ||
243 | will cause the logical drive geometry to have more than 65535 cylinders which | ||
244 | will make it impossible for FDISK to be used properly. The 8GB BIOS Geometry | ||
245 | can be enabled by configuring the DAC960 BIOS, which is accessible via Alt-M | ||
246 | during the BIOS initialization sequence. | ||
247 | |||
248 | For maximum performance and the most efficient E2FSCK performance, it is | ||
249 | recommended that EXT2 file systems be built with a 4KB block size and 16 block | ||
250 | stride to match the DAC960 controller's 64KB default stripe size. The command | ||
251 | "mke2fs -b 4096 -R stride=16 <device>" is appropriate. Unless there will be a | ||
252 | large number of small files on the file systems, it is also beneficial to add | ||
253 | the "-i 16384" option to increase the bytes per inode parameter thereby | ||
254 | reducing the file system metadata. Finally, on systems that will only be run | ||
255 | with Linux 2.2 or later kernels it is beneficial to enable sparse superblocks | ||
256 | with the "-s 1" option. | ||
257 | |||
258 | |||
259 | DAC960 ANNOUNCEMENTS MAILING LIST | ||
260 | |||
261 | The DAC960 Announcements Mailing List provides a forum for informing Linux | ||
262 | users of new driver releases and other announcements regarding Linux support | ||
263 | for DAC960 PCI RAID Controllers. To join the mailing list, send a message to | ||
264 | "dac960-announce-request@dandelion.com" with the line "subscribe" in the | ||
265 | message body. | ||
266 | |||
267 | |||
268 | CONTROLLER CONFIGURATION AND STATUS MONITORING | ||
269 | |||
270 | The DAC960 RAID controllers running firmware 4.06 or above include a Background | ||
271 | Initialization facility so that system downtime is minimized both for initial | ||
272 | installation and subsequent configuration of additional storage. The BIOS | ||
273 | Configuration Utility (accessible via Alt-R during the BIOS initialization | ||
274 | sequence) is used to quickly configure the controller, and then the logical | ||
275 | drives that have been created are available for immediate use even while they | ||
276 | are still being initialized by the controller. The primary need for online | ||
277 | configuration and status monitoring is then to avoid system downtime when disk | ||
278 | drives fail and must be replaced. Mylex's online monitoring and configuration | ||
279 | utilities are being ported to Linux and will become available at some point in | ||
280 | the future. Note that with a SAF-TE (SCSI Accessed Fault-Tolerant Enclosure) | ||
281 | enclosure, the controller is able to rebuild failed drives automatically as | ||
282 | soon as a drive replacement is made available. | ||
283 | |||
284 | The primary interfaces for controller configuration and status monitoring are | ||
285 | special files created in the /proc/rd/... hierarchy along with the normal | ||
286 | system console logging mechanism. Whenever the system is operating, the DAC960 | ||
287 | driver queries each controller for status information every 10 seconds, and | ||
288 | checks for additional conditions every 60 seconds. The initial status of each | ||
289 | controller is always available for controller N in /proc/rd/cN/initial_status, | ||
290 | and the current status as of the last status monitoring query is available in | ||
291 | /proc/rd/cN/current_status. In addition, status changes are also logged by the | ||
292 | driver to the system console and will appear in the log files maintained by | ||
293 | syslog. The progress of asynchronous rebuild or consistency check operations | ||
294 | is also available in /proc/rd/cN/current_status, and progress messages are | ||
295 | logged to the system console at most every 60 seconds. | ||
296 | |||
297 | Starting with the 2.2.3/2.0.3 versions of the driver, the status information | ||
298 | available in /proc/rd/cN/initial_status and /proc/rd/cN/current_status has been | ||
299 | augmented to include the vendor, model, revision, and serial number (if | ||
300 | available) for each physical device found connected to the controller: | ||
301 | |||
302 | ***** DAC960 RAID Driver Version 2.2.3 of 19 August 1999 ***** | ||
303 | Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> | ||
304 | Configuring Mylex DAC960PRL PCI RAID Controller | ||
305 | Firmware Version: 4.07-0-07, Channels: 1, Memory Size: 16MB | ||
306 | PCI Bus: 1, Device: 4, Function: 1, I/O Address: Unassigned | ||
307 | PCI Address: 0xFE300000 mapped at 0xA0800000, IRQ Channel: 21 | ||
308 | Controller Queue Depth: 128, Maximum Blocks per Command: 128 | ||
309 | Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 | ||
310 | Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 | ||
311 | SAF-TE Enclosure Management Enabled | ||
312 | Physical Devices: | ||
313 | 0:0 Vendor: IBM Model: DRVS09D Revision: 0270 | ||
314 | Serial Number: 68016775HA | ||
315 | Disk Status: Online, 17928192 blocks | ||
316 | 0:1 Vendor: IBM Model: DRVS09D Revision: 0270 | ||
317 | Serial Number: 68004E53HA | ||
318 | Disk Status: Online, 17928192 blocks | ||
319 | 0:2 Vendor: IBM Model: DRVS09D Revision: 0270 | ||
320 | Serial Number: 13013935HA | ||
321 | Disk Status: Online, 17928192 blocks | ||
322 | 0:3 Vendor: IBM Model: DRVS09D Revision: 0270 | ||
323 | Serial Number: 13016897HA | ||
324 | Disk Status: Online, 17928192 blocks | ||
325 | 0:4 Vendor: IBM Model: DRVS09D Revision: 0270 | ||
326 | Serial Number: 68019905HA | ||
327 | Disk Status: Online, 17928192 blocks | ||
328 | 0:5 Vendor: IBM Model: DRVS09D Revision: 0270 | ||
329 | Serial Number: 68012753HA | ||
330 | Disk Status: Online, 17928192 blocks | ||
331 | 0:6 Vendor: ESG-SHV Model: SCA HSBP M6 Revision: 0.61 | ||
332 | Logical Drives: | ||
333 | /dev/rd/c0d0: RAID-5, Online, 89640960 blocks, Write Thru | ||
334 | No Rebuild or Consistency Check in Progress | ||
335 | |||
336 | To simplify the monitoring process for custom software, the special file | ||
337 | /proc/rd/status returns "OK" when all DAC960 controllers in the system are | ||
338 | operating normally and no failures have occurred, or "ALERT" if any logical | ||
339 | drives are offline or critical or any non-standby physical drives are dead. | ||
340 | |||
341 | Configuration commands for controller N are available via the special file | ||
342 | /proc/rd/cN/user_command. A human readable command can be written to this | ||
343 | special file to initiate a configuration operation, and the results of the | ||
344 | operation can then be read back from the special file in addition to being | ||
345 | logged to the system console. The shell command sequence | ||
346 | |||
347 | echo "<configuration-command>" > /proc/rd/c0/user_command | ||
348 | cat /proc/rd/c0/user_command | ||
349 | |||
350 | is typically used to execute configuration commands. The configuration | ||
351 | commands are: | ||
352 | |||
353 | flush-cache | ||
354 | |||
355 | The "flush-cache" command flushes the controller's cache. The system | ||
356 | automatically flushes the cache at shutdown or if the driver module is | ||
357 | unloaded, so this command is only needed to be certain a write back cache | ||
358 | is flushed to disk before the system is powered off by a command to a UPS. | ||
359 | Note that the flush-cache command also stops an asynchronous rebuild or | ||
360 | consistency check, so it should not be used except when the system is being | ||
361 | halted. | ||
362 | |||
363 | kill <channel>:<target-id> | ||
364 | |||
365 | The "kill" command marks the physical drive <channel>:<target-id> as DEAD. | ||
366 | This command is provided primarily for testing, and should not be used | ||
367 | during normal system operation. | ||
368 | |||
369 | make-online <channel>:<target-id> | ||
370 | |||
371 | The "make-online" command changes the physical drive <channel>:<target-id> | ||
372 | from status DEAD to status ONLINE. In cases where multiple physical drives | ||
373 | have been killed simultaneously, this command may be used to bring all but | ||
374 | one of them back online, after which a rebuild to the final drive is | ||
375 | necessary. | ||
376 | |||
377 | Warning: make-online should only be used on a dead physical drive that is | ||
378 | an active part of a drive group, never on a standby drive. The command | ||
379 | should never be used on a dead drive that is part of a critical logical | ||
380 | drive; rebuild should be used if only a single drive is dead. | ||
381 | |||
382 | make-standby <channel>:<target-id> | ||
383 | |||
384 | The "make-standby" command changes physical drive <channel>:<target-id> | ||
385 | from status DEAD to status STANDBY. It should only be used in cases where | ||
386 | a dead drive was replaced after an automatic rebuild was performed onto a | ||
387 | standby drive. It cannot be used to add a standby drive to the controller | ||
388 | configuration if one was not created initially; the BIOS Configuration | ||
389 | Utility must be used for that currently. | ||
390 | |||
391 | rebuild <channel>:<target-id> | ||
392 | |||
393 | The "rebuild" command initiates an asynchronous rebuild onto physical drive | ||
394 | <channel>:<target-id>. It should only be used when a dead drive has been | ||
395 | replaced. | ||
396 | |||
397 | check-consistency <logical-drive-number> | ||
398 | |||
399 | The "check-consistency" command initiates an asynchronous consistency check | ||
400 | of <logical-drive-number> with automatic restoration. It can be used | ||
401 | whenever it is desired to verify the consistency of the redundancy | ||
402 | information. | ||
403 | |||
404 | cancel-rebuild | ||
405 | cancel-consistency-check | ||
406 | |||
407 | The "cancel-rebuild" and "cancel-consistency-check" commands cancel any | ||
408 | rebuild or consistency check operations previously initiated. | ||
409 | |||
410 | |||
411 | EXAMPLE I - DRIVE FAILURE WITHOUT A STANDBY DRIVE | ||
412 | |||
413 | The following annotated logs demonstrate the controller configuration and and | ||
414 | online status monitoring capabilities of the Linux DAC960 Driver. The test | ||
415 | configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a | ||
416 | DAC960PJ controller. The physical drives are configured into a single drive | ||
417 | group without a standby drive, and the drive group has been configured into two | ||
418 | logical drives, one RAID-5 and one RAID-6. Note that these logs are from an | ||
419 | earlier version of the driver and the messages have changed somewhat with newer | ||
420 | releases, but the functionality remains similar. First, here is the current | ||
421 | status of the RAID configuration: | ||
422 | |||
423 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
424 | ***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** | ||
425 | Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> | ||
426 | Configuring Mylex DAC960PJ PCI RAID Controller | ||
427 | Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB | ||
428 | PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned | ||
429 | PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 | ||
430 | Controller Queue Depth: 128, Maximum Blocks per Command: 128 | ||
431 | Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 | ||
432 | Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 | ||
433 | Physical Devices: | ||
434 | 0:1 - Disk: Online, 2201600 blocks | ||
435 | 0:2 - Disk: Online, 2201600 blocks | ||
436 | 0:3 - Disk: Online, 2201600 blocks | ||
437 | 1:1 - Disk: Online, 2201600 blocks | ||
438 | 1:2 - Disk: Online, 2201600 blocks | ||
439 | 1:3 - Disk: Online, 2201600 blocks | ||
440 | Logical Drives: | ||
441 | /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru | ||
442 | /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru | ||
443 | No Rebuild or Consistency Check in Progress | ||
444 | |||
445 | gwynedd:/u/lnz# cat /proc/rd/status | ||
446 | OK | ||
447 | |||
448 | The above messages indicate that everything is healthy, and /proc/rd/status | ||
449 | returns "OK" indicating that there are no problems with any DAC960 controller | ||
450 | in the system. For demonstration purposes, while I/O is active Physical Drive | ||
451 | 1:1 is now disconnected, simulating a drive failure. The failure is noted by | ||
452 | the driver within 10 seconds of the controller's having detected it, and the | ||
453 | driver logs the following console status messages indicating that Logical | ||
454 | Drives 0 and 1 are now CRITICAL as a result of Physical Drive 1:1 being DEAD: | ||
455 | |||
456 | DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 | ||
457 | DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 | ||
458 | DAC960#0: Physical Drive 1:1 killed because of timeout on SCSI command | ||
459 | DAC960#0: Physical Drive 1:1 is now DEAD | ||
460 | DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL | ||
461 | DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL | ||
462 | |||
463 | The Sense Keys logged here are just Check Condition / Unit Attention conditions | ||
464 | arising from a SCSI bus reset that is forced by the controller during its error | ||
465 | recovery procedures. Concurrently with the above, the driver status available | ||
466 | from /proc/rd also reflects the drive failure. The status message in | ||
467 | /proc/rd/status has changed from "OK" to "ALERT": | ||
468 | |||
469 | gwynedd:/u/lnz# cat /proc/rd/status | ||
470 | ALERT | ||
471 | |||
472 | and /proc/rd/c0/current_status has been updated: | ||
473 | |||
474 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
475 | ... | ||
476 | Physical Devices: | ||
477 | 0:1 - Disk: Online, 2201600 blocks | ||
478 | 0:2 - Disk: Online, 2201600 blocks | ||
479 | 0:3 - Disk: Online, 2201600 blocks | ||
480 | 1:1 - Disk: Dead, 2201600 blocks | ||
481 | 1:2 - Disk: Online, 2201600 blocks | ||
482 | 1:3 - Disk: Online, 2201600 blocks | ||
483 | Logical Drives: | ||
484 | /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru | ||
485 | /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru | ||
486 | No Rebuild or Consistency Check in Progress | ||
487 | |||
488 | Since there are no standby drives configured, the system can continue to access | ||
489 | the logical drives in a performance degraded mode until the failed drive is | ||
490 | replaced and a rebuild operation completed to restore the redundancy of the | ||
491 | logical drives. Once Physical Drive 1:1 is replaced with a properly | ||
492 | functioning drive, or if the physical drive was killed without having failed | ||
493 | (e.g., due to electrical problems on the SCSI bus), the user can instruct the | ||
494 | controller to initiate a rebuild operation onto the newly replaced drive: | ||
495 | |||
496 | gwynedd:/u/lnz# echo "rebuild 1:1" > /proc/rd/c0/user_command | ||
497 | gwynedd:/u/lnz# cat /proc/rd/c0/user_command | ||
498 | Rebuild of Physical Drive 1:1 Initiated | ||
499 | |||
500 | The echo command instructs the controller to initiate an asynchronous rebuild | ||
501 | operation onto Physical Drive 1:1, and the status message that results from the | ||
502 | operation is then available for reading from /proc/rd/c0/user_command, as well | ||
503 | as being logged to the console by the driver. | ||
504 | |||
505 | Within 10 seconds of this command the driver logs the initiation of the | ||
506 | asynchronous rebuild operation: | ||
507 | |||
508 | DAC960#0: Rebuild of Physical Drive 1:1 Initiated | ||
509 | DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01 | ||
510 | DAC960#0: Physical Drive 1:1 is now WRITE-ONLY | ||
511 | DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 1% completed | ||
512 | |||
513 | and /proc/rd/c0/current_status is updated: | ||
514 | |||
515 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
516 | ... | ||
517 | Physical Devices: | ||
518 | 0:1 - Disk: Online, 2201600 blocks | ||
519 | 0:2 - Disk: Online, 2201600 blocks | ||
520 | 0:3 - Disk: Online, 2201600 blocks | ||
521 | 1:1 - Disk: Write-Only, 2201600 blocks | ||
522 | 1:2 - Disk: Online, 2201600 blocks | ||
523 | 1:3 - Disk: Online, 2201600 blocks | ||
524 | Logical Drives: | ||
525 | /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru | ||
526 | /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru | ||
527 | Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 6% completed | ||
528 | |||
529 | As the rebuild progresses, the current status in /proc/rd/c0/current_status is | ||
530 | updated every 10 seconds: | ||
531 | |||
532 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
533 | ... | ||
534 | Physical Devices: | ||
535 | 0:1 - Disk: Online, 2201600 blocks | ||
536 | 0:2 - Disk: Online, 2201600 blocks | ||
537 | 0:3 - Disk: Online, 2201600 blocks | ||
538 | 1:1 - Disk: Write-Only, 2201600 blocks | ||
539 | 1:2 - Disk: Online, 2201600 blocks | ||
540 | 1:3 - Disk: Online, 2201600 blocks | ||
541 | Logical Drives: | ||
542 | /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru | ||
543 | /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru | ||
544 | Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 15% completed | ||
545 | |||
546 | and every minute a progress message is logged to the console by the driver: | ||
547 | |||
548 | DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 32% completed | ||
549 | DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 63% completed | ||
550 | DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 94% completed | ||
551 | DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 94% completed | ||
552 | |||
553 | Finally, the rebuild completes successfully. The driver logs the status of the | ||
554 | logical and physical drives and the rebuild completion: | ||
555 | |||
556 | DAC960#0: Rebuild Completed Successfully | ||
557 | DAC960#0: Physical Drive 1:1 is now ONLINE | ||
558 | DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE | ||
559 | DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE | ||
560 | |||
561 | /proc/rd/c0/current_status is updated: | ||
562 | |||
563 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
564 | ... | ||
565 | Physical Devices: | ||
566 | 0:1 - Disk: Online, 2201600 blocks | ||
567 | 0:2 - Disk: Online, 2201600 blocks | ||
568 | 0:3 - Disk: Online, 2201600 blocks | ||
569 | 1:1 - Disk: Online, 2201600 blocks | ||
570 | 1:2 - Disk: Online, 2201600 blocks | ||
571 | 1:3 - Disk: Online, 2201600 blocks | ||
572 | Logical Drives: | ||
573 | /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru | ||
574 | /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru | ||
575 | Rebuild Completed Successfully | ||
576 | |||
577 | and /proc/rd/status indicates that everything is healthy once again: | ||
578 | |||
579 | gwynedd:/u/lnz# cat /proc/rd/status | ||
580 | OK | ||
581 | |||
582 | |||
583 | EXAMPLE II - DRIVE FAILURE WITH A STANDBY DRIVE | ||
584 | |||
585 | The following annotated logs demonstrate the controller configuration and and | ||
586 | online status monitoring capabilities of the Linux DAC960 Driver. The test | ||
587 | configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a | ||
588 | DAC960PJ controller. The physical drives are configured into a single drive | ||
589 | group with a standby drive, and the drive group has been configured into two | ||
590 | logical drives, one RAID-5 and one RAID-6. Note that these logs are from an | ||
591 | earlier version of the driver and the messages have changed somewhat with newer | ||
592 | releases, but the functionality remains similar. First, here is the current | ||
593 | status of the RAID configuration: | ||
594 | |||
595 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
596 | ***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** | ||
597 | Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> | ||
598 | Configuring Mylex DAC960PJ PCI RAID Controller | ||
599 | Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB | ||
600 | PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned | ||
601 | PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 | ||
602 | Controller Queue Depth: 128, Maximum Blocks per Command: 128 | ||
603 | Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 | ||
604 | Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 | ||
605 | Physical Devices: | ||
606 | 0:1 - Disk: Online, 2201600 blocks | ||
607 | 0:2 - Disk: Online, 2201600 blocks | ||
608 | 0:3 - Disk: Online, 2201600 blocks | ||
609 | 1:1 - Disk: Online, 2201600 blocks | ||
610 | 1:2 - Disk: Online, 2201600 blocks | ||
611 | 1:3 - Disk: Standby, 2201600 blocks | ||
612 | Logical Drives: | ||
613 | /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru | ||
614 | /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru | ||
615 | No Rebuild or Consistency Check in Progress | ||
616 | |||
617 | gwynedd:/u/lnz# cat /proc/rd/status | ||
618 | OK | ||
619 | |||
620 | The above messages indicate that everything is healthy, and /proc/rd/status | ||
621 | returns "OK" indicating that there are no problems with any DAC960 controller | ||
622 | in the system. For demonstration purposes, while I/O is active Physical Drive | ||
623 | 1:2 is now disconnected, simulating a drive failure. The failure is noted by | ||
624 | the driver within 10 seconds of the controller's having detected it, and the | ||
625 | driver logs the following console status messages: | ||
626 | |||
627 | DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 | ||
628 | DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 | ||
629 | DAC960#0: Physical Drive 1:2 killed because of timeout on SCSI command | ||
630 | DAC960#0: Physical Drive 1:2 is now DEAD | ||
631 | DAC960#0: Physical Drive 1:2 killed because it was removed | ||
632 | DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL | ||
633 | DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL | ||
634 | |||
635 | Since a standby drive is configured, the controller automatically begins | ||
636 | rebuilding onto the standby drive: | ||
637 | |||
638 | DAC960#0: Physical Drive 1:3 is now WRITE-ONLY | ||
639 | DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed | ||
640 | |||
641 | Concurrently with the above, the driver status available from /proc/rd also | ||
642 | reflects the drive failure and automatic rebuild. The status message in | ||
643 | /proc/rd/status has changed from "OK" to "ALERT": | ||
644 | |||
645 | gwynedd:/u/lnz# cat /proc/rd/status | ||
646 | ALERT | ||
647 | |||
648 | and /proc/rd/c0/current_status has been updated: | ||
649 | |||
650 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
651 | ... | ||
652 | Physical Devices: | ||
653 | 0:1 - Disk: Online, 2201600 blocks | ||
654 | 0:2 - Disk: Online, 2201600 blocks | ||
655 | 0:3 - Disk: Online, 2201600 blocks | ||
656 | 1:1 - Disk: Online, 2201600 blocks | ||
657 | 1:2 - Disk: Dead, 2201600 blocks | ||
658 | 1:3 - Disk: Write-Only, 2201600 blocks | ||
659 | Logical Drives: | ||
660 | /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru | ||
661 | /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru | ||
662 | Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed | ||
663 | |||
664 | As the rebuild progresses, the current status in /proc/rd/c0/current_status is | ||
665 | updated every 10 seconds: | ||
666 | |||
667 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
668 | ... | ||
669 | Physical Devices: | ||
670 | 0:1 - Disk: Online, 2201600 blocks | ||
671 | 0:2 - Disk: Online, 2201600 blocks | ||
672 | 0:3 - Disk: Online, 2201600 blocks | ||
673 | 1:1 - Disk: Online, 2201600 blocks | ||
674 | 1:2 - Disk: Dead, 2201600 blocks | ||
675 | 1:3 - Disk: Write-Only, 2201600 blocks | ||
676 | Logical Drives: | ||
677 | /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru | ||
678 | /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru | ||
679 | Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed | ||
680 | |||
681 | and every minute a progress message is logged on the console by the driver: | ||
682 | |||
683 | DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed | ||
684 | DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 76% completed | ||
685 | DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 66% completed | ||
686 | DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 84% completed | ||
687 | |||
688 | Finally, the rebuild completes successfully. The driver logs the status of the | ||
689 | logical and physical drives and the rebuild completion: | ||
690 | |||
691 | DAC960#0: Rebuild Completed Successfully | ||
692 | DAC960#0: Physical Drive 1:3 is now ONLINE | ||
693 | DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE | ||
694 | DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE | ||
695 | |||
696 | /proc/rd/c0/current_status is updated: | ||
697 | |||
698 | ***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** | ||
699 | Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> | ||
700 | Configuring Mylex DAC960PJ PCI RAID Controller | ||
701 | Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB | ||
702 | PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned | ||
703 | PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 | ||
704 | Controller Queue Depth: 128, Maximum Blocks per Command: 128 | ||
705 | Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 | ||
706 | Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 | ||
707 | Physical Devices: | ||
708 | 0:1 - Disk: Online, 2201600 blocks | ||
709 | 0:2 - Disk: Online, 2201600 blocks | ||
710 | 0:3 - Disk: Online, 2201600 blocks | ||
711 | 1:1 - Disk: Online, 2201600 blocks | ||
712 | 1:2 - Disk: Dead, 2201600 blocks | ||
713 | 1:3 - Disk: Online, 2201600 blocks | ||
714 | Logical Drives: | ||
715 | /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru | ||
716 | /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru | ||
717 | Rebuild Completed Successfully | ||
718 | |||
719 | and /proc/rd/status indicates that everything is healthy once again: | ||
720 | |||
721 | gwynedd:/u/lnz# cat /proc/rd/status | ||
722 | OK | ||
723 | |||
724 | Note that the absence of a viable standby drive does not create an "ALERT" | ||
725 | status. Once dead Physical Drive 1:2 has been replaced, the controller must be | ||
726 | told that this has occurred and that the newly replaced drive should become the | ||
727 | new standby drive: | ||
728 | |||
729 | gwynedd:/u/lnz# echo "make-standby 1:2" > /proc/rd/c0/user_command | ||
730 | gwynedd:/u/lnz# cat /proc/rd/c0/user_command | ||
731 | Make Standby of Physical Drive 1:2 Succeeded | ||
732 | |||
733 | The echo command instructs the controller to make Physical Drive 1:2 into a | ||
734 | standby drive, and the status message that results from the operation is then | ||
735 | available for reading from /proc/rd/c0/user_command, as well as being logged to | ||
736 | the console by the driver. Within 60 seconds of this command the driver logs: | ||
737 | |||
738 | DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01 | ||
739 | DAC960#0: Physical Drive 1:2 is now STANDBY | ||
740 | DAC960#0: Make Standby of Physical Drive 1:2 Succeeded | ||
741 | |||
742 | and /proc/rd/c0/current_status is updated: | ||
743 | |||
744 | gwynedd:/u/lnz# cat /proc/rd/c0/current_status | ||
745 | ... | ||
746 | Physical Devices: | ||
747 | 0:1 - Disk: Online, 2201600 blocks | ||
748 | 0:2 - Disk: Online, 2201600 blocks | ||
749 | 0:3 - Disk: Online, 2201600 blocks | ||
750 | 1:1 - Disk: Online, 2201600 blocks | ||
751 | 1:2 - Disk: Standby, 2201600 blocks | ||
752 | 1:3 - Disk: Online, 2201600 blocks | ||
753 | Logical Drives: | ||
754 | /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru | ||
755 | /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru | ||
756 | Rebuild Completed Successfully | ||