diff options
author | Tejun Heo <htejun@gmail.com> | 2005-10-01 22:54:29 -0400 |
---|---|---|
committer | Jeff Garzik <jgarzik@pobox.com> | 2005-10-03 22:11:29 -0400 |
commit | fe998aa7e27f125f6768ec6b137b0ce2c9790509 (patch) | |
tree | 124543efd939e2238d1b09a044969adbbef9b4bc /Documentation | |
parent | 31961943e3110c5a1c36b1e0069c29f7c4380e51 (diff) |
[PATCH] libata: add ATA exceptions chapter to doc
Hello, Jeff.
This patch adds ATA errors & exceptions chapter to
Documentation/DocBook/libata.tmpl. As suggested, the chapter is
placed before low level driver specific chapters. Contents are
unchanged from the last posting.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/DocBook/libata.tmpl | 716 |
1 files changed, 716 insertions, 0 deletions
diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl index b2ec780bcda1..d260d92089ad 100644 --- a/Documentation/DocBook/libata.tmpl +++ b/Documentation/DocBook/libata.tmpl | |||
@@ -787,6 +787,722 @@ and other resources, etc. | |||
787 | !Idrivers/scsi/libata-scsi.c | 787 | !Idrivers/scsi/libata-scsi.c |
788 | </chapter> | 788 | </chapter> |
789 | 789 | ||
790 | <chapter id="ataExceptions"> | ||
791 | <title>ATA errors & exceptions</title> | ||
792 | |||
793 | <para> | ||
794 | This chapter tries to identify what error/exception conditions exist | ||
795 | for ATA/ATAPI devices and describe how they should be handled in | ||
796 | implementation-neutral way. | ||
797 | </para> | ||
798 | |||
799 | <para> | ||
800 | The term 'error' is used to describe conditions where either an | ||
801 | explicit error condition is reported from device or a command has | ||
802 | timed out. | ||
803 | </para> | ||
804 | |||
805 | <para> | ||
806 | The term 'exception' is either used to describe exceptional | ||
807 | conditions which are not errors (say, power or hotplug events), or | ||
808 | to describe both errors and non-error exceptional conditions. Where | ||
809 | explicit distinction between error and exception is necessary, the | ||
810 | term 'non-error exception' is used. | ||
811 | </para> | ||
812 | |||
813 | <sect1 id="excat"> | ||
814 | <title>Exception categories</title> | ||
815 | <para> | ||
816 | Exceptions are described primarily with respect to legacy | ||
817 | taskfile + bus master IDE interface. If a controller provides | ||
818 | other better mechanism for error reporting, mapping those into | ||
819 | categories described below shouldn't be difficult. | ||
820 | </para> | ||
821 | |||
822 | <para> | ||
823 | In the following sections, two recovery actions - reset and | ||
824 | reconfiguring transport - are mentioned. These are described | ||
825 | further in <xref linkend="exrec"/>. | ||
826 | </para> | ||
827 | |||
828 | <sect2 id="excatHSMviolation"> | ||
829 | <title>HSM violation</title> | ||
830 | <para> | ||
831 | This error is indicated when STATUS value doesn't match HSM | ||
832 | requirement during issuing or excution any ATA/ATAPI command. | ||
833 | </para> | ||
834 | |||
835 | <itemizedlist> | ||
836 | <title>Examples</title> | ||
837 | |||
838 | <listitem> | ||
839 | <para> | ||
840 | ATA_STATUS doesn't contain !BSY && DRDY && !DRQ while trying | ||
841 | to issue a command. | ||
842 | </para> | ||
843 | </listitem> | ||
844 | |||
845 | <listitem> | ||
846 | <para> | ||
847 | !BSY && !DRQ during PIO data transfer. | ||
848 | </para> | ||
849 | </listitem> | ||
850 | |||
851 | <listitem> | ||
852 | <para> | ||
853 | DRQ on command completion. | ||
854 | </para> | ||
855 | </listitem> | ||
856 | |||
857 | <listitem> | ||
858 | <para> | ||
859 | !BSY && ERR after CDB tranfer starts but before the | ||
860 | last byte of CDB is transferred. ATA/ATAPI standard states | ||
861 | that "The device shall not terminate the PACKET command | ||
862 | with an error before the last byte of the command packet has | ||
863 | been written" in the error outputs description of PACKET | ||
864 | command and the state diagram doesn't include such | ||
865 | transitions. | ||
866 | </para> | ||
867 | </listitem> | ||
868 | |||
869 | </itemizedlist> | ||
870 | |||
871 | <para> | ||
872 | In these cases, HSM is violated and not much information | ||
873 | regarding the error can be acquired from STATUS or ERROR | ||
874 | register. IOW, this error can be anything - driver bug, | ||
875 | faulty device, controller and/or cable. | ||
876 | </para> | ||
877 | |||
878 | <para> | ||
879 | As HSM is violated, reset is necessary to restore known state. | ||
880 | Reconfiguring transport for lower speed might be helpful too | ||
881 | as transmission errors sometimes cause this kind of errors. | ||
882 | </para> | ||
883 | </sect2> | ||
884 | |||
885 | <sect2 id="excatDevErr"> | ||
886 | <title>ATA/ATAPI device error (non-NCQ / non-CHECK CONDITION)</title> | ||
887 | |||
888 | <para> | ||
889 | These are errors detected and reported by ATA/ATAPI devices | ||
890 | indicating device problems. For this type of errors, STATUS | ||
891 | and ERROR register values are valid and describe error | ||
892 | condition. Note that some of ATA bus errors are detected by | ||
893 | ATA/ATAPI devices and reported using the same mechanism as | ||
894 | device errors. Those cases are described later in this | ||
895 | section. | ||
896 | </para> | ||
897 | |||
898 | <para> | ||
899 | For ATA commands, this type of errors are indicated by !BSY | ||
900 | && ERR during command execution and on completion. | ||
901 | </para> | ||
902 | |||
903 | <para>For ATAPI commands,</para> | ||
904 | |||
905 | <itemizedlist> | ||
906 | |||
907 | <listitem> | ||
908 | <para> | ||
909 | !BSY && ERR && ABRT right after issuing PACKET | ||
910 | indicates that PACKET command is not supported and falls in | ||
911 | this category. | ||
912 | </para> | ||
913 | </listitem> | ||
914 | |||
915 | <listitem> | ||
916 | <para> | ||
917 | !BSY && ERR(==CHK) && !ABRT after the last | ||
918 | byte of CDB is transferred indicates CHECK CONDITION and | ||
919 | doesn't fall in this category. | ||
920 | </para> | ||
921 | </listitem> | ||
922 | |||
923 | <listitem> | ||
924 | <para> | ||
925 | !BSY && ERR(==CHK) && ABRT after the last byte | ||
926 | of CDB is transferred *probably* indicates CHECK CONDITION and | ||
927 | doesn't fall in this category. | ||
928 | </para> | ||
929 | </listitem> | ||
930 | |||
931 | </itemizedlist> | ||
932 | |||
933 | <para> | ||
934 | Of errors detected as above, the followings are not ATA/ATAPI | ||
935 | device errors but ATA bus errors and should be handled | ||
936 | according to <xref linkend="excatATAbusErr"/>. | ||
937 | </para> | ||
938 | |||
939 | <variablelist> | ||
940 | |||
941 | <varlistentry> | ||
942 | <term>CRC error during data transfer</term> | ||
943 | <listitem> | ||
944 | <para> | ||
945 | This is indicated by ICRC bit in the ERROR register and | ||
946 | means that corruption occurred during data transfer. Upto | ||
947 | ATA/ATAPI-7, the standard specifies that this bit is only | ||
948 | applicable to UDMA transfers but ATA/ATAPI-8 draft revision | ||
949 | 1f says that the bit may be applicable to multiword DMA and | ||
950 | PIO. | ||
951 | </para> | ||
952 | </listitem> | ||
953 | </varlistentry> | ||
954 | |||
955 | <varlistentry> | ||
956 | <term>ABRT error during data transfer or on completion</term> | ||
957 | <listitem> | ||
958 | <para> | ||
959 | Upto ATA/ATAPI-7, the standard specifies that ABRT could be | ||
960 | set on ICRC errors and on cases where a device is not able | ||
961 | to complete a command. Combined with the fact that MWDMA | ||
962 | and PIO transfer errors aren't allowed to use ICRC bit upto | ||
963 | ATA/ATAPI-7, it seems to imply that ABRT bit alone could | ||
964 | indicate tranfer errors. | ||
965 | </para> | ||
966 | <para> | ||
967 | However, ATA/ATAPI-8 draft revision 1f removes the part | ||
968 | that ICRC errors can turn on ABRT. So, this is kind of | ||
969 | gray area. Some heuristics are needed here. | ||
970 | </para> | ||
971 | </listitem> | ||
972 | </varlistentry> | ||
973 | |||
974 | </variablelist> | ||
975 | |||
976 | <para> | ||
977 | ATA/ATAPI device errors can be further categorized as follows. | ||
978 | </para> | ||
979 | |||
980 | <variablelist> | ||
981 | |||
982 | <varlistentry> | ||
983 | <term>Media errors</term> | ||
984 | <listitem> | ||
985 | <para> | ||
986 | This is indicated by UNC bit in the ERROR register. ATA | ||
987 | devices reports UNC error only after certain number of | ||
988 | retries cannot recover the data, so there's nothing much | ||
989 | else to do other than notifying upper layer. | ||
990 | </para> | ||
991 | <para> | ||
992 | READ and WRITE commands report CHS or LBA of the first | ||
993 | failed sector but ATA/ATAPI standard specifies that the | ||
994 | amount of transferred data on error completion is | ||
995 | indeterminate, so we cannot assume that sectors preceding | ||
996 | the failed sector have been transferred and thus cannot | ||
997 | complete those sectors successfully as SCSI does. | ||
998 | </para> | ||
999 | </listitem> | ||
1000 | </varlistentry> | ||
1001 | |||
1002 | <varlistentry> | ||
1003 | <term>Media changed / media change requested error</term> | ||
1004 | <listitem> | ||
1005 | <para> | ||
1006 | <<TODO: fill here>> | ||
1007 | </para> | ||
1008 | </listitem> | ||
1009 | </varlistentry> | ||
1010 | |||
1011 | <varlistentry><term>Address error</term> | ||
1012 | <listitem> | ||
1013 | <para> | ||
1014 | This is indicated by IDNF bit in the ERROR register. | ||
1015 | Report to upper layer. | ||
1016 | </para> | ||
1017 | </listitem> | ||
1018 | </varlistentry> | ||
1019 | |||
1020 | <varlistentry><term>Other errors</term> | ||
1021 | <listitem> | ||
1022 | <para> | ||
1023 | This can be invalid command or parameter indicated by ABRT | ||
1024 | ERROR bit or some other error condition. Note that ABRT | ||
1025 | bit can indicate a lot of things including ICRC and Address | ||
1026 | errors. Heuristics needed. | ||
1027 | </para> | ||
1028 | </listitem> | ||
1029 | </varlistentry> | ||
1030 | |||
1031 | </variablelist> | ||
1032 | |||
1033 | <para> | ||
1034 | Depending on commands, not all STATUS/ERROR bits are | ||
1035 | applicable. These non-applicable bits are marked with | ||
1036 | "na" in the output descriptions but upto ATA/ATAPI-7 | ||
1037 | no definition of "na" can be found. However, | ||
1038 | ATA/ATAPI-8 draft revision 1f describes "N/A" as | ||
1039 | follows. | ||
1040 | </para> | ||
1041 | |||
1042 | <blockquote> | ||
1043 | <variablelist> | ||
1044 | <varlistentry><term>3.2.3.3a N/A</term> | ||
1045 | <listitem> | ||
1046 | <para> | ||
1047 | A keyword the indicates a field has no defined value in | ||
1048 | this standard and should not be checked by the host or | ||
1049 | device. N/A fields should be cleared to zero. | ||
1050 | </para> | ||
1051 | </listitem> | ||
1052 | </varlistentry> | ||
1053 | </variablelist> | ||
1054 | </blockquote> | ||
1055 | |||
1056 | <para> | ||
1057 | So, it seems reasonable to assume that "na" bits are | ||
1058 | cleared to zero by devices and thus need no explicit masking. | ||
1059 | </para> | ||
1060 | |||
1061 | </sect2> | ||
1062 | |||
1063 | <sect2 id="excatATAPIcc"> | ||
1064 | <title>ATAPI device CHECK CONDITION</title> | ||
1065 | |||
1066 | <para> | ||
1067 | ATAPI device CHECK CONDITION error is indicated by set CHK bit | ||
1068 | (ERR bit) in the STATUS register after the last byte of CDB is | ||
1069 | transferred for a PACKET command. For this kind of errors, | ||
1070 | sense data should be acquired to gather information regarding | ||
1071 | the errors. REQUEST SENSE packet command should be used to | ||
1072 | acquire sense data. | ||
1073 | </para> | ||
1074 | |||
1075 | <para> | ||
1076 | Once sense data is acquired, this type of errors can be | ||
1077 | handled similary to other SCSI errors. Note that sense data | ||
1078 | may indicate ATA bus error (e.g. Sense Key 04h HARDWARE ERROR | ||
1079 | && ASC/ASCQ 47h/00h SCSI PARITY ERROR). In such | ||
1080 | cases, the error should be considered as an ATA bus error and | ||
1081 | handled according to <xref linkend="excatATAbusErr"/>. | ||
1082 | </para> | ||
1083 | |||
1084 | </sect2> | ||
1085 | |||
1086 | <sect2 id="excatNCQerr"> | ||
1087 | <title>ATA device error (NCQ)</title> | ||
1088 | |||
1089 | <para> | ||
1090 | NCQ command error is indicated by cleared BSY and set ERR bit | ||
1091 | during NCQ command phase (one or more NCQ commands | ||
1092 | outstanding). Although STATUS and ERROR registers will | ||
1093 | contain valid values describing the error, READ LOG EXT is | ||
1094 | required to clear the error condition, determine which command | ||
1095 | has failed and acquire more information. | ||
1096 | </para> | ||
1097 | |||
1098 | <para> | ||
1099 | READ LOG EXT Log Page 10h reports which tag has failed and | ||
1100 | taskfile register values describing the error. With this | ||
1101 | information the failed command can be handled as a normal ATA | ||
1102 | command error as in <xref linkend="excatDevErr"/> and all | ||
1103 | other in-flight commands must be retried. Note that this | ||
1104 | retry should not be counted - it's likely that commands | ||
1105 | retried this way would have completed normally if it were not | ||
1106 | for the failed command. | ||
1107 | </para> | ||
1108 | |||
1109 | <para> | ||
1110 | Note that ATA bus errors can be reported as ATA device NCQ | ||
1111 | errors. This should be handled as described in <xref | ||
1112 | linkend="excatATAbusErr"/>. | ||
1113 | </para> | ||
1114 | |||
1115 | <para> | ||
1116 | If READ LOG EXT Log Page 10h fails or reports NQ, we're | ||
1117 | thoroughly screwed. This condition should be treated | ||
1118 | according to <xref linkend="excatHSMviolation"/>. | ||
1119 | </para> | ||
1120 | |||
1121 | </sect2> | ||
1122 | |||
1123 | <sect2 id="excatATAbusErr"> | ||
1124 | <title>ATA bus error</title> | ||
1125 | |||
1126 | <para> | ||
1127 | ATA bus error means that data corruption occurred during | ||
1128 | transmission over ATA bus (SATA or PATA). This type of errors | ||
1129 | can be indicated by | ||
1130 | </para> | ||
1131 | |||
1132 | <itemizedlist> | ||
1133 | |||
1134 | <listitem> | ||
1135 | <para> | ||
1136 | ICRC or ABRT error as described in <xref linkend="excatDevErr"/>. | ||
1137 | </para> | ||
1138 | </listitem> | ||
1139 | |||
1140 | <listitem> | ||
1141 | <para> | ||
1142 | Controller-specific error completion with error information | ||
1143 | indicating transmission error. | ||
1144 | </para> | ||
1145 | </listitem> | ||
1146 | |||
1147 | <listitem> | ||
1148 | <para> | ||
1149 | On some controllers, command timeout. In this case, there may | ||
1150 | be a mechanism to determine that the timeout is due to | ||
1151 | transmission error. | ||
1152 | </para> | ||
1153 | </listitem> | ||
1154 | |||
1155 | <listitem> | ||
1156 | <para> | ||
1157 | Unknown/random errors, timeouts and all sorts of weirdities. | ||
1158 | </para> | ||
1159 | </listitem> | ||
1160 | |||
1161 | </itemizedlist> | ||
1162 | |||
1163 | <para> | ||
1164 | As described above, transmission errors can cause wide variety | ||
1165 | of symptoms ranging from device ICRC error to random device | ||
1166 | lockup, and, for many cases, there is no way to tell if an | ||
1167 | error condition is due to transmission error or not; | ||
1168 | therefore, it's necessary to employ some kind of heuristic | ||
1169 | when dealing with errors and timeouts. For example, | ||
1170 | encountering repetitive ABRT errors for known supported | ||
1171 | command is likely to indicate ATA bus error. | ||
1172 | </para> | ||
1173 | |||
1174 | <para> | ||
1175 | Once it's determined that ATA bus errors have possibly | ||
1176 | occurred, lowering ATA bus transmission speed is one of | ||
1177 | actions which may alleviate the problem. See <xref | ||
1178 | linkend="exrecReconf"/> for more information. | ||
1179 | </para> | ||
1180 | |||
1181 | </sect2> | ||
1182 | |||
1183 | <sect2 id="excatPCIbusErr"> | ||
1184 | <title>PCI bus error</title> | ||
1185 | |||
1186 | <para> | ||
1187 | Data corruption or other failures during transmission over PCI | ||
1188 | (or other system bus). For standard BMDMA, this is indicated | ||
1189 | by Error bit in the BMDMA Status register. This type of | ||
1190 | errors must be logged as it indicates something is very wrong | ||
1191 | with the system. Resetting host controller is recommended. | ||
1192 | </para> | ||
1193 | |||
1194 | </sect2> | ||
1195 | |||
1196 | <sect2 id="excatLateCompletion"> | ||
1197 | <title>Late completion</title> | ||
1198 | |||
1199 | <para> | ||
1200 | This occurs when timeout occurs and the timeout handler finds | ||
1201 | out that the timed out command has completed successfully or | ||
1202 | with error. This is usually caused by lost interrupts. This | ||
1203 | type of errors must be logged. Resetting host controller is | ||
1204 | recommended. | ||
1205 | </para> | ||
1206 | |||
1207 | </sect2> | ||
1208 | |||
1209 | <sect2 id="excatUnknown"> | ||
1210 | <title>Unknown error (timeout)</title> | ||
1211 | |||
1212 | <para> | ||
1213 | This is when timeout occurs and the command is still | ||
1214 | processing or the host and device are in unknown state. When | ||
1215 | this occurs, HSM could be in any valid or invalid state. To | ||
1216 | bring the device to known state and make it forget about the | ||
1217 | timed out command, resetting is necessary. The timed out | ||
1218 | command may be retried. | ||
1219 | </para> | ||
1220 | |||
1221 | <para> | ||
1222 | Timeouts can also be caused by transmission errors. Refer to | ||
1223 | <xref linkend="excatATAbusErr"/> for more details. | ||
1224 | </para> | ||
1225 | |||
1226 | </sect2> | ||
1227 | |||
1228 | <sect2 id="excatHoplugPM"> | ||
1229 | <title>Hotplug and power management exceptions</title> | ||
1230 | |||
1231 | <para> | ||
1232 | <<TODO: fill here>> | ||
1233 | </para> | ||
1234 | |||
1235 | </sect2> | ||
1236 | |||
1237 | </sect1> | ||
1238 | |||
1239 | <sect1 id="exrec"> | ||
1240 | <title>EH recovery actions</title> | ||
1241 | |||
1242 | <para> | ||
1243 | This section discusses several important recovery actions. | ||
1244 | </para> | ||
1245 | |||
1246 | <sect2 id="exrecClr"> | ||
1247 | <title>Clearing error condition</title> | ||
1248 | |||
1249 | <para> | ||
1250 | Many controllers require its error registers to be cleared by | ||
1251 | error handler. Different controllers may have different | ||
1252 | requirements. | ||
1253 | </para> | ||
1254 | |||
1255 | <para> | ||
1256 | For SATA, it's strongly recommended to clear at least SError | ||
1257 | register during error handling. | ||
1258 | </para> | ||
1259 | </sect2> | ||
1260 | |||
1261 | <sect2 id="exrecRst"> | ||
1262 | <title>Reset</title> | ||
1263 | |||
1264 | <para> | ||
1265 | During EH, resetting is necessary in the following cases. | ||
1266 | </para> | ||
1267 | |||
1268 | <itemizedlist> | ||
1269 | |||
1270 | <listitem> | ||
1271 | <para> | ||
1272 | HSM is in unknown or invalid state | ||
1273 | </para> | ||
1274 | </listitem> | ||
1275 | |||
1276 | <listitem> | ||
1277 | <para> | ||
1278 | HBA is in unknown or invalid state | ||
1279 | </para> | ||
1280 | </listitem> | ||
1281 | |||
1282 | <listitem> | ||
1283 | <para> | ||
1284 | EH needs to make HBA/device forget about in-flight commands | ||
1285 | </para> | ||
1286 | </listitem> | ||
1287 | |||
1288 | <listitem> | ||
1289 | <para> | ||
1290 | HBA/device behaves weirdly | ||
1291 | </para> | ||
1292 | </listitem> | ||
1293 | |||
1294 | </itemizedlist> | ||
1295 | |||
1296 | <para> | ||
1297 | Resetting during EH might be a good idea regardless of error | ||
1298 | condition to improve EH robustness. Whether to reset both or | ||
1299 | either one of HBA and device depends on situation but the | ||
1300 | following scheme is recommended. | ||
1301 | </para> | ||
1302 | |||
1303 | <itemizedlist> | ||
1304 | |||
1305 | <listitem> | ||
1306 | <para> | ||
1307 | When it's known that HBA is in ready state but ATA/ATAPI | ||
1308 | device in in unknown state, reset only device. | ||
1309 | </para> | ||
1310 | </listitem> | ||
1311 | |||
1312 | <listitem> | ||
1313 | <para> | ||
1314 | If HBA is in unknown state, reset both HBA and device. | ||
1315 | </para> | ||
1316 | </listitem> | ||
1317 | |||
1318 | </itemizedlist> | ||
1319 | |||
1320 | <para> | ||
1321 | HBA resetting is implementation specific. For a controller | ||
1322 | complying to taskfile/BMDMA PCI IDE, stopping active DMA | ||
1323 | transaction may be sufficient iff BMDMA state is the only HBA | ||
1324 | context. But even mostly taskfile/BMDMA PCI IDE complying | ||
1325 | controllers may have implementation specific requirements and | ||
1326 | mechanism to reset themselves. This must be addressed by | ||
1327 | specific drivers. | ||
1328 | </para> | ||
1329 | |||
1330 | <para> | ||
1331 | OTOH, ATA/ATAPI standard describes in detail ways to reset | ||
1332 | ATA/ATAPI devices. | ||
1333 | </para> | ||
1334 | |||
1335 | <variablelist> | ||
1336 | |||
1337 | <varlistentry><term>PATA hardware reset</term> | ||
1338 | <listitem> | ||
1339 | <para> | ||
1340 | This is hardware initiated device reset signalled with | ||
1341 | asserted PATA RESET- signal. There is no standard way to | ||
1342 | initiate hardware reset from software although some | ||
1343 | hardware provides registers that allow driver to directly | ||
1344 | tweak the RESET- signal. | ||
1345 | </para> | ||
1346 | </listitem> | ||
1347 | </varlistentry> | ||
1348 | |||
1349 | <varlistentry><term>Software reset</term> | ||
1350 | <listitem> | ||
1351 | <para> | ||
1352 | This is achieved by turning CONTROL SRST bit on for at | ||
1353 | least 5us. Both PATA and SATA support it but, in case of | ||
1354 | SATA, this may require controller-specific support as the | ||
1355 | second Register FIS to clear SRST should be transmitted | ||
1356 | while BSY bit is still set. Note that on PATA, this resets | ||
1357 | both master and slave devices on a channel. | ||
1358 | </para> | ||
1359 | </listitem> | ||
1360 | </varlistentry> | ||
1361 | |||
1362 | <varlistentry><term>EXECUTE DEVICE DIAGNOSTIC command</term> | ||
1363 | <listitem> | ||
1364 | <para> | ||
1365 | Although ATA/ATAPI standard doesn't describe exactly, EDD | ||
1366 | implies some level of resetting, possibly similar level | ||
1367 | with software reset. Host-side EDD protocol can be handled | ||
1368 | with normal command processing and most SATA controllers | ||
1369 | should be able to handle EDD's just like other commands. | ||
1370 | As in software reset, EDD affects both devices on a PATA | ||
1371 | bus. | ||
1372 | </para> | ||
1373 | <para> | ||
1374 | Although EDD does reset devices, this doesn't suit error | ||
1375 | handling as EDD cannot be issued while BSY is set and it's | ||
1376 | unclear how it will act when device is in unknown/weird | ||
1377 | state. | ||
1378 | </para> | ||
1379 | </listitem> | ||
1380 | </varlistentry> | ||
1381 | |||
1382 | <varlistentry><term>ATAPI DEVICE RESET command</term> | ||
1383 | <listitem> | ||
1384 | <para> | ||
1385 | This is very similar to software reset except that reset | ||
1386 | can be restricted to the selected device without affecting | ||
1387 | the other device sharing the cable. | ||
1388 | </para> | ||
1389 | </listitem> | ||
1390 | </varlistentry> | ||
1391 | |||
1392 | <varlistentry><term>SATA phy reset</term> | ||
1393 | <listitem> | ||
1394 | <para> | ||
1395 | This is the preferred way of resetting a SATA device. In | ||
1396 | effect, it's identical to PATA hardware reset. Note that | ||
1397 | this can be done with the standard SCR Control register. | ||
1398 | As such, it's usually easier to implement than software | ||
1399 | reset. | ||
1400 | </para> | ||
1401 | </listitem> | ||
1402 | </varlistentry> | ||
1403 | |||
1404 | </variablelist> | ||
1405 | |||
1406 | <para> | ||
1407 | One more thing to consider when resetting devices is that | ||
1408 | resetting clears certain configuration parameters and they | ||
1409 | need to be set to their previous or newly adjusted values | ||
1410 | after reset. | ||
1411 | </para> | ||
1412 | |||
1413 | <para> | ||
1414 | Parameters affected are. | ||
1415 | </para> | ||
1416 | |||
1417 | <itemizedlist> | ||
1418 | |||
1419 | <listitem> | ||
1420 | <para> | ||
1421 | CHS set up with INITIALIZE DEVICE PARAMETERS (seldomly used) | ||
1422 | </para> | ||
1423 | </listitem> | ||
1424 | |||
1425 | <listitem> | ||
1426 | <para> | ||
1427 | Parameters set with SET FEATURES including transfer mode setting | ||
1428 | </para> | ||
1429 | </listitem> | ||
1430 | |||
1431 | <listitem> | ||
1432 | <para> | ||
1433 | Block count set with SET MULTIPLE MODE | ||
1434 | </para> | ||
1435 | </listitem> | ||
1436 | |||
1437 | <listitem> | ||
1438 | <para> | ||
1439 | Other parameters (SET MAX, MEDIA LOCK...) | ||
1440 | </para> | ||
1441 | </listitem> | ||
1442 | |||
1443 | </itemizedlist> | ||
1444 | |||
1445 | <para> | ||
1446 | ATA/ATAPI standard specifies that some parameters must be | ||
1447 | maintained across hardware or software reset, but doesn't | ||
1448 | strictly specify all of them. Always reconfiguring needed | ||
1449 | parameters after reset is required for robustness. Note that | ||
1450 | this also applies when resuming from deep sleep (power-off). | ||
1451 | </para> | ||
1452 | |||
1453 | <para> | ||
1454 | Also, ATA/ATAPI standard requires that IDENTIFY DEVICE / | ||
1455 | IDENTIFY PACKET DEVICE is issued after any configuration | ||
1456 | parameter is updated or a hardware reset and the result used | ||
1457 | for further operation. OS driver is required to implement | ||
1458 | revalidation mechanism to support this. | ||
1459 | </para> | ||
1460 | |||
1461 | </sect2> | ||
1462 | |||
1463 | <sect2 id="exrecReconf"> | ||
1464 | <title>Reconfigure transport</title> | ||
1465 | |||
1466 | <para> | ||
1467 | For both PATA and SATA, a lot of corners are cut for cheap | ||
1468 | connectors, cables or controllers and it's quite common to see | ||
1469 | high transmission error rate. This can be mitigated by | ||
1470 | lowering transmission speed. | ||
1471 | </para> | ||
1472 | |||
1473 | <para> | ||
1474 | The following is a possible scheme Jeff Garzik suggested. | ||
1475 | </para> | ||
1476 | |||
1477 | <blockquote> | ||
1478 | <para> | ||
1479 | If more than $N (3?) transmission errors happen in 15 minutes, | ||
1480 | </para> | ||
1481 | <itemizedlist> | ||
1482 | <listitem> | ||
1483 | <para> | ||
1484 | if SATA, decrease SATA PHY speed. if speed cannot be decreased, | ||
1485 | </para> | ||
1486 | </listitem> | ||
1487 | <listitem> | ||
1488 | <para> | ||
1489 | decrease UDMA xfer speed. if at UDMA0, switch to PIO4, | ||
1490 | </para> | ||
1491 | </listitem> | ||
1492 | <listitem> | ||
1493 | <para> | ||
1494 | decrease PIO xfer speed. if at PIO3, complain, but continue | ||
1495 | </para> | ||
1496 | </listitem> | ||
1497 | </itemizedlist> | ||
1498 | </blockquote> | ||
1499 | |||
1500 | </sect2> | ||
1501 | |||
1502 | </sect1> | ||
1503 | |||
1504 | </chapter> | ||
1505 | |||
790 | <chapter id="PiixInt"> | 1506 | <chapter id="PiixInt"> |
791 | <title>ata_piix Internals</title> | 1507 | <title>ata_piix Internals</title> |
792 | !Idrivers/scsi/ata_piix.c | 1508 | !Idrivers/scsi/ata_piix.c |