aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/networking
diff options
context:
space:
mode:
authorAlexei Starovoitov <ast@plumgrid.com>2014-09-26 03:17:02 -0400
committerDavid S. Miller <davem@davemloft.net>2014-09-26 15:05:14 -0400
commit51580e798cb61b0fc63fa3aa6c5c975375aa0550 (patch)
tree2b608f048ba6415a28be79135af26f28ba7ebd5b /Documentation/networking
parent0a542a86d73b1577e7d4f55fc95dcffd3fe62643 (diff)
bpf: verifier (add docs)
this patch adds all of eBPF verfier documentation and empty bpf_check() The end goal for the verifier is to statically check safety of the program. Verifier will catch: - loops - out of range jumps - unreachable instructions - invalid instructions - uninitialized register access - uninitialized stack access - misaligned stack access - out of range stack access - invalid calling convention More details in Documentation/networking/filter.txt Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/filter.txt224
1 files changed, 224 insertions, 0 deletions
diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt
index 4a01d71785e9..5ce4d07406a5 100644
--- a/Documentation/networking/filter.txt
+++ b/Documentation/networking/filter.txt
@@ -1001,6 +1001,99 @@ instruction that loads 64-bit immediate value into a dst_reg.
1001Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM which loads 1001Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM which loads
100232-bit immediate value into a register. 100232-bit immediate value into a register.
1003 1003
1004eBPF verifier
1005-------------
1006The safety of the eBPF program is determined in two steps.
1007
1008First step does DAG check to disallow loops and other CFG validation.
1009In particular it will detect programs that have unreachable instructions.
1010(though classic BPF checker allows them)
1011
1012Second step starts from the first insn and descends all possible paths.
1013It simulates execution of every insn and observes the state change of
1014registers and stack.
1015
1016At the start of the program the register R1 contains a pointer to context
1017and has type PTR_TO_CTX.
1018If verifier sees an insn that does R2=R1, then R2 has now type
1019PTR_TO_CTX as well and can be used on the right hand side of expression.
1020If R1=PTR_TO_CTX and insn is R2=R1+R1, then R2=UNKNOWN_VALUE,
1021since addition of two valid pointers makes invalid pointer.
1022(In 'secure' mode verifier will reject any type of pointer arithmetic to make
1023sure that kernel addresses don't leak to unprivileged users)
1024
1025If register was never written to, it's not readable:
1026 bpf_mov R0 = R2
1027 bpf_exit
1028will be rejected, since R2 is unreadable at the start of the program.
1029
1030After kernel function call, R1-R5 are reset to unreadable and
1031R0 has a return type of the function.
1032
1033Since R6-R9 are callee saved, their state is preserved across the call.
1034 bpf_mov R6 = 1
1035 bpf_call foo
1036 bpf_mov R0 = R6
1037 bpf_exit
1038is a correct program. If there was R1 instead of R6, it would have
1039been rejected.
1040
1041load/store instructions are allowed only with registers of valid types, which
1042are PTR_TO_CTX, PTR_TO_MAP, FRAME_PTR. They are bounds and alignment checked.
1043For example:
1044 bpf_mov R1 = 1
1045 bpf_mov R2 = 2
1046 bpf_xadd *(u32 *)(R1 + 3) += R2
1047 bpf_exit
1048will be rejected, since R1 doesn't have a valid pointer type at the time of
1049execution of instruction bpf_xadd.
1050
1051At the start R1 type is PTR_TO_CTX (a pointer to generic 'struct bpf_context')
1052A callback is used to customize verifier to restrict eBPF program access to only
1053certain fields within ctx structure with specified size and alignment.
1054
1055For example, the following insn:
1056 bpf_ld R0 = *(u32 *)(R6 + 8)
1057intends to load a word from address R6 + 8 and store it into R0
1058If R6=PTR_TO_CTX, via is_valid_access() callback the verifier will know
1059that offset 8 of size 4 bytes can be accessed for reading, otherwise
1060the verifier will reject the program.
1061If R6=FRAME_PTR, then access should be aligned and be within
1062stack bounds, which are [-MAX_BPF_STACK, 0). In this example offset is 8,
1063so it will fail verification, since it's out of bounds.
1064
1065The verifier will allow eBPF program to read data from stack only after
1066it wrote into it.
1067Classic BPF verifier does similar check with M[0-15] memory slots.
1068For example:
1069 bpf_ld R0 = *(u32 *)(R10 - 4)
1070 bpf_exit
1071is invalid program.
1072Though R10 is correct read-only register and has type FRAME_PTR
1073and R10 - 4 is within stack bounds, there were no stores into that location.
1074
1075Pointer register spill/fill is tracked as well, since four (R6-R9)
1076callee saved registers may not be enough for some programs.
1077
1078Allowed function calls are customized with bpf_verifier_ops->get_func_proto()
1079The eBPF verifier will check that registers match argument constraints.
1080After the call register R0 will be set to return type of the function.
1081
1082Function calls is a main mechanism to extend functionality of eBPF programs.
1083Socket filters may let programs to call one set of functions, whereas tracing
1084filters may allow completely different set.
1085
1086If a function made accessible to eBPF program, it needs to be thought through
1087from safety point of view. The verifier will guarantee that the function is
1088called with valid arguments.
1089
1090seccomp vs socket filters have different security restrictions for classic BPF.
1091Seccomp solves this by two stage verifier: classic BPF verifier is followed
1092by seccomp verifier. In case of eBPF one configurable verifier is shared for
1093all use cases.
1094
1095See details of eBPF verifier in kernel/bpf/verifier.c
1096
1004eBPF maps 1097eBPF maps
1005--------- 1098---------
1006'maps' is a generic storage of different types for sharing data between kernel 1099'maps' is a generic storage of different types for sharing data between kernel
@@ -1040,6 +1133,137 @@ The map is defined by:
1040 . key size in bytes 1133 . key size in bytes
1041 . value size in bytes 1134 . value size in bytes
1042 1135
1136Understanding eBPF verifier messages
1137------------------------------------
1138
1139The following are few examples of invalid eBPF programs and verifier error
1140messages as seen in the log:
1141
1142Program with unreachable instructions:
1143static struct bpf_insn prog[] = {
1144 BPF_EXIT_INSN(),
1145 BPF_EXIT_INSN(),
1146};
1147Error:
1148 unreachable insn 1
1149
1150Program that reads uninitialized register:
1151 BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
1152 BPF_EXIT_INSN(),
1153Error:
1154 0: (bf) r0 = r2
1155 R2 !read_ok
1156
1157Program that doesn't initialize R0 before exiting:
1158 BPF_MOV64_REG(BPF_REG_2, BPF_REG_1),
1159 BPF_EXIT_INSN(),
1160Error:
1161 0: (bf) r2 = r1
1162 1: (95) exit
1163 R0 !read_ok
1164
1165Program that accesses stack out of bounds:
1166 BPF_ST_MEM(BPF_DW, BPF_REG_10, 8, 0),
1167 BPF_EXIT_INSN(),
1168Error:
1169 0: (7a) *(u64 *)(r10 +8) = 0
1170 invalid stack off=8 size=8
1171
1172Program that doesn't initialize stack before passing its address into function:
1173 BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
1174 BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
1175 BPF_LD_MAP_FD(BPF_REG_1, 0),
1176 BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
1177 BPF_EXIT_INSN(),
1178Error:
1179 0: (bf) r2 = r10
1180 1: (07) r2 += -8
1181 2: (b7) r1 = 0x0
1182 3: (85) call 1
1183 invalid indirect read from stack off -8+0 size 8
1184
1185Program that uses invalid map_fd=0 while calling to map_lookup_elem() function:
1186 BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
1187 BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
1188 BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
1189 BPF_LD_MAP_FD(BPF_REG_1, 0),
1190 BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
1191 BPF_EXIT_INSN(),
1192Error:
1193 0: (7a) *(u64 *)(r10 -8) = 0
1194 1: (bf) r2 = r10
1195 2: (07) r2 += -8
1196 3: (b7) r1 = 0x0
1197 4: (85) call 1
1198 fd 0 is not pointing to valid bpf_map
1199
1200Program that doesn't check return value of map_lookup_elem() before accessing
1201map element:
1202 BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
1203 BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
1204 BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
1205 BPF_LD_MAP_FD(BPF_REG_1, 0),
1206 BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
1207 BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
1208 BPF_EXIT_INSN(),
1209Error:
1210 0: (7a) *(u64 *)(r10 -8) = 0
1211 1: (bf) r2 = r10
1212 2: (07) r2 += -8
1213 3: (b7) r1 = 0x0
1214 4: (85) call 1
1215 5: (7a) *(u64 *)(r0 +0) = 0
1216 R0 invalid mem access 'map_value_or_null'
1217
1218Program that correctly checks map_lookup_elem() returned value for NULL, but
1219accesses the memory with incorrect alignment:
1220 BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
1221 BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
1222 BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
1223 BPF_LD_MAP_FD(BPF_REG_1, 0),
1224 BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
1225 BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
1226 BPF_ST_MEM(BPF_DW, BPF_REG_0, 4, 0),
1227 BPF_EXIT_INSN(),
1228Error:
1229 0: (7a) *(u64 *)(r10 -8) = 0
1230 1: (bf) r2 = r10
1231 2: (07) r2 += -8
1232 3: (b7) r1 = 1
1233 4: (85) call 1
1234 5: (15) if r0 == 0x0 goto pc+1
1235 R0=map_ptr R10=fp
1236 6: (7a) *(u64 *)(r0 +4) = 0
1237 misaligned access off 4 size 8
1238
1239Program that correctly checks map_lookup_elem() returned value for NULL and
1240accesses memory with correct alignment in one side of 'if' branch, but fails
1241to do so in the other side of 'if' branch:
1242 BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
1243 BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
1244 BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
1245 BPF_LD_MAP_FD(BPF_REG_1, 0),
1246 BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
1247 BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
1248 BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
1249 BPF_EXIT_INSN(),
1250 BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 1),
1251 BPF_EXIT_INSN(),
1252Error:
1253 0: (7a) *(u64 *)(r10 -8) = 0
1254 1: (bf) r2 = r10
1255 2: (07) r2 += -8
1256 3: (b7) r1 = 1
1257 4: (85) call 1
1258 5: (15) if r0 == 0x0 goto pc+2
1259 R0=map_ptr R10=fp
1260 6: (7a) *(u64 *)(r0 +0) = 0
1261 7: (95) exit
1262
1263 from 5 to 8: R0=imm0 R10=fp
1264 8: (7a) *(u64 *)(r0 +0) = 1
1265 R0 invalid mem access 'imm'
1266
1043Testing 1267Testing
1044------- 1268-------
1045 1269