aboutsummaryrefslogtreecommitdiffstats
path: root/include/net
diff options
context:
space:
mode:
authorDavid S. Miller <davem@davemloft.net>2015-04-14 18:51:19 -0400
committerDavid S. Miller <davem@davemloft.net>2015-04-14 18:51:19 -0400
commitbae97d84100ae7a8dc3b79233ecd3a8f7c19ea57 (patch)
tree975f812d346f61d988a8dc5a0989539293700ad9 /include/net
parent87ffabb1f055e14e7d171c6599539a154d647904 (diff)
parent97bb43c3e06e9bfdc9e3140a312004df462685b9 (diff)
Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next A final pull request, I know it's very late but this time I think it's worth a bit of rush. The following patchset contains Netfilter/nf_tables updates for net-next, more specifically concatenation support and dynamic stateful expression instantiation. This also comes with a couple of small patches. One to fix the ebtables.h userspace header and another to get rid of an obsolete example file in tree that describes a nf_tables expression. This time, I decided to paste the original descriptions. This will result in a rather large commit description, but I think these bytes to keep. Patrick McHardy says: ==================== netfilter: nf_tables: concatenation support The following patches add support for concatenations, which allow multi dimensional exact matches in O(1). The basic idea is to split the data registers, currently consisting of 4 registers of 16 bytes each, into smaller units, 16 registers of 4 bytes each, and making sure each register store always leaves the full 32 bit in a well defined state, meaning smaller stores will zero the remaining bits. Based on that, we can load multiple adjacent registers with different values, thereby building a concatenated bigger value, and use that value for set lookups. Sets are changed to use variable sized extensions for their key and data values, removing the fixed limit of 16 bytes while saving memory if less space is needed. As a side effect, these patches will allow some nice optimizations in the future, like using jhash2 in nft_hash, removing the masking in nft_cmp_fast, optimized data comparison using 32 bit word size etc. These are not done so far however. The patches are split up as follows: * the first five patches add length validation to register loads and stores to make sure we stay within bounds and prepare the validation functions for the new addressing mode * the next patches prepare for changing to 32 bit addressing by introducing a struct nft_regs, which holds the verdict register as well as the data registers. The verdict members are moved to a new struct nft_verdict to allow to pull struct nft_data out of the stack. * the next patches contain preparatory conversions of expressions and sets to use 32 bit addressing * the next patch introduces so far unused register conversion helpers for parsing and dumping register numbers over netlink * following is the real conversion to 32 bit addressing, consisting of replacing struct nft_data in struct nft_regs by an array of u32s and actually translating and validating the new register numbers. * the final two patches add support for variable sized data items and variable sized keys / data in set elements The patches have been verified to work correctly with nft binaries using both old and new addressing. ==================== Patrick McHardy says: ==================== netfilter: nf_tables: dynamic stateful expression instantiation The following patches are the grand finale of my nf_tables set work, using all the building blocks put in place by the previous patches to support something like iptables hashlimit, but a lot more powerful. Sets are extended to allow attaching expressions to set elements. The dynset expression dynamically instantiates these expressions based on a template when creating new set elements and evaluates them for all new or updated set members. In combination with concatenations this effectively creates state tables for arbitrary combinations of keys, using the existing expression types to maintain that state. Regular set GC takes care of purging expired states. We currently support two different stateful expressions, counter and limit. Using limit as a template we can express the functionality of hashlimit, but completely unrestricted in the combination of keys. Using counter we can perform accounting for arbitrary flows. The following examples from patch 5/5 show some possibilities. Userspace syntax is still WIP, especially the listing of state tables will most likely be seperated from normal set listings and use a more structured format: 1. Limit the rate of new SSH connections per host, similar to iptables hashlimit: flow ip saddr timeout 60s \ limit 10/second \ accept 2. Account network traffic between each set of /24 networks: flow ip saddr & 255.255.255.0 . ip daddr & 255.255.255.0 \ counter 3. Account traffic to each host per user: flow skuid . ip daddr \ counter 4. Account traffic for each combination of source address and TCP flags: flow ip saddr . tcp flags \ counter The resulting set content after a Xmas-scan look like this: { 192.168.122.1 . fin | psh | urg : counter packets 1001 bytes 40040, 192.168.122.1 . ack : counter packets 74 bytes 3848, 192.168.122.1 . psh | ack : counter packets 35 bytes 3144 } In the future the "expressions attached to elements" will be extended to also support user created non-stateful expressions to allow to efficiently select beween a set of parameter sets, f.i. a set of log statements with different prefixes based on the interface, which currently require one rule each. This will most likely have to wait until the next kernel version though. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include/net')
-rw-r--r--include/net/netfilter/nf_tables.h103
-rw-r--r--include/net/netfilter/nft_meta.h4
2 files changed, 75 insertions, 32 deletions
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index d6a2f0ed5130..e6bcf55dcf20 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1,6 +1,7 @@
1#ifndef _NET_NF_TABLES_H 1#ifndef _NET_NF_TABLES_H
2#define _NET_NF_TABLES_H 2#define _NET_NF_TABLES_H
3 3
4#include <linux/module.h>
4#include <linux/list.h> 5#include <linux/list.h>
5#include <linux/netfilter.h> 6#include <linux/netfilter.h>
6#include <linux/netfilter/nfnetlink.h> 7#include <linux/netfilter/nfnetlink.h>
@@ -36,29 +37,43 @@ static inline void nft_set_pktinfo(struct nft_pktinfo *pkt,
36 pkt->xt.family = ops->pf; 37 pkt->xt.family = ops->pf;
37} 38}
38 39
40/**
41 * struct nft_verdict - nf_tables verdict
42 *
43 * @code: nf_tables/netfilter verdict code
44 * @chain: destination chain for NFT_JUMP/NFT_GOTO
45 */
46struct nft_verdict {
47 u32 code;
48 struct nft_chain *chain;
49};
50
39struct nft_data { 51struct nft_data {
40 union { 52 union {
41 u32 data[4]; 53 u32 data[4];
42 struct { 54 struct nft_verdict verdict;
43 u32 verdict;
44 struct nft_chain *chain;
45 };
46 }; 55 };
47} __attribute__((aligned(__alignof__(u64)))); 56} __attribute__((aligned(__alignof__(u64))));
48 57
49static inline int nft_data_cmp(const struct nft_data *d1, 58/**
50 const struct nft_data *d2, 59 * struct nft_regs - nf_tables register set
51 unsigned int len) 60 *
52{ 61 * @data: data registers
53 return memcmp(d1->data, d2->data, len); 62 * @verdict: verdict register
54} 63 *
64 * The first four data registers alias to the verdict register.
65 */
66struct nft_regs {
67 union {
68 u32 data[20];
69 struct nft_verdict verdict;
70 };
71};
55 72
56static inline void nft_data_copy(struct nft_data *dst, 73static inline void nft_data_copy(u32 *dst, const struct nft_data *src,
57 const struct nft_data *src) 74 unsigned int len)
58{ 75{
59 BUILD_BUG_ON(__alignof__(*dst) != __alignof__(u64)); 76 memcpy(dst, src, len);
60 *(u64 *)&dst->data[0] = *(u64 *)&src->data[0];
61 *(u64 *)&dst->data[2] = *(u64 *)&src->data[2];
62} 77}
63 78
64static inline void nft_data_debug(const struct nft_data *data) 79static inline void nft_data_debug(const struct nft_data *data)
@@ -96,7 +111,8 @@ struct nft_data_desc {
96 unsigned int len; 111 unsigned int len;
97}; 112};
98 113
99int nft_data_init(const struct nft_ctx *ctx, struct nft_data *data, 114int nft_data_init(const struct nft_ctx *ctx,
115 struct nft_data *data, unsigned int size,
100 struct nft_data_desc *desc, const struct nlattr *nla); 116 struct nft_data_desc *desc, const struct nlattr *nla);
101void nft_data_uninit(const struct nft_data *data, enum nft_data_types type); 117void nft_data_uninit(const struct nft_data *data, enum nft_data_types type);
102int nft_data_dump(struct sk_buff *skb, int attr, const struct nft_data *data, 118int nft_data_dump(struct sk_buff *skb, int attr, const struct nft_data *data,
@@ -112,12 +128,14 @@ static inline enum nft_registers nft_type_to_reg(enum nft_data_types type)
112 return type == NFT_DATA_VERDICT ? NFT_REG_VERDICT : NFT_REG_1; 128 return type == NFT_DATA_VERDICT ? NFT_REG_VERDICT : NFT_REG_1;
113} 129}
114 130
115int nft_validate_input_register(enum nft_registers reg); 131unsigned int nft_parse_register(const struct nlattr *attr);
116int nft_validate_output_register(enum nft_registers reg); 132int nft_dump_register(struct sk_buff *skb, unsigned int attr, unsigned int reg);
117int nft_validate_data_load(const struct nft_ctx *ctx, enum nft_registers reg,
118 const struct nft_data *data,
119 enum nft_data_types type);
120 133
134int nft_validate_register_load(enum nft_registers reg, unsigned int len);
135int nft_validate_register_store(const struct nft_ctx *ctx,
136 enum nft_registers reg,
137 const struct nft_data *data,
138 enum nft_data_types type, unsigned int len);
121 139
122/** 140/**
123 * struct nft_userdata - user defined data associated with an object 141 * struct nft_userdata - user defined data associated with an object
@@ -141,7 +159,10 @@ struct nft_userdata {
141 * @priv: element private data and extensions 159 * @priv: element private data and extensions
142 */ 160 */
143struct nft_set_elem { 161struct nft_set_elem {
144 struct nft_data key; 162 union {
163 u32 buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)];
164 struct nft_data val;
165 } key;
145 void *priv; 166 void *priv;
146}; 167};
147 168
@@ -216,15 +237,15 @@ struct nft_expr;
216 */ 237 */
217struct nft_set_ops { 238struct nft_set_ops {
218 bool (*lookup)(const struct nft_set *set, 239 bool (*lookup)(const struct nft_set *set,
219 const struct nft_data *key, 240 const u32 *key,
220 const struct nft_set_ext **ext); 241 const struct nft_set_ext **ext);
221 bool (*update)(struct nft_set *set, 242 bool (*update)(struct nft_set *set,
222 const struct nft_data *key, 243 const u32 *key,
223 void *(*new)(struct nft_set *, 244 void *(*new)(struct nft_set *,
224 const struct nft_expr *, 245 const struct nft_expr *,
225 struct nft_data []), 246 struct nft_regs *),
226 const struct nft_expr *expr, 247 const struct nft_expr *expr,
227 struct nft_data data[], 248 struct nft_regs *regs,
228 const struct nft_set_ext **ext); 249 const struct nft_set_ext **ext);
229 250
230 int (*insert)(const struct nft_set *set, 251 int (*insert)(const struct nft_set *set,
@@ -350,6 +371,7 @@ void nf_tables_unbind_set(const struct nft_ctx *ctx, struct nft_set *set,
350 * @NFT_SET_EXT_TIMEOUT: element timeout 371 * @NFT_SET_EXT_TIMEOUT: element timeout
351 * @NFT_SET_EXT_EXPIRATION: element expiration time 372 * @NFT_SET_EXT_EXPIRATION: element expiration time
352 * @NFT_SET_EXT_USERDATA: user data associated with the element 373 * @NFT_SET_EXT_USERDATA: user data associated with the element
374 * @NFT_SET_EXT_EXPR: expression assiociated with the element
353 * @NFT_SET_EXT_NUM: number of extension types 375 * @NFT_SET_EXT_NUM: number of extension types
354 */ 376 */
355enum nft_set_extensions { 377enum nft_set_extensions {
@@ -359,6 +381,7 @@ enum nft_set_extensions {
359 NFT_SET_EXT_TIMEOUT, 381 NFT_SET_EXT_TIMEOUT,
360 NFT_SET_EXT_EXPIRATION, 382 NFT_SET_EXT_EXPIRATION,
361 NFT_SET_EXT_USERDATA, 383 NFT_SET_EXT_USERDATA,
384 NFT_SET_EXT_EXPR,
362 NFT_SET_EXT_NUM 385 NFT_SET_EXT_NUM
363}; 386};
364 387
@@ -470,6 +493,11 @@ static inline struct nft_userdata *nft_set_ext_userdata(const struct nft_set_ext
470 return nft_set_ext(ext, NFT_SET_EXT_USERDATA); 493 return nft_set_ext(ext, NFT_SET_EXT_USERDATA);
471} 494}
472 495
496static inline struct nft_expr *nft_set_ext_expr(const struct nft_set_ext *ext)
497{
498 return nft_set_ext(ext, NFT_SET_EXT_EXPR);
499}
500
473static inline bool nft_set_elem_expired(const struct nft_set_ext *ext) 501static inline bool nft_set_elem_expired(const struct nft_set_ext *ext)
474{ 502{
475 return nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION) && 503 return nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION) &&
@@ -484,8 +512,7 @@ static inline struct nft_set_ext *nft_set_elem_ext(const struct nft_set *set,
484 512
485void *nft_set_elem_init(const struct nft_set *set, 513void *nft_set_elem_init(const struct nft_set *set,
486 const struct nft_set_ext_tmpl *tmpl, 514 const struct nft_set_ext_tmpl *tmpl,
487 const struct nft_data *key, 515 const u32 *key, const u32 *data,
488 const struct nft_data *data,
489 u64 timeout, gfp_t gfp); 516 u64 timeout, gfp_t gfp);
490void nft_set_elem_destroy(const struct nft_set *set, void *elem); 517void nft_set_elem_destroy(const struct nft_set *set, void *elem);
491 518
@@ -556,6 +583,7 @@ static inline void nft_set_gc_batch_add(struct nft_set_gc_batch *gcb,
556 * @policy: netlink attribute policy 583 * @policy: netlink attribute policy
557 * @maxattr: highest netlink attribute number 584 * @maxattr: highest netlink attribute number
558 * @family: address family for AF-specific types 585 * @family: address family for AF-specific types
586 * @flags: expression type flags
559 */ 587 */
560struct nft_expr_type { 588struct nft_expr_type {
561 const struct nft_expr_ops *(*select_ops)(const struct nft_ctx *, 589 const struct nft_expr_ops *(*select_ops)(const struct nft_ctx *,
@@ -567,8 +595,11 @@ struct nft_expr_type {
567 const struct nla_policy *policy; 595 const struct nla_policy *policy;
568 unsigned int maxattr; 596 unsigned int maxattr;
569 u8 family; 597 u8 family;
598 u8 flags;
570}; 599};
571 600
601#define NFT_EXPR_STATEFUL 0x1
602
572/** 603/**
573 * struct nft_expr_ops - nf_tables expression operations 604 * struct nft_expr_ops - nf_tables expression operations
574 * 605 *
@@ -584,7 +615,7 @@ struct nft_expr_type {
584struct nft_expr; 615struct nft_expr;
585struct nft_expr_ops { 616struct nft_expr_ops {
586 void (*eval)(const struct nft_expr *expr, 617 void (*eval)(const struct nft_expr *expr,
587 struct nft_data data[NFT_REG_MAX + 1], 618 struct nft_regs *regs,
588 const struct nft_pktinfo *pkt); 619 const struct nft_pktinfo *pkt);
589 unsigned int size; 620 unsigned int size;
590 621
@@ -622,6 +653,18 @@ static inline void *nft_expr_priv(const struct nft_expr *expr)
622 return (void *)expr->data; 653 return (void *)expr->data;
623} 654}
624 655
656struct nft_expr *nft_expr_init(const struct nft_ctx *ctx,
657 const struct nlattr *nla);
658void nft_expr_destroy(const struct nft_ctx *ctx, struct nft_expr *expr);
659int nft_expr_dump(struct sk_buff *skb, unsigned int attr,
660 const struct nft_expr *expr);
661
662static inline void nft_expr_clone(struct nft_expr *dst, struct nft_expr *src)
663{
664 __module_get(src->ops->type->owner);
665 memcpy(dst, src, src->ops->size);
666}
667
625/** 668/**
626 * struct nft_rule - nf_tables rule 669 * struct nft_rule - nf_tables rule
627 * 670 *
diff --git a/include/net/netfilter/nft_meta.h b/include/net/netfilter/nft_meta.h
index 0ee47c3e2e31..711887a09e91 100644
--- a/include/net/netfilter/nft_meta.h
+++ b/include/net/netfilter/nft_meta.h
@@ -26,11 +26,11 @@ int nft_meta_set_dump(struct sk_buff *skb,
26 const struct nft_expr *expr); 26 const struct nft_expr *expr);
27 27
28void nft_meta_get_eval(const struct nft_expr *expr, 28void nft_meta_get_eval(const struct nft_expr *expr,
29 struct nft_data data[NFT_REG_MAX + 1], 29 struct nft_regs *regs,
30 const struct nft_pktinfo *pkt); 30 const struct nft_pktinfo *pkt);
31 31
32void nft_meta_set_eval(const struct nft_expr *expr, 32void nft_meta_set_eval(const struct nft_expr *expr,
33 struct nft_data data[NFT_REG_MAX + 1], 33 struct nft_regs *regs,
34 const struct nft_pktinfo *pkt); 34 const struct nft_pktinfo *pkt);
35 35
36#endif 36#endif