tipc: fix changeover issues due to large packet

In conjunction with changing the interfaces' MTU (e.g. especially in the case of a bonding) where the TIPC links are brought up and down in a short time, a couple of issues were detected with the current link changeover mechanism: 1) When one link is up but immediately forced down again, the failover procedure will be carried out in order to failover all the messages in the link's transmq queue onto the other working link. The link and node state is also set to FAILINGOVER as part of the process. The message will be transmited in form of a FAILOVER_MSG, so its size is plus of 40 bytes (= the message header size). There is no problem if the original message size is not larger than the link's MTU - 40, and indeed this is the max size of a normal payload messages. However, in the situation above, because the link has just been up, the messages in the link's transmq are almost SYNCH_MSGs which had been generated by the link synching procedure, then their size might reach the max value already! When the FAILOVER_MSG is built on the top of such a SYNCH_MSG, its size will exceed the link's MTU. As a result, the messages are dropped silently and the failover procedure will never end up, the link will not be able to exit the FAILINGOVER state, so cannot be re-established. 2) The same scenario above can happen more easily in case the MTU of the links is set differently or when changing. In that case, as long as a large message in the failure link's transmq queue was built and fragmented with its link's MTU > the other link's one, the issue will happen (there is no need of a link synching in advance). 3) The link synching procedure also faces with the same issue but since the link synching is only started upon receipt of a SYNCH_MSG, dropping the message will not result in a state deadlock, but it is not expected as design. The 1) & 3) issues are resolved by the last commit that only a dummy SYNCH_MSG (i.e. without data) is generated at the link synching, so the size of a FAILOVER_MSG if any then will never exceed the link's MTU. For the 2) issue, the only solution is trying to fragment the messages in the failure link's transmq queue according to the working link's MTU so they can be failovered then. A new function is made to accomplish this, it will still be a TUNNEL PROTOCOL/FAILOVER MSG but if the original message size is too large, it will be fragmented & reassembled at the receiving side. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Tuong Lien <tuong.t.lien@dektech.com.au> 2019-07-23 21:56:12 -0400
committer: David S. Miller <davem@davemloft.net> 2019-07-25 18:55:47 -0400
commit: 2320bcdae62887555701ea78a46b640ff6b63868 (patch)
tree: 9823f2ab5b41438842517ae8647904c9aa7b943b /net/tipc/msg.c
parent: 4929a932be334d68d333089872bc67e4f1d97475 (diff)
1 files changed, 59 insertions, 0 deletions
diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index f48e5857210f..e6d49cdc61b4 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -244,6 +244,65 @@ bool tipc_msg_validate(struct sk_buff **_skb)
 }
 /**
+ * tipc_msg_fragment - build a fragment skb list for TIPC message
+ *
+ * @skb: TIPC message skb
+ * @hdr: internal msg header to be put on the top of the fragments
+ * @pktmax: max size of a fragment incl. the header
+ * @frags: returned fragment skb list
+ *
+ * Returns 0 if the fragmentation is successful, otherwise: -EINVAL
+ * or -ENOMEM
+ */
+int tipc_msg_fragment(struct sk_buff *skb, const struct tipc_msg *hdr,
+                      int pktmax, struct sk_buff_head *frags)
+{
+        int pktno, nof_fragms, dsz, dmax, eat;
+        struct tipc_msg *_hdr;
+        struct sk_buff *_skb;
+        u8 *data;
+        /* Non-linear buffer? */
+        if (skb_linearize(skb))
+                return -ENOMEM;
+        data = (u8 *)skb->data;
+        dsz = msg_size(buf_msg(skb));
+        dmax = pktmax - INT_H_SIZE;
+        if (dsz <= dmax || !dmax)
+                return -EINVAL;
+        nof_fragms = dsz / dmax + 1;
+        for (pktno = 1; pktno <= nof_fragms; pktno++) {
+                if (pktno < nof_fragms)
+                        eat = dmax;
+                else
+                        eat = dsz % dmax;
+                /* Allocate a new fragment */
+                _skb = tipc_buf_acquire(INT_H_SIZE + eat, GFP_ATOMIC);
+                if (!_skb)
+                        goto error;
+                skb_orphan(_skb);
+                __skb_queue_tail(frags, _skb);
+                /* Copy header & data to the fragment */
+                skb_copy_to_linear_data(_skb, hdr, INT_H_SIZE);
+                skb_copy_to_linear_data_offset(_skb, INT_H_SIZE, data, eat);
+                data += eat;
+                /* Update the fragment's header */
+                _hdr = buf_msg(_skb);
+                msg_set_fragm_no(_hdr, pktno);
+                msg_set_nof_fragms(_hdr, nof_fragms);
+                msg_set_size(_hdr, INT_H_SIZE + eat);
+        }
+        return 0;
+error:
+        __skb_queue_purge(frags);
+        __skb_queue_head_init(frags);
+        return -ENOMEM;
+}
+/**
 * tipc_msg_build - create buffer chain containing specified header and data
 * @mhdr: Message header, to be prepended to data
 * @m: User message
author	Tuong Lien <tuong.t.lien@dektech.com.au>	2019-07-23 21:56:12 -0400
committer	David S. Miller <davem@davemloft.net>	2019-07-25 18:55:47 -0400
commit	2320bcdae62887555701ea78a46b640ff6b63868 (patch)
tree	9823f2ab5b41438842517ae8647904c9aa7b943b /net/tipc/msg.c
parent	4929a932be334d68d333089872bc67e4f1d97475 (diff)

diff --git a/net/tipc/msg.c b/net/tipc/msg.c index f48e5857210f..e6d49cdc61b4 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c
@@ -244,6 +244,65 @@ bool tipc_msg_validate(struct sk_buff **_skb)
244	}	244	}
245		245
246	/**	246	/**
		247	* tipc_msg_fragment - build a fragment skb list for TIPC message
		248	*
		249	* @skb: TIPC message skb
		250	* @hdr: internal msg header to be put on the top of the fragments
		251	* @pktmax: max size of a fragment incl. the header
		252	* @frags: returned fragment skb list
		253	*
		254	* Returns 0 if the fragmentation is successful, otherwise: -EINVAL
		255	* or -ENOMEM
		256	*/
		257	int tipc_msg_fragment(struct sk_buff skb, const struct tipc_msg hdr,
		258	int pktmax, struct sk_buff_head *frags)
		259	{
		260	int pktno, nof_fragms, dsz, dmax, eat;
		261	struct tipc_msg *_hdr;
		262	struct sk_buff *_skb;
		263	u8 *data;
		264
		265	/* Non-linear buffer? */
		266	if (skb_linearize(skb))
		267	return -ENOMEM;
		268
		269	data = (u8 *)skb->data;
		270	dsz = msg_size(buf_msg(skb));
		271	dmax = pktmax - INT_H_SIZE;
		272	if (dsz <= dmax \|\| !dmax)
		273	return -EINVAL;
		274
		275	nof_fragms = dsz / dmax + 1;
		276	for (pktno = 1; pktno <= nof_fragms; pktno++) {
		277	if (pktno < nof_fragms)
		278	eat = dmax;
		279	else
		280	eat = dsz % dmax;
		281	/* Allocate a new fragment */
		282	_skb = tipc_buf_acquire(INT_H_SIZE + eat, GFP_ATOMIC);
		283	if (!_skb)
		284	goto error;
		285	skb_orphan(_skb);
		286	__skb_queue_tail(frags, _skb);
		287	/* Copy header & data to the fragment */
		288	skb_copy_to_linear_data(_skb, hdr, INT_H_SIZE);
		289	skb_copy_to_linear_data_offset(_skb, INT_H_SIZE, data, eat);
		290	data += eat;
		291	/* Update the fragment's header */
		292	_hdr = buf_msg(_skb);
		293	msg_set_fragm_no(_hdr, pktno);
		294	msg_set_nof_fragms(_hdr, nof_fragms);
		295	msg_set_size(_hdr, INT_H_SIZE + eat);
		296	}
		297	return 0;
		298
		299	error:
		300	__skb_queue_purge(frags);
		301	__skb_queue_head_init(frags);
		302	return -ENOMEM;
		303	}
		304
		305	/**
247	* tipc_msg_build - create buffer chain containing specified header and data	306	* tipc_msg_build - create buffer chain containing specified header and data
248	* @mhdr: Message header, to be prepended to data	307	* @mhdr: Message header, to be prepended to data
249	* @m: User message	308	* @m: User message