litmus-rt-ext-res.git/net/sunrpc, branch EXT-RES

svcauth_gss: Close connection when dropping an incoming message

2017-04-12T10:41:17+00:00

[ Upstream commit 4d712ef1db05c3aa5c3b690a50c37ebad584c53f ]

S5.3.3.1 of RFC 2203 requires that an incoming GSS-wrapped message
whose sequence number lies outside the current window is dropped.
The rationale is:

  The reason for discarding requests silently is that the server
  is unable to determine if the duplicate or out of range request
  was due to a sequencing problem in the client, network, or the
  operating system, or due to some quirk in routing, or a replay
  attack by an intruder.  Discarding the request allows the client
  to recover after timing out, if indeed the duplication was
  unintentional or well intended.

However, clients may rely on the server dropping the connection to
indicate that a retransmit is needed. Without a connection reset, a
client can wait forever without retransmitting, and the workload
just stops dead. I've reproduced this behavior by running xfstests
generic/323 on an NFSv4.0 mount with proto=rdma and sec=krb5i.

To address this issue, have the server close the connection when it
silently discards an incoming message due to a GSS sequence number
problem.

There are a few other places where the server will never reply.
Change those spots in a similar fashion.

Signed-off-by: Chuck Lever 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

xprtrdma: Squelch kbuild sparse complaint

2017-03-26T11:05:57+00:00

commit eed50879d64ab1b9f76445dbab822e43a098b309 upstream.

New complaint from kbuild for 4.9.y:

net/sunrpc/xprtrdma/verbs.c:489:19: sparse: incompatible types in
    comparison expression (different type sizes)

verbs.c:
489	max_sge = min(ia->ri_device->attrs.max_sge, RPCRDMA_MAX_SEND_SGES);

I can't reproduce this running sparse here. Likewise, "make W=1
net/sunrpc/xprtrdma/verbs.o" never indicated any issue.

A little poking suggests that because the range of its values is
small, gcc can make the actual width of RPCRDMA_MAX_SEND_SGES
smaller than the width of an unsigned integer.

Fixes: 16f906d66cd7 ("xprtrdma: Reduce required number of send SGEs")
Signed-off-by: Chuck Lever 
Signed-off-by: Anna Schumaker 
Signed-off-by: Greg Kroah-Hartman

xprtrdma: Reduce required number of send SGEs

2017-03-12T05:41:52+00:00

commit 16f906d66cd76fb9895cbc628f447532a7ac1faa upstream.

The MAX_SEND_SGES check introduced in commit 655fec6987be
("xprtrdma: Use gathered Send for large inline messages") fails
for devices that have a small max_sge.

Instead of checking for a large fixed maximum number of SGEs,
check for a minimum small number. RPC-over-RDMA will switch to
using a Read chunk if an xdr_buf has more pages than can fit in
the device's max_sge limit. This is considerably better than
failing all together to mount the server.

This fix supports devices that have as few as three send SGEs
available.

Reported-by: Selvin Xavier 
Reported-by: Devesh Sharma 
Reported-by: Honggang Li 
Reported-by: Ram Amrani 
Fixes: 655fec6987be ("xprtrdma: Use gathered Send for large ...")
Tested-by: Honggang Li 
Tested-by: Ram Amrani 
Tested-by: Steve Wise 
Reviewed-by: Parav Pandit 
Signed-off-by: Chuck Lever 
Signed-off-by: Anna Schumaker 
Signed-off-by: Greg Kroah-Hartman

xprtrdma: Disable pad optimization by default

2017-03-12T05:41:52+00:00

commit c95a3c6b88658bcb8f77f85f31a0b9d9036e8016 upstream.

Commit d5440e27d3e5 ("xprtrdma: Enable pad optimization") made the
Linux client omit XDR round-up padding in normal Read and Write
chunks so that the client doesn't have to register and invalidate
3-byte memory regions that contain no real data.

Unfortunately, my cheery 2014 assessment that this optimization "is
supported now by both Linux and Solaris servers" was premature.
We've found bugs in Solaris in this area since commit d5440e27d3e5
("xprtrdma: Enable pad optimization") was merged (SYMLINK is the
main offender).

So for maximum interoperability, I'm disabling this optimization
again. If a CM private message is exchanged when connecting, the
client recognizes that the server is Linux, and enables the
optimization for that connection.

Until now the Solaris server bugs did not impact common operations,
and were thus largely benign. Soon, less capable devices on Linux
NFS/RDMA clients will make use of Read chunks more often, and these
Solaris bugs will prevent interoperation in more cases.

Fixes: 677eb17e94ed ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever 
Signed-off-by: Anna Schumaker 
Signed-off-by: Greg Kroah-Hartman

xprtrdma: Per-connection pad optimization

2017-03-12T05:41:52+00:00

commit b5f0afbea4f2ea52c613ac2b06cb6de2ea18cb6d upstream.

Pad optimization is changed by echoing into
/proc/sys/sunrpc/rdma_pad_optimize. This is a global setting,
affecting all RPC-over-RDMA connections to all servers.

The marshaling code picks up that value and uses it for decisions
about how to construct each RPC-over-RDMA frame. Having it change
suddenly in mid-operation can result in unexpected failures. And
some servers a client mounts might need chunk round-up, while
others don't.

So instead, copy the pad_optimize setting into each connection's
rpcrdma_ia when the transport is created, and use the copy, which
can't change during the life of the connection, instead.

This also removes a hack: rpcrdma_convert_iovs was using
the remote-invalidation-expected flag to predict when it could leave
out Write chunk padding. This is because the Linux server handles
implicit XDR padding on Write chunks correctly, and only Linux
servers can set the connection's remote-invalidation-expected flag.

It's more sensible to use the pad optimization setting instead.

Fixes: 677eb17e94ed ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever 
Signed-off-by: Anna Schumaker 
Signed-off-by: Greg Kroah-Hartman

xprtrdma: Fix Read chunk padding

2017-03-12T05:41:52+00:00

commit 24abdf1be15c478e2821d6fc903a4a4440beff02 upstream.

When pad optimization is disabled, rpcrdma_convert_iovs still
does not add explicit XDR round-up padding to a Read chunk.

Commit 677eb17e94ed ("xprtrdma: Fix XDR tail buffer marshalling")
incorrectly short-circuited the test for whether round-up padding
is needed that appears later in rpcrdma_convert_iovs.

However, if this is indeed a regular Read chunk (and not a
Position-Zero Read chunk), the tail iovec _always_ contains the
chunk's padding, and never anything else.

So, it's easy to just skip the tail when padding optimization is
enabled, and add the tail in a subsequent Read chunk segment, if
disabled.

Fixes: 677eb17e94ed ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever 
Signed-off-by: Anna Schumaker 
Signed-off-by: Greg Kroah-Hartman

svcrpc: fix oops in absence of krb5 module

2017-02-09T07:08:27+00:00

commit 034dd34ff4916ec1f8f74e39ca3efb04eab2f791 upstream.

Olga Kornievskaia says: "I ran into this oops in the nfsd (below)
(4.10-rc3 kernel). To trigger this I had a client (unsuccessfully) try
to mount the server with krb5 where the server doesn't have the
rpcsec_gss_krb5 module built."

The problem is that rsci.cred is copied from a svc_cred structure that
gss_proxy didn't properly initialize.  Fix that.

[120408.542387] general protection fault: 0000 [#1] SMP
...
[120408.565724] CPU: 0 PID: 3601 Comm: nfsd Not tainted 4.10.0-rc3+ #16
[120408.567037] Hardware name: VMware, Inc. VMware Virtual =
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[120408.569225] task: ffff8800776f95c0 task.stack: ffffc90003d58000
[120408.570483] RIP: 0010:gss_mech_put+0xb/0x20 [auth_rpcgss]
...
[120408.584946]  ? rsc_free+0x55/0x90 [auth_rpcgss]
[120408.585901]  gss_proxy_save_rsc+0xb2/0x2a0 [auth_rpcgss]
[120408.587017]  svcauth_gss_proxy_init+0x3cc/0x520 [auth_rpcgss]
[120408.588257]  ? __enqueue_entity+0x6c/0x70
[120408.589101]  svcauth_gss_accept+0x391/0xb90 [auth_rpcgss]
[120408.590212]  ? try_to_wake_up+0x4a/0x360
[120408.591036]  ? wake_up_process+0x15/0x20
[120408.592093]  ? svc_xprt_do_enqueue+0x12e/0x2d0 [sunrpc]
[120408.593177]  svc_authenticate+0xe1/0x100 [sunrpc]
[120408.594168]  svc_process_common+0x203/0x710 [sunrpc]
[120408.595220]  svc_process+0x105/0x1c0 [sunrpc]
[120408.596278]  nfsd+0xe9/0x160 [nfsd]
[120408.597060]  kthread+0x101/0x140
[120408.597734]  ? nfsd_destroy+0x60/0x60 [nfsd]
[120408.598626]  ? kthread_park+0x90/0x90
[120408.599448]  ret_from_fork+0x22/0x30

Fixes: 1d658336b05f "SUNRPC: Add RPC based upcall mechanism for RPCGSS auth"
Cc: Simo Sorce 
Reported-by: Olga Kornievskaia 
Tested-by: Olga Kornievskaia 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman

SUNRPC: cleanup ida information when removing sunrpc module

2017-02-01T07:33:09+00:00

commit c929ea0b910355e1876c64431f3d5802f95b3d75 upstream.

After removing sunrpc module, I get many kmemleak information as,
unreferenced object 0xffff88003316b1e0 (size 544):
  comm "gssproxy", pid 2148, jiffies 4294794465 (age 4200.081s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [] kmemleak_alloc+0x4a/0xa0
    [] kmem_cache_alloc+0x15e/0x1f0
    [] ida_pre_get+0xaa/0x150
    [] ida_simple_get+0xad/0x180
    [] nlmsvc_lookup_host+0x4ab/0x7f0 [lockd]
    [] lockd+0x4d/0x270 [lockd]
    [] param_set_timeout+0x55/0x100 [lockd]
    [] svc_defer+0x114/0x3f0 [sunrpc]
    [] svc_defer+0x2d7/0x3f0 [sunrpc]
    [] rpc_show_info+0x8a/0x110 [sunrpc]
    [] proc_reg_write+0x7f/0xc0
    [] __vfs_write+0xdf/0x3c0
    [] vfs_write+0xef/0x240
    [] SyS_write+0xad/0x130
    [] entry_SYSCALL_64_fastpath+0x1a/0xa9
    [] 0xffffffffffffffff

I found, the ida information (dynamic memory) isn't cleanup.

Signed-off-by: Kinglong Mee 
Fixes: 2f048db4680a ("SUNRPC: Add an identifier for struct rpc_clnt")
Signed-off-by: Trond Myklebust 
Signed-off-by: Greg Kroah-Hartman

xprtrdma: Squelch "max send, max recv" messages at connect time

2017-01-26T07:24:43+00:00

commit 6d6bf72de914059b304f7b99530a7856e5c846aa upstream.

Clean up: This message was intended to be a dprintk, as it is on the
server-side.

Fixes: 87cfb9a0c85c ('xprtrdma: Client-side support for ...')
Signed-off-by: Chuck Lever 
Signed-off-by: Anna Schumaker 
Signed-off-by: Greg Kroah-Hartman

xprtrdma: Make FRWR send queue entry accounting more accurate

2017-01-26T07:24:43+00:00

commit 8d38de65644d900199f035277aa5f3da4aa9fc17 upstream.

Verbs providers may perform house-keeping on the Send Queue during
each signaled send completion. It is necessary therefore for a verbs
consumer (like xprtrdma) to occasionally force a signaled send
completion if it runs unsignaled most of the time.

xprtrdma does not require signaled completions for Send or FastReg
Work Requests, but does signal some LocalInv Work Requests. To
ensure that Send Queue house-keeping can run before the Send Queue
is more than half-consumed, xprtrdma forces a signaled completion
on occasion by counting the number of Send Queue Entries it
consumes. It currently does this by counting each ib_post_send as
one Entry.

Commit c9918ff56dfb ("xprtrdma: Add ro_unmap_sync method for FRWR")
introduced the ability for frwr_op_unmap_sync to post more than one
Work Request with a single post_send. Thus the underlying assumption
of one Send Queue Entry per ib_post_send is no longer true.

Also, FastReg Work Requests are currently never signaled. They
should be signaled once in a while, just as Send is, to keep the
accounting of consumed SQEs accurate.

While we're here, convert the CQCOUNT macros to the currently
preferred kernel coding style, which is inline functions.

Fixes: c9918ff56dfb ("xprtrdma: Add ro_unmap_sync method for FRWR")
Signed-off-by: Chuck Lever 
Signed-off-by: Anna Schumaker 
Signed-off-by: Greg Kroah-Hartman