torvalds-linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-01-11 17:10:13 +00:00

Author	SHA1	Message	Date
NeilBrown	fb321998de	nfsd: use correct loop termination in nfsd4_revoke_states() The loop in nfsd4_revoke_states() stops one too early because the end value given is CLIENT_HASH_MASK where it should be CLIENT_HASH_SIZE. This means that an admin request to drop all locks for a filesystem will miss locks held by clients which hash to the maximum possible hash value. Fixes: 1ac3629bf012 ("nfsd: prepare for supporting admin-revocation of state") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2026-01-02 13:49:38 -05:00
NeilBrown	2857bd59fe	nfsd: provide locking for v4_end_grace Writing to v4_end_grace can race with server shutdown and result in memory being accessed after it was freed - reclaim_str_hashtbl in particularly. We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is held while client_tracking_op->init() is called and that can wait for an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a deadlock. nfsd4_end_grace() is also called by the landromat work queue and this doesn't require locking as server shutdown will stop the work and wait for it before freeing anything that nfsd4_end_grace() might access. However, we must be sure that writing to v4_end_grace doesn't restart the work item after shutdown has already waited for it. For this we add a new flag protected with nn->client_lock. It is set only while it is safe to make client tracking calls, and v4_end_grace only schedules work while the flag is set with the spinlock held. So this patch adds a nfsd_net field "client_tracking_active" which is set as described. Another field "grace_end_forced", is set when v4_end_grace is written. After this is set, and providing client_tracking_active is set, the laundromat is scheduled. This "grace_end_forced" field bypasses other checks for whether the grace period has finished. This resolves a race which can result in use-after-free. Reported-by: Li Lingfeng <lilingfeng3@huawei.com> Closes: https://lore.kernel.org/linux-nfs/20250623030015.2353515-1-neil@brown.name/T/#t Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neil@brown.name> Tested-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2026-01-02 13:48:22 -05:00
Haoxiang Li	1f941b2c23	nfsd: Drop the client reference in client_states_open() In error path, call drop_client() to drop the reference obtained by get_nfsdfs_clp(). Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-12-24 21:33:12 -05:00
Jeff Layton	8f9e967830	nfsd: use ATTR_DELEG in nfsd4_finalize_deleg_timestamps() When finalizing timestamps that have never been updated and preparing to release the delegation lease, the notify_change() call can trigger a delegation break, and fail to update the timestamps. When this happens, there will be messages like this in dmesg: [ 2709.375785] Unable to update timestamps on inode 00:39:263: -11 Since this code is going to release the lease just after updating the timestamps, breaking the delegation is undesirable. Fix this by setting ATTR_DELEG in ia_valid, in order to avoid the delegation break. Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-12-24 21:32:34 -05:00
Chuck Lever	8072e34e13	nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if the client already has a SHARE_ACCESS_READ open from a previous OPEN operation, this action overwrites the existing pointer without releasing its reference, orphaning the previous reference. Additionally, the function originally stored the same nfsd_file pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with only a single reference. When put_deleg_file() runs, it clears fi_rdeleg_file and calls nfs4_file_put_access() to release the file. However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when the fi_access[O_RDONLY] counter drops to zero. If another READ open exists on the file, the counter remains elevated and the nfsd_file reference from the delegation is never released. This potentially causes open conflicts on that file. Then, on server shutdown, these leaks cause __nfsd_file_cache_purge() to encounter files with an elevated reference count that cannot be cleaned up, ultimately triggering a BUG() in kmem_cache_destroy() because there are still nfsd_file objects allocated in that cache. Fixes: e7a8ebc305f2 ("NFSD: Offer write delegation for OPEN with OPEN4_SHARE_ACCESS_WRITE") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-12-24 21:29:24 -05:00
NeilBrown	fceb8734e7	nfsd: stop pretending that we cache the SEQUENCE reply. nfsd does not cache the reply to a SEQUENCE. As the comment above nfsd4_replay_cache_entry() says: * The sequence operation is not cached because we can use the slot and * session values. The comment above nfsd4_cache_this() suggests otherwise. * The session reply cache only needs to cache replies that the client * actually asked us to. But it's almost free for us to cache compounds * consisting of only a SEQUENCE op, so we may as well cache those too. * Also, the protocol doesn't give us a convenient response in the case * of a replay of a solo SEQUENCE op that wasn't cached The code in nfsd4_store_cache_entry() makes it clear that only responses beyond 'cstate.data_offset' are actually cached, and data_offset is set at the end of nfsd4_encode_sequence() after the sequence response has been encoded. This patch simplifies code and removes the confusing comments. - nfsd4_is_solo_sequence() is discarded as not-useful. - nfsd4_cache_this() is now trivial so it too is discarded with the code placed in-line at the one call-site in nfsd4_store_cache_entry(). - nfsd4_enc_sequence_replay() is open-coded in to nfsd4_replay_cache_entry(), and then simplified to (hopefully) make the process of replaying a reply clearer. Signed-off-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-11-17 08:46:12 -05:00
Matvey Kovalev	bfce8e4273	nfsd: delete unreachable confusing code in nfs4_open_delegation() op_delegate_type is assigned OPEN_DELEGATE_NONE just before the if-block where condition specifies it not be equal to OPEN_DELEGATE_NONE. Compiler treats the block as unreachable and optimizes it out from the resulting executable. In that aspect commit d08d32e6e5c0 ("nfsd4: return delegation immediately if lease fails") notably makes no difference. Seems it's better to just drop this code instead of fiddling with memory barriers or atomics. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Matvey Kovalev <matvey.kovalev@ispras.ru> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-11-16 18:20:11 -05:00
NeilBrown	4552f4e3f2	nfsd: change nfs4_client_to_reclaim() to allocate data The calling convention for nfs4_client_to_reclaim() is clumsy in that the caller needs to free memory if the function fails. It is much cleaner if the function frees its own memory. This patch changes nfs4_client_to_reclaim() to re-allocate the .data fields to be stored in the newly allocated struct nfs4_client_reclaim, and to free everything on failure. __cld_pipe_inprogress_downcall() needs to allocate the data anyway to copy it from user-space, so now that data is allocated twice. I think that is a small price to pay for a cleaner interface. Signed-off-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-11-16 18:20:11 -05:00
NeilBrown	1cff14b7fc	nfsd: ensure SEQUENCE replay sends a valid reply. nfsd4_enc_sequence_replay() uses nfsd4_encode_operation() to encode a new SEQUENCE reply when replaying a request from the slot cache - only ops after the SEQUENCE are replayed from the cache in ->sl_data. However it does this in nfsd4_replay_cache_entry() which is called before nfsd4_sequence() has filled in reply fields. This means that in the replayed SEQUENCE reply: maxslots will be whatever the client sent target_maxslots will be -1 (assuming init to zero, and nfsd4_encode_sequence() subtracts 1) status_flags will be zero The incorrect maxslots value, in particular, can cause the client to think the slot table has been reduced in size so it can discard its knowledge of current sequence number of the later slots, though the server has not discarded those slots. When the client later wants to use a later slot, it can get NFS4ERR_SEQ_MISORDERED from the server. This patch moves the setup of the reply into a new helper function and call it before nfsd4_replay_cache_entry() is called. Only one of the updated fields was used after this point - maxslots. So the nfsd4_sequence struct has been extended to have separate maxslots for the request and the response. Reported-by: Olga Kornievskaia <okorniev@redhat.com> Closes: https://lore.kernel.org/linux-nfs/20251010194449.10281-1-okorniev@redhat.com/ Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-11-10 09:31:52 -05:00
Chuck Lever	c96573c0d7	NFSD: Never cache a COMPOUND when the SEQUENCE operation fails RFC 8881 normatively mandates that operations where the initial SEQUENCE operation in a compound fails must not modify the slot's replay cache. nfsd4_cache_this() doesn't prevent such caching. So when SEQUENCE fails, cstate.data_offset is not set, allowing read_bytes_from_xdr_buf() to access uninitialized memory. Reported-by: rtm@csail.mit.edu Closes: https://lore.kernel.org/linux-nfs/c3628d57-94ae-48cf-8c9e-49087a28cec9@oracle.com/T/#t Fixes: 468de9e54a90 ("nfsd41: expand solo sequence check") Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-11-10 09:31:52 -05:00
Olga Kornievskaia	4aa17144d5	NFSD: free copynotify stateid in nfs4_free_ol_stateid() Typically copynotify stateid is freed either when parent's stateid is being close/freed or in nfsd4_laundromat if the stateid hasn't been used in a lease period. However, in case when the server got an OPEN (which created a parent stateid), followed by a COPY_NOTIFY using that stateid, followed by a client reboot. New client instance while doing CREATE_SESSION would force expire previous state of this client. It leads to the open state being freed thru release_openowner-> nfs4_free_ol_stateid() and it finds that it still has copynotify stateid associated with it. We currently print a warning and is triggerred WARNING: CPU: 1 PID: 8858 at fs/nfsd/nfs4state.c:1550 nfs4_free_ol_stateid+0xb0/0x100 [nfsd] This patch, instead, frees the associated copynotify stateid here. If the parent stateid is freed (without freeing the copynotify stateids associated with it), it leads to the list corruption when laundromat ends up freeing the copynotify state later. [ 1626.839430] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [ 1626.842828] Modules linked in: nfnetlink_queue nfnetlink_log bluetooth cfg80211 rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd nfs_acl lockd grace nfs_localio ext4 crc16 mbcache jbd2 overlay uinput snd_seq_dummy snd_hrtimer qrtr rfkill vfat fat uvcvideo snd_hda_codec_generic videobuf2_vmalloc videobuf2_memops snd_hda_intel uvc snd_intel_dspcfg videobuf2_v4l2 videobuf2_common snd_hda_codec snd_hda_core videodev snd_hwdep snd_seq mc snd_seq_device snd_pcm snd_timer snd soundcore sg loop auth_rpcgss vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock xfs 8021q garp stp llc mrp nvme ghash_ce e1000e nvme_core sr_mod nvme_keyring nvme_auth cdrom vmwgfx drm_ttm_helper ttm sunrpc dm_mirror dm_region_hash dm_log iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse dm_multipath dm_mod nfnetlink [ 1626.855594] CPU: 2 UID: 0 PID: 199 Comm: kworker/u24:33 Kdump: loaded Tainted: G B W 6.17.0-rc7+ #22 PREEMPT(voluntary) [ 1626.857075] Tainted: [B]=BAD_PAGE, [W]=WARN [ 1626.857573] Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS VMW201.00V.24006586.BA64.2406042154 06/04/2024 [ 1626.858724] Workqueue: nfsd4 laundromat_main [nfsd] [ 1626.859304] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 1626.860010] pc : __list_del_entry_valid_or_report+0x148/0x200 [ 1626.860601] lr : __list_del_entry_valid_or_report+0x148/0x200 [ 1626.861182] sp : ffff8000881d7a40 [ 1626.861521] x29: ffff8000881d7a40 x28: 0000000000000018 x27: ffff0000c2a98200 [ 1626.862260] x26: 0000000000000600 x25: 0000000000000000 x24: ffff8000881d7b20 [ 1626.862986] x23: ffff0000c2a981e8 x22: 1fffe00012410e7d x21: ffff0000920873e8 [ 1626.863701] x20: ffff0000920873e8 x19: ffff000086f22998 x18: 0000000000000000 [ 1626.864421] x17: 20747562202c3839 x16: 3932326636383030 x15: 3030666666662065 [ 1626.865092] x14: 6220646c756f6873 x13: 0000000000000001 x12: ffff60004fd9e4a3 [ 1626.865713] x11: 1fffe0004fd9e4a2 x10: ffff60004fd9e4a2 x9 : dfff800000000000 [ 1626.866320] x8 : 00009fffb0261b5e x7 : ffff00027ecf2513 x6 : 0000000000000001 [ 1626.866938] x5 : ffff00027ecf2510 x4 : ffff60004fd9e4a3 x3 : 0000000000000000 [ 1626.867553] x2 : 0000000000000000 x1 : ffff000096069640 x0 : 000000000000006d [ 1626.868167] Call trace: [ 1626.868382] __list_del_entry_valid_or_report+0x148/0x200 (P) [ 1626.868876] _free_cpntf_state_locked+0xd0/0x268 [nfsd] [ 1626.869368] nfs4_laundromat+0x6f8/0x1058 [nfsd] [ 1626.869813] laundromat_main+0x24/0x60 [nfsd] [ 1626.870231] process_one_work+0x584/0x1050 [ 1626.870595] worker_thread+0x4c4/0xc60 [ 1626.870893] kthread+0x2f8/0x398 [ 1626.871146] ret_from_fork+0x10/0x20 [ 1626.871422] Code: aa1303e1 aa1403e3 910e8000 97bc55d7 (d4210000) [ 1626.871892] SMP: stopping secondary CPUs Reported-by: rtm@csail.mit.edu Closes: https://lore.kernel.org/linux-nfs/d8f064c1-a26f-4eed-b4f0-1f7f608f415f@oracle.com/T/#t Fixes: 624322f1adc5 ("NFSD add COPY_NOTIFY operation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-11-10 09:31:52 -05:00
Chuck Lever	3e7f011c25	Revert "NFSD: Remove the cap on number of operations per NFSv4 COMPOUND" I've found that pynfs COMP6 now leaves the connection or lease in a strange state, which causes CLOSE9 to hang indefinitely. I've dug into it a little, but I haven't been able to root-cause it yet. However, I bisected to commit 48aab1606fa8 ("NFSD: Remove the cap on number of operations per NFSv4 COMPOUND"). Tianshuo Han also reports a potential vulnerability when decoding an NFSv4 COMPOUND. An attacker can place an arbitrarily large op count in the COMPOUND header, which results in: [ 51.410584] nfsd: vmalloc error: size 1209533382144, exceeds total pages, mode:0xdc0(GFP_KERNEL\|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0 when NFSD attempts to allocate the COMPOUND op array. Let's restore the operation-per-COMPOUND limit, but increased to 200 for now. Reported-by: tianshuo han <hantianshuo233@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: stable@vger.kernel.org Tested-by: Tianshuo Han <hantianshuo233@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-10-21 11:03:50 -04:00
Jeff Layton	e5e9b24ab8	nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation Instead of allowing the ctime to roll backward with a WRITE_ATTRS delegation, set FMODE_NOCMTIME on the file and have it skip mtime and ctime updates. It is possible that the client will never send a SETATTR to set the times before returning the delegation. Add two new bools to struct nfs4_delegation: dl_written: tracks whether the file has been written since the delegation was granted. This is set in the WRITE and LAYOUTCOMMIT handlers. dl_setattr: tracks whether the client has sent at least one valid mtime that can also update the ctime in a SETATTR. When unlocking the lease for the delegation, clear FMODE_NOCMTIME. If the file has been written, but no setattr for the delegated mtime and ctime has been done, update the timestamps to current_time(). Suggested-by: NeilBrown <neil@brown.name> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-09-21 19:24:50 -04:00
Jeff Layton	b40b1ba37a	nfsd: fix timestamp updates in CB_GETATTR When updating the local timestamps from CB_GETATTR, the updated values are not being properly vetted. Compare the update times vs. the saved times in the delegation rather than the current times in the inode. Also, ensure that the ctime is properly vetted vs. its original value. Fixes: 6ae30d6eb26b ("nfsd: add support for delegated timestamps") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-09-21 19:24:50 -04:00
Jeff Layton	3952f1cbcb	nfsd: fix SETATTR updates for delegated timestamps SETATTRs containing delegated timestamp updates are currently not being vetted properly. Since we no longer need to compare the timestamps vs. the current timestamps, move the vetting of delegated timestamps wholly into nfsd. Rename the set_cb_time() helper to nfsd4_vet_deleg_time(), and make it non-static. Add a new vet_deleg_attrs() helper that is called from nfsd4_setattr that uses nfsd4_vet_deleg_time() to properly validate the all the timestamps. If the validation indicates that the update should be skipped, unset the appropriate flags in ia_valid. Fixes: 7e13f4f8d27d ("nfsd: handle delegated timestamps in SETATTR") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-09-21 19:24:50 -04:00
Jeff Layton	7663e963a5	nfsd: track original timestamps in nfs4_delegation As Trond points out [1], the "original time" mentioned in RFC 9754 refers to the timestamps on the files at the time that the delegation was granted, and not the current timestamp of the file on the server. Store the current timestamps for the file in the nfs4_delegation when granting one. Add STATX_ATIME and STATX_MTIME to the request mask in nfs4_delegation_stat(). When granting OPEN_DELEGATE_READ_ATTRS_DELEG, do a nfs4_delegation_stat() and save the correct atime. If the stat() fails for any reason, fall back to granting a normal read deleg. [1]: https://lore.kernel.org/linux-nfs/47a4e40310e797f21b5137e847b06bb203d99e66.camel@kernel.org/ Fixes: 7e13f4f8d27d ("nfsd: handle delegated timestamps in SETATTR") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-09-21 19:24:50 -04:00
Jeff Layton	c066ff58e5	nfsd: use ATTR_CTIME_SET for delegated ctime updates Ensure that notify_change() doesn't clobber a delegated ctime update with current_time() by setting ATTR_CTIME_SET for those updates. Don't bother setting the timestamps in cb_getattr_update_times() in the non-delegated case. notify_change() will do that itself. Fixes: 7e13f4f8d27d ("nfsd: handle delegated timestamps in SETATTR") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-09-21 19:24:50 -04:00
Chuck Lever	48aab1606f	NFSD: Remove the cap on number of operations per NFSv4 COMPOUND This limit has always been a sanity check; in nearly all cases a large COMPOUND is a sign of a malfunctioning client. The only real limit on COMPOUND size and complexity is the size of NFSD's send and receive buffers. However, there are a few cases where a large COMPOUND is sane. For example, when a client implementation wants to walk down a long file pathname in a single round trip. A small risk is that now a client can construct a COMPOUND request that can keep a single nfsd thread busy for quite some time. Suggested-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-07-14 12:46:41 -04:00
Dai Ngo	9c65001c57	NFSD: detect mismatch of file handle and delegation stateid in OPEN op When the client sends an OPEN with claim type CLAIM_DELEG_CUR_FH or CLAIM_DELEGATION_CUR, the delegation stateid and the file handle must belong to the same file, otherwise return NFS4ERR_INVAL. Note that RFC8881, section 8.2.4, mandates the server to return NFS4ERR_BAD_STATEID if the selected table entry does not match the current filehandle. However returning NFS4ERR_BAD_STATEID in the OPEN causes the client to retry the operation and therefor get the client into a loop. To avoid this situation we return NFS4ERR_INVAL instead. Reported-by: Petro Pavlov <petro.pavlov@vastdata.com> Fixes: c44c5eeb2c02 ("[PATCH] nfsd4: add open state code for CLAIM_DELEGATE_CUR") Cc: stable@vger.kernel.org Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-07-14 12:46:40 -04:00
Jeff Layton	908e4ead7f	nfsd: handle get_client_locked() failure in nfsd4_setclientid_confirm() Lei Lu recently reported that nfsd4_setclientid_confirm() did not check the return value from get_client_locked(). a SETCLIENTID_CONFIRM could race with a confirmed client expiring and fail to get a reference. That could later lead to a UAF. Fix this by getting a reference early in the case where there is an extant confirmed client. If that fails then treat it as if there were no confirmed client found at all. In the case where the unconfirmed client is expiring, just fail and return the result from get_client_locked(). Reported-by: lei lu <llfamsec@gmail.com> Closes: https://lore.kernel.org/linux-nfs/CAEBF3_b=UvqzNKdnfD_52L05Mqrqui9vZ2eFamgAbV0WG+FNWQ@mail.gmail.com/ Fixes: d20c11d86d8f ("nfsd: Protect session creation and client confirm using client_lock") Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-07-14 12:46:39 -04:00
Dai Ngo	3b8737ce5b	NFSD: release read access of nfs4_file when a write delegation is returned When a write delegation is returned, check if read access was added to nfs4_file when client opens file with WRONLY, and release it. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-07-14 12:46:36 -04:00
Dai Ngo	e7a8ebc305	NFSD: Offer write delegation for OPEN with OPEN4_SHARE_ACCESS_WRITE RFC8881, section 9.1.2 says: "In the case of READ, the server may perform the corresponding check on the access mode, or it may choose to allow READ for OPEN4_SHARE_ACCESS_WRITE, to accommodate clients whose WRITE implementation may unavoidably do reads (e.g., due to buffer cache constraints)." and in section 10.4.1: "Similarly, when closing a file opened for OPEN4_SHARE_ACCESS_WRITE/ OPEN4_SHARE_ACCESS_BOTH and if an OPEN_DELEGATE_WRITE delegation is in effect" This patch allows READ using write delegation stateid granted on OPENs with OPEN4_SHARE_ACCESS_WRITE only, to accommodate clients whose WRITE implementation may unavoidably do (e.g., due to buffer cache constraints). For write delegation granted for OPEN with OPEN4_SHARE_ACCESS_WRITE a new nfsd_file and a struct file are allocated to use for reads. The nfsd_file is freed when the file is closed by release_all_access. Suggested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-07-14 12:46:36 -04:00
Chuck Lever	58d721684d	NFSD: Remove NFSD_BUFSIZE Clean up: The documenting comment for NFSD_BUFSIZE is quite stale. NFSD_BUFSIZE is used only for NFSv4 Reply these days; never for NFSv2 or v3, and never for RPC Calls. Even so, the byte count estimate does not include the size of the NFSv4 COMPOUND Reply HEADER or the RPC auth flavor. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-05-15 16:16:27 -04:00
Chuck Lever	281ae67c94	NFSD: Implement CB_SEQUENCE referring call lists The slot index number of the current COMPOUND has, until now, not been needed outside of nfsd4_sequence(). But to record the tuple that represents a referring call, the slot number will be needed when processing subsequent operations in the COMPOUND. Refactor the code that allocates a new struct nfsd4_slot to ensure that the new sl_index field is always correctly initialized. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-05-11 19:48:21 -04:00
Li Lingfeng	a1d14d931b	nfsd: decrease sc_count directly if fail to queue dl_recall A deadlock warning occurred when invoking nfs4_put_stid following a failed dl_recall queue operation: T1 T2 nfs4_laundromat nfs4_get_client_reaplist nfs4_anylock_blockers __break_lease spin_lock // ctx->flc_lock spin_lock // clp->cl_lock nfs4_lockowner_has_blockers locks_owner_has_blockers spin_lock // flctx->flc_lock nfsd_break_deleg_cb nfsd_break_one_deleg nfs4_put_stid refcount_dec_and_lock spin_lock // clp->cl_lock When a file is opened, an nfs4_delegation is allocated with sc_count initialized to 1, and the file_lease holds a reference to the delegation. The file_lease is then associated with the file through kernel_setlease. The disassociation is performed in nfsd4_delegreturn via the following call chain: nfsd4_delegreturn --> destroy_delegation --> destroy_unhashed_deleg --> nfs4_unlock_deleg_lease --> kernel_setlease --> generic_delete_lease The corresponding sc_count reference will be released after this disassociation. Since nfsd_break_one_deleg executes while holding the flc_lock, the disassociation process becomes blocked when attempting to acquire flc_lock in generic_delete_lease. This means: 1) sc_count in nfsd_break_one_deleg will not be decremented to 0; 2) The nfs4_put_stid called by nfsd_break_one_deleg will not attempt to acquire cl_lock; 3) Consequently, no deadlock condition is created. Given that sc_count in nfsd_break_one_deleg remains non-zero, we can safely perform refcount_dec on sc_count directly. This approach effectively avoids triggering deadlock warnings. Fixes: 230ca758453c ("nfsd: put dl_stid if fail to queue dl_recall") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-04-13 16:39:42 -04:00
Chuck Lever	26a8076215	NFSD: Add a Kconfig setting to enable delegated timestamps After three tries, we still see test failures with delegated timestamps. Disable them by default, but leave the implementation intact so that development can continue. Cc: stable@vger.kernel.org # v6.14 Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-14 10:49:47 -04:00
Jeff Layton	261e3bbf97	nfsd: use a long for the count in nfsd4_state_shrinker_count() If there are no courtesy clients then the return value from the atomic_long_read() could overflow an int. Use a long to store the value instead. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:12 -04:00
Jeff Layton	387625808c	nfsd: remove obsolete comment from nfs4_alloc_stid idr_alloc_cyclic() is what guarantees that now, not this long-gone trick. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:12 -04:00
Jeff Layton	49bdbdb11f	nfsd: replace CB_GETATTR_BUSY with NFSD4_CALLBACK_RUNNING These flags serve essentially the same purpose and get set and cleared at the same time. Drop CB_GETATTR_BUSY and just use NFSD4_CALLBACK_RUNNING instead. For this to work, we must use clear_and_wake_up_bit(), but doing that on for other types of callbacks is wasteful. Declare a new NFSD4_CALLBACK_WAKE flag in cb_flags to indicate that wake_up is needed, and only set that for CB_GETATTRs. Also, make the wait use a TASK_UNINTERRUPTIBLE sleep. This is done in the context of an nfsd thread, and it should never need to deal with signals. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:10 -04:00
Jeff Layton	424dd3df1f	nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY deleg_reaper() will walk the client_lru list and put any suitable entries onto "cblist" using the cl_ra_cblist pointer. It then walks the objects outside the spinlock and queues callbacks for them. None of the operations that deleg_reaper() does outside the nn->client_lock are blocking operations. Just queue their workqueue jobs under the nn->client_lock instead. Also, the NFSD4_CLIENT_CB_RECALL_ANY and NFSD4_CALLBACK_RUNNING flags serve an identical purpose now. Drop the NFSD4_CLIENT_CB_RECALL_ANY flag and just use the one in the callback. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:09 -04:00
Jeff Layton	1054e8ffc5	nfsd: prevent callback tasks running concurrently The nfsd4_callback workqueue jobs exist to queue backchannel RPCs to rpciod. Because they run in different workqueue contexts, the rpc_task can run concurrently with the workqueue job itself, should it become requeued. This is problematic as there is no locking when accessing the fields in the nfsd4_callback. Add a new unsigned long to nfsd4_callback and declare a new NFSD4_CALLBACK_RUNNING flag to be set in it. When attempting to run a workqueue job, do a test_and_set_bit() on that flag first, and don't queue the workqueue job if it returns true. Clear NFSD4_CALLBACK_RUNNING in nfsd41_destroy_cb(). This also gives us a more reliable mechanism for handling queueing failures in codepaths where we have to take references under spinlocks. We can now do the test_and_set_bit on NFSD4_CALLBACK_RUNNING first, and only take references to the objects if that returns false. Most of the nfsd4_run_cb() callers are converted to use this new flag or the nfsd4_try_run_cb() wrapper. The main exception is the callback channel probe, which has its own synchronization. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:09 -04:00
Mike Snitzer	9254c8ae9b	nfsd: disallow file locking and delegations for NFSv4 reexport We do not and cannot support file locking with NFS reexport over NFSv4.x for the same reason we don't do it for NFSv3: NFS reexport server reboot cannot allow clients to recover locks because the source NFS server has not rebooted, and so it is not in grace. Since the source NFS server is not in grace, it cannot offer any guarantees that the file won't have been changed between the locks getting lost and any attempt to recover/reclaim them. The same applies to delegations and any associated locks, so disallow them too. Clients are no longer allowed to get file locks or delegations from a reexport server, any attempts will fail with operation not supported. Update the "Reboot recovery" section accordingly in Documentation/filesystems/nfs/reexport.rst Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:08 -04:00
Li Lingfeng	230ca75845	nfsd: put dl_stid if fail to queue dl_recall Before calling nfsd4_run_cb to queue dl_recall to the callback_wq, we increment the reference count of dl_stid. We expect that after the corresponding work_struct is processed, the reference count of dl_stid will be decremented through the callback function nfsd4_cb_recall_release. However, if the call to nfsd4_run_cb fails, the incremented reference count of dl_stid will not be decremented correspondingly, leading to the following nfs4_stid leak: unreferenced object 0xffff88812067b578 (size 344): comm "nfsd", pid 2761, jiffies 4295044002 (age 5541.241s) hex dump (first 32 bytes): 01 00 00 00 6b 6b 6b 6b b8 02 c0 e2 81 88 ff ff ....kkkk........ 00 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 ad 4e ad de .kkkkkkk.....N.. backtrace: kmem_cache_alloc+0x4b9/0x700 nfsd4_process_open1+0x34/0x300 nfsd4_open+0x2d1/0x9d0 nfsd4_proc_compound+0x7a2/0xe30 nfsd_dispatch+0x241/0x3e0 svc_process_common+0x5d3/0xcc0 svc_process+0x2a3/0x320 nfsd+0x180/0x2e0 kthread+0x199/0x1d0 ret_from_fork+0x30/0x50 ret_from_fork_asm+0x1b/0x30 unreferenced object 0xffff8881499f4d28 (size 368): comm "nfsd", pid 2761, jiffies 4295044005 (age 5541.239s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 30 4d 9f 49 81 88 ff ff ........0M.I.... 30 4d 9f 49 81 88 ff ff 20 00 00 00 01 00 00 00 0M.I.... ....... backtrace: kmem_cache_alloc+0x4b9/0x700 nfs4_alloc_stid+0x29/0x210 alloc_init_deleg+0x92/0x2e0 nfs4_set_delegation+0x284/0xc00 nfs4_open_delegation+0x216/0x3f0 nfsd4_process_open2+0x2b3/0xee0 nfsd4_open+0x770/0x9d0 nfsd4_proc_compound+0x7a2/0xe30 nfsd_dispatch+0x241/0x3e0 svc_process_common+0x5d3/0xcc0 svc_process+0x2a3/0x320 nfsd+0x180/0x2e0 kthread+0x199/0x1d0 ret_from_fork+0x30/0x50 ret_from_fork_asm+0x1b/0x30 Fix it by checking the result of nfsd4_run_cb and call nfs4_put_stid if fail to queue dl_recall. Cc: stable@vger.kernel.org Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:00 -04:00
Jeff Layton	d1bc15b147	nfsd: allow SC_STATUS_FREEABLE when searching via nfs4_lookup_stateid() The pynfs DELEG8 test fails when run against nfsd. It acquires a delegation and then lets the lease time out. It then tries to use the deleg stateid and expects to see NFS4ERR_DELEG_REVOKED, but it gets bad NFS4ERR_BAD_STATEID instead. When a delegation is revoked, it's initially marked with SC_STATUS_REVOKED, or SC_STATUS_ADMIN_REVOKED and later, it's marked with the SC_STATUS_FREEABLE flag, which denotes that it is waiting for s FREE_STATEID call. nfs4_lookup_stateid() accepts a statusmask that includes the status flags that a found stateid is allowed to have. Currently, that mask never includes SC_STATUS_FREEABLE, which means that revoked delegations are (almost) never found. Add SC_STATUS_FREEABLE to the always-allowed status flags, and remove it from nfsd4_delegreturn() since it's now always implied. Fixes: 8dd91e8d31fe ("nfsd: fix race between laundromat and free_stateid") Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:11:00 -04:00
Chuck Lever	8a388c1fab	NFSD: Skip sending CB_RECALL_ANY when the backchannel isn't up NFSD sends CB_RECALL_ANY to clients when the server is low on memory or that client has a large number of delegations outstanding. We've seen cases where NFSD attempts to send CB_RECALL_ANY requests to disconnected clients, and gets confused. These calls never go anywhere if a backchannel transport to the target client isn't available. Before the server can send any backchannel operation, the client has to connect first and then do a BIND_CONN_TO_SESSION. This patch doesn't address the root cause of the confusion, but there's no need to queue up these optional operations if they can't go anywhere. Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition") Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:10:55 -04:00
Olga Kornievskaia	45de52d034	nfsd: adjust WARN_ON_ONCE in revoke_delegation A WARN_ON_ONCE() is added to revoke delegations to make sure that the state has been marked for revocation. However, that's only true for 4.1+ stateids. For 4.0 stateids, in unhash_delegation_locked() the sc_status is set to SC_STATUS_CLOSED. Modify the check to reflect it, otherwise a WARN_ON_ONCE is erronously triggered. Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-03-10 09:10:55 -04:00
NeilBrown	5fb2516121	nfsd: fix uninitialised slot info when a request is retried A recent patch moved the assignment of seq->maxslots from before the test for a resent request (which ends with a goto) to after, resulting in it not being run in that case. This results in the server returning bogus "high slot id" and "target high slot id" values. The assignments to ->maxslots and ->target_maxslots need to be after the out: label so that the correct values are returned in replies to requests that are served from cache. Fixes: 60aa6564317d ("nfsd: allocate new session-based DRC slots on demand.") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-02-10 13:30:55 -05:00
Jeff Layton	d3edfd9ed1	nfsd: implement OPEN_ARGS_SHARE_ACCESS_WANT_OPEN_XOR_DELEGATION Allow clients to request getting a delegation xor an open stateid if a delegation isn't available. This allows the client to avoid sending a final CLOSE for the (useless) open stateid, when it is granted a delegation. If this flag is requested by the client and there isn't already a new open stateid, discard the new open stateid before replying. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-21 15:30:01 -05:00
Jeff Layton	7e13f4f8d2	nfsd: handle delegated timestamps in SETATTR Allow SETATTR to handle delegated timestamps. This patch assumes that only the delegation holder has the ability to set the timestamps in this way, so we allow this only if the SETATTR stateid refers to a *_ATTRS_DELEG delegation. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-21 15:30:01 -05:00
Jeff Layton	6ae30d6eb2	nfsd: add support for delegated timestamps Add support for the delegated timestamps on write delegations. This allows the server to proxy timestamps from the delegation holder to other clients that are doing GETATTRs vs. the same inode. When OPEN4_SHARE_ACCESS_WANT_DELEG_TIMESTAMPS bit is set in the OPEN call, set the dl_type to the *_ATTRS_DELEG flavor of delegation. Add timespec64 fields to nfs4_cb_fattr and decode the timestamps into those. Vet those timestamps according to the delstid spec and update the inode attrs if necessary. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-21 15:30:01 -05:00
Jeff Layton	cee9b4ef42	nfsd: rework NFS4_SHARE_WANT_* flag handling The delstid draft adds new NFS4_SHARE_WANT_TYPE_MASK values that don't fit neatly into the existing WANT_MASK or WHEN_MASK. Add a new NFS4_SHARE_WANT_MOD_MASK value and redefine NFS4_SHARE_WANT_MASK to include it. Also fix the checks in nfsd4_deleg_xgrade_none_ext() to check for the flags instead of equality, since there may be modifier flags in the value. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-21 15:30:01 -05:00
Jeff Layton	fbd5573d0d	nfsd: prepare delegation code for handing out _ATTRS_DELEG delegations Add some preparatory code to various functions that handle delegation types to allow them to handle the OPEN_DELEGATE__ATTRS_DELEG constants. Add helpers for detecting whether it's a read or write deleg, and whether the attributes are delegated. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-21 15:30:01 -05:00
Jeff Layton	c9c99a33e2	nfsd: rename NFS4_SHARE_WANT_* constants to OPEN4_SHARE_ACCESS_WANT_* Add the OPEN4_SHARE_ACCESS_WANT constants from the nfs4.1 and delstid draft into the nfs4_1.x file, and regenerate the headers and source files. Do a mass renaming of NFS4_SHARE_WANT_* to OPEN4_SHARE_ACCESS_WANT_* in the nfsd directory. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-21 15:30:01 -05:00
Jeff Layton	8dfbea8bde	nfsd: switch to autogenerated definitions for open_delegation_type4 Rename the enum with the same name in include/linux/nfs4.h, add the proper enum to nfs4_1.x and regenerate the headers and source files. Do a mass rename of all NFS4_OPEN_DELEGATE_* to OPEN_DELEGATE_* in the nfsd directory. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-21 15:30:01 -05:00
NeilBrown	35e34642b5	nfsd: add shrinker to reduce number of slots allocated per session Add a shrinker which frees unused slots and may ask the clients to use fewer slots on each session. We keep a global count of the number of freeable slots, which is the sum of one less than the current "target" slots in all sessions in all clients in all net-namespaces. This number is reported by the shrinker. When the shrinker is asked to free some, we call xxx on each session in a round-robin asking each to reduce the slot count by 1. This will reduce the "target" so the number reported by the shrinker will reduce immediately. The memory will only be freed later when the client confirmed that it is no longer needed. We use a global list of sessions and move the "head" to after the last session that we asked to reduce, so the next callback from the shrinker will move on to the next session. This pressure should be applied "evenly" across all sessions over time. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-06 09:37:39 -05:00
NeilBrown	fc8738c68d	nfsd: add support for freeing unused session-DRC slots Reducing the number of slots in the session slot table requires confirmation from the client. This patch adds reduce_session_slots() which starts the process of getting confirmation, but never calls it. That will come in a later patch. Before we can free a slot we need to confirm that the client won't try to use it again. This involves returning a lower cr_maxrequests in a SEQUENCE reply and then seeing a ca_maxrequests on the same slot which is not larger than we limit we are trying to impose. So for each slot we need to remember that we have sent a reduced cr_maxrequests. To achieve this we introduce a concept of request "generations". Each time we decide to reduce cr_maxrequests we increment the generation number, and record this when we return the lower cr_maxrequests to the client. When a slot with the current generation reports a low ca_maxrequests, we commit to that level and free extra slots. We use an 16 bit generation number (64 seems wasteful) and if it cycles we iterate all slots and reset the generation number to avoid false matches. When we free a slot we store the seqid in the slot pointer so that it can be restored when we reactivate the slot. The RFC can be read as suggesting that the slot number could restart from one after a slot is retired and reactivated, but also suggests that retiring slots is not required. So when we reactive a slot we accept with the next seqid in sequence, or 1. When decoding sa_highest_slotid into maxslots we need to add 1 - this matches how it is encoded for the reply. se_dead is moved in struct nfsd4_session to remove a hole. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-06 09:37:38 -05:00
NeilBrown	60aa656431	nfsd: allocate new session-based DRC slots on demand. If a client ever uses the highest available slot for a given session, attempt to allocate more slots so there is room for the client to use them if wanted. GFP_NOWAIT is used so if there is not plenty of free memory, failure is expected - which is what we want. It also allows the allocation while holding a spinlock. Each time we increase the number of slots by 20% (rounded up). This allows fairly quick growth while avoiding excessive over-shoot. We would expect to stablise with around 10% more slots available than the client actually uses. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-06 09:37:38 -05:00
NeilBrown	601c8cb349	nfsd: add session slot count to /proc/fs/nfsd/clients/*/info Each client now reports the number of slots allocated in each session. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-06 09:37:38 -05:00
NeilBrown	b5fba969a2	nfsd: remove artificial limits on the session-based DRC Rather than guessing how much space it might be safe to use for the DRC, simply try allocating slots and be prepared to accept failure. The first slot for each session is allocated with GFP_KERNEL which is unlikely to fail. Subsequent slots are allocated with the addition of __GFP_NORETRY which is expected to fail if there isn't much free memory. This is probably too aggressive but clears the way for adding a shrinker interface to free extra slots when memory is tight. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-06 09:37:37 -05:00
NeilBrown	0b6e142426	nfsd: use an xarray to store v4.1 session slots Using an xarray to store session slots will make it easier to change the number of active slots based on demand, and removes an unnecessary limit. To achieve good throughput with a high-latency server it can be helpful to have hundreds of concurrent writes, which means hundreds of slots. So increase the limit to 2048 (twice what the Linux client will currently use). This limit is only a sanity check, not a hard limit. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2025-01-06 09:37:37 -05:00

1 2 3 4 5 ...

1373 Commits