Age | Commit message (Collapse) | Author |
|
Improve the efficiency of buffered reads in a number of ways:
(1) Overhaul the algorithm in general so that it's a lot more compact and
split the read submission code between buffered and unbuffered
versions. The unbuffered version can be vastly simplified.
(2) Read-result collection is handed off to a work queue rather than being
done in the I/O thread. Multiple subrequests can be processes
simultaneously.
(3) When a subrequest is collected, any folios it fully spans are
collected and "spare" data on either side is donated to either the
previous or the next subrequest in the sequence.
Notes:
(*) Readahead expansion is massively slows down fio, presumably because it
causes a load of extra allocations, both folio and xarray, up front
before RPC requests can be transmitted.
(*) RDMA with cifs does appear to work, both with SIW and RXE.
(*) PG_private_2-based reading and copy-to-cache is split out into its own
file and altered to use folio_queue. Note that the copy to the cache
now creates a new write transaction against the cache and adds the
folios to be copied into it. This allows it to use part of the
writeback I/O code.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-20-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Make the netfs write-side routines use the new folio_queue struct to hold a
rolling buffer of folios, with the issuer adding folios at the tail and the
collector removing them from the head as they're processed instead of using
an xarray.
This will allow a subsequent patch to simplify the write collector.
The primary mark (as tested by folioq_is_marked()) is used to note if the
corresponding folio needs putting.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-16-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Record statistics for contention upon the writeback serialisation lock that
prevents racing writeback calls from causing each other to interleave their
writebacks. These can be viewed in /proc/fs/netfs/stats on the WbLock line,
with skip=N indicating the number of non-SYNC writebacks skipped and wait=N
indicating the number of SYNC writebacks that waited.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-5-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Adjust the labels in /proc/fs/netfs/stats that refer to netfs-specific
counters. These currently all begin with "Netfs", but change them to begin
with more specific labels.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240814203850.2240469-4-dhowells@redhat.com/ # v2
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add some write-side stats to count buffered writes, buffered writethrough,
and writepages calls.
Whilst we're at it, clean up the naming on some of the existing stats
counters and organise the output into two sets.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
|
|
An issue can occur between write-streaming (storing dirty data in partial
non-uptodate pages) and a cachefiles object being culled to make space.
The problem occurs because the cache object is only marked in use while
there are files open using it. Once it has been released, it can be culled
and the cookie marked disabled.
At this point, a streaming write is permitted to occur (if the cache is
active, we require pages to be prefetched and cached), but the cache can
become active again before this gets flushed out - and then two effects can
occur:
(1) The cache may be asked to write out a region that's less than its DIO
block size (assumed by cachefiles to be PAGE_SIZE) - and this causes
one of two debugging statements to be emitted.
(2) netfs_how_to_modify() gets confused because it sees a page that isn't
allowed to be non-uptodate being uptodate and tries to prefetch it -
leading to a warning that PG_fscache is set twice.
Fix this by the following means:
(1) Add a netfs_inode flag to disallow write-streaming to an inode and set
it if we ever do local caching of that inode. It remains set for the
lifetime of that inode - even if the cookie becomes disabled.
(2) If the no-write-streaming flag is set, then make netfs_how_to_modify()
always want to prefetch instead.
(3) If netfs_how_to_modify() decides it wants to prefetch a folio, but
that folio has write-streamed data in it, then it requires the folio
be flushed first.
(4) Export a counter of the number of times we wanted to prefetch a
non-uptodate page, but found it had write-streamed data in it.
(5) Export a counter of the number of times we cancelled a write to the
cache because it didn't DIO align and remove the debug statements.
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
cc: linux-erofs@lists.ozlabs.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
|
|
Provide a counter for DIO writes to match that for DIO reads.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
|
|
Implement support for unbuffered and DIO reads in the netfs library,
utilising the existing read helper code to do block splitting and
individual queuing. The code also handles extraction of the destination
buffer from the supplied iterator, allowing async unbuffered reads to take
place.
The read will be split up according to the rsize setting and, if supplied,
the ->clamp_length() method. Note that the next subrequest will be issued
as soon as issue_op returns, without waiting for previous ones to finish.
The network filesystem needs to pause or handle queuing them if it doesn't
want to fire them all at the server simultaneously.
Once all the subrequests have finished, the state will be assessed and the
amount of data to be indicated as having being obtained will be
determined. As the subrequests may finish in any order, if an intermediate
subrequest is short, any further subrequests may be copied into the buffer
and then abandoned.
In the future, this will also take care of doing an unbuffered read from
encrypted content, with the decryption being done by the library.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
|
|
Modify the netfs_io_request struct to act as a point around which writes
can be coordinated. It represents and pins a range of pages that need
writing and a list of regions of dirty data in that range of pages.
If RMW is required, the original data can be downloaded into the bounce
buffer, decrypted if necessary, the modifications made, then the modified
data can be reencrypted/recompressed and sent back to the server.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
|
|
Rename /proc/fs/fscache to "netfs" and make a symlink from fscache to that.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: Christian Brauner <christian@brauner.io>
cc: linux-fsdevel@vger.kernel.org
cc: linux-cachefs@redhat.com
|
|
Add a netfs_i_context struct that should be included in the network
filesystem's own inode struct wrapper, directly after the VFS's inode
struct, e.g.:
struct my_inode {
struct {
/* These must be contiguous */
struct inode vfs_inode;
struct netfs_i_context netfs_ctx;
};
};
The netfs_i_context struct so far contains a single field for the network
filesystem to use - the cache cookie:
struct netfs_i_context {
...
struct fscache_cookie *cache;
};
Three functions are provided to help with this:
(1) void netfs_i_context_init(struct inode *inode,
const struct netfs_request_ops *ops);
Initialise the netfs context and set the operations.
(2) struct netfs_i_context *netfs_i_context(struct inode *inode);
Find the netfs context from the VFS inode.
(3) struct inode *netfs_inode(struct netfs_i_context *ctx);
Find the VFS inode from the netfs context.
Changes
=======
ver #4)
- Fix netfs_is_cache_enabled() to check cookie->cache_priv to see if a
cache is present[3].
- Fix netfs_skip_folio_read() to zero out all of the page, not just some
of it[3].
ver #3)
- Split out the bit to move ceph cap-getting on readahead into
ceph_init_request()[1].
- Stick in a comment to the netfs inode structs indicating the contiguity
requirements[2].
ver #2)
- Adjust documentation to match.
- Use "#if IS_ENABLED()" in netfs_i_cookie(), not "#ifdef".
- Move the cap check from ceph_readahead() to ceph_init_request() to be
called from netfslib.
- Remove ceph_readahead() and use netfs_readahead() directly instead.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/8af0d47f17d89c06bbf602496dd845f2b0bf25b3.camel@kernel.org/ [1]
Link: https://lore.kernel.org/r/beaf4f6a6c2575ed489adb14b257253c868f9a5c.camel@kernel.org/ [2]
Link: https://lore.kernel.org/r/3536452.1647421585@warthog.procyon.org.uk/ [3]
Link: https://lore.kernel.org/r/164622984545.3564931.15691742939278418580.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/164678213320.1200972.16807551936267647470.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/164692909854.2099075.9535537286264248057.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/306388.1647595110@warthog.procyon.org.uk/ # v4
|
|
Add a helper to do the pre-reading work for the netfs write_begin address
space op.
Changes
v6:
- Fixed a missing rreq put in netfs_write_begin()[3].
- Use DEFINE_READAHEAD()[4].
v5:
- Made the wait for PG_fscache in netfs_write_begin() killable[2].
v4:
- Added flag to netfs_subreq_terminated() to indicate that the caller may
have been running async and stuff that might sleep needs punting to a
workqueue (can't use in_softirq()[1]).
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-tested-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
Tested-By: Marc Dionne <marc.dionne@auristor.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-mm@kvack.org
cc: linux-cachefs@redhat.com
cc: linux-afs@lists.infradead.org
cc: linux-nfs@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20210216084230.GA23669@lst.de/ [1]
Link: https://lore.kernel.org/r/2499407.1616505440@warthog.procyon.org.uk/ [2]
Link: https://lore.kernel.org/r/161781042127.463527.9154479794406046987.stgit@warthog.procyon.org.uk/ [3]
Link: https://lore.kernel.org/r/1234933.1617886271@warthog.procyon.org.uk/ [4]
Link: https://lore.kernel.org/r/160588543960.3465195.2792938973035886168.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/161118140165.1232039.16418853874312234477.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/161161035539.2537118.15674887534950908530.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161340398368.1303470.11242918276563276090.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/161539541541.286939.1889738674057013729.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/161653798616.2770958.17213315845968485563.stgit@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/161789080530.6155.1011847312392330491.stgit@warthog.procyon.org.uk/ # v6
|
|
Gather statistics from the netfs interface that can be exported through a
seqfile. This is intended to be called by a later patch when viewing
/proc/fs/fscache/stats.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-tested-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Dave Wysochanski <dwysocha@redhat.com>
Tested-By: Marc Dionne <marc.dionne@auristor.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-mm@kvack.org
cc: linux-cachefs@redhat.com
cc: linux-afs@lists.infradead.org
cc: linux-nfs@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/161118139247.1232039.10556850937548511068.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/161161034669.2537118.2761232524997091480.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161340397101.1303470.17581910581108378458.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/161539539959.286939.6794352576462965914.stgit@warthog.procyon.org.uk/ # v4
Link: https://lore.kernel.org/r/161653797700.2770958.5801990354413178228.stgit@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/161789079281.6155.17141344853277186500.stgit@warthog.procyon.org.uk/ # v6
|