From 2d13e6ca429c0a6fbc82750acbece829facceec5 Mon Sep 17 00:00:00 2001 From: Noam Camus Date: Tue, 11 Oct 2016 13:51:35 -0700 Subject: lib/bitmap.c: enhance bitmap syntax Today there are platforms with many CPUs (up to 4K). Trying to boot only part of the CPUs may result in too long string. For example lets take NPS platform that is part of arch/arc. This platform have SMP system with 256 cores each with 16 HW threads (SMT machine) where HW thread appears as CPU to the kernel. In this example there is total of 4K CPUs. When one tries to boot only part of the HW threads from each core the string representing the map may be long... For example if for sake of performance we decided to boot only first half of HW threads of each core the map will look like: 0-7,16-23,32-39,...,4080-4087 This patch introduce new syntax to accommodate with such use case. I added an optional postfix to a range of CPUs which will choose according to given modulo the desired range of reminders i.e.: :sed_size/group_size For example, above map can be described in new syntax like this: 0-4095:8/16 Note that this patch is backward compatible with current syntax. [akpm@linux-foundation.org: rework documentation] Link: http://lkml.kernel.org/r/1473579629-4283-1-git-send-email-noamca@mellanox.com Signed-off-by: Noam Camus Cc: David Decotigny Cc: Ben Hutchings Cc: David S. Miller Cc: Pan Xinhui Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/kernel-parameters.txt | 50 ++++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 14 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 705fb915cbf7..a1489e14f8ee 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -33,6 +33,37 @@ can also be entered as Double-quotes can be used to protect spaces in values, e.g.: param="spaces in here" +cpu lists: +---------- + +Some kernel parameters take a list of CPUs as a value, e.g. isolcpus, +nohz_full, irqaffinity, rcu_nocbs. The format of this list is: + + ,..., + +or + + - + (must be a positive range in ascending order) + +or a mixture + +,...,- + +Note that for the special case of a range one can split the range into equal +sized groups and for each group use some amount from the beginning of that +group: + + -cpu number>:/ + +For example one can add to the command line following parameter: + + isolcpus=1,2,10-20,100-2000:2/25 + +where the final item represents CPUs 100,101,125,126,150,151,... + + + This document may not be entirely up to date and comprehensive. The command "modinfo -p ${modulename}" shows a current list of all parameters of a loadable module. Loadable modules, after being loaded into the running kernel, also @@ -1789,13 +1820,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. See Documentation/filesystems/nfs/nfsroot.txt. irqaffinity= [SMP] Set the default irq affinity mask - Format: - ,..., - or - - - (must be a positive range in ascending order) - or a mixture - ,...,- + The argument is a cpu list, as described above. irqfixup [HW] When an interrupt is not handled search all handlers @@ -1812,13 +1837,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Format: ,,, isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler. - Format: - ,..., - or - - - (must be a positive range in ascending order) - or a mixture - ,...,- + The argument is a cpu list, as described above. This option can be used to specify one or more CPUs to isolate from the general SMP balancing and scheduling @@ -2680,6 +2699,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Default: on nohz_full= [KNL,BOOT] + The argument is a cpu list, as described above. In kernels built with CONFIG_NO_HZ_FULL=y, set the specified list of CPUs whose tick will be stopped whenever possible. The boot CPU will be forced outside @@ -3285,6 +3305,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. See Documentation/blockdev/ramdisk.txt. rcu_nocbs= [KNL] + The argument is a cpu list, as described above. + In kernels built with CONFIG_RCU_NOCB_CPU=y, set the specified list of CPUs to be no-callback CPUs. Invocation of these CPUs' RCU callbacks will -- cgit v1.2.3 From e662145f5c1276d35e8955b3df7a68da306ee498 Mon Sep 17 00:00:00 2001 From: Tomohiro Kusumi Date: Tue, 11 Oct 2016 13:52:25 -0700 Subject: autofs: fix typos in Documentation/filesystems/autofs4.txt plus minor whitespace fixes. Link: http://lkml.kernel.org/r/20160812024734.12352.17122.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/autofs4.txt | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/autofs4.txt b/Documentation/filesystems/autofs4.txt index 39d02e19fb62..8fac3fe7b8c9 100644 --- a/Documentation/filesystems/autofs4.txt +++ b/Documentation/filesystems/autofs4.txt @@ -203,9 +203,9 @@ initiated or is being considered, otherwise it returns 0. Mountpoint expiry ----------------- -The VFS has a mechansim for automatically expiring unused mounts, +The VFS has a mechanism for automatically expiring unused mounts, much as it can expire any unused dentry information from the dcache. -This is guided by the MNT_SHRINKABLE flag. This only applies to +This is guided by the MNT_SHRINKABLE flag. This only applies to mounts that were created by `d_automount()` returning a filesystem to be mounted. As autofs doesn't return such a filesystem but leaves the mounting to the automount daemon, it must involve the automount daemon @@ -298,7 +298,7 @@ remove directories and symlinks using normal filesystem operations. autofs knows whether a process requesting some operation is the daemon or not based on its process-group id number (see getpgid(1)). -When an autofs filesystem it mounted the pgid of the mounting +When an autofs filesystem is mounted the pgid of the mounting processes is recorded unless the "pgrp=" option is given, in which case that number is recorded instead. Any request arriving from a process in that process group is considered to come from the daemon. @@ -450,7 +450,7 @@ Commands are: numbers for existing filesystems can be found in `/proc/self/mountinfo`. - **AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD**: same as `close(ioctlfd)`. -- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in +- **AUTOFS_DEV_IOCTL_SETPIPEFD_CMD**: if the filesystem is in catatonic mode, this can provide the write end of a new pipe in `arg1` to re-establish communication with a daemon. The process group of the calling process is used to identify the -- cgit v1.2.3 From d873284103dacbe90ace2f3de20dff02fafcfef0 Mon Sep 17 00:00:00 2001 From: Tomohiro Kusumi Date: Tue, 11 Oct 2016 13:52:53 -0700 Subject: autofs: fix Documentation regarding devid on ioctl The explanation on how ioctl handles devid seems incorrect. Userspace who calls this ioctl has no input regarding devid, and ioctl implementation retrieves devid via superblock. Link: http://lkml.kernel.org/r/20160812024825.12352.13486.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/autofs4-mount-control.txt | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt index aff22113a986..540d9a7e252d 100644 --- a/Documentation/filesystems/autofs4-mount-control.txt +++ b/Documentation/filesystems/autofs4-mount-control.txt @@ -323,9 +323,8 @@ mount on the given path dentry. The call requires an initialized struct autofs_dev_ioctl with the path field set to the mount point in question and the size field adjusted -appropriately as well as the arg1 field set to the device number of the -containing autofs mount. Upon return the struct field arg1 contains the -uid and arg2 the gid. +appropriately. Upon return the struct field arg1 contains the uid and +arg2 the gid. When reconstructing an autofs mount tree with active mounts we need to re-connect to mounts that may have used the original process uid and -- cgit v1.2.3 From bf72eda5f9c593495127a34d3444b2ec5939e837 Mon Sep 17 00:00:00 2001 From: Tomohiro Kusumi Date: Tue, 11 Oct 2016 13:52:56 -0700 Subject: autofs: update struct autofs_dev_ioctl in Documentation Sync with changes made by commit 730c9eeca980 ("autofs4: improve parameter usage") which introduced an union for various ioctl commands instead of having statically named arg1,2. This commit simply replaces arg1,2 with the corresponding fields without changing semantics. Link: http://lkml.kernel.org/r/20160812024831.12352.24667.stgit@pluto.themaw.net Signed-off-by: Tomohiro Kusumi Signed-off-by: Ian Kent Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- .../filesystems/autofs4-mount-control.txt | 70 +++++++++++++--------- 1 file changed, 42 insertions(+), 28 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt index 540d9a7e252d..50a3e01a36f8 100644 --- a/Documentation/filesystems/autofs4-mount-control.txt +++ b/Documentation/filesystems/autofs4-mount-control.txt @@ -179,8 +179,19 @@ struct autofs_dev_ioctl { * including this struct */ __s32 ioctlfd; /* automount command fd */ - __u32 arg1; /* Command parameters */ - __u32 arg2; + union { + struct args_protover protover; + struct args_protosubver protosubver; + struct args_openmount openmount; + struct args_ready ready; + struct args_fail fail; + struct args_setpipefd setpipefd; + struct args_timeout timeout; + struct args_requester requester; + struct args_expire expire; + struct args_askumount askumount; + struct args_ismountpoint ismountpoint; + }; char path[0]; }; @@ -192,8 +203,8 @@ optionally be used to check a specific mount corresponding to a given mount point file descriptor, and when requesting the uid and gid of the last successful mount on a directory within the autofs file system. -The fields arg1 and arg2 are used to communicate parameters and results of -calls made as described below. +The union is used to communicate parameters and results of calls made +as described below. The path field is used to pass a path where it is needed and the size field is used account for the increased structure length when translating the @@ -245,9 +256,9 @@ AUTOFS_DEV_IOCTL_PROTOVER_CMD and AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD Get the major and minor version of the autofs4 protocol version understood by loaded module. This call requires an initialized struct autofs_dev_ioctl with the ioctlfd field set to a valid autofs mount point descriptor -and sets the requested version number in structure field arg1. These -commands return 0 on success or one of the negative error codes if -validation fails. +and sets the requested version number in version field of struct args_protover +or sub_version field of struct args_protosubver. These commands return +0 on success or one of the negative error codes if validation fails. AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT @@ -256,9 +267,9 @@ AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT Obtain and release a file descriptor for an autofs managed mount point path. The open call requires an initialized struct autofs_dev_ioctl with the path field set and the size field adjusted appropriately as well -as the arg1 field set to the device number of the autofs mount. The -device number can be obtained from the mount options shown in -/proc/mounts. The close call requires an initialized struct +as the devid field of struct args_openmount set to the device number of +the autofs mount. The device number can be obtained from the mount options +shown in /proc/mounts. The close call requires an initialized struct autofs_dev_ioct with the ioctlfd field set to the descriptor obtained from the open call. The release of the file descriptor can also be done with close(2) so any open descriptors will also be closed at process exit. @@ -272,10 +283,10 @@ AUTOFS_DEV_IOCTL_READY_CMD and AUTOFS_DEV_IOCTL_FAIL_CMD Return mount and expire result status from user space to the kernel. Both of these calls require an initialized struct autofs_dev_ioctl with the ioctlfd field set to the descriptor obtained from the open -call and the arg1 field set to the wait queue token number, received -by user space in the foregoing mount or expire request. The arg2 field -is set to the status to be returned. For the ready call this is always -0 and for the fail call it is set to the errno of the operation. +call and the token field of struct args_ready or struct args_fail set +to the wait queue token number, received by user space in the foregoing +mount or expire request. The status field of struct args_fail is set to +the errno of the operation. It is set to 0 on success. AUTOFS_DEV_IOCTL_SETPIPEFD_CMD @@ -290,9 +301,10 @@ mount be catatonic (see next call). The call requires an initialized struct autofs_dev_ioctl with the ioctlfd field set to the descriptor obtained from the open call and -the arg1 field set to descriptor of the pipe. On success the call -also sets the process group id used to identify the controlling process -(eg. the owning automount(8) daemon) to the process group of the caller. +the pipefd field of struct args_setpipefd set to descriptor of the pipe. +On success the call also sets the process group id used to identify the +controlling process (eg. the owning automount(8) daemon) to the process +group of the caller. AUTOFS_DEV_IOCTL_CATATONIC_CMD @@ -323,8 +335,8 @@ mount on the given path dentry. The call requires an initialized struct autofs_dev_ioctl with the path field set to the mount point in question and the size field adjusted -appropriately. Upon return the struct field arg1 contains the uid and -arg2 the gid. +appropriately. Upon return the uid field of struct args_requester contains +the uid and gid field the gid. When reconstructing an autofs mount tree with active mounts we need to re-connect to mounts that may have used the original process uid and @@ -342,8 +354,9 @@ this ioctl is called until no further expire candidates are found. The call requires an initialized struct autofs_dev_ioctl with the ioctlfd field set to the descriptor obtained from the open call. In addition an immediate expire, independent of the mount timeout, can be -requested by setting the arg1 field to 1. If no expire candidates can -be found the ioctl returns -1 with errno set to EAGAIN. +requested by setting the how field of struct args_expire to 1. If no +expire candidates can be found the ioctl returns -1 with errno set to +EAGAIN. This call causes the kernel module to check the mount corresponding to the given ioctlfd for mounts that can be expired, issues an expire @@ -356,7 +369,8 @@ Checks if an autofs mount point is in use. The call requires an initialized struct autofs_dev_ioctl with the ioctlfd field set to the descriptor obtained from the open call and -it returns the result in the arg1 field, 1 for busy and 0 otherwise. +it returns the result in the may_umount field of struct args_askumount, +1 for busy and 0 otherwise. AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD @@ -368,12 +382,12 @@ The call requires an initialized struct autofs_dev_ioctl. There are two possible variations. Both use the path field set to the path of the mount point to check and the size field adjusted appropriately. One uses the ioctlfd field to identify a specific mount point to check while the other -variation uses the path and optionally arg1 set to an autofs mount type. -The call returns 1 if this is a mount point and sets arg1 to the device -number of the mount and field arg2 to the relevant super block magic -number (described below) or 0 if it isn't a mountpoint. In both cases -the the device number (as returned by new_encode_dev()) is returned -in field arg1. +variation uses the path and optionally in.type field of struct args_ismountpoint +set to an autofs mount type. The call returns 1 if this is a mount point +and sets out.devid field to the device number of the mount and out.magic +field to the relevant super block magic number (described below) or 0 if +it isn't a mountpoint. In both cases the the device number (as returned +by new_encode_dev()) is returned in out.devid field. If supplied with a file descriptor we're looking for a specific mount, not necessarily at the top of the mounted stack. In this case the path -- cgit v1.2.3 From a9a62c9384417545620aee1b5ad1d9357350c17a Mon Sep 17 00:00:00 2001 From: Mauricio Faria de Oliveira Date: Tue, 11 Oct 2016 13:54:14 -0700 Subject: dma-mapping: introduce the DMA_ATTR_NO_WARN attribute Introduce the DMA_ATTR_NO_WARN attribute, and document it. Link: http://lkml.kernel.org/r/1470092390-25451-2-git-send-email-mauricfo@linux.vnet.ibm.com Signed-off-by: Mauricio Faria de Oliveira Cc: Keith Busch Cc: Jens Axboe Cc: Benjamin Herrenschmidt Cc: Michael Ellerman Cc: Krzysztof Kozlowski Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/DMA-attributes.txt | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'Documentation') diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index 2d455a5cf671..98bf7ac29aad 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -126,3 +126,20 @@ means that we won't try quite as hard to get them. NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM, though ARM64 patches will likely be posted soon. + +DMA_ATTR_NO_WARN +---------------- + +This tells the DMA-mapping subsystem to suppress allocation failure reports +(similarly to __GFP_NOWARN). + +On some architectures allocation failures are reported with error messages +to the system logs. Although this can help to identify and debug problems, +drivers which handle failures (eg, retry later) have no problems with them, +and can actually flood the system logs with error messages that aren't any +problem at all, depending on the implementation of the retry mechanism. + +So, this provides a way for drivers to avoid those error messages on calls +where allocation failures are not a problem, and shouldn't bother the logs. + +NOTE: At the moment DMA_ATTR_NO_WARN is only implemented on PowerPC. -- cgit v1.2.3 From 9099daed9c6991a512c1f74b92ec49daf9408cda Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Tue, 11 Oct 2016 13:55:11 -0700 Subject: mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping Some of the kmemleak_*() callbacks in memblock, bootmem, CMA convert a physical address to a virtual one using __va(). However, such physical addresses may sometimes be located in highmem and using __va() is incorrect, leading to inconsistent object tracking in kmemleak. The following functions have been added to the kmemleak API and they take a physical address as the object pointer. They only perform the corresponding action if the address has a lowmem mapping: kmemleak_alloc_phys kmemleak_free_part_phys kmemleak_not_leak_phys kmemleak_ignore_phys The affected calling places have been updated to use the new kmemleak API. Link: http://lkml.kernel.org/r/1471531432-16503-1-git-send-email-catalin.marinas@arm.com Signed-off-by: Catalin Marinas Reported-by: Vignesh R Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/dev-tools/kmemleak.rst | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'Documentation') diff --git a/Documentation/dev-tools/kmemleak.rst b/Documentation/dev-tools/kmemleak.rst index 1788722d5495..b2391b829169 100644 --- a/Documentation/dev-tools/kmemleak.rst +++ b/Documentation/dev-tools/kmemleak.rst @@ -162,6 +162,15 @@ See the include/linux/kmemleak.h header for the functions prototype. - ``kmemleak_alloc_recursive`` - as kmemleak_alloc but checks the recursiveness - ``kmemleak_free_recursive`` - as kmemleak_free but checks the recursiveness +The following functions take a physical address as the object pointer +and only perform the corresponding action if the address has a lowmem +mapping: + +- ``kmemleak_alloc_phys`` +- ``kmemleak_free_part_phys`` +- ``kmemleak_not_leak_phys`` +- ``kmemleak_ignore_phys`` + Dealing with false positives/negatives -------------------------------------- -- cgit v1.2.3 From 3989144f863ac576e6efba298d24b0b02a10d4bb Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Tue, 11 Oct 2016 13:55:20 -0700 Subject: kthread: kthread worker API cleanup A good practice is to prefix the names of functions by the name of the subsystem. The kthread worker API is a mix of classic kthreads and workqueues. Each worker has a dedicated kthread. It runs a generic function that process queued works. It is implemented as part of the kthread subsystem. This patch renames the existing kthread worker API to use the corresponding name from the workqueues API prefixed by kthread_: __init_kthread_worker() -> __kthread_init_worker() init_kthread_worker() -> kthread_init_worker() init_kthread_work() -> kthread_init_work() insert_kthread_work() -> kthread_insert_work() queue_kthread_work() -> kthread_queue_work() flush_kthread_work() -> kthread_flush_work() flush_kthread_worker() -> kthread_flush_worker() Note that the names of DEFINE_KTHREAD_WORK*() macros stay as they are. It is common that the "DEFINE_" prefix has precedence over the subsystem names. Note that INIT() macros and init() functions use different naming scheme. There is no good solution. There are several reasons for this solution: + "init" in the function names stands for the verb "initialize" aka "initialize worker". While "INIT" in the macro names stands for the noun "INITIALIZER" aka "worker initializer". + INIT() macros are used only in DEFINE() macros + init() functions are used close to the other kthread() functions. It looks much better if all the functions use the same scheme. + There will be also kthread_destroy_worker() that will be used close to kthread_cancel_work(). It is related to the init() function. Again it looks better if all functions use the same naming scheme. + there are several precedents for such init() function names, e.g. amd_iommu_init_device(), free_area_init_node(), jump_label_init_type(), regmap_init_mmio_clk(), + It is not an argument but it was inconsistent even before. [arnd@arndb.de: fix linux-next merge conflict] Link: http://lkml.kernel.org/r/20160908135724.1311726-1-arnd@arndb.de Link: http://lkml.kernel.org/r/1470754545-17632-3-git-send-email-pmladek@suse.com Suggested-by: Andrew Morton Signed-off-by: Petr Mladek Cc: Oleg Nesterov Cc: Tejun Heo Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Steven Rostedt Cc: "Paul E. McKenney" Cc: Josh Triplett Cc: Thomas Gleixner Cc: Jiri Kosina Cc: Borislav Petkov Cc: Michal Hocko Cc: Vlastimil Babka Signed-off-by: Arnd Bergmann Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/RCU/lockdep-splat.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/RCU/lockdep-splat.txt b/Documentation/RCU/lockdep-splat.txt index bf9061142827..238e9f61352f 100644 --- a/Documentation/RCU/lockdep-splat.txt +++ b/Documentation/RCU/lockdep-splat.txt @@ -57,7 +57,7 @@ Call Trace: [] kernel_thread_helper+0x4/0x10 [] ? finish_task_switch+0x80/0x110 [] ? retint_restore_args+0xe/0xe - [] ? __init_kthread_worker+0x70/0x70 + [] ? __kthread_init_worker+0x70/0x70 [] ? gs_change+0xb/0xb Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows: -- cgit v1.2.3