summaryrefslogtreecommitdiff
path: root/net/core/devlink.c
AgeCommit message (Collapse)Author
2021-11-04devlink: fix flexible_array.cocci warningGuo Zhengkui
Fix following coccicheck warning: ./net/core/devlink.c:69:6-10: WARNING use flexible-array member instead Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Link: https://lore.kernel.org/r/20211103121607.27490-1-guozhengkui@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01ethtool: don't drop the rtnl_lock half way thru the ioctlJakub Kicinski
devlink compat code needs to drop rtnl_lock to take devlink->lock to ensure correct lock ordering. This is problematic because we're not strictly guaranteed that the netdev will not disappear after we re-lock. It may open a possibility of nested ->begin / ->complete calls. Instead of calling into devlink under rtnl_lock take a ref on the devlink instance and make the call after we've dropped rtnl_lock. We (continue to) assume that netdevs have an implicit reference on the devlink returned from ndo_get_devlink_port Note that ndo_get_devlink_port will now get called under rtnl_lock. That should be fine since none of the drivers seem to be taking serious locks inside ndo_get_devlink_port. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01devlink: expose get/put functionsJakub Kicinski
Allow those who hold implicit reference on a devlink instance to try to take a full ref on it. This will be used from netdev code which has an implicit ref because of driver call ordering. Note that after recent changes devlink_unregister() may happen before netdev unregister, but devlink_free() should still happen after, so we are safe to try, but we can't just refcount_inc() and assume it's not zero. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-29devlink: make all symbols GPL-onlyJakub Kicinski
devlink_alloc() and devlink_register() are both GPL. A non-GPL module won't get far, so for consistency we can make all symbols GPL without risking any real life breakage. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28devlink: Simplify internal devlink params implementationLeon Romanovsky
Reduce extra indirection from devlink_params_*() API. Such change makes it clear that we can drop devlink->lock from these flows, because everything is executed when the devlink is not registered yet. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-27Revert "devlink: Remove not-executed trap policer notifications"Leon Romanovsky
This reverts commit 22849b5ea5952d853547cc5e0651f34a246b2a4f as it revealed that mlxsw and netdevsim (copy/paste from mlxsw) reregisters devlink objects during another devlink user triggered command. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-27Revert "devlink: Remove not-executed trap group notifications"Leon Romanovsky
This reverts commit 8bbeed4858239ac956a78e5cbaf778bd6f3baef8 as it revealed that mlxsw and netdevsim (copy/paste from mlxsw) reregisters devlink objects during another devlink user triggered command. Fixes: 22849b5ea595 ("devlink: Remove not-executed trap policer notifications") Reported-by: syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22devlink: Clean not-executed param notificationsLeon Romanovsky
The parameters are registered before devlink_register() and all the notifications are delayed. This patch removes not-possible parameters notifications along with addition of code annotation logic. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22devlink: Remove not-executed trap group notificationsLeon Romanovsky
The trap logic is registered before devlink_register() and all the notifications are delayed. This patch removes not-possible trap group notifications along with addition of code annotation logic. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22devlink: Remove not-executed trap policer notificationsLeon Romanovsky
The trap policer logic is registered before devlink_register() and all the notifications are delayed. This patch removes not-possible notifications along with addition of code annotation logic. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22devlink: Delete obsolete parameters publish APILeon Romanovsky
The change of devlink_register() to be last devlink command together with delayed notification logic made the publish API to be obsolete. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-19devlink: Remove extra device_lock assert checksLeon Romanovsky
PCI core code in the pci_call_probe() has a path that doesn't hold device_lock. It happens because the ->probe() is called through the workqueue mechanism. 349 static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, 350 const struct pci_device_id *id) 351 { 352 .... 377 if (cpu < nr_cpu_ids) 378 error = work_on_cpu(cpu, local_pci_probe, &ddi); Luckily enough, the core still ensures that only single flow is executed, so it safe to remove the assert checks that anyway were added for annotations purposes. Fixes: b88f7b1203bf ("devlink: Annotate devlink API calls") Reported-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12devlink: Delete reload enable/disable interfaceLeon Romanovsky
Commit a0c76345e3d3 ("devlink: disallow reload operation during device cleanup") added devlink_reload_{enable,disable}() APIs to prevent reload operation from racing with device probe/dismantle. After recent changes to move devlink_register() to the end of device probe and devlink_unregister() to the beginning of device dismantle, these races can no longer happen. Reload operations will be denied if the devlink instance is unregistered and devlink_unregister() will block until all in-flight operations are done. Therefore, remove these devlink_reload_{enable,disable}() APIs. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12devlink: Allow control devlink ops behavior through feature maskLeon Romanovsky
Introduce new devlink call to set feature mask to control devlink behavior during device initialization phase after devlink_alloc() is already called. This allows us to set reload ops based on device property which is not known at the beginning of driver initialization. For the sake of simplicity, this API lacks any type of locking and needs to be called before devlink_register() to make sure that no parallel access to the ops is possible at this stage. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12devlink: Annotate devlink API callsLeon Romanovsky
Initial annotation patch to separate calls that needs to be executed before or after devlink_register(). Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12devlink: Move netdev_to_devlink helpers to devlink.cLeon Romanovsky
Both netdev_to_devlink and netdev_to_devlink_port are used in devlink.c only, so move them in order to reduce their scope. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12devlink: Reduce struct devlink exposureLeon Romanovsky
The declaration of struct devlink in general header provokes the situation where internal fields can be accidentally used by the driver authors. In order to reduce such possible situations, let's reduce the namespace exposure of struct devlink. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-01devlink: report maximum number of snapshots with regionsJacob Keller
Each region has an independently configurable number of maximum snapshots. This information is not reported to userspace, making it not very discoverable. Fix this by adding a new DEVLINK_ATTR_REGION_MAX_SNAPSHOST attribute which is used to report this maximum. Ex: $devlink region pci/0000:af:00.0/nvm-flash: size 10485760 snapshot [] max 1 pci/0000:af:00.0/device-caps: size 4096 snapshot [] max 10 pci/0000:af:00.1/nvm-flash: size 10485760 snapshot [] max 1 pci/0000:af:00.1/device-caps: size 4096 snapshot [] max 10 This information enables users to understand why a new region command may fail due to having too many existing snapshots. Reported-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-29devlink: Add missed notifications iteratorsLeon Romanovsky
The commit mentioned in Fixes line missed a couple of notifications that were registered before devlink_register() and should be delayed too. As such, the too early placed WARN_ON() check spotted it. WARNING: CPU: 1 PID: 6540 at net/core/devlink.c:5158 devlink_nl_region_notify+0x184/0x1e0 net/core/devlink.c:5158 Modules linked in: CPU: 1 PID: 6540 Comm: syz-executor.0 Not tainted 5.15.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:devlink_nl_region_notify+0x184/0x1e0 net/core/devlink.c:5158 Code: 38 41 b8 c0 0c 00 00 31 d2 48 89 ee 4c 89 e7 e8 72 1a 26 00 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e e9 01 bd 41 fa e8 fc bc 41 fa <0f> 0b e9 f7 fe ff ff e8 f0 bc 41 fa 0f 0b eb da 4c 89 e7 e8 c4 18 RSP: 0018:ffffc90002d6f658 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88801f08d580 RSI: ffffffff87344e94 RDI: 0000000000000003 RBP: ffff88801ee42100 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff87344d8a R11: 0000000000000000 R12: ffff88801c1dc000 R13: 0000000000000000 R14: 000000000000002c R15: ffff88801c1dc070 FS: 0000555555e8e400(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dd7c590310 CR3: 0000000069a09000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: devlink_region_create+0x39f/0x4c0 net/core/devlink.c:10327 nsim_dev_dummy_region_init drivers/net/netdevsim/dev.c:481 [inline] nsim_dev_probe+0x5f6/0x1150 drivers/net/netdevsim/dev.c:1479 call_driver_probe drivers/base/dd.c:517 [inline] really_probe+0x245/0xcc0 drivers/base/dd.c:596 __driver_probe_device+0x338/0x4d0 drivers/base/dd.c:751 driver_probe_device+0x4c/0x1a0 drivers/base/dd.c:781 __device_attach_driver+0x20b/0x2f0 drivers/base/dd.c:898 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:427 __device_attach+0x228/0x4a0 drivers/base/dd.c:969 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:487 device_add+0xc35/0x21b0 drivers/base/core.c:3359 nsim_bus_dev_new drivers/net/netdevsim/bus.c:435 [inline] new_device_store+0x48b/0x770 drivers/net/netdevsim/bus.c:302 bus_attr_store+0x72/0xa0 drivers/base/bus.c:122 sysfs_kf_write+0x110/0x160 fs/sysfs/file.c:139 kernfs_fop_write_iter+0x342/0x500 fs/kernfs/file.c:296 call_write_iter include/linux/fs.h:2163 [inline] new_sync_write+0x429/0x660 fs/read_write.c:507 vfs_write+0x7cf/0xae0 fs/read_write.c:594 ksys_write+0x12d/0x250 fs/read_write.c:647 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f328409d3ef Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 99 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 cc fd ff ff 48 RSP: 002b:00007ffdc6851140 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f328409d3ef RDX: 0000000000000003 RSI: 00007ffdc6851190 RDI: 0000000000000004 RBP: 0000000000000004 R08: 0000000000000000 R09: 00007ffdc68510e0 R10: 0000000000000000 R11: 0000000000000293 R12: 00007f3284144971 R13: 00007ffdc6851190 R14: 0000000000000000 R15: 00007ffdc6851860 Fixes: cf530217408e ("devlink: Notify users when objects are accessible") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/2ed1159291f2a589b013914f2b60d8172fc525c1.1632925030.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-27devlink: Notify users when objects are accessibleLeon Romanovsky
The devlink core code notified users about add/remove objects without relation if this object can be accessible or not. In this patch we unify such user visible notifications in one place. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24devlink: Remove single line function obfuscationsLeon Romanovsky
There is no need in extra one line functions to call relevant functions only once. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-24devlink: Delete not used port parameters APIsLeon Romanovsky
There is no in-kernel users for the devlink port parameters API, so let's remove it. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-22devlink: Make devlink_register to be voidLeon Romanovsky
devlink_register() can't fail and always returns success, but all drivers are obligated to check returned status anyway. This adds a lot of boilerplate code to handle impossible flow. Make devlink_register() void and simplify the drivers that use that API call. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Simon Horman <simon.horman@corigine.com> Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17devlink: Delete not-used devlink APIsLeon Romanovsky
Devlink core exported generously the functions calls that were used by netdevsim tests or not used at all. Delete such APIs with one exception - devlink_alloc_ns(). That function should be spared from deleting because it is a special form of devlink_alloc() needed for the netdevsim. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15devlink: Delete not-used single parameter notification APIsLeon Romanovsky
There is no need in specific devlink_param_*publish(), because same output can be achieved by using devlink_params_*publish() in correct places. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14devlink: Clear whole devlink_flash_notify structLeon Romanovsky
The { 0 } doesn't clear all fields in the struct, but tells to the compiler to set all fields to zero and doesn't touch any sub-fields if they exists. The {} is an empty initialiser that instructs to fully initialize whole struct including sub-fields, which is error-prone for future devlink_flash_notify extensions. Fixes: 6700acc5f1fe ("devlink: collect flash notify params into a struct") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14devlink: Use xarray to store devlink instancesLeon Romanovsky
We can use xarray instead of linearly organized linked lists for the devlink instances. This will let us revise the locking scheme in favour of internal xarray locking that protects database. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14devlink: Count struct devlink consumersLeon Romanovsky
The struct devlink itself is protected by internal lock and doesn't need global lock during operation. That global lock is used to protect addition/removal new devlink instances from the global list in use by all devlink consumers in the system. The future conversion of linked list to be xarray will allow us to actually delete that lock, but first we need to count all struct devlink users. The reference counting provides us a way to ensure that no new user space commands success to grab devlink instance which is going to be destroyed makes it is safe to access it without lock. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14devlink: Remove check of always valid devlink pointerLeon Romanovsky
Devlink objects are accessible only after they were registered and have valid devlink_*->devlink pointers. Remove that check and simplify respective fill functions as an outcome of such change. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14devlink: Simplify devlink_pernet_pre_exit callLeon Romanovsky
The devlink_pernet_pre_exit() will be called if net namespace exits. That routine is relevant for devlink instances that were assigned to that namespaces first. This assignment is possible only with the following command: "devlink reload DEV netns ...", which already checks reload support. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11devlink: Add APIs to publish, unpublish individual parameterParav Pandit
Enable drivers to publish/unpublish individual parameter. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11devlink: Add API to register and unregister single parameterParav Pandit
Currently device configuration parameters can be registered as an array. Due to this a constant array must be registered. A single driver supporting multiple devices each with different device capabilities end up registering all parameters even if it doesn't support it. One possible workaround a driver can do is, it registers multiple single entry arrays to overcome such limitation. Better is to provide a API that enables driver to register/unregister a single parameter. This also further helps in two ways. (1) to reduce the memory of devlink_param_entry by avoiding in registering parameters which are not supported by the device. (2) avoid generating multiple parameter add, delete, publish, unpublish, init value notifications for such unsupported parameters Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11devlink: Create a helper function for one parameter registrationParav Pandit
Create and use a helper function for one parameter registration. Subsequent patch also will reuse this for driver facing routine to register a single parameter. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11devlink: Add new "enable_vnet" generic device paramParav Pandit
Add new device generic parameter to enable/disable creation of VDPA net auxiliary device and associated device functionality in the devlink instance. User who prefers to disable such functionality can disable it using below example. $ devlink dev param set pci/0000:06:00.0 \ name enable_vnet value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create auxiliary device for the VDPA net functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11devlink: Add new "enable_rdma" generic device paramParav Pandit
Add new device generic parameter to enable/disable creation of RDMA auxiliary device and associated device functionality in the devlink instance. User who prefers to disable such functionality can disable it using below example. $ devlink dev param set pci/0000:06:00.0 \ name enable_rdma value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create auxiliary device for the RDMA functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-11devlink: Add new "enable_eth" generic device paramParav Pandit
Add new device generic parameter to enable/disable creation of Ethernet auxiliary device and associated device functionality in the devlink instance. User who prefers to disable such functionality can disable it using below example. $ devlink dev param set pci/0000:06:00.0 \ name enable_eth value false cmode driverinit $ devlink dev reload pci/0000:06:00.0 At this point devlink instance do not create auxiliary device for the Ethernet functionality. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09devlink: Fix port_type_set function pointer checkLeon Romanovsky
Fix a typo when checking existence of port_type_set function pointer. Fixes: 82564f6c706a ("devlink: Simplify devlink port API calls") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09devlink: Set device as early as possibleLeon Romanovsky
All kernel devlink implementations call to devlink_alloc() during initialization routine for specific device which is used later as a parent device for devlink_register(). Such late device assignment causes to the situation which requires us to call to device_register() before setting other parameters, but that call opens devlink to the world and makes accessible for the netlink users. Any attempt to move devlink_register() to be the last call generates the following error due to access to the devlink->dev pointer. [ 8.758862] devlink_nl_param_fill+0x2e8/0xe50 [ 8.760305] devlink_param_notify+0x6d/0x180 [ 8.760435] __devlink_params_register+0x2f1/0x670 [ 8.760558] devlink_params_register+0x1e/0x20 The simple change of API to set devlink device in the devlink_alloc() instead of devlink_register() fixes all this above and ensures that prior to call to devlink_register() everything already set. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-08devlink: Simplify devlink port API callsLeon Romanovsky
Devlink port already has pointer to the devlink instance and all API calls that forward these devlink ports to the drivers perform same "devlink_port->devlink" assignment before actual call. This patch removes useless parameter and allows us in the future to create specific devlink_port_ops to manage user space access with reliable ops assignment. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicting commits, all resolutions pretty trivial: drivers/bus/mhi/pci_generic.c 5c2c85315948 ("bus: mhi: pci-generic: configurable network interface MRU") 56f6f4c4eb2a ("bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean") drivers/nfc/s3fwrn5/firmware.c a0302ff5906a ("nfc: s3fwrn5: remove unnecessary label") 46573e3ab08f ("nfc: s3fwrn5: fix undefined parameter values in dev_err()") 801e541c79bb ("nfc: s3fwrn5: fix undefined parameter values in dev_err()") MAINTAINERS 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") 8a7b46fa7902 ("MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-30devlink: Allocate devlink directly in requested net namespaceLeon Romanovsky
There is no need in extra call indirection and check from impossible flow where someone tries to set namespace without prior call to devlink_alloc(). Instead of this extra logic and additional EXPORT_SYMBOL, use specialized devlink allocation function that receives net namespace as an argument. Such specialized API allows clear view when devlink initialized in wrong net namespace and/or kernel users don't try to change devlink namespace under the hood. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-30devlink: Break parameter notification sequence to be before/after ↵Leon Romanovsky
unload/load driver The change of namespaces during devlink reload calls to driver unload before it accesses devlink parameters. The commands below causes to use-after-free bug when trying to get flow steering mode. * ip netns add n1 * devlink dev reload pci/0000:00:09.0 netns n1 ================================================================== BUG: KASAN: use-after-free in mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] Read of size 4 at addr ffff888009d04308 by task devlink/275 CPU: 6 PID: 275 Comm: devlink Not tainted 5.12.0-rc2+ #2853 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x93/0xc2 print_address_description.constprop.0+0x18/0x140 ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] kasan_report.cold+0x7c/0xd8 ? mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] mlx5_devlink_fs_mode_get+0x96/0xa0 [mlx5_core] devlink_nl_param_fill+0x1c8/0xe80 ? __free_pages_ok+0x37a/0x8a0 ? devlink_flash_update_timeout_notify+0xd0/0xd0 ? lock_acquire+0x1a9/0x6d0 ? fs_reclaim_acquire+0xb7/0x160 ? lock_is_held_type+0x98/0x110 ? 0xffffffff81000000 ? lock_release+0x1f9/0x6c0 ? fs_reclaim_release+0xa1/0xf0 ? lock_downgrade+0x6d0/0x6d0 ? lock_is_held_type+0x98/0x110 ? lock_is_held_type+0x98/0x110 ? memset+0x20/0x40 ? __build_skb_around+0x1f8/0x2b0 devlink_param_notify+0x6d/0x180 devlink_reload+0x1c3/0x520 ? devlink_remote_reload_actions_performed+0x30/0x30 ? mutex_trylock+0x24b/0x2d0 ? devlink_nl_cmd_reload+0x62b/0x1070 devlink_nl_cmd_reload+0x66d/0x1070 ? devlink_reload+0x520/0x520 ? devlink_get_from_attrs+0x1bc/0x260 ? devlink_nl_pre_doit+0x64/0x4d0 genl_family_rcv_msg_doit+0x1e9/0x2f0 ? mutex_lock_io_nested+0x1130/0x1130 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240 ? security_capable+0x51/0x90 genl_rcv_msg+0x27f/0x4a0 ? genl_get_cmd+0x3c0/0x3c0 ? lock_acquire+0x1a9/0x6d0 ? devlink_reload+0x520/0x520 ? lock_release+0x6c0/0x6c0 netlink_rcv_skb+0x11d/0x340 ? genl_get_cmd+0x3c0/0x3c0 ? netlink_ack+0x9f0/0x9f0 ? lock_release+0x1f9/0x6c0 genl_rcv+0x24/0x40 netlink_unicast+0x433/0x700 ? netlink_attachskb+0x730/0x730 ? _copy_from_iter_full+0x178/0x650 ? __alloc_skb+0x113/0x2b0 netlink_sendmsg+0x6f1/0xbd0 ? netlink_unicast+0x700/0x700 ? lock_is_held_type+0x98/0x110 ? netlink_unicast+0x700/0x700 sock_sendmsg+0xb0/0xe0 __sys_sendto+0x193/0x240 ? __x64_sys_getpeername+0xb0/0xb0 ? do_sys_openat2+0x10b/0x370 ? __up_read+0x1a1/0x7b0 ? do_user_addr_fault+0x219/0xdc0 ? __x64_sys_openat+0x120/0x1d0 ? __x64_sys_open+0x1a0/0x1a0 __x64_sys_sendto+0xdd/0x1b0 ? syscall_enter_from_user_mode+0x1d/0x50 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc69d0af14a Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c RSP: 002b:00007ffc1d8292f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fc69d0af14a RDX: 0000000000000038 RSI: 0000555f57c56440 RDI: 0000000000000003 RBP: 0000555f57c56410 R08: 00007fc69d17b200 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Allocated by task 146: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x99/0xc0 mlx5_init_fs+0xf0/0x1c50 [mlx5_core] mlx5_load+0xd2/0x180 [mlx5_core] mlx5_init_one+0x2f6/0x450 [mlx5_core] probe_one+0x47d/0x6e0 [mlx5_core] pci_device_probe+0x2a0/0x4a0 really_probe+0x20a/0xc90 driver_probe_device+0xd8/0x380 device_driver_attach+0x1df/0x250 __driver_attach+0xff/0x240 bus_for_each_dev+0x11e/0x1a0 bus_add_driver+0x309/0x570 driver_register+0x1ee/0x380 0xffffffffa06b8062 do_one_initcall+0xd5/0x410 do_init_module+0x1c8/0x760 load_module+0x6d8b/0x9650 __do_sys_finit_module+0x118/0x1b0 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae Freed by task 275: kasan_save_stack+0x1b/0x40 kasan_set_track+0x1c/0x30 kasan_set_free_info+0x20/0x30 __kasan_slab_free+0x102/0x140 slab_free_freelist_hook+0x74/0x1b0 kfree+0xd7/0x2a0 mlx5_unload+0x16/0xb0 [mlx5_core] mlx5_unload_one+0xae/0x120 [mlx5_core] mlx5_devlink_reload_down+0x1bc/0x380 [mlx5_core] devlink_reload+0x141/0x520 devlink_nl_cmd_reload+0x66d/0x1070 genl_family_rcv_msg_doit+0x1e9/0x2f0 genl_rcv_msg+0x27f/0x4a0 netlink_rcv_skb+0x11d/0x340 genl_rcv+0x24/0x40 netlink_unicast+0x433/0x700 netlink_sendmsg+0x6f1/0xbd0 sock_sendmsg+0xb0/0xe0 __sys_sendto+0x193/0x240 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff888009d04300 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 8 bytes inside of 128-byte region [ffff888009d04300, ffff888009d04380) The buggy address belongs to the page: page:0000000086a64ecc refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888009d04000 pfn:0x9d04 head:0000000086a64ecc order:1 compound_mapcount:0 flags: 0x4000000000010200(slab|head) raw: 4000000000010200 ffffea0000203980 0000000200000002 ffff8880050428c0 raw: ffff888009d04000 000000008020001d 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888009d04200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888009d04280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff888009d04300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888009d04380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888009d04400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== The right solution to devlink reload is to notify about deletion of parameters, unload driver, change net namespaces, load driver and notify about addition of parameters. Fixes: 070c63f20f6c ("net: devlink: allow to change namespaces during reload") Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-28devlink: Remove duplicated registration checkLeon Romanovsky
Both registered flag and devlink pointer are set at the same time and indicate the same thing - devlink/devlink_port are ready. Instead of checking ->registered use devlink pointer as an indication. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25devlink: Fix phys_port_name of virtual port and merge errorParav Pandit
Merge commit cited in fixes tag was incorrect. Due to it phys_port_name of the virtual port resulted in incorrect name. Also the phys_port_name of the physical port was written twice due to the merge error. Fix it by removing the old code and inserting back the misplaced code. Related commits of interest in net and net-next branches that resulted in merge conflict are: in net-next branch: commit f285f37cb1e6 ("devlink: append split port number to the port name") in net branch: commit b28d8f0c25a9 ("devlink: Correct VIRTUAL port to not have phys_port attributes") Fixes: 126285651b7 ("Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net") Signed-off-by: Parav Pandit <parav@nvidia.com> Reported-by: Niklas Schnelle <schnelle@linux.ibm.com> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23devlink: Protect rate list with lock while switching modesDmytro Linkin
Devlink eswitch set command doesn't hold devlink->lock, which makes possible race condition between rate list traversing and others devlink rate KAPI calls, like devlink_rate_nodes_destroy(). Hold devlink lock while traversing the list. Fixes: a8ecb93ef03d ("devlink: Introduce rate nodes") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23devlink: Remove eswitch mode check for mode set callDmytro Linkin
When eswitch is disabled, querying its current mode results in error. Due to this when trying to set the eswitch mode for mlx5 devices, it fails to set the eswitch switchdev mode. Hence remove such check. Fixes: a8ecb93ef03d ("devlink: Introduce rate nodes") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23devlink: Decrease refcnt of parent rate object on leaf destroyDmytro Linkin
Port functions, like SFs, can be deleted by the user when its leaf rate object has parent node. In such case node refcnt won't be decreased which blocks the node from deletion later. Do simple refcnt decrease, since driver in cleanup stage. This: 1) assumes that driver took proper internal parent unset action; 2) allows to avoid nested callbacks call and deadlock. Fixes: d75559845078 ("devlink: Allow setting parent node of rate objects") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14net: core: devlink: add dropped stats traps fieldOleksandr Mazur
Whenever query statistics is issued for trap, devlink subsystem would also fill-in statistics 'dropped' field. This field indicates the number of packets HW dropped and failed to report to the device driver, and thus - to the devlink subsystem itself. In case if device driver didn't register callback for hard drop statistics querying, 'dropped' field will be omitted and not filled. Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09devlink: Fix error message in devlink_rate_set_ops_supported()Dan Carpenter
The WARN_ON() macro takes a condition, it doesn't take a message. Use WARN() instead. Fixes: 1897db2ec310 ("devlink: Allow setting tx rate for devlink rate leaf objects") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Bug fixes overlapping feature additions and refactoring, mostly. Signed-off-by: David S. Miller <davem@davemloft.net>