summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-07-29LoongArch: Simplify "BEQ/BNE foo, zero" with BEQZ/BNEZWANG Xuerui
While B{EQ,NE}Z and B{EQ,NE} are different instructions, and the vastly expanded range for branch destination does not really matter in the few cases touched, use the B{EQ,NE}Z where possible for shorter lines and better consistency (e.g. some places used "BEQ foo, zero", while some used "BEQ zero, foo"). Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29LoongArch: Use the "move" pseudo-instruction where applicableWANG Xuerui
Some of the assembly code in the LoongArch port likely originated from a time when the assembler did not support pseudo-instructions like "move" or "jr", so the desugared form was used and readability suffers (to a minor degree) as a result. As the upstream toolchain supports these pseudo-instructions from the beginning, migrate the existing few usages to them for better readability. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29LoongArch: Use the "jr" pseudo-instruction where applicableWANG Xuerui
Some of the assembly code in the LoongArch port likely originated from a time when the assembler did not support pseudo-instructions like "move" or "jr", so the desugared form was used and readability suffers (to a minor degree) as a result. As the upstream toolchain supports these pseudo-instructions from the beginning, migrate the existing few usages to them for better readability. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29LoongArch: Use ABI names of registers where appropriateWANG Xuerui
Some of the assembly in the LoongArch port seem to come from a prehistoric time, when the assembler didn't even have support for the ABI names we all come to know and love, thus used raw register numbers which hampered readability. The usages are found with a regex match inside arch/loongarch, then manually adjusted for those non-definitions. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29ARM: findbit: fix overflowing offsetRussell King (Oracle)
When offset is larger than the size of the bit array, we should not attempt to access the array as we can perform an access beyond the end of the array. Fix this by changing the pre-condition. Using "cmp r2, r1; bhs ..." covers us for the size == 0 case, since this will always take the branch when r1 is zero, irrespective of the value of r2. This means we can fix this bug without adding any additional code! Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-07-29x86/bugs: Do not enable IBPB at firmware entry when IBPB is not availableThadeu Lima de Souza Cascardo
Some cloud hypervisors do not provide IBPB on very recent CPU processors, including AMD processors affected by Retbleed. Using IBPB before firmware calls on such systems would cause a GPF at boot like the one below. Do not enable such calls when IBPB support is not present. EFI Variables Facility v0.08 2004-May-17 general protection fault, maybe for address 0x1: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 24 Comm: kworker/u2:1 Not tainted 5.19.0-rc8+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Workqueue: efi_rts_wq efi_call_rts RIP: 0010:efi_call_rts Code: e8 37 33 58 ff 41 bf 48 00 00 00 49 89 c0 44 89 f9 48 83 c8 01 4c 89 c2 48 c1 ea 20 66 90 b9 49 00 00 00 b8 01 00 00 00 31 d2 <0f> 30 e8 7b 9f 5d ff e8 f6 f8 ff ff 4c 89 f1 4c 89 ea 4c 89 e6 48 RSP: 0018:ffffb373800d7e38 EFLAGS: 00010246 RAX: 0000000000000001 RBX: 0000000000000006 RCX: 0000000000000049 RDX: 0000000000000000 RSI: ffff94fbc19d8fe0 RDI: ffff94fbc1b2b300 RBP: ffffb373800d7e70 R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000000b R11: 000000000000000b R12: ffffb3738001fd78 R13: ffff94fbc2fcfc00 R14: ffffb3738001fd80 R15: 0000000000000048 FS: 0000000000000000(0000) GS:ffff94fc3da00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff94fc30201000 CR3: 000000006f610000 CR4: 00000000000406f0 Call Trace: <TASK> ? __wake_up process_one_work worker_thread ? rescuer_thread kthread ? kthread_complete_and_exit ret_from_fork </TASK> Modules linked in: Fixes: 28a99e95f55c ("x86/amd: Use IBPB for firmware calls") Reported-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220728122602.2500509-1-cascardo@canonical.com
2022-07-28Merge tag 'drm-fixes-2022-07-29' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fix from Dave Airlie: "Quiet extra week, just a single fix for i915 workaround with execlist backend. i915: - Further reset robustness improvements for execlists [Wa_22011802037]" * tag 'drm-fixes-2022-07-29' of git://anongit.freedesktop.org/drm/drm: drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend
2022-07-29Merge tag 'drm-intel-fixes-2022-07-28-1' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Further reset robustness improvements for execlists [Wa_22011802037] (Umesh Nerlige Ramappa) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YuJIWaEbKcs/q0NY@tursulin-desk
2022-07-28nouveau/svm: Fix to migrate all requested pagesAlistair Popple
Users may request that pages from an OpenCL SVM allocation be migrated to the GPU with clEnqueueSVMMigrateMem(). In Nouveau this will call into nouveau_dmem_migrate_vma() to do the migration. If the total range to be migrated exceeds SG_MAX_SINGLE_ALLOC the pages will be migrated in chunks of size SG_MAX_SINGLE_ALLOC. However a typo in updating the starting address means that only the first chunk will get migrated. Fix the calculation so that the entire range will get migrated if possible. Signed-off-by: Alistair Popple <apopple@nvidia.com> Fixes: e3d8b0890469 ("drm/nouveau/svm: map pages after migration") Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220720062745.960701-1-apopple@nvidia.com Cc: <stable@vger.kernel.org> # v5.8+
2022-07-28Merge tag 'net-5.19-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter, no known blockers for the release. Current release - regressions: - wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop(), fix taking the lock before its initialized - Bluetooth: mgmt: fix double free on error path Current release - new code bugs: - eth: ice: fix tunnel checksum offload with fragmented traffic Previous releases - regressions: - tcp: md5: fix IPv4-mapped support after refactoring, don't take the pure v6 path - Revert "tcp: change pingpong threshold to 3", improving detection of interactive sessions - mld: fix netdev refcount leak in mld_{query | report}_work() due to a race - Bluetooth: - always set event mask on suspend, avoid early wake ups - L2CAP: fix use-after-free caused by l2cap_chan_put - bridge: do not send empty IFLA_AF_SPEC attribute Previous releases - always broken: - ping6: fix memleak in ipv6_renew_options() - sctp: prevent null-deref caused by over-eager error paths - virtio-net: fix the race between refill work and close, resulting in NAPI scheduled after close and a BUG() - macsec: - fix three netlink parsing bugs - avoid breaking the device state on invalid change requests - fix a memleak in another error path Misc: - dt-bindings: net: ethernet-controller: rework 'fixed-link' schema - two more batches of sysctl data race adornment" * tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits) stmmac: dwmac-mediatek: fix resource leak in probe ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr net: ping6: Fix memleak in ipv6_renew_options(). net/funeth: Fix fun_xdp_tx() and XDP packet reclaim sctp: leave the err path free in sctp_stream_init to sctp_stream_free sfc: disable softirqs for ptp TX ptp: ocp: Select CRC16 in the Kconfig. tcp: md5: fix IPv4-mapped support virtio-net: fix the race between refill work and close mptcp: Do not return EINPROGRESS when subflow creation succeeds Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put Bluetooth: Always set event mask on suspend Bluetooth: mgmt: Fix double free on error path wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop() ice: do not setup vlan for loopback VSI ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS) ice: Fix VSIs unable to share unicast MAC ice: Fix tunnel checksum offload with fragmented traffic ice: Fix max VLANs available for VF netfilter: nft_queue: only allow supported familes and hooks ...
2022-07-28stmmac: dwmac-mediatek: fix resource leak in probeDan Carpenter
If mediatek_dwmac_clks_config() fails, then call stmmac_remove_config_dt() before returning. Otherwise it is a resource leak. Fixes: fa4b3ca60e80 ("stmmac: dwmac-mediatek: fix clock issue") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YuJ4aZyMUlG6yGGa@kili Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptrZiyang Xuan
Change net device's MTU to smaller than IPV6_MIN_MTU or unregister device while matching route. That may trigger null-ptr-deref bug for ip6_ptr probability as following. ========================================================= BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134 Read of size 4 at addr 0000000000000308 by task ping6/263 CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14 Call trace: dump_backtrace+0x1a8/0x230 show_stack+0x20/0x70 dump_stack_lvl+0x68/0x84 print_report+0xc4/0x120 kasan_report+0x84/0x120 __asan_load4+0x94/0xd0 find_match.part.0+0x70/0x134 __find_rr_leaf+0x408/0x470 fib6_table_lookup+0x264/0x540 ip6_pol_route+0xf4/0x260 ip6_pol_route_output+0x58/0x70 fib6_rule_lookup+0x1a8/0x330 ip6_route_output_flags_noref+0xd8/0x1a0 ip6_route_output_flags+0x58/0x160 ip6_dst_lookup_tail+0x5b4/0x85c ip6_dst_lookup_flow+0x98/0x120 rawv6_sendmsg+0x49c/0xc70 inet_sendmsg+0x68/0x94 Reproducer as following: Firstly, prepare conditions: $ip netns add ns1 $ip netns add ns2 $ip link add veth1 type veth peer name veth2 $ip link set veth1 netns ns1 $ip link set veth2 netns ns2 $ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1 $ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2 $ip netns exec ns1 ifconfig veth1 up $ip netns exec ns2 ifconfig veth2 up $ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1 $ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1 Secondly, execute the following two commands in two ssh windows respectively: $ip netns exec ns1 sh $while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done $ip netns exec ns1 sh $while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly, then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check. cpu0 cpu1 fib6_table_lookup __find_rr_leaf addrconf_notify [ NETDEV_CHANGEMTU ] addrconf_ifdown RCU_INIT_POINTER(dev->ip6_ptr, NULL) find_match ip6_ignore_linkdown So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to fix the null-ptr-deref bug. Fixes: dcd1f572954f ("net/ipv6: Remove fib6_idev") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220728013307.656257-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28net: ping6: Fix memleak in ipv6_renew_options().Kuniyuki Iwashima
When we close ping6 sockets, some resources are left unfreed because pingv6_prot is missing sk->sk_prot->destroy(). As reported by syzbot [0], just three syscalls leak 96 bytes and easily cause OOM. struct ipv6_sr_hdr *hdr; char data[24] = {0}; int fd; hdr = (struct ipv6_sr_hdr *)data; hdr->hdrlen = 2; hdr->type = IPV6_SRCRT_TYPE_4; fd = socket(AF_INET6, SOCK_DGRAM, NEXTHDR_ICMP); setsockopt(fd, IPPROTO_IPV6, IPV6_RTHDR, data, 24); close(fd); To fix memory leaks, let's add a destroy function. Note the socket() syscall checks if the GID is within the range of net.ipv4.ping_group_range. The default value is [1, 0] so that no GID meets the condition (1 <= GID <= 0). Thus, the local DoS does not succeed until we change the default value. However, at least Ubuntu/Fedora/RHEL loosen it. $ cat /usr/lib/sysctl.d/50-default.conf ... -net.ipv4.ping_group_range = 0 2147483647 Also, there could be another path reported with these options, and some of them require CAP_NET_RAW. setsockopt IPV6_ADDRFORM (inet6_sk(sk)->pktoptions) IPV6_RECVPATHMTU (inet6_sk(sk)->rxpmtu) IPV6_HOPOPTS (inet6_sk(sk)->opt) IPV6_RTHDRDSTOPTS (inet6_sk(sk)->opt) IPV6_RTHDR (inet6_sk(sk)->opt) IPV6_DSTOPTS (inet6_sk(sk)->opt) IPV6_2292PKTOPTIONS (inet6_sk(sk)->opt) getsockopt IPV6_FLOWLABEL_MGR (inet6_sk(sk)->ipv6_fl_list) For the record, I left a different splat with syzbot's one. unreferenced object 0xffff888006270c60 (size 96): comm "repro2", pid 231, jiffies 4294696626 (age 13.118s) hex dump (first 32 bytes): 01 00 00 00 44 00 00 00 00 00 00 00 00 00 00 00 ....D........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000f6bc7ea9>] sock_kmalloc (net/core/sock.c:2564 net/core/sock.c:2554) [<000000006d699550>] do_ipv6_setsockopt.constprop.0 (net/ipv6/ipv6_sockglue.c:715) [<00000000c3c3b1f5>] ipv6_setsockopt (net/ipv6/ipv6_sockglue.c:1024) [<000000007096a025>] __sys_setsockopt (net/socket.c:2254) [<000000003a8ff47b>] __x64_sys_setsockopt (net/socket.c:2265 net/socket.c:2262 net/socket.c:2262) [<000000007c409dcb>] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) [<00000000e939c4a9>] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) [0]: https://syzkaller.appspot.com/bug?extid=a8430774139ec3ab7176 Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.") Reported-by: syzbot+a8430774139ec3ab7176@syzkaller.appspotmail.com Reported-by: Ayushman Dutta <ayudutta@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220728012220.46918-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28watch_queue: Fix missing locking in add_watch_to_object()Linus Torvalds
If a watch is being added to a queue, it needs to guard against interference from addition of a new watch, manual removal of a watch and removal of a watch due to some other queue being destroyed. KEYCTL_WATCH_KEY guards against this for the same {key,queue} pair by holding the key->sem writelocked and by holding refs on both the key and the queue - but that doesn't prevent interaction from other {key,queue} pairs. While add_watch_to_object() does take the spinlock on the event queue, it doesn't take the lock on the source's watch list. The assumption was that the caller would prevent that (say by taking key->sem) - but that doesn't prevent interference from the destruction of another queue. Fix this by locking the watcher list in add_watch_to_object(). Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reported-by: syzbot+03d7b43290037d1f87ca@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: keyrings@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-28watch_queue: Fix missing rcu annotationDavid Howells
Since __post_watch_notification() walks wlist->watchers with only the RCU read lock held, we need to use RCU methods to add to the list (we already use RCU methods to remove from the list). Fix add_watch_to_object() to use hlist_add_head_rcu() instead of hlist_add_head() for that list. Fixes: c73be61cede5 ("pipe: Add general notification queue support") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-28net/funeth: Fix fun_xdp_tx() and XDP packet reclaimDimitris Michailidis
The current implementation of fun_xdp_tx(), used for XPD_TX, is incorrect in that it takes an address/length pair and later releases it with page_frag_free(). It is OK for XDP_TX but the same code is used by ndo_xdp_xmit. In that case it loses the XDP memory type and releases the packet incorrectly for some of the types. Assorted breakage follows. Change fun_xdp_tx() to take xdp_frame and rely on xdp_return_frame() in reclaim. Fixes: db37bc177dae ("net/funeth: add the data path") Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Link: https://lore.kernel.org/r/20220726215923.7887-1-dmichail@fungible.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-27Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-07-26 This series contains updates to ice driver only. Przemyslaw corrects accounting for VF VLANs to allow for correct number of VLANs for untrusted VF. He also correct issue with checksum offload on VXLAN tunnels. Ani allows for two VSIs to share the same MAC address. Maciej corrects checked bits for descriptor completion of loopback * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: do not setup vlan for loopback VSI ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS) ice: Fix VSIs unable to share unicast MAC ice: Fix tunnel checksum offload with fragmented traffic ice: Fix max VLANs available for VF ==================== Link: https://lore.kernel.org/r/20220726204646.2171589-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27sctp: leave the err path free in sctp_stream_init to sctp_stream_freeXin Long
A NULL pointer dereference was reported by Wei Chen: BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:__list_del_entry_valid+0x26/0x80 Call Trace: <TASK> sctp_sched_dequeue_common+0x1c/0x90 sctp_sched_prio_dequeue+0x67/0x80 __sctp_outq_teardown+0x299/0x380 sctp_outq_free+0x15/0x20 sctp_association_free+0xc3/0x440 sctp_do_sm+0x1ca7/0x2210 sctp_assoc_bh_rcv+0x1f6/0x340 This happens when calling sctp_sendmsg without connecting to server first. In this case, a data chunk already queues up in send queue of client side when processing the INIT_ACK from server in sctp_process_init() where it calls sctp_stream_init() to alloc stream_in. If it fails to alloc stream_in all stream_out will be freed in sctp_stream_init's err path. Then in the asoc freeing it will crash when dequeuing this data chunk as stream_out is missing. As we can't free stream out before dequeuing all data from send queue, and this patch is to fix it by moving the err path stream_out/in freeing in sctp_stream_init() to sctp_stream_free() which is eventually called when freeing the asoc in sctp_association_free(). This fix also makes the code in sctp_process_init() more clear. Note that in sctp_association_init() when it fails in sctp_stream_init(), sctp_association_free() will not be called, and in that case it should go to 'stream_free' err path to free stream instead of 'fail_init'. Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations") Reported-by: Wei Chen <harperchen1110@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/831a3dc100c4908ff76e5bcc363be97f2778bc0b.1658787066.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27sfc: disable softirqs for ptp TXAlejandro Lucero
Sending a PTP packet can imply to use the normal TX driver datapath but invoked from the driver's ptp worker. The kernel generic TX code disables softirqs and preemption before calling specific driver TX code, but the ptp worker does not. Although current ptp driver functionality does not require it, there are several reasons for doing so: 1) The invoked code is always executed with softirqs disabled for non PTP packets. 2) Better if a ptp packet transmission is not interrupted by softirq handling which could lead to high latencies. 3) netdev_xmit_more used by the TX code requires preemption to be disabled. Indeed a solution for dealing with kernel preemption state based on static kernel configuration is not possible since the introduction of dynamic preemption level configuration at boot time using the static calls functionality. Fixes: f79c957a0b537 ("drivers: net: sfc: use netdev_xmit_more helper") Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com> Link: https://lore.kernel.org/r/20220726064504.49613-1-alejandro.lucero-palau@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27ptp: ocp: Select CRC16 in the Kconfig.Jonathan Lemon
The crc16() function is used to check the firmware validity, but the library was not explicitly selected. Fixes: 3c3673bde50c ("ptp: ocp: Add firmware header checks") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Vadim Fedorenko <vadfed@fb.com> Link: https://lore.kernel.org/r/20220726220604.1339972-1-jonathan.lemon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27clk: sunxi-ng: Fix H6 RTC clock definitionJernej Skrabec
While RTC clock was added in H616 ccu_common list, it was not in H6 list. That caused invalid pointer dereference like this: Unable to handle kernel NULL pointer dereference at virtual address 000000000000020c Mem abort info: ESR = 0x96000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=000000004d574000 [000000000000020c] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP CPU: 3 PID: 339 Comm: cat Tainted: G B 5.18.0-rc1+ #1352 Hardware name: Tanix TX6 (DT) pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ccu_gate_is_enabled+0x48/0x74 lr : ccu_gate_is_enabled+0x40/0x74 sp : ffff80000c0b76d0 x29: ffff80000c0b76d0 x28: 00000000016e3600 x27: 0000000000000000 x26: 0000000000000000 x25: 0000000000000002 x24: ffff00000952fe08 x23: ffff800009611400 x22: ffff00000952fe79 x21: 0000000000000000 x20: 0000000000000001 x19: ffff80000aad6f08 x18: 0000000000000000 x17: 2d2d2d2d2d2d2d2d x16: 2d2d2d2d2d2d2d2d x15: 2d2d2d2d2d2d2d2d x14: 0000000000000000 x13: 00000000f2f2f2f2 x12: ffff700001816e89 x11: 1ffff00001816e88 x10: ffff700001816e88 x9 : dfff800000000000 x8 : ffff80000c0b7447 x7 : 0000000000000001 x6 : ffff700001816e88 x5 : ffff80000c0b7440 x4 : 0000000000000001 x3 : ffff800008935c50 x2 : dfff800000000000 x1 : 0000000000000000 x0 : 000000000000020c Call trace: ccu_gate_is_enabled+0x48/0x74 clk_core_is_enabled+0x7c/0x1c0 clk_summary_show_subtree+0x1dc/0x334 clk_summary_show_subtree+0x250/0x334 clk_summary_show_subtree+0x250/0x334 clk_summary_show_subtree+0x250/0x334 clk_summary_show_subtree+0x250/0x334 clk_summary_show+0x90/0xdc seq_read_iter+0x248/0x6d4 seq_read+0x17c/0x1fc full_proxy_read+0x90/0xf0 vfs_read+0xdc/0x28c ksys_read+0xc8/0x174 __arm64_sys_read+0x44/0x5c invoke_syscall+0x60/0x190 el0_svc_common.constprop.0+0x7c/0x160 do_el0_svc+0x38/0xa0 el0_svc+0x68/0x160 el0t_64_sync_handler+0x10c/0x140 el0t_64_sync+0x18c/0x190 Code: d1006260 97e5c981 785e8260 8b0002a0 (b9400000) ---[ end trace 0000000000000000 ]--- Fix that by adding rtc clock to H6 ccu_common list too. Fixes: 38d321b61bda ("clk: sunxi-ng: h6-r: Add RTC gate clock") Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/r/20220719183725.2605141-1-jernej.skrabec@gmail.com Reviewed-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-27tcp: md5: fix IPv4-mapped supportEric Dumazet
After the blamed commit, IPv4 SYN packets handled by a dual stack IPv6 socket are dropped, even if perfectly valid. $ nstat | grep MD5 TcpExtTCPMD5Failure 5 0.0 For a dual stack listener, an incoming IPv4 SYN packet would call tcp_inbound_md5_hash() with @family == AF_INET, while tp->af_specific is pointing to tcp_sock_ipv6_specific. Only later when an IPv4-mapped child is created, tp->af_specific is changed to tcp_sock_ipv6_mapped_specific. Fixes: 7bbb765b7349 ("net/tcp: Merge TCP-MD5 inbound callbacks") Reported-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Dmitry Safonov <dima@arista.com> Tested-by: Leonard Crestez <cdleonard@gmail.com> Link: https://lore.kernel.org/r/20220726115743.2759832-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27ARM: 9216/1: Fix MAX_DMA_ADDRESS overflowFlorian Fainelli
Commit 26f09e9b3a06 ("mm/memblock: add memblock memory allocation apis") added a check to determine whether arm_dma_zone_size is exceeding the amount of kernel virtual address space available between the upper 4GB virtual address limit and PAGE_OFFSET in order to provide a suitable definition of MAX_DMA_ADDRESS that should fit within the 32-bit virtual address space. The quantity used for comparison was off by a missing trailing 0, leading to MAX_DMA_ADDRESS to be overflowing a 32-bit quantity. This was caught thanks to CONFIG_DEBUG_VIRTUAL on the bcm2711 platform where we define a dma_zone_size of 1GB and we have a PAGE_OFFSET value of 0xc000_0000 (CONFIG_VMSPLIT_3G) leading to MAX_DMA_ADDRESS being 0x1_0000_0000 which overflows the unsigned long type used throughout __pa() and then __virt_addr_valid(). Because the virtual address passed to __virt_addr_valid() would now be 0, the function would loudly warn and flood the kernel log, thus making the platform unable to boot properly. Fixes: 26f09e9b3a06 ("mm/memblock: add memblock memory allocation apis") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-07-27Merge tag 'asm-generic-fixes-5.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic fixes from Arnd Bergmann: "Two more bug fixes for asm-generic, one addressing an incorrect Kconfig symbol reference and another one fixing a build failure for the perf tool on mips and possibly others" * tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: remove a broken and needless ifdef conditional tools: Fixed MIPS builds due to struct flock re-definition
2022-07-27Merge tag 'soc-fixes-5.19-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "One last set of changes for the soc tree: - fix clock frequency on lan966x - fix incorrect GPIO numbers on some pxa machines - update Baolin's email address" * tag 'soc-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: pxa2xx: Fix GPIO descriptor tables mailmap: update Baolin Wang's email ARM: dts: lan966x: fix sys_clk frequency
2022-07-27Revert "x86/sev: Expose sev_es_ghcb_hv_call() for use by HyperV"Borislav Petkov
This reverts commit 007faec014cb5d26983c1f86fd08c6539b41392e. Now that hyperv does its own protocol negotiation: 49d6a3c062a1 ("x86/Hyper-V: Add SEV negotiate protocol support in Isolation VM") revert this exposure of the sev_es_ghcb_hv_call() helper. Cc: Wei Liu <wei.liu@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by:Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/r/20220614014553.1915929-1-ltykernel@gmail.com
2022-07-27x86/configs: Update configs in x86_debug.configLukas Bulwahn
Commit 4675ff05de2d ("kmemcheck: rip it out") removed kmemcheck and its corresponding build config KMEMCHECK. Commit 0f620cefd775 ("objtool: Rename "VMLINUX_VALIDATION" -> "NOINSTR_VALIDATION"") renamed the debug config option. Adjust x86_debug.config to those changes in debug configs. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220722121815.27535-1-lukas.bulwahn@gmail.com
2022-07-27Merge tag 'nvme-5.19-2022-07-27' of git://git.infradead.org/nvme into block-5.19Jens Axboe
Pull NVMe fix from Christoph: "nvme fix for Linux 5.19 - yet another duplicate ID quirk (Tobias Gruetzmacher)" * tag 'nvme-5.19-2022-07-27' of git://git.infradead.org/nvme: nvme-pci: Crucial P2 has bogus namespace ids
2022-07-27perf bpf: Remove undefined behavior from bpf_perf_object__next()Ian Rogers
bpf_perf_object__next() folded the last element in the list test with the empty list test. However, this meant that offsets were computed against null and that a struct list_head was compared against a 'struct bpf_perf_object'. Working around this with clang's undefined behavior sanitizer required -fno-sanitize=null and -fno-sanitize=object-size. Remove the undefined behavior by using the regular Linux list APIs and handling the starting case separately from the end testing case. Looking at uses like bpf_perf_object__for_each(), as the constant NULL or non-NULL argument can be constant propagated, the code is no less efficient. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Christy Lee <christylee@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27perf symbol: Skip symbols if SHF_ALLOC flag is not setLeo Yan
Some symbols are observed with the 'st_value' field zeroed. E.g. libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which resides in the '.gnu.warning.getwd' section. Unlike normal sections, such kind of sections are used for linker warning when a file calls deprecated functions, but they are not part of memory images, the symbols in these sections should be dropped. This patch checks the section attribute SHF_ALLOC bit, if the bit is not set, it skips symbols to avoid spurious ones. Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chang Rui <changruinj@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27perf symbol: Correct address for bss symbolsLeo Yan
When using 'perf mem' and 'perf c2c', an issue is observed that tool reports the wrong offset for global data symbols. This is a common issue on both x86 and Arm64 platforms. Let's see an example, for a test program, below is the disassembly for its .bss section which is dumped with objdump: ... Disassembly of section .bss: 0000000000004040 <completed.0>: ... 0000000000004080 <buf1>: ... 00000000000040c0 <buf2>: ... 0000000000004100 <thread>: ... First we used 'perf mem record' to run the test program and then used 'perf --debug verbose=4 mem report' to observe what's the symbol info for 'buf1' and 'buf2' structures. # ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8 # ./perf --debug verbose=4 mem report ... dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028 symbol__new: buf2 0x30a8-0x30e8 ... dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028 symbol__new: buf1 0x3068-0x30a8 ... The perf tool relies on libelf to parse symbols, in executable and shared object files, 'st_value' holds a virtual address; 'sh_addr' is the address at which section's first byte should reside in memory, and 'sh_offset' is the byte offset from the beginning of the file to the first byte in the section. The perf tool uses below formula to convert a symbol's memory address to a file address: file_address = st_value - sh_addr + sh_offset ^ ` Memory address We can see the final adjusted address ranges for buf1 and buf2 are [0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is incorrect, in the code, the structure for 'buf1' and 'buf2' specifies compiler attribute with 64-byte alignment. The problem happens for 'sh_offset', libelf returns it as 0x3028 which is not 64-byte aligned, combining with disassembly, it's likely libelf doesn't respect the alignment for .bss section, therefore, it doesn't return the aligned value for 'sh_offset'. Suggested by Fangrui Song, ELF file contains program header which contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD segments contain the execution info. A better choice for converting memory address to file address is using the formula: file_address = st_value - p_vaddr + p_offset This patch introduces elf_read_program_header() which returns the program header based on the passed 'st_value', then it uses the formula above to calculate the symbol file address; and the debugging log is updated respectively. After applying the change: # ./perf --debug verbose=4 mem report ... dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28 symbol__new: buf2 0x30c0-0x3100 ... dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28 symbol__new: buf1 0x3080-0x30c0 ... Fixes: f17e04afaff84b5c ("perf report: Fix ELF symbol parsing") Reported-by: Chang Rui <changruinj@gmail.com> Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220724060013.171050-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27perf scripts python: Let script to be python2 compliantLeo Yan
The mainline kernel can be used for relative old distros, e.g. RHEL 7. The distro doesn't upgrade from python2 to python3, this causes the building error that the python script is not python2 compliant. To fix the building failure, this patch changes from the python f-string format to traditional string format. Fixes: 12fdd6c009da0d02 ("perf scripts python: Support Arm CoreSight trace data disassembly") Reported-by: Akemi Yagi <toracat@elrepo.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: ElRepo <contact@elrepo.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27tools headers cpufeatures: Sync with the kernel sourcesArnaldo Carvalho de Melo
To pick the changes from: 28a99e95f55c6185 ("x86/amd: Use IBPB for firmware calls") This only causes these perf files to be rebuilt: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org Link: https://lore.kernel.org/lkml/Yt6oWce9UDAmBAtX@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-27virtio-net: fix the race between refill work and closeJason Wang
We try using cancel_delayed_work_sync() to prevent the work from enabling NAPI. This is insufficient since we don't disable the source of the refill work scheduling. This means an NAPI poll callback after cancel_delayed_work_sync() can schedule the refill work then can re-enable the NAPI that leads to use-after-free [1]. Since the work can enable NAPI, we can't simply disable NAPI before calling cancel_delayed_work_sync(). So fix this by introducing a dedicated boolean to control whether or not the work could be scheduled from NAPI. [1] ================================================================== BUG: KASAN: use-after-free in refill_work+0x43/0xd4 Read of size 2 at addr ffff88810562c92e by task kworker/2:1/42 CPU: 2 PID: 42 Comm: kworker/2:1 Not tainted 5.19.0-rc1+ #480 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Workqueue: events refill_work Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0xbb/0x6ac ? _printk+0xad/0xde ? refill_work+0x43/0xd4 kasan_report+0xa8/0x130 ? refill_work+0x43/0xd4 refill_work+0x43/0xd4 process_one_work+0x43d/0x780 worker_thread+0x2a0/0x6f0 ? process_one_work+0x780/0x780 kthread+0x167/0x1a0 ? kthread_exit+0x50/0x50 ret_from_fork+0x22/0x30 </TASK> ... Fixes: b2baed69e605c ("virtio_net: set/cancel work on ndo_open/ndo_stop") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-27EDAC/ghes: Set the DIMM label unconditionallyToshi Kani
The commit cb51a371d08e ("EDAC/ghes: Setup DIMM label from DMI and use it in error reports") enforced that both the bank and device strings passed to dimm_setup_label() are not NULL. However, there are BIOSes, for example on a HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019 which don't populate both strings: Handle 0x0020, DMI type 17, 84 bytes Memory Device Array Handle: 0x0013 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 32 GB Form Factor: DIMM Set: None Locator: PROC 1 DIMM 1 <===== device Bank Locator: Not Specified <===== bank This results in a buffer overflow because ghes_edac_register() calls strlen() on an uninitialized label, which had non-zero values left over from krealloc_array(): detected buffer overflow in __fortify_strlen ------------[ cut here ]------------ kernel BUG at lib/string_helpers.c:983! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 1 Comm: swapper/0 Tainted: G I 5.18.6-200.fc36.x86_64 #1 Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019 RIP: 0010:fortify_panic ... Call Trace: <TASK> ghes_edac_register.cold ghes_probe platform_probe really_probe __driver_probe_device driver_probe_device __driver_attach ? __device_attach_driver bus_for_each_dev bus_add_driver driver_register acpi_ghes_init acpi_init ? acpi_sleep_proc_init do_one_initcall The label contains garbage because the commit in Fixes reallocs the DIMMs array while scanning the system but doesn't clear the newly allocated memory. Change dimm_setup_label() to always initialize the label to fix the issue. Set it to the empty string in case BIOS does not provide both bank and device so that ghes_edac_register() can keep the default label given by edac_mc_alloc_dimms(). [ bp: Rewrite commit message. ] Fixes: b9cae27728d1f ("EDAC/ghes: Scan the system once on driver init") Co-developed-by: Robert Richter <rric@kernel.org> Signed-off-by: Robert Richter <rric@kernel.org> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Robert Elliott <elliott@hpe.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220719220124.760359-1-toshi.kani@hpe.com
2022-07-27Add Seth Forshee as co-maintainer for idmapped mountsChristian Brauner
Seth has been integral in the design and implementation of idmapped mounts and was the main architect behind the s_user_ns work which ultimately made filesystems such as FUSE and overlayfs available in containers. He continues to be active in both development and review. I'm very happy he decided to maintain this feature. He has my full trust. Link: https://lore.kernel.org/r/20220726141615.1046027-1-brauner@kernel.org Cc: Seth Forshee <sforshee@digitalocean.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-07-26mptcp: Do not return EINPROGRESS when subflow creation succeedsMat Martineau
New subflows are created within the kernel using O_NONBLOCK, so EINPROGRESS is the expected return value from kernel_connect(). __mptcp_subflow_connect() has the correct logic to consider EINPROGRESS to be a successful case, but it has also used that error code as its return value. Before v5.19 this was benign: all the callers ignored the return value. Starting in v5.19 there is a MPTCP_PM_CMD_SUBFLOW_CREATE generic netlink command that does use the return value, so the EINPROGRESS gets propagated to userspace. Make __mptcp_subflow_connect() always return 0 on success instead. Fixes: ec3edaa7ca6c ("mptcp: Add handling of outgoing MP_JOIN requests") Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/20220725205231.87529-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski
Florian Westphal says: ==================== netfilter updates for net Three late fixes for netfilter: 1) If nf_queue user requests packet truncation below size of l3 header, we corrupt the skb, then crash. Reject such requests. 2) add cond_resched() calls when doing cycle detection in the nf_tables graph. This avoids softlockup warning with certain rulesets. 3) Reject rulesets that use nftables 'queue' expression in family/chain combinations other than those that are supported. Currently the ruleset will load, but when userspace attempts to reinject you get WARN splat + packet drops. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_queue: only allow supported familes and hooks netfilter: nf_tables: add rescheduling points during loop detection walks netfilter: nf_queue: do not allow packet truncation below transport header offset ==================== Link: https://lore.kernel.org/r/20220726192056.13497-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26Merge tag 'for-net-2022-07-26' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix early wakeup after suspend - Fix double free on error - Fix use-after-free on l2cap_chan_put * tag 'for-net-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put Bluetooth: Always set event mask on suspend Bluetooth: mgmt: Fix double free on error path ==================== Link: https://lore.kernel.org/r/20220726221328.423714-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26Merge tag 'mm-hotfixes-stable-2022-07-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Thirteen hotfixes. Eight are cc:stable and the remainder are for post-5.18 issues or are too minor to warrant backporting" * tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: update Gao Xiang's email addresses userfaultfd: provide properly masked address for huge-pages Revert "ocfs2: mount shared volume without ha stack" hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte fs: sendfile handles O_NONBLOCK of out_fd ntfs: fix use-after-free in ntfs_ucsncmp() secretmem: fix unhandled fault in truncate mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range() mm: fix missing wake-up event for FSDAX pages mm: fix page leak with multiple threads mapping the same page mailmap: update Seth Forshee's email address tmpfs: fix the issue that the mount and remount results are inconsistent. mm: kfence: apply kmemleak_ignore_phys on early allocated pool
2022-07-26scsi: ufs: core: Fix a race condition related to device managementBart Van Assche
If a device management command completion happens after wait_for_completion_timeout() times out and before ufshcd_clear_cmds() is called, then the completion code may crash on the complete() call in __ufshcd_transfer_req_compl(). Fix the following crash: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 Call trace: complete+0x64/0x178 __ufshcd_transfer_req_compl+0x30c/0x9c0 ufshcd_poll+0xf0/0x208 ufshcd_sl_intr+0xb8/0xf0 ufshcd_intr+0x168/0x2f4 __handle_irq_event_percpu+0xa0/0x30c handle_irq_event+0x84/0x178 handle_fasteoi_irq+0x150/0x2e8 __handle_domain_irq+0x114/0x1e4 gic_handle_irq.31846+0x58/0x300 el1_irq+0xe4/0x1c0 efi_header_end+0x110/0x680 __irq_exit_rcu+0x108/0x124 __handle_domain_irq+0x118/0x1e4 gic_handle_irq.31846+0x58/0x300 el1_irq+0xe4/0x1c0 cpuidle_enter_state+0x3ac/0x8c4 do_idle+0x2fc/0x55c cpu_startup_entry+0x84/0x90 kernel_init+0x0/0x310 start_kernel+0x0/0x608 start_kernel+0x4ec/0x608 Link: https://lore.kernel.org/r/20220720170228.1598842-1-bvanassche@acm.org Fixes: 5a0b0cb9bee7 ("[SCSI] ufs: Add support for sending NOP OUT UPIU") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Avri Altman <avri.altman@wdc.com> Cc: Bean Huo <beanhuo@micron.com> Cc: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-26scsi: core: Fix warning in scsi_alloc_sgtables()Jason Yan
As explained in SG_IO howto[1]: "If iovec_count is non-zero then 'dxfer_len' should be equal to the sum of iov_len lengths. If not, the minimum of the two is the transfer length." When iovec_count is non-zero and dxfer_len is zero, the sg_io() just genarated a null bio, and finally caused a warning below. To fix it, skip generating a bio for this request if dxfer_len is zero. [1] https://tldp.org/HOWTO/SCSI-Generic-HOWTO/x198.html WARNING: CPU: 2 PID: 3643 at drivers/scsi/scsi_lib.c:1032 scsi_alloc_sgtables+0xc7d/0xf70 drivers/scsi/scsi_lib.c:1032 Modules linked in: CPU: 2 PID: 3643 Comm: syz-executor397 Not tainted 5.17.0-rc3-syzkaller-00316-gb81b1829e7e3 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-204/01/2014 RIP: 0010:scsi_alloc_sgtables+0xc7d/0xf70 drivers/scsi/scsi_lib.c:1032 Code: e7 fc 31 ff 44 89 f6 e8 c1 4e e7 fc 45 85 f6 0f 84 1a f5 ff ff e8 93 4c e7 fc 83 c5 01 0f b7 ed e9 0f f5 ff ff e8 83 4c e7 fc <0f> 0b 41 bc 0a 00 00 00 e9 2b fb ff ff 41 bc 09 00 00 00 e9 20 fb RSP: 0018:ffffc90000d07558 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff88801bfc96a0 RCX: 0000000000000000 RDX: ffff88801c876000 RSI: ffffffff849060bd RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff849055b9 R11: 0000000000000000 R12: ffff888012b8c000 R13: ffff88801bfc9580 R14: 0000000000000000 R15: ffff88801432c000 FS: 00007effdec8e700(0000) GS:ffff88802cc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007effdec6d718 CR3: 00000000206d6000 CR4: 0000000000150ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> scsi_setup_scsi_cmnd drivers/scsi/scsi_lib.c:1219 [inline] scsi_prepare_cmd drivers/scsi/scsi_lib.c:1614 [inline] scsi_queue_rq+0x283e/0x3630 drivers/scsi/scsi_lib.c:1730 blk_mq_dispatch_rq_list+0x6ea/0x22e0 block/blk-mq.c:1851 __blk_mq_sched_dispatch_requests+0x20b/0x410 block/blk-mq-sched.c:299 blk_mq_sched_dispatch_requests+0xfb/0x180 block/blk-mq-sched.c:332 __blk_mq_run_hw_queue+0xf9/0x350 block/blk-mq.c:1968 __blk_mq_delay_run_hw_queue+0x5b6/0x6c0 block/blk-mq.c:2045 blk_mq_run_hw_queue+0x30f/0x480 block/blk-mq.c:2096 blk_mq_sched_insert_request+0x340/0x440 block/blk-mq-sched.c:451 blk_execute_rq+0xcc/0x340 block/blk-mq.c:1231 sg_io+0x67c/0x1210 drivers/scsi/scsi_ioctl.c:485 scsi_ioctl_sg_io drivers/scsi/scsi_ioctl.c:866 [inline] scsi_ioctl+0xa66/0x1560 drivers/scsi/scsi_ioctl.c:921 sd_ioctl+0x199/0x2a0 drivers/scsi/sd.c:1576 blkdev_ioctl+0x37a/0x800 block/ioctl.c:588 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7effdecdc5d9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 81 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007effdec8e2f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007effded664c0 RCX: 00007effdecdc5d9 RDX: 0000000020002300 RSI: 0000000000002285 RDI: 0000000000000004 RBP: 00007effded34034 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 R13: 00007effded34054 R14: 2f30656c69662f2e R15: 00007effded664c8 Link: https://lore.kernel.org/r/20220720025120.3226770-1-yanaijie@huawei.com Fixes: 25636e282fe9 ("block: fix SG_IO vector request data length handling") Reported-by: syzbot+d44b35ecfb807e5af0b5@syzkaller.appspotmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-26scsi: ufs: host: Hold reference returned by of_parse_phandle()Liang He
In ufshcd_populate_vreg(), we should hold the reference returned by of_parse_phandle() and then use it to call of_node_put() for refcount balance. Link: https://lore.kernel.org/r/20220719071529.1081166-1-windhl@126.com Fixes: aa4976130934 ("ufs: Add regulator enable support") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-26scsi: mpt3sas: Stop fw fault watchdog work item during system shutdownDavid Jeffery
During system shutdown or reboot, mpt3sas will reset the firmware back to ready state. However, the driver leaves running a watchdog work item intended to keep the firmware in operational state. This causes a second, unneeded reset on shutdown and moves the firmware back to operational instead of in ready state as intended. And if the mpt3sas_fwfault_debug module parameter is set, this extra reset also panics the system. mpt3sas's scsih_shutdown needs to stop the watchdog before resetting the firmware back to ready state. Link: https://lore.kernel.org/r/20220722142448.6289-1-djeffery@redhat.com Fixes: fae21608c31c ("scsi: mpt3sas: Transition IOC to Ready state during shutdown") Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-07-26mailmap: update Gao Xiang's email addressesGao Xiang
I've been in Alibaba Cloud for more than one year, mainly to address cloud-native challenges (such as high-performance container images) for open source communities. Update my email addresses on behalf of my current employer (Alibaba Cloud) to support all my (team) work in this area. Also add an outdated @redhat.com address of me. Link: https://lkml.kernel.org/r/20220719154246.62970-1-xiang@kernel.org Signed-off-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-26userfaultfd: provide properly masked address for huge-pagesNadav Amit
Commit 824ddc601adc ("userfaultfd: provide unmasked address on page-fault") was introduced to fix an old bug, in which the offset in the address of a page-fault was masked. Concerns were raised - although were never backed by actual code - that some userspace code might break because the bug has been around for quite a while. To address these concerns a new flag was introduced, and only when this flag is set by the user, userfaultfd provides the exact address of the page-fault. The commit however had a bug, and if the flag is unset, the offset was always masked based on a base-page granularity. Yet, for huge-pages, the behavior prior to the commit was that the address is masked to the huge-page granulrity. While there are no reports on real breakage, fix this issue. If the flag is unset, use the address with the masking that was done before. Link: https://lkml.kernel.org/r/20220711165906.2682-1-namit@vmware.com Fixes: 824ddc601adc ("userfaultfd: provide unmasked address on page-fault") Signed-off-by: Nadav Amit <namit@vmware.com> Reported-by: James Houghton <jthoughton@google.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: James Houghton <jthoughton@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-26Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_putLuiz Augusto von Dentz
This fixes the following trace which is caused by hci_rx_work starting up *after* the final channel reference has been put() during sock_close() but *before* the references to the channel have been destroyed, so instead the code now rely on kref_get_unless_zero/l2cap_chan_hold_unless_zero to prevent referencing a channel that is about to be destroyed. refcount_t: increment on 0; use-after-free. BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0 Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705 CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S W 4.14.234-00003-g1fb6d0bd49a4-dirty #28 Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT) Workqueue: hci0 hci_rx_work Call trace: dump_backtrace+0x0/0x378 show_stack+0x20/0x2c dump_stack+0x124/0x148 print_address_description+0x80/0x2e8 __kasan_report+0x168/0x188 kasan_report+0x10/0x18 __asan_load4+0x84/0x8c refcount_dec_and_test+0x20/0xd0 l2cap_chan_put+0x48/0x12c l2cap_recv_frame+0x4770/0x6550 l2cap_recv_acldata+0x44c/0x7a4 hci_acldata_packet+0x100/0x188 hci_rx_work+0x178/0x23c process_one_work+0x35c/0x95c worker_thread+0x4cc/0x960 kthread+0x1a8/0x1c4 ret_from_fork+0x10/0x18 Cc: stable@kernel.org Reported-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-26Bluetooth: Always set event mask on suspendAbhishek Pandit-Subedi
When suspending, always set the event mask once disconnects are successful. Otherwise, if wakeup is disallowed, the event mask is not set before suspend continues and can result in an early wakeup. Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier") Cc: stable@vger.kernel.org Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-26Bluetooth: mgmt: Fix double free on error pathDan Carpenter
Don't call mgmt_pending_remove() twice (double free). Fixes: 6b88eff43704 ("Bluetooth: hci_sync: Refactor remove Adv Monitor") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-07-26wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()Tetsuo Handa
lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1], for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped") guards clear_bit() using fq.lock even before fq_init() from ieee80211_txq_setup_flows() initializes this spinlock. According to discussion [2], Toke was not happy with expanding usage of fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we can instead use synchronize_rcu() for flushing ieee80211_wake_txqs(). Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1] Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2] Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped") Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> Acked-by: Toke Høiland-Jørgensen <toke@kernel.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/9cc9b81d-75a3-3925-b612-9d0ad3cab82b@I-love.SAKURA.ne.jp [ pick up commit 3598cb6e1862 ("wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()") from -next] Link: https://lore.kernel.org/all/87o7xcq6qt.fsf@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>