Dev/cvadrevu/v5.15.26 rt34#55
Merged
gratian merged 642 commits intoni:nilrt/master/5.15from Mar 11, 2022
Merged
Conversation
commit 1de9770 upstream. The callback functions of clcsock will be saved and replaced during the fallback. But if the fallback happens more than once, then the copies of these callback functions will be overwritten incorrectly, resulting in a loop call issue: clcsk->sk_error_report |- smc_fback_error_report() <------------------------------| |- smc_fback_forward_wakeup() | (loop) |- clcsock_callback() (incorrectly overwritten) | |- smc->clcsk_error_report() ------------------| So this patch fixes the issue by saving these function pointers only once in the fallback and avoiding overwriting. Reported-by: syzbot+4de3c0e8a263e1e499bc@syzkaller.appspotmail.com Fixes: 341adee ("net/smc: Forward wakeup to smc socket waitqueue after fallback") Link: https://lore.kernel.org/r/0000000000006d045e05d78776f6@google.com Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 525b108 upstream. The function mt7531_phy_mode_supported in the DSA driver set supported mode to PHY_INTERFACE_MODE_GMII instead of PHY_INTERFACE_MODE_INTERNAL for the internal PHY, so this check breaks the PHY initialization: mt7530 mdio-bus:00 wan (uninitialized): failed to connect to PHY: -EINVAL Remove the check to make it work again. Reported-by: Hauke Mehrtens <hauke@hauke-m.de> Fixes: e40d2cc ("net: phy: add MediaTek Gigabit Ethernet PHY driver") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com> Tested-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf8e59f upstream. If NIC had packets in tx queue at the moment link down event happened, it could result in tx timeout when link got back up. Since device has more than one tx queue we need to reset them accordingly. Fixes: 057f4af ("atl1c: add 4 RX/TX queue support for Mikrotik 10/25G NIC") Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com> Link: https://lore.kernel.org/r/20220211065123.4187615-1-gatis@mikrotik.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 032062f upstream. When a link comes up we add its presence to the name table to make it possible for users to subscribe for link up/down events. However, after a previous call signature change the binding is wrongly published with the peer node as publishing node, instead of the own node as it should be. This has the effect that the command 'tipc name table show' will list the link binding (service type 2) with node scope and a peer node as originator, something that obviously is impossible. We correct this bug here. Fixes: 50a3499 ("tipc: simplify signature of tipc_namtbl_publish()") Signed-off-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20220214013852.2803940-1-jmaloy@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a36ed7 upstream. Clang static analysis reports this representative problem dpaa2-switch-flower.c:616:24: warning: The right operand of '==' is a garbage value tmp->cfg.vlan_id == vlan) { ^ ~~~~ vlan is set in dpaa2_switch_flower_parse_mirror_key(). However this function can return success without setting vlan. So change the default return to -EOPNOTSUPP. Fixes: 0f3faec ("dpaa2-switch: add VLAN based mirroring") Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07dd448 upstream. 1588 Single Step Timestamping code path uses a mutex to enforce atomicity for two events: - update of ptp single step register - transmit ptp event packet Before this patch the mutex was not initialized. This caused unexpected crashes in the Tx function. Fixes: c552118 ("dpaa2-eth: support PTP Sync packet one-step timestamping") Signed-off-by: Radu Bulie <radu-andrei.bulie@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…g gets disabled commit c832962 upstream. Whenever bridge driver hits the max capacity of MDBs, it disables the MC processing (by setting corresponding bridge option), but never notifies switchdev about such change (the notifiers are called only upon explicit setting of this option, through the registered netlink interface). This could lead to situation when Software MDB processing gets disabled, but this event never gets offloaded to the underlying Hardware. Fix this by adding a notify message in such case. Fixes: 147c1e9 ("switchdev: bridge: Offload multicast disabled") Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20220215165303.31908-1-oleksandr.mazur@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31ded15 upstream. This was detected by the gcc in Fedora Rawhide's gcc: 50 11.01 fedora:rawhide : FAIL gcc version 12.0.1 20220205 (Red Hat 12.0.1-0) (GCC) inlined from 'bpf__config_obj' at util/bpf-loader.c:1242:9: util/bpf-loader.c:1225:34: error: pointer 'map_opt' may be used after 'free' [-Werror=use-after-free] 1225 | *key_scan_pos += strlen(map_opt); | ^~~~~~~~~~~~~~~ util/bpf-loader.c:1223:9: note: call to 'free' here 1223 | free(map_name); | ^~~~~~~~~~~~~~ cc1: all warnings being treated as errors So do the calculations on the pointer before freeing it. Fixes: 04f9bf2 ("perf bpf-loader: Add missing '*' for key_scan_pos") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang ShaoBo <bobo.shaobowang@huawei.com> Link: https://lore.kernel.org/lkml/Yg1VtQxKrPpS3uNA@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7e793a upstream. non-regular file needs to be compiled and then copied to the output directory. Remove it from TEST_PROGS and add it to TEST_GEN_PROGS. This removes error thrown by rsync when non-regular object isn't found: rsync: [sender] link_stat "/linux/tools/testing/selftests/exec/non-regular" failed: No such file or directory (2) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3] Fixes: 0f71241 ("selftests/exec: add file type errno tests") Reported-by: "kernelci.org bot" <bot@kernelci.org> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f6de67 upstream. In commit: 114945d ("arm64: Fix labels in el2_setup macros") We renamed a label from '1' to '.Lskip_gicv3_\@', but failed to update a branch to it, which now targets a later label also called '1'. The branch is taken rarely, when GICv3 is present but SRE is disabled at EL3, causing a boot-time crash. Update the caller to the new label name. Fixes: 114945d ("arm64: Fix labels in el2_setup macros") Cc: <stable@vger.kernel.org> # 5.12.x Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Link: https://lore.kernel.org/r/20220214175643.21931-1-joakim.tjernlund@infinera.com Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…k Ultra commit 19d20c7 upstream. Commit 83b7dcb introduced a generic implicit feedback parser, which fails to execute for M-Audio FastTrack Ultra sound cards. The issue is with the ENDPOINT_SYNCTYPE check in add_generic_implicit_fb() where the SYNCTYPE is ADAPTIVE instead of ASYNC. The reason is that the sync type of the FastTrack output endpoints are set to adaptive in the quirks table since commit 65f0444. Fixes: 83b7dcb ("ALSA: usb-audio: Add generic implicit fb parsing") Signed-off-by: Matteo Martelli <matteomartelli3@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220211224913.20683-2-matteomartelli3@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c07f2c7 upstream. Legion Y9000X 2019 has the same speaker with Y9000X 2020, but with a different quirk address. Add one quirk entry to make the speaker work on Y9000X 2019 too. Signed-off-by: Yu Huang <diwang90@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220212160835.165065-1-diwang90@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a84583 upstream. The recently introduced coef_mutex for Realtek codec seems causing a deadlock when the relevant code is invoked from the power-off state; then the HD-audio core tries to power-up internally, and this kicks off the codec runtime PM code that tries to take the same coef_mutex. In order to avoid the deadlock, do the temporary power up/down around the coef_mutex acquisition and release. This assures that the power-up sequence runs before the mutex, hence no re-entrance will happen. Fixes: b837a9f ("ALSA: hda: realtek: Fix race at concurrent COEF updates") Reported-and-tested-by: Julian Wollrath <jwollrath@web.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220214132838.4db10fca@schienar Link: https://lore.kernel.org/r/20220214130410.21230-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6317f74 upstream. The forced probe mask via probe_mask 0x100 bit doesn't work any longer as expected since the bus init code was moved and it's clearing the codec_mask value that was set beforehand. This patch fixes the long-time regression by moving the check_probe_mask() call. Fixes: a41d122 ("ALSA: hda - Embed bus into controller object") Reported-by: dmummenschanz@web.de Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/trinity-f018660b-95c9-442b-a2a8-c92a56eb07ed-1644345967148@3c-app-webde-bap22 Link: https://lore.kernel.org/r/20220214100020.8870-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd8e5b1 upstream. By some unknown reason, BIOS on Shenker Dock 15 doesn't set up the codec mask properly for the onboard audio. Let's set the forced codec mask to enable the codec discovery. Reported-by: dmummenschanz@web.de Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/trinity-f018660b-95c9-442b-a2a8-c92a56eb07ed-1644345967148@3c-app-webde-bap22 Link: https://lore.kernel.org/r/20220214100020.8870-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 564778d upstream. When writing out a stereo control we discard the change notification from the first channel, meaning that events are only generated based on changes to the second channel. Ensure that we report a change if either channel has changed. Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-2-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 650204d upstream. When writing out a stereo control we discard the change notification from the first channel, meaning that events are only generated based on changes to the second channel. Ensure that we report a change if either channel has changed. Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-4-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f3d90a upstream. When writing out a stereo control we discard the change notification from the first channel, meaning that events are only generated based on changes to the second channel. Ensure that we report a change if either channel has changed. Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-3-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b7c463 upstream. When writing out a stereo control we discard the change notification from the first channel, meaning that events are only generated based on changes to the second channel. Ensure that we report a change if either channel has changed. Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-5-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd5a927 upstream. 'setcifsacl -g <SID>' silently fails to set the group SID on server. Actually, the bug existed since commit 438471b ("CIFS: Add support for setting owner info, dos attributes, and create time"), but this fix will not apply cleanly to kernel versions <= v5.10. Fixes: 3970acf ("SMB3: Add support for getting and setting SACLs") Cc: stable@vger.kernel.org # 5.11+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9bb162f upstream. Allthough kernel text is always mapped with BATs, we still have inittext mapped with pages, so TLB miss handling is required when CONFIG_DEBUG_PAGEALLOC or CONFIG_KFENCE is set. The final solution should be to set a BAT that also maps inittext but that BAT then needs to be cleared at end of init, and it will require more changes to be able to do it properly. As DEBUG_PAGEALLOC or KFENCE are debugging, performance is not a big deal so let's fix it simply for now to enable easy stable application. Fixes: 035b19a ("powerpc/32s: Always map kernel text and rodata with BATs") Cc: stable@vger.kernel.org # v5.11+ Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aea33b4813a26bdb9378b5f273f00bd5d4abe240.1638857364.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe663df upstream. Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian 2.37.90.20220207) the following build error shows up: {standard input}: Assembler messages: {standard input}:2088: Error: unrecognized opcode: `ptesync' make[3]: *** [/builds/linux/scripts/Makefile.build:287: arch/powerpc/lib/sstep.o] Error 1 Add the 'ifdef CONFIG_PPC64' around the 'ptesync' in function 'emulate_update_regs()' to like it is in 'analyse_instr()'. Since it looks like it got dropped inadvertently by commit 3cdfcbf ("powerpc: Change analyse_instr so it doesn't modify *regs"). A key detail is that analyse_instr() will never recognise lwsync or ptesync on 32-bit (because of the existing ifdef), and as a result emulate_update_regs() should never be called with an op specifying either of those on 32-bit. So removing them from emulate_update_regs() should be a nop in terms of runtime behaviour. Fixes: 3cdfcbf ("powerpc: Change analyse_instr so it doesn't modify *regs") Cc: stable@vger.kernel.org # v4.14+ Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> [mpe: Add last paragraph of change log mentioning analyse_instr() details] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220211005113.1361436-1-anders.roxell@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9161f36 upstream. If gpmi_nfc_apply_timings() fails, the PM runtime usage counter must be dropped. Reported-by: Pavel Machek <pavel@denx.de> Fixes: f53d4c1 ("mtd: rawnand: gpmi: Add ERR007117 protection for nfc_apply_timings") Signed-off-by: Christian Eggers <ceggers@arri.de> Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220125081619.6286-1-ceggers@arri.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9405b5f upstream. The conversion to the new API broke the snapshot mount option due to 32 vs. 64 bit type mismatch Fixes: 24e0a1e ("cifs: switch to new mount api") Cc: stable@vger.kernel.org # 5.11+ Reported-by: <ruckajan10@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c08e584 upstream. The previous bug fix had an unfortunate side effect that broke distribution of binding table entries between nodes. The updated tipc_sock_addr struct is also used further down in the same function, and there the old value is still the correct one. Fixes: 032062f ("tipc: fix wrong publisher node address in link publications") Signed-off-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20220216020009.3404578-1-jmaloy@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d77ea82 upstream. Commit 7252a36 ("scsi: ufs: Avoid busy-waiting by eliminating tag conflicts") guarantees that 'tag' is not in use by any SCSI command. Remove the check that returns early if a conflict occurs. Link: https://lore.kernel.org/r/20211203231950.193369-6-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 945c3cc upstream. The following deadlock has been observed on a test setup: - All tags allocated - The SCSI error handler calls ufshcd_eh_host_reset_handler() - ufshcd_eh_host_reset_handler() queues work that calls ufshcd_err_handler() - ufshcd_err_handler() locks up as follows: Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt Call trace: __switch_to+0x298/0x5d8 __schedule+0x6cc/0xa94 schedule+0x12c/0x298 blk_mq_get_tag+0x210/0x480 __blk_mq_alloc_request+0x1c8/0x284 blk_get_request+0x74/0x134 ufshcd_exec_dev_cmd+0x68/0x640 ufshcd_verify_dev_init+0x68/0x35c ufshcd_probe_hba+0x12c/0x1cb8 ufshcd_host_reset_and_restore+0x88/0x254 ufshcd_reset_and_restore+0xd0/0x354 ufshcd_err_handler+0x408/0xc58 process_one_work+0x24c/0x66c worker_thread+0x3e8/0xa4c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30 Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved request. Link: https://lore.kernel.org/r/20211203231950.193369-10-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 307f314 upstream. Per TAS2770 datasheet there must be a 1 ms delay from reset to first command. So insert delays into the driver where appropriate. Fixes: 1a476ab ("tas2770: add tas2770 smart PA kernel driver") Signed-off-by: Martin Povišer <povik+lin@cutebit.org> Link: https://lore.kernel.org/r/20220204095301.5554-1-povik+lin@cutebit.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8d251f upstream. In commit da0363f ("ASoC: qcom: Fix for DMA interrupt clear reg overwriting") we changed regmap_write() to regmap_update_bits() so that we can avoid overwriting bits that we didn't intend to modify. Unfortunately this change breaks the case where a register is writable but not readable, which is exactly how the HDMI irq clear register is designed (grep around LPASS_HDMITX_APP_IRQCLEAR_REG to see how it's write only). That's because regmap_update_bits() tries to read the register from the hardware and if it isn't readable it looks in the regmap cache to see what was written there last time to compare against what we want to write there. Eventually, we're unable to modify this register at all because the bits that we're trying to set are already set in the cache. This is doubly bad for the irq clear register because you have to write the bit to clear an interrupt. Given the irq is level triggered, we see an interrupt storm upon plugging in an HDMI cable and starting audio playback. The irq storm is so great that performance degrades significantly, leading to CPU soft lockups. Fix it by using regmap_write_bits() so that we really do write the bits in the clear register that we want to. This brings the number of irqs handled by lpass_dma_interrupt_handler() down from ~150k/sec to ~10/sec. Fixes: da0363f ("ASoC: qcom: Fix for DMA interrupt clear reg overwriting") Cc: Srinivasa Rao Mandadapu <srivasam@codeaurora.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20220209232520.4017634-1-swboyd@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e92bc4c upstream. Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in wbt_disable_default() when switch elevator to bfq. And when we remove scsi device, wbt will be enabled by wbt_enable_default. If it become false positive between wbt_wait() and wbt_track() when submit write request. The following is the scenario that triggered the problem. T1 T2 T3 elevator_switch_mq bfq_init_queue wbt_disable_default <= Set rwb->enable_state (OFF) Submit_bio blk_mq_make_request rq_qos_throttle <= rwb->enable_state (OFF) scsi_remove_device sd_remove del_gendisk blk_unregister_queue elv_unregister_queue wbt_enable_default <= Set rwb->enable_state (ON) q_qos_track <= rwb->enable_state (ON) ^^^^^^ this request will mark WBT_TRACKED without inflight add and will lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung. Fix this by move wbt_enable_default() from elv_unregister to bfq_exit_queue(). Only re-enable wbt when bfq exit. Fixes: 76a8040 ("blk-wbt: make sure throttle is enabled properly") Remove oneline stale comment, and kill one oneshot local variable. Signed-off-by: Ming Lei <ming.lei@rehdat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/linux-block/20211214133103.551813-1-qiulaibin@huawei.com/ Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22e2100 upstream. The trace_hardirqs_{on,off}() require the caller to setup frame pointer properly. This because these two functions use macro 'CALLER_ADDR1' (aka. __builtin_return_address(1)) to acquire caller info. If the $fp is used for other purpose, the code generated this macro (as below) could trigger memory access fault. 0xffffffff8011510e <+80>: ld a1,-16(s0) 0xffffffff80115112 <+84>: ld s2,-8(a1) # <-- paging fault here The oops message during booting if compiled with 'irqoff' tracer enabled: [ 0.039615][ T0] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000f8 [ 0.041925][ T0] Oops [#1] [ 0.042063][ T0] Modules linked in: [ 0.042864][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc1-00233-g9a20c48d1ed2 ni#29 [ 0.043568][ T0] Hardware name: riscv-virtio,qemu (DT) [ 0.044343][ T0] epc : trace_hardirqs_on+0x56/0xe2 [ 0.044601][ T0] ra : restore_all+0x12/0x6e [ 0.044721][ T0] epc : ffffffff80126a5c ra : ffffffff80003b94 sp : ffffffff81403db0 [ 0.044801][ T0] gp : ffffffff8163acd8 tp : ffffffff81414880 t0 : 0000000000000020 [ 0.044882][ T0] t1 : 0098968000000000 t2 : 0000000000000000 s0 : ffffffff81403de0 [ 0.044967][ T0] s1 : 0000000000000000 a0 : 0000000000000001 a1 : 0000000000000100 [ 0.045046][ T0] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 [ 0.045124][ T0] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000054494d45 [ 0.045210][ T0] s2 : ffffffff80003b94 s3 : ffffffff81a8f1b0 s4 : ffffffff80e27b50 [ 0.045289][ T0] s5 : ffffffff81414880 s6 : ffffffff8160fa00 s7 : 00000000800120e8 [ 0.045389][ T0] s8 : 0000000080013100 s9 : 000000000000007f s10: 0000000000000000 [ 0.045474][ T0] s11: 0000000000000000 t3 : 7fffffffffffffff t4 : 0000000000000000 [ 0.045548][ T0] t5 : 0000000000000000 t6 : ffffffff814aa368 [ 0.045620][ T0] status: 0000000200000100 badaddr: 00000000000000f8 cause: 000000000000000d [ 0.046402][ T0] [<ffffffff80003b94>] restore_all+0x12/0x6e This because the $fp(aka. $s0) register is not used as frame pointer in the assembly entry code. resume_kernel: REG_L s0, TASK_TI_PREEMPT_COUNT(tp) bnez s0, restore_all REG_L s0, TASK_TI_FLAGS(tp) andi s0, s0, _TIF_NEED_RESCHED beqz s0, restore_all call preempt_schedule_irq j restore_all To fix above issue, here we add one extra level wrapper for function trace_hardirqs_{on,off}() so they can be safely called by low level entry code. Signed-off-by: Changbin Du <changbin.du@gmail.com> Fixes: 3c46979 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 737b0ef upstream. n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010. See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516 The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to the newer 27.010 here. Chapter 5.4.6.3.7 describes the encoding of the control signal octet used by the MSC (modem status command). The same encoding is also used in convergence layer type 2 as described in chapter 5.5.2. Table 7 and 24 both require the DV (data valid) bit to be set 1 for outgoing control signal octets sent by the DTE (data terminal equipment), i.e. for the initiator side. Currently, the DV bit is only set if CD (carrier detect) is on, regardless of the side. This patch fixes this behavior by setting the DV bit on the initiator side unconditionally. Fixes: e1eaea4 ("tty: n_gsm line discipline") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-1-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3b7468 upstream. Trying to open a DLCI by sending a SABM frame may fail with a timeout. The link is closed on the initiator side without informing the responder about this event. The responder assumes the link is open after sending a UA frame to answer the SABM frame. The link gets stuck in a half open state. This patch fixes this by initiating the proper link termination procedure after link setup timeout instead of silently closing it down. Fixes: e1eaea4 ("tty: n_gsm line discipline") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-3-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 96b169f upstream. The here fixed commit made the tty hangup asynchronous to avoid a circular locking warning. I could not reproduce this warning. Furthermore, due to the asynchronous hangup the function call now gets queued up while the underlying tty is being freed. Depending on the timing this results in a NULL pointer access in the global work queue scheduler. To be precise in process_one_work(). Therefore, the previous commit made the issue worse which it tried to fix. This patch fixes this by falling back to the old behavior which uses a blocking tty hangup call before freeing up the associated tty. Fixes: 7030082 ("tty: n_gsm: avoid recursive locking with async port hangup") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-4-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c19d935 upstream. tty flow control is handled via gsmtty_throttle() and gsmtty_unthrottle(). Both functions propagate the outgoing hardware flow control state to the remote side via MSC (modem status command) frames. The local state is taken from the RTS (ready to send) flag of the tty. However, RTS gets mapped to DTR (data terminal ready), which is wrong. This patch corrects this by mapping RTS to RTS. Fixes: e1eaea4 ("tty: n_gsm line discipline") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-5-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 687f9ad upstream. The function gsm_process_modem() exists to handle modem status bits of incoming frames. This includes incoming MSC (modem status command) frames and convergence layer type 2 data frames. The function, however, was only designed to handle MSC frames as it expects the command length. Within gsm_dlci_data() it is wrongly assumed that this is the same as the data frame length. This is only true if the data frame contains only 1 byte of payload. This patch names the length parameter of gsm_process_modem() in a generic manner to reflect its association. It also corrects all calls to the function to handle the variable number of modem status octets correctly in both cases. Fixes: 7263287 ("tty: n_gsm: Fixed logic to decode break signal from modem status") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-6-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2ab75b upstream. In the current implementation the user may open a virtual tty which then could fail to establish the underlying DLCI. The function gsmtty_open() gets stuck in tty_port_block_til_ready() while waiting for a carrier rise. This happens if the remote side fails to acknowledge the link establishment request in time or completely. At some point gsm_dlci_close() is called to abort the link establishment attempt. The function tries to inform the associated virtual tty by performing a hangup. But the blocking loop within tty_port_block_til_ready() is not informed about this event. The patch proposed here fixes this by resetting the initialization state of the virtual tty to ensure the loop exits and triggering it to make tty_port_block_til_ready() return. Fixes: e1eaea4 ("tty: n_gsm line discipline") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220218073123.2121-7-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba2ab85 upstream. The loop exited too early so the k210_pinconf_drive_strength[0] array element was never used. Fixes: d4c34d0 ("pinctrl: Add RISC-V Canaan Kendryte K210 FPIOA driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Sean Anderson <seanga2@gmail.com> Link: https://lore.kernel.org/r/20220209180804.GA18385@kili Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e9f7b92 upstream. Using bias-pull-up would actually cause the pin to have its pull-down enabled. Fix this. Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Fixes: d4c34d0 ("pinctrl: Add RISC-V Canaan Kendryte K210 FPIOA driver") Link: https://lore.kernel.org/r/20220209182822.640905-1-seanga2@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1e972a upstream. The tegra186 GPIO driver makes the assumption that the pointer returned by irq_data_get_irq_chip_data() is a pointer to a tegra_gpio structure. Unfortunately, it is actually a pointer to the inner gpio_chip structure, as mandated by the gpiolib infrastructure. Nice try. The saving grace is that the gpio_chip is the first member of tegra_gpio, so the bug has gone undetected since... forever. Fix it by performing a container_of() on the pointer. This results in no additional code, and makes it possible to understand how the whole thing works. Fixes: 5b2b135 ("gpio: Add Tegra186 support") Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/r/20220211093904.1112679-1-maz@kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c94afc4 upstream. memblock.{reserved,memory}.regions may be allocated using kmalloc() in memblock_double_array(). Use kfree() to release these kmalloced regions indicated by memblock_{reserved,memory}_in_slab. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Fixes: 3010f87 ("mm: discard memblock data later") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6ba527 upstream. The VF can be configured via the PF's ndo ops at the same time the PF is receiving/handling virtchnl messages. This has many issues, with one of them being the ndo op could be actively resetting a VF (i.e. resetting it to the default state and deleting/re-adding the VF's VSI) while a virtchnl message is being handled. The following error was seen because a VF ndo op was used to change a VF's trust setting while the VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing: [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5 [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6 Fix this by making sure the virtchnl handling and VF ndo ops that trigger VF resets cannot run concurrently. This is done by adding a struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex will be locked around the critical operations and VFR. Since the ndo ops will trigger a VFR, the virtchnl thread will use mutex_trylock(). This is done because if any other thread (i.e. VF ndo op) has the mutex, then that means the current VF message being handled is no longer valid, so just ignore it. This issue can be seen using the following commands: for i in {0..50}; do rmmod ice modprobe ice sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on sleep 2 echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on done Fixes: 7c71086 ("ice: Add handlers for VF netdevice operations") Cc: <stable@vger.kernel.org> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> [I had to fix the cherry-pick manually as the patch added a line around some context that was missing.] Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fadead8 upstream. Commit c503e63 ("ice: Stop processing VF messages during teardown") introduced a driver state flag, ICE_VF_DEINIT_IN_PROGRESS, which is intended to prevent some issues with concurrently handling messages from VFs while tearing down the VFs. This change was motivated by crashes caused while tearing down and bringing up VFs in rapid succession. It turns out that the fix actually introduces issues with the VF driver caused because the PF no longer responds to any messages sent by the VF during its .remove routine. This results in the VF potentially removing its DMA memory before the PF has shut down the device queues. Additionally, the fix doesn't actually resolve concurrency issues within the ice driver. It is possible for a VF to initiate a reset just prior to the ice driver removing VFs. This can result in the remove task concurrently operating while the VF is being reset. This results in similar memory corruption and panics purportedly fixed by that commit. Fix this concurrency at its root by protecting both the reset and removal flows using the existing VF cfg_lock. This ensures that we cannot remove the VF while any outstanding critical tasks such as a virtchnl message or a reset are occurring. This locking change also fixes the root cause originally fixed by commit c503e63 ("ice: Stop processing VF messages during teardown"), so we can simply revert it. Note that I kept these two changes together because simply reverting the original commit alone would leave the driver vulnerable to worse race conditions. Fixes: c503e63 ("ice: Stop processing VF messages during teardown") Cc: <stable@vger.kernel.org> # e6ba527: ice: Fix race conditions between virtchnl handling and VF ndo ops Cc: <stable@vger.kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220228172347.614588246@linuxfoundation.org Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Allen Pais <apais@linux.microsoft.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Slade Watkins <slade@sladewatkins.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Clark Williams <williams@redhat.com>
This is the 5.15.26 stable release
Signed-off-by: Clark Williams <williams@redhat.com>
Linux 5.15.26-rt34 Signed-off-by: Chaitanya Vadrevu <chaitanya.vadrevu@ni.com>
amstewart
approved these changes
Mar 8, 2022
gratian
approved these changes
Mar 9, 2022
gratian
pushed a commit
that referenced
this pull request
May 10, 2022
[ Upstream commit 059a47f ] After rx/tx ring buffer size is changed, kernel panic occurs when it acts XDP_TX or XDP_REDIRECT. When tx/rx ring buffer size is changed(ethtool -G), sfc driver reallocates and reinitializes rx and tx queues and their buffer (tx_queue->buffer). But it misses reinitializing xdp queues(efx->xdp_tx_queues). So, while it is acting XDP_TX or XDP_REDIRECT, it uses the uninitialized tx_queue->buffer. A new function efx_set_xdp_channels() is separated from efx_set_channels() to handle only xdp queues. Splat looks like: BUG: kernel NULL pointer dereference, address: 000000000000002a #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#4] PREEMPT SMP NOPTI RIP: 0010:efx_tx_map_chunk+0x54/0x90 [sfc] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G D 5.17.0+ #55 e8beeee8289528f11357029357cf Code: 48 8b 8d a8 01 00 00 48 8d 14 52 4c 8d 2c d0 44 89 e0 48 85 c9 74 0e 44 89 e2 4c 89 f6 48 80 RSP: 0018:ffff92f121e45c60 EFLAGS: 00010297 RIP: 0010:efx_tx_map_chunk+0x54/0x90 [sfc] RAX: 0000000000000040 RBX: ffff92ea506895c0 RCX: ffffffffc0330870 RDX: 0000000000000001 RSI: 00000001139b10ce RDI: ffff92ea506895c0 RBP: ffffffffc0358a80 R08: 00000001139b110d R09: 0000000000000000 R10: 0000000000000001 R11: ffff92ea414c0088 R12: 0000000000000040 R13: 0000000000000018 R14: 00000001139b10ce R15: ffff92ea506895c0 FS: 0000000000000000(0000) GS:ffff92f121ec0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Code: 48 8b 8d a8 01 00 00 48 8d 14 52 4c 8d 2c d0 44 89 e0 48 85 c9 74 0e 44 89 e2 4c 89 f6 48 80 CR2: 000000000000002a CR3: 00000003e6810004 CR4: 00000000007706e0 RSP: 0018:ffff92f121e85c60 EFLAGS: 00010297 PKRU: 55555554 RAX: 0000000000000040 RBX: ffff92ea50689700 RCX: ffffffffc0330870 RDX: 0000000000000001 RSI: 00000001145a90ce RDI: ffff92ea50689700 RBP: ffffffffc0358a80 R08: 00000001145a910d R09: 0000000000000000 R10: 0000000000000001 R11: ffff92ea414c0088 R12: 0000000000000040 R13: 0000000000000018 R14: 00000001145a90ce R15: ffff92ea50689700 FS: 0000000000000000(0000) GS:ffff92f121e80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000002a CR3: 00000003e6810005 CR4: 00000000007706e0 PKRU: 55555554 Call Trace: <IRQ> efx_xdp_tx_buffers+0x12b/0x3d0 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5] __efx_rx_packet+0x5c3/0x930 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5] efx_rx_packet+0x28c/0x2e0 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5] efx_ef10_ev_process+0x5f8/0xf40 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5] ? enqueue_task_fair+0x95/0x550 efx_poll+0xc4/0x360 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5] Fixes: 3990a8f ("sfc: allocate channels for XDP tx queues") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
gratian
pushed a commit
to gratian/linux
that referenced
this pull request
Feb 26, 2024
[ Upstream commit dbc153f ] A crash was found when dumping SMC-D connections. It can be reproduced by following steps: - run nginx/wrk test: smc_run nginx smc_run wrk -t 16 -c 1000 -d <duration> -H 'Connection: Close' <URL> - continuously dump SMC-D connections in parallel: watch -n 1 'smcss -D' BUG: kernel NULL pointer dereference, address: 0000000000000030 CPU: 2 PID: 7204 Comm: smcss Kdump: loaded Tainted: G E 6.7.0+ ni#55 RIP: 0010:__smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag] Call Trace: <TASK> ? __die+0x24/0x70 ? page_fault_oops+0x66/0x150 ? exc_page_fault+0x69/0x140 ? asm_exc_page_fault+0x26/0x30 ? __smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag] ? __kmalloc_node_track_caller+0x35d/0x430 ? __alloc_skb+0x77/0x170 smc_diag_dump_proto+0xd0/0xf0 [smc_diag] smc_diag_dump+0x26/0x60 [smc_diag] netlink_dump+0x19f/0x320 __netlink_dump_start+0x1dc/0x300 smc_diag_handler_dump+0x6a/0x80 [smc_diag] ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag] sock_diag_rcv_msg+0x121/0x140 ? __pfx_sock_diag_rcv_msg+0x10/0x10 netlink_rcv_skb+0x5a/0x110 sock_diag_rcv+0x28/0x40 netlink_unicast+0x22a/0x330 netlink_sendmsg+0x1f8/0x420 __sock_sendmsg+0xb0/0xc0 ____sys_sendmsg+0x24e/0x300 ? copy_msghdr_from_user+0x62/0x80 ___sys_sendmsg+0x7c/0xd0 ? __do_fault+0x34/0x160 ? do_read_fault+0x5f/0x100 ? do_fault+0xb0/0x110 ? __handle_mm_fault+0x2b0/0x6c0 __sys_sendmsg+0x4d/0x80 do_syscall_64+0x69/0x180 entry_SYSCALL_64_after_hwframe+0x6e/0x76 It is possible that the connection is in process of being established when we dump it. Assumed that the connection has been registered in a link group by smc_conn_create() but the rmb_desc has not yet been initialized by smc_buf_create(), thus causing the illegal access to conn->rmb_desc. So fix it by checking before dump. Fixes: 4b1b7d3 ("net/smc: add SMC-D diag support") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
gratian
pushed a commit
to gratian/linux
that referenced
this pull request
Feb 28, 2024
[ Upstream commit dbc153f ] A crash was found when dumping SMC-D connections. It can be reproduced by following steps: - run nginx/wrk test: smc_run nginx smc_run wrk -t 16 -c 1000 -d <duration> -H 'Connection: Close' <URL> - continuously dump SMC-D connections in parallel: watch -n 1 'smcss -D' BUG: kernel NULL pointer dereference, address: 0000000000000030 CPU: 2 PID: 7204 Comm: smcss Kdump: loaded Tainted: G E 6.7.0+ ni#55 RIP: 0010:__smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag] Call Trace: <TASK> ? __die+0x24/0x70 ? page_fault_oops+0x66/0x150 ? exc_page_fault+0x69/0x140 ? asm_exc_page_fault+0x26/0x30 ? __smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag] ? __kmalloc_node_track_caller+0x35d/0x430 ? __alloc_skb+0x77/0x170 smc_diag_dump_proto+0xd0/0xf0 [smc_diag] smc_diag_dump+0x26/0x60 [smc_diag] netlink_dump+0x19f/0x320 __netlink_dump_start+0x1dc/0x300 smc_diag_handler_dump+0x6a/0x80 [smc_diag] ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag] sock_diag_rcv_msg+0x121/0x140 ? __pfx_sock_diag_rcv_msg+0x10/0x10 netlink_rcv_skb+0x5a/0x110 sock_diag_rcv+0x28/0x40 netlink_unicast+0x22a/0x330 netlink_sendmsg+0x1f8/0x420 __sock_sendmsg+0xb0/0xc0 ____sys_sendmsg+0x24e/0x300 ? copy_msghdr_from_user+0x62/0x80 ___sys_sendmsg+0x7c/0xd0 ? __do_fault+0x34/0x160 ? do_read_fault+0x5f/0x100 ? do_fault+0xb0/0x110 ? __handle_mm_fault+0x2b0/0x6c0 __sys_sendmsg+0x4d/0x80 do_syscall_64+0x69/0x180 entry_SYSCALL_64_after_hwframe+0x6e/0x76 It is possible that the connection is in process of being established when we dump it. Assumed that the connection has been registered in a link group by smc_conn_create() but the rmb_desc has not yet been initialized by smc_buf_create(), thus causing the illegal access to conn->rmb_desc. So fix it by checking before dump. Fixes: 4b1b7d3 ("net/smc: add SMC-D diag support") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
usercw88
pushed a commit
to usercw88/linux
that referenced
this pull request
Feb 4, 2025
…_bind returns err [ Upstream commit 36684e9 ] The pointer need to be set to NULL, otherwise KASAN complains about use-after-free. Because in mtk_drm_bind, all private's drm are set as follows. private->all_drm_private[i]->drm = drm; And drm will be released by drm_dev_put in case mtk_drm_kms_init returns failure. However, the shutdown path still accesses the previous allocated memory in drm_atomic_helper_shutdown. [ 84.874820] watchdog: watchdog0: watchdog did not stop! [ 86.512054] ================================================================== [ 86.513162] BUG: KASAN: use-after-free in drm_atomic_helper_shutdown+0x33c/0x378 [ 86.514258] Read of size 8 at addr ffff0000d46fc068 by task shutdown/1 [ 86.515213] [ 86.515455] CPU: 1 UID: 0 PID: 1 Comm: shutdown Not tainted 6.13.0-rc1-mtk+gfa1a78e5d24b-dirty ni#55 [ 86.516752] Hardware name: Unknown Product/Unknown Product, BIOS 2022.10 10/01/2022 [ 86.517960] Call trace: [ 86.518333] show_stack+0x20/0x38 (C) [ 86.518891] dump_stack_lvl+0x90/0xd0 [ 86.519443] print_report+0xf8/0x5b0 [ 86.519985] kasan_report+0xb4/0x100 [ 86.520526] __asan_report_load8_noabort+0x20/0x30 [ 86.521240] drm_atomic_helper_shutdown+0x33c/0x378 [ 86.521966] mtk_drm_shutdown+0x54/0x80 [ 86.522546] platform_shutdown+0x64/0x90 [ 86.523137] device_shutdown+0x260/0x5b8 [ 86.523728] kernel_restart+0x78/0xf0 [ 86.524282] __do_sys_reboot+0x258/0x2f0 [ 86.524871] __arm64_sys_reboot+0x90/0xd8 [ 86.525473] invoke_syscall+0x74/0x268 [ 86.526041] el0_svc_common.constprop.0+0xb0/0x240 [ 86.526751] do_el0_svc+0x4c/0x70 [ 86.527251] el0_svc+0x4c/0xc0 [ 86.527719] el0t_64_sync_handler+0x144/0x168 [ 86.528367] el0t_64_sync+0x198/0x1a0 [ 86.528920] [ 86.529157] The buggy address belongs to the physical page: [ 86.529972] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff0000d46fd4d0 pfn:0x1146fc [ 86.531319] flags: 0xbfffc0000000000(node=0|zone=2|lastcpupid=0xffff) [ 86.532267] raw: 0bfffc0000000000 0000000000000000 dead000000000122 0000000000000000 [ 86.533390] raw: ffff0000d46fd4d0 0000000000000000 00000000ffffffff 0000000000000000 [ 86.534511] page dumped because: kasan: bad access detected [ 86.535323] [ 86.535559] Memory state around the buggy address: [ 86.536265] ffff0000d46fbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.537314] ffff0000d46fbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.538363] >ffff0000d46fc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.544733] ^ [ 86.551057] ffff0000d46fc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.557510] ffff0000d46fc100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.563928] ================================================================== [ 86.571093] Disabling lock debugging due to kernel taint [ 86.577642] Unable to handle kernel paging request at virtual address e0e9c0920000000b [ 86.581834] KASAN: maybe wild-memory-access in range [0x0752049000000058-0x075204900000005f] ... Fixes: 1ef7ed4 ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support") Signed-off-by: Guoqing Jiang <guoqing.jiang@canonical.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241223023227.1258112-1-guoqing.jiang@canonical.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
chaitu236
pushed a commit
to chaitu236/linux
that referenced
this pull request
May 1, 2025
commit b150654 upstream. Commit <d74169ceb0d2> ("iommu/vt-d: Allocate DMAR fault interrupts locally") moved the call to enable_drhd_fault_handling() to a code path that does not hold any lock while traversing the drhd list. Fix it by ensuring the dmar_global_lock lock is held when traversing the drhd list. Without this fix, the following warning is triggered: ============================= WARNING: suspicious RCU usage 6.14.0-rc3 ni#55 Not tainted ----------------------------- drivers/iommu/intel/dmar.c:2046 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 2 locks held by cpuhp/1/23: #0: ffffffff84a67c50 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x87/0x2c0 #1: ffffffff84a6a380 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x87/0x2c0 stack backtrace: CPU: 1 UID: 0 PID: 23 Comm: cpuhp/1 Not tainted 6.14.0-rc3 ni#55 Call Trace: <TASK> dump_stack_lvl+0xb7/0xd0 lockdep_rcu_suspicious+0x159/0x1f0 ? __pfx_enable_drhd_fault_handling+0x10/0x10 enable_drhd_fault_handling+0x151/0x180 cpuhp_invoke_callback+0x1df/0x990 cpuhp_thread_fun+0x1ea/0x2c0 smpboot_thread_fn+0x1f5/0x2e0 ? __pfx_smpboot_thread_fn+0x10/0x10 kthread+0x12a/0x2d0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x4a/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Holding the lock in enable_drhd_fault_handling() triggers a lockdep splat about a possible deadlock between dmar_global_lock and cpu_hotplug_lock. This is avoided by not holding dmar_global_lock when calling iommu_device_register(), which initiates the device probe process. Fixes: d74169c ("iommu/vt-d: Allocate DMAR fault interrupts locally") Reported-and-tested-by: Ido Schimmel <idosch@nvidia.com> Closes: https://lore.kernel.org/linux-iommu/Zx9OwdLIc_VoQ0-a@shredder.mtl.com/ Tested-by: Breno Leitao <leitao@debian.org> Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20250218022422.2315082-1-baolu.lu@linux.intel.com Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is an additional 2022Q1.3
-nextbranch upgrade from5.15.21-rt30to5.15.26-rt34.There were no merge conflicts and no additional NI-specific patches are required.
Smoke tests: