merge from thesofproject#1
Merged
bardliao merged 22 commits intobardliao:topic/sof-devfrom Sep 17, 2018
Merged
Conversation
This patch adds the handling of snd_soc_dapm_effect that was missing. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch updates the maximum size into the same as used currently in DSP side. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch updates the definitions and structs to indentical as used on DSP side. It will help using the equalizer setup blobs in user space. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
This patch adds the binary control into topology parsing and provides for ASoC the SOF ext bytes put and get functions. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Remove a stray character and add a missing header. The GPIO IRQ is requested as a managed resource during probing, therefore there's no need to explicitly release it during driver unbinding. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@intel.com>
Checking for the IRQ GPIO level once an IRQ has been triggered shouldn't be necessary, once an IRQ has been triggered, so this might end up being superfluous. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@intel.com>
The SOF DT doesn't compile yet, it's only provided as an example, disable it for now to prevent automated builds from failing. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@intel.com>
The snd_sgbuf_aligned_pages() inline function is currently only defined if CONFIG_SND_DMA_SGBUF is enabled. This isn't needed, the function doesn't need SG DMA to be valid. Move it outside of #ifdef. Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@intel.com>
oops case:
the snd_card{} would be freed ahead of dynamic kcontrol
remove process, in this process, the snd_card{} will be
used, then the oops will be hit.
the solution:
the operation of removing the dynamic kcontrol should
be ahead of snd_card{} free.
Signed-off-by: Wu Zhigang <zhigang.wu@linux.intel.com>
remove the dynamic object during the topology free stage. to avoid multi remove operation. Signed-off-by: Wu Zhigang <zhigang.wu@linux.intel.com>
This patch adds support for retreiving byte control data after the DSP comes out of D3. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
If use IPC poistion update, read position from IPC position. For Playback, Use DPIB register from HDA space which reflects the actual data transferred. For Capture, Use the position buffer for pointer, as DPIB is not accurate enough, its update may be completed earlier than the data written to DDR. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Which will make it possible to retrieve stream_tag and change host_period_bytes in for HDA hs_params(), the latter will make disable position update IPC for HDA stream possbile. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
In DPIB/posbuf position pointer mode, we don't have position update IPC messages, we may need(depends on if no_irq is set) set different IOC flags for each BDL entry. Without IOC setting for BDL entries, irq mode won't work as there is no interrupt happens at each period finished. Last, we set BDL IOC at DPIB/pos mode only. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
pcm_runtime is not ready yet at hw_params() stage, so we need to retrieve period settings from params here. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Signed-off-by: Zhu Yingjiang <yingjiang.zhu@linux.intel.com>
bardliao
pushed a commit
that referenced
this pull request
Dec 19, 2018
Remove no_pcm check to invoke pcm_new() for backend dai-links too. This fixes crash in hdmi codec driver during hdmi_codec_startup() while accessing chmap_info struct. chmap_info struct memory is allocated in pcm_new() of hdmi codec driver which is not invoked in case of DPCM when hdmi codec driver is part of backend dai-link. Below is the crash stack: [ 61.635493] Unable to handle kernel NULL pointer dereference at virtual address 00000018 .. [ 61.666696] CM = 0, WnR = 1 [ 61.669778] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffc0d6633000 [ 61.676526] [0000000000000018] *pgd=0000000153fc8003, *pud=0000000153fc8003, *pmd=0000000000000000 [ 61.685793] Internal error: Oops: 96000046 [#1] PREEMPT SMP [ 61.722955] CPU: 7 PID: 2238 Comm: aplay Not tainted 4.14.72 thesofproject#21 .. [ 61.740269] PC is at hdmi_codec_startup+0x124/0x164 [ 61.745308] LR is at hdmi_codec_startup+0xe4/0x164 Signed-off-by: Rohit kumar <rohitkr@codeaurora.org> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit de17f14)
bardliao
pushed a commit
that referenced
this pull request
Jan 10, 2019
We can concurrently try to open the same sub-channel from 2 paths: path #1: vmbus_onoffer() -> vmbus_process_offer() -> handle_sc_creation(). path #2: storvsc_probe() -> storvsc_connect_to_vsp() -> -> storvsc_channel_init() -> handle_multichannel_storage() -> -> vmbus_are_subchannels_present() -> handle_sc_creation(). They conflict with each other, but it was not an issue before the recent commit ae6935e ("vmbus: split ring buffer allocation from open"), because at the beginning of vmbus_open() we checked newchannel->state so only one path could succeed, and the other would return with -EINVAL. After ae6935e, the failing path frees the channel's ringbuffer by vmbus_free_ring(), and this causes a panic later. Commit ae6935e itself is good, and it just reveals the longstanding race. We can resolve the issue by removing path #2, i.e. removing the second vmbus_are_subchannels_present() in handle_multichannel_storage(). BTW, the comment "Check to see if sub-channels have already been created" in handle_multichannel_storage() is incorrect: when we unload the driver, we first close the sub-channel(s) and then close the primary channel, next the host sends rescind-offer message(s) so primary->sc_list will become empty. This means the first vmbus_are_subchannels_present() in handle_multichannel_storage() is never useful. Fixes: ae6935e ("vmbus: split ring buffer allocation from open") Cc: stable@vger.kernel.org Cc: Long Li <longli@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
bardliao
pushed a commit
that referenced
this pull request
Jan 10, 2019
__qdisc_drop_all() accesses skb->prev to get to the tail of the segment-list. With commit 68d2f84 ("net: gro: properly remove skb from list") the skb-list handling has been changed to set skb->next to NULL and set the list-poison on skb->prev. With that change, __qdisc_drop_all() will panic when it tries to dereference skb->prev. Since commit 992cba7 ("net: Add and use skb_list_del_init().") __list_del_entry is used, leaving skb->prev unchanged (thus, pointing to the list-head if it's the first skb of the list). This will make __qdisc_drop_all modify the next-pointer of the list-head and result in a panic later on: [ 34.501053] general protection fault: 0000 [#1] SMP KASAN PTI [ 34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp thesofproject#108 [ 34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011 [ 34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90 [ 34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04 [ 34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202 [ 34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6 [ 34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038 [ 34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062 [ 34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000 [ 34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008 [ 34.512082] FS: 0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000 [ 34.513036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0 [ 34.514593] Call Trace: [ 34.514893] <IRQ> [ 34.515157] napi_gro_receive+0x93/0x150 [ 34.515632] receive_buf+0x893/0x3700 [ 34.516094] ? __netif_receive_skb+0x1f/0x1a0 [ 34.516629] ? virtnet_probe+0x1b40/0x1b40 [ 34.517153] ? __stable_node_chain+0x4d0/0x850 [ 34.517684] ? kfree+0x9a/0x180 [ 34.518067] ? __kasan_slab_free+0x171/0x190 [ 34.518582] ? detach_buf+0x1df/0x650 [ 34.519061] ? lapic_next_event+0x5a/0x90 [ 34.519539] ? virtqueue_get_buf_ctx+0x280/0x7f0 [ 34.520093] virtnet_poll+0x2df/0xd60 [ 34.520533] ? receive_buf+0x3700/0x3700 [ 34.521027] ? qdisc_watchdog_schedule_ns+0xd5/0x140 [ 34.521631] ? htb_dequeue+0x1817/0x25f0 [ 34.522107] ? sch_direct_xmit+0x142/0xf30 [ 34.522595] ? virtqueue_napi_schedule+0x26/0x30 [ 34.523155] net_rx_action+0x2f6/0xc50 [ 34.523601] ? napi_complete_done+0x2f0/0x2f0 [ 34.524126] ? kasan_check_read+0x11/0x20 [ 34.524608] ? _raw_spin_lock+0x7d/0xd0 [ 34.525070] ? _raw_spin_lock_bh+0xd0/0xd0 [ 34.525563] ? kvm_guest_apic_eoi_write+0x6b/0x80 [ 34.526130] ? apic_ack_irq+0x9e/0xe0 [ 34.526567] __do_softirq+0x188/0x4b5 [ 34.527015] irq_exit+0x151/0x180 [ 34.527417] do_IRQ+0xdb/0x150 [ 34.527783] common_interrupt+0xf/0xf [ 34.528223] </IRQ> This patch makes sure that skb->prev is set to NULL when entering netem_enqueue. Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Fixes: 68d2f84 ("net: gro: properly remove skb from list") Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bardliao
pushed a commit
that referenced
this pull request
Jan 10, 2019
It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] thesofproject#6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] thesofproject#7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] thesofproject#8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 thesofproject#9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] thesofproject#10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 thesofproject#11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa thesofproject#12 [ffff881ff43eff18] sys_read at ffffffff811fda62 thesofproject#13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David Howells <dhowells@redhat.com>
bardliao
pushed a commit
that referenced
this pull request
Jan 10, 2019
There are actually two kinds of discard merge: - one is the normal discard merge, just like normal read/write request, and call it single-range discard - another is the multi-range discard, queue_max_discard_segments(rq->q) > 1 For the former case, queue_max_discard_segments(rq->q) is 1, and we should handle this kind of discard merge like the normal read/write request. This patch fixes the following kernel panic issue[1], which is caused by not removing the single-range discard request from elevator queue. Guangwu has one raid discard test case, in which this issue is a bit easier to trigger, and I verified that this patch can fix the kernel panic issue in Guangwu's test case. [1] kernel panic log from Jens's report BUG: unable to handle kernel NULL pointer dereference at 0000000000000148 PGD 0 P4D 0. Oops: 0000 [#1] SMP PTI CPU: 37 PID: 763 Comm: kworker/37:1H Not tainted \ 4.20.0-rc3-00649-ge64d9a554a91-dirty thesofproject#14 Hardware name: Wiwynn \ Leopard-Orv2/Leopard-DDR BW, BIOS LBM08 03/03/2017 Workqueue: kblockd \ blk_mq_run_work_fn RIP: \ 0010:blk_mq_get_driver_tag+0x81/0x120 Code: 24 \ 10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 00 00 00 \ 0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 20 72 37 \ f6 87 b0 00 00 00 02 RSP: 0018:ffffc90004aabd30 EFLAGS: 00010246 \ RAX: 0000000000000003 RBX: ffff888465ea1300 RCX: ffffc90004aabde8 RDX: 00000000ffffffff RSI: ffffc90004aabde8 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff888465ea1348 R09: 0000000000000000 R10: 0000000000001000 R11: 00000000ffffffff R12: ffff888465ea1300 R13: 0000000000000000 R14: ffff888465ea1348 R15: ffff888465d10000 FS: 0000000000000000(0000) GS:ffff88846f9c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000148 CR3: 000000000220a003 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: blk_mq_dispatch_rq_list+0xec/0x480 ? elv_rb_del+0x11/0x30 blk_mq_do_dispatch_sched+0x6e/0xf0 blk_mq_sched_dispatch_requests+0xfa/0x170 __blk_mq_run_hw_queue+0x5f/0xe0 process_one_work+0x154/0x350 worker_thread+0x46/0x3c0 kthread+0xf5/0x130 ? process_one_work+0x350/0x350 ? kthread_destroy_worker+0x50/0x50 ret_from_fork+0x1f/0x30 Modules linked in: sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel \ kvm switchtec irqbypass iTCO_wdt iTCO_vendor_support efivars cdc_ether usbnet mii \ cdc_acm i2c_i801 lpc_ich mfd_core ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq \ button sch_fq_codel nfsd nfs_acl lockd grace auth_rpcgss oid_registry sunrpc nvme \ nvme_core fuse sg loop efivarfs autofs4 CR2: 0000000000000148 \ ---[ end trace 340a1fb996df1b9b ]--- RIP: 0010:blk_mq_get_driver_tag+0x81/0x120 Code: 24 10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 \ 00 00 00 0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 \ 20 72 37 f6 87 b0 00 00 00 02 Fixes: 445251d ("blk-mq: fix discard merge with scheduler attached") Reported-by: Jens Axboe <axboe@kernel.dk> Cc: Guangwu Zhang <guazhang@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
bardliao
pushed a commit
that referenced
this pull request
Jan 10, 2019
When we use direct_IO with an NFS backing store, we can trigger a WARNING in __set_page_dirty(), as below, since we're dirtying the page unnecessarily in nfs_direct_read_completion(). To fix, replicate the logic in commit 53cbf3b ("fs: direct-io: don't dirtying pages for ITER_BVEC/ITER_KVEC direct read"). Other filesystems that implement direct_IO handle this; most use blockdev_direct_IO(). ceph and cifs have similar logic. mount 127.0.0.1:/export /nfs dd if=/dev/zero of=/nfs/image bs=1M count=200 losetup --direct-io=on -f /nfs/image mkfs.btrfs /dev/loop0 mount -t btrfs /dev/loop0 /mnt/ kernel: WARNING: CPU: 0 PID: 8067 at fs/buffer.c:580 __set_page_dirty+0xaf/0xd0 kernel: Modules linked in: loop(E) nfsv3(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) fuse(E) tun(E) ip6t_rpfilter(E) ipt_REJECT(E) nf_ kernel: snd_seq(E) snd_seq_device(E) snd_pcm(E) video(E) snd_timer(E) snd(E) soundcore(E) ip_tables(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) cdrom(E) ata_generic(E) pata_acpi(E) crc32c_intel(E) ahci(E) li kernel: CPU: 0 PID: 8067 Comm: kworker/0:2 Tainted: G E 4.20.0-rc1.master.20181111.ol7.x86_64 #1 kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 kernel: Workqueue: nfsiod rpc_async_release [sunrpc] kernel: RIP: 0010:__set_page_dirty+0xaf/0xd0 kernel: Code: c3 48 8b 02 f6 c4 04 74 d4 48 89 df e8 ba 05 f7 ff 48 89 c6 eb cb 48 8b 43 08 a8 01 75 1f 48 89 d8 48 8b 00 a8 04 74 02 eb 87 <0f> 0b eb 83 48 83 e8 01 eb 9f 48 83 ea 01 0f 1f 00 eb 8b 48 83 e8 kernel: RSP: 0000:ffffc1c8825b7d78 EFLAGS: 00013046 kernel: RAX: 000fffffc0020089 RBX: fffff2b603308b80 RCX: 0000000000000001 kernel: RDX: 0000000000000001 RSI: ffff9d11478115c8 RDI: ffff9d11478115d0 kernel: RBP: ffffc1c8825b7da0 R08: 0000646f6973666e R09: 8080808080808080 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff9d11478115d0 kernel: R13: ffff9d11478115c8 R14: 0000000000003246 R15: 0000000000000001 kernel: FS: 0000000000000000(0000) GS:ffff9d115ba00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007f408686f640 CR3: 0000000104d8e004 CR4: 00000000000606f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: __set_page_dirty_buffers+0xb6/0x110 kernel: set_page_dirty+0x52/0xb0 kernel: nfs_direct_read_completion+0xc4/0x120 [nfs] kernel: nfs_pgio_release+0x10/0x20 [nfs] kernel: rpc_free_task+0x30/0x70 [sunrpc] kernel: rpc_async_release+0x12/0x20 [sunrpc] kernel: process_one_work+0x174/0x390 kernel: worker_thread+0x4f/0x3e0 kernel: kthread+0x102/0x140 kernel: ? drain_workqueue+0x130/0x130 kernel: ? kthread_stop+0x110/0x110 kernel: ret_from_fork+0x35/0x40 kernel: ---[ end trace 01341980905412c9 ]--- Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> [forward-ported to v4.20] Signed-off-by: Calum Mackay <calum.mackay@oracle.com> Reviewed-by: Dave Kleikamp <dave.kleikamp@oracle.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
bardliao
pushed a commit
that referenced
this pull request
Jan 10, 2019
Frank Schreiner reported, that since kernel 4.18 he faces sysfs-warnings when loading modules on a 32-bit kernel. Here is one such example: sysfs: cannot create duplicate filename '/module/nfs/sections/.text' CPU: 0 PID: 98 Comm: modprobe Not tainted 4.18.0-2-parisc #1 Debian 4.18.10-2 Backtrace: [<1017ce2c>] show_stack+0x3c/0x50 [<107a7210>] dump_stack+0x28/0x38 [<103f900c>] sysfs_warn_dup+0x88/0xac [<103f8b1c>] sysfs_add_file_mode_ns+0x164/0x1d0 [<103f9e70>] internal_create_group+0x11c/0x304 [<103fa0a0>] sysfs_create_group+0x48/0x60 [<1022abe8>] load_module.constprop.35+0x1f9c/0x23b8 [<1022b278>] sys_finit_module+0xd0/0x11c [<101831dc>] syscall_exit+0x0/0x14 This warning gets triggered by the fact, that due to commit 24b6c22 ("parisc: Build kernel without -ffunction-sections") we now get multiple .text sections in the kernel modules for which sysfs_create_group() can't create multiple virtual files. This patch works around the problem by re-enabling the -ffunction-sections compiler option for modules, while keeping it disabled for the non-module kernel code. Reported-by: Frank Scheiner <frank.scheiner@web.de> Fixes: 24b6c22 ("parisc: Build kernel without -ffunction-sections") Cc: <stable@vger.kernel.org> # v4.18+ Signed-off-by: Helge Deller <deller@gmx.de>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
KASAN has found use-after-free in sockfs_setattr. The existed commit 6d8c50d ("socket: close race condition between sock_close() and sockfs_setattr()") is to fix this simillar issue, but it seems to ignore that crypto module forgets to set the sk to NULL after af_alg_release. KASAN report details as below: BUG: KASAN: use-after-free in sockfs_setattr+0x120/0x150 Write of size 4 at addr ffff88837b956128 by task syz-executor0/4186 CPU: 2 PID: 4186 Comm: syz-executor0 Not tainted xxx + #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: dump_stack+0xca/0x13e print_address_description+0x79/0x330 ? vprintk_func+0x5e/0xf0 kasan_report+0x18a/0x2e0 ? sockfs_setattr+0x120/0x150 sockfs_setattr+0x120/0x150 ? sock_register+0x2d0/0x2d0 notify_change+0x90c/0xd40 ? chown_common+0x2ef/0x510 chown_common+0x2ef/0x510 ? chmod_common+0x3b0/0x3b0 ? __lock_is_held+0xbc/0x160 ? __sb_start_write+0x13d/0x2b0 ? __mnt_want_write+0x19a/0x250 do_fchownat+0x15c/0x190 ? __ia32_sys_chmod+0x80/0x80 ? trace_hardirqs_on_thunk+0x1a/0x1c __x64_sys_fchownat+0xbf/0x160 ? lockdep_hardirqs_on+0x39a/0x5e0 do_syscall_64+0xc8/0x580 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462589 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fb4b2c83c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000104 RAX: ffffffffffffffda RBX: 000000000072bfa0 RCX: 0000000000462589 RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000007 RBP: 0000000000000005 R08: 0000000000001000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb4b2c846bc R13: 00000000004bc733 R14: 00000000006f5138 R15: 00000000ffffffff Allocated by task 4185: kasan_kmalloc+0xa0/0xd0 __kmalloc+0x14a/0x350 sk_prot_alloc+0xf6/0x290 sk_alloc+0x3d/0xc00 af_alg_accept+0x9e/0x670 hash_accept+0x4a3/0x650 __sys_accept4+0x306/0x5c0 __x64_sys_accept4+0x98/0x100 do_syscall_64+0xc8/0x580 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 4184: __kasan_slab_free+0x12e/0x180 kfree+0xeb/0x2f0 __sk_destruct+0x4e6/0x6a0 sk_destruct+0x48/0x70 __sk_free+0xa9/0x270 sk_free+0x2a/0x30 af_alg_release+0x5c/0x70 __sock_release+0xd3/0x280 sock_close+0x1a/0x20 __fput+0x27f/0x7f0 task_work_run+0x136/0x1b0 exit_to_usermode_loop+0x1a7/0x1d0 do_syscall_64+0x461/0x580 entry_SYSCALL_64_after_hwframe+0x49/0xbe Syzkaller reproducer: r0 = perf_event_open(&(0x7f0000000000)={0x0, 0x70, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext}, 0x0, 0x0, 0xffffffffffffffff, 0x0) r1 = socket$alg(0x26, 0x5, 0x0) getrusage(0x0, 0x0) bind(r1, &(0x7f00000001c0)=@alg={0x26, 'hash\x00', 0x0, 0x0, 'sha256-ssse3\x00'}, 0x80) r2 = accept(r1, 0x0, 0x0) r3 = accept4$unix(r2, 0x0, 0x0, 0x0) r4 = dup3(r3, r0, 0x0) fchownat(r4, &(0x7f00000000c0)='\x00', 0x0, 0x0, 0x1000) Fixes: 6d8c50d ("socket: close race condition between sock_close() and sockfs_setattr()") Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
The compound IOMMU group rework moved iommu_register_group() together in pnv_pci_ioda_setup_iommu_api() (which is a part of ppc_md.pcibios_fixup). As the result, pnv_ioda_setup_bus_iommu_group() does not create groups any more, it only adds devices to groups. This works fine for boot time devices. However IOMMU groups for SRIOV's VFs were added by pnv_ioda_setup_bus_iommu_group() so this got broken: pnv_tce_iommu_bus_notifier() expects a group to be registered for VF and it is not. This adds missing group registration and adds a NULL pointer check into the bus notifier so we won't crash if there is no group, although it is not expected to happen now because of the change above. Example oops seen prior to this patch: $ echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/sriov_numvfs Unable to handle kernel paging request for data at address 0x00000030 Faulting instruction address: 0xc0000000004a6018 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV CPU: 46 PID: 7006 Comm: bash Not tainted 4.15-ish NIP: c0000000004a6018 LR: c0000000004a6014 CTR: 0000000000000000 REGS: c000008fc876b400 TRAP: 0300 Not tainted (4.15-ish) MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CFAR: c000000000d0be20 DAR: 0000000000000030 DSISR: 40000000 SOFTE: 1 ... NIP sysfs_do_create_link_sd.isra.0+0x68/0x150 LR sysfs_do_create_link_sd.isra.0+0x64/0x150 Call Trace: pci_dev_type+0x0/0x30 (unreliable) iommu_group_add_device+0x8c/0x600 iommu_add_device+0xe8/0x180 pnv_tce_iommu_bus_notifier+0xb0/0xf0 notifier_call_chain+0x9c/0x110 blocking_notifier_call_chain+0x64/0xa0 device_add+0x524/0x7d0 pci_device_add+0x248/0x450 pci_iov_add_virtfn+0x294/0x3e0 pci_enable_sriov+0x43c/0x580 mlx5_core_sriov_configure+0x15c/0x2f0 [mlx5_core] sriov_numvfs_store+0x180/0x240 dev_attr_store+0x3c/0x60 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1ac/0x240 __vfs_write+0x3c/0x70 vfs_write+0xd8/0x220 SyS_write+0x6c/0x110 system_call+0x58/0x6c Fixes: 0bd9716 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reported-by: Santwana Samantray <santwana.samantray@in.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
Evaluating page_mapping() on a poisoned page ends up dereferencing junk
and making PF_POISONED_CHECK() considerably crashier than intended:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000006
Mem abort info:
ESR = 0x96000005
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000005
CM = 0, WnR = 0
user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c2f6ac38
[0000000000000006] pgd=0000000000000000, pud=0000000000000000
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 491 Comm: bash Not tainted 5.0.0-rc1+ #1
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Dec 17 2018
pstate: 00000005 (nzcv daif -PAN -UAO)
pc : page_mapping+0x18/0x118
lr : __dump_page+0x1c/0x398
Process bash (pid: 491, stack limit = 0x000000004ebd4ecd)
Call trace:
page_mapping+0x18/0x118
__dump_page+0x1c/0x398
dump_page+0xc/0x18
remove_store+0xbc/0x120
dev_attr_store+0x18/0x28
sysfs_kf_write+0x40/0x50
kernfs_fop_write+0x130/0x1d8
__vfs_write+0x30/0x180
vfs_write+0xb4/0x1a0
ksys_write+0x60/0xd0
__arm64_sys_write+0x18/0x20
el0_svc_common+0x94/0xf8
el0_svc_handler+0x68/0x70
el0_svc+0x8/0xc
Code: f9400401 d1000422 f240003f 9a801040 (f9400402)
---[ end trace cdb5eb5bf435cecb ]---
Fix that by not inspecting the mapping until we've determined that it's
likely to be valid. Now the above condition still ends up stopping the
kernel, but in the correct manner:
page:ffffffbf20000000 is uninitialized and poisoned
raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
------------[ cut here ]------------
kernel BUG at ./include/linux/mm.h:1006!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 483 Comm: bash Not tainted 5.0.0-rc1+ #3
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Dec 17 2018
pstate: 40000005 (nZcv daif -PAN -UAO)
pc : remove_store+0xbc/0x120
lr : remove_store+0xbc/0x120
...
Link: http://lkml.kernel.org/r/03b53ee9d7e76cda4b9b5e1e31eea080db033396.1550071778.git.robin.murphy@arm.com
Fixes: 1c6fb1d ("mm: print more information about mapping in __dump_page")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
We've been seeing hard-to-trigger psi crashes when running inside VM
instances:
divide error: 0000 [#1] SMP PTI
Modules linked in: [...]
CPU: 0 PID: 212 Comm: kworker/0:2 Not tainted 4.16.18-119_fbk9_3817_gfe944c98d695 thesofproject#119
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Workqueue: events psi_clock
RIP: 0010:psi_update_stats+0x270/0x490
RSP: 0018:ffffc90001117e10 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800a35a13f8
RDX: 0000000000000000 RSI: ffff8800a35a1340 RDI: 0000000000000000
RBP: 0000000000000658 R08: ffff8800a35a1470 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 00000000000f8502
FS: 0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbe370fa000 CR3: 00000000b1e3a000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
psi_clock+0x12/0x50
process_one_work+0x1e0/0x390
worker_thread+0x2b/0x3c0
? rescuer_thread+0x330/0x330
kthread+0x113/0x130
? kthread_create_worker_on_cpu+0x40/0x40
? SyS_exit_group+0x10/0x10
ret_from_fork+0x35/0x40
Code: 48 0f 47 c7 48 01 c2 45 85 e4 48 89 16 0f 85 e6 00 00 00 4c 8b 49 10 4c 8b 51 08 49 69 d9 f2 07 00 00 48 6b c0 64 4c 8b 29 31 d2 <48> f7 f7 49 69 d5 8d 06 00 00 48 89 c5 4c 69 f0 00 98 0b 00 48
The Code-line points to `period` being 0 inside update_stats(), and we
divide by that when calculating that period's pressure percentage.
The elapsed period should never be 0. The reason this can happen is due
to an off-by-one in the idle time / missing period calculation combined
with a coarse sched_clock() in the virtual machine.
The target time for aggregation is advanced into the future on a fixed
grid to prevent clock drift. So when an aggregation runs after some idle
period, we can not just set it to "now + psi_period", but have to
calculate the downtime and advance the target time relative to itself.
However, if the aggregator was disabled exactly one psi_period (ns), we
drop one idle period in the calculation due to a > when we should do >=.
In that case, next_update will be advanced from 'now - psi_period' to
'now' when it should be moved to 'now + psi_period'. The run finishes
with last_update == next_update == sched_clock().
With hardware clocks, this exact nanosecond match isn't likely in the
first place; but if it does happen, the clock will still have moved on and
the period non-zero by the time the worker runs. A pointlessly short
period, but besides the extra work, no harm no foul. However, a slow
sched_clock() like we have on VMs might not have advanced either by the
time the worker runs again. And when we calculate the elapsed period, the
result, our pressure divisor, will be 0. Ouch.
Fix this by correctly handling the situation when the elapsed time between
aggregation runs is precisely two periods, and advance the expiration
timestamp correctly to period into the future.
Link: http://lkml.kernel.org/r/20190214193157.15788-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Łukasz Siudut <lsiudut@fb.com
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
In process_slab(), "p = get_freepointer()" could return a tagged
pointer, but "addr = page_address()" always return a native pointer. As
the result, slab_index() is messed up here,
return (p - addr) / s->size;
All other callers of slab_index() have the same situation where "addr"
is from page_address(), so just need to untag "p".
# cat /sys/kernel/slab/hugetlbfs_inode_cache/alloc_calls
Unable to handle kernel paging request at virtual address 2bff808aa4856d48
Mem abort info:
ESR = 0x96000007
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000007
CM = 0, WnR = 0
swapper pgtable: 64k pages, 48-bit VAs, pgdp = 0000000002498338
[2bff808aa4856d48] pgd=00000097fcfd0003, pud=00000097fcfd0003, pmd=00000097fca30003, pte=00e8008b24850712
Internal error: Oops: 96000007 [#1] SMP
CPU: 3 PID: 79210 Comm: read_all Tainted: G L 5.0.0-rc7+ thesofproject#84
Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.0.6 07/10/2018
pstate: 00400089 (nzcv daIf +PAN -UAO)
pc : get_map+0x78/0xec
lr : get_map+0xa0/0xec
sp : aeff808989e3f8e0
x29: aeff808989e3f940 x28: ffff800826200000
x27: ffff100012d47000 x26: 9700000000002500
x25: 0000000000000001 x24: 52ff8008200131f8
x23: 52ff8008200130a0 x22: 52ff800820013098
x21: ffff800826200000 x20: ffff100013172ba0
x19: 2bff808a8971bc00 x18: ffff1000148f5538
x17: 000000000000001b x16: 00000000000000ff
x15: ffff1000148f5000 x14: 00000000000000d2
x13: 0000000000000001 x12: 0000000000000000
x11: 0000000020000002 x10: 2bff808aa4856d48
x9 : 0000020000000000 x8 : 68ff80082620ebb0
x7 : 0000000000000000 x6 : ffff1000105da1dc
x5 : 0000000000000000 x4 : 0000000000000000
x3 : 0000000000000010 x2 : 2bff808a8971bc00
x1 : ffff7fe002098800 x0 : ffff80082620ceb0
Process read_all (pid: 79210, stack limit = 0x00000000f65b9361)
Call trace:
get_map+0x78/0xec
process_slab+0x7c/0x47c
list_locations+0xb0/0x3c8
alloc_calls_show+0x34/0x40
slab_attr_show+0x34/0x48
sysfs_kf_seq_show+0x2e4/0x570
kernfs_seq_show+0x12c/0x1a0
seq_read+0x48c/0xf84
kernfs_fop_read+0xd4/0x448
__vfs_read+0x94/0x5d4
vfs_read+0xcc/0x194
ksys_read+0x6c/0xe8
__arm64_sys_read+0x68/0xb0
el0_svc_handler+0x230/0x3bc
el0_svc+0x8/0xc
Code: d3467d2a 9ac92329 8b0a0e6a f9800151 (c85f7d4b)
---[ end trace a383a9a44ff13176 ]---
Kernel panic - not syncing: Fatal exception
SMP: stopping secondary CPUs
SMP: failed to stop secondary CPUs 1-7,32,40,127
Kernel Offset: disabled
CPU features: 0x002,20000c18
Memory Limit: none
---[ end Kernel panic - not syncing: Fatal exception ]---
Link: http://lkml.kernel.org/r/20190220020251.82039-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
Rong Chen has reported the following boot crash:
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 239 Comm: udevd Not tainted 5.0.0-rc4-00149-gefad4e4 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
RIP: 0010:page_mapping+0x12/0x80
Code: 5d c3 48 89 df e8 0e ad 02 00 85 c0 75 da 89 e8 5b 5d c3 0f 1f 44 00 00 53 48 89 fb 48 8b 43 08 48 8d 50 ff a8 01 48 0f 45 da <48> 8b 53 08 48 8d 42 ff 83 e2 01 48 0f 44 c3 48 83 38 ff 74 2f 48
RSP: 0018:ffff88801fa87cd8 EFLAGS: 00010202
RAX: ffffffffffffffff RBX: fffffffffffffffe RCX: 000000000000000a
RDX: fffffffffffffffe RSI: ffffffff820b9a20 RDI: ffff88801e5c0000
RBP: 6db6db6db6db6db7 R08: ffff88801e8bb000 R09: 0000000001b64d13
R10: ffff88801fa87cf8 R11: 0000000000000001 R12: ffff88801e640000
R13: ffffffff820b9a20 R14: ffff88801f145258 R15: 0000000000000001
FS: 00007fb2079817c0(0000) GS:ffff88801dd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000006 CR3: 000000001fa82000 CR4: 00000000000006a0
Call Trace:
__dump_page+0x14/0x2c0
is_mem_section_removable+0x24c/0x2c0
removable_show+0x87/0xa0
dev_attr_show+0x25/0x60
sysfs_kf_seq_show+0xba/0x110
seq_read+0x196/0x3f0
__vfs_read+0x34/0x180
vfs_read+0xa0/0x150
ksys_read+0x44/0xb0
do_syscall_64+0x5e/0x4a0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
and bisected it down to commit efad4e4 ("mm, memory_hotplug:
is_mem_section_removable do not pass the end of a zone").
The reason for the crash is that the mapping is garbage for poisoned
(uninitialized) page. This shouldn't happen as all pages in the zone's
boundary should be initialized.
Later debugging revealed that the actual problem is an off-by-one when
evaluating the end_page. 'start_pfn + nr_pages' resp 'zone_end_pfn'
refers to a pfn after the range and as such it might belong to a
differen memory section.
This along with CONFIG_SPARSEMEM then makes the loop condition
completely bogus because a pointer arithmetic doesn't work for pages
from two different sections in that memory model.
Fix the issue by reworking is_pageblock_removable to be pfn based and
only use struct page where necessary. This makes the code slightly
easier to follow and we will remove the problematic pointer arithmetic
completely.
Link: http://lkml.kernel.org/r/20190218181544.14616-1-mhocko@kernel.org
Fixes: efad4e4 ("mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: <rong.a.chen@intel.com>
Tested-by: <rong.a.chen@intel.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
The default value of ARCH_SLAB_MINALIGN in "include/linux/slab.h" is "__alignof__(unsigned long long)" which for ARC unexpectedly turns out to be 4. This is not a compiler bug, but as defined by ARC ABI [1] Thus slab allocator would allocate a struct which is 32-bit aligned, which is generally OK even if struct has long long members. There was however potetial problem when it had any atomic64_t which use LLOCKD/SCONDD instructions which are required by ISA to take 64-bit addresses. This is the problem we ran into [ 4.015732] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null) [ 4.167881] Misaligned Access [ 4.172356] Path: /bin/busybox.nosuid [ 4.176004] CPU: 2 PID: 171 Comm: rm Not tainted 4.19.14-yocto-standard #1 [ 4.182851] [ 4.182851] [ECR ]: 0x000d0000 => Check Programmer's Manual [ 4.190061] [EFA ]: 0xbeaec3fc [ 4.190061] [BLINK ]: ext4_delete_entry+0x210/0x234 [ 4.190061] [ERET ]: ext4_delete_entry+0x13e/0x234 [ 4.202985] [STAT32]: 0x80080002 : IE K [ 4.207236] BTA: 0x9009329c SP: 0xbe5b1ec4 FP: 0x00000000 [ 4.212790] LPS: 0x9074b118 LPE: 0x9074b120 LPC: 0x00000000 [ 4.218348] r00: 0x00000040 r01: 0x00000021 r02: 0x00000001 ... ... [ 4.270510] Stack Trace: [ 4.274510] ext4_delete_entry+0x13e/0x234 [ 4.278695] ext4_rmdir+0xe0/0x238 [ 4.282187] vfs_rmdir+0x50/0xf0 [ 4.285492] do_rmdir+0x9e/0x154 [ 4.288802] EV_Trap+0x110/0x114 The fix is to make sure slab allocations are 64-bit aligned. Do note that atomic64_t is __attribute__((aligned(8)) which means gcc does generate 64-bit aligned references, relative to beginning of container struct. However the issue is if the container itself is not 64-bit aligned, atomic64_t ends up unaligned which is what this patch ensures. [1] https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/wiki/files/ARCv2_ABI.pdf Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: <stable@vger.kernel.org> # 4.8+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: reworked changelog, added dependency on LL64+LLSC]
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
The SHA256 code we adopted from the OpenSSL project uses a rather peculiar way to take the address of the round constant table: it takes the address of the sha256_block_data_order() routine, and substracts a constant known quantity to arrive at the base of the table, which is emitted by the same assembler code right before the routine's entry point. However, recent versions of binutils have helpfully changed the behavior of references emitted via an ADR instruction when running in Thumb2 mode: it now takes the Thumb execution mode bit into account, which is bit 0 af the address. This means the produced table address also has bit 0 set, and so we end up with an address value pointing 1 byte past the start of the table, which results in crashes such as Unable to handle kernel paging request at virtual address bf825000 pgd = 42f44b1 [bf825000] *pgd=80000040206003, *pmd=5f1bd003, *pte=00000000 Internal error: Oops: 207 [#1] PREEMPT SMP THUMB2 Modules linked in: sha256_arm(+) sha1_arm_ce sha1_arm ... CPU: 7 PID: 396 Comm: cryptomgr_test Not tainted 5.0.0-rc6+ thesofproject#144 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 PC is at sha256_block_data_order+0xaaa/0xb30 [sha256_arm] LR is at __this_module+0x17fd/0xffffe800 [sha256_arm] pc : [<bf820bca>] lr : [<bf824ffd>] psr: 800b0033 sp : ebc8bbe ip : faaabe1c fp : 2fdd3433 r10: 4c5f1692 r9 : e43037df r8 : b04b0a5a r7 : c369d722 r6 : 39c3693e r5 : 7a013189 r4 : 1580d26b r3 : 8762a9b0 r2 : eea9c2cd r1 : 3e9ab536 r0 : 1dea4ae7 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment user Control: 70c5383d Table: 6b8467c0 DAC: dbadc0de Process cryptomgr_test (pid: 396, stack limit = 0x69e1fe23) Stack: (0xebc8bbe8 to 0xebc8c000) ... unwind: Unknown symbol address bf820bca unwind: Index not found bf820bca Code: 441a ea80 40f9 440a (f85e) 3b04 ---[ end trace e560cce92700ef8a ]--- Given that this affects older kernels as well, in case they are built with a recent toolchain, apply a minimal backportable fix, which is to emit another non-code label at the start of the routine, and reference that instead. (This is similar to the current upstream state of this file in OpenSSL) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
The SHA512 code we adopted from the OpenSSL project uses a rather peculiar way to take the address of the round constant table: it takes the address of the sha256_block_data_order() routine, and substracts a constant known quantity to arrive at the base of the table, which is emitted by the same assembler code right before the routine's entry point. However, recent versions of binutils have helpfully changed the behavior of references emitted via an ADR instruction when running in Thumb2 mode: it now takes the Thumb execution mode bit into account, which is bit 0 af the address. This means the produced table address also has bit 0 set, and so we end up with an address value pointing 1 byte past the start of the table, which results in crashes such as Unable to handle kernel paging request at virtual address bf825000 pgd = 42f44b1 [bf825000] *pgd=80000040206003, *pmd=5f1bd003, *pte=00000000 Internal error: Oops: 207 [#1] PREEMPT SMP THUMB2 Modules linked in: sha256_arm(+) sha1_arm_ce sha1_arm ... CPU: 7 PID: 396 Comm: cryptomgr_test Not tainted 5.0.0-rc6+ thesofproject#144 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 PC is at sha256_block_data_order+0xaaa/0xb30 [sha256_arm] LR is at __this_module+0x17fd/0xffffe800 [sha256_arm] pc : [<bf820bca>] lr : [<bf824ffd>] psr: 800b0033 sp : ebc8bbe ip : faaabe1c fp : 2fdd3433 r10: 4c5f1692 r9 : e43037df r8 : b04b0a5a r7 : c369d722 r6 : 39c3693e r5 : 7a013189 r4 : 1580d26b r3 : 8762a9b0 r2 : eea9c2cd r1 : 3e9ab536 r0 : 1dea4ae7 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment user Control: 70c5383d Table: 6b8467c0 DAC: dbadc0de Process cryptomgr_test (pid: 396, stack limit = 0x69e1fe23) Stack: (0xebc8bbe8 to 0xebc8c000) ... unwind: Unknown symbol address bf820bca unwind: Index not found bf820bca Code: 441a ea80 40f9 440a (f85e) 3b04 ---[ end trace e560cce92700ef8a ]--- Given that this affects older kernels as well, in case they are built with a recent toolchain, apply a minimal backportable fix, which is to emit another non-code label at the start of the routine, and reference that instead. (This is similar to the current upstream state of this file in OpenSSL) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
…version Fix a possible NULL pointer dereference in ip6erspan_set_version checking nlattr data pointer kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 7549 Comm: syz-executor432 Not tainted 5.0.0-rc6-next-20190218 thesofproject#37 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:ip6erspan_set_version+0x5c/0x350 net/ipv6/ip6_gre.c:1726 Code: 07 38 d0 7f 08 84 c0 0f 85 9f 02 00 00 49 8d bc 24 b0 00 00 00 c6 43 54 01 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 9a 02 00 00 4d 8b ac 24 b0 00 00 00 4d 85 ed 0f RSP: 0018:ffff888089ed7168 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8880869d6e58 RCX: 0000000000000000 RDX: 0000000000000016 RSI: ffffffff862736b4 RDI: 00000000000000b0 RBP: ffff888089ed7180 R08: 1ffff11010d3adcb R09: ffff8880869d6e58 R10: ffffed1010d3add5 R11: ffff8880869d6eaf R12: 0000000000000000 R13: ffffffff8931f8c0 R14: ffffffff862825d0 R15: ffff8880869d6e58 FS: 0000000000b3d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000184 CR3: 0000000092cc5000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip6erspan_newlink+0x66/0x7b0 net/ipv6/ip6_gre.c:2210 __rtnl_newlink+0x107b/0x16c0 net/core/rtnetlink.c:3176 rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3234 rtnetlink_rcv_msg+0x465/0xb00 net/core/rtnetlink.c:5192 netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2485 rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5210 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1925 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xdd/0x130 net/socket.c:631 ___sys_sendmsg+0x806/0x930 net/socket.c:2136 __sys_sendmsg+0x105/0x1d0 net/socket.c:2174 __do_sys_sendmsg net/socket.c:2183 [inline] __se_sys_sendmsg net/socket.c:2181 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2181 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440159 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffa69156e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440159 RDX: 0000000000000000 RSI: 0000000020001340 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000001 R09: 00000000004002c8 R10: 0000000000000011 R11: 0000000000000246 R12: 00000000004019e0 R13: 0000000000401a70 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace 09f8a7d13b4faaa1 ]--- RIP: 0010:ip6erspan_set_version+0x5c/0x350 net/ipv6/ip6_gre.c:1726 Code: 07 38 d0 7f 08 84 c0 0f 85 9f 02 00 00 49 8d bc 24 b0 00 00 00 c6 43 54 01 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 9a 02 00 00 4d 8b ac 24 b0 00 00 00 4d 85 ed 0f RSP: 0018:ffff888089ed7168 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8880869d6e58 RCX: 0000000000000000 RDX: 0000000000000016 RSI: ffffffff862736b4 RDI: 00000000000000b0 RBP: ffff888089ed7180 R08: 1ffff11010d3adcb R09: ffff8880869d6e58 R10: ffffed1010d3add5 R11: ffff8880869d6eaf R12: 0000000000000000 R13: ffffffff8931f8c0 R14: ffffffff862825d0 R15: ffff8880869d6e58 FS: 0000000000b3d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000184 CR3: 0000000092cc5000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 4974d5f ("net: ip6_gre: initialize erspan_ver just for erspan tunnels") Reported-and-tested-by: syzbot+30191cf1057abd3064af@syzkaller.appspotmail.com Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
Commit 5738459 ("iommu/vt-d: Store bus information in RMRR PCI device path") changed the type of the path data, however, the change in path type was not reflected in size calculations. Update to use the correct type and prevent a buffer overflow. This bug manifests in systems with deep PCI hierarchies, and can lead to an overflow of the static allocated buffer (dmar_pci_notify_info_buf), or can lead to overflow of slab-allocated data. BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0 Write of size 1 at addr ffffffff90445d80 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.14.87-rt49-02406-gd0a0e96 #1 Call Trace: ? dump_stack+0x46/0x59 ? print_address_description+0x1df/0x290 ? dmar_alloc_pci_notify_info+0x1d5/0x2e0 ? kasan_report+0x256/0x340 ? dmar_alloc_pci_notify_info+0x1d5/0x2e0 ? e820__memblock_setup+0xb0/0xb0 ? dmar_dev_scope_init+0x424/0x48f ? __down_write_common+0x1ec/0x230 ? dmar_dev_scope_init+0x48f/0x48f ? dmar_free_unused_resources+0x109/0x109 ? cpumask_next+0x16/0x20 ? __kmem_cache_create+0x392/0x430 ? kmem_cache_create+0x135/0x2f0 ? e820__memblock_setup+0xb0/0xb0 ? intel_iommu_init+0x170/0x1848 ? _raw_spin_unlock_irqrestore+0x32/0x60 ? migrate_enable+0x27a/0x5b0 ? sched_setattr+0x20/0x20 ? migrate_disable+0x1fc/0x380 ? task_rq_lock+0x170/0x170 ? try_to_run_init_process+0x40/0x40 ? locks_remove_file+0x85/0x2f0 ? dev_prepare_static_identity_mapping+0x78/0x78 ? rt_spin_unlock+0x39/0x50 ? lockref_put_or_lock+0x2a/0x40 ? dput+0x128/0x2f0 ? __rcu_read_unlock+0x66/0x80 ? __fput+0x250/0x300 ? __rcu_read_lock+0x1b/0x30 ? mntput_no_expire+0x38/0x290 ? e820__memblock_setup+0xb0/0xb0 ? pci_iommu_init+0x25/0x63 ? pci_iommu_init+0x25/0x63 ? do_one_initcall+0x7e/0x1c0 ? initcall_blacklisted+0x120/0x120 ? kernel_init_freeable+0x27b/0x307 ? rest_init+0xd0/0xd0 ? kernel_init+0xf/0x120 ? rest_init+0xd0/0xd0 ? ret_from_fork+0x1f/0x40 The buggy address belongs to the variable: dmar_pci_notify_info_buf+0x40/0x60 Fixes: 5738459 ("iommu/vt-d: Store bus information in RMRR PCI device path") Signed-off-by: Julia Cartwright <julia@ni.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
bardliao
pushed a commit
that referenced
this pull request
Mar 11, 2019
It can be reproduced by following steps: 1. virtio_net NIC is configured with gso/tso on 2. configure nginx as http server with an index file bigger than 1M bytes 3. use tc netem to produce duplicate packets and delay: tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90% 4. continually curl the nginx http server to get index file on client 5. BUG_ON is seen quickly [10258690.371129] kernel BUG at net/core/skbuff.c:4028! [10258690.371748] invalid opcode: 0000 [#1] SMP PTI [10258690.372094] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 5.0.0-rc6 #2 [10258690.372094] RSP: 0018:ffffa05797b43da0 EFLAGS: 00010202 [10258690.372094] RBP: 00000000000005ea R08: 0000000000000000 R09: 00000000000005ea [10258690.372094] R10: ffffa0579334d800 R11: 00000000000002c0 R12: 0000000000000002 [10258690.372094] R13: 0000000000000000 R14: ffffa05793122900 R15: ffffa0578f7cb028 [10258690.372094] FS: 0000000000000000(0000) GS:ffffa05797b40000(0000) knlGS:0000000000000000 [10258690.372094] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [10258690.372094] CR2: 00007f1a6dc00868 CR3: 000000001000e000 CR4: 00000000000006e0 [10258690.372094] Call Trace: [10258690.372094] <IRQ> [10258690.372094] skb_to_sgvec+0x11/0x40 [10258690.372094] start_xmit+0x38c/0x520 [virtio_net] [10258690.372094] dev_hard_start_xmit+0x9b/0x200 [10258690.372094] sch_direct_xmit+0xff/0x260 [10258690.372094] __qdisc_run+0x15e/0x4e0 [10258690.372094] net_tx_action+0x137/0x210 [10258690.372094] __do_softirq+0xd6/0x2a9 [10258690.372094] irq_exit+0xde/0xf0 [10258690.372094] smp_apic_timer_interrupt+0x74/0x140 [10258690.372094] apic_timer_interrupt+0xf/0x20 [10258690.372094] </IRQ> In __skb_to_sgvec(), the skb->len is not equal to the sum of the skb's linear data size and nonlinear data size, thus BUG_ON triggered. Because the skb is cloned and a part of nonlinear data is split off. Duplicate packet is cloned in netem_enqueue() and may be delayed some time in qdisc. When qdisc len reached the limit and returns NET_XMIT_DROP, the skb will be retransmit later in write queue. the skb will be fragmented by tso_fragment(), the limit size that depends on cwnd and mss decrease, the skb's nonlinear data will be split off. The length of the skb cloned by netem will not be updated. When we use virtio_net NIC and invoke skb_to_sgvec(), the BUG_ON trigger. To fix it, netem returns NET_XMIT_SUCCESS to upper stack when it clones a duplicate packet. Fixes: 35d889d ("sch_netem: fix skb leak in netem_enqueue()") Signed-off-by: Sheng Lan <lansheng@huawei.com> Reported-by: Qin Ji <jiqin.ji@huawei.com> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bardliao
pushed a commit
that referenced
this pull request
Apr 8, 2019
If codec registration fails after the ASoC Intel SST driver has been probed, the kernel will Oops and crash at suspend/resume. general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 PID: 2811 Comm: cat Tainted: G W 4.19.30 thesofproject#15 Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014 RIP: 0010:snd_soc_suspend+0x5a/0xd21 Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89 fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b RSP: 0018:ffff888035407750 EFLAGS: 00010202 RAX: 0000000000000034 RBX: 00000000000001a0 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88805c417098 RBP: ffff8880354077b0 R08: dffffc0000000000 R09: ffffed100b975718 R10: 0000000000000001 R11: ffffffff949ea4a3 R12: 1ffff1100b975746 R13: dffffc0000000000 R14: ffff88805cba4588 R15: dffffc0000000000 FS: 0000794a78e91b80(0000) GS:ffff888068d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007bd5283ccf58 CR3: 000000004b7aa000 CR4: 00000000001006e0 Call Trace: ? dpm_complete+0x67b/0x67b ? i915_gem_suspend+0x14d/0x1ad sst_soc_prepare+0x91/0x1dd ? sst_be_hw_params+0x7e/0x7e dpm_prepare+0x39a/0x88b dpm_suspend_start+0x13/0x9d suspend_devices_and_enter+0x18f/0xbd7 ? arch_suspend_enable_irqs+0x11/0x11 ? printk+0xd9/0x12d ? lock_release+0x95f/0x95f ? log_buf_vmcoreinfo_setup+0x131/0x131 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? __pm_pr_dbg+0x186/0x190 ? pm_notifier_call_chain+0x39/0x39 ? suspend_test+0x9d/0x9d pm_suspend+0x2f4/0x728 ? trace_suspend_resume+0x3da/0x3da ? lock_release+0x95f/0x95f ? kernfs_fop_write+0x19f/0x32d state_store+0xd8/0x147 ? sysfs_kf_read+0x155/0x155 kernfs_fop_write+0x23e/0x32d __vfs_write+0x108/0x608 ? vfs_read+0x2e9/0x2e9 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? debug_smp_processor_id+0x10/0x10 ? selinux_file_permission+0x1c5/0x3c8 ? rcu_sync_lockdep_assert+0x6a/0xad ? __sb_start_write+0x129/0x2ac vfs_write+0x1aa/0x434 ksys_write+0xfe/0x1be ? __ia32_sys_read+0x82/0x82 do_syscall_64+0xcd/0x120 entry_SYSCALL_64_after_hwframe+0x49/0xbe In the observed situation, the problem is seen because the codec driver failed to probe due to a hardware problem. max98090 i2c-193C9890:00: Failed to read device revision: -1 max98090 i2c-193C9890:00: ASoC: failed to probe component -1 cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1 cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1 cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1 The problem is similar to the problem solved with commit 2fc995a ("ASoC: intel: Fix crash at suspend/resume without card registration"), but codec registration fails at a later point. At that time, the pointer checked with the above mentioned commit is already set, but it is not cleared if the device is subsequently removed. Adding a remove function to clear the pointer fixes the problem. Cc: stable@vger.kernel.org Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Curtis Malainey <cujomalainey@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 8f71370)
bardliao
pushed a commit
that referenced
this pull request
Apr 25, 2019
If codec registration fails after the ASoC Intel SST driver has been probed, the kernel will Oops and crash at suspend/resume. general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 PID: 2811 Comm: cat Tainted: G W 4.19.30 thesofproject#15 Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014 RIP: 0010:snd_soc_suspend+0x5a/0xd21 Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89 fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b RSP: 0018:ffff888035407750 EFLAGS: 00010202 RAX: 0000000000000034 RBX: 00000000000001a0 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88805c417098 RBP: ffff8880354077b0 R08: dffffc0000000000 R09: ffffed100b975718 R10: 0000000000000001 R11: ffffffff949ea4a3 R12: 1ffff1100b975746 R13: dffffc0000000000 R14: ffff88805cba4588 R15: dffffc0000000000 FS: 0000794a78e91b80(0000) GS:ffff888068d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007bd5283ccf58 CR3: 000000004b7aa000 CR4: 00000000001006e0 Call Trace: ? dpm_complete+0x67b/0x67b ? i915_gem_suspend+0x14d/0x1ad sst_soc_prepare+0x91/0x1dd ? sst_be_hw_params+0x7e/0x7e dpm_prepare+0x39a/0x88b dpm_suspend_start+0x13/0x9d suspend_devices_and_enter+0x18f/0xbd7 ? arch_suspend_enable_irqs+0x11/0x11 ? printk+0xd9/0x12d ? lock_release+0x95f/0x95f ? log_buf_vmcoreinfo_setup+0x131/0x131 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? __pm_pr_dbg+0x186/0x190 ? pm_notifier_call_chain+0x39/0x39 ? suspend_test+0x9d/0x9d pm_suspend+0x2f4/0x728 ? trace_suspend_resume+0x3da/0x3da ? lock_release+0x95f/0x95f ? kernfs_fop_write+0x19f/0x32d state_store+0xd8/0x147 ? sysfs_kf_read+0x155/0x155 kernfs_fop_write+0x23e/0x32d __vfs_write+0x108/0x608 ? vfs_read+0x2e9/0x2e9 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? debug_smp_processor_id+0x10/0x10 ? selinux_file_permission+0x1c5/0x3c8 ? rcu_sync_lockdep_assert+0x6a/0xad ? __sb_start_write+0x129/0x2ac vfs_write+0x1aa/0x434 ksys_write+0xfe/0x1be ? __ia32_sys_read+0x82/0x82 do_syscall_64+0xcd/0x120 entry_SYSCALL_64_after_hwframe+0x49/0xbe In the observed situation, the problem is seen because the codec driver failed to probe due to a hardware problem. max98090 i2c-193C9890:00: Failed to read device revision: -1 max98090 i2c-193C9890:00: ASoC: failed to probe component -1 cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1 cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1 cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1 The problem is similar to the problem solved with commit 2fc995a ("ASoC: intel: Fix crash at suspend/resume without card registration"), but codec registration fails at a later point. At that time, the pointer checked with the above mentioned commit is already set, but it is not cleared if the device is subsequently removed. Adding a remove function to clear the pointer fixes the problem. Cc: stable@vger.kernel.org Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Curtis Malainey <cujomalainey@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 8f71370)
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
When a target sends Check Condition, whilst initiator is busy xmiting re-queued data, could lead to race between iscsi_complete_task() and iscsi_xmit_task() and eventually crashing with the following kernel backtrace. [3326150.987523] ALERT: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 [3326150.987549] ALERT: IP: [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.987571] WARN: PGD 569c8067 PUD 569c9067 PMD 0 [3326150.987582] WARN: Oops: 0002 [#1] SMP [3326150.987593] WARN: Modules linked in: tun nfsv3 nfs fscache dm_round_robin [3326150.987762] WARN: CPU: 2 PID: 8399 Comm: kworker/u32:1 Tainted: G O 4.4.0+2 #1 [3326150.987774] WARN: Hardware name: Dell Inc. PowerEdge R720/0W7JN5, BIOS 2.5.4 01/22/2016 [3326150.987790] WARN: Workqueue: iscsi_q_13 iscsi_xmitworker [libiscsi] [3326150.987799] WARN: task: ffff8801d50f3800 ti: ffff8801f5458000 task.ti: ffff8801f5458000 [3326150.987810] WARN: RIP: e030:[<ffffffffa05ce70d>] [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.987825] WARN: RSP: e02b:ffff8801f545bdb0 EFLAGS: 00010246 [3326150.987831] WARN: RAX: 00000000ffffffc3 RBX: ffff880282d2ab20 RCX: ffff88026b6ac480 [3326150.987842] WARN: RDX: 0000000000000000 RSI: 00000000fffffe01 RDI: ffff880282d2ab20 [3326150.987852] WARN: RBP: ffff8801f545bdc8 R08: 0000000000000000 R09: 0000000000000008 [3326150.987862] WARN: R10: 0000000000000000 R11: 000000000000fe88 R12: 0000000000000000 [3326150.987872] WARN: R13: ffff880282d2abe8 R14: ffff880282d2abd8 R15: ffff880282d2ac08 [3326150.987890] WARN: FS: 00007f5a866b4840(0000) GS:ffff88028a640000(0000) knlGS:0000000000000000 [3326150.987900] WARN: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [3326150.987907] WARN: CR2: 0000000000000078 CR3: 0000000070244000 CR4: 0000000000042660 [3326150.987918] WARN: Stack: [3326150.987924] WARN: ffff880282d2ad58 ffff880282d2ab20 ffff880282d2abe8 ffff8801f545be18 [3326150.987938] WARN: ffffffffa05cea90 ffff880282d2abf8 ffff88026b59cc80 ffff88026b59cc00 [3326150.987951] WARN: ffff88022acf32c0 ffff880289491800 ffff880255a80800 0000000000000400 [3326150.987964] WARN: Call Trace: [3326150.987975] WARN: [<ffffffffa05cea90>] iscsi_xmitworker+0x2f0/0x360 [libiscsi] [3326150.987988] WARN: [<ffffffff8108862c>] process_one_work+0x1fc/0x3b0 [3326150.987997] WARN: [<ffffffff81088f95>] worker_thread+0x2a5/0x470 [3326150.988006] WARN: [<ffffffff8159cad8>] ? __schedule+0x648/0x870 [3326150.988015] WARN: [<ffffffff81088cf0>] ? rescuer_thread+0x300/0x300 [3326150.988023] WARN: [<ffffffff8108ddf5>] kthread+0xd5/0xe0 [3326150.988031] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110 [3326150.988040] WARN: [<ffffffff815a0bcf>] ret_from_fork+0x3f/0x70 [3326150.988048] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110 [3326150.988127] ALERT: RIP [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.988138] WARN: RSP <ffff8801f545bdb0> [3326150.988144] WARN: CR2: 0000000000000078 [3326151.020366] WARN: ---[ end trace 1c60974d4678d81b ]--- Commit 6f8830f ("scsi: libiscsi: add lock around task lists to fix list corruption regression") introduced "taskqueuelock" to fix list corruption during the race, but this wasn't enough. Re-setting of conn->task to NULL, could race with iscsi_xmit_task(). iscsi_complete_task() { .... if (conn->task == task) conn->task = NULL; } conn->task in iscsi_xmit_task() could be NULL and so will be task. __iscsi_get_task(task) will crash (NullPtr de-ref), trying to access refcount. iscsi_xmit_task() { struct iscsi_task *task = conn->task; __iscsi_get_task(task); } This commit will take extra conn->session->back_lock in iscsi_xmit_task() to ensure iscsi_xmit_task() waits for iscsi_complete_task(), if iscsi_complete_task() wins the race. If iscsi_xmit_task() wins the race, iscsi_xmit_task() increments task->refcount (__iscsi_get_task) ensuring iscsi_complete_task() will not iscsi_free_task(). Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Acked-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
The compound IOMMU group rework moved iommu_register_group() together in pnv_pci_ioda_setup_iommu_api() (which is a part of ppc_md.pcibios_fixup). As the result, pnv_ioda_setup_bus_iommu_group() does not create groups any more, it only adds devices to groups. This works fine for boot time devices. However IOMMU groups for SRIOV's VFs were added by pnv_ioda_setup_bus_iommu_group() so this got broken: pnv_tce_iommu_bus_notifier() expects a group to be registered for VF and it is not. This adds missing group registration and adds a NULL pointer check into the bus notifier so we won't crash if there is no group, although it is not expected to happen now because of the change above. Example oops seen prior to this patch: $ echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/sriov_numvfs Unable to handle kernel paging request for data at address 0x00000030 Faulting instruction address: 0xc0000000004a6018 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV CPU: 46 PID: 7006 Comm: bash Not tainted 4.15-ish NIP: c0000000004a6018 LR: c0000000004a6014 CTR: 0000000000000000 REGS: c000008fc876b400 TRAP: 0300 Not tainted (4.15-ish) MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CFAR: c000000000d0be20 DAR: 0000000000000030 DSISR: 40000000 SOFTE: 1 ... NIP sysfs_do_create_link_sd.isra.0+0x68/0x150 LR sysfs_do_create_link_sd.isra.0+0x64/0x150 Call Trace: pci_dev_type+0x0/0x30 (unreliable) iommu_group_add_device+0x8c/0x600 iommu_add_device+0xe8/0x180 pnv_tce_iommu_bus_notifier+0xb0/0xf0 notifier_call_chain+0x9c/0x110 blocking_notifier_call_chain+0x64/0xa0 device_add+0x524/0x7d0 pci_device_add+0x248/0x450 pci_iov_add_virtfn+0x294/0x3e0 pci_enable_sriov+0x43c/0x580 mlx5_core_sriov_configure+0x15c/0x2f0 [mlx5_core] sriov_numvfs_store+0x180/0x240 dev_attr_store+0x3c/0x60 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1ac/0x240 __vfs_write+0x3c/0x70 vfs_write+0xd8/0x220 SyS_write+0x6c/0x110 system_call+0x58/0x6c Fixes: 0bd9716 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reported-by: Santwana Samantray <santwana.samantray@in.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
Attempting to avoid cloning the skb when broadcasting by inflating the refcount with sock_hold/sock_put while under RCU lock is dangerous and violates RCU principles. It leads to subtle race conditions when attempting to free the SKB, as we may reference sockets that have already been freed by the stack. Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c4b [006b6b6b6b6b6c4b] address between user and kernel address ranges Internal error: Oops: 96000004 [#1] PREEMPT SMP task: fffffff78f65b380 task.stack: ffffff8049a88000 pc : sock_rfree+0x38/0x6c lr : skb_release_head_state+0x6c/0xcc Process repro (pid: 7117, stack limit = 0xffffff8049a88000) Call trace: sock_rfree+0x38/0x6c skb_release_head_state+0x6c/0xcc skb_release_all+0x1c/0x38 __kfree_skb+0x1c/0x30 kfree_skb+0xd0/0xf4 pfkey_broadcast+0x14c/0x18c pfkey_sendmsg+0x1d8/0x408 sock_sendmsg+0x44/0x60 ___sys_sendmsg+0x1d0/0x2a8 __sys_sendmsg+0x64/0xb4 SyS_sendmsg+0x34/0x4c el0_svc_naked+0x34/0x38 Kernel panic - not syncing: Fatal exception Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
…version Fix a possible NULL pointer dereference in ip6erspan_set_version checking nlattr data pointer kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 7549 Comm: syz-executor432 Not tainted 5.0.0-rc6-next-20190218 thesofproject#37 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:ip6erspan_set_version+0x5c/0x350 net/ipv6/ip6_gre.c:1726 Code: 07 38 d0 7f 08 84 c0 0f 85 9f 02 00 00 49 8d bc 24 b0 00 00 00 c6 43 54 01 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 9a 02 00 00 4d 8b ac 24 b0 00 00 00 4d 85 ed 0f RSP: 0018:ffff888089ed7168 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8880869d6e58 RCX: 0000000000000000 RDX: 0000000000000016 RSI: ffffffff862736b4 RDI: 00000000000000b0 RBP: ffff888089ed7180 R08: 1ffff11010d3adcb R09: ffff8880869d6e58 R10: ffffed1010d3add5 R11: ffff8880869d6eaf R12: 0000000000000000 R13: ffffffff8931f8c0 R14: ffffffff862825d0 R15: ffff8880869d6e58 FS: 0000000000b3d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000184 CR3: 0000000092cc5000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip6erspan_newlink+0x66/0x7b0 net/ipv6/ip6_gre.c:2210 __rtnl_newlink+0x107b/0x16c0 net/core/rtnetlink.c:3176 rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3234 rtnetlink_rcv_msg+0x465/0xb00 net/core/rtnetlink.c:5192 netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2485 rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5210 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1925 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xdd/0x130 net/socket.c:631 ___sys_sendmsg+0x806/0x930 net/socket.c:2136 __sys_sendmsg+0x105/0x1d0 net/socket.c:2174 __do_sys_sendmsg net/socket.c:2183 [inline] __se_sys_sendmsg net/socket.c:2181 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2181 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440159 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffa69156e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440159 RDX: 0000000000000000 RSI: 0000000020001340 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000001 R09: 00000000004002c8 R10: 0000000000000011 R11: 0000000000000246 R12: 00000000004019e0 R13: 0000000000401a70 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace 09f8a7d13b4faaa1 ]--- RIP: 0010:ip6erspan_set_version+0x5c/0x350 net/ipv6/ip6_gre.c:1726 Code: 07 38 d0 7f 08 84 c0 0f 85 9f 02 00 00 49 8d bc 24 b0 00 00 00 c6 43 54 01 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 9a 02 00 00 4d 8b ac 24 b0 00 00 00 4d 85 ed 0f RSP: 0018:ffff888089ed7168 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8880869d6e58 RCX: 0000000000000000 RDX: 0000000000000016 RSI: ffffffff862736b4 RDI: 00000000000000b0 RBP: ffff888089ed7180 R08: 1ffff11010d3adcb R09: ffff8880869d6e58 R10: ffffed1010d3add5 R11: ffff8880869d6eaf R12: 0000000000000000 R13: ffffffff8931f8c0 R14: ffffffff862825d0 R15: ffff8880869d6e58 FS: 0000000000b3d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000184 CR3: 0000000092cc5000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 4974d5f ("net: ip6_gre: initialize erspan_ver just for erspan tunnels") Reported-and-tested-by: syzbot+30191cf1057abd3064af@syzkaller.appspotmail.com Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
The SHA256 code we adopted from the OpenSSL project uses a rather peculiar way to take the address of the round constant table: it takes the address of the sha256_block_data_order() routine, and substracts a constant known quantity to arrive at the base of the table, which is emitted by the same assembler code right before the routine's entry point. However, recent versions of binutils have helpfully changed the behavior of references emitted via an ADR instruction when running in Thumb2 mode: it now takes the Thumb execution mode bit into account, which is bit 0 af the address. This means the produced table address also has bit 0 set, and so we end up with an address value pointing 1 byte past the start of the table, which results in crashes such as Unable to handle kernel paging request at virtual address bf825000 pgd = 42f44b1 [bf825000] *pgd=80000040206003, *pmd=5f1bd003, *pte=00000000 Internal error: Oops: 207 [#1] PREEMPT SMP THUMB2 Modules linked in: sha256_arm(+) sha1_arm_ce sha1_arm ... CPU: 7 PID: 396 Comm: cryptomgr_test Not tainted 5.0.0-rc6+ thesofproject#144 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 PC is at sha256_block_data_order+0xaaa/0xb30 [sha256_arm] LR is at __this_module+0x17fd/0xffffe800 [sha256_arm] pc : [<bf820bca>] lr : [<bf824ffd>] psr: 800b0033 sp : ebc8bbe ip : faaabe1c fp : 2fdd3433 r10: 4c5f1692 r9 : e43037df r8 : b04b0a5a r7 : c369d722 r6 : 39c3693e r5 : 7a013189 r4 : 1580d26b r3 : 8762a9b0 r2 : eea9c2cd r1 : 3e9ab536 r0 : 1dea4ae7 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment user Control: 70c5383d Table: 6b8467c0 DAC: dbadc0de Process cryptomgr_test (pid: 396, stack limit = 0x69e1fe23) Stack: (0xebc8bbe8 to 0xebc8c000) ... unwind: Unknown symbol address bf820bca unwind: Index not found bf820bca Code: 441a ea80 40f9 440a (f85e) 3b04 ---[ end trace e560cce92700ef8a ]--- Given that this affects older kernels as well, in case they are built with a recent toolchain, apply a minimal backportable fix, which is to emit another non-code label at the start of the routine, and reference that instead. (This is similar to the current upstream state of this file in OpenSSL) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
The SHA512 code we adopted from the OpenSSL project uses a rather peculiar way to take the address of the round constant table: it takes the address of the sha256_block_data_order() routine, and substracts a constant known quantity to arrive at the base of the table, which is emitted by the same assembler code right before the routine's entry point. However, recent versions of binutils have helpfully changed the behavior of references emitted via an ADR instruction when running in Thumb2 mode: it now takes the Thumb execution mode bit into account, which is bit 0 af the address. This means the produced table address also has bit 0 set, and so we end up with an address value pointing 1 byte past the start of the table, which results in crashes such as Unable to handle kernel paging request at virtual address bf825000 pgd = 42f44b1 [bf825000] *pgd=80000040206003, *pmd=5f1bd003, *pte=00000000 Internal error: Oops: 207 [#1] PREEMPT SMP THUMB2 Modules linked in: sha256_arm(+) sha1_arm_ce sha1_arm ... CPU: 7 PID: 396 Comm: cryptomgr_test Not tainted 5.0.0-rc6+ thesofproject#144 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 PC is at sha256_block_data_order+0xaaa/0xb30 [sha256_arm] LR is at __this_module+0x17fd/0xffffe800 [sha256_arm] pc : [<bf820bca>] lr : [<bf824ffd>] psr: 800b0033 sp : ebc8bbe ip : faaabe1c fp : 2fdd3433 r10: 4c5f1692 r9 : e43037df r8 : b04b0a5a r7 : c369d722 r6 : 39c3693e r5 : 7a013189 r4 : 1580d26b r3 : 8762a9b0 r2 : eea9c2cd r1 : 3e9ab536 r0 : 1dea4ae7 Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment user Control: 70c5383d Table: 6b8467c0 DAC: dbadc0de Process cryptomgr_test (pid: 396, stack limit = 0x69e1fe23) Stack: (0xebc8bbe8 to 0xebc8c000) ... unwind: Unknown symbol address bf820bca unwind: Index not found bf820bca Code: 441a ea80 40f9 440a (f85e) 3b04 ---[ end trace e560cce92700ef8a ]--- Given that this affects older kernels as well, in case they are built with a recent toolchain, apply a minimal backportable fix, which is to emit another non-code label at the start of the routine, and reference that instead. (This is similar to the current upstream state of this file in OpenSSL) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
Commit 5738459 ("iommu/vt-d: Store bus information in RMRR PCI device path") changed the type of the path data, however, the change in path type was not reflected in size calculations. Update to use the correct type and prevent a buffer overflow. This bug manifests in systems with deep PCI hierarchies, and can lead to an overflow of the static allocated buffer (dmar_pci_notify_info_buf), or can lead to overflow of slab-allocated data. BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0 Write of size 1 at addr ffffffff90445d80 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.14.87-rt49-02406-gd0a0e96 #1 Call Trace: ? dump_stack+0x46/0x59 ? print_address_description+0x1df/0x290 ? dmar_alloc_pci_notify_info+0x1d5/0x2e0 ? kasan_report+0x256/0x340 ? dmar_alloc_pci_notify_info+0x1d5/0x2e0 ? e820__memblock_setup+0xb0/0xb0 ? dmar_dev_scope_init+0x424/0x48f ? __down_write_common+0x1ec/0x230 ? dmar_dev_scope_init+0x48f/0x48f ? dmar_free_unused_resources+0x109/0x109 ? cpumask_next+0x16/0x20 ? __kmem_cache_create+0x392/0x430 ? kmem_cache_create+0x135/0x2f0 ? e820__memblock_setup+0xb0/0xb0 ? intel_iommu_init+0x170/0x1848 ? _raw_spin_unlock_irqrestore+0x32/0x60 ? migrate_enable+0x27a/0x5b0 ? sched_setattr+0x20/0x20 ? migrate_disable+0x1fc/0x380 ? task_rq_lock+0x170/0x170 ? try_to_run_init_process+0x40/0x40 ? locks_remove_file+0x85/0x2f0 ? dev_prepare_static_identity_mapping+0x78/0x78 ? rt_spin_unlock+0x39/0x50 ? lockref_put_or_lock+0x2a/0x40 ? dput+0x128/0x2f0 ? __rcu_read_unlock+0x66/0x80 ? __fput+0x250/0x300 ? __rcu_read_lock+0x1b/0x30 ? mntput_no_expire+0x38/0x290 ? e820__memblock_setup+0xb0/0xb0 ? pci_iommu_init+0x25/0x63 ? pci_iommu_init+0x25/0x63 ? do_one_initcall+0x7e/0x1c0 ? initcall_blacklisted+0x120/0x120 ? kernel_init_freeable+0x27b/0x307 ? rest_init+0xd0/0xd0 ? kernel_init+0xf/0x120 ? rest_init+0xd0/0xd0 ? ret_from_fork+0x1f/0x40 The buggy address belongs to the variable: dmar_pci_notify_info_buf+0x40/0x60 Fixes: 5738459 ("iommu/vt-d: Store bus information in RMRR PCI device path") Signed-off-by: Julia Cartwright <julia@ni.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
It can be reproduced by following steps: 1. virtio_net NIC is configured with gso/tso on 2. configure nginx as http server with an index file bigger than 1M bytes 3. use tc netem to produce duplicate packets and delay: tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90% 4. continually curl the nginx http server to get index file on client 5. BUG_ON is seen quickly [10258690.371129] kernel BUG at net/core/skbuff.c:4028! [10258690.371748] invalid opcode: 0000 [#1] SMP PTI [10258690.372094] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 5.0.0-rc6 #2 [10258690.372094] RSP: 0018:ffffa05797b43da0 EFLAGS: 00010202 [10258690.372094] RBP: 00000000000005ea R08: 0000000000000000 R09: 00000000000005ea [10258690.372094] R10: ffffa0579334d800 R11: 00000000000002c0 R12: 0000000000000002 [10258690.372094] R13: 0000000000000000 R14: ffffa05793122900 R15: ffffa0578f7cb028 [10258690.372094] FS: 0000000000000000(0000) GS:ffffa05797b40000(0000) knlGS:0000000000000000 [10258690.372094] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [10258690.372094] CR2: 00007f1a6dc00868 CR3: 000000001000e000 CR4: 00000000000006e0 [10258690.372094] Call Trace: [10258690.372094] <IRQ> [10258690.372094] skb_to_sgvec+0x11/0x40 [10258690.372094] start_xmit+0x38c/0x520 [virtio_net] [10258690.372094] dev_hard_start_xmit+0x9b/0x200 [10258690.372094] sch_direct_xmit+0xff/0x260 [10258690.372094] __qdisc_run+0x15e/0x4e0 [10258690.372094] net_tx_action+0x137/0x210 [10258690.372094] __do_softirq+0xd6/0x2a9 [10258690.372094] irq_exit+0xde/0xf0 [10258690.372094] smp_apic_timer_interrupt+0x74/0x140 [10258690.372094] apic_timer_interrupt+0xf/0x20 [10258690.372094] </IRQ> In __skb_to_sgvec(), the skb->len is not equal to the sum of the skb's linear data size and nonlinear data size, thus BUG_ON triggered. Because the skb is cloned and a part of nonlinear data is split off. Duplicate packet is cloned in netem_enqueue() and may be delayed some time in qdisc. When qdisc len reached the limit and returns NET_XMIT_DROP, the skb will be retransmit later in write queue. the skb will be fragmented by tso_fragment(), the limit size that depends on cwnd and mss decrease, the skb's nonlinear data will be split off. The length of the skb cloned by netem will not be updated. When we use virtio_net NIC and invoke skb_to_sgvec(), the BUG_ON trigger. To fix it, netem returns NET_XMIT_SUCCESS to upper stack when it clones a duplicate packet. Fixes: 35d889d ("sch_netem: fix skb leak in netem_enqueue()") Signed-off-by: Sheng Lan <lansheng@huawei.com> Reported-by: Qin Ji <jiqin.ji@huawei.com> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
If codec registration fails after the ASoC Intel SST driver has been probed, the kernel will Oops and crash at suspend/resume. general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 PID: 2811 Comm: cat Tainted: G W 4.19.30 thesofproject#15 Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014 RIP: 0010:snd_soc_suspend+0x5a/0xd21 Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89 fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b RSP: 0018:ffff888035407750 EFLAGS: 00010202 RAX: 0000000000000034 RBX: 00000000000001a0 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88805c417098 RBP: ffff8880354077b0 R08: dffffc0000000000 R09: ffffed100b975718 R10: 0000000000000001 R11: ffffffff949ea4a3 R12: 1ffff1100b975746 R13: dffffc0000000000 R14: ffff88805cba4588 R15: dffffc0000000000 FS: 0000794a78e91b80(0000) GS:ffff888068d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007bd5283ccf58 CR3: 000000004b7aa000 CR4: 00000000001006e0 Call Trace: ? dpm_complete+0x67b/0x67b ? i915_gem_suspend+0x14d/0x1ad sst_soc_prepare+0x91/0x1dd ? sst_be_hw_params+0x7e/0x7e dpm_prepare+0x39a/0x88b dpm_suspend_start+0x13/0x9d suspend_devices_and_enter+0x18f/0xbd7 ? arch_suspend_enable_irqs+0x11/0x11 ? printk+0xd9/0x12d ? lock_release+0x95f/0x95f ? log_buf_vmcoreinfo_setup+0x131/0x131 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? __pm_pr_dbg+0x186/0x190 ? pm_notifier_call_chain+0x39/0x39 ? suspend_test+0x9d/0x9d pm_suspend+0x2f4/0x728 ? trace_suspend_resume+0x3da/0x3da ? lock_release+0x95f/0x95f ? kernfs_fop_write+0x19f/0x32d state_store+0xd8/0x147 ? sysfs_kf_read+0x155/0x155 kernfs_fop_write+0x23e/0x32d __vfs_write+0x108/0x608 ? vfs_read+0x2e9/0x2e9 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? debug_smp_processor_id+0x10/0x10 ? selinux_file_permission+0x1c5/0x3c8 ? rcu_sync_lockdep_assert+0x6a/0xad ? __sb_start_write+0x129/0x2ac vfs_write+0x1aa/0x434 ksys_write+0xfe/0x1be ? __ia32_sys_read+0x82/0x82 do_syscall_64+0xcd/0x120 entry_SYSCALL_64_after_hwframe+0x49/0xbe In the observed situation, the problem is seen because the codec driver failed to probe due to a hardware problem. max98090 i2c-193C9890:00: Failed to read device revision: -1 max98090 i2c-193C9890:00: ASoC: failed to probe component -1 cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1 cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1 cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1 The problem is similar to the problem solved with commit 2fc995a ("ASoC: intel: Fix crash at suspend/resume without card registration"), but codec registration fails at a later point. At that time, the pointer checked with the above mentioned commit is already set, but it is not cleared if the device is subsequently removed. Adding a remove function to clear the pointer fixes the problem. Cc: stable@vger.kernel.org Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Curtis Malainey <cujomalainey@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 8f71370)
bardliao
pushed a commit
that referenced
this pull request
Apr 26, 2019
If codec registration fails after the ASoC Intel SST driver has been probed, the kernel will Oops and crash at suspend/resume. general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 PID: 2811 Comm: cat Tainted: G W 4.19.30 thesofproject#15 Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014 RIP: 0010:snd_soc_suspend+0x5a/0xd21 Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89 fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b RSP: 0018:ffff888035407750 EFLAGS: 00010202 RAX: 0000000000000034 RBX: 00000000000001a0 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88805c417098 RBP: ffff8880354077b0 R08: dffffc0000000000 R09: ffffed100b975718 R10: 0000000000000001 R11: ffffffff949ea4a3 R12: 1ffff1100b975746 R13: dffffc0000000000 R14: ffff88805cba4588 R15: dffffc0000000000 FS: 0000794a78e91b80(0000) GS:ffff888068d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007bd5283ccf58 CR3: 000000004b7aa000 CR4: 00000000001006e0 Call Trace: ? dpm_complete+0x67b/0x67b ? i915_gem_suspend+0x14d/0x1ad sst_soc_prepare+0x91/0x1dd ? sst_be_hw_params+0x7e/0x7e dpm_prepare+0x39a/0x88b dpm_suspend_start+0x13/0x9d suspend_devices_and_enter+0x18f/0xbd7 ? arch_suspend_enable_irqs+0x11/0x11 ? printk+0xd9/0x12d ? lock_release+0x95f/0x95f ? log_buf_vmcoreinfo_setup+0x131/0x131 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? __pm_pr_dbg+0x186/0x190 ? pm_notifier_call_chain+0x39/0x39 ? suspend_test+0x9d/0x9d pm_suspend+0x2f4/0x728 ? trace_suspend_resume+0x3da/0x3da ? lock_release+0x95f/0x95f ? kernfs_fop_write+0x19f/0x32d state_store+0xd8/0x147 ? sysfs_kf_read+0x155/0x155 kernfs_fop_write+0x23e/0x32d __vfs_write+0x108/0x608 ? vfs_read+0x2e9/0x2e9 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? debug_smp_processor_id+0x10/0x10 ? selinux_file_permission+0x1c5/0x3c8 ? rcu_sync_lockdep_assert+0x6a/0xad ? __sb_start_write+0x129/0x2ac vfs_write+0x1aa/0x434 ksys_write+0xfe/0x1be ? __ia32_sys_read+0x82/0x82 do_syscall_64+0xcd/0x120 entry_SYSCALL_64_after_hwframe+0x49/0xbe In the observed situation, the problem is seen because the codec driver failed to probe due to a hardware problem. max98090 i2c-193C9890:00: Failed to read device revision: -1 max98090 i2c-193C9890:00: ASoC: failed to probe component -1 cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1 cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1 cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1 The problem is similar to the problem solved with commit 2fc995a ("ASoC: intel: Fix crash at suspend/resume without card registration"), but codec registration fails at a later point. At that time, the pointer checked with the above mentioned commit is already set, but it is not cleared if the device is subsequently removed. Adding a remove function to clear the pointer fixes the problem. Cc: stable@vger.kernel.org Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Curtis Malainey <cujomalainey@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 8f71370)
bardliao
pushed a commit
that referenced
this pull request
Apr 28, 2019
If codec registration fails after the ASoC Intel SST driver has been probed, the kernel will Oops and crash at suspend/resume. general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 PID: 2811 Comm: cat Tainted: G W 4.19.30 thesofproject#15 Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014 RIP: 0010:snd_soc_suspend+0x5a/0xd21 Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89 fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b RSP: 0018:ffff888035407750 EFLAGS: 00010202 RAX: 0000000000000034 RBX: 00000000000001a0 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88805c417098 RBP: ffff8880354077b0 R08: dffffc0000000000 R09: ffffed100b975718 R10: 0000000000000001 R11: ffffffff949ea4a3 R12: 1ffff1100b975746 R13: dffffc0000000000 R14: ffff88805cba4588 R15: dffffc0000000000 FS: 0000794a78e91b80(0000) GS:ffff888068d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007bd5283ccf58 CR3: 000000004b7aa000 CR4: 00000000001006e0 Call Trace: ? dpm_complete+0x67b/0x67b ? i915_gem_suspend+0x14d/0x1ad sst_soc_prepare+0x91/0x1dd ? sst_be_hw_params+0x7e/0x7e dpm_prepare+0x39a/0x88b dpm_suspend_start+0x13/0x9d suspend_devices_and_enter+0x18f/0xbd7 ? arch_suspend_enable_irqs+0x11/0x11 ? printk+0xd9/0x12d ? lock_release+0x95f/0x95f ? log_buf_vmcoreinfo_setup+0x131/0x131 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? __pm_pr_dbg+0x186/0x190 ? pm_notifier_call_chain+0x39/0x39 ? suspend_test+0x9d/0x9d pm_suspend+0x2f4/0x728 ? trace_suspend_resume+0x3da/0x3da ? lock_release+0x95f/0x95f ? kernfs_fop_write+0x19f/0x32d state_store+0xd8/0x147 ? sysfs_kf_read+0x155/0x155 kernfs_fop_write+0x23e/0x32d __vfs_write+0x108/0x608 ? vfs_read+0x2e9/0x2e9 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? debug_smp_processor_id+0x10/0x10 ? selinux_file_permission+0x1c5/0x3c8 ? rcu_sync_lockdep_assert+0x6a/0xad ? __sb_start_write+0x129/0x2ac vfs_write+0x1aa/0x434 ksys_write+0xfe/0x1be ? __ia32_sys_read+0x82/0x82 do_syscall_64+0xcd/0x120 entry_SYSCALL_64_after_hwframe+0x49/0xbe In the observed situation, the problem is seen because the codec driver failed to probe due to a hardware problem. max98090 i2c-193C9890:00: Failed to read device revision: -1 max98090 i2c-193C9890:00: ASoC: failed to probe component -1 cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1 cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1 cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1 The problem is similar to the problem solved with commit 2fc995a ("ASoC: intel: Fix crash at suspend/resume without card registration"), but codec registration fails at a later point. At that time, the pointer checked with the above mentioned commit is already set, but it is not cleared if the device is subsequently removed. Adding a remove function to clear the pointer fixes the problem. Cc: stable@vger.kernel.org Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Curtis Malainey <cujomalainey@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 8f71370)
bardliao
pushed a commit
that referenced
this pull request
Apr 29, 2019
With the following commit: 73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS") ... the hotplug code attempted to detect when SMT was disabled by BIOS, in which case it reported SMT as permanently disabled. However, that code broke a virt hotplug scenario, where the guest is booted with only primary CPU threads, and a sibling is brought online later. The problem is that there doesn't seem to be a way to reliably distinguish between the HW "SMT disabled by BIOS" case and the virt "sibling not yet brought online" case. So the above-mentioned commit was a bit misguided, as it permanently disabled SMT for both cases, preventing future virt sibling hotplugs. Going back and reviewing the original problems which were attempted to be solved by that commit, when SMT was disabled in BIOS: 1) /sys/devices/system/cpu/smt/control showed "on" instead of "notsupported"; and 2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning. I'd propose that we instead consider #1 above to not actually be a problem. Because, at least in the virt case, it's possible that SMT wasn't disabled by BIOS and a sibling thread could be brought online later. So it makes sense to just always default the smt control to "on" to allow for that possibility (assuming cpuid indicates that the CPU supports SMT). The real problem is #2, which has a simple fix: change vmx_vm_init() to query the actual current SMT state -- i.e., whether any siblings are currently online -- instead of looking at the SMT "control" sysfs value. So fix it by: a) reverting the original "fix" and its followup fix: 73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS") bc2d8d2 ("cpu/hotplug: Fix SMT supported evaluation") and b) changing vmx_vm_init() to query the actual current SMT state -- instead of the sysfs control value -- to determine whether the L1TF warning is needed. This also requires the 'sched_smt_present' variable to exported, instead of 'cpu_smt_control'. Fixes: 73d5e2b ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joe Mario <jmario@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
bardliao
pushed a commit
that referenced
this pull request
Apr 29, 2019
Lockdep found a potential deadlock between cpu_hotplug_lock, bpf_event_mutex, and cpuctx_mutex: [ 13.007000] WARNING: possible circular locking dependency detected [ 13.007587] 5.0.0-rc3-00018-g2fa53f892422-dirty thesofproject#477 Not tainted [ 13.008124] ------------------------------------------------------ [ 13.008624] test_progs/246 is trying to acquire lock: [ 13.009030] 0000000094160d1d (tracepoints_mutex){+.+.}, at: tracepoint_probe_register_prio+0x2d/0x300 [ 13.009770] [ 13.009770] but task is already holding lock: [ 13.010239] 00000000d663ef86 (bpf_event_mutex){+.+.}, at: bpf_probe_register+0x1d/0x60 [ 13.010877] [ 13.010877] which lock already depends on the new lock. [ 13.010877] [ 13.011532] [ 13.011532] the existing dependency chain (in reverse order) is: [ 13.012129] [ 13.012129] -> #4 (bpf_event_mutex){+.+.}: [ 13.012582] perf_event_query_prog_array+0x9b/0x130 [ 13.013016] _perf_ioctl+0x3aa/0x830 [ 13.013354] perf_ioctl+0x2e/0x50 [ 13.013668] do_vfs_ioctl+0x8f/0x6a0 [ 13.014003] ksys_ioctl+0x70/0x80 [ 13.014320] __x64_sys_ioctl+0x16/0x20 [ 13.014668] do_syscall_64+0x4a/0x180 [ 13.015007] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.015469] [ 13.015469] -> #3 (&cpuctx_mutex){+.+.}: [ 13.015910] perf_event_init_cpu+0x5a/0x90 [ 13.016291] perf_event_init+0x1b2/0x1de [ 13.016654] start_kernel+0x2b8/0x42a [ 13.016995] secondary_startup_64+0xa4/0xb0 [ 13.017382] [ 13.017382] -> #2 (pmus_lock){+.+.}: [ 13.017794] perf_event_init_cpu+0x21/0x90 [ 13.018172] cpuhp_invoke_callback+0xb3/0x960 [ 13.018573] _cpu_up+0xa7/0x140 [ 13.018871] do_cpu_up+0xa4/0xc0 [ 13.019178] smp_init+0xcd/0xd2 [ 13.019483] kernel_init_freeable+0x123/0x24f [ 13.019878] kernel_init+0xa/0x110 [ 13.020201] ret_from_fork+0x24/0x30 [ 13.020541] [ 13.020541] -> #1 (cpu_hotplug_lock.rw_sem){++++}: [ 13.021051] static_key_slow_inc+0xe/0x20 [ 13.021424] tracepoint_probe_register_prio+0x28c/0x300 [ 13.021891] perf_trace_event_init+0x11f/0x250 [ 13.022297] perf_trace_init+0x6b/0xa0 [ 13.022644] perf_tp_event_init+0x25/0x40 [ 13.023011] perf_try_init_event+0x6b/0x90 [ 13.023386] perf_event_alloc+0x9a8/0xc40 [ 13.023754] __do_sys_perf_event_open+0x1dd/0xd30 [ 13.024173] do_syscall_64+0x4a/0x180 [ 13.024519] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.024968] [ 13.024968] -> #0 (tracepoints_mutex){+.+.}: [ 13.025434] __mutex_lock+0x86/0x970 [ 13.025764] tracepoint_probe_register_prio+0x2d/0x300 [ 13.026215] bpf_probe_register+0x40/0x60 [ 13.026584] bpf_raw_tracepoint_open.isra.34+0xa4/0x130 [ 13.027042] __do_sys_bpf+0x94f/0x1a90 [ 13.027389] do_syscall_64+0x4a/0x180 [ 13.027727] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.028171] [ 13.028171] other info that might help us debug this: [ 13.028171] [ 13.028807] Chain exists of: [ 13.028807] tracepoints_mutex --> &cpuctx_mutex --> bpf_event_mutex [ 13.028807] [ 13.029666] Possible unsafe locking scenario: [ 13.029666] [ 13.030140] CPU0 CPU1 [ 13.030510] ---- ---- [ 13.030875] lock(bpf_event_mutex); [ 13.031166] lock(&cpuctx_mutex); [ 13.031645] lock(bpf_event_mutex); [ 13.032135] lock(tracepoints_mutex); [ 13.032441] [ 13.032441] *** DEADLOCK *** [ 13.032441] [ 13.032911] 1 lock held by test_progs/246: [ 13.033239] #0: 00000000d663ef86 (bpf_event_mutex){+.+.}, at: bpf_probe_register+0x1d/0x60 [ 13.033909] [ 13.033909] stack backtrace: [ 13.034258] CPU: 1 PID: 246 Comm: test_progs Not tainted 5.0.0-rc3-00018-g2fa53f892422-dirty thesofproject#477 [ 13.034964] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 [ 13.035657] Call Trace: [ 13.035859] dump_stack+0x5f/0x8b [ 13.036130] print_circular_bug.isra.37+0x1ce/0x1db [ 13.036526] __lock_acquire+0x1158/0x1350 [ 13.036852] ? lock_acquire+0x98/0x190 [ 13.037154] lock_acquire+0x98/0x190 [ 13.037447] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.037876] __mutex_lock+0x86/0x970 [ 13.038167] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.038600] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.039028] ? __mutex_lock+0x86/0x970 [ 13.039337] ? __mutex_lock+0x24a/0x970 [ 13.039649] ? bpf_probe_register+0x1d/0x60 [ 13.039992] ? __bpf_trace_sched_wake_idle_without_ipi+0x10/0x10 [ 13.040478] ? tracepoint_probe_register_prio+0x2d/0x300 [ 13.040906] tracepoint_probe_register_prio+0x2d/0x300 [ 13.041325] bpf_probe_register+0x40/0x60 [ 13.041649] bpf_raw_tracepoint_open.isra.34+0xa4/0x130 [ 13.042068] ? __might_fault+0x3e/0x90 [ 13.042374] __do_sys_bpf+0x94f/0x1a90 [ 13.042678] do_syscall_64+0x4a/0x180 [ 13.042975] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 13.043382] RIP: 0033:0x7f23b10a07f9 [ 13.045155] RSP: 002b:00007ffdef42fdd8 EFLAGS: 00000202 ORIG_RAX: 0000000000000141 [ 13.045759] RAX: ffffffffffffffda RBX: 00007ffdef42ff70 RCX: 00007f23b10a07f9 [ 13.046326] RDX: 0000000000000070 RSI: 00007ffdef42fe10 RDI: 0000000000000011 [ 13.046893] RBP: 00007ffdef42fdf0 R08: 0000000000000038 R09: 00007ffdef42fe10 [ 13.047462] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 [ 13.048029] R13: 0000000000000016 R14: 00007f23b1db4690 R15: 0000000000000000 Since tracepoints_mutex will be taken in tracepoint_probe_register/unregister() there is no need to take bpf_event_mutex too. bpf_event_mutex is protecting modifications to prog array used in kprobe/perf bpf progs. bpf_raw_tracepoints don't need to take this mutex. Fixes: c4f6699 ("bpf: introduce BPF_RAW_TRACEPOINT") Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.