Skip to content

Odroidxu4 v4.14#1

Closed
tobetter wants to merge 3 commits intomihailescu2m:odroidxu4-4.14.yfrom
tobetter:odroidxu4-v4.14
Closed

Odroidxu4 v4.14#1
tobetter wants to merge 3 commits intomihailescu2m:odroidxu4-4.14.yfrom
tobetter:odroidxu4-v4.14

Conversation

@tobetter
Copy link
Copy Markdown

3 commits are added to enable LCD and IR receiver on CloudShell.

@mihailescu2m
Copy link
Copy Markdown
Owner

hi, i did a force update to fix the frequencies, and this patch does not apply anymore.
can you please resubmit?

Signed-off-by: Dongjin Kim <tobetter@gmail.com>
This driver helps to register the device of GPIO based IR receiver, "gpio-ir-recv",
with the gpio number and pulse trigger when driver is loading. For example,

	# modprobe gpio-ir-recv
	# modprobe gpioplug-ir-recv gpio_nr=24 active_low=1

Change-Id: Ia0cf06140f51eb3f43d89dfe68f442d3e1f1a57e
Signed-off-by: Dongjin Kim <tobetter@gmail.com>
Change-Id: I65e9f38395dddfbe14daf7300a34a955453b5cb4
Signed-off-by: Dongjin Kim <tobetter@gmail.com>
@tobetter tobetter closed this Oct 15, 2017
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
An out of bounds error was detected on an ARM64 target with
Android based kernel 4.9. This occurs while trying to
restore mark on a skb from an inet request socket.

BUG: KASAN: slab-out-of-bounds in socket_match.isra.2+0xc8/0x1f0 net/netfilter/xt_socket.c:248
Read of size 4 at addr ffffffc06a8d824c by task syz-fuzzer/1532
CPU: 7 PID: 1532 Comm: syz-fuzzer Tainted: G        W  O    4.9.41+ #1
Call trace:
[<ffffff900808d2f8>] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:76
[<ffffff900808d760>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226
[<ffffff90085f7dc8>] __dump_stack lib/dump_stack.c:15 [inline]
[<ffffff90085f7dc8>] dump_stack+0xe4/0x134 lib/dump_stack.c:51
[<ffffff900830f358>] print_address_description+0x68/0x258 mm/kasan/report.c:248
[<ffffff900830f770>] kasan_report_error mm/kasan/report.c:347 [inline]
[<ffffff900830f770>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371
[<ffffff900830fdec>] kasan_report+0x5c/0x70 mm/kasan/report.c:372
[<ffffff900830de98>] check_memory_region_inline mm/kasan/kasan.c:308 [inline]
[<ffffff900830de98>] __asan_load4+0x88/0xa0 mm/kasan/kasan.c:740
[<ffffff90097498f8>] socket_match.isra.2+0xc8/0x1f0 net/netfilter/xt_socket.c:248
[<ffffff9009749a5c>] socket_mt4_v1_v2_v3+0x3c/0x48 net/netfilter/xt_socket.c:272
[<ffffff90097f7e4c>] ipt_do_table+0x54c/0xad8 net/ipv4/netfilter/ip_tables.c:311
[<ffffff90097fcf14>] iptable_mangle_hook+0x6c/0x220 net/ipv4/netfilter/iptable_mangle.c:90
...
Allocated by task 1532:
 save_stack_trace_tsk+0x0/0x2a0 arch/arm64/kernel/stacktrace.c:131
 save_stack_trace+0x28/0x38 arch/arm64/kernel/stacktrace.c:215
 save_stack mm/kasan/kasan.c:495 [inline]
 set_track mm/kasan/kasan.c:507 [inline]
 kasan_kmalloc+0xd8/0x188 mm/kasan/kasan.c:599
 kasan_slab_alloc+0x14/0x20 mm/kasan/kasan.c:537
 slab_post_alloc_hook mm/slab.h:417 [inline]
 slab_alloc_node mm/slub.c:2728 [inline]
 slab_alloc mm/slub.c:2736 [inline]
 kmem_cache_alloc+0x14c/0x2e8 mm/slub.c:2741
 reqsk_alloc include/net/request_sock.h:87 [inline]
 inet_reqsk_alloc+0x4c/0x238 net/ipv4/tcp_input.c:6236
 tcp_conn_request+0x2b0/0xea8 net/ipv4/tcp_input.c:6341
 tcp_v4_conn_request+0xe0/0x100 net/ipv4/tcp_ipv4.c:1256
 tcp_rcv_state_process+0x384/0x18a8 net/ipv4/tcp_input.c:5926
 tcp_v4_do_rcv+0x2f0/0x3e0 net/ipv4/tcp_ipv4.c:1430
 tcp_v4_rcv+0x1278/0x1350 net/ipv4/tcp_ipv4.c:1709
 ip_local_deliver_finish+0x174/0x3e0 net/ipv4/ip_input.c:216

v1->v2: Change socket_mt6_v1_v2_v3() as well as mentioned by Eric
v2->v3: Put the correct fixes tag

Fixes: 01555e7 ("netfilter: xt_socket: add XT_SOCKET_RESTORESKMARK flag")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
Check for a NULL dsaddr in filelayout_free_lseg() before calling
nfs4_fl_put_deviceid().  This fixes the following oops:

[ 1967.645207] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[ 1967.646010] IP: [<ffffffffc06d6aea>] nfs4_put_deviceid_node+0xa/0x90 [nfsv4]
[ 1967.646010] PGD c08bc067 PUD 915d3067 PMD 0
[ 1967.753036] Oops: 0000 [#1] SMP
[ 1967.753036] Modules linked in: nfs_layout_nfsv41_files ext4 mbcache jbd2 loop rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache amd64_edac_mod ipmi_ssif edac_mce_amd edac_core kvm_amd sg kvm ipmi_si ipmi_devintf irqbypass pcspkr k8temp ipmi_msghandler i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common amdkfd amd_iommu_v2 radeon i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mptsas ttm scsi_transport_sas mptscsih drm mptbase serio_raw i2c_core bnx2 dm_mirror dm_region_hash dm_log dm_mod
[ 1967.790031] CPU: 2 PID: 1370 Comm: ls Not tainted 3.10.0-709.el7.test.bz1463784.x86_64 #1
[ 1967.790031] Hardware name: IBM BladeCenter LS21 -[7971AC1]-/Server Blade, BIOS -[BAE155AUS-1.10]- 06/03/2009
[ 1967.790031] task: ffff8800c42a3f40 ti: ffff8800c4064000 task.ti: ffff8800c4064000
[ 1967.790031] RIP: 0010:[<ffffffffc06d6aea>]  [<ffffffffc06d6aea>] nfs4_put_deviceid_node+0xa/0x90 [nfsv4]
[ 1967.790031] RSP: 0000:ffff8800c4067978  EFLAGS: 00010246
[ 1967.790031] RAX: ffffffffc062f000 RBX: ffff8801d468a540 RCX: dead000000000200
[ 1967.790031] RDX: ffff8800c40679f8 RSI: ffff8800c4067a0c RDI: 0000000000000000
[ 1967.790031] RBP: ffff8800c4067980 R08: ffff8801d468a540 R09: 0000000000000000
[ 1967.790031] R10: 0000000000000000 R11: ffffffffffffffff R12: ffff8801d468a540
[ 1967.790031] R13: ffff8800c40679f8 R14: ffff8801d5645300 R15: ffff880126f15ff0
[ 1967.790031] FS:  00007f11053c9800(0000) GS:ffff88012bd00000(0000) knlGS:0000000000000000
[ 1967.790031] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1967.790031] CR2: 0000000000000030 CR3: 0000000094b55000 CR4: 00000000000007e0
[ 1967.790031] Stack:
[ 1967.790031]  ffff8801d468a540 ffff8800c4067990 ffffffffc062d2fe ffff8800c40679b0
[ 1967.790031]  ffffffffc062b5b4 ffff8800c40679f8 ffff8801d468a540 ffff8800c40679d8
[ 1967.790031]  ffffffffc06d39af ffff8800c40679f8 ffff880126f16078 0000000000000001
[ 1967.790031] Call Trace:
[ 1967.790031]  [<ffffffffc062d2fe>] nfs4_fl_put_deviceid+0xe/0x10 [nfs_layout_nfsv41_files]
[ 1967.790031]  [<ffffffffc062b5b4>] filelayout_free_lseg+0x24/0x90 [nfs_layout_nfsv41_files]
[ 1967.790031]  [<ffffffffc06d39af>] pnfs_free_lseg_list+0x5f/0x80 [nfsv4]
[ 1967.790031]  [<ffffffffc06d5a67>] _pnfs_return_layout+0x157/0x270 [nfsv4]
[ 1967.790031]  [<ffffffffc06c17dd>] nfs4_evict_inode+0x4d/0x70 [nfsv4]
[ 1967.790031]  [<ffffffff8121de19>] evict+0xa9/0x180
[ 1967.790031]  [<ffffffff8121e729>] iput+0xf9/0x190
[ 1967.790031]  [<ffffffffc0652cea>] nfs_dentry_iput+0x3a/0x50 [nfs]
[ 1967.790031]  [<ffffffff8121ab4f>] shrink_dentry_list+0x20f/0x490
[ 1967.790031]  [<ffffffff8121b018>] d_invalidate+0xd8/0x150
[ 1967.790031]  [<ffffffffc065446b>] nfs_readdir_page_filler+0x40b/0x600 [nfs]
[ 1967.790031]  [<ffffffffc0654bbd>] nfs_readdir_xdr_to_array+0x20d/0x3b0 [nfs]
[ 1967.790031]  [<ffffffff811f3482>] ? __mem_cgroup_commit_charge+0xe2/0x2f0
[ 1967.790031]  [<ffffffff81183208>] ? __add_to_page_cache_locked+0x48/0x170
[ 1967.790031]  [<ffffffffc0654d60>] ? nfs_readdir_xdr_to_array+0x3b0/0x3b0 [nfs]
[ 1967.790031]  [<ffffffffc0654d82>] nfs_readdir_filler+0x22/0x90 [nfs]
[ 1967.790031]  [<ffffffff8118351f>] do_read_cache_page+0x7f/0x190
[ 1967.790031]  [<ffffffff81215d30>] ? fillonedir+0xe0/0xe0
[ 1967.790031]  [<ffffffff8118366c>] read_cache_page+0x1c/0x30
[ 1967.790031]  [<ffffffffc0654f9b>] nfs_readdir+0x1ab/0x6b0 [nfs]
[ 1967.790031]  [<ffffffffc06bd1c0>] ? nfs4_xdr_dec_layoutget+0x270/0x270 [nfsv4]
[ 1967.790031]  [<ffffffff81215d30>] ? fillonedir+0xe0/0xe0
[ 1967.790031]  [<ffffffff81215c20>] vfs_readdir+0xb0/0xe0
[ 1967.790031]  [<ffffffff81216045>] SyS_getdents+0x95/0x120
[ 1967.790031]  [<ffffffff816b9449>] system_call_fastpath+0x16/0x1b
[ 1967.790031] Code: 90 31 d2 48 89 d0 5d c3 85 f6 74 f5 8d 4e 01 89 f0 f0 0f b1 0f 39 f0 74 e2 89 c6 eb eb 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 53 <48> 8b 47 30 48 89 fb a8 04 74 3b 8b 57 60 83 fa 02 74 19 8d 4a
[ 1967.790031] RIP  [<ffffffffc06d6aea>] nfs4_put_deviceid_node+0xa/0x90 [nfsv4]
[ 1967.790031]  RSP <ffff8800c4067978>
[ 1967.790031] CR2: 0000000000000030

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Fixes: 1ebf980 ("NFS/filelayout: Fix racy setting of fl->dsaddr...")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
ppp_release() tries to ensure that netdevices are unregistered before
decrementing the unit refcount and running ppp_destroy_interface().

This is all fine as long as the the device is unregistered by
ppp_release(): the unregister_netdevice() call, followed by
rtnl_unlock(), guarantee that the unregistration process completes
before rtnl_unlock() returns.

However, the device may be unregistered by other means (like
ppp_nl_dellink()). If this happens right before ppp_release() calling
rtnl_lock(), then ppp_release() has to wait for the concurrent
unregistration code to release the lock.
But rtnl_unlock() releases the lock before completing the device
unregistration process. This allows ppp_release() to proceed and
eventually call ppp_destroy_interface() before the unregistration
process completes. Calling free_netdev() on this partially unregistered
device will BUG():

 ------------[ cut here ]------------
 kernel BUG at net/core/dev.c:8141!
 invalid opcode: 0000 [#1] SMP

 CPU: 1 PID: 1557 Comm: pppd Not tainted 4.14.0-rc2+ #4
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014

 Call Trace:
  ppp_destroy_interface+0xd8/0xe0 [ppp_generic]
  ppp_disconnect_channel+0xda/0x110 [ppp_generic]
  ppp_unregister_channel+0x5e/0x110 [ppp_generic]
  pppox_unbind_sock+0x23/0x30 [pppox]
  pppoe_connect+0x130/0x440 [pppoe]
  SYSC_connect+0x98/0x110
  ? do_fcntl+0x2c0/0x5d0
  SyS_connect+0xe/0x10
  entry_SYSCALL_64_fastpath+0x1a/0xa5

 RIP: free_netdev+0x107/0x110 RSP: ffffc28a40573d88
 ---[ end trace ed294ff0cc40eeff ]---

We could set the ->needs_free_netdev flag on PPP devices and move the
ppp_destroy_interface() logic in the ->priv_destructor() callback. But
that'd be quite intrusive as we'd first need to unlink from the other
channels and units that depend on the device (the ones that used the
PPPIOCCONNECT and PPPIOCATTACH ioctls).

Instead, we can just let the netdevice hold a reference on its
ppp_file. This reference is dropped in ->priv_destructor(), at the very
end of the unregistration process, so that neither ppp_release() nor
ppp_disconnect_channel() can call ppp_destroy_interface() in the interim.

Reported-by: Beniamino Galvani <bgalvani@redhat.com>
Fixes: 8cb775b ("ppp: fix device unregistration upon netns deletion")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
While line6_probe() may kick off URB for a control MIDI endpoint, the
function doesn't clean up it properly at its error path.  This results
in a leftover URB action that is eventually triggered later and causes
an Oops like:
  general protection fault: 0000 [#1] PREEMPT SMP KASAN
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted
  RIP: 0010:usb_fill_bulk_urb ./include/linux/usb.h:1619
  RIP: 0010:line6_start_listen+0x3fe/0x9e0 sound/usb/line6/driver.c:76
  Call Trace:
   <IRQ>
   line6_data_received+0x1f7/0x470 sound/usb/line6/driver.c:326
   __usb_hcd_giveback_urb+0x2e0/0x650 drivers/usb/core/hcd.c:1779
   usb_hcd_giveback_urb+0x337/0x420 drivers/usb/core/hcd.c:1845
   dummy_timer+0xba9/0x39f0 drivers/usb/gadget/udc/dummy_hcd.c:1965
   call_timer_fn+0x2a2/0x940 kernel/time/timer.c:1281
   ....

Since the whole clean-up procedure is done in line6_disconnect()
callback, we can simply call it in the error path instead of
open-coding the whole again.  It'll fix such an issue automagically.

The bug was spotted by syzkaller.

Fixes: eedd0e9 ("ALSA: line6: Don't forget to call driver's destructor at error path")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
Since commit:

  1fd7e41 ("perf/core: Remove perf_cpu_context::unique_pmu")

... when a PMU is unregistered then its associated ->pmu_cpu_context is
unconditionally freed. Whilst this is fine for dynamically allocated
context types (i.e. those registered using perf_invalid_context), this
causes a problem for sharing of static contexts such as
perf_{sw,hw}_context, which are used by multiple built-in PMUs and
effectively have a global lifetime.

Whilst testing the ARM SPE driver, which must use perf_sw_context to
support per-task AUX tracing, unregistering the driver as a result of a
module unload resulted in:

 Unable to handle kernel NULL pointer dereference at virtual address 00000038
 Internal error: Oops: 96000004 [#1] PREEMPT SMP
 Modules linked in: [last unloaded: arm_spe_pmu]
 PC is at ctx_resched+0x38/0xe8
 LR is at perf_event_exec+0x20c/0x278
 [...]
 ctx_resched+0x38/0xe8
 perf_event_exec+0x20c/0x278
 setup_new_exec+0x88/0x118
 load_elf_binary+0x26c/0x109c
 search_binary_handler+0x90/0x298
 do_execveat_common.isra.14+0x540/0x618
 SyS_execve+0x38/0x48

since the software context has been freed and the ctx.pmu->pmu_disable_count
field has been set to NULL.

This patch fixes the problem by avoiding the freeing of static PMU contexts
altogether. Whilst the sharing of dynamic contexts is questionable, this
actually requires the caller to share their context pointer explicitly
and so the burden is on them to manage the object lifetime.

Reported-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 1fd7e41 ("perf/core: Remove perf_cpu_context::unique_pmu")
Link: http://lkml.kernel.org/r/1507040450-7730-1-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
Tetsuo Handa and Fengguang Wu reported a panic in the unwinder:

  BUG: unable to handle kernel NULL pointer dereference at 000001f2
  IP: update_stack_state+0xd4/0x340
  *pde = 00000000

  Oops: 0000 [#1] PREEMPT SMP
  CPU: 0 PID: 18728 Comm: 01-cpu-hotplug Not tainted 4.13.0-rc4-00170-gb09be67 #592
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
  task: bb0b53c0 task.stack: bb3ac000
  EIP: update_stack_state+0xd4/0x340
  EFLAGS: 00010002 CPU: 0
  EAX: 0000a570 EBX: bb3adccb ECX: 0000f401 EDX: 0000a570
  ESI: 00000001 EDI: 000001ba EBP: bb3adc6b ESP: bb3adc3f
   DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
  CR0: 80050033 CR2: 000001f2 CR3: 0b3a7000 CR4: 00140690
  DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
  DR6: fffe0ff0 DR7: 00000400
  Call Trace:
   ? unwind_next_frame+0xea/0x400
   ? __unwind_start+0xf5/0x180
   ? __save_stack_trace+0x81/0x160
   ? save_stack_trace+0x20/0x30
   ? __lock_acquire+0xfa5/0x12f0
   ? lock_acquire+0x1c2/0x230
   ? tick_periodic+0x3a/0xf0
   ? _raw_spin_lock+0x42/0x50
   ? tick_periodic+0x3a/0xf0
   ? tick_periodic+0x3a/0xf0
   ? debug_smp_processor_id+0x12/0x20
   ? tick_handle_periodic+0x23/0xc0
   ? local_apic_timer_interrupt+0x63/0x70
   ? smp_trace_apic_timer_interrupt+0x235/0x6a0
   ? trace_apic_timer_interrupt+0x37/0x3c
   ? strrchr+0x23/0x50
  Code: 0f 95 c1 89 c7 89 45 e4 0f b6 c1 89 c6 89 45 dc 8b 04 85 98 cb 74 bc 88 4d e3 89 45 f0 83 c0 01 84 c9 89 04 b5 98 cb 74 bc 74 3b <8b> 47 38 8b 57 34 c6 43 1d 01 25 00 00 02 00 83 e2 03 09 d0 83
  EIP: update_stack_state+0xd4/0x340 SS:ESP: 0068:bb3adc3f
  CR2: 00000000000001f2
  ---[ end trace 0d147fd4aba8ff50 ]---
  Kernel panic - not syncing: Fatal exception in interrupt

On x86-32, after decoding a frame pointer to get a regs address,
regs_size() dereferences the regs pointer when it checks regs->cs to see
if the regs are user mode.  This is dangerous because it's possible that
what looks like a decoded frame pointer is actually a corrupt value, and
we don't want the unwinder to make things worse.

Instead of calling regs_size() on an unsafe pointer, just assume they're
kernel regs to start with.  Later, once it's safe to access the regs, we
can do the user mode check and corresponding safety check for the
remaining two regs.

Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: LKP <lkp@01.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 5ed8d8b ("x86/unwind: Move common code into update_stack_state()")
Link: http://lkml.kernel.org/r/7f95b9a6993dec7674b3f3ab3dcd3294f7b9644d.1507597785.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
The dummy-hcd driver calls the gadget driver's disconnect callback
under the wrong conditions.  It should invoke the callback when Vbus
power is turned off, but instead it does so when the D+ pullup is
turned off.

This can cause a deadlock in the composite core when a gadget driver
is unregistered:

[   88.361471] ============================================
[   88.362014] WARNING: possible recursive locking detected
[   88.362580] 4.14.0-rc2+ #9 Not tainted
[   88.363010] --------------------------------------------
[   88.363561] v4l_id/526 is trying to acquire lock:
[   88.364062]  (&(&cdev->lock)->rlock){....}, at: [<ffffffffa0547e03>] composite_disconnect+0x43/0x100 [libcomposite]
[   88.365051]
[   88.365051] but task is already holding lock:
[   88.365826]  (&(&cdev->lock)->rlock){....}, at: [<ffffffffa0547b09>] usb_function_deactivate+0x29/0x80 [libcomposite]
[   88.366858]
[   88.366858] other info that might help us debug this:
[   88.368301]  Possible unsafe locking scenario:
[   88.368301]
[   88.369304]        CPU0
[   88.369701]        ----
[   88.370101]   lock(&(&cdev->lock)->rlock);
[   88.370623]   lock(&(&cdev->lock)->rlock);
[   88.371145]
[   88.371145]  *** DEADLOCK ***
[   88.371145]
[   88.372211]  May be due to missing lock nesting notation
[   88.372211]
[   88.373191] 2 locks held by v4l_id/526:
[   88.373715]  #0:  (&(&cdev->lock)->rlock){....}, at: [<ffffffffa0547b09>] usb_function_deactivate+0x29/0x80 [libcomposite]
[   88.374814]  #1:  (&(&dum_hcd->dum->lock)->rlock){....}, at: [<ffffffffa05bd48d>] dummy_pullup+0x7d/0xf0 [dummy_hcd]
[   88.376289]
[   88.376289] stack backtrace:
[   88.377726] CPU: 0 PID: 526 Comm: v4l_id Not tainted 4.14.0-rc2+ #9
[   88.378557] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[   88.379504] Call Trace:
[   88.380019]  dump_stack+0x86/0xc7
[   88.380605]  __lock_acquire+0x841/0x1120
[   88.381252]  lock_acquire+0xd5/0x1c0
[   88.381865]  ? composite_disconnect+0x43/0x100 [libcomposite]
[   88.382668]  _raw_spin_lock_irqsave+0x40/0x54
[   88.383357]  ? composite_disconnect+0x43/0x100 [libcomposite]
[   88.384290]  composite_disconnect+0x43/0x100 [libcomposite]
[   88.385490]  set_link_state+0x2d4/0x3c0 [dummy_hcd]
[   88.386436]  dummy_pullup+0xa7/0xf0 [dummy_hcd]
[   88.387195]  usb_gadget_disconnect+0xd8/0x160 [udc_core]
[   88.387990]  usb_gadget_deactivate+0xd3/0x160 [udc_core]
[   88.388793]  usb_function_deactivate+0x64/0x80 [libcomposite]
[   88.389628]  uvc_function_disconnect+0x1e/0x40 [usb_f_uvc]

This patch changes the code to test the port-power status bit rather
than the port-connect status bit when deciding whether to isue the
callback.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: David Tulloh <david@tulloh.id.au>
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
Stack trace output during a stress test:
 [    4.310049] Freeing initrd memory: 22592K
[    4.310646] rtas_flash: no firmware flash support
[    4.313341] cpuhp/64: page allocation failure: order:0, mode:0x14480c0(GFP_KERNEL|__GFP_ZERO|__GFP_THISNODE), nodemask=(null)
[    4.313465] cpuhp/64 cpuset=/ mems_allowed=0
[    4.313521] CPU: 64 PID: 392 Comm: cpuhp/64 Not tainted 4.11.0-39.el7a.ppc64le #1
[    4.313588] Call Trace:
[    4.313622] [c000000f1fb1b8e0] [c000000000c09388] dump_stack+0xb0/0xf0 (unreliable)
[    4.313694] [c000000f1fb1b920] [c00000000030ef6c] warn_alloc+0x12c/0x1c0
[    4.313753] [c000000f1fb1b9c0] [c00000000030ff68] __alloc_pages_nodemask+0xea8/0x1000
[    4.313823] [c000000f1fb1bbb0] [c000000000113a8c] core_imc_mem_init+0xbc/0x1c0
[    4.313892] [c000000f1fb1bc00] [c000000000113cdc] ppc_core_imc_cpu_online+0x14c/0x170
[    4.313962] [c000000f1fb1bc90] [c000000000125758] cpuhp_invoke_callback+0x198/0x5d0
[    4.314031] [c000000f1fb1bd00] [c00000000012782c] cpuhp_thread_fun+0x8c/0x3d0
[    4.314101] [c000000f1fb1bd60] [c0000000001678d0] smpboot_thread_fn+0x290/0x2a0
[    4.314169] [c000000f1fb1bdc0] [c00000000015ee78] kthread+0x168/0x1b0
[    4.314229] [c000000f1fb1be30] [c00000000000b368] ret_from_kernel_thread+0x5c/0x74
[    4.314313] Mem-Info:
[    4.314356] active_anon:0 inactive_anon:0 isolated_anon:0

core_imc_mem_init() at system boot use alloc_pages_node() to get memory
and alloc_pages_node() throws this stack dump when tried to allocate
memory from a node which has no memory behind it. Add a ___GFP_NOWARN
flag in allocation request as a fix.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Venkat R.B <venkatb3@in.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
…p state

Driver calls request_firmware() whenever the device is opened for the
first time. As the device gets opened and closed, dev->num_inst == 1
is true several times. This is not necessary since the firmware is saved
in the fw_buf. s5p_mfc_load_firmware() copies the buffer returned by
the request_firmware() to dev->fw_buf.

fw_buf sticks around until it gets released from s5p_mfc_remove(), hence
there is no need to keep requesting firmware and copying it to fw_buf.

This might have been overlooked when changes are made to free fw_buf from
the device release interface s5p_mfc_release().

Fix s5p_mfc_load_firmware() to call request_firmware() once and keep state.
Change _probe() to load firmware once fw_buf has been allocated.

s5p_mfc_open() and it continues to call s5p_mfc_load_firmware() and init
hardware which is the step where firmware is written to the device.

This addresses the mfc_mutex contention due to repeated request_firmware()
calls from open() in the following circular locking warning:

[  552.194115] qtdemux0:sink/2710 is trying to acquire lock:
[  552.199488]  (&dev->mfc_mutex){+.+.}, at: [<bf145544>] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[  552.207459]
               but task is already holding lock:
[  552.213264]  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[  552.220284]
               which lock already depends on the new lock.

[  552.228429]
               the existing dependency chain (in reverse order) is:
[  552.235881]
               -> #2 (&mm->mmap_sem){++++}:
[  552.241259]        __might_fault+0x80/0xb0
[  552.245331]        filldir64+0xc0/0x2f8
[  552.249144]        call_filldir+0xb0/0x14c
[  552.253214]        ext4_readdir+0x768/0x90c
[  552.257374]        iterate_dir+0x74/0x168
[  552.261360]        SyS_getdents64+0x7c/0x1a0
[  552.265608]        ret_fast_syscall+0x0/0x28
[  552.269850]
               -> #1 (&type->i_mutex_dir_key#2){++++}:
[  552.276180]        down_read+0x48/0x90
[  552.279904]        lookup_slow+0x74/0x178
[  552.283889]        walk_component+0x1a4/0x2e4
[  552.288222]        link_path_walk+0x174/0x4a0
[  552.292555]        path_openat+0x68/0x944
[  552.296541]        do_filp_open+0x60/0xc4
[  552.300528]        file_open_name+0xe4/0x114
[  552.304772]        filp_open+0x28/0x48
[  552.308499]        kernel_read_file_from_path+0x30/0x78
[  552.313700]        _request_firmware+0x3ec/0x78c
[  552.318291]        request_firmware+0x3c/0x54
[  552.322642]        s5p_mfc_load_firmware+0x54/0x150 [s5p_mfc]
[  552.328358]        s5p_mfc_open+0x4e4/0x550 [s5p_mfc]
[  552.333394]        v4l2_open+0xa0/0x104 [videodev]
[  552.338137]        chrdev_open+0xa4/0x18c
[  552.342121]        do_dentry_open+0x208/0x310
[  552.346454]        path_openat+0x28c/0x944
[  552.350526]        do_filp_open+0x60/0xc4
[  552.354512]        do_sys_open+0x118/0x1c8
[  552.358586]        ret_fast_syscall+0x0/0x28
[  552.362830]
               -> #0 (&dev->mfc_mutex){+.+.}:
               -> #0 (&dev->mfc_mutex){+.+.}:
[  552.368379]        lock_acquire+0x6c/0x88
[  552.372364]        __mutex_lock+0x68/0xa34
[  552.376437]        mutex_lock_interruptible_nested+0x1c/0x24
[  552.382086]        s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[  552.386939]        v4l2_mmap+0x54/0x88 [videodev]
[  552.391601]        mmap_region+0x3a8/0x638
[  552.395673]        do_mmap+0x330/0x3a4
[  552.399400]        vm_mmap_pgoff+0x90/0xb8
[  552.403472]        SyS_mmap_pgoff+0x90/0xc0
[  552.407632]        ret_fast_syscall+0x0/0x28
[  552.411876]
               other info that might help us debug this:

[  552.419848] Chain exists of:
                 &dev->mfc_mutex --> &type->i_mutex_dir_key#2 --> &mm->mmap_sem

[  552.431200]  Possible unsafe locking scenario:

[  552.437092]        CPU0                    CPU1
[  552.441598]        ----                    ----
[  552.446104]   lock(&mm->mmap_sem);
[  552.449484]                                lock(&type->i_mutex_dir_key#2);
[  552.456329]                                lock(&mm->mmap_sem);
[  552.462222]   lock(&dev->mfc_mutex);
[  552.465775]
                *** DEADLOCK ***

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap().
Fix it to not take the lock. The following lockdep warning is fixed
with this change.

[ 1990.972058] ======================================================
[ 1990.978172] WARNING: possible circular locking dependency detected
[ 1990.984327] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted
[ 1990.989783] ------------------------------------------------------
[ 1990.995937] qtdemux0:sink/2765 is trying to acquire lock:
[ 1991.001309]  (&gsc->lock){+.+.}, at: [<bf1729f0>] gsc_m2m_mmap+0x24/0x5c [exynos_gsc]
[ 1991.009108]
               but task is already holding lock:
[ 1991.014913]  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 1991.021932]
               which lock already depends on the new lock.

[ 1991.030078]
               the existing dependency chain (in reverse order) is:
[ 1991.037530]
               -> #1 (&mm->mmap_sem){++++}:
[ 1991.042913]        __might_fault+0x80/0xb0
[ 1991.047096]        video_usercopy+0x1cc/0x510 [videodev]
[ 1991.052297]        v4l2_ioctl+0xa4/0xdc [videodev]
[ 1991.057036]        do_vfs_ioctl+0xa0/0xa18
[ 1991.061102]        SyS_ioctl+0x34/0x5c
[ 1991.064834]        ret_fast_syscall+0x0/0x28
[ 1991.069072]
               -> #0 (&gsc->lock){+.+.}:
[ 1991.074193]        lock_acquire+0x6c/0x88
[ 1991.078179]        __mutex_lock+0x68/0xa34
[ 1991.082247]        mutex_lock_interruptible_nested+0x1c/0x24
[ 1991.087888]        gsc_m2m_mmap+0x24/0x5c [exynos_gsc]
[ 1991.093029]        v4l2_mmap+0x54/0x88 [videodev]
[ 1991.097673]        mmap_region+0x3a8/0x638
[ 1991.101743]        do_mmap+0x330/0x3a4
[ 1991.105470]        vm_mmap_pgoff+0x90/0xb8
[ 1991.109542]        SyS_mmap_pgoff+0x90/0xc0
[ 1991.113702]        ret_fast_syscall+0x0/0x28
[ 1991.117945]
               other info that might help us debug this:

[ 1991.125918]  Possible unsafe locking scenario:

[ 1991.131810]        CPU0                    CPU1
[ 1991.136315]        ----                    ----
[ 1991.140821]   lock(&mm->mmap_sem);
[ 1991.144201]                                lock(&gsc->lock);
[ 1991.149833]                                lock(&mm->mmap_sem);
[ 1991.155725]   lock(&gsc->lock);
[ 1991.158845]
                *** DEADLOCK ***

[ 1991.164740] 1 lock held by qtdemux0:sink/2765:
[ 1991.169157]  #0:  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 1991.176609]
               stack backtrace:
[ 1991.180946] CPU: 2 PID: 2765 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4
[ 1991.189608] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 1991.195686] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14)
[ 1991.203393] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4)
[ 1991.210586] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410)
[ 1991.218644] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938)
[ 1991.227049] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc)
[ 1991.235281] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88)
[ 1991.242993] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34)
[ 1991.250620] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24)
[ 1991.259812] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf1729f0>] (gsc_m2m_mmap+0x24/0x5c [exynos_gsc])
[ 1991.270159] [<bf1729f0>] (gsc_m2m_mmap [exynos_gsc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev])
[ 1991.279510] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638)
[ 1991.287792] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4)
[ 1991.294986] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8)
[ 1991.302178] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0)
[ 1991.309977] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28)

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Oct 16, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap().
Fix it to not take the lock. The following lockdep warning is fixed
with this change.

[ 2106.181412] ======================================================
[ 2106.187563] WARNING: possible circular locking dependency detected
[ 2106.193718] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted
[ 2106.199175] ------------------------------------------------------
[ 2106.205328] qtdemux0:sink/2614 is trying to acquire lock:
[ 2106.210701]  (&dev->mfc_mutex){+.+.}, at: [<bf175544>] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[ 2106.218672]
[ 2106.218672] but task is already holding lock:
[ 2106.224477]  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 2106.231497]
[ 2106.231497] which lock already depends on the new lock.
[ 2106.231497]
[ 2106.239642]
[ 2106.239642] the existing dependency chain (in reverse order) is:
[ 2106.247095]
[ 2106.247095] -> #1 (&mm->mmap_sem){++++}:
[ 2106.252473]        __might_fault+0x80/0xb0
[ 2106.256567]        video_usercopy+0x1cc/0x510 [videodev]
[ 2106.261845]        v4l2_ioctl+0xa4/0xdc [videodev]
[ 2106.266596]        do_vfs_ioctl+0xa0/0xa18
[ 2106.270667]        SyS_ioctl+0x34/0x5c
[ 2106.274395]        ret_fast_syscall+0x0/0x28
[ 2106.278637]
[ 2106.278637] -> #0 (&dev->mfc_mutex){+.+.}:
[ 2106.284186]        lock_acquire+0x6c/0x88
[ 2106.288173]        __mutex_lock+0x68/0xa34
[ 2106.292244]        mutex_lock_interruptible_nested+0x1c/0x24
[ 2106.297893]        s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[ 2106.302747]        v4l2_mmap+0x54/0x88 [videodev]
[ 2106.307409]        mmap_region+0x3a8/0x638
[ 2106.311480]        do_mmap+0x330/0x3a4
[ 2106.315207]        vm_mmap_pgoff+0x90/0xb8
[ 2106.319279]        SyS_mmap_pgoff+0x90/0xc0
[ 2106.323439]        ret_fast_syscall+0x0/0x28
[ 2106.327683]
[ 2106.327683] other info that might help us debug this:
[ 2106.327683]
[ 2106.335656]  Possible unsafe locking scenario:
[ 2106.335656]
[ 2106.341548]        CPU0                    CPU1
[ 2106.346053]        ----                    ----
[ 2106.350559]   lock(&mm->mmap_sem);
[ 2106.353939]                                lock(&dev->mfc_mutex);
[ 2106.353939]                                lock(&dev->mfc_mutex);
[ 2106.365897]   lock(&dev->mfc_mutex);
[ 2106.369450]
[ 2106.369450]  *** DEADLOCK ***
[ 2106.369450]
[ 2106.375344] 1 lock held by qtdemux0:sink/2614:
[ 2106.379762]  #0:  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 2106.387214]
[ 2106.387214] stack backtrace:
[ 2106.391550] CPU: 7 PID: 2614 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4
[ 2106.400213] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 2106.406285] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14)
[ 2106.413995] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4)
[ 2106.421187] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410)
[ 2106.429245] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938)
[ 2106.437651] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc)
[ 2106.445883] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88)
[ 2106.453596] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34)
[ 2106.461221] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24)
[ 2106.470425] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf175544>] (s5p_mfc_mmap+0x28/0xd4 [s5p_mfc])
[ 2106.480494] [<bf175544>] (s5p_mfc_mmap [s5p_mfc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev])
[ 2106.489575] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638)
[ 2106.497875] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4)
[ 2106.505068] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8)
[ 2106.512260] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0)
[ 2106.520059] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28)

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Oct 24, 2017
Add missing init_list_head for the registered buffer list.
Absence of the init could lead to a unhandled kernel paging
request as below, when streamon/streamoff are called in row.

[338046.571321] Unable to handle kernel paging request at virtual address fffffffffffffe00
[338046.574849] pgd = ffff800034820000
[338046.582381] [fffffffffffffe00] *pgd=00000000b60f5003[338046.582545]
, *pud=00000000b1f31003
, *pmd=0000000000000000[338046.592082]
[338046.597754] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[338046.601671] Modules linked in: venus_enc venus_dec venus_core
usb_f_ecm g_ether usb_f_rndis u_ether libcomposite ipt_MASQUERADE
nf_nat_masquerade_ipv4 arc4 wcn36xx mac80211 btqcomsmd btqca iptable_nat
nf_co]
[338046.662408] CPU: 0 PID: 5433 Comm: irq/160-venus Tainted: G        W
4.9.39+ hardkernel#232
[338046.668024] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC
(DT)
[338046.675268] task: ffff80003541cb00 task.stack: ffff800026e20000
[338046.682097] PC is at venus_helper_release_buf_ref+0x28/0x88
[venus_core]
[338046.688282] LR is at vdec_event_notify+0xe8/0x150 [venus_dec]
[338046.695029] pc : [<ffff000000af6c48>] lr : [<ffff000000a6fc60>]
pstate: a0000145
[338046.701256] sp : ffff800026e23bc0
[338046.708494] x29: ffff800026e23bc0 x28: 0000000000000000
[338046.718853] x27: ffff000000afd4f8 x26: ffff800031faa700
[338046.729253] x25: ffff000000afd790 x24: ffff800031faa618
[338046.739664] x23: ffff800003e18138 x22: ffff800002fc9810
[338046.750109] x21: ffff800026e23c28 x20: 0000000000000001
[338046.760592] x19: ffff80002a13b800 x18: 0000000000000010
[338046.771099] x17: 0000ffffa3d01600 x16: ffff000008100428
[338046.781654] x15: 0000000000000006 x14: ffff000089045ba7
[338046.792250] x13: ffff000009045bb6 x12: 00000000004f37c8
[338046.802894] x11: 0000000000267211 x10: 0000000000000000
[338046.813574] x9 : 0000000000032000 x8 : 00000000dc400000
[338046.824274] x7 : 0000000000000000 x6 : ffff800031faa728
[338046.835005] x5 : ffff80002a13b850 x4 : 0000000000000000
[338046.845793] x3 : fffffffffffffdf8 x2 : 0000000000000000
[338046.856602] x1 : 0000000000000003 x0 : ffff80002a13b800

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
mihailescu2m pushed a commit that referenced this pull request Oct 24, 2017
Thomas reported that 'perf buildid-list' gets a SEGFAULT due to NULL
pointer deref when he ran it on a data with namespace events.  It was
because the buildid_id__mark_dso_hit_ops lacks the namespace event
handler and perf_too__fill_default() didn't set it.

  Program received signal SIGSEGV, Segmentation fault.
  0x0000000000000000 in ?? ()
  Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.s390x bzip2-libs-1.0.6-21.fc25.s390x elfutils-libelf-0.169-1.fc25.s390x
  +elfutils-libs-0.169-1.fc25.s390x libcap-ng-0.7.8-1.fc25.s390x numactl-libs-2.0.11-2.ibm.fc25.s390x openssl-libs-1.1.0e-1.1.ibm.fc25.s390x perl-libs-5.24.1-386.fc25.s390x
  +python-libs-2.7.13-2.fc25.s390x slang-2.3.0-7.fc25.s390x xz-libs-5.2.3-2.fc25.s390x zlib-1.2.8-10.fc25.s390x
  (gdb) where
  #0  0x0000000000000000 in ?? ()
  #1  0x00000000010fad6a in machines__deliver_event (machines=<optimized out>, machines@entry=0x2c6fd18,
      evlist=<optimized out>, event=event@entry=0x3fffdf00470, sample=0x3ffffffe880, sample@entry=0x3ffffffe888,
      tool=tool@entry=0x1312968 <build_id.mark_dso_hit_ops>, file_offset=1136) at util/session.c:1287
  #2  0x00000000010fbf4e in perf_session__deliver_event (file_offset=1136, tool=0x1312968 <build_id.mark_dso_hit_ops>,
      sample=0x3ffffffe888, event=0x3fffdf00470, session=0x2c6fc30) at util/session.c:1340
  #3  perf_session__process_event (session=0x2c6fc30, session@entry=0x0, event=event@entry=0x3fffdf00470,
      file_offset=file_offset@entry=1136) at util/session.c:1522
  #4  0x00000000010fddde in __perf_session__process_events (file_size=11880, data_size=<optimized out>,
      data_offset=<optimized out>, session=0x0) at util/session.c:1899
  #5  perf_session__process_events (session=0x0, session@entry=0x2c6fc30) at util/session.c:1953
  #6  0x000000000103b2ac in perf_session__list_build_ids (with_hits=<optimized out>, force=<optimized out>)
      at builtin-buildid-list.c:83
  #7  cmd_buildid_list (argc=<optimized out>, argv=<optimized out>) at builtin-buildid-list.c:115
  #8  0x00000000010a026c in run_builtin (p=0x1311f78 <commands+24>, argc=argc@entry=2, argv=argv@entry=0x3fffffff3c0)
      at perf.c:296
  #9  0x000000000102bc00 in handle_internal_command (argv=<optimized out>, argc=2) at perf.c:348
  hardkernel#10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:392
  hardkernel#11 main (argc=<optimized out>, argv=0x3fffffff3c0) at perf.c:536
  (gdb)

Fix it by adding a stub event handler for namespace event.

Committer testing:

Further clarifying, plain using 'perf buildid-list' will not end up in a
SEGFAULT when processing a perf.data file with namespace info:

  # perf record -a --namespaces sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.024 MB perf.data (1058 samples) ]
  # perf buildid-list | wc -l
  38
  # perf buildid-list | head -5
  e2a171c7b905826fc8494f0711ba76ab6abbd604 /lib/modules/4.14.0-rc3+/build/vmlinux
  874840a02d8f8a31cedd605d0b8653145472ced3 /lib/modules/4.14.0-rc3+/kernel/arch/x86/kvm/kvm-intel.ko
  ea7223776730cd8a22f320040aae4d54312984bc /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/i915/i915.ko
  5961535e6732a8edb7f22b3f148bb2fa2e0be4b9 /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/drm.ko
  f045f54aa78cf1931cc893f78b6cbc52c72a8cb1 /usr/lib64/libc-2.25.so
  #

It is only when one asks for checking what of those entries actually had
samples, i.e. when we use either -H or --with-hits, that we will process
all the PERF_RECORD_ events, and since tools/perf/builtin-buildid-list.c
neither explicitely set a perf_tool.namespaces() callback nor the
default stub was set that we end up, when processing a
PERF_RECORD_NAMESPACE record, causing a SEGFAULT:

  # perf buildid-list -H
  Segmentation fault (core dumped)
  ^C
  #

Reported-and-Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Fixes: f3b3614 ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
Link: http://lkml.kernel.org/r/20171017132900.11043-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
mihailescu2m pushed a commit that referenced this pull request Oct 24, 2017
When an EMAD is transmitted, a timeout work item is scheduled with a
delay of 200ms, so that another EMAD will be retried until a maximum of
five retries.

In certain situations, it's possible for the function waiting on the
EMAD to be associated with a work item that is queued on the same
workqueue (`mlxsw_core`) as the timeout work item. This results in
flushing a work item on the same workqueue.

According to commit e159489 ("workqueue: relax lockdep annotation
on flush_work()") the above may lead to a deadlock in case the workqueue
has only one worker active or if the system in under memory pressure and
the rescue worker is in use. The latter explains the very rare and
random nature of the lockdep splats we have been seeing:

[   52.730240] ============================================
[   52.736179] WARNING: possible recursive locking detected
[   52.742119] 4.14.0-rc3jiri+ #4 Not tainted
[   52.746697] --------------------------------------------
[   52.752635] kworker/1:3/599 is trying to acquire lock:
[   52.758378]  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c4fa4>] flush_work+0x3a4/0x5e0
[   52.767837]
               but task is already holding lock:
[   52.774360]  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.784495]
               other info that might help us debug this:
[   52.791794]  Possible unsafe locking scenario:
[   52.798413]        CPU0
[   52.801144]        ----
[   52.803875]   lock(mlxsw_core_driver_name);
[   52.808556]   lock(mlxsw_core_driver_name);
[   52.813236]
                *** DEADLOCK ***
[   52.819857]  May be due to missing lock nesting notation
[   52.827450] 3 locks held by kworker/1:3/599:
[   52.832221]  #0:  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.842846]  #1:  ((&(&bridge->fdb_notify.dw)->work)){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.854537]  #2:  (rtnl_mutex){+.+.}, at: [<ffffffff822ad8e7>] rtnl_lock+0x17/0x20
[   52.863021]
               stack backtrace:
[   52.867890] CPU: 1 PID: 599 Comm: kworker/1:3 Not tainted 4.14.0-rc3jiri+ #4
[   52.875773] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
[   52.886267] Workqueue: mlxsw_core mlxsw_sp_fdb_notify_work [mlxsw_spectrum]
[   52.894060] Call Trace:
[   52.909122]  __lock_acquire+0xf6f/0x2a10
[   53.025412]  lock_acquire+0x158/0x440
[   53.047557]  flush_work+0x3c4/0x5e0
[   53.087571]  __cancel_work_timer+0x3ca/0x5e0
[   53.177051]  cancel_delayed_work_sync+0x13/0x20
[   53.182142]  mlxsw_reg_trans_bulk_wait+0x12d/0x7a0 [mlxsw_core]
[   53.194571]  mlxsw_core_reg_access+0x586/0x990 [mlxsw_core]
[   53.225365]  mlxsw_reg_query+0x10/0x20 [mlxsw_core]
[   53.230882]  mlxsw_sp_fdb_notify_work+0x2a3/0x9d0 [mlxsw_spectrum]
[   53.237801]  process_one_work+0x8f1/0x12f0
[   53.321804]  worker_thread+0x1fd/0x10c0
[   53.435158]  kthread+0x28e/0x370
[   53.448703]  ret_from_fork+0x2a/0x40
[   53.453017] mlxsw_spectrum 0000:01:00.0: EMAD retries (2/5) (tid=bf4549b100000774)
[   53.453119] mlxsw_spectrum 0000:01:00.0: EMAD retries (5/5) (tid=bf4549b100000770)
[   53.453132] mlxsw_spectrum 0000:01:00.0: EMAD reg access failed (tid=bf4549b100000770,reg_id=200b(sfn),type=query,status=0(operation performed))
[   53.453143] mlxsw_spectrum 0000:01:00.0: Failed to get FDB notifications

Fix this by creating another workqueue for EMAD timeouts, thereby
preventing the situation of a work item trying to flush a work item
queued on the same workqueue.

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Oct 24, 2017
When syzkaller team brought us a C repro for the crash [1] that
had been reported many times in the past, I finally could find
the root cause.

If FlowLabel info is merged by fl6_merge_options(), we leave
part of the opt_space storage provided by udp/raw/l2tp with random value
in opt_space.tot_len, unless a control message was provided at sendmsg()
time.

Then ip6_setup_cork() would use this random value to perform a kzalloc()
call. Undefined behavior and crashes.

Fix is to properly set tot_len in fl6_merge_options()

At the same time, we can also avoid consuming memory and cpu cycles
to clear it, if every option is copied via a kmemdup(). This is the
change in ip6_setup_cork().

[1]
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 6613 Comm: syz-executor0 Not tainted 4.14.0-rc4+ hardkernel#127
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801cb64a100 task.stack: ffff8801cc350000
RIP: 0010:ip6_setup_cork+0x274/0x15c0 net/ipv6/ip6_output.c:1168
RSP: 0018:ffff8801cc357550 EFLAGS: 00010203
RAX: dffffc0000000000 RBX: ffff8801cc357748 RCX: 0000000000000010
RDX: 0000000000000002 RSI: ffffffff842bd1d9 RDI: 0000000000000014
RBP: ffff8801cc357620 R08: ffff8801cb17f380 R09: ffff8801cc357b10
R10: ffff8801cb64a100 R11: 0000000000000000 R12: ffff8801cc357ab0
R13: ffff8801cc357b10 R14: 0000000000000000 R15: ffff8801c3bbf0c0
FS:  00007f9c5c459700(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020324000 CR3: 00000001d1cf2000 CR4: 00000000001406f0
DR0: 0000000020001010 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Call Trace:
 ip6_make_skb+0x282/0x530 net/ipv6/ip6_output.c:1729
 udpv6_sendmsg+0x2769/0x3380 net/ipv6/udp.c:1340
 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 SYSC_sendto+0x358/0x5a0 net/socket.c:1750
 SyS_sendto+0x40/0x50 net/socket.c:1718
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4520a9
RSP: 002b:00007f9c5c458c08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 00000000004520a9
RDX: 0000000000000001 RSI: 0000000020fd1000 RDI: 0000000000000016
RBP: 0000000000000086 R08: 0000000020e0afe4 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004bb1ee
R13: 00000000ffffffff R14: 0000000000000016 R15: 0000000000000029
Code: e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 ea 0f 00 00 48 8d 79 04 48 b8 00 00 00 00 00 fc ff df 45 8b 74 24 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85
RIP: ip6_setup_cork+0x274/0x15c0 net/ipv6/ip6_output.c:1168 RSP: ffff8801cc357550

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Oct 24, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap().
Fix it to not take the lock. The following lockdep warning is fixed
with this change.

[ 2106.181412] ======================================================
[ 2106.187563] WARNING: possible circular locking dependency detected
[ 2106.193718] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted
[ 2106.199175] ------------------------------------------------------
[ 2106.205328] qtdemux0:sink/2614 is trying to acquire lock:
[ 2106.210701]  (&dev->mfc_mutex){+.+.}, at: [<bf175544>] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[ 2106.218672]
[ 2106.218672] but task is already holding lock:
[ 2106.224477]  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 2106.231497]
[ 2106.231497] which lock already depends on the new lock.
[ 2106.231497]
[ 2106.239642]
[ 2106.239642] the existing dependency chain (in reverse order) is:
[ 2106.247095]
[ 2106.247095] -> #1 (&mm->mmap_sem){++++}:
[ 2106.252473]        __might_fault+0x80/0xb0
[ 2106.256567]        video_usercopy+0x1cc/0x510 [videodev]
[ 2106.261845]        v4l2_ioctl+0xa4/0xdc [videodev]
[ 2106.266596]        do_vfs_ioctl+0xa0/0xa18
[ 2106.270667]        SyS_ioctl+0x34/0x5c
[ 2106.274395]        ret_fast_syscall+0x0/0x28
[ 2106.278637]
[ 2106.278637] -> #0 (&dev->mfc_mutex){+.+.}:
[ 2106.284186]        lock_acquire+0x6c/0x88
[ 2106.288173]        __mutex_lock+0x68/0xa34
[ 2106.292244]        mutex_lock_interruptible_nested+0x1c/0x24
[ 2106.297893]        s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[ 2106.302747]        v4l2_mmap+0x54/0x88 [videodev]
[ 2106.307409]        mmap_region+0x3a8/0x638
[ 2106.311480]        do_mmap+0x330/0x3a4
[ 2106.315207]        vm_mmap_pgoff+0x90/0xb8
[ 2106.319279]        SyS_mmap_pgoff+0x90/0xc0
[ 2106.323439]        ret_fast_syscall+0x0/0x28
[ 2106.327683]
[ 2106.327683] other info that might help us debug this:
[ 2106.327683]
[ 2106.335656]  Possible unsafe locking scenario:
[ 2106.335656]
[ 2106.341548]        CPU0                    CPU1
[ 2106.346053]        ----                    ----
[ 2106.350559]   lock(&mm->mmap_sem);
[ 2106.353939]                                lock(&dev->mfc_mutex);
[ 2106.353939]                                lock(&dev->mfc_mutex);
[ 2106.365897]   lock(&dev->mfc_mutex);
[ 2106.369450]
[ 2106.369450]  *** DEADLOCK ***
[ 2106.369450]
[ 2106.375344] 1 lock held by qtdemux0:sink/2614:
[ 2106.379762]  #0:  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 2106.387214]
[ 2106.387214] stack backtrace:
[ 2106.391550] CPU: 7 PID: 2614 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4
[ 2106.400213] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 2106.406285] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14)
[ 2106.413995] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4)
[ 2106.421187] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410)
[ 2106.429245] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938)
[ 2106.437651] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc)
[ 2106.445883] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88)
[ 2106.453596] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34)
[ 2106.461221] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24)
[ 2106.470425] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf175544>] (s5p_mfc_mmap+0x28/0xd4 [s5p_mfc])
[ 2106.480494] [<bf175544>] (s5p_mfc_mmap [s5p_mfc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev])
[ 2106.489575] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638)
[ 2106.497875] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4)
[ 2106.505068] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8)
[ 2106.512260] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0)
[ 2106.520059] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28)

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Oct 24, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap().
Fix it to not take the lock. The following lockdep warning is fixed
with this change.

[ 1990.972058] ======================================================
[ 1990.978172] WARNING: possible circular locking dependency detected
[ 1990.984327] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted
[ 1990.989783] ------------------------------------------------------
[ 1990.995937] qtdemux0:sink/2765 is trying to acquire lock:
[ 1991.001309]  (&gsc->lock){+.+.}, at: [<bf1729f0>] gsc_m2m_mmap+0x24/0x5c [exynos_gsc]
[ 1991.009108]
               but task is already holding lock:
[ 1991.014913]  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 1991.021932]
               which lock already depends on the new lock.

[ 1991.030078]
               the existing dependency chain (in reverse order) is:
[ 1991.037530]
               -> #1 (&mm->mmap_sem){++++}:
[ 1991.042913]        __might_fault+0x80/0xb0
[ 1991.047096]        video_usercopy+0x1cc/0x510 [videodev]
[ 1991.052297]        v4l2_ioctl+0xa4/0xdc [videodev]
[ 1991.057036]        do_vfs_ioctl+0xa0/0xa18
[ 1991.061102]        SyS_ioctl+0x34/0x5c
[ 1991.064834]        ret_fast_syscall+0x0/0x28
[ 1991.069072]
               -> #0 (&gsc->lock){+.+.}:
[ 1991.074193]        lock_acquire+0x6c/0x88
[ 1991.078179]        __mutex_lock+0x68/0xa34
[ 1991.082247]        mutex_lock_interruptible_nested+0x1c/0x24
[ 1991.087888]        gsc_m2m_mmap+0x24/0x5c [exynos_gsc]
[ 1991.093029]        v4l2_mmap+0x54/0x88 [videodev]
[ 1991.097673]        mmap_region+0x3a8/0x638
[ 1991.101743]        do_mmap+0x330/0x3a4
[ 1991.105470]        vm_mmap_pgoff+0x90/0xb8
[ 1991.109542]        SyS_mmap_pgoff+0x90/0xc0
[ 1991.113702]        ret_fast_syscall+0x0/0x28
[ 1991.117945]
               other info that might help us debug this:

[ 1991.125918]  Possible unsafe locking scenario:

[ 1991.131810]        CPU0                    CPU1
[ 1991.136315]        ----                    ----
[ 1991.140821]   lock(&mm->mmap_sem);
[ 1991.144201]                                lock(&gsc->lock);
[ 1991.149833]                                lock(&mm->mmap_sem);
[ 1991.155725]   lock(&gsc->lock);
[ 1991.158845]
                *** DEADLOCK ***

[ 1991.164740] 1 lock held by qtdemux0:sink/2765:
[ 1991.169157]  #0:  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 1991.176609]
               stack backtrace:
[ 1991.180946] CPU: 2 PID: 2765 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4
[ 1991.189608] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 1991.195686] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14)
[ 1991.203393] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4)
[ 1991.210586] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410)
[ 1991.218644] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938)
[ 1991.227049] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc)
[ 1991.235281] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88)
[ 1991.242993] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34)
[ 1991.250620] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24)
[ 1991.259812] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf1729f0>] (gsc_m2m_mmap+0x24/0x5c [exynos_gsc])
[ 1991.270159] [<bf1729f0>] (gsc_m2m_mmap [exynos_gsc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev])
[ 1991.279510] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638)
[ 1991.287792] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4)
[ 1991.294986] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8)
[ 1991.302178] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0)
[ 1991.309977] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28)

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
During modeset cleanup on driver unload we may have a pending
hotplug work. This needs to be canceled early during the teardown
so that it does not fire after we have freed the connector.
We do this after drm_kms_helper_poll_fini(dev) since this might
trigger modeset retry work due to link retrain and before
intel_fbdev_fini() since this work requires the lock from fbdev.

If this is not done we may see something like:
DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock))
 ------------[ cut here ]------------
 WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60
 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178
+a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid
+[last unloaded: snd_hda_intel]
 CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G     U          4.14.0-rc3-CI-CI_DRM_3186+ #1
 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
 task: ffff8803c827aa40 task.stack: ffffc90000520000
 RIP: 0010:mutex_destroy+0x4e/0x60
 RSP: 0018:ffffc90000523d58 EFLAGS: 0001029
 RAX: 000000000000002a RBX: ffff88044fbef648 RCX: 0000000000000000
 RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0cf0
 RBP: ffffc90000523d60 R08: 0000000000000001 R09: 0000000000000001
 R10: 000000000f21cb81 R11: 0000000000000000 R12: ffff88044f71efc8
 R13: ffffffffa02b3d20 R14: ffffffffa02b3d90 R15: ffff880459b29308
 FS:  00007f5df4d6e8c0(0000) GS:ffff88045d300000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055ec51f00a18 CR3: 0000000451782006 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  drm_fb_helper_fini+0xd9/0x130
  intel_fbdev_destroy+0x12/0x60 [i915]
  intel_fbdev_fini+0x28/0x30 [i915]
  intel_modeset_cleanup+0x45/0xa0 [i915]
  i915_driver_unload+0x92/0x180 [i915]
  i915_pci_remove+0x19/0x30 [i915]
  i915_driver_unload+0x92/0x180 [i915]
  i915_pci_remove+0x19/0x30 [i915]
  pci_device_remove+0x39/0xb0
  device_release_driver_internal+0x15d/0x220
  driver_detach+0x40/0x80
  bus_remove_driver+0x58/0xd0
  driver_unregister+0x2c/0x40
  pci_unregister_driver+0x36/0xb0
  i915_exit+0x1a/0x8b [i915]
  SyS_delete_module+0x18c/0x1e0
  entry_SYSCALL_64_fastpath+0x1c/0xb1
 RIP: 0033:0x7f5df3286287
 RSP: 002b:00007fff8e107cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: ffffffff81493a03 RCX: 00007f5df3286287
 RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564c7be02e48
 RBP: ffffc90000523f88 R08: 0000000000000000 R09: 0000000000000080
 R10: 00007f5df4d6e8c0 R11: 0000000000000246 R12: 0000000000000000
 R13: 00007fff8e107eb0 R14: 0000000000000000 R15: 0000000000000000
Or a GPF like:

 general protection fault: 0000 [#1] PREEMPT SMP
 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178
+a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid
+[last unloaded: snd_hda_intel]
 CPU: 0 PID: 82 Comm: kworker/0:1 Tainted: G     U  W       4.14.0-rc3-CI-CI_DRM_3186+ #1
 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
 Workqueue: events intel_dp_modeset_retry_work_fn [i915]
 task: ffff88045a5caa40 task.stack: ffffc90000378000
 RIP: 0010:drm_setup_crtcs+0x143/0xbf0
 RSP: 0018:ffffc9000037bd20 EFLAGS: 00010202
 RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000001
 RDX: 0000000000000001 RSI: 0000000000000780 RDI: 00000000ffffffff
 RBP: ffffc9000037bdb8 R08: 0000000000000001 R09: 0000000000000001
 R10: 0000000000000780 R11: 0000000000000000 R12: 0000000000000002
 R13: ffff88044fbef4e8 R14: 0000000000000780 R15: 0000000000000438
 FS:  0000000000000000(0000) GS:ffff88045d200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055ec51ee5168 CR3: 000000044c89d003 CR4: 00000000003606f0
 Call Trace:
  drm_fb_helper_hotplug_event.part.18+0x7e/0xc0
  drm_fb_helper_hotplug_event+0x1a/0x20
  intel_fbdev_output_poll_changed+0x1a/0x20 [i915]
  drm_kms_helper_hotplug_event+0x27/0x30
  intel_dp_modeset_retry_work_fn+0x77/0x80 [i915]
  process_one_work+0x233/0x660
  worker_thread+0x206/0x3b0
  kthread+0x152/0x190
  ? process_one_work+0x660/0x660
  ? kthread_create_on_node+0x40/0x40
  ret_from_fork+0x27/0x40
 Code: 06 00 00 45 8b 45 20 31 db 45 31 e4 45 85 c0 0f 8e 91 06 00 00 44 8b 75 94 44 8b 7d 90 49 8b 45 28 49 63 d4 44 89 f6 41 83 c4 01 <48> 8b 04 d0 44
+89 fa 48 8b 38 48 8b 87 a8 01 00 00 ff 50 20 01
 RIP: drm_setup_crtcs+0x143/0xbf0 RSP: ffffc9000037bd20
 ---[ end trace 08901ff1a77d30c7 ]---

v2:
* Rename it to intel_hpd_poll_fini() and call drm_kms_helper_fini() inside it
as the first step before cancel work (Chris Wilson)
* Add GPF trace in commit message and make the function static (Maarten Lankhorst)

Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 9301397 ("drm/i915: Implement Link Rate fallback on Link training failure")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tony Cheng <tony.cheng@amd.com>
Cc: Harry Wentland <Harry.wentland@amd.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1509054720-25325-1-git-send-email-manasi.d.navare@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 886c6b8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
Kasan spotted

    [IGT] gem_tiled_pread_pwrite: exiting, ret=0
    ==================================================================
    BUG: KASAN: use-after-free in __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
    Read of size 8 at addr ffff8801359da310 by task kworker/3:2/182

    CPU: 3 PID: 182 Comm: kworker/3:2 Tainted: G     U          4.14.0-rc6-CI-Custom_3340+ #1
    Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
    Workqueue: events __i915_gem_free_work [i915]
    Call Trace:
     dump_stack+0x68/0xa0
     print_address_description+0x78/0x290
     ? __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
     kasan_report+0x23d/0x350
     __asan_report_load8_noabort+0x19/0x20
     __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
     ? i915_gem_object_truncate+0x100/0x100 [i915]
     ? lock_acquire+0x380/0x380
     __i915_gem_object_put_pages+0x30d/0x530 [i915]
     __i915_gem_free_objects+0x551/0xbd0 [i915]
     ? lock_acquire+0x13e/0x380
     __i915_gem_free_work+0x4e/0x70 [i915]
     process_one_work+0x6f6/0x1590
     ? pwq_dec_nr_in_flight+0x2b0/0x2b0
     worker_thread+0xe6/0xe90
     ? pci_mmcfg_check_reserved+0x110/0x110
     kthread+0x309/0x410
     ? process_one_work+0x1590/0x1590
     ? kthread_create_on_node+0xb0/0xb0
     ret_from_fork+0x27/0x40

    Allocated by task 1801:
     save_stack_trace+0x1b/0x20
     kasan_kmalloc+0xee/0x190
     kasan_slab_alloc+0x12/0x20
     kmem_cache_alloc+0xdc/0x2e0
     radix_tree_node_alloc.constprop.12+0x48/0x330
     __radix_tree_create+0x274/0x480
     __radix_tree_insert+0xa2/0x610
     i915_gem_object_get_sg+0x224/0x670 [i915]
     i915_gem_object_get_page+0xb5/0x1c0 [i915]
     i915_gem_pread_ioctl+0x822/0xf60 [i915]
     drm_ioctl_kernel+0x13f/0x1c0
     drm_ioctl+0x6cf/0x980
     do_vfs_ioctl+0x184/0xf30
     SyS_ioctl+0x41/0x70
     entry_SYSCALL_64_fastpath+0x1c/0xb1

    Freed by task 37:
     save_stack_trace+0x1b/0x20
     kasan_slab_free+0xaf/0x190
     kmem_cache_free+0xbf/0x340
     radix_tree_node_rcu_free+0x79/0x90
     rcu_process_callbacks+0x46d/0xf40
     __do_softirq+0x21c/0x8d3

    The buggy address belongs to the object at ffff8801359da0f0
    which belongs to the cache radix_tree_node of size 576
    The buggy address is located 544 bytes inside of
    576-byte region [ffff8801359da0f0, ffff8801359da330)
    The buggy address belongs to the page:
    page:ffffea0004d67600 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
    flags: 0x8000000000008100(slab|head)
    raw: 8000000000008100 0000000000000000 0000000000000000 0000000100110011
    raw: ffffea0004b52920 ffffea0004b38020 ffff88015b416a80 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
     ffff8801359da200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff8801359da280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff8801359da300: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
			     ^
     ffff8801359da380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff8801359da400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ==================================================================
    Disabling lock debugging due to kernel taint

which looks like the slab containing the radixtree iter was freed as we
traversed the tree, taking the rcu read lock across the loop should
prevent that (deferring all the frees until the end).

Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Fixes: 96d7763 ("drm/i915: Use a radixtree for random access to the object's backing storage")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026130032.10677-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
(cherry picked from commit bea6e98)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
Kasan spotted

    [IGT] gem_tiled_pread_pwrite: exiting, ret=0
    ==================================================================
    BUG: KASAN: use-after-free in __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
    Read of size 8 at addr ffff8801359da310 by task kworker/3:2/182

    CPU: 3 PID: 182 Comm: kworker/3:2 Tainted: G     U          4.14.0-rc6-CI-Custom_3340+ #1
    Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
    Workqueue: events __i915_gem_free_work [i915]
    Call Trace:
     dump_stack+0x68/0xa0
     print_address_description+0x78/0x290
     ? __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
     kasan_report+0x23d/0x350
     __asan_report_load8_noabort+0x19/0x20
     __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
     ? i915_gem_object_truncate+0x100/0x100 [i915]
     ? lock_acquire+0x380/0x380
     __i915_gem_object_put_pages+0x30d/0x530 [i915]
     __i915_gem_free_objects+0x551/0xbd0 [i915]
     ? lock_acquire+0x13e/0x380
     __i915_gem_free_work+0x4e/0x70 [i915]
     process_one_work+0x6f6/0x1590
     ? pwq_dec_nr_in_flight+0x2b0/0x2b0
     worker_thread+0xe6/0xe90
     ? pci_mmcfg_check_reserved+0x110/0x110
     kthread+0x309/0x410
     ? process_one_work+0x1590/0x1590
     ? kthread_create_on_node+0xb0/0xb0
     ret_from_fork+0x27/0x40

    Allocated by task 1801:
     save_stack_trace+0x1b/0x20
     kasan_kmalloc+0xee/0x190
     kasan_slab_alloc+0x12/0x20
     kmem_cache_alloc+0xdc/0x2e0
     radix_tree_node_alloc.constprop.12+0x48/0x330
     __radix_tree_create+0x274/0x480
     __radix_tree_insert+0xa2/0x610
     i915_gem_object_get_sg+0x224/0x670 [i915]
     i915_gem_object_get_page+0xb5/0x1c0 [i915]
     i915_gem_pread_ioctl+0x822/0xf60 [i915]
     drm_ioctl_kernel+0x13f/0x1c0
     drm_ioctl+0x6cf/0x980
     do_vfs_ioctl+0x184/0xf30
     SyS_ioctl+0x41/0x70
     entry_SYSCALL_64_fastpath+0x1c/0xb1

    Freed by task 37:
     save_stack_trace+0x1b/0x20
     kasan_slab_free+0xaf/0x190
     kmem_cache_free+0xbf/0x340
     radix_tree_node_rcu_free+0x79/0x90
     rcu_process_callbacks+0x46d/0xf40
     __do_softirq+0x21c/0x8d3

    The buggy address belongs to the object at ffff8801359da0f0
    which belongs to the cache radix_tree_node of size 576
    The buggy address is located 544 bytes inside of
    576-byte region [ffff8801359da0f0, ffff8801359da330)
    The buggy address belongs to the page:
    page:ffffea0004d67600 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
    flags: 0x8000000000008100(slab|head)
    raw: 8000000000008100 0000000000000000 0000000000000000 0000000100110011
    raw: ffffea0004b52920 ffffea0004b38020 ffff88015b416a80 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
     ffff8801359da200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff8801359da280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff8801359da300: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
			     ^
     ffff8801359da380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff8801359da400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ==================================================================
    Disabling lock debugging due to kernel taint

which looks like the slab containing the radixtree iter was freed as we
traversed the tree, taking the rcu read lock across the loop should
prevent that (deferring all the frees until the end).

Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Fixes: d1b48c1 ("drm/i915: Replace execbuf vma ht with an idr")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026130032.10677-2-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
(cherry picked from commit 547da76)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
…-text symbols"

This reverts commit 83e840c ("powerpc64/elfv1: Only dereference
function descriptor for non-text symbols").

Chandan reported that on newer kernels, trying to enable function_graph
tracer on ppc64 (BE) locks up the system with the following trace:

  Unable to handle kernel paging request for data at address 0x600000002fa30010
  Faulting instruction address: 0xc0000000001f1300
  Thread overran stack, or stack corrupted
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
  Modules linked in:
  CPU: 1 PID: 6586 Comm: bash Not tainted 4.14.0-rc3-00162-g6e51f1f-dirty hardkernel#20
  task: c000000625c07200 task.stack: c000000625c07310
  NIP:  c0000000001f1300 LR: c000000000121cac CTR: c000000000061af8
  REGS: c000000625c088c0 TRAP: 0380   Not tainted  (4.14.0-rc3-00162-g6e51f1f-dirty)
  MSR:  8000000000001032 <SF,ME,IR,DR,RI>  CR: 28002848  XER: 00000000
  CFAR: c0000000001f1320 SOFTE: 0
  ...
  NIP [c0000000001f1300] .__is_insn_slot_addr+0x30/0x90
  LR [c000000000121cac] .kernel_text_address+0x18c/0x1c0
  Call Trace:
  [c000000625c08b40] [c0000000001bd040] .is_module_text_address+0x20/0x40 (unreliable)
  [c000000625c08bc0] [c000000000121cac] .kernel_text_address+0x18c/0x1c0
  [c000000625c08c50] [c000000000061960] .prepare_ftrace_return+0x50/0x130
  [c000000625c08cf0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
  [c000000625c08d60] [c000000000121b40] .kernel_text_address+0x20/0x1c0
  [c000000625c08df0] [c000000000061960] .prepare_ftrace_return+0x50/0x130
  ...
  [c000000625c0ab30] [c000000000061960] .prepare_ftrace_return+0x50/0x130
  [c000000625c0abd0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
  [c000000625c0ac40] [c000000000121b40] .kernel_text_address+0x20/0x1c0
  [c000000625c0acd0] [c000000000061960] .prepare_ftrace_return+0x50/0x130
  [c000000625c0ad70] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
  [c000000625c0ade0] [c000000000121b40] .kernel_text_address+0x20/0x1c0

This is because ftrace is using ppc_function_entry() for obtaining the
address of return_to_handler() in prepare_ftrace_return(). The call to
kernel_text_address() itself gets traced and we end up in a recursive
loop.

Fixes: 83e840c ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols")
Cc: stable@vger.kernel.org # v4.13+
Reported-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
…lization

Call trace observed during boot:

  nest_capp0_imc performance monitor hardware support registered
  nest_capp1_imc performance monitor hardware support registered
  core_imc memory allocation for cpu 56 failed
  Unable to handle kernel paging request for data at address 0xffa400010
  Faulting instruction address: 0xc000000000bf3294
  0:mon> e
  cpu 0x0: Vector: 300 (Data Access) at [c000000ff38ff8d0]
      pc: c000000000bf3294: mutex_lock+0x34/0x90
      lr: c000000000bf3288: mutex_lock+0x28/0x90
      sp: c000000ff38ffb50
     msr: 9000000002009033
     dar: ffa400010
   dsisr: 80000
    current = 0xc000000ff383de00
    paca    = 0xc000000007ae0000	 softe: 0	 irq_happened: 0x01
      pid   = 13, comm = cpuhp/0
  Linux version 4.11.0-39.el7a.ppc64le (mockbuild@ppc-058.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Tue Oct 3 07:42:44 EDT 2017
  0:mon> t
  [c000000ff38ffb80] c0000000002ddfac perf_pmu_migrate_context+0xac/0x470
  [c000000ff38ffc40] c00000000011385c ppc_core_imc_cpu_offline+0x1ac/0x1e0
  [c000000ff38ffc90] c000000000125758 cpuhp_invoke_callback+0x198/0x5d0
  [c000000ff38ffd00] c00000000012782c cpuhp_thread_fun+0x8c/0x3d0
  [c000000ff38ffd60] c0000000001678d0 smpboot_thread_fn+0x290/0x2a0
  [c000000ff38ffdc0] c00000000015ee78 kthread+0x168/0x1b0
  [c000000ff38ffe30] c00000000000b368 ret_from_kernel_thread+0x5c/0x74

While registering the cpuhoplug callbacks for core-imc, if we fails
in the cpuhotplug online path for any random core (either because opal call to
initialize the core-imc counters fails or because memory allocation fails for
that core), ppc_core_imc_cpu_offline() will get invoked for other cpus who
successfully returned from cpuhotplug online path.

But in the ppc_core_imc_cpu_offline() path we are trying to migrate the event
context, when core-imc counters are not even initialized. Thus creating the
above stack dump.

Add a check to see if core-imc counters are enabled or not in the cpuhotplug
offline path before migrating the context to handle this failing scenario.

Fixes: 885dcd7 ("powerpc/perf: Add nest IMC PMU support")
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.

Found by syzkaller.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c hardkernel#400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
 usb_suspend_interface drivers/usb/core/driver.c:1209
 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
…ully

The 0day bot reports the below failure which happens occasionally, with
their randconfig testing (once every ~100 boots).  The Code points at
the private pointer ->driver_data being NULL, which hints at a race of
sorts where the private driver_data descriptor has disappeared by the
time we get to run the workqueue.

So let's check that pointer before we continue with issuing the command
to the drive.

This fix is of the brown paper bag nature but considering that IDE is
long deprecated, let's do that so that random testing which happens to
enable CONFIG_IDE during randconfig builds, doesn't fail because of
this.

Besides, failing the TEST_UNIT_READY command because the drive private
data is gone is something which we could simply do anyway, to denote
that there was a problem communicating with the device.

  BUG: unable to handle kernel NULL pointer dereference at 000001c0
  IP: cdrom_check_status
  *pde = 00000000
  Oops: 0000 [#1] SMP
  CPU: 1 PID: 155 Comm: kworker/1:2 Not tainted 4.14.0-rc8 hardkernel#127
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
  Workqueue: events_freezable_power_ disk_events_workfn
  task: 4fe90980 task.stack: 507ac000
  EIP: cdrom_check_status+0x2c/0x90
  EFLAGS: 00210246 CPU: 1
  EAX: 00000000 EBX: 4fefec00 ECX: 00000000 EDX: 00000000
  ESI: 00000003 EDI: ffffffff EBP: 467a9340 ESP: 507aded0
   DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  CR0: 80050033 CR2: 000001c0 CR3: 06e0f000 CR4: 00000690
  Call Trace:
   ? ide_cdrom_check_events_real
   ? cdrom_check_events
   ? disk_check_events
   ? process_one_work
   ? process_one_work
   ? worker_thread
   ? kthread
   ? process_one_work
   ? __kthread_create_on_node
   ? ret_from_fork
  Code: 53 83 ec 14 89 c3 89 d1 be 03 00 00 00 65 a1 14 00 00 00 89 44 24 10 31 c0 8b 43 18 c7 44 24 04 00 00 00 00 c7 04 24 00 00 00 00 <8a> 80 c0 01 00 00 c7 44 24 08 00 00 00 00 83 e0 03 c7 44 24 0c
  EIP: cdrom_check_status+0x2c/0x90 SS:ESP: 0068:507aded0
  CR2: 00000000000001c0
  ---[ end trace 2410e586dd8f88b2 ]---

Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
… updates

Commit 5e98596 ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing
implementation", 2016-12-20) added code that tries to exclude any use
or update of the hashed page table (HPT) while the HPT resizing code
is iterating through all the entries in the HPT.  It does this by
taking the kvm->lock mutex, clearing the kvm->arch.hpte_setup_done
flag and then sending an IPI to all CPUs in the host.  The idea is
that any VCPU task that tries to enter the guest will see that the
hpte_setup_done flag is clear and therefore call kvmppc_hv_setup_htab_rma,
which also takes the kvm->lock mutex and will therefore block until
we release kvm->lock.

However, any VCPU that is already in the guest, or is handling a
hypervisor page fault or hypercall, can re-enter the guest without
rechecking the hpte_setup_done flag.  The IPI will cause a guest exit
of any VCPUs that are currently in the guest, but does not prevent
those VCPU tasks from immediately re-entering the guest.

The result is that after resize_hpt_rehash_hpte() has made a HPTE
absent, a hypervisor page fault can occur and make that HPTE present
again.  This includes updating the rmap array for the guest real page,
meaning that we now have a pointer in the rmap array which connects
with pointers in the old rev array but not the new rev array.  In
fact, if the HPT is being reduced in size, the pointer in the rmap
array could point outside the bounds of the new rev array.  If that
happens, we can get a host crash later on such as this one:

[91652.628516] Unable to handle kernel paging request for data at address 0xd0000000157fb10c
[91652.628668] Faulting instruction address: 0xc0000000000e2640
[91652.628736] Oops: Kernel access of bad area, sig: 11 [#1]
[91652.628789] LE SMP NR_CPUS=1024 NUMA PowerNV
[91652.628847] Modules linked in: binfmt_misc vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas i2c_opal ipmi_powernv ipmi_devintf i2c_core ipmi_msghandler powernv_op_panel nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm_hv kvm_pr kvm scsi_dh_alua dm_service_time dm_multipath tg3 ptp pps_core [last unloaded: stap_552b612747aec2da355051e464fa72a1_14259]
[91652.629566] CPU: 136 PID: 41315 Comm: CPU 21/KVM Tainted: G           O    4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le #1
[91652.629684] task: c0000007a419e400 task.stack: c0000000028d8000
[91652.629750] NIP:  c0000000000e2640 LR: d00000000c36e498 CTR: c0000000000e25f0
[91652.629829] REGS: c0000000028db5d0 TRAP: 0300   Tainted: G           O     (4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le)
[91652.629932] MSR:  900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022422  XER: 00000000
[91652.630034] CFAR: d00000000c373f84 DAR: d0000000157fb10c DSISR: 40000000 SOFTE: 1
[91652.630034] GPR00: d00000000c36e498 c0000000028db850 c000000001403900 c0000007b7960000
[91652.630034] GPR04: d0000000117fb100 d000000007ab00d8 000000000033bb10 0000000000000000
[91652.630034] GPR08: fffffffffffffe7f 801001810073bb10 d00000000e440000 d00000000c373f70
[91652.630034] GPR12: c0000000000e25f0 c00000000fdb9400 f000000003b24680 0000000000000000
[91652.630034] GPR16: 00000000000004fb 00007ff7081a0000 00000000000ec91a 000000000033bb10
[91652.630034] GPR20: 0000000000010000 00000000001b1190 0000000000000001 0000000000010000
[91652.630034] GPR24: c0000007b7ab8038 d0000000117fb100 0000000ec91a1190 c000001e6a000000
[91652.630034] GPR28: 00000000033bb100 000000000073bb10 c0000007b7960000 d0000000157fb100
[91652.630735] NIP [c0000000000e2640] kvmppc_add_revmap_chain+0x50/0x120
[91652.630806] LR [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
[91652.630884] Call Trace:
[91652.630913] [c0000000028db850] [c0000000028db8b0] 0xc0000000028db8b0 (unreliable)
[91652.630996] [c0000000028db8b0] [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
[91652.631091] [c0000000028db9e0] [d00000000c36a078] kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv]
[91652.631179] [c0000000028dbb30] [d00000000c2248c4] kvmppc_vcpu_run+0x34/0x50 [kvm]
[91652.631266] [c0000000028dbb50] [d00000000c220d54] kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
[91652.631351] [c0000000028dbbd0] [d00000000c2139d8] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
[91652.631433] [c0000000028dbd40] [c0000000003832e0] do_vfs_ioctl+0xd0/0x8c0
[91652.631501] [c0000000028dbde0] [c000000000383ba4] SyS_ioctl+0xd4/0x130
[91652.631569] [c0000000028dbe30] [c00000000000b8e0] system_call+0x58/0x6c
[91652.631635] Instruction dump:
[91652.631676] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ffa1 2fa70000 793d0020 e9432110
[91652.631814] 7bbf26e4 7c7e1b78 7feafa14 409e0094 <807f000c> 786326e4 7c6a1a14 93a40008
[91652.631959] ---[ end trace ac85ba6db72e5b2e ]---

To fix this, we tighten up the way that the hpte_setup_done flag is
checked to ensure that it does provide the guarantee that the resizing
code needs.  In kvmppc_run_core(), we check the hpte_setup_done flag
after disabling interrupts and refuse to enter the guest if it is
clear (for a HPT guest).  The code that checks hpte_setup_done and
calls kvmppc_hv_setup_htab_rma() is moved from kvmppc_vcpu_run_hv()
to a point inside the main loop in kvmppc_run_vcpu(), ensuring that
we don't just spin endlessly calling kvmppc_run_core() while
hpte_setup_done is clear, but instead have a chance to block on the
kvm->lock mutex.

Finally we also check hpte_setup_done inside the region in
kvmppc_book3s_hv_page_fault() where the HPTE is locked and we are about
to update the HPTE, and bail out if it is clear.  If another CPU is
inside kvm_vm_ioctl_resize_hpt_commit) and has cleared hpte_setup_done,
then we know that either we are looking at a HPTE
that resize_hpt_rehash_hpte() has not yet processed, which is OK,
or else we will see hpte_setup_done clear and refuse to update it,
because of the full barrier formed by the unlock of the HPTE in
resize_hpt_rehash_hpte() combined with the locking of the HPTE
in kvmppc_book3s_hv_page_fault().

Fixes: 5e98596 ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation")
Cc: stable@vger.kernel.org # v4.10+
Reported-by: Satheesh Rajendran <satheera@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will
cause a divide error in usbnet_probe:

divide error: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f hardkernel#56
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bef5c00 task.stack: ffff88006bf60000
RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355
RSP: 0018:ffff88006bf67508 EFLAGS: 00010246
RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34
RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34
RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881
R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003
R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0
Call Trace:
 usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783
 qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338
 usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361
 really_probe drivers/base/dd.c:413
 driver_probe_device+0x522/0x740 drivers/base/dd.c:557

Fix by simply ignoring the bogus descriptor, as it is optional
for QMI devices anyway.

Fixes: 423ce8c ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
When we receive a packet on a QMI device in raw IP mode, we should call
skb_reset_mac_header() to ensure that skb->mac_header contains a valid
offset in the packet. While it shouldn't really matter, the packets have
no MAC header and the interface is configured as-such, it seems certain
parts of the network stack expects a "good" value in skb->mac_header.

Without the skb_reset_mac_header() call added in this patch, for example
shaping traffic (using tc) triggers the following oops on the first
received packet:

[  303.642957] skbuff: skb_under_panic: text:8f137918 len:177 put:67 head:8e4b0f00 data:8e4b0eff tail:0x8e4b0fb0 end:0x8e4b1520 dev:wwan0
[  303.655045] Kernel bug detected[#1]:
[  303.658622] CPU: 1 PID: 1002 Comm: logd Not tainted 4.9.58 #0
[  303.664339] task: 8fdf05e0 task.stack: 8f15c000
[  303.668844] $ 0   : 00000000 00000001 0000007a 00000000
[  303.674062] $ 4   : 8149a2fc 8149a2fc 8149ce20 00000000
[  303.679284] $ 8   : 00000030 3878303a 31623465 20303235
[  303.684510] $12   : ded731e3 2626a277 00000000 03bd0000
[  303.689747] $16   : 8ef62b40 00000043 8f137918 804db5fc
[  303.694978] $20   : 00000001 00000004 8fc13800 00000003
[  303.700215] $24   : 00000001 8024ab10
[  303.705442] $28   : 8f15c000 8fc19cf0 00000043 802cc920
[  303.710664] Hi    : 00000000
[  303.713533] Lo    : 74e58000
[  303.716436] epc   : 802cc920 skb_panic+0x58/0x5c
[  303.721046] ra    : 802cc920 skb_panic+0x58/0x5c
[  303.725639] Status: 11007c03 KERNEL EXL IE
[  303.729823] Cause : 50800024 (ExcCode 09)
[  303.733817] PrId  : 0001992f (MIPS 1004Kc)
[  303.737892] Modules linked in: rt2800pci rt2800mmio rt2800lib qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 mt76x2i
Process logd (pid: 1002, threadinfo=8f15c000, task=8fdf05e0, tls=77b3eee4)
[  303.962509] Stack : 00000000 80408990 8f137918 000000b1 00000043 8e4b0f00 8e4b0eff 8e4b0fb0
[  303.970871]         8e4b1520 8fec1800 00000043 802cd2a4 6e000045 00000043 00000000 8ef62000
[  303.979219]         8eef5d00 8ef62b40 8fea7300 8f137918 00000000 00000000 0002bb01 793e5664
[  303.987568]         8ef08884 00000001 8fea7300 00000002 8fc19e80 8eef5d00 00000006 00000003
[  303.995934]         00000000 8030ba90 00000003 77ab3fd0 8149dc80 8004d1bc 8f15c000 8f383700
[  304.004324]         ...
[  304.006767] Call Trace:
[  304.009241] [<802cc920>] skb_panic+0x58/0x5c
[  304.013504] [<802cd2a4>] skb_push+0x78/0x90
[  304.017783] [<8f137918>] 0x8f137918
[  304.021269] Code: 0060282  0c02a3b4  24842888 <000c000d> 8c870060  8c8200a0  0007382b  00070336  8c88005c
[  304.031034]
[  304.032805] ---[ end trace b778c482b3f0bda9 ]---
[  304.041384] Kernel panic - not syncing: Fatal exception in interrupt
[  304.051975] Rebooting in 3 seconds..

While the oops is for a 4.9-kernel, I was able to trigger the same oops with
net-next as of yesterday.

Fixes: 32f7adf ("net: qmi_wwan: support "raw IP" mode")
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
syzkaller reported a NULL pointer dereference in asn1_ber_decoder().  It
can be reproduced by the following command, assuming
CONFIG_PKCS7_TEST_KEY=y:

        keyctl add pkcs7_test desc '' @s

The bug is that if the data buffer is empty, an integer underflow occurs
in the following check:

        if (unlikely(dp >= datalen - 1))
                goto data_overrun_error;

This results in the NULL data pointer being dereferenced.

Fix it by checking for 'datalen - dp < 2' instead.

Also fix the similar check for 'dp >= datalen - n' later in the same
function.  That one possibly could result in a buffer overread.

The NULL pointer dereference was reproducible using the "pkcs7_test" key
type but not the "asymmetric" key type because the "asymmetric" key type
checks for a 0-length payload before calling into the ASN.1 decoder but
the "pkcs7_test" key type does not.

The bug report was:

    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
    PGD 7b708067 P4D 7b708067 PUD 7b6ee067 PMD 0
    Oops: 0000 [#1] SMP
    Modules linked in:
    CPU: 0 PID: 522 Comm: syz-executor1 Not tainted 4.14.0-rc8 #7
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
    task: ffff9b6b3798c040 task.stack: ffff9b6b37970000
    RIP: 0010:asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
    RSP: 0018:ffff9b6b37973c78 EFLAGS: 00010216
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000021c
    RDX: ffffffff814a04ed RSI: ffffb1524066e000 RDI: ffffffff910759e0
    RBP: ffff9b6b37973d60 R08: 0000000000000001 R09: ffff9b6b3caa4180
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    FS:  00007f10ed1f2700(0000) GS:ffff9b6b3ea00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000007b6f3000 CR4: 00000000000006f0
    Call Trace:
     pkcs7_parse_message+0xee/0x240 crypto/asymmetric_keys/pkcs7_parser.c:139
     verify_pkcs7_signature+0x33/0x180 certs/system_keyring.c:216
     pkcs7_preparse+0x41/0x70 crypto/asymmetric_keys/pkcs7_key_type.c:63
     key_create_or_update+0x180/0x530 security/keys/key.c:855
     SYSC_add_key security/keys/keyctl.c:122 [inline]
     SyS_add_key+0xbf/0x250 security/keys/keyctl.c:62
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x4585c9
    RSP: 002b:00007f10ed1f1bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000f8
    RAX: ffffffffffffffda RBX: 00007f10ed1f2700 RCX: 00000000004585c9
    RDX: 0000000020000000 RSI: 0000000020008ffb RDI: 0000000020008000
    RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff1b2260ae
    R13: 00007fff1b2260af R14: 00007f10ed1f2700 R15: 0000000000000000
    Code: dd ca ff 48 8b 45 88 48 83 e8 01 4c 39 f0 0f 86 a8 07 00 00 e8 53 dd ca ff 49 8d 46 01 48 89 85 58 ff ff ff 48 8b 85 60 ff ff ff <42> 0f b6 0c 30 89 c8 88 8d 75 ff ff ff 83 e0 1f 89 8d 28 ff ff
    RIP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: ffff9b6b37973c78
    CR2: 0000000000000000

Fixes: 42d5ec2 ("X.509: Add an ASN.1 decoder")
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
gcc-4.4 complains about:

	struct sgt_dma iter = {
		.sg = vma->pages->sgl,
		.dma = sg_dma_address(iter.sg),
		.max = iter.dma + iter.sg->length,
	};

drivers/gpu/drm/i915/i915_gem_gtt.c: In function ‘gen8_ppgtt_insert_4lvl’:
drivers/gpu/drm/i915/i915_gem_gtt.c:938: error: ‘iter.sg’ is used uninitialized in this function
drivers/gpu/drm/i915/i915_gem_gtt.c:939: error: ‘iter.dma’ is used uninitialized in this function

and worse generates invalid code that triggers a GPF:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915]
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: snd_aloop nf_conntrack_ipv6 nf_defrag_ipv6 nf_log_ipv6 ip6table_filter ip6_tables ctr ccm xt_state nf_log_ipv4
nf_log_common xt_LOG xt_limit xt_recent xt_owner xt_addrtype iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c ip_tables dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm
irqbypass uas usb_storage hid_multitouch btusb btrtl uvcvideo videobuf2_v4l2 videobuf2_core videodev media videobuf2_vmalloc videobuf2_memops
sg ppdev dell_wmi sparse_keymap mei_wdt sd_mod iTCO_wdt iTCO_vendor_support rtsx_pci_ms memstick rtsx_pci_sdmmc mmc_core dell_smm_hwmon hwmon
dell_laptop dell_smbios dcdbas joydev input_leds hci_uart btintel btqca btbcm bluetooth parport_pc parport i2c_hid
  intel_lpss_acpi intel_lpss pcspkr wmi int3400_thermal acpi_thermal_rel dell_rbtn mei_me mei snd_hda_codec_hdmi snd_hda_codec_realtek
snd_hda_codec_generic ahci libahci acpi_pad xhci_pci xhci_hcd snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device
snd_pcm snd_timer snd soundcore int3403_thermal arc4 e1000e ptp pps_core i2c_i801 iwlmvm mac80211 rtsx_pci iwlwifi cfg80211 rfkill
intel_pch_thermal processor_thermal_device int340x_thermal_zone intel_soc_dts_iosf i915 video fjes
CPU: 2 PID: 2408 Comm: X Not tainted 4.10.0-rc5+ #1
Hardware name: Dell Inc. Latitude E7470/0T6HHJ, BIOS 1.11.3 11/09/2016
task: ffff880219fe4740 task.stack: ffffc90005f98000
RIP: 0010:gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915]
RSP: 0018:ffffc90005f9b8c8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8802167d8000 RCX: 0000000000000001
RDX: 00000000ffff7000 RSI: ffff880219f94140 RDI: ffff880228444000
RBP: ffffc90005f9b948 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000080
R13: 0000000000000001 R14: ffffc90005f9bcd7 R15: ffff88020c9a83c0
FS:  00007fb53e1ee920(0000) GS:ffff88024dd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000022ef95000 CR4: 00000000003406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  ppgtt_bind_vma+0x40/0x50 [i915]
  i915_vma_bind+0xcb/0x1c0 [i915]
  __i915_vma_do_pin+0x6e/0xd0 [i915]
  i915_gem_execbuffer_reserve_vma+0x162/0x1d0 [i915]
  i915_gem_execbuffer_reserve+0x4fc/0x510 [i915]
  ? __kmalloc+0x134/0x250
  ? i915_gem_wait_for_error+0x25/0x100 [i915]
  ? i915_gem_wait_for_error+0x25/0x100 [i915]
  i915_gem_do_execbuffer+0x2df/0xa00 [i915]
  ? drm_malloc_gfp.clone.0+0x42/0x80 [i915]
  ? path_put+0x22/0x30
  ? __check_object_size+0x62/0x1f0
  ? terminate_walk+0x44/0x90
  i915_gem_execbuffer2+0x95/0x1e0 [i915]
  drm_ioctl+0x243/0x490
  ? handle_pte_fault+0x1d7/0x220
  ? i915_gem_do_execbuffer+0xa00/0xa00 [i915]
  ? handle_mm_fault+0x10d/0x2a0
  vfs_ioctl+0x18/0x30
  do_vfs_ioctl+0x14b/0x3f0
  SyS_ioctl+0x92/0xa0
  entry_SYSCALL_64_fastpath+0x1a/0xa9
RIP: 0033:0x7fb53b4fcb77
RSP: 002b:00007ffe0c572898 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fb53e17c038 RCX: 00007fb53b4fcb77
RDX: 00007ffe0c572900 RSI: 0000000040406469 RDI: 000000000000000b
RBP: 00007fb5376d67e0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000028 R11: 0000000000003246 R12: 0000000000000000
R13: 0000000000000000 R14: 000055eecb314d00 R15: 000055eecb315460
Code: 0f 84 5d ff ff ff eb a2 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 58 0f 1f 44 00 00 31 c0 89 4d b0 <4c>
8b 60 10 44 8b 70 0c 48 89 d0 4c 8b 2e 48 c1 e8 27 25 ff 01
RIP: gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915] RSP: ffffc90005f9b8c8
CR2: 0000000000000010

Recent gccs, such as 4.9, 6.3 or 7.2, do not generate the warning nor do
they explode on use. If we manually create the struct using locals from
the stack, this should eliminate this issue, and does not alter code
generation with gcc-7.2.

Fixes: 894cceb ("drm/i915: Micro-optimise gen8_ppgtt_insert_entries()")
Reported-by: Kelly French <kfrench@federalhill.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kelly French <kfrench@federalhill.net>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171106211128.12538-1-chris@chris-wilson.co.uk
Tested-by: Kelly French <kfrench@federalhill.net>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit 5684514)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.

Similar issue is present in asix_resume(), this patch fixes it as well.

Found by syzkaller.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c hardkernel#400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
 usb_suspend_interface drivers/usb/core/driver.c:1209
 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
rds_ib_recv_refill() is a function that refills an IB receive
queue. It can be called from both the CQE handler (tasklet) and a
worker thread.

Just after the call to ib_post_recv(), a debug message is printed with
rdsdebug():

            ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
            rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
                     recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
                     (long) ib_sg_dma_address(
                            ic->i_cm_id->device,
                            &recv->r_frag->f_sg),
                    ret);

Now consider an invocation of rds_ib_recv_refill() from the worker
thread, which is preemptible. Further, assume that the worker thread
is preempted between the ib_post_recv() and rdsdebug() statements.

Then, if the preemption is due to a receive CQE event, the
rds_ib_recv_cqe_handler() will be invoked. This function processes
receive completions, including freeing up data structures, such as the
recv->r_frag.

In this scenario, rds_ib_recv_cqe_handler() will process the receive
WR posted above. That implies, that the recv->r_frag has been freed
before the above rdsdebug() statement has been executed. When it is
later executed, we will have a NULL pointer dereference:

[ 4088.068008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 4088.076754] IP: rds_ib_recv_refill+0x87/0x620 [rds_rdma]
[ 4088.082686] PGD 0 P4D 0
[ 4088.085515] Oops: 0000 [#1] SMP
[ 4088.089015] Modules linked in: rds_rdma(OE) rds(OE) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) mlx4_ib(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_core(E) binfmt_misc(E) sb_edac(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) crypto_simd(E) iTCO_wdt(E) glue_helper(E) iTCO_vendor_support(E) sg(E) cryptd(E) pcspkr(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) shpchp(E) ioatdma(E) i2c_i801(E) wmi(E) lpc_ich(E) mei_me(E) mei(E) mfd_core(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) fscrypto(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
[ 4088.168486]  fb_sys_fops(E) ahci(E) ixgbe(E) libahci(E) ttm(E) mdio(E) ptp(E) pps_core(E) drm(E) sd_mod(E) libata(E) crc32c_intel(E) mlx4_core(E) i2c_core(E) dca(E) megaraid_sas(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: rds]
[ 4088.193442] CPU: 20 PID: 1244 Comm: kworker/20:2 Tainted: G           OE   4.14.0-rc7.master.20171105.ol7.x86_64 #1
[ 4088.205097] Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
[ 4088.216074] Workqueue: ib_cm cm_work_handler [ib_cm]
[ 4088.221614] task: ffff885fa11d0000 task.stack: ffffc9000e598000
[ 4088.228224] RIP: 0010:rds_ib_recv_refill+0x87/0x620 [rds_rdma]
[ 4088.234736] RSP: 0018:ffffc9000e59bb68 EFLAGS: 00010286
[ 4088.240568] RAX: 0000000000000000 RBX: ffffc9002115d050 RCX: ffffc9002115d050
[ 4088.248535] RDX: ffffffffa0521380 RSI: ffffffffa0522158 RDI: ffffffffa0525580
[ 4088.256498] RBP: ffffc9000e59bbf8 R08: 0000000000000005 R09: 0000000000000000
[ 4088.264465] R10: 0000000000000339 R11: 0000000000000001 R12: 0000000000000000
[ 4088.272433] R13: ffff885f8c9d8000 R14: ffffffff81a0a060 R15: ffff884676268000
[ 4088.280397] FS:  0000000000000000(0000) GS:ffff885fbec80000(0000) knlGS:0000000000000000
[ 4088.289434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4088.295846] CR2: 0000000000000020 CR3: 0000000001e09005 CR4: 00000000001606e0
[ 4088.303816] Call Trace:
[ 4088.306557]  rds_ib_cm_connect_complete+0xe0/0x220 [rds_rdma]
[ 4088.312982]  ? __dynamic_pr_debug+0x8c/0xb0
[ 4088.317664]  ? __queue_work+0x142/0x3c0
[ 4088.321944]  rds_rdma_cm_event_handler+0x19e/0x250 [rds_rdma]
[ 4088.328370]  cma_ib_handler+0xcd/0x280 [rdma_cm]
[ 4088.333522]  cm_process_work+0x25/0x120 [ib_cm]
[ 4088.338580]  cm_work_handler+0xd6b/0x17aa [ib_cm]
[ 4088.343832]  process_one_work+0x149/0x360
[ 4088.348307]  worker_thread+0x4d/0x3e0
[ 4088.352397]  kthread+0x109/0x140
[ 4088.355996]  ? rescuer_thread+0x380/0x380
[ 4088.360467]  ? kthread_park+0x60/0x60
[ 4088.364563]  ret_from_fork+0x25/0x30
[ 4088.368548] Code: 48 89 45 90 48 89 45 98 eb 4d 0f 1f 44 00 00 48 8b 43 08 48 89 d9 48 c7 c2 80 13 52 a0 48 c7 c6 58 21 52 a0 48 c7 c7 80 55 52 a0 <4c> 8b 48 20 44 89 64 24 08 48 8b 40 30 49 83 e1 fc 48 89 04 24
[ 4088.389612] RIP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] RSP: ffffc9000e59bb68
[ 4088.397772] CR2: 0000000000000020
[ 4088.401505] ---[ end trace fe922e6ccf004431 ]---

This bug was provoked by compiling rds out-of-tree with
EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
between the rdsdebug() and ib_ib_port_recv() statements:

   	       /* XXX when can this fail? */
	       ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
+		if (can_wait)
+			usleep_range(1000, 5000);
	       rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
			recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
			(long) ib_sg_dma_address(

The fix is simply to move the rdsdebug() statement up before the
ib_post_recv() and remove the printing of ret, which is taken care of
anyway by the non-debug code.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
When IOMMU support was enabled, dma-buf import in Exynos DRM was broken
since commit f43c359 ("drm/exynos: use real device for DMA-mapping
operations") due to using wrong struct device in drm_gem_prime_import()
function. This patch fixes following kernel BUG caused by incorrect buffer
mapping to DMA address space:

exynos-sysmmu 14650000.sysmmu: 14450000.mixer: PAGE FAULT occurred at 0xb2e00000
------------[ cut here ]------------
kernel BUG at drivers/iommu/exynos-iommu.c:449!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc4-next-20171016-00033-g990d723669fd #3165
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
task: c0e0b7c0 task.stack: c0e00000
PC is at exynos_sysmmu_irq+0x1d0/0x24c
LR is at exynos_sysmmu_irq+0x154/0x24c
------------[ cut here ]------------

Reported-by: Marian Mihailescu <mihailescu2m@gmail.com>
Fixes: f43c359 ("drm/exynos: use real device for DMA-mapping operations")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap().
Fix it to not take the lock. The following lockdep warning is fixed
with this change.

[ 2106.181412] ======================================================
[ 2106.187563] WARNING: possible circular locking dependency detected
[ 2106.193718] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted
[ 2106.199175] ------------------------------------------------------
[ 2106.205328] qtdemux0:sink/2614 is trying to acquire lock:
[ 2106.210701]  (&dev->mfc_mutex){+.+.}, at: [<bf175544>] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[ 2106.218672]
[ 2106.218672] but task is already holding lock:
[ 2106.224477]  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 2106.231497]
[ 2106.231497] which lock already depends on the new lock.
[ 2106.231497]
[ 2106.239642]
[ 2106.239642] the existing dependency chain (in reverse order) is:
[ 2106.247095]
[ 2106.247095] -> #1 (&mm->mmap_sem){++++}:
[ 2106.252473]        __might_fault+0x80/0xb0
[ 2106.256567]        video_usercopy+0x1cc/0x510 [videodev]
[ 2106.261845]        v4l2_ioctl+0xa4/0xdc [videodev]
[ 2106.266596]        do_vfs_ioctl+0xa0/0xa18
[ 2106.270667]        SyS_ioctl+0x34/0x5c
[ 2106.274395]        ret_fast_syscall+0x0/0x28
[ 2106.278637]
[ 2106.278637] -> #0 (&dev->mfc_mutex){+.+.}:
[ 2106.284186]        lock_acquire+0x6c/0x88
[ 2106.288173]        __mutex_lock+0x68/0xa34
[ 2106.292244]        mutex_lock_interruptible_nested+0x1c/0x24
[ 2106.297893]        s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]
[ 2106.302747]        v4l2_mmap+0x54/0x88 [videodev]
[ 2106.307409]        mmap_region+0x3a8/0x638
[ 2106.311480]        do_mmap+0x330/0x3a4
[ 2106.315207]        vm_mmap_pgoff+0x90/0xb8
[ 2106.319279]        SyS_mmap_pgoff+0x90/0xc0
[ 2106.323439]        ret_fast_syscall+0x0/0x28
[ 2106.327683]
[ 2106.327683] other info that might help us debug this:
[ 2106.327683]
[ 2106.335656]  Possible unsafe locking scenario:
[ 2106.335656]
[ 2106.341548]        CPU0                    CPU1
[ 2106.346053]        ----                    ----
[ 2106.350559]   lock(&mm->mmap_sem);
[ 2106.353939]                                lock(&dev->mfc_mutex);
[ 2106.353939]                                lock(&dev->mfc_mutex);
[ 2106.365897]   lock(&dev->mfc_mutex);
[ 2106.369450]
[ 2106.369450]  *** DEADLOCK ***
[ 2106.369450]
[ 2106.375344] 1 lock held by qtdemux0:sink/2614:
[ 2106.379762]  #0:  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 2106.387214]
[ 2106.387214] stack backtrace:
[ 2106.391550] CPU: 7 PID: 2614 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4
[ 2106.400213] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 2106.406285] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14)
[ 2106.413995] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4)
[ 2106.421187] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410)
[ 2106.429245] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938)
[ 2106.437651] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc)
[ 2106.445883] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88)
[ 2106.453596] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34)
[ 2106.461221] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24)
[ 2106.470425] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf175544>] (s5p_mfc_mmap+0x28/0xd4 [s5p_mfc])
[ 2106.480494] [<bf175544>] (s5p_mfc_mmap [s5p_mfc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev])
[ 2106.489575] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638)
[ 2106.497875] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4)
[ 2106.505068] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8)
[ 2106.512260] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0)
[ 2106.520059] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28)

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Hans Verkuil <hansverk@cisco.com>
Acked-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Nov 17, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap().
Fix it to not take the lock. The following lockdep warning is fixed
with this change.

[ 1990.972058] ======================================================
[ 1990.978172] WARNING: possible circular locking dependency detected
[ 1990.984327] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted
[ 1990.989783] ------------------------------------------------------
[ 1990.995937] qtdemux0:sink/2765 is trying to acquire lock:
[ 1991.001309]  (&gsc->lock){+.+.}, at: [<bf1729f0>] gsc_m2m_mmap+0x24/0x5c [exynos_gsc]
[ 1991.009108]
               but task is already holding lock:
[ 1991.014913]  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 1991.021932]
               which lock already depends on the new lock.

[ 1991.030078]
               the existing dependency chain (in reverse order) is:
[ 1991.037530]
               -> #1 (&mm->mmap_sem){++++}:
[ 1991.042913]        __might_fault+0x80/0xb0
[ 1991.047096]        video_usercopy+0x1cc/0x510 [videodev]
[ 1991.052297]        v4l2_ioctl+0xa4/0xdc [videodev]
[ 1991.057036]        do_vfs_ioctl+0xa0/0xa18
[ 1991.061102]        SyS_ioctl+0x34/0x5c
[ 1991.064834]        ret_fast_syscall+0x0/0x28
[ 1991.069072]
               -> #0 (&gsc->lock){+.+.}:
[ 1991.074193]        lock_acquire+0x6c/0x88
[ 1991.078179]        __mutex_lock+0x68/0xa34
[ 1991.082247]        mutex_lock_interruptible_nested+0x1c/0x24
[ 1991.087888]        gsc_m2m_mmap+0x24/0x5c [exynos_gsc]
[ 1991.093029]        v4l2_mmap+0x54/0x88 [videodev]
[ 1991.097673]        mmap_region+0x3a8/0x638
[ 1991.101743]        do_mmap+0x330/0x3a4
[ 1991.105470]        vm_mmap_pgoff+0x90/0xb8
[ 1991.109542]        SyS_mmap_pgoff+0x90/0xc0
[ 1991.113702]        ret_fast_syscall+0x0/0x28
[ 1991.117945]
               other info that might help us debug this:

[ 1991.125918]  Possible unsafe locking scenario:

[ 1991.131810]        CPU0                    CPU1
[ 1991.136315]        ----                    ----
[ 1991.140821]   lock(&mm->mmap_sem);
[ 1991.144201]                                lock(&gsc->lock);
[ 1991.149833]                                lock(&mm->mmap_sem);
[ 1991.155725]   lock(&gsc->lock);
[ 1991.158845]
                *** DEADLOCK ***

[ 1991.164740] 1 lock held by qtdemux0:sink/2765:
[ 1991.169157]  #0:  (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8
[ 1991.176609]
               stack backtrace:
[ 1991.180946] CPU: 2 PID: 2765 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4
[ 1991.189608] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 1991.195686] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14)
[ 1991.203393] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4)
[ 1991.210586] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410)
[ 1991.218644] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938)
[ 1991.227049] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc)
[ 1991.235281] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88)
[ 1991.242993] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34)
[ 1991.250620] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24)
[ 1991.259812] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf1729f0>] (gsc_m2m_mmap+0x24/0x5c [exynos_gsc])
[ 1991.270159] [<bf1729f0>] (gsc_m2m_mmap [exynos_gsc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev])
[ 1991.279510] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638)
[ 1991.287792] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4)
[ 1991.294986] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8)
[ 1991.302178] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0)
[ 1991.309977] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28)

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Suggested-by: Hans Verkuil <hansverk@cisco.com>
Acked-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
Commit 308c6ca ("net: hns: All ports can not work when insmod hns ko
after rmmod.") add phy_stop in hns_nic_init_phy(), In the branch of "net",
this method is effective, but in the branch of "net-next", it will cause
a WARNING when hns modules loaded, reference to commit 2b3e88e ("net:
phy: improve phy state checking"):

[10.092168] ------------[ cut here ]------------
[10.092171] called from state READY
[10.092189] WARNING: CPU: 4 PID: 1 at ../drivers/net/phy/phy.c:854
                phy_stop+0x90/0xb0
[10.092192] Modules linked in:
[10.092197] CPU: 4 PID:1 Comm:swapper/0 Not tainted 4.20.0-rc7-next-20181220 #1
[10.092200] Hardware name: Huawei TaiShan 2280 /D05, BIOS Hisilicon D05 UEFI
                16.12 Release 05/15/2017
[10.092202] pstate: 60000005 (nZCv daif -PAN -UAO)
[10.092205] pc : phy_stop+0x90/0xb0
[10.092208] lr : phy_stop+0x90/0xb0
[10.092209] sp : ffff00001159ba90
[10.092212] x29: ffff00001159ba90 x28: 0000000000000007
[10.092215] x27: ffff000011180068 x26: ffff0000110a5620
[10.092218] x25: ffff0000113b6000 x24: ffff842f96dac000
[10.092221] x23: 0000000000000000 x22: 0000000000000000
[10.092223] x21: ffff841fb8425e18 x20: ffff801fb3a56438
[10.092226] x19: ffff801fb3a56000 x18: ffffffffffffffff
[10.092228] x17: 0000000000000000 x16: 0000000000000000
[10.092231] x15: ffff00001122d6c8 x14: ffff00009159b7b7
[10.092234] x13: ffff00001159b7c5 x12: ffff000011245000
[10.092236] x11: 0000000005f5e0ff x10: ffff00001159b750
[10.092239] x9 : 00000000ffffffd0 x8 : 0000000000000465
[10.092242] x7 : ffff0000112457f8 x6 : ffff0000113bd7ce
[10.092245] x5 : 0000000000000000 x4 : 0000000000000000
[10.092247] x3 : 00000000ffffffff x2 : ffff000011245828
[10.092250] x1 : 4b5860bd05871300 x0 : 0000000000000000
[10.092253] Call trace:
[10.092255]  phy_stop+0x90/0xb0
[10.092260]  hns_nic_init_phy+0xf8/0x110
[10.092262]  hns_nic_try_get_ae+0x4c/0x3b0
[10.092264]  hns_nic_dev_probe+0x1fc/0x480
[10.092268]  platform_drv_probe+0x50/0xa0
[10.092271]  really_probe+0x1f4/0x298
[10.092273]  driver_probe_device+0x58/0x108
[10.092275]  __driver_attach+0xdc/0xe0
[10.092278]  bus_for_each_dev+0x74/0xc8
[10.092280]  driver_attach+0x20/0x28
[10.092283]  bus_add_driver+0x1b8/0x228
[10.092285]  driver_register+0x60/0x110
[10.092288]  __platform_driver_register+0x40/0x48
[10.092292]  hns_nic_dev_driver_init+0x18/0x20
[10.092296]  do_one_initcall+0x5c/0x180
[10.092299]  kernel_init_freeable+0x198/0x240
[10.092303]  kernel_init+0x10/0x108
[10.092306]  ret_from_fork+0x10/0x18
[10.092308] ---[ end trace 1396dd0278e397eb ]---

This WARNING occurred because of calling phy_stop before phy_start.

The root cause of the problem in commit '308c6cafde01' is:

Reference to hns_nic_init_phy, the flag phydev->supported is changed after
phy_connect_direct. The flag phydev->supported is 0x6ff when hns modules is
loaded, so will not change Fiber Port power(Reference to marvell.c), which
is power on at default.
Then the flag phydev->supported is changed to 0x6f, so Fiber Port power is
off when removing hns modules.
When hns modules installed again, the flag phydev->supported is default
value 0x6ff, so will not change Fiber Port power(now is off), causing mac
link not up problem.

So the solution is change phy flags before phy_connect_direct.

Fixes: 308c6ca ("net: hns: All ports can not work when insmod hns ko after rmmod.")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
When enable SLUB debug, than remove hns_enet_drv module, SLUB debug will
identify a use after free bug:

[134.189505] Unable to handle kernel paging request at virtual address
		006b6b6b6b6b6b6b
[134.197553] Mem abort info:
[134.200381]   ESR = 0x96000004
[134.203487]   Exception class = DABT (current EL), IL = 32 bits
[134.209497]   SET = 0, FnV = 0
[134.212596]   EA = 0, S1PTW = 0
[134.215777] Data abort info:
[134.218701]   ISV = 0, ISS = 0x00000004
[134.222596]   CM = 0, WnR = 0
[134.225606] [006b6b6b6b6b6b6b] address between user and kernel address ranges
[134.232851] Internal error: Oops: 96000004 [#1] SMP
[134.237798] CPU: 21 PID: 27834 Comm: rmmod Kdump: loaded Tainted: G
		OE     4.19.5-1.2.34.aarch64 #1
[134.247856] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.58 10/24/2018
[134.255181] pstate: 20000005 (nzCv daif -PAN -UAO)
[134.260044] pc : hns_ae_put_handle+0x38/0x60
[134.264372] lr : hns_ae_put_handle+0x24/0x60
[134.268700] sp : ffff00001be93c50
[134.272054] x29: ffff00001be93c50 x28: ffff802faaec8040
[134.277442] x27: 0000000000000000 x26: 0000000000000000
[134.282830] x25: 0000000056000000 x24: 0000000000000015
[134.288284] x23: ffff0000096fe098 x22: ffff000001050070
[134.293671] x21: ffff801fb3c044a0 x20: ffff80afb75ec098
[134.303287] x19: ffff80afb75ec098 x18: 0000000000000000
[134.312945] x17: 0000000000000000 x16: 0000000000000000
[134.322517] x15: 0000000000000002 x14: 0000000000000000
[134.332030] x13: dead000000000100 x12: ffff7e02bea3c988
[134.341487] x11: ffff80affbee9e68 x10: 0000000000000000
[134.351033] x9 : 6fffff8000008101 x8 : 0000000000000000
[134.360569] x7 : dead000000000100 x6 : ffff000009579748
[134.370059] x5 : 0000000000210d00 x4 : 0000000000000000
[134.379550] x3 : 0000000000000001 x2 : 0000000000000000
[134.388813] x1 : 6b6b6b6b6b6b6b6b x0 : 0000000000000000
[134.397993] Process rmmod (pid: 27834, stack limit = 0x00000000d474b7fd)
[134.408498] Call trace:
[134.414611]  hns_ae_put_handle+0x38/0x60
[134.422208]  hnae_put_handle+0xd4/0x108
[134.429563]  hns_nic_dev_remove+0x60/0xc0 [hns_enet_drv]
[134.438342]  platform_drv_remove+0x2c/0x70
[134.445958]  device_release_driver_internal+0x174/0x208
[134.454810]  driver_detach+0x70/0xd8
[134.461913]  bus_remove_driver+0x64/0xe8
[134.469396]  driver_unregister+0x34/0x60
[134.476822]  platform_driver_unregister+0x20/0x30
[134.485130]  hns_nic_dev_driver_exit+0x14/0x6e4 [hns_enet_drv]
[134.494634]  __arm64_sys_delete_module+0x238/0x290

struct hnae_handle is a member of struct hnae_vf_cb, so when vf_cb is
freed, than use hnae_handle will cause use after free panic.

This patch frees vf_cb after hnae_handle used.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
syzbot was able to crash one host with the following stack trace :

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 8625 Comm: syz-executor4 Not tainted 4.20.0+ #8
RIP: 0010:dev_net include/linux/netdevice.h:2169 [inline]
RIP: 0010:icmp6_send+0x116/0x2d30 net/ipv6/icmp.c:426
 icmpv6_send
 smack_socket_sock_rcv_skb
 security_sock_rcv_skb
 sk_filter_trim_cap
 __sk_receive_skb
 dccp_v6_do_rcv
 release_sock

This is because a RX packet found socket owned by user and
was stored into socket backlog. Before leaving RCU protected section,
skb->dev was cleared in __sk_receive_skb(). When socket backlog
was finally handled at release_sock() time, skb was fed to
smack_socket_sock_rcv_skb() then icmp6_send()

We could fix the bug in smack_socket_sock_rcv_skb(), or simply
make icmp6_send() more robust against such possibility.

In the future we might provide to icmp6_send() the net pointer
instead of infering it.

Fixes: d66a8ac ("Smack: Inform peer that IPv6 traffic has been blocked")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Piotr Sawicki <p.sawicki2@partner.samsung.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
Keys for "authenc" AEADs are formatted as an rtattr containing a 4-byte
'enckeylen', followed by an authentication key and an encryption key.
crypto_authenc_extractkeys() parses the key to find the inner keys.

However, it fails to consider the case where the rtattr's payload is
longer than 4 bytes but not 4-byte aligned, and where the key ends
before the next 4-byte aligned boundary.  In this case, 'keylen -=
RTA_ALIGN(rta->rta_len);' underflows to a value near UINT_MAX.  This
causes a buffer overread and crash during crypto_ahash_setkey().

Fix it by restricting the rtattr payload to the expected size.

Reproducer using AF_ALG:

	#include <linux/if_alg.h>
	#include <linux/rtnetlink.h>
	#include <sys/socket.h>

	int main()
	{
		int fd;
		struct sockaddr_alg addr = {
			.salg_type = "aead",
			.salg_name = "authenc(hmac(sha256),cbc(aes))",
		};
		struct {
			struct rtattr attr;
			__be32 enckeylen;
			char keys[1];
		} __attribute__((packed)) key = {
			.attr.rta_len = sizeof(key),
			.attr.rta_type = 1 /* CRYPTO_AUTHENC_KEYA_PARAM */,
		};

		fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
		bind(fd, (void *)&addr, sizeof(addr));
		setsockopt(fd, SOL_ALG, ALG_SET_KEY, &key, sizeof(key));
	}

It caused:

	BUG: unable to handle kernel paging request at ffff88007ffdc000
	PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 2e05067 PTE 0
	Oops: 0000 [#1] SMP
	CPU: 0 PID: 883 Comm: authenc Not tainted 4.20.0-rc1-00108-g00c9fe37a7f27 hardkernel#13
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
	RIP: 0010:sha256_ni_transform+0xb3/0x330 arch/x86/crypto/sha256_ni_asm.S:155
	[...]
	Call Trace:
	 sha256_ni_finup+0x10/0x20 arch/x86/crypto/sha256_ssse3_glue.c:321
	 crypto_shash_finup+0x1a/0x30 crypto/shash.c:178
	 shash_digest_unaligned+0x45/0x60 crypto/shash.c:186
	 crypto_shash_digest+0x24/0x40 crypto/shash.c:202
	 hmac_setkey+0x135/0x1e0 crypto/hmac.c:66
	 crypto_shash_setkey+0x2b/0xb0 crypto/shash.c:66
	 shash_async_setkey+0x10/0x20 crypto/shash.c:223
	 crypto_ahash_setkey+0x2d/0xa0 crypto/ahash.c:202
	 crypto_authenc_setkey+0x68/0x100 crypto/authenc.c:96
	 crypto_aead_setkey+0x2a/0xc0 crypto/aead.c:62
	 aead_setkey+0xc/0x10 crypto/algif_aead.c:526
	 alg_setkey crypto/af_alg.c:223 [inline]
	 alg_setsockopt+0xfe/0x130 crypto/af_alg.c:256
	 __sys_setsockopt+0x6d/0xd0 net/socket.c:1902
	 __do_sys_setsockopt net/socket.c:1913 [inline]
	 __se_sys_setsockopt net/socket.c:1910 [inline]
	 __x64_sys_setsockopt+0x1f/0x30 net/socket.c:1910
	 do_syscall_64+0x4a/0x180 arch/x86/entry/common.c:290
	 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: e236d4a ("[CRYPTO] authenc: Move enckeylen into key itself")
Cc: <stable@vger.kernel.org> # v2.6.25+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
Recent changes - probably DMA API related (generic and/or arm64-specific) -
exposed a case where driver maps a zero-length buffer:
ahash_init()->ahash_update()->ahash_final() with a zero-length string to
hash

kernel BUG at kernel/dma/swiotlb.c:475!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 1823 Comm: cryptomgr_test Not tainted 4.20.0-rc1-00108-g00c9fe37a7f2 #1
Hardware name: LS1046A RDB Board (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO)
pc : swiotlb_tbl_map_single+0x170/0x2b8
lr : swiotlb_map_page+0x134/0x1f8
sp : ffff00000f79b8f0
x29: ffff00000f79b8f0 x28: 0000000000000000
x27: ffff0000093d0000 x26: 0000000000000000
x25: 00000000001f3ffe x24: 0000000000200000
x23: 0000000000000000 x22: 00000009f2c538c0
x21: ffff800970aeb410 x20: 0000000000000001
x19: ffff800970aeb410 x18: 0000000000000007
x17: 000000000000000e x16: 0000000000000001
x15: 0000000000000019 x14: c32cb8218a167fe8
x13: ffffffff00000000 x12: ffff80097fdae348
x11: 0000800976bca000 x10: 0000000000000010
x9 : 0000000000000000 x8 : ffff0000091fd6c8
x7 : 0000000000000000 x6 : 00000009f2c538bf
x5 : 0000000000000000 x4 : 0000000000000001
x3 : 0000000000000000 x2 : 00000009f2c538c0
x1 : 00000000f9fff000 x0 : 0000000000000000
Process cryptomgr_test (pid: 1823, stack limit = 0x(____ptrval____))
Call trace:
 swiotlb_tbl_map_single+0x170/0x2b8
 swiotlb_map_page+0x134/0x1f8
 ahash_final_no_ctx+0xc4/0x6cc
 ahash_final+0x10/0x18
 crypto_ahash_op+0x30/0x84
 crypto_ahash_final+0x14/0x1c
 __test_hash+0x574/0xe0c
 test_hash+0x28/0x80
 __alg_test_hash+0x84/0xd0
 alg_test_hash+0x78/0x144
 alg_test.part.30+0x12c/0x2b4
 alg_test+0x3c/0x68
 cryptomgr_test+0x44/0x4c
 kthread+0xfc/0x128
 ret_from_fork+0x10/0x18
Code: d34bfc18 2a1a03f7 1a9f8694 35fff89a (d4210000)

Cc: <stable@vger.kernel.org>
Signed-off-by: Aymen Sghaier <aymen.sghaier@nxp.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
Authencesn template in decrypt path unconditionally calls aead_request_complete
after ahash_verify which leads to following kernel panic in after decryption.

[  338.539800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[  338.548372] PGD 0 P4D 0
[  338.551157] Oops: 0000 [#1] SMP PTI
[  338.554919] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G        W I       4.19.7+ hardkernel#13
[  338.564431] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
[  338.572212] RIP: 0010:esp_input_done2+0x350/0x410 [esp4]
[  338.578030] Code: ff 0f b6 68 10 48 8b 83 c8 00 00 00 e9 8e fe ff ff 8b 04 25 04 00 00 00 83 e8 01 48 98 48 8b 3c c5 10 00 00 00 e9 f7 fd ff ff <8b> 04 25 04 00 00 00 83 e8 01 48 98 4c 8b 24 c5 10 00 00 00 e9 3b
[  338.598547] RSP: 0018:ffff911c97803c00 EFLAGS: 00010246
[  338.604268] RAX: 0000000000000002 RBX: ffff911c4469ee00 RCX: 0000000000000000
[  338.612090] RDX: 0000000000000000 RSI: 0000000000000130 RDI: ffff911b87c20400
[  338.619874] RBP: 0000000000000000 R08: ffff911b87c20498 R09: 000000000000000a
[  338.627610] R10: 0000000000000001 R11: 0000000000000004 R12: 0000000000000000
[  338.635402] R13: ffff911c89590000 R14: ffff911c91730000 R15: 0000000000000000
[  338.643234] FS:  0000000000000000(0000) GS:ffff911c97800000(0000) knlGS:0000000000000000
[  338.652047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  338.658299] CR2: 0000000000000004 CR3: 00000001ec20a000 CR4: 00000000000006f0
[  338.666382] Call Trace:
[  338.669051]  <IRQ>
[  338.671254]  esp_input_done+0x12/0x20 [esp4]
[  338.675922]  chcr_handle_resp+0x3b5/0x790 [chcr]
[  338.680949]  cpl_fw6_pld_handler+0x37/0x60 [chcr]
[  338.686080]  chcr_uld_rx_handler+0x22/0x50 [chcr]
[  338.691233]  uldrx_handler+0x8c/0xc0 [cxgb4]
[  338.695923]  process_responses+0x2f0/0x5d0 [cxgb4]
[  338.701177]  ? bitmap_find_next_zero_area_off+0x3a/0x90
[  338.706882]  ? matrix_alloc_area.constprop.7+0x60/0x90
[  338.712517]  ? apic_update_irq_cfg+0x82/0xf0
[  338.717177]  napi_rx_handler+0x14/0xe0 [cxgb4]
[  338.722015]  net_rx_action+0x2aa/0x3e0
[  338.726136]  __do_softirq+0xcb/0x280
[  338.730054]  irq_exit+0xde/0xf0
[  338.733504]  do_IRQ+0x54/0xd0
[  338.736745]  common_interrupt+0xf/0xf

Fixes: 104880a ("crypto: authencesn - Convert to new AEAD...")
Signed-off-by: Harsh Jain <harsh@chelsio.com>
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
mihailescu2m pushed a commit that referenced this pull request Jan 21, 2019
This patch modified test_btf pretty print test to cover
the bitfield with struct member equal to or greater 256.

Without the previous kernel patch fix, the modified test will fail:

  $ test_btf -p
  ......
  BTF pretty print array(#1)......unexpected pprint output
  expected: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x1}
      read: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x0}

  BTF pretty print array(#2)......unexpected pprint output
  expected: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x1}
      read: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x0}

  PASS:6 SKIP:0 FAIL:2

With the kernel fix, the modified test will succeed:
  $ test_btf -p
  ......
  BTF pretty print array(#1)......OK
  BTF pretty print array(#2)......OK
  PASS:8 SKIP:0 FAIL:0

Fixes: 9d5f9f7 ("bpf: btf: fix struct/union/fwd types with kind_flag")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants