build: fix build with gcc 7#1
Closed
JeanMarcCoic wants to merge 1 commit into
Closed
Conversation
When building the generated source from flex using gcc 7, there is a
warning about an implicit fallthrough:
```
iopc/iopc-lex.c: In function ‘iopc_lex’:
iopc/iopc-lex.l:267:19: error: this statement may fall through [-Werror=implicit-fallthrough=]
yyextra->len = yyleng;
iopc/iopc-lex.c:1507:2: note: in expansion of macro ‘YY_USER_ACTION’
#endif
^~~
iopc/iopc-lex.l:503:1: note: in expansion of macro ‘YY_RULE_SETUP’
<<EOF>> { ERROR("unterminated doxygen param"); }
^ ~~~
iopc/iopc-lex.l:504:1: note: here
.|^","|","{HS}*"," {
^
cc1: all warnings being treated as errors
```
Interestingly, gcc 8 doesn't complain about this fallthrough. To fix that,
we simply define two separate error messages to avoid relying on an
implicit fallthrough.
vthib
approved these changes
Aug 29, 2019
skyj
approved these changes
Sep 3, 2019
Contributor
|
Pull request merged: 32a28fc |
vthib
added a commit
that referenced
this pull request
Jan 6, 2020
The consistency of the QPS blocks could get corrupted on some realloc situations. A block may end up marked as having its previous block free, while it isn't. Use of the freelist might end up using the wrong blocks, and then... you get 0x1010101 in all your handles. Makes sense! To describe the bug, a quick summary about QPS, or rather about QPS's implementation of the TLSF allocator. Allocated blocks have an associated header, which can indicate the status of the block. Notably, two bits are used to indicate whether the block is used or free, and whether the previous block is used or free. Here is a buggy situation: * First block is used, with size N * Second block is free, with size 1 * Third block is used, with size M In the qps->hdrs, this ends up as: +---------------+---------------+--------------------------+ | size: N, USED | size: 1, FREE | size: N, PREV_FREE, USED | +---------------+---------------+--------------------------+ The first block is reallocated to size N+1. As the next block is free and (block_size + next_block_size) = (N + 1) >= asked_size, The second block is removed from the freelist, and the first block size adjusted: +-------------------------------+--------------------------+ | size: N + 1, USED | size: N, PREV_FREE, USED | +-------------------------------+--------------------------+ The bug is previous here: we should update the PREV_FREE flag of the new next block, but we didn't. This leads to a corruption of the invariants of the allocator (clearly detected by the check_invariants routine). What this ends up causing probably depends a lot on the access pattern of the QPS. This is too big of a bug to not have been triggered regularly, but as the direct access to the freelist is still fine, unless we end up reallocating or using blocks around this corrupted block, we probably are safe. As for the 0x1010101 value, well, here's the backtrace: #0 qhat_flatten_leaf8 at qps-hat.in.c:110 #1 qhat_set_path8 at qps-hat.in.c:268 #2 qhat_set_path at qps-hat.h:286 We end up writing the byte value 1 repeatedly in a uint8_t array, at the wrong address, with a wrong len. My guess is that the bug is caused by repeated +1 reallocs, followed by repeated freeing. We end up corrupting the headers, then using a wrong header when reallocating, using an invalid ptr & len when setting a simple value "1". This ends up corrupting a handle's list with repeated 0x1010101 value, which can get copied around, but leads to segfaults when dereferenced. Change-Id: I4650d9666c5cab3af8ab824b98a300b9095ead8a rip-it: 39cffc8 uprooted
nicopauss
added a commit
that referenced
this pull request
Jan 18, 2021
The Azure pipelines on Ubuntu seems to deadlocks randomly. It can be reproduced on docker with an Ubuntu image. The deadlocks seems to occur in the lib ASAN, so it is very difficult to debug: (gdb) thread apply all bt Thread 2 (Thread 0x7f10ed3fc700 (LWP 11433)): #0 0x00000000004bde50 in __sanitizer::BlockingMutex::Lock() () #1 0x00000000004355c0 in __sanitizer::SizeClassAllocator64<__asan::AP64<__sanitizer::LocalAddressSpaceView> >::GetFromAllocator(__sanitizer::AllocatorStats*, unsigned long, unsigned int*, unsigned long) () #2 0x00000000004354c3 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__asan::AP64<__sanitizer::LocalAddressSpaceView> > >::Refill(__sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__asan::AP64<__sanitizer::LocalAddressSpaceView> > >::PerClass*, __sanitiz er::SizeClassAllocator64<__asan::AP64<__sanitizer::LocalAddressSpaceView> >*, unsigned long) () #3 0x0000000000435112 in __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__asan::AP64<__sanitizer::LocalAddressSpaceView> >, __sanitizer::LargeMmapAllocatorPtrArrayDynamic>::Allocate(__sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__asan::AP64<__sanitizer::LocalAddress SpaceView> > >*, unsigned long, unsigned long) () #4 0x0000000000434ee1 in __sanitizer::QuarantineCache<__asan::QuarantineCallback>::Enqueue(__asan::QuarantineCallback, void*, unsigned long) () #5 0x0000000000434d53 in __asan::Allocator::QuarantineChunk(__asan::AsanChunk*, void*, __sanitizer::BufferedStackTrace*) () #6 0x00000000004a7692 in free () #7 0x00007f10f3f7ce51 in __pthread_attr_destroy (attr=<optimized out>) at pthread_attr_destroy.c:38 #8 0x00000000004c4654 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) () #9 0x00000000004c4aaa in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) () #10 0x00000000004b2b0e in __asan::AsanThread::SetThreadStackAndTls(__asan::AsanThread::InitOptions const*) () #11 0x00000000004b270d in __asan::AsanThread::Init(__asan::AsanThread::InitOptions const*) () #12 0x00000000004b2bd8 in __asan::AsanThread::ThreadStart(unsigned long long, __sanitizer::atomic_uintptr_t*) () #13 0x00007f10f479c609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #14 0x00007f10f4007293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 1 (Thread 0x7f10f202d980 (LWP 11432)): #0 0x00000000004bd8a7 in __sanitizer::internal_sched_yield() () #1 0x0000000000492705 in pthread_create () #2 0x00000000010f664a in thr_create (thread=0x7fff14247480, attr=0x7fff142472e0, fn=0x10f3540 <thr_job_main>, arg=0x621000044d00) at src/core/thr.c:111 #3 0x00000000010ecfe2 in thr_fork_threads () at src/core/thr-job.blk:949 #4 0x00000000010ed910 in thr_initialize (arg=0x0) at src/core/thr-job.blk:1066 #5 0x0000000001063db1 in module_require (module=0x60f000000310, required_by=0x60f000000220) at src/core/module.c:282 #6 0x0000000001063a66 in module_require (module=0x60f000000220, required_by=0x0) at src/core/module.c:276 #7 0x0000000000a2e669 in z_qps_hat () at tests/zchk-hat.blk:394 #8 0x0000000000745aa6 in z_run () at src/core/z.blk:1181 #9 0x00000000008dd459 in main (argc=1, argv=0x7fff1424a2f8) at tests/zchk.c:1010 We never encountered such deadlock on our buildbots. There is one difference between our buildbots and the Azure pipelines, we set the environment variable ASAN_OPTIONS to 'handle_segv=0:detect_leaks=1'. With this variable, the deadlocks disappear. So let's use this variable for the Azure pipelines. Change-Id: I2c33526422717ddcbf808fd618e17b8f15532c17 rip-it: adb53e9
nicopauss
pushed a commit
that referenced
this pull request
Feb 12, 2026
Because our el_wake API didn't use explicit memory barriers between the
read/write synchronizations points TSAN reported data race such as the
one below:
WARNING: ThreadSanitizer: data race
Write of size 8 by main thread:
#0 close <null>
#1 el_fd_unregister ./src/core/el-epoll.in.c:109:13
#2 el_wake_unregister ./src/core/el.blk:1645:9
#3 el_unregister ./src/core/el.blk:1894:36
#4 dns_resolv_ctx_wipe ./src/net/addr.blk:239:5
#5 dns_resolv_ctx_delete ./src/net/addr.blk:245:1
#6 ____addr_info_async_block_invoke ./src/net/addr.blk:297:9
#7 el_wake_on_event ./src/core/el.blk:1607:9
#8 el_fd_fire ./src/core/el.blk:1307:13
#9 el_fds_loop ./src/core/el.blk:1461:17
#10 z_connect_ics_from_addr_and_wait ./tests/zchk-iop-rpc.c:316:13
#11 __z_iop_rpc_block_invoke_6 ./tests/zchk-iop-rpc.c:608:9
#12 z_iop_rpc ./tests/zchk-iop-rpc.c:640:7
#13 z_run ./src/core/z.blk:1545:9
#14 main ./tests/zchk.c:1206:12
Previous read of size 8 by thread T8:
#0 write <null>
#1 el_wake_fire ./src/core/el.blk:1660:5
#2 ____addr_info_async_block_invoke_2 ./src/net/addr.blk:331:9
#3 job_run ./src/core/thr-job.blk:281:9
#4 thr_run_deque_entry ./src/core/thr-job.blk:381:12
#5 thr_job_try_steal ./src/core/thr-job.blk:473:20
#6 thr_job_steal ./src/core/thr-job.blk:513:15
#7 thr_job_main ./src/core/thr-job.blk:874:13
#8 thr_hooks_wrapper ./src/core/thr.c:89:11
Indeed even if the eventfd() API ensures the thread safety of the
write/read sequence we still have to ensure no more usage of the fd
itself is possible before closing it.
Thus we add an explicit memory barrier in the form of an atomic counter.
Change-Id: I519147361d912e155341eec76f1fb4018699235c
Priv-Id: b1aef4b55b87f433f478950e7b79207846a0f623
nicopauss
pushed a commit
that referenced
this pull request
Feb 12, 2026
thr_queue_drain() seemed to assume that no other thread is running the
queue when used and thus started with an `atomic_store(&q->running_on,
id)` without taking care of the current `running_on` value.
But when destroying a queue (`thr_queue_destroy`) TSAN reported this
race:
WARNING: ThreadSanitizer: data race
Write of size 8 by main thread:
#0 free <null>
#1 libc_free ./src/core/mem.blk:140:9
#2 mp_ifree ./src/core/mem.blk:351:5
#3 thr_queue_delete ./src/core/thr-job.blk:574:1
#4 thr_queue_drain ./src/core/thr-job.blk:618:9
#5 thr_queue_sync ./src/core/thr-job.blk:705:13
#6 thr_queue_destroy ./src/core/thr-job.blk:730:9
#7 test_queue ./tests/zchk-thrjob.blk:381:9
#8 __z_thrjobs_block_invoke_4 ./tests/zchk-thrjob.blk:647:9
#9 z_thrjobs ./tests/zchk-thrjob.blk:648:7
#10 z_run ./src/core/z.blk:1545:9
#11 main ./tests/zchk.c:1206:12
Previous atomic write of size 8 at 0x7210000001e0 by thread T28:
#0 thr_queue_drain ./src/core/thr-job.blk:614:5
#1 thr_queue_run ./src/core/thr-job.blk:624:5
#2 job_run ./src/core/thr-job.blk:285:9
#3 thr_run_deque_entry ./src/core/thr-job.blk:381:12
#4 thr_job_try_steal ./src/core/thr-job.blk:473:20
#5 thr_job_steal ./src/core/thr-job.blk:513:15
#6 thr_job_main ./src/core/thr-job.blk:884:25
#7 thr_hooks_wrapper ./src/core/thr.c:89:11
Indeed when finishing to drain the queue, another would finished by:
do {
[…]
} while (!mpsc_queue_drain_end(&it, &thr_qnode_destroy));
atomic_compare_exchange_strong(&q->running_on, &id,
THR_QUEUE_NOT_RUNNING);
So after removing the last element of the queue (`mpsc_queue_drain_end`)
the queue's `running_on` is reset to THR_QUEUE_NOT_RUNNING.
But when destroying, the sole condition to immediately drain the queue
is:
if (mpsc_queue_push(&q->q, &n->qnode)) {
thr_queue_drain(q);
So the race is obvious here, as soon as the queued is emptied, the
destroying thread could already be freeing the queue while the previous
thread could still be trying to reset `q->running_on`.
To fix we now actually wait for the queue to be release before draining
it again.
Change-Id: I159d6426ec7ace01d0e7aaf685a7e32a6bf31749
Priv-Id: 07021192cda3217760c2ba85fc0fa12af17ef20f
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When building the generated source from flex using gcc 7, there is a
warning about an implicit fallthrough:
Interestingly, gcc 8 doesn't complain about this fallthrough. To fix that,
we simply define two separate error messages to avoid relying on an
implicit fallthrough.