[MLIR][OpenMP] Updates to initial Taskloop Bounds Implementation#3
[MLIR][OpenMP] Updates to initial Taskloop Bounds Implementation#3Stylie777 wants to merge 3 commits into
Conversation
- Force the first 3 entries to the StructArg to be the bounds info - Ensure it will work when executing the tasks in parallel
tblah
left a comment
There was a problem hiding this comment.
Looks great to me beyond some minor nits. I see O3 can optimize away the extra geps and stores from outlining so I think it is okay to leave them.
| PostOutlineCBTy PostOutlineCB; | ||
| BasicBlock *EntryBB, *ExitBB, *OuterAllocaBB; | ||
| SmallVector<Value *, 2> ExcludeArgsFromAggregate; | ||
| SetVector<Value *> Inputs, Outputs; |
There was a problem hiding this comment.
- I know the other members don't have documentation but I think it would be good to add something because I don't think the use of these is immediately obvious.
- I don't think we can use Outputs. IIRC there's an assertion somewhere that there are no live out values. Better to remove it.
There was a problem hiding this comment.
- Added a comment about the use of Inputs
- We still need some definition of Outputs somewhere as the CodeExtractor's
extractCodeRegionexpects there to be a SetVector for both Inputs and Outputs. The API gives 2 options, one where you just pass the CEAC value, and another that includes the inputs and outputs. I am happy to exclude the Outputs from the OutlineInfo struct, but there will need to be a SetVector made before extracting the code region from OpenMPIRBuilder::finalize.
| llvm::SmallVectorImpl<Instruction *> &ToBeDeleted, | ||
| OpenMPIRBuilder::InsertPointTy InnerAllocaIP, | ||
| const Twine &Name = "", bool AsPtr = true) { | ||
| const Twine &Name = "", bool AsPtr = true, bool Is64Bit = false) { |
There was a problem hiding this comment.
| const Twine &Name = "", bool AsPtr = true, bool Is64Bit = false) { | |
| const Twine &Name = "", bool AsPtr = true, IntegerType *IntTy) { | |
| Builder.restoreIP(OuterAllocaIP); | |
| IntTy = IntTy ? IntTy : Builder.getInt32Ty(); |
More flexible.
| } else { | ||
| UseFakeVal = | ||
| cast<BinaryOperator>(Builder.CreateAdd(FakeVal, Builder.getInt32(10))); | ||
| cast<BinaryOperator>(Builder.CreateAdd(FakeVal, Is64Bit ? Builder.getInt64(10) : Builder.getInt32(10))); |
There was a problem hiding this comment.
You could do this from IntTy with llvm::ConstantInt::get or IRBuilderBase::getIntN
There was a problem hiding this comment.
Done with ConstantInt::get
| OI.ExcludeArgsFromAggregate.push_back(createFakeIntVal( | ||
| Builder, AllocaIP, ToBeDeleted, TaskloopAllocaIP, "global.tid", false)); | ||
| Value *FakeLB = createFakeIntVal(Builder, AllocaIP, ToBeDeleted, TaskloopAllocaIP, | ||
| "lb", false, true); |
There was a problem hiding this comment.
| "lb", false, true); | |
| "lb", /*AsPtr=*/false, Builder.getInt64Ty()); |
It isn't obvious what the bool values mean without some extra help
| "ub", false, true); | ||
| Value *FakeStep = createFakeIntVal(Builder, AllocaIP, ToBeDeleted, TaskloopAllocaIP, | ||
| "step", false, true); | ||
| /* For Taskloop, we want to force the bounds being the first 3 inputs in the aggregate struct*/ |
There was a problem hiding this comment.
nit: llvm style is to use C++ style comments
| // HasShareds is true if any variables are captured in the outlined region, | ||
| // false otherwise. | ||
| bool HasShareds = StaleCI->arg_size() > 1; | ||
| /* Create the casting for the Bounds Values that can be used when outlining to replace the uses of the fakes with real values */ |
| /*sizeof_task=*/TaskSize, /*sizeof_shared=*/SharedsSize, | ||
| /*task_func=*/&OutlinedFn}); | ||
|
|
||
| Value *Shareds = StaleCI->getArgOperand(1); |
There was a problem hiding this comment.
This could be moved above line 2035 to make that section clearer
There was a problem hiding this comment.
I've moved this above the declaration for ArgStructAlloca so this can use the Shareds variable rather than calling the getArgOperand function
| if (ConstantInt *CI = dyn_cast<ConstantInt>(Gep.getOperand(2))) { | ||
| switch (CI->getZExtValue()) { | ||
| case 0: | ||
| TaskLB = &I; |
There was a problem hiding this comment.
It would also be good to check that the value being indexed is the right one, not just the numeric value of the index.
There was a problem hiding this comment.
Added a check to make sure the GEP Instruction being checked is using the Shared's as its first operand.
| @@ -2015,20 +2033,18 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTaskloop( | |||
| Value *TaskSize = Builder.getInt64( | |||
| divideCeil(M.getDataLayout().getTypeSizeInBits(Taskloop), 8)); | |||
There was a problem hiding this comment.
| divideCeil(M.getDataLayout().getTypeSizeInBits(Taskloop), 8)); | |
| divideCeil(M.getDataLayout().getTypeSizeInBits(Task), 8)); |
kaviya2510
left a comment
There was a problem hiding this comment.
Thank you for the work @Stylie777.
There are some changes that need to be updated for passing the arguments to runtime call __kmpc_taskloop(..). Kindly address those changes.
Rest of the work looks good to me.
| @@ -2015,20 +2033,18 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTaskloop( | |||
| Value *TaskSize = Builder.getInt64( | |||
| divideCeil(M.getDataLayout().getTypeSizeInBits(Taskloop), 8)); | |||
There was a problem hiding this comment.
| divideCeil(M.getDataLayout().getTypeSizeInBits(Taskloop), 8)); | |
| divideCeil(M.getDataLayout().getTypeSizeInBits(Task), 8)); |
There was a problem hiding this comment.
As we are utilizing structArg to store the the loopbound values, the struct __OMP_STRUCT_TYPE(Taskloop, kmp_task_info, false, VoidPtr, VoidPtr, Int32, VoidPtr, VoidPtr, Int64, Int64, Int64) is no longer needed.
The required size for storing loop bounds can be reserved in kmp_task_t by strutArg itself.
| Builder.CreateStore(Step_ext, step); | ||
| llvm::Value *loadstep = Builder.CreateLoad(Builder.getInt64Ty(), step); | ||
| llvm::Value *Lb = Builder.CreateStructGEP(ArgStructType, TaskShareds, 0); | ||
| Builder.CreateStore(CastedLBVal, Lb); |
There was a problem hiding this comment.
| Builder.CreateStore(CastedLBVal, Lb); | |
| auto *Idx0 = Builder.getInt32(0); | |
| llvm::Value *Lb = Builder.CreateGEP(ArgStructType, TaskShareds, {Idx0, Builder.getInt32(0)}); |
The values of lb,ub and step are already populated in StructArg. You can directly access it and pass the pointer to the runtime call __kmpc_taskloop(...)
| SharedsSize); | ||
| } | ||
| llvm::Value *Ub = Builder.CreateStructGEP(ArgStructType, TaskShareds, 1); | ||
| Builder.CreateStore(CastedUBVal, Ub); |
There was a problem hiding this comment.
Same here. Remove the store instruction.
| Builder.CreateMemCpy(TaskShareds, Alignment, Shareds, Alignment, | ||
| SharedsSize); | ||
| } | ||
| llvm::Value *Ub = Builder.CreateStructGEP(ArgStructType, TaskShareds, 1); |
There was a problem hiding this comment.
| llvm::Value *Ub = Builder.CreateStructGEP(ArgStructType, TaskShareds, 1); | |
| llvm::Value *Ub =Builder.CreateGEP(ArgStructType, TaskShareds, {Idx0, Builder.getInt32(1)}); |
There was a problem hiding this comment.
GEP to StructArg and get the upper bound value.
| llvm::Value *Ub = Builder.CreateStructGEP(ArgStructType, TaskShareds, 1); | ||
| Builder.CreateStore(CastedUBVal, Ub); | ||
|
|
||
| llvm::Value *Step = Builder.CreateStructGEP(ArgStructType, TaskShareds, 2); |
There was a problem hiding this comment.
| llvm::Value *Step = Builder.CreateStructGEP(ArgStructType, TaskShareds, 2); | |
| llvm::Value *Step =Builder.CreateGEP(ArgStructType, TaskShareds, {Idx0, Builder.getInt32(2)}); |
| Builder.CreateStore(CastedUBVal, Ub); | ||
|
|
||
| llvm::Value *Step = Builder.CreateStructGEP(ArgStructType, TaskShareds, 2); | ||
| Builder.CreateStore(CastedStepVal, Step); |
| @@ -2127,12 +2161,15 @@ OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTaskloop( | |||
| if (Add->getOpcode() == llvm::Instruction::Add) { | |||
There was a problem hiding this comment.
Tom raised a concern that this add instruction pattern might also match other unrelated add instructions, and we discussed this in my PR: llvm#166903 (comment)
He suggested looking at the wsloop and distribute implementations for guidance on how this is handled there. I have not had a chance to dig into that yet. Could you please check this once?
Comments at Stylie777#3
Comments at Stylie777#3
Comments at Stylie777#3
Comments at Stylie777#3
Using code/ideas from the x86 backend to optimize a select on a bitcast integer. The previous aarch64 approach was to individually extract the bits from the mask, which is kind of terrible. https://rust.godbolt.org/z/576sndT66 ```llvm define void @if_then_else8(ptr %out, i8 %mask, ptr %if_true, ptr %if_false) { start: %t = load <8 x i32>, ptr %if_true, align 4 %f = load <8 x i32>, ptr %if_false, align 4 %m = bitcast i8 %mask to <8 x i1> %s = select <8 x i1> %m, <8 x i32> %t, <8 x i32> %f store <8 x i32> %s, ptr %out, align 4 ret void } ``` turned into ```asm if_then_else8: // @if_then_else8 sub sp, sp, llvm#16 ubfx w8, w1, llvm#4, #1 and w11, w1, #0x1 ubfx w9, w1, llvm#5, #1 fmov s1, w11 ubfx w10, w1, #1, #1 fmov s0, w8 ubfx w8, w1, llvm#6, #1 ldp q5, q2, [x3] mov v1.h[1], w10 ldp q4, q3, [x2] mov v0.h[1], w9 ubfx w9, w1, #2, #1 mov v1.h[2], w9 ubfx w9, w1, #3, #1 mov v0.h[2], w8 ubfx w8, w1, llvm#7, #1 mov v1.h[3], w9 mov v0.h[3], w8 ushll v1.4s, v1.4h, #0 ushll v0.4s, v0.4h, #0 shl v1.4s, v1.4s, llvm#31 shl v0.4s, v0.4s, llvm#31 cmlt v1.4s, v1.4s, #0 cmlt v0.4s, v0.4s, #0 bsl v1.16b, v4.16b, v5.16b bsl v0.16b, v3.16b, v2.16b stp q1, q0, [x0] add sp, sp, llvm#16 ret ``` With this PR that instead emits ```asm if_then_else8: adrp x8, .LCPI0_1 dup v0.4s, w1 ldr q1, [x8, :lo12:.LCPI0_1] adrp x8, .LCPI0_0 ldr q2, [x8, :lo12:.LCPI0_0] ldp q4, q3, [x2] and v1.16b, v0.16b, v1.16b and v0.16b, v0.16b, v2.16b ldp q5, q2, [x3] cmeq v1.4s, v1.4s, #0 cmeq v0.4s, v0.4s, #0 bsl v1.16b, v2.16b, v3.16b bsl v0.16b, v5.16b, v4.16b stp q0, q1, [x0] ret ``` So substantially shorter. Instead of building the mask element-by-element, this approach (by virtue of not splitting) instead splats the mask value into all vector lanes, performs a bitwise and with powers of 2, and compares with zero to construct the mask vector. cc rust-lang/rust#122376 cc llvm#175769
`SE.getUMaxExpr` causes assertion failure due to type mismatch here: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Analysis/LoopAccessAnalysis.cpp#L253 Running `opt -S -p loop-vectorize -debug-only=loop-vectorize llvm/test/Analysis/LoopAccessAnalysis/type-mismatch-in-scalar-evolution.ll ` without the changes made in LoopAccessAnalysis.cpp causes assertion failure. Attaching the stack dump for reference: ``` LV: Checking a loop in 'loop_contains_store_assumed_bounds' from input.ll LV: Loop hints: force=? width=4 interleave=0 LV: Found a loop: for.body LV: Found an induction variable. opt: /home/kshitij/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:3918: const llvm::SCEV* llvm::ScalarEvolution::getMinMaxExpr(llvm::SCEVTypes, llvm::SmallVectorImpl<const llvm::SCEV*>&): Assertion `getEffectiveSCEVType(Ops[i]->getType()) == ETy && "Operand types don't match!"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug. Stack dump: 0. Program arguments: opt -S -passes=loop-vectorize -debug-only=loop-vectorize -force-vector-width=4 -disable-output input.ll 1. Running pass "function(loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>)" on module "input.ll" 2. Running pass "loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>" on function "loop_contains_store_assumed_bounds" #0 0x000058ee97c5e652 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/opt+0x4f44652) #1 0x000058ee97c5af0f llvm::sys::RunSignalHandlers() (/usr/local/bin/opt+0x4f40f0f) #2 0x000058ee97c5b05c SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0 #3 0x00007c49d4c45330 (/lib/x86_64-linux-gnu/libc.so.6+0x45330) llvm#4 0x00007c49d4c9eb2c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 llvm#5 0x00007c49d4c9eb2c __pthread_kill_internal ./nptl/pthread_kill.c:78:10 llvm#6 0x00007c49d4c9eb2c pthread_kill ./nptl/pthread_kill.c:89:10 llvm#7 0x00007c49d4c4527e raise ./signal/../sysdeps/posix/raise.c:27:6 llvm#8 0x00007c49d4c288ff abort ./stdlib/abort.c:81:7 llvm#9 0x00007c49d4c2881b _nl_load_domain ./intl/loadmsgcat.c:1177:9 llvm#10 0x00007c49d4c3b517 (/lib/x86_64-linux-gnu/libc.so.6+0x3b517) llvm#11 0x000058ee98003fdb llvm::ScalarEvolution::getMinMaxExpr(llvm::SCEVTypes, llvm::SmallVectorImpl<llvm::SCEV const*>&) (/usr/local/bin/opt+0x52e9fdb) llvm#12 0x000058ee98004507 llvm::ScalarEvolution::getUMaxExpr(llvm::SCEV const*, llvm::SCEV const*) (/usr/local/bin/opt+0x52ea507) llvm#13 0x000058ee980dc728 llvm::getStartAndEndForAccess(llvm::Loop const*, llvm::SCEV const*, llvm::Type*, llvm::SCEV const*, llvm::SCEV const*, llvm::ScalarEvolution*, llvm::DenseMap<std::pair<llvm::SCEV const*, llvm::Type*>, std::pair<llvm::SCEV const*, llvm::SCEV const*>, llvm::DenseMapInfo<std::pair<llvm::SCEV const*, llvm::Type*>, void>, llvm::detail::DenseMapPair<std::pair<llvm::SCEV const*, llvm::Type*>, std::pair<llvm::SCEV const*, llvm::SCEV const*>>>*, llvm::DominatorTree*, llvm::AssumptionCache*, std::optional<llvm::ScalarEvolution::LoopGuards>&) (/usr/local/bin/opt+0x53c2728) llvm#14 0x000058ee9814008b llvm::isDereferenceableAndAlignedInLoop(llvm::LoadInst*, llvm::Loop*, llvm::ScalarEvolution&, llvm::DominatorTree&, llvm::AssumptionCache*, llvm::SmallVectorImpl<llvm::SCEVPredicate const*>*) (/usr/local/bin/opt+0x542608b) llvm#15 0x000058ee9a0fa1ca llvm::LoopVectorizationLegality::canUncountableExitConditionLoadBeMoved(llvm::BasicBlock*) (/usr/local/bin/opt+0x73e01ca) llvm#16 0x000058ee9a0faee0 llvm::LoopVectorizationLegality::isVectorizableEarlyExitLoop() (/usr/local/bin/opt+0x73e0ee0) llvm#17 0x000058ee9a104678 llvm::LoopVectorizationLegality::canVectorize(bool) (/usr/local/bin/opt+0x73ea678) llvm#18 0x000058ee9a08c953 llvm::LoopVectorizePass::processLoop(llvm::Loop*) (/usr/local/bin/opt+0x7372953) llvm#19 0x000058ee9a090e21 llvm::LoopVectorizePass::runImpl(llvm::Function&) (/usr/local/bin/opt+0x7376e21) llvm#20 0x000058ee9a0914e0 llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/opt+0x73774e0) llvm#21 0x000058ee99e419a5 llvm::detail::PassModel<llvm::Function, llvm::LoopVectorizePass, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) PassBuilderPipelines.cpp:0:0 llvm#22 0x000058ee97f18905 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/usr/local/bin/opt+0x51fe905) llvm#23 0x000058ee995d70d5 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) AMDGPUTargetMachine.cpp:0:0 llvm#24 0x000058ee97f17051 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/opt+0x51fd051) llvm#25 0x000058ee995d7775 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) AMDGPUTargetMachine.cpp:0:0 llvm#26 0x000058ee97f1783d llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/local/bin/opt+0x51fd83d) llvm#27 0x000058ee9c153909 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool, bool) (/usr/local/bin/opt+0x9439909) llvm#28 0x000058ee97c3f380 optMain (/usr/local/bin/opt+0x4f25380) llvm#29 0x00007c49d4c2a1ca __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3 llvm#30 0x00007c49d4c2a28b call_init ./csu/../csu/libc-start.c:128:20 llvm#31 0x00007c49d4c2a28b __libc_start_main ./csu/../csu/libc-start.c:347:5 llvm#32 0x000058ee97c309a5 _start (/usr/local/bin/opt+0x4f169a5) ``` This is caused by a type mismatch between `SE.getSCEV(DerefRK.IRArgValue)` and `DerefBytesSCEV`. Fixing this by extending them to the wider type.
…bols add' (llvm#188377) Context: lldb might crash when running to a debuggee crashing state and do a target symbols add command. Backtrace: ``` #0 0x000055ca6790dc65 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:848:11 #1 0x000055ca6790e434 PrintStackTraceSignalHandler(void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:931:1 #2 0x000055ca6790b839 llvm::sys::RunSignalHandlers() /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Signals.cpp:104:5 #3 0x000055ca6790ff6b SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:430:38 llvm#4 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0 llvm#5 0x00007fe9e5f25649 syscall /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/misc/../sysdeps/unix/sysv/linux/x86_64/syscall.S:38:0 llvm#6 0x00007fe9ec649170 SignalHandler(int, siginfo_t*, void*) /home/hyubo/osmeta/external/llvm-project/llvm/lib/Support/Unix/Signals.inc:429:7 llvm#7 0x00007fe9e5e44560 __restore_rt /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:13:0 llvm#8 0x00007fe9ebb77bf0 lldb_private::operator<(lldb_private::StackID const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackID.cpp:99:16 llvm#9 0x00007fe9ebb6863d CompareStackID(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:683:3 llvm#10 0x00007fe9ebb6d049 bool __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>::operator()<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/predefined_ops.h:196:4 llvm#11 0x00007fe9ebb6cefe __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::__lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, __gnu_cxx::__ops::_Iter_comp_val<bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algobase.h:1464:8 llvm#12 0x00007fe9ebb6cdfc __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>> std::lower_bound<__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(__gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, __gnu_cxx::__normal_iterator<std::shared_ptr<lldb_private::StackFrame>*, std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>>, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /mnt/gvfs/third-party2/libgcc/d1129753c8361ac8e9453c0f4291337a4507ebe6/11.x/platform010/5684a5a/include/c++/11.x/bits/stl_algo.h:2062:14 llvm#13 0x00007fe9ebb685fa auto llvm::lower_bound<std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)>(std::vector<std::shared_ptr<lldb_private::StackFrame>, std::allocator<std::shared_ptr<lldb_private::StackFrame>>>&, lldb_private::StackID const&, bool (*)(std::shared_ptr<lldb_private::StackFrame> const&, lldb_private::StackID const&)) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/STLExtras.h:2001:10 llvm#14 0x00007fe9ebb68441 lldb_private::StackFrameList::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/StackFrameList.cpp:697:11 llvm#15 0x00007fe9ebbee395 lldb_private::Thread::GetFrameWithStackID(lldb_private::StackID const&) /home/hyubo/osmeta/external/llvm-project/lldb/include/lldb/Target/Thread.h:459:7 llvm#16 0x00007fe9ebac7cf7 lldb_private::ExecutionContextRef::GetFrameSP() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:643:25 llvm#17 0x00007fe9ebac80e1 lldb_private::GetStoppedExecutionContext(lldb_private::ExecutionContextRef const*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Target/ExecutionContext.cpp:164:34 llvm#18 0x00007fe9eb8903fa lldb_private::Statusline::Redraw(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Statusline.cpp:139:7 llvm#19 0x00007fe9eb7ac8be lldb_private::Debugger::RedrawStatusline(std::optional<lldb_private::ExecutionContextRef>) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1233:3 llvm#20 0x00007fe9eb804d1e lldb_private::IOHandlerEditline::RedrawCallback() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:446:3 llvm#21 0x00007fe9eb80aa81 lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2::operator()() const /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:262:73 llvm#22 0x00007fe9eb80aa5d void llvm::detail::UniqueFunctionBase<void>::CallImpl<lldb_private::IOHandlerEditline::IOHandlerEditline(lldb_private::Debugger&, lldb_private::IOHandler::Type, std::shared_ptr<lldb_private::File> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, std::shared_ptr<lldb_private::LockableStreamFile> const&, unsigned int, char const*, llvm::StringRef, llvm::StringRef, bool, bool, unsigned int, lldb_private::IOHandlerDelegate&)::$_2>(void*) /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:213:5 llvm#23 0x00007fe9eb93bfbf llvm::unique_function<void ()>::operator()() /home/hyubo/osmeta/external/llvm-project/llvm/include/llvm/ADT/FunctionExtras.h:365:5 llvm#24 0x00007fe9eb93bb80 lldb_private::Editline::GetCharacter(wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:0:5 llvm#25 0x00007fe9eb941a18 lldb_private::Editline::ConfigureEditor(bool)::$_0::operator()(editline*, wchar_t*) const /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1287:5 llvm#26 0x00007fe9eb9419e2 lldb_private::Editline::ConfigureEditor(bool)::$_0::__invoke(editline*, wchar_t*) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1286:27 llvm#27 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:439:14 llvm#28 0x00007fe9f3384e26 el_getc /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:400:1 llvm#29 0x00007fe9f3384f90 read_getcmd /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:247:14 llvm#30 0x00007fe9f3384f90 el_gets /home/engshare/third-party2/libedit/3.1/src/libedit/src/read.c:586:14 llvm#31 0x00007fe9eb9409f3 lldb_private::Editline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Host/common/Editline.cpp:1636:16 llvm#32 0x00007fe9eb8044d7 lldb_private::IOHandlerEditline::GetLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>&, bool&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:339:5 llvm#33 0x00007fe9eb805609 lldb_private::IOHandlerEditline::Run() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/IOHandler.cpp:600:11 llvm#34 0x00007fe9eb7b214c lldb_private::Debugger::RunIOHandlers() /home/hyubo/osmeta/external/llvm-project/lldb/source/Core/Debugger.cpp:1280:16 llvm#35 0x00007fe9eb98f00f lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) /home/hyubo/osmeta/external/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:3620:16 llvm#36 0x00007fe9eb4f0e09 lldb::SBDebugger::RunCommandInterpreter(bool, bool) /home/hyubo/osmeta/external/llvm-project/lldb/source/API/SBDebugger.cpp:1234:42 llvm#37 0x000055ca6788d6b0 Driver::MainLoop() /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:677:3 llvm#38 0x000055ca6788e226 main /home/hyubo/osmeta/external/llvm-project/lldb/tools/driver/Driver.cpp:887:17 llvm#39 0x00007fe9e5e2c657 __libc_start_call_main /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16 llvm#40 0x00007fe9e5e2c718 call_init /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:128:20 llvm#41 0x00007fe9e5e2c718 __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:379:5 llvm#42 0x000055ca67889a11 _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:118:0 Segmentation fault (core dumped) ``` When `target symbols add` is run, `Symtab::AddSymbol()` can reallocate the underlying `std::vector<Symbol>` and resize it, invalidating all existing Symbol* pointers. While `Process::Flush()` clears stale stack frames, the statusline caches its own `ExecutionContextRef` containing a `StackID` with a `SymbolContextScope*` (which can be a `Symbol*`). This cached reference is not cleared by `Process::Flush()`, so the next statusline redraw accesses a dangling pointer and crashes. Fix this by adding `Statusline::Flush()` which clears the cached frame, `Debugger::Flush()` which forwards to it under the statusline mutex, and calling `Debugger::Flush()` from `Process::Flush()` so that all flush paths (symbol add, exec, module load) also invalidate the statusline's stale state. After this fix, lldb is not crashing anymore, new symbols from a symbol file are correctly loaded --------- Co-authored-by: George Hu <georgehuyubo@gmail.com>
llvm#183506 revealed a pre-existing use-after-scope in createInstrInfo (MSan bot: https://lab.llvm.org/buildbot/#/builders/164/builds/21562 [*]). This patch fixes the issue by changing the stack-allocated AArch64Subtarget (which goes out of scope once createInstrInfo() returns) into heap-allocated, allowing it to be safely stored in the returned AArch64InstrInfo. ----- [*] WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55555666fabd in llvm::AArch64InstrInfo::getInstSizeInBytes(llvm::MachineInstr const&) const /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:247:5 ... /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:85:3 llvm#9 0x555556508559 in InstSizes_MOVaddrTagged_Test::TestBody() /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:301:3 ... Member fields were destroyed #0 0x555556498a1d in __sanitizer_dtor_callback_fields /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1074:5 #1 0x5555564fbda6 in ~Triple /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/TargetParser/Triple.h:348:12 #2 0x5555564fbda6 in ~Triple /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/TargetParser/Triple.h:47:7 #3 0x5555564fbda6 in llvm::AArch64Subtarget::~AArch64Subtarget() /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Target/AArch64/AArch64Subtarget.h:38:7 llvm#4 0x555556503396 in (anonymous namespace)::createInstrInfo(llvm::TargetMachine*) /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:38:1 llvm#5 0x5555565084cb in InstSizes_MOVaddrTagged_Test::TestBody() /home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:299:42
This approach was agreed in the original Taskloop PR here: llvm#166903 (comment). This uses the
structArgto transfer the data between outlined functions to better support the bounds in Taskloop.