[Support/BLAKE3] Make g_cpu_features thread safe#147948
Conversation
|
@llvm/pr-subscribers-llvm-support Author: Dmitry Vasilyev (slydiman) Changes
Full diff: https://github.com/llvm/llvm-project/pull/147948.diff 1 Files Affected:
diff --git a/llvm/lib/Support/BLAKE3/blake3_dispatch.c b/llvm/lib/Support/BLAKE3/blake3_dispatch.c
index e96e714225f41..6c156f123dd5b 100644
--- a/llvm/lib/Support/BLAKE3/blake3_dispatch.c
+++ b/llvm/lib/Support/BLAKE3/blake3_dispatch.c
@@ -14,6 +14,36 @@
#endif
#endif
+/* Atomic access abstraction (since MSVC does not do C11 yet) */
+#if defined(_MSC_VER) && !defined(__clang__)
+#if !defined(IS_X86)
+#include <intrin.h>
+#endif
+#pragma warning(disable : 5105)
+#ifndef FORCEINLINE
+#define FORCEINLINE inline __forceinline
+#endif
+typedef volatile long atomic32_t;
+static FORCEINLINE int32_t atomic_load32(atomic32_t *src) {
+ return _InterlockedOr(src, 0);
+}
+static FORCEINLINE void atomic_store32(atomic32_t *dst, int32_t val) {
+ _InterlockedExchange(dst, val);
+}
+#else
+#include <stdatomic.h>
+#ifndef FORCEINLINE
+#define FORCEINLINE inline __attribute__((__always_inline__))
+#endif
+typedef volatile _Atomic(int32_t) atomic32_t;
+static FORCEINLINE int32_t atomic_load32(atomic32_t *src) {
+ return atomic_load_explicit(src, memory_order_relaxed);
+}
+static FORCEINLINE void atomic_store32(atomic32_t *dst, int32_t val) {
+ atomic_store_explicit(dst, val, memory_order_relaxed);
+}
+#endif
+
#define MAYBE_UNUSED(x) (void)((x))
#if defined(IS_X86)
@@ -76,7 +106,7 @@ enum cpu_feature {
#if !defined(BLAKE3_TESTING)
static /* Allow the variable to be controlled manually for testing */
#endif
- enum cpu_feature g_cpu_features = UNDEFINED;
+ atomic32_t g_cpu_features = UNDEFINED;
LLVM_ATTRIBUTE_USED
#if !defined(BLAKE3_TESTING)
@@ -84,9 +114,10 @@ static
#endif
enum cpu_feature
get_cpu_features(void) {
-
- if (g_cpu_features != UNDEFINED) {
- return g_cpu_features;
+ enum cpu_feature _cpu_features;
+ _cpu_features = (enum cpu_feature)atomic_load32(&g_cpu_features);
+ if (_cpu_features != UNDEFINED) {
+ return _cpu_features;
} else {
#if defined(IS_X86)
uint32_t regs[4] = {0};
@@ -125,10 +156,11 @@ static
}
}
}
- g_cpu_features = features;
+ atomic_store32(&g_cpu_features, (int32_t)features);
return features;
#else
/* How to detect NEON? */
+ atomic_store32(&g_cpu_features, 0);
return 0;
#endif
}
|
|
This seems to have been fixed upstream: BLAKE3-team/BLAKE3@12823b8 Update instead? |
Done. Thanks. |
|
Why not update the entire BLAKE3? Wie generally do not want to fork third-party software, but update it as the need arises. Every LLVM-private change will make eventually updating it more difficult. This incliudes cherry-picked patches from the upstream repository. See also https://reviews.llvm.org/D121510#3383116 |
akyrtzi
left a comment
There was a problem hiding this comment.
Thank you for updating to latest BLAKE3!
Are the first 2 commits of this PR redundant now, could you merge down to one commit?
Otherwise LGTM.
Meinersbur
left a comment
There was a problem hiding this comment.
LGTM
@akyrtzi The LLVM repository only allows squashed merges. It even is encouraged to never force-push to a PR branch since it may cause GitHub to lose track of comments on source lines.
|
This causes linker error when linking both libblake3 and LLVM statically. |
This symbol was introduced in llvm#147948, but not prefixed, resulting in conflicts if libblake3 and LLVM are both linked statically into the same binary.
|
I think #148607 should fix that. |
This symbol was introduced in #147948, but not prefixed, resulting in conflicts if libblake3 and LLVM are both linked statically into the same binary.
Added by llvm#147948, blake3_xof_many and blake3_compress_subtree_wide cause conflicts when linking llvm and blake3 statically into the same binary. Similar to llvm#148607.
This was dropped in llvm#147948 and causes symbol conflicts if libblake3 is also linked.
This was dropped in #147948 and causes symbol conflicts if libblake3 is also linked.
g_cpu_featurescan be updated multiple times byget_cpu_features(), which reports a thread sanitizer error when used with multiple lld threads.This PR updates BLAKE3 to v1.8.2.