-
Notifications
You must be signed in to change notification settings - Fork 814
feat: Add NETopKV function. #1251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
cb2af0c to
1f0778c
Compare
| */ | ||
| void configure(const ITensor *predictions, const ITensor *targets, ITensor *output, const unsigned int k); | ||
|
|
||
| /** Static function to check if given info will lead to a valid configuration of @ref CPPTopKVKernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CpuTopKVKernel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in next patchset
89cd2b8 to
cd41038
Compare
e0090dc to
11238b6
Compare
|
|
||
| } // namespace detail | ||
|
|
||
| void topkv_fp32_neon(const ITensor *in1, const ITensor *in2, ITensor *out, uint32_t k, const Window &win) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change these to predictions & targets in all classes and functions? (Except the operator list.dox unfortunately due to the convention there). There is a mix-up of in1/src0 in different files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in next patch
tests/datasets/TopKVDataset.h
Outdated
|
|
||
| #include "tests/framework/datasets/Datasets.h" | ||
|
|
||
| #include <type_traits> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is type_traits here? I think we only need , am I missing anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
tests/validation/NEON/TopKV.cpp
Outdated
|
|
||
| TEST_SUITE(NEON) | ||
| TEST_SUITE(TopKVLayer) | ||
| // clang-format on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need these two lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need them so that the Test name shows up as NEON/TopKVLayer
NEON/TopKVLayer/S32/RunLarge@Shape=1000,32000:K=4:DataType=S32'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've meant the following lines:
// clang-format on
// INDENT-ON
This platform doesn't hightlight :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed them in next patch
tests/validation/NEON/TopKV.cpp
Outdated
| TEST_SUITE(TopKVLayer) | ||
|
|
||
| template <typename T> | ||
| using CPPTopKVLayerFixture = TopKVValidationFixture<Tensor, Accessor, CPPTopKV, T>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that it was useful when comparing this implementation but we never have a CPP test suite in the NEON/ directory. The build system might miss to compile this test because we might ignore (we may already be ignoring) any files with under tests/validation/NEON when neon=0 option is provided to Scons.
I think these tests are valuable. Let's keep them but put them under tests/validation/CPP/TopKV.cpp
src/cpu/kernels/CpuTopKVKernel.cpp
Outdated
|
|
||
| // targets must match batch | ||
| // targets is expected to contain N elements (shape [N]) | ||
| ARM_COMPUTE_RETURN_ERROR_ON_MSG(src1.num_dimensions() < 1, "targets must have at least 1 dimension"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only need to copy what's already inside CPPTopKVKernel.cpp:
ARM_COMPUTE_RETURN_ERROR_ON(predictions->num_dimensions() > 2);
ARM_COMPUTE_RETURN_ERROR_ON(targets->num_dimensions() > 1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in next patch
11238b6 to
b0f0bdb
Compare
* The Neon(TM) implementation of TopKV reduces execution time from 447.8 ms (CPP) to 11.65 ms for the same workload (F32, C=1000, N=32000, k=3, 6 threads), achieving an approximate 38× speedup. This gain comes from SIMD vectorization, removal of per-element branches, and a more efficient inner loop * Resolves ARMCL-1227. Change-Id: Ifdf161ce4254dc5ecd57aff9ae22410facd31705 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
b0f0bdb to
b94b4e9
Compare
The Neon(TM) implementation of TopKV reduces execution time from 447.8 ms (scalar CPP) to 11.65 ms for the same workload (F32, C=1000, N=32000, k=3, 6 threads), achieving an approximate 38× speedup. This gain comes from SIMD vectorization, removal of per-element branches, and a more efficient inner loop.
Resolves ARMCL-1227
Change-Id: Ifdf161ce4254dc5ecd57aff9ae22410facd31705