Skip to content

fix(agent-data-plane): support ARMv7 compile target#1866

Draft
thieman wants to merge 4 commits into
mainfrom
thieman/linux-armv7-adp
Draft

fix(agent-data-plane): support ARMv7 compile target#1866
thieman wants to merge 4 commits into
mainfrom
thieman/linux-armv7-adp

Conversation

@thieman

@thieman thieman commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds build and smoke-test support for Agent Data Plane on the Datadog IoT Agent's current Linux ARMv7 hard-float package target.

The current IoT Agent ARMv7 packages map to Debian armhf / RPM armv7hl, so this PR targets Rust's armv7-unknown-linux-gnueabihf target.

Key changes

  • Allows stringtheory on 32-bit little-endian platforms and makes its inline capacity documentation/tests word-size aware.
  • Fixes 32-bit compile-time assumptions in resource-accounting and Unix ancillary credential sizing.
  • Disables jemalloc for ARM Linux ADP builds after local ARMv7 runtime testing found the global thp:never jemalloc config aborts before CLI startup.
  • Adds an ARMv7 hard-float target profile for the refactored ADP Docker build scripts.
  • Teaches local ADP image builds to use a bookworm-based Rust build image and --platform linux/arm/v7 for BUILD_TARGET=armv7-unknown-linux-gnueabihf, producing an actual ARMv7 image while cross-compiling from the build platform.
  • Makes ADP build-prep tolerate build images that already provide rustup.
  • Avoids staging bogus profiling helpers for ARMv7 internal/local images when ddprof-arm is unavailable.

Out of scope for this draft

  • Release tarball fanout for linux/arm/v7 / armv7.
  • Public/internal multi-arch manifest fanout in CI.
  • Full IoT Agent deployment testing on physical ARMv7 hardware.

Validation

  • Registered ARMv7 binfmt/QEMU in local Colima with docker run --privileged --rm tonistiigi/binfmt --install arm.
  • Verified ARMv7 userspace execution: docker run --rm --platform linux/arm/v7 arm32v7/debian:bookworm-slim sh -c 'uname -m; dpkg --print-architecture; test -e /lib/ld-linux-armhf.so.3'.
  • Inspected current Datadog IoT Agent 7.80.3-1_armhf.deb; its main Agent binary is ARM EABI5 and only references up to GLIBC_2.8.
  • make fmt
  • cargo test --package stringtheory --package resource-accounting --package saluki-io
  • cargo check --workspace
  • cargo check --workspace --tests
  • BUILD_TARGET=armv7-unknown-linux-gnueabihf make build-adp-image-release
  • docker image inspect saluki-images/agent-data-plane:testing-release --format 'os={{.Os}} arch={{.Architecture}} variant={{.Variant}}'os=linux arch=arm variant=v7
  • docker run --rm --platform linux/arm/v7 saluki-images/agent-data-plane:testing-release versionv1.3.0-3abca41ef7
  • ARMv7 disabled-mode startup smoke under QEMU/bookworm-compatible image:
docker run --rm --platform linux/arm/v7 \
  -e DD_DATA_PLANE_ENABLED=false \
  -e DD_DATA_PLANE_STANDALONE_MODE=false \
  -e DD_DATA_PLANE_REMOTE_AGENT_ENABLED=false \
  -e DD_DATA_PLANE_USE_NEW_CONFIG_STREAM_ENDPOINT=false \
  --entrypoint sh \
  saluki-images/agent-data-plane:testing-release \
  -euxc 'printf "{}\n" > /tmp/datadog.yaml; /usr/local/bin/agent-data-plane --config /tmp/datadog.yaml run 2>&1 | tee /tmp/adp-run.log; grep -E "Agent Data Plane starting|Agent Data Plane is not enabled" /tmp/adp-run.log'

Notes from local ARMv7 investigation

  • Docker/Colima did not initially have ARMv7 binfmt registered; ARMv7 containers failed with exec format error until installing tonistiigi/binfmt --install arm.
  • A default Ubuntu 24.04 GNU ARMv7 build produced a binary requiring GLIBC_2.38, which failed on Debian bookworm armhf. The local ARMv7 Makefile path now builds with rust:1.96-bookworm to keep the runtime glibc requirement compatible with bookworm.
  • A static glibc ARMv7 experiment removed dynamic glibc requirements but bus-errored under QEMU before CLI startup, so this PR keeps a dynamic GNU hard-float build.

@dd-octo-sts dd-octo-sts Bot added area/io General I/O and networking. area/memory Memory bounds and memory management. labels Jun 12, 2026
@pr-commenter

pr-commenter Bot commented Jun 12, 2026

Copy link
Copy Markdown

Binary Size Analysis (Agent Data Plane)

Baseline: cf03337 · Comparison: 1899d62 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 40.50 MiB (baseline) vs 40.50 MiB (comparison)
Size Change: +2.15 KiB (+0.01%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
anon.d41443d9b4e8e19d69b001be3e706067.20.llvm.14570049942322470332 +13.81 KiB 1
anon.df25a89916046317e78fdabfd67ac98f.16.llvm.9921130204754883941 -13.81 KiB 1
anon.af6803e350cde8e8fde38a9f631ccf6b.10.llvm.15866020932687486436 +4.65 KiB 1
anon.5c6ebeb8a8dae37d9fd2e13c1c5baf91.8.llvm.7221663532278332205 -4.65 KiB 1
anon.be1ef3cfeb22a9397bdb440589661b1f.92.llvm.2955540337442524778 +3.92 KiB 1
anon.75571037cd0723bbe125d64ea26c3745.17.llvm.6756009048372103969 -3.91 KiB 1
anon.cccee0d713080b3a81e943fc589a7ed9.5.llvm.15146293729238759243 -3.51 KiB 1
anon.8dc588bb37061571a41a5004abbad7c7.404.llvm.15392843726178276514 +3.51 KiB 1
anon.582e16924be0b5360507e30233c02a75.345.llvm.10738046353486614011 +2.98 KiB 1
anon.582e16924be0b5360507e30233c02a75.345.llvm.11444254857378069292 -2.98 KiB 1
anon.b0664815aeef65ed84d11558fc99cdc5.3.llvm.13040176013401945301 -2.96 KiB 1
anon.eb9554d305715fb69e4f5784fa60396a.2.llvm.3040268805220341097 +2.96 KiB 1
anon.8077ff3c69f97319c2e42d2edd274628.133.llvm.7728992980506180869 +2.50 KiB 1
anon.57fede7ae0ad6de79d4970d61ee2c06a.277.llvm.5589318379930140287 -2.50 KiB 1
anon.fa437768b57c739f727cdfc2f52aef9a.771.llvm.4281978802789094180 +2.38 KiB 1
anon.4ff6ecd5a501c89600f3486c9656f7b1.24.llvm.16784999995774596236 -2.38 KiB 1
anon.4bf2db15733070e7ae37527381ac919c.86.llvm.16197326569580186361 +2.31 KiB 1
anon.289b3ad0cb3220b52cf9bdf05881984c.328.llvm.15707378778291209455 -2.31 KiB 1
anon.615a1611bd10a0e4d3c056e58c425776.6.llvm.2993557316941394640 +2.04 KiB 1
anon.016807120c200da2088cfb6c1dad9183.243.llvm.677954183733296466 -2.04 KiB 1
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW] +13.8Ki  [NEW]     +82    anon.d41443d9b4e8e19d69b001be3e706067.20.llvm.14570049942322470332
  [NEW] +5.28Ki  [NEW]     +33    core::ptr::drop_in_place<http_body_util::combinators::map_err::MapErr<http_body_util::combinators::map_err::MapErr<http_body_util::combinators::map_frame::MapFrame<tonic::service::interceptor::ResponseBody<tonic::body::Body>,tonic::codec::decode::Streaming<datadog_protos::agent_include::datadog::remoteagent::v1::RegisterRemoteAgentResponse>::new<tonic::service::interceptor::ResponseBody<tonic::body::Body>,tonic_prost::codec::ProstDecoder<datadog_protos::agent_include::datadog::remoteagent::v1::RegisterRemoteAgentResponse>>::{{closure}}>,tonic::codec::decode::Streaming<datadog_protos::agent_include::datadog::remoteagent::v1::RegisterRemoteAgentResponse>::new<tonic::service::interceptor::ResponseBody<tonic::body::Body>,tonic_prost::codec::ProstDecoder<datadog_protos::agent_include::datadog::remoteagent::v1::RegisterRemoteAgentResponse>>::{{closure}}>,tonic::status::Status::map_error<tonic::statu
  [NEW] +4.65Ki  [NEW]     +74    anon.af6803e350cde8e8fde38a9f631ccf6b.10.llvm.15866020932687486436
  [NEW] +3.92Ki  [NEW]     +16    anon.be1ef3cfeb22a9397bdb440589661b1f.92.llvm.2955540337442524778
  [NEW] +3.51Ki  [NEW]     +80    anon.8dc588bb37061571a41a5004abbad7c7.404.llvm.15392843726178276514
  [NEW] +2.98Ki  [NEW] +2.89Ki    anon.582e16924be0b5360507e30233c02a75.345.llvm.10738046353486614011
  [NEW] +2.96Ki  [NEW]     +75    anon.eb9554d305715fb69e4f5784fa60396a.2.llvm.3040268805220341097
  [NEW] +2.95Ki  [NEW]     +61    core::ptr::drop_in_place<std::sync::poison::PoisonError<std::sync::poison::rwlock::RwLockReadGuard<quick_cache::shard::CacheShard<saluki_context::hash::ContextKey,saluki_context::context::Context,saluki_common::cache::weight::WrappedWeighter<saluki_common::cache::weight::ItemCountWeighter>,saluki_common::hash::NoopU64BuildHasher,saluki_common::cache::expiry::ExpiryCapableLifecycle<saluki_context::hash::ContextKey>,alloc::sync::Arc<quick_cache::sync_placeholder::Placeholder<saluki_context::context::Context>>>>>>::h939de28ceb0e893a
  [NEW] +2.80Ki  [NEW]    +129    core::ptr::drop_in_place<std::sync::poison::PoisonError<std::sync::poison::rwlock::RwLockWriteGuard<quick_cache::shard::CacheShard<saluki_context::hash::ContextKey,saluki_context::context::Context,saluki_common::cache::weight::WrappedWeighter<saluki_common::cache::weight::ItemCountWeighter>,saluki_common::hash::NoopU64BuildHasher,saluki_common::cache::expiry::ExpiryCapableLifecycle<saluki_context::hash::ContextKey>,alloc::sync::Arc<quick_cache::sync_placeholder::Placeholder<saluki_context::context::Context>>>>>>::h8e3217846b977450
  [NEW] +2.50Ki  [NEW]    +101    anon.8077ff3c69f97319c2e42d2edd274628.133.llvm.7728992980506180869
  +0.0% +2.14Ki  +0.0%    +792    [11330 Others]
  [DEL] -2.50Ki  [DEL]    -101    anon.57fede7ae0ad6de79d4970d61ee2c06a.277.llvm.5589318379930140287
  [DEL] -2.81Ki  [DEL]    -129    core::ptr::drop_in_place<std::sync::poison::PoisonError<std::sync::poison::rwlock::RwLockWriteGuard<quick_cache::shard::CacheShard<alloc::string::String,saluki_components::sources::otlp::metrics::cache::Extrema,saluki_common::cache::weight::WrappedWeighter<saluki_common::cache::weight::ItemCountWeighter>,foldhash::quality::RandomState,saluki_common::cache::expiry::ExpiryCapableLifecycle<alloc::string::String>,alloc::sync::Arc<quick_cache::sync_placeholder::Placeholder<saluki_components::sources::otlp::metrics::cache::Extrema>>>>>>::h866530fe0d3f15c7
  [DEL] -2.95Ki  [DEL]     -61    core::ptr::drop_in_place<std::sync::poison::PoisonError<std::sync::poison::rwlock::RwLockReadGuard<figment::figment::Figment>>>::ha2ee7fba0e18f97a
  [DEL] -2.96Ki  [DEL]     -75    anon.b0664815aeef65ed84d11558fc99cdc5.3.llvm.13040176013401945301
  [DEL] -2.98Ki  [DEL] -2.89Ki    anon.582e16924be0b5360507e30233c02a75.345.llvm.11444254857378069292
  [DEL] -3.51Ki  [DEL]     -80    anon.cccee0d713080b3a81e943fc589a7ed9.5.llvm.15146293729238759243
  [DEL] -3.91Ki  [DEL]     -16    anon.75571037cd0723bbe125d64ea26c3745.17.llvm.6756009048372103969
  [DEL] -4.65Ki  [DEL]     -74    anon.5c6ebeb8a8dae37d9fd2e13c1c5baf91.8.llvm.7221663532278332205
  [DEL] -5.27Ki  [DEL]     -33    core::ptr::drop_in_place<http_body_util::combinators::map_err::MapErr<http_body_util::combinators::map_err::MapErr<http_body_util::combinators::map_frame::MapFrame<tonic::service::interceptor::ResponseBody<tonic::body::Body>,tonic::codec::decode::Streaming<datadog_protos::agent_include::datadog::remoteagent::v1::RefreshRemoteAgentResponse>::new<tonic::service::interceptor::ResponseBody<tonic::body::Body>,tonic_prost::codec::ProstDecoder<datadog_protos::agent_include::datadog::remoteagent::v1::RefreshRemoteAgentResponse>>::{{closure}}>,tonic::codec::decode::Streaming<datadog_protos::agent_include::datadog::remoteagent::v1::RefreshRemoteAgentResponse>::new<tonic::service::interceptor::ResponseBody<tonic::body::Body>,tonic_prost::codec::ProstDecoder<datadog_protos::agent_include::datadog::remoteagent::v1::RefreshRemoteAgentResponse>>::{{closure}}>,tonic::status::Status::map_error<tonic::status::S
  [DEL] -13.8Ki  [DEL]     -82    anon.df25a89916046317e78fdabfd67ac98f.16.llvm.9921130204754883941
  +0.0% +2.15Ki  +0.0%    +792    TOTAL

@datadog-datadog-prod-us1-2

This comment has been minimized.

# Conflicts:
#	docker/Dockerfile.agent-data-plane
@pr-commenter

pr-commenter Bot commented Jun 29, 2026

Copy link
Copy Markdown

Regression Detector (Agent Data Plane)

Run ID: 284934ad-c27a-48e8-8852-e66cf2ec133c
Baseline: cf03337c · Comparison: 1899d627 · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (5)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
quality_gates_rss_dsd_medium memory ⚪ +0.28 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ +0.05 metrics profiles logs
quality_gates_rss_idle memory ⚪ +0.01 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ -0.02 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ -0.03 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 137 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 42.4 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 65.5 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 192 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 28 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

pub(crate) mod state;

#[cfg(all(target_os = "linux", not(system_allocator)))]
#[cfg(all(target_os = "linux", not(target_arch = "arm"), not(system_allocator)))]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that arm here is only 32-bit, does not include arm64

@thieman

thieman commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

ARMv7 emulated Rust tests update

Validated Rust test binaries under ARMv7 emulation with an explicit QEMU runner:

docker run --rm \
  -v "$PWD:/work" \
  -v "$PWD/target/docker-cargo-home:/usr/local/cargo/registry" \
  -v "$PWD/target/docker-target:/work/target" \
  -w /work \
  -e CARGO_TARGET_ARMV7_UNKNOWN_LINUX_GNUEABIHF_LINKER=arm-linux-gnueabihf-gcc \
  -e CARGO_TARGET_ARMV7_UNKNOWN_LINUX_GNUEABIHF_RUNNER='qemu-arm -L /usr/arm-linux-gnueabihf' \
  -e CC_armv7_unknown_linux_gnueabihf=arm-linux-gnueabihf-gcc \
  -e CXX_armv7_unknown_linux_gnueabihf=arm-linux-gnueabihf-g++ \
  -e AR_armv7_unknown_linux_gnueabihf=arm-linux-gnueabihf-ar \
  -e RUSTFLAGS='--cfg tokio_unstable' \
  rust:1.96-bookworm \
  bash -euxo pipefail -c '
    apt-get update >/dev/null
    apt-get install -y --no-install-recommends \
      gcc-arm-linux-gnueabihf \
      g++-arm-linux-gnueabihf \
      libc6-dev-armhf-cross \
      qemu-user \
      pkg-config \
      cmake \
      protobuf-compiler \
      perl \
      make \
      file >/dev/null
    rustup target add armv7-unknown-linux-gnueabihf
    cargo test --package stringtheory --package resource-accounting --package saluki-io --target armv7-unknown-linux-gnueabihf
  '

Result: passed.

This initially exposed 32-bit layout assumptions in stringtheory and saluki-io tests; those tests now derive capacities/sizes from actual type layout instead of hard-coded 64-bit sizes.

@thieman

thieman commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

ARMv7 standalone runtime smoke

Validated the full ADP image in standalone mode under ARMv7 QEMU emulation:

docker run --rm --platform linux/arm/v7 \
  -v "$CERT_DIR:/mnt/cert:ro" \
  -e DD_API_KEY=test-api-key \
  -e DD_HOSTNAME=armv7-standalone-smoke \
  -e DD_DD_URL=http://127.0.0.1:9 \
  -e DD_DATA_PLANE_ENABLED=true \
  -e DD_DATA_PLANE_STANDALONE_MODE=true \
  -e DD_DATA_PLANE_REMOTE_AGENT_ENABLED=false \
  -e DD_DATA_PLANE_USE_NEW_CONFIG_STREAM_ENDPOINT=false \
  --entrypoint sh \
  saluki-images/agent-data-plane:testing-release \
  -euxc '...
    timeout 15s /usr/local/bin/agent-data-plane --config /tmp/datadog.yaml run 2>&1 | tee /tmp/adp-standalone.log
    grep "Agent Data Plane running" /tmp/adp-standalone.log
    grep "Serving unprivileged API" /tmp/adp-standalone.log
    grep "Serving privileged API" /tmp/adp-standalone.log
    grep "DogStatsD listener started" /tmp/adp-standalone.log
    grep "Topology healthy. Waiting for interrupt" /tmp/adp-standalone.log
    ! grep -Ei "panic|panicked|bus error|segmentation fault|Failed to|No data pipelines" /tmp/adp-standalone.log
  '

Result: passed. The ARMv7 binary reached standalone mode, started the unprivileged and privileged APIs, started DogStatsD, reported topology healthy, and waited for interrupt with no panic/bus error/segfault/failure log.

@thieman

thieman commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Self-review follow-up

I ran a focused self-review plus fresh reviewer passes. Two concrete issues were found and fixed:

  1. stringtheory::PackedLengthCapacity was unsafe on 32-bit: the old packed usize representation would have limited each field to 16 bits on ARMv7 and could truncate production interner capacities. Fixed by preserving the existing packed single-usize representation on 64-bit and using explicit u32 capacity/length fields only on 32-bit. Added a regression test for multi-MiB values under ARMv7 emulation.
  2. ARMv7 internal/local images skipped ddprof because no ddprof-arm artifact exists, but also lacked /maybe-profile.sh. Fixed by staging a no-op wrapper for TARGETARCH=arm that simply execs the target command; amd64/arm64 behavior is unchanged.

Post-fix validation:

  • make fmt
  • Host: cargo test --package stringtheory --package resource-accounting --package saluki-io
  • Host: cargo check --workspace
  • Host: cargo check --workspace --tests
  • ARMv7 emulated Rust tests: cargo test --package stringtheory --package resource-accounting --package saluki-io --target armv7-unknown-linux-gnueabihf with CARGO_TARGET_ARMV7_UNKNOWN_LINUX_GNUEABIHF_RUNNER='qemu-arm -L /usr/arm-linux-gnueabihf'
  • ARMv7 image build: BUILD_TARGET=armv7-unknown-linux-gnueabihf make build-adp-image-release
  • ARMv7 image inspect: os=linux arch=arm variant=v7
  • ARMv7 wrapper check: /ddprof absent, /maybe-profile.sh executable, wrapper successfully ran agent-data-plane version
  • ARMv7 standalone smoke reached Agent Data Plane running, served privileged/unprivileged APIs, started DogStatsD, and reported topology healthy with no panic/bus error/segfault/failure log.

Remaining intentional scope gap: CI/release fanout for publishing ARMv7 artifacts is still not added in this draft. This PR proves the local build/runtime/test path; the CI/release matrix can be added once we decide the exact artifact publishing shape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/io General I/O and networking. area/memory Memory bounds and memory management.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant