Skip to content

feat(agent-data-plane): emit point telemetry#1638

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 7 commits into
mainfrom
travis/dadp-71-point-telemetry
May 14, 2026
Merged

feat(agent-data-plane): emit point telemetry#1638
gh-worker-dd-mergequeue-cf854d[bot] merged 7 commits into
mainfrom
travis/dadp-71-point-telemetry

Conversation

@thieman

@thieman thieman commented May 13, 2026

Copy link
Copy Markdown
Contributor

Summary

Emit ADP-side point telemetry for metric payloads so Core Agent can consume point.sent and point.dropped via Remote Agent Registry.

Key changes:

  • Track metric data point counts through payload and transaction metadata.
  • Count point sends on successful Datadog forwarder delivery.
  • Count point drops for permanent HTTP/send failures and retry queue drops.
  • Emit the ADP point telemetry instruments as gauges while preserving cumulative semantics.
  • Expose ADP point telemetry through RAR remapping as point.sent / point.dropped with the domain tag. RAR handles remote-agent attribution.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

How did you test this PR?

  • make fmt
  • cargo check --workspace
  • cargo check --workspace --tests
  • cargo check -p saluki-components --lib
  • cargo check -p saluki-components --tests
  • cargo check -p saluki-io --lib
  • cargo test -p agent-data-plane state::metrics::tests::test_render_rar_telemetry --bin agent-data-plane
  • cargo test -p saluki-components common::datadog::request_builder --lib
  • cargo test -p saluki-components common::datadog::request_builder::tests::split_oversized_request_tracks_data_points_written --lib
  • cargo test -p saluki-components common::datadog::transaction::tests::basic_transaction_ser_deser_roundtrip --lib
  • cargo test -p saluki-components encoders::datadog::metrics::tests::input_data_point_count_tracks_metric_values --lib
  • cargo test -p saluki-io net::util::retry::queue --lib
  • Pre-commit checks run by git commit:
    • Rust formatting
    • Cargo.toml formatting
    • Clippy lints
    • third-party license freshness
    • dependency advisories/license/source checks
    • API documentation build

References

  • DADP-71

@dd-octo-sts dd-octo-sts Bot added area/core Core functionality, event model, etc. area/io General I/O and networking. area/components Sources, transforms, and destinations. encoder/datadog-events Datadog events encoder. encoder/datadog-logs Datadog Logs encoder. encoder/datadog-metrics Datadog Metrics encoder. encoder/datadog-service-checks Datadog Service Checks encoder. encoder/datadog-stats Datadog APM Stats encoder. encoder/datadog-traces Datadog Traces encoder. forwarder/datadog Datadog forwarder. labels May 13, 2026
@pr-commenter

pr-commenter Bot commented May 13, 2026

Copy link
Copy Markdown

Binary Size Analysis (Agent Data Plane)

Target: 3ad3408 (baseline) vs 987b2eb (comparison) diff
Analysis Type: Stripped binaries (debug symbols excluded)
Baseline Size: 37.15 MiB
Comparison Size: 37.23 MiB
Size Change: +81.33 KiB (+0.21%)
Pass/Fail Threshold: +5%
Result: PASSED ✅

Changes by Module

Module File Size Symbols
saluki_components::config_registry::datadog +43.22 KiB 3
core -32.04 KiB 1849
[sections] +15.17 KiB 8
figment +10.65 KiB 94
alloc +8.66 KiB 116
bytes +6.95 KiB 8
saluki_components::common::datadog +6.77 KiB 51
tonic +6.54 KiB 50
serde_core +5.77 KiB 69
saluki_components::forwarders::otlp -5.28 KiB 3
hashbrown +4.79 KiB 29
tokio_util -4.30 KiB 11
saluki_config::GenericConfiguration::as_typed -3.87 KiB 16
tower +3.77 KiB 46
[Unmapped] +3.64 KiB 1
http_body_util -3.53 KiB 14
saluki_io::net::server -3.45 KiB 2
std +3.09 KiB 52
hyper_util -3.07 KiB 17
serde -2.98 KiB 6

Detailed Symbol Changes

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +1.1% +87.4Ki  +1.0% +62.6Ki    [8617 Others]
  [NEW] +41.6Ki  [NEW] +41.5Ki    saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::hc63512f5c44efe66
  [NEW] +41.6Ki  [NEW] +41.5Ki    saluki_components::config_registry::datadog::SUPPORTED_ANNOTATIONS::_{{closure}}::h23d11e78cfbe0815
  +226% +12.7Ki  +230% +12.7Ki    h2::proto::connection::DynConnection<B>::recv_frame::hfe9e3743004c5b69
  [NEW] +9.96Ki  [NEW] +9.82Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::h8a9881bd2979145e
  [NEW] +9.46Ki  [NEW] +9.32Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::h9f87a7461f01f41a
  [NEW] +8.88Ki  [NEW] +8.74Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::hd41d0998474d20fe
  +5.2% +8.19Ki  +5.2% +8.19Ki    [section .rodata]
  [NEW] +8.14Ki  [NEW] +8.02Ki    agent_data_plane::state::metrics::rules::dogstatsd::get_dogstatsd_remappings::hd9a85f5e0d91a339
  [NEW] +7.63Ki  [NEW] +7.48Ki    saluki_io::net::util::retry::queue::persisted::PersistedQueue<T>::push::_{{closure}}::h141b42adfcb26889
  [NEW] +7.51Ki  [NEW] +7.37Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::h895c688f9d390108
  [DEL] -7.46Ki  [DEL] -7.32Ki    saluki_io::net::util::retry::queue::persisted::PersistedQueue<T>::push::_{{closure}}::hd64b90a811288913
  [DEL] -7.66Ki  [DEL] -7.52Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::h5f1bd23083011079
 -63.8% -8.79Ki -64.4% -8.79Ki    saluki_components::transforms::trace_obfuscation::sql::obfuscate_sql_string::h390ed7ef0924ab0a
  [DEL] -8.84Ki  [DEL] -8.70Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::h951127abd2c7ac8f
  [DEL] -9.32Ki  [DEL] -9.18Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::h69967d59221b5eec
  [DEL] -9.38Ki  [DEL] -9.23Ki    _<tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::hc112d540227d69ee
  [DEL] -13.7Ki  [DEL] -13.6Ki    prost::encoding::message::encode::h316276b7ccf4a4e6
 -96.7% -14.4Ki -97.9% -14.4Ki    _<otlp_protos::otlp_include::opentelemetry::proto::metrics::v1::ResourceMetrics as prost::message::Message>::encode_raw::h040a4e27c338283f
  [DEL] -40.8Ki  [DEL] -40.7Ki    saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::hbf856a9e9cd12bd2
 -99.7% -41.4Ki -99.9% -41.4Ki    core::ops::function::FnOnce::call_once::h734b67589c43d342
  +0.2% +81.3Ki  +0.2% +56.4Ki    TOTAL

@pr-commenter

pr-commenter Bot commented May 13, 2026

Copy link
Copy Markdown

Regression Detector (Agent Data Plane)

Run ID: adf8d1dd-359c-420b-906d-a15ade5167e6
Baseline: 3ad3408f · Comparison: 987b2eb3 · Diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ +6.47 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ +4.16 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ +3.55 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ +2.45 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ +1.80 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu ⚪ +0.66 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ +0.61 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ +0.45 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ +0.41 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ +0.41 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ +0.40 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ +0.31 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ +0.24 metrics profiles logs
quality_gates_rss_idle memory ⚪ +0.21 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ +0.20 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ +0.20 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ +0.13 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ -0.12 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ -0.06 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ +0.03 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ +0.01 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ +0.00 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ +0.02 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ +0.07 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ -0.09 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ -0.14 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ -0.17 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ -0.33 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ +0.90 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ -1.21 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ -1.98 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 122 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 39.6 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 61.1 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 178 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 26.9 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

Comment thread bin/agent-data-plane/src/state/metrics/rules/transaction.rs Outdated
Comment thread lib/saluki-components/src/common/datadog/io.rs Outdated
Comment thread lib/saluki-components/src/common/datadog/request_builder.rs Outdated
Comment thread lib/saluki-components/src/common/datadog/telemetry.rs Outdated
Comment thread lib/saluki-io/src/net/util/retry/queue/mod.rs Outdated
// Listen for transactions to forward, and send a copy of each one to each endpoint I/O task.
while let Some(transaction) = transactions_rx.recv().await {
for (endpoint_url, endpoint_tx) in &endpoint_txs {
if endpoint_tx.send(transaction.clone()).await.is_err() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to account this as a permanently dropped transaction and increment point.dropped? Right now we only log below

@thieman thieman May 14, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[GPT 5.5] Agreed — this is a permanent drop for that endpoint because the endpoint I/O task receiver is gone, so the transaction will never be enqueued or retried there. Fixed in e462a4c by tracking it via track_permanently_failed_transaction(transaction.metadata(), None, endpoint_domain) when the fanout send fails.

@thieman thieman marked this pull request as ready for review May 14, 2026 09:54
@thieman thieman requested a review from a team as a code owner May 14, 2026 09:54
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 0408d5c into main May 14, 2026
77 checks passed
dd-octo-sts Bot pushed a commit that referenced this pull request May 14, 2026
## Summary

Emit ADP-side point telemetry for metric payloads so Core Agent can consume `point.sent` and `point.dropped` via Remote Agent Registry.

Key changes:
- Track metric data point counts through payload and transaction metadata.
- Count point sends on successful Datadog forwarder delivery.
- Count point drops for permanent HTTP/send failures and retry queue drops.
- Emit the ADP point telemetry instruments as gauges while preserving cumulative semantics.
- Expose ADP point telemetry through RAR remapping as `point.sent` / `point.dropped` with the `domain` tag. RAR handles remote-agent attribution.

## Change Type
- [ ] Bug fix
- [x] New feature
- [ ] Non-functional (chore, refactoring, docs)
- [ ] Performance

## How did you test this PR?

- `make fmt`
- `cargo check --workspace`
- `cargo check --workspace --tests`
- `cargo check -p saluki-components --lib`
- `cargo check -p saluki-components --tests`
- `cargo check -p saluki-io --lib`
- `cargo test -p agent-data-plane state::metrics::tests::test_render_rar_telemetry --bin agent-data-plane`
- `cargo test -p saluki-components common::datadog::request_builder --lib`
- `cargo test -p saluki-components common::datadog::request_builder::tests::split_oversized_request_tracks_data_points_written --lib`
- `cargo test -p saluki-components common::datadog::transaction::tests::basic_transaction_ser_deser_roundtrip --lib`
- `cargo test -p saluki-components encoders::datadog::metrics::tests::input_data_point_count_tracks_metric_values --lib`
- `cargo test -p saluki-io net::util::retry::queue --lib`
- Pre-commit checks run by `git commit`:
  - Rust formatting
  - Cargo.toml formatting
  - Clippy lints
  - third-party license freshness
  - dependency advisories/license/source checks
  - API documentation build

## References

- DADP-71

Co-authored-by: jszwedko <jesse.szwedko@datadoghq.com> 0408d5c
tobz pushed a commit that referenced this pull request Jun 1, 2026
Emit ADP-side point telemetry for metric payloads so Core Agent can consume `point.sent` and `point.dropped` via Remote Agent Registry.

Key changes:
- Track metric data point counts through payload and transaction metadata.
- Count point sends on successful Datadog forwarder delivery.
- Count point drops for permanent HTTP/send failures and retry queue drops.
- Emit the ADP point telemetry instruments as gauges while preserving cumulative semantics.
- Expose ADP point telemetry through RAR remapping as `point.sent` / `point.dropped` with the `domain` tag. RAR handles remote-agent attribution.

- [ ] Bug fix
- [x] New feature
- [ ] Non-functional (chore, refactoring, docs)
- [ ] Performance

- `make fmt`
- `cargo check --workspace`
- `cargo check --workspace --tests`
- `cargo check -p saluki-components --lib`
- `cargo check -p saluki-components --tests`
- `cargo check -p saluki-io --lib`
- `cargo test -p agent-data-plane state::metrics::tests::test_render_rar_telemetry --bin agent-data-plane`
- `cargo test -p saluki-components common::datadog::request_builder --lib`
- `cargo test -p saluki-components common::datadog::request_builder::tests::split_oversized_request_tracks_data_points_written --lib`
- `cargo test -p saluki-components common::datadog::transaction::tests::basic_transaction_ser_deser_roundtrip --lib`
- `cargo test -p saluki-components encoders::datadog::metrics::tests::input_data_point_count_tracks_metric_values --lib`
- `cargo test -p saluki-io net::util::retry::queue --lib`
- Pre-commit checks run by `git commit`:
  - Rust formatting
  - Cargo.toml formatting
  - Clippy lints
  - third-party license freshness
  - dependency advisories/license/source checks
  - API documentation build

- DADP-71

Co-authored-by: jszwedko <jesse.szwedko@datadoghq.com>
@tobz tobz deleted the travis/dadp-71-point-telemetry branch June 15, 2026 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/components Sources, transforms, and destinations. area/core Core functionality, event model, etc. area/io General I/O and networking. encoder/datadog-events Datadog events encoder. encoder/datadog-logs Datadog Logs encoder. encoder/datadog-metrics Datadog Metrics encoder. encoder/datadog-service-checks Datadog Service Checks encoder. encoder/datadog-stats Datadog APM Stats encoder. encoder/datadog-traces Datadog Traces encoder. forwarder/datadog Datadog forwarder. mergequeue-status: done

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants