fix(sidecar): configure OTLP endpoint for FFE metrics#2076
Conversation
🎉 All green!🧪 All tests passed 🎯 Code Coverage (details) 🔗 Commit SHA: 5e86692 | Docs | Datadog PR Page | Give us feedback! |
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2076 +/- ##
==========================================
+ Coverage 73.43% 73.60% +0.16%
==========================================
Files 470 470
Lines 78340 78449 +109
==========================================
+ Hits 57532 57741 +209
+ Misses 20808 20708 -100
🚀 New features to boost your workflow:
|
| parent_session_id: ffi::CharSlice, | ||
| ) -> MaybeError { | ||
| let session_id_str: String = session_id.to_utf8_lossy().into(); | ||
| let mut otlp_metrics_endpoint = unsafe { otlp_metrics_endpoint.as_ref().cloned() }; |
There was a problem hiding this comment.
nit: Was a bit confused by this variable shadowing approach, though I guess it's not uncommon.
libdatadog/datadog-sidecar-ffi/src/lib.rs
Lines 278 to 282 in 091c98d
Maybe you could add a : Option<Endpoint> type hint to differentiate this from *const Endpoint ?
| ..Default::default() | ||
| }, | ||
| &Endpoint::default(), | ||
| null(), |
There was a problem hiding this comment.
An observation: no non-null case means that this code is not tested
libdatadog/datadog-sidecar-ffi/src/lib.rs
Lines 643 to 647 in 091c98d
Not sure if there are any challenges to testing that path, so up to you if it's worth the trouble.
There was a problem hiding this comment.
Thanks for the careful review!
yannham
left a comment
There was a problem hiding this comment.
Can't say about the PHP/sidecar's side of things, but the Rust part LGTM
bwoebi
left a comment
There was a problem hiding this comment.
Looks reasonable, approving!
Motivation
Companion PHP evaluation-metrics work needs FFE metrics to follow the sidecar architecture used by the rest of the tracer: configure endpoints once on the session, then send typed sidecar actions without carrying endpoint strings on every flush.
The previous FFE metrics path accepted an OTLP endpoint alongside each metrics batch. That diverged from the normal
Endpointflow, made test-session-token propagation easy to miss, and forced PHP to keep fetching endpoint configuration during request flushes.Design context: https://docs.google.com/document/d/1NvMfTpZWLBlFmEFNjdnlMyeVpy5l7KD8qujGFco6w2w/edit?tab=t.0
Companion PHP PR: DataDog/dd-trace-php#3911
Changes
Endpointto sidecar session configuration.ddog_sidecar_session_set_configand session state.SidecarAction::FfeEvaluationMetricsthrough the session's configured OTLP metrics endpoint instead of an endpoint passed with each metrics batch.test_tokenwhen the OTLP metrics endpoint does not already have one.x-datadog-test-session-tokenthroughEndpointstandard headers.Decisions
FFE metrics remain caller-driven. This PR does not make shared evaluator calls auto-emit metrics, so SDKs that still log metrics in host language implementations can continue to coexist without double counting.
The sidecar owns metrics transport, aggregation, OTLP/protobuf encoding, and endpoint/header handling. PHP remains responsible for constructing the appropriate OTLP metrics endpoint from its existing configuration resolver and supplying it once during sidecar session setup.
The test-session-token behavior is intentionally implemented through
Endpoint, not by manually adding FFE-specific headers. Production endpoints normally have no test token, so production behavior is unchanged.Validation
Local targeted test:
cargo test -p datadog-sidecar ffe_metric_actions_dispatch_without_registered_applicationResult: passed locally.
PHP integration smoke test:
Companion PHP PR: DataDog/dd-trace-php#3911
The local PHP branch backing that PR was updated to this libdatadog PR head (
091c98df6): PHP passes the OTLP metricsEndpointonce during sidecar session setup, and FFE metric flushes send only context plus metric payloads.docker-compose run --no-deps 8.3-bookworm bash -lc \ 'export PATH=/rust/cargo/bin:$PATH TMPDIR=/home/circleci/app/tmp TMP=/home/circleci/app/tmp TEMP=/home/circleci/app/tmp; make MAX_TEST_PARALLELISM=1 test_c TESTS=tests/ext/ffe/evaluation_metrics_unix_agent_endpoint.phpt'Result:
tests/ext/ffe/evaluation_metrics_unix_agent_endpoint.phptpassed locally under Docker on PHP 8.3/Linux aarch64:Tests passed 1 (100.0%).That PHPT validates the PHP sidecar integration over
DD_TRACE_AGENT_URL=unix://...: the request reaches/v1/metrics, usesapplication/x-protobuf, and carries the request-replayer test-session token through the configured endpoint path.Negative validation during review:
Endpoint.test_tokenfails with a missingx-datadog-test-session-tokenheader.Endpointstandard headers.CI on commit
091c98dreports all tests passed and no new flaky tests.End-to-End Evidence
The companion PHP branch for DataDog/dd-trace-php#3911 was validated locally against this libdatadog PR head (
091c98df6). That path exercises the full PHP metrics flow:PHP API -> tracer request buffer -> sidecar session-configured OTLP metrics Endpoint -> sidecar FFE metrics action -> request-replayer UDS intakeDocker command used from the PHP repo:
docker-compose run --no-deps 8.3-bookworm bash -lc \ 'export PATH=/rust/cargo/bin:$PATH TMPDIR=/home/circleci/app/tmp TMP=/home/circleci/app/tmp TEMP=/home/circleci/app/tmp; make MAX_TEST_PARALLELISM=1 test_c TESTS=tests/ext/ffe/evaluation_metrics_unix_agent_endpoint.phpt'Observed result:
The passing PHPT confirms PHP can record and flush an FFE evaluation metric through the sidecar over
DD_TRACE_AGENT_URL=unix://..., and that the captured request reaches/v1/metricswithContent-Type: application/x-protobufplus the request-replayer test-session token propagated through the configuredEndpoint.