Skip to content

Log injection rework#2440

Merged
randomanderson merged 15 commits into
masterfrom
landerson/log-injection-rework
Mar 16, 2021
Merged

Log injection rework#2440
randomanderson merged 15 commits into
masterfrom
landerson/log-injection-rework

Conversation

@randomanderson
Copy link
Copy Markdown
Contributor

@randomanderson randomanderson commented Feb 25, 2021

This PR reworks all of the log injection instrumentations.

Problems

  • ScopeListener is called quite often. For some backends, the entire MDC is being duplicated on every scope transition!!
  • The thread local instrumentation for unified service tagging was brittle forcing us to have it was off by default.
  • The Slf4j instrumentation was too broad. We should be instrumenting backends only

Solutions

This PR borrows heavily from the opentelemetry instrumentation.

  • Unifed service tagging is on by default

For Logback, log4j1, and JBoss logging

  • Added Logback specific instrumentation
  • Attach the span context to the logging "event" before it is "published" using the InstrumentationContext
  • Read the context and unified service tags when MDC values are read from the "event"

Great part about this approach work is only done when it needs to be. The "publish" is guarded by if (LogLevel >= INFO){} type constructs. No work is done adding/removing contexts when there's no logging happening.

For Log4j2 versions >=2.7
Log4j2 has a built-in mechanism for adding to the context. The instrumentation implements a ContextDataInjector

For Log4j2 versions <2.7
This still uses the old ScopeListener mechanism. I only cleaned up the interface to not use reflection. The interactions of LoggerConfig and AsyncLoggerConfig (where callAppenders is located) is quite complex. In the future, it's something that can be looked into. (Unfortunately, this means there's no unified service tagging for this version range)

Testing

The previous tests made assumptions that are no longer true. The smoke tests are a much better fit for testing logging. This PR tests a base version and the latest version for all of the logging backends. This brings the total number of combinations tested to 48.

Future work

  • Since there are 48 smoke tests, they take quite a while to complete. Our current build does not allow smoke tests to run in parallel. Some changes could be made to special-case these smoke tests
  • InstrumentationContext support for Log4j2 versions <2.7 . The current implementation uses the old ScopeListener mechanism

@randomanderson randomanderson added the tag: do not merge Do not merge changes label Feb 25, 2021
@randomanderson randomanderson force-pushed the landerson/log-injection-rework branch 2 times, most recently from 2ccc84c to 6373c86 Compare March 3, 2021 03:39
@randomanderson randomanderson removed the tag: do not merge Do not merge changes label Mar 3, 2021
@randomanderson randomanderson marked this pull request as ready for review March 3, 2021 04:04
@randomanderson randomanderson requested a review from a team as a code owner March 3, 2021 04:04
@richardstartin
Copy link
Copy Markdown
Contributor

It looks like this mostly addresses the concerns I expressed previously. I'm just interested in the UnionMap - maybe we can specialise that somewhat. Instead of copying the context values into the map, we could have a switch statement in get which would access our context values when cases are hit, and fallback the other map when they're missed.

@randomanderson randomanderson force-pushed the landerson/log-injection-rework branch from 066d274 to 75b93e1 Compare March 3, 2021 20:24
Copy link
Copy Markdown
Contributor

@bantonsson bantonsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice cleanup

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not relevant, but we only need to do the switch if the key starts with dd.

@randomanderson randomanderson force-pushed the landerson/log-injection-rework branch 2 times, most recently from 619c43a to f48ae7d Compare March 12, 2021 20:09
@randomanderson randomanderson force-pushed the landerson/log-injection-rework branch from f48ae7d to 3d276d5 Compare March 16, 2021 14:56
@randomanderson randomanderson merged commit cf71e7c into master Mar 16, 2021
@randomanderson randomanderson deleted the landerson/log-injection-rework branch March 16, 2021 16:04
@github-actions github-actions Bot added this to the 0.76.0 milestone Mar 16, 2021
@randallt
Copy link
Copy Markdown

@randomanderson @richardstartin :
Apologies for commenting on this old issue. We are a current Datadog customer, but we are stuck on a custom Java tracing agent until we fully transition away from our previous APM provider (which won't happen for a year). I'm trying to add log integration into our DDTracer piece, but I'm hitting a ByteBuddy issue trying to instrument log4j1 and I'm wondering if you remember how the Datadog Java Agent got around it (if at all). Here is the error I get:

java.lang.VerifyError: (class: org/apache/log4j/MDC, method: <clinit> signature: ()V) Illegal type in constant pool
	at org.apache.log4j.spi.LoggingEvent.getMDC(LoggingEvent.java:338)
	at org.apache.log4j.helpers.PatternParser$MDCPatternConverter.convert(PatternParser.java:479)
	at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65)
	at org.apache.log4j.PatternLayout.format(PatternLayout.java:506)
	at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
	at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
	at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
	at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
	at org.apache.log4j.Category.callAppenders(Category.java:206)
	at org.apache.log4j.Category.forcedLog(Category.java:391)
	at org.apache.log4j.Category.log(Category.java:856)
	at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:305)

From what I've been able to find, this happens because log4j1 is compiled using Java < 1.5, which ByteBuddy doesn't support. Any ideas? Also, are you guys on a public Slack / MS Teams channel somewhere for ad-hoc questions?

@tylerbenson
Copy link
Copy Markdown
Contributor

@randallt We have a slack that you can join chat.datadoghq.com, but this seems like something better suited for a support@datadoghq.com email.

@randallt
Copy link
Copy Markdown

I haven't figured out yet how to get through L1 Support to a dev team for something this technical--it just goes back and forth with them asking for more information, etc. Is there a keyword I can use to avoid that when I know I need a developer/Team lead?

This is a very specific question for the implementors of the Datadog Java agent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants