Skip to content

heap_mem_leak_fix - restart heap profiling with every call to profile…#263

Open
steve-wortham-wwt wants to merge 1 commit into
grafana:mainfrom
steve-wortham-wwt:heap_mem_leak_fix
Open

heap_mem_leak_fix - restart heap profiling with every call to profile…#263
steve-wortham-wwt wants to merge 1 commit into
grafana:mainfrom
steve-wortham-wwt:heap_mem_leak_fix

Conversation

@steve-wortham-wwt
Copy link
Copy Markdown

… function - SW

@steve-wortham-wwt steve-wortham-wwt requested review from a team as code owners May 22, 2026 16:47
@cla-assistant
Copy link
Copy Markdown

cla-assistant Bot commented May 22, 2026

CLA assistant check
All committers have signed the CLA.

@steve-wortham-wwt
Copy link
Copy Markdown
Author

I ran some tests locally and found that this change solves the memory leak mentioned here: #28

@steve-wortham-wwt
Copy link
Copy Markdown
Author

Prior to this change my memory usage would climb steadily and the GC never reclaimed that memory. But with this fix my memory usage has some more natural ups and downs:

image

@bryanhuhta
Copy link
Copy Markdown
Contributor

@steve-wortham-wwt Thanks for the contribution. This looks very promising. I've been working on this bug on and off for a while now, but was never able to reproduce the leak. Do you have any reproduction steps I can try? Or did you reproduce this in your prod environment?

@bryanhuhta
Copy link
Copy Markdown
Contributor

I ran a small test script for 4 hours with the following variations (only memory profiling):

I got these results:

viz

Two things jump out at me:

  • Indeed this SDK does leak memory
  • This PR significantly improves the leak, but we still seem to be leaking memory

In main, we were leaking memory through ArrayBuffers. This PR seems to fix that. However, we can see that there's still a steady growth in heap size in both this PR and in main, but not when running the test script with profiling disabled.

I think this PR is worth merging, but there's still work to be done to pin down the other memory leak(s).

Test script

const VARIANT = process.env.VARIANT;
const IS_CONTROL = VARIANT === 'control';

let Pyroscope = null;
if (!IS_CONTROL) {
  const LIB_PATH = VARIANT
    ? `./libs/${VARIANT}/index.js`
    : '../dist/esm/index.js';
  Pyroscope = (await import(LIB_PATH)).default;
}

const DURATION_SECONDS = Number(process.env.DURATION_SECONDS ?? 600);
const DURATION_MS = DURATION_SECONDS * 1000;
const FLUSH_INTERVAL_MS = Number(process.env.FLUSH_INTERVAL_MS ?? 1000);
const SAMPLE_INTERVAL_MS = Number(process.env.SAMPLE_INTERVAL_MS ?? 1000);
const ALLOC_BATCH = Number(process.env.ALLOC_BATCH ?? 5000);
const ALLOC_SIZE = Number(process.env.ALLOC_SIZE ?? 1024);
const SAMPLING_INTERVAL_BYTES = Number(
  process.env.SAMPLING_INTERVAL_BYTES ?? 64 * 1024
);
const SERVER_ADDRESS = process.env.SERVER_ADDRESS ?? 'http://localhost:4040';
const LABEL = process.env.LABEL ?? 'run';

if (Pyroscope) {
  Pyroscope.init({
    appName: 'leak-repro',
    serverAddress: SERVER_ADDRESS,
    flushIntervalMs: FLUSH_INTERVAL_MS,
    heap: {
      samplingIntervalBytes: SAMPLING_INTERVAL_BYTES,
      stackDepth: 64,
    },
    tags: { run: LABEL },
  });

  Pyroscope.startHeapProfiling();
}

const startedAt = Date.now();
process.stdout.write(
  'elapsed_ms,rss,heap_used,heap_total,external,array_buffers\n'
);

function logMemory() {
  const m = process.memoryUsage();
  const elapsed = Date.now() - startedAt;
  process.stdout.write(
    `${elapsed},${m.rss},${m.heapUsed},${m.heapTotal},${m.external},${m.arrayBuffers}\n`
  );
}

let sink = null;
function allocate() {
  const batch = new Array(ALLOC_BATCH);
  for (let i = 0; i < ALLOC_BATCH; i++) {
    batch[i] = {
      id: i,
      payload: Buffer.alloc(ALLOC_SIZE),
      blob: new Array(16).fill(Math.random()),
    };
  }
  sink = batch[batch.length - 1];
}

const allocTimer = setInterval(allocate, 5);
const sampleTimer = setInterval(logMemory, SAMPLE_INTERVAL_MS);

setTimeout(() => {
  clearInterval(allocTimer);
  clearInterval(sampleTimer);
  logMemory();
  const shutdown = Pyroscope
    ? Pyroscope.stopHeapProfiling()
    : Promise.resolve();
  shutdown.finally(() => {
    void sink;
    process.exit(0);
  });
}, DURATION_MS);

@steve-wortham-wwt
Copy link
Copy Markdown
Author

Nice find. And yes, I can't share my exact example since it's production code.

Either I didn't run my application for long enough or there's something about my setup that isn't hitting the remaining memory leak in the same way. In any case, this is promising. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants