Improve heap profile memory usage by lazily loading js objects#260
Improve heap profile memory usage by lazily loading js objects#260IlyasShabi merged 18 commits intomainfrom
Conversation
Overall package sizeSelf size: 1.86 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.2.1 | 163.06 kB | 163.06 kB | | p-limit | 3.1.0 | 7.75 kB | 13.78 kB | | node-gyp-build | 3.9.0 | 8.81 kB | 8.81 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
BenchmarksBenchmark execution time: 2026-02-19 11:20:15 Comparing candidate commit a683217 in PR branch Found 1 performance improvements and 0 performance regressions! Performance is the same for 88 metrics, 31 unstable metrics. scenario:profiler-heavy-load-no-wall-profiler-24
|
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d2982ce329
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Some thoughts:
|
| }; | ||
| } // namespace | ||
|
|
||
| std::shared_ptr<Node> TranslateAllocationProfileToCpp( |
There was a problem hiding this comment.
TranslateAllocationProfileToCpp and TranslateAllocationProfileToExternal are ~similar. While templating could reduce duplication, the different string assignment mechanisms would add complexity. Keeping both functions for clarity
There was a problem hiding this comment.
+1. Maybe add this as a comment in the source code?
|
|
||
| export interface AllocationProfileNode extends ProfileNode { | ||
| allocations: Allocation[]; | ||
| children: AllocationProfileNode[]; |
7e1921f to
b0caf9a
Compare
|
Currently we have this processing flow for a heap profile:
IIUC the proposed change should decrease the JS memory used during the second step but at the expense of creating a copy of the entire C++ node tree. It would be nice to have benchmark / test cases that exhibits the behaviour that need fixing and measure the improvement. |
|
I added a benchmark for the second step that:
Results I got from Local and CI across platforms: After applying callback approach and avoid copying the root to cpp with larger profilers: |
|
For your consideration, I think you could also introduce a small encapsulation for the common part in all the getters: diff --git a/bindings/allocation-profile-node.cc b/bindings/allocation-profile-node.cc
index 876d119..3112db2 100644
--- a/bindings/allocation-profile-node.cc
+++ b/bindings/allocation-profile-node.cc
@@ -19,6 +19,15 @@
namespace dd {
+template <typename F>
+void ExternalAllocationNode::mapExternalNode(
+ const Nan::PropertyCallbackInfo<v8::Value>& info,
+ F&& mapper) {
+ auto* wrapper =
+ Nan::ObjectWrap::Unwrap<ExternalAllocationNode>(info.Holder());
+ info.GetReturnValue().Set(mapper(wrapper->node_.get()));
+}
+
NAN_MODULE_INIT(ExternalAllocationNode::Init) {
v8::Local<v8::FunctionTemplate> tpl = Nan::New<v8::FunctionTemplate>();
tpl->SetClassName(Nan::New("AllocationProfileNode").ToLocalChecked());
@@ -58,37 +67,35 @@ v8::Local<v8::Object> ExternalAllocationNode::New(
}
NAN_GETTER(ExternalAllocationNode::GetName) {
- auto* wrapper =
- Nan::ObjectWrap::Unwrap<ExternalAllocationNode>(info.Holder());
- auto* isolate = v8::Isolate::GetCurrent();
- info.GetReturnValue().Set(
- v8::Local<v8::String>::New(isolate, wrapper->node_->name));
+ mapExternalNode(info, [](ExternalNode* node) {
+ auto* isolate = v8::Isolate::GetCurrent();
+ return v8::Local<v8::String>::New(isolate, node->name);
+ });
}
NAN_GETTER(ExternalAllocationNode::GetScriptName) {
- auto* wrapper =
- Nan::ObjectWrap::Unwrap<ExternalAllocationNode>(info.Holder());
- auto* isolate = v8::Isolate::GetCurrent();
- info.GetReturnValue().Set(
- v8::Local<v8::String>::New(isolate, wrapper->node_->script_name));
+ mapExternalNode(info, [](ExternalNode* node) {
+ auto* isolate = v8::Isolate::GetCurrent();
+ return v8::Local<v8::String>::New(isolate, node->script_name);
+ });
}
NAN_GETTER(ExternalAllocationNode::GetScriptId) {
- auto* wrapper =
- Nan::ObjectWrap::Unwrap<ExternalAllocationNode>(info.Holder());
- info.GetReturnValue().Set(Nan::New(wrapper->node_->script_id));
+ mapExternalNode(info, [](ExternalNode* node) {
+ return Nan::New(node->script_id);
+ });
}
NAN_GETTER(ExternalAllocationNode::GetLineNumber) {
- auto* wrapper =
- Nan::ObjectWrap::Unwrap<ExternalAllocationNode>(info.Holder());
- info.GetReturnValue().Set(Nan::New(wrapper->node_->line_number));
+ mapExternalNode(info, [](ExternalNode* node) {
+ return Nan::New(node->line_number);
+ });
}
NAN_GETTER(ExternalAllocationNode::GetColumnNumber) {
- auto* wrapper =
- Nan::ObjectWrap::Unwrap<ExternalAllocationNode>(info.Holder());
- info.GetReturnValue().Set(Nan::New(wrapper->node_->column_number));
+ mapExternalNode(info, [](ExternalNode* node) {
+ return Nan::New(node->column_number);
+ });
}
NAN_GETTER(ExternalAllocationNode::GetAllocations) {
diff --git a/bindings/allocation-profile-node.hh b/bindings/allocation-profile-node.hh
index 0db6bfb..5c289cf 100644
--- a/bindings/allocation-profile-node.hh
+++ b/bindings/allocation-profile-node.hh
@@ -32,6 +32,11 @@ class ExternalAllocationNode : public Nan::ObjectWrap {
private:
ExternalAllocationNode(std::shared_ptr<ExternalNode> node) : node_(node) {}
+ template <typename F>
+ static void mapExternalNode(
+ const Nan::PropertyCallbackInfo<v8::Value>& info,
+ F&& mapper);
+
static NAN_GETTER(GetName);
static NAN_GETTER(GetScriptName);
static NAN_GETTER(GetScriptId); |
|
Nice work 🔥 |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1a66a02db7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return v8ProfileV2(root => { | ||
| return convertProfile(root, ignoreSamplePath, sourceMapper, generateLabels); |
There was a problem hiding this comment.
Avoid feeding immutable V2 nodes into convertProfile
profileV2() now forwards the lazy native nodes directly to convertProfile(), but the serializer path still mutates node.scriptName to strip file:// prefixes (see serialize() in ts/src/profile-serializer.ts). The new V2 nodes expose scriptName via getter-only accessors, so in strict mode this assignment throws for ESM/mjs frames, causing heap.profileV2() to fail instead of producing a profile whenever a sampled frame has a file:// script URL.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
@IlyasShabi
Might be worth checking that.
Weirdly I could not make it throw an error, it seems that setting node.scriptName creates a property on the JS object that bypasses the getter when it is accessed later on.
There was a problem hiding this comment.
I don't think it is read only, you need to pass an extra v8::ReadOnly property attribute to SetNativeDataProperty to make it read-only.
With this attribute, I get an error if I try to mutate the property.
There was a problem hiding this comment.
IMO it's better to not touch the node at all and pass scriptName as a separate param to getLocation
|
@codex review |
What does this PR do?:
Introduces a new lazy heap profiling API
profileV2that reduces memory usage during heap profile serialization. Instead of materializing the entire V8 allocation profile as JS object upfront, the new API traverses the profile root tree on-demandMotivation:
When collecting heap profiles, the current
profile()API callsGetAllocationProfile()to get the profile data, then recursively translate the entire profile tree into JS objects, this creates a full JS objects tree in heap memory which can be significant for large apps.The new
profileV2()API introduces a lazy traversal pattern, getting JS object is done on-demand using getters to get children, name, scriptName and others. On the TS side, we only request for current node JS object which allow us to gradually build the tree and reduce memory usage at this operation.This approach is using a callback to keep V8 local string alive during the entire process