Skip to content

rustc_query_system: reduce dependency graph memory usage#79589

Merged
bors merged 8 commits intorust-lang:masterfrom
tgnottingham:shared_dep_graph
Dec 24, 2020
Merged

rustc_query_system: reduce dependency graph memory usage#79589
bors merged 8 commits intorust-lang:masterfrom
tgnottingham:shared_dep_graph

Conversation

@tgnottingham
Copy link
Copy Markdown
Contributor

This change implements, at a high level, two space optimizations to the dependency graph.

The first optimization is sharing graph data with the previous dependency graph. Whenever we intern a node, we know whether that node is new (not in the previous graph) or not, and if not, the color of the node in the previous graph.

Red and green nodes have their DepNode present in the previous graph, so for that piece of node data, we can just store the index of the node in the previous graph rather than duplicate the DepNode. Green nodes additionally have the the same result Fingerprint, so we can avoid duplicating that too. Finally, we distinguish between "light" and "dark" green nodes, where the latter are nodes that were marked green because all of their dependencies were marked green. These nodes can additionally share edges with the previous graph, because we know that their set of dependencies is the same (technically, light green and red nodes can have the same dependencies too, but we don't try to figure out whether or not that's the case).

Also, some effort is made to pack data tightly, and to avoid storing DepNodes as map keys more than once.

The second optimization is storing edges in a more compact representation, as in the SerializedDepGraph, that is, in a single vector, rather than one EdgesVec per node. An EdgesVec is a SmallVec with an inline buffer for 8 elements. Each EdgesVec is, at minimum, 40 bytes, and has a per-node overhead of up to 40 bytes. In the ideal case of exactly 8 edges, then 32 bytes are used for edges, and the overhead is 8 bytes. But most of the time, the overhead is higher.

In contrast, using a single vector to store all edges, and having each node specify its start and end elements as 4 byte indices into the vector has a constant overhead of 8 bytes--the best case scenario for the per-node EdgesVec approach.

The downside of this approach is that EdgesVecs built up during query execution have to be copied into the vector, whereas before, we could just take ownership over them. However, we mostly make up for this because the single vector representation enables a more efficient implementation of DepGraph::serialize.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-incr-comp Area: Incremental compilation A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. I-compiletime Issue: Problems and improvements with respect to compile times. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.