Skip to content

perf: cache offsetBufferAddress in CometPlainVector for variable-width vectors#4364

Open
0lai0 wants to merge 2 commits into
apache:mainfrom
0lai0:fix-4280-CacheoffsetBufferAdress
Open

perf: cache offsetBufferAddress in CometPlainVector for variable-width vectors#4364
0lai0 wants to merge 2 commits into
apache:mainfrom
0lai0:fix-4280-CacheoffsetBufferAdress

Conversation

@0lai0
Copy link
Copy Markdown
Contributor

@0lai0 0lai0 commented May 19, 2026

Which issue does this PR close?

Closes #4280

Rationale for this change

CometPlainVector already caches the value buffer address, but variable-width reads still fetched the offset buffer address on every getUTF8String and getBinary call. Caching the offset buffer address avoids repeated Arrow buffer lookups on hot per-row paths.

What changes are included in this PR?

  • Cache offsetBufferAddress in CometPlainVector for variable-width vectors
  • Use the cached offset buffer address in getUTF8String and getBinary
  • Add targeted JUnit coverage for variable-width string and binary reads

How are these changes tested?

./mvnw test-compile surefire:test \
  -pl spark \
  -Dtest=org.apache.comet.vector.TestCometPlainVector,org.apache.comet.parquet.TestColumnReader \
  -DfailIfNoTests=false \
  -Dscalastyle.skip=true

@mbutrovich mbutrovich changed the title feat: enhance CometPlainVector to support variable width vectors perf: cache offsetBufferAddress in CometPlainVector for variable-width vectors May 19, 2026
@mbutrovich mbutrovich self-requested a review May 19, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CometPlainVector: cache offsetBufferAddress for variable-width vectors

2 participants