Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leaks for huge tables #89

Merged
merged 1 commit into from
Dec 24, 2024
Merged

Conversation

aykut-bozkurt
Copy link
Collaborator

We process each row group sequentially during "COPY FROM parquet". Normally, we expect that memory consumption does not exceed too much the row group size. But we also do some allocations during the copy at current Postgres context, which can be extreme for some huge tables (e.g. with 100 columns and default row group size ~ 123000) To fix the issue, we introduce a memory context that is used and freed per each row during the copy.

Fixes #88.

We process each row group sequentially during "COPY FROM parquet".
Normally, we expect that memory consumption does not exceed too much
the row group size. But we also do some allocations during the copy at current Postgres context, which can be extreme for
some huge tables (e.g. with 100 columns and default row group
size ~ 123000) To fix the issue, we intoduce a memory context that is used and freed per each row during the copy.
Copy link

codecov bot commented Dec 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.11%. Comparing base (fd51bed) to head (1949f08).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #89   +/-   ##
=======================================
  Coverage   92.11%   92.11%           
=======================================
  Files          71       71           
  Lines        9104     9109    +5     
=======================================
+ Hits         8386     8391    +5     
  Misses        718      718           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aykut-bozkurt aykut-bozkurt merged commit 2c1a62d into main Dec 24, 2024
6 checks passed
@aykut-bozkurt aykut-bozkurt deleted the aykut/fix-copy-from-leak branch December 24, 2024 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large parquet file OOM crash
2 participants