Skip to content

feat: Add tqdm progress bars for PDF parsing and summary generation#19

Open
blueprintparadise wants to merge 2 commits intoVectifyAI:mainfrom
blueprintparadise:feature/add-progress-bars-new
Open

feat: Add tqdm progress bars for PDF parsing and summary generation#19
blueprintparadise wants to merge 2 commits intoVectifyAI:mainfrom
blueprintparadise:feature/add-progress-bars-new

Conversation

@blueprintparadise
Copy link
Copy Markdown

Hey guys, absolutely loved your project. I have been playing with it for the last few days and it shows a lot of promise.
Here is my first pull request. It just adds a progress bar to the cli output. Please let me know if you face any issue during integration.

Regards,
RSH

@saccharin98
Copy link
Copy Markdown
Collaborator

Hi @mxyhi, thanks for the contribution! The idea of adding progress bars is great for CLI users. However, there are some issues that need to be addressed before this can be merged.

Critical: Based on outdated code (will crash on merge)

The PR modifies a call to asyncio.run(generate_summaries_for_structure(...)) at the top level, but main has since been refactored — this logic now lives inside an async def page_index_builder() function. Calling asyncio.run() inside an already-running event loop will raise RuntimeError: This event loop is already running. Please rebase onto the latest main before continuing.

Design: Avoid duplicating existing logic

Rather than re-implementing the summary generation flow at the call site, please consider adding a progress_callback parameter (or integrating tqdm directly) into the existing generate_summaries_for_structure function in utils.py.

Minor issues

  • Unused imports: argparse and Counter were added but never used — please remove them
  • Progress bars should be disableable when used as a library (e.g. disable=not sys.stdout.isatty()), to avoid polluting the output stream of non-CLI users
  • tqdm(doc, ...) on the pymupdf path is missing total=len(doc), so no percentage will be shown

Happy to see this landed once the rebase and design issues are resolved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants