Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

UN-3578 [FIX] GDrive folder browser no longer breaks on Google Docs#2093

Merged
kirtimanmishrazipstack merged 5 commits into
mainZipstack/unstract:mainfrom
UN-3578-gdrive-authentication-wf-pipelineZipstack/unstract:UN-3578-gdrive-authentication-wf-pipelineCopy head branch name to clipboard
Jun 20, 2026
Merged

UN-3578 [FIX] GDrive folder browser no longer breaks on Google Docs#2093
kirtimanmishrazipstack merged 5 commits into
mainZipstack/unstract:mainfrom
UN-3578-gdrive-authentication-wf-pipelineZipstack/unstract:UN-3578-gdrive-authentication-wf-pipelineCopy head branch name to clipboard

Conversation

@kirtimanmishrazipstack

@kirtimanmishrazipstack kirtimanmishrazipstack commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

What

  • The Google Drive connector's folder browser ("Folders to process") now loads folders even when they contain Google-native files (Docs / Sheets / Slides). It previously showed "No data".

Why

  • Customers couldn't pick a Drive folder for their ETL pipeline when that folder held a Google Doc or Sheet — the folder list failed to load and blocked pipeline setup (reported on-call).

How

  • get_files() (backend/file_management/file_management_helper.py) wraps the per-file fs.info() size backfill in try/except. pydrive2's info() runs an unguarded int(fileSize)int(None) for Google-native files (no fileSize), which aborted the entire listing; on failure we now log a warning and keep the ls() entry with size 0.

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

  • No — config-time folder-browser path only, and now strictly more resilient. Runtime ETL execution uses a separate, already-guarded path (walk()/ls()).

Database Migrations

  • None

Env Config

  • None

Relevant Docs

Related Issues or PRs

Dependencies Versions

  • None

Notes on Testing

  • Manual: in a GDrive connector, browse a folder containing a Google Doc/Sheet — folders now list instead of showing "No data".

Screenshots

Checklist

I have read and understood the Contribution Guidelines.

…ile lacks a size

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

  • Bug Fixes
    • Improved file browsing reliability when some items are missing/unknown size metadata; directory listings no longer fail due to metadata lookup errors and instead proceed safely (with warnings logged).
  • UI Improvements
    • Updated the file system display so file sizes are left blank when size information is unknown or not available, avoiding misleading “0 B” output for certain files.

Walkthrough

When listing files returns entries with missing size data, the backend now wraps the fs.info() call in error handling that logs a warning and defaults the size to 0 rather than breaking the listing. The frontend adjusts display logic to show a blank size for falsy values instead of "0 B", ensuring unknown sizes are not misrepresented to users.

Changes

Unknown File Size Resilience

Layer / File(s) Summary
Guarded fs.info() with warning and size=0 fallback
backend/file_management/file_management_helper.py
When a file entry has no size, fs.info() is now called inside a try/except; failures are logged as warnings and the size defaults to 0 so the listing continues.
Conditional size formatting for unknown values
frontend/src/components/input-output/file-system/FileSystem.jsx
The transformTree function now displays size as blank when node.size is falsy, instead of formatting it as "0 B", clarifying that the size is unknown rather than zero.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the primary fix: resolving a GDrive folder browser failure when Google Docs are present.
Description check ✅ Passed The PR description comprehensively covers all required template sections with clear explanations of the issue, fix, and testing approach.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch UN-3578-gdrive-authentication-wf-pipeline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a folder-browser crash in the Google Drive connector that occurred when a listed folder contained Google-native files (Docs, Sheets, Slides). pydrive2's fs.info() called int(None) on the missing fileSize field, aborting the entire listing; the fix wraps that call in try/except and falls back to size=0 with a logged warning. A paired frontend guard prevents the placeholder 0 from rendering as "0 B".

  • Backend (file_management_helper.py): fs.info() is now wrapped in try/except Exception as exc; on failure the ls() entry is kept with size=0 and the exception is logged for diagnosability.
  • Frontend (FileSystem.jsx): transformTree now checks node.size truthiness before calling formatBytes, so Google-native docs show a blank size column rather than "0 B".

Confidence Score: 5/5

Safe to merge — the change is scoped entirely to the config-time folder-browser path and makes it strictly more resilient.

Both changes are narrow and well-contained. The backend wraps a single fs.info() call that was previously unguarded, with proper exception logging; the runtime ETL path is unaffected. The frontend tweak is cosmetic. The previous review concern about missing exception binding was addressed in the latest commit.

No files require special attention.

Important Files Changed

Filename Overview
backend/file_management/file_management_helper.py Wraps fs.info() in try/except for Google-native files that have no fileSize; falls back to size=0 and logs a warning with the bound exception. Change is surgical and scoped to the folder-browser listing path only.
frontend/src/components/input-output/file-system/FileSystem.jsx Adds a truthiness guard on node.size before calling formatBytes, so Google-native docs with size=0 display a blank cell rather than "0 B". Minor side-effect: genuine zero-byte files will also show blank.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[fs.ls path detail=true] --> B{for each file_info}
    B --> C{type==file AND size is None?}
    C -- No --> G[Use ls metadata as-is]
    C -- Yes --> D[fs.info file_name]
    D -- success --> E[Use enriched file_info]
    D -- Exception --> F["Log warning with exc\nSet size = 0"]
    E --> H[FileInformation]
    F --> H
    G --> H
    H --> I[Return file_list]

    subgraph Frontend
        I --> J[transformTree]
        J --> K{isFile AND node.size truthy?}
        K -- Yes --> L[formatBytes node.size]
        K -- No --> M[size = blank]
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[fs.ls path detail=true] --> B{for each file_info}
    B --> C{type==file AND size is None?}
    C -- No --> G[Use ls metadata as-is]
    C -- Yes --> D[fs.info file_name]
    D -- success --> E[Use enriched file_info]
    D -- Exception --> F["Log warning with exc\nSet size = 0"]
    E --> H[FileInformation]
    F --> H
    G --> H
    H --> I[Return file_list]

    subgraph Frontend
        I --> J[transformTree]
        J --> K{isFile AND node.size truthy?}
        K -- Yes --> L[formatBytes node.size]
        K -- No --> M[size = blank]
    end
Loading

Reviews (5): Last reviewed commit: "UN-3578 [FIX] Show blank instead of 0 B ..." | Re-trigger Greptile

Comment thread backend/file_management/file_management_helper.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/file_management/file_management_helper.py`:
- Around line 89-92: The except block is catching too broad an Exception type
without logging the actual error, which can hide unexpected bugs. Replace the
generic Exception catch with more specific exception types related to fsspec or
connector operations (such as FileNotFoundError, PermissionError, or other
fsspec-specific exceptions), and modify the logger.warning call to include the
actual exception object as part of the log message so that the root cause is
captured while still applying the fallback metadata behavior for known issues.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c35df02b-54f9-4f14-bb99-b2efbfe381e1

📥 Commits

Reviewing files that changed from the base of the PR and between 64b06a7 and 01fa47d.

📒 Files selected for processing (1)
  • backend/file_management/file_management_helper.py

Comment thread backend/file_management/file_management_helper.py Outdated
@kirtimanmishrazipstack kirtimanmishrazipstack changed the title UN-3578 [FIX] Keep GDrive folder listing alive when a Google-native f… UN-3578 [FIX] GDrive folder browser no longer breaks on Google Docs Jun 19, 2026
Comment thread backend/file_management/file_management_helper.py

@chandrasekharan-zipstack chandrasekharan-zipstack left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still recommend leaving that size field blank rather than displaying 0 B. You can't distinguish between a failed file and one that is actually 0 B

kirtimanmishrazipstack and others added 2 commits June 20, 2026 13:36
…sting

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ector browser

Google-native docs report no size; the folder browser now leaves the size
field blank rather than asserting a misleading 0 B. Display-only; applies
uniformly across all file connectors.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

Frontend Lint Report (Biome)

All checks passed! No linting or formatting issues found.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
frontend/src/components/input-output/file-system/FileSystem.jsx (1)

207-208: 🏗️ Heavy lift

Distinguish unknown size from real 0 B to avoid masking zero-byte files.

Using a truthy check (node.size) makes real zero-byte files render as blank, same as unknown sizes. Consider using an explicit unknown sentinel (null/undefined) from the backend and checking node.size == null here, so actual 0 still displays as 0 B.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/input-output/file-system/FileSystem.jsx` around lines
207 - 208, The current truthy check for node.size in the size variable
assignment treats both zero-byte files (actual 0 value) and unknown sizes
(null/undefined from backend) identically, displaying both as blank. Fix this by
replacing the truthy check `node.size ?` with an explicit null/undefined check
using `node.size == null` (or `node.size != null` in the ternary), so that
actual zero values will be formatted as "0 B" while truly unknown sizes remain
blank.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@frontend/src/components/input-output/file-system/FileSystem.jsx`:
- Around line 207-208: The current truthy check for node.size in the size
variable assignment treats both zero-byte files (actual 0 value) and unknown
sizes (null/undefined from backend) identically, displaying both as blank. Fix
this by replacing the truthy check `node.size ?` with an explicit null/undefined
check using `node.size == null` (or `node.size != null` in the ternary), so that
actual zero values will be formatted as "0 B" while truly unknown sizes remain
blank.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9795d48f-3ca0-4903-b0b3-56efa2f011fd

📥 Commits

Reviewing files that changed from the base of the PR and between 68b504b and 6b32b5f.

📒 Files selected for processing (1)
  • frontend/src/components/input-output/file-system/FileSystem.jsx

@github-actions

Copy link
Copy Markdown
Contributor

Unstract test results

Per-group results

Status Group Tier Passed Failed Errors Skipped Duration (s)
unit-connectors unit 64 12 0 3 16.8
unit-core unit 0 0 4 0 1.2
unit-platform-service unit 9 0 1 0 1.4
unit-prompt-service unit 15 0 0 0 20.2
unit-rig unit 53 0 0 0 3.4
unit-runner unit 11 0 0 0 3.1
unit-sdk1 unit 390 0 0 0 20.9
unit-tool-registry unit 0 0 1 0 1.3
unit-workers unit 0 0 0 0 17.6
TOTAL 542 12 6 3 85.7

Critical paths

⚠️ Critical paths not yet covered

  • auth-login — User can log in and obtain a session cookie. (entry: POST /api/v1/auth/login; declared coverage: no groups declared)
  • adapter-register-llm — Register and validate an LLM adapter. (entry: POST /api/v1/adapter/; declared coverage: no groups declared)
  • workflow-create-execute — Create a workflow, configure source+destination, execute, poll, fetch result. (entry: POST /api/v1/workflow/{id}/execute/; declared coverage: e2e-workflow)
  • api-deployment-run — Deploy a workflow as an API, POST a document, receive structured JSON. (entry: POST /deployment/api/{org}/{name}/; declared coverage: e2e-api-deployment)
  • prompt-studio-fetch-response — Prompt Studio: create project, add prompt, run single-pass, get response. (entry: POST /api/v1/prompt-studio/prompt-studio-tool/{id}/fetch_response/; declared coverage: e2e-prompt-studio)
  • pipeline-etl-execute — Run an ETL pipeline from source connector to destination. (entry: POST /api/v1/pipeline/{id}/execute/; declared coverage: no groups declared)
  • usage-token-tracking — Per-execution token usage is recorded and retrievable. (entry: GET /api/v1/usage/get_token_usage/; declared coverage: no groups declared)
  • workflow-execution-fan-out — Multi-file workflow execution fans out to file-processing workers and rejoins. (entry: internal: backend → rabbitmq → workers/file_processing; declared coverage: no groups declared)
  • callback-result-delivery — Async results are posted back via the callback worker. (entry: internal: workers/callback → backend /internal endpoints; declared coverage: no groups declared)
✅ Covered critical paths
  • tool-sandbox-exec — covered by unit-runner

@kirtimanmishrazipstack

Copy link
Copy Markdown
Contributor Author

I would still recommend leaving that size field blank rather than displaying 0 B. You can't distinguish between a failed file and one that is actually 0 B

@chandrasekharan-zipstack Agreed, fixed — size now shows blank instead of 0 B for any 0/unknown value (6b32b5f), so we never assert a size we can't determine. One line in the shared browser, applies to all connectors

@kirtimanmishrazipstack kirtimanmishrazipstack merged commit de49f06 into main Jun 20, 2026
10 checks passed
@kirtimanmishrazipstack kirtimanmishrazipstack deleted the UN-3578-gdrive-authentication-wf-pipeline branch June 20, 2026 08:38
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.