Commit b6a85e4
authored
feat(storage): full object checksum: integrate full-object checksum in AsyncMultiRangeDownloader (#17263)
### 1. Overview of the Solution
This solution implements end-to-end full-object checksum validation in
`AsyncMultiRangeDownloader` for the asynchronous Google Cloud Storage
Python client library. As asynchronous multiplexed downloads of
non-contiguous ranges are performed concurrently over a single
bidirectional gRPC connection, this feature automatically and
incrementally calculates a rolling checksum as bytes arrive and
validates it against the server's authoritative object checksum once the
download completes.
The technical approach consists of three coordinated layers:
* **`_AsyncReadObjectStream` (Stream Ingestion)**: Safely extracts the
authoritative server checksum (`full_obj_server_crc32c`) and
finalization status (`is_finalized`) from the object metadata received
in the first data payload response of the stream.
* **`_ReadResumptionStrategy` & `_DownloadState` (Verification Logic)**:
Computes an isolated, persistent rolling checksum in the individual
`_DownloadState` object to ensure calculations do not bleed across
concurrent multiplexed ranges. Crucially, the rolling hash updates only
*after* buffer writes succeed to prevent state corruption during retry
re-connects, raising a `DataCorruption` exception on completion if a
mismatch occurs.
* **`AsyncMultiRangeDownloader` (Orchestration & Cleanup)**: Detects
candidate full-object ranges (e.g., `(0, 0)` or `(0, persisted_size)`),
propagates checksum settings to the resumption strategy, and guarantees
robust cleanup (closing the stream immediately and unregistering IDs) if
data corruption or write errors occur.
### 2. What This PR Specifically Does
This PR implements **Step 3: Downloader Orchestration & End-to-End
Integration/System Tests** of the solution:
* Relocates `raise_if_no_fast_crc32c()` validation to the execution
phase (`download_ranges()`) instead of construction time.
* Propagates stream details (`is_finalized`, `full_obj_server_crc32c`)
to the resumption state dictionary.
* Detects implicit full-object downloads (`(0, 0)`) or explicit
full-object downloads (`(0, persisted_size)`) post-`open()`, and flags
them for validation.
* Implements the robust cleanup guarantee in `download_ranges()`: wraps
execution in a robust `try...finally` block to close the stream
immediately and unregister multiplexer range IDs upon a `DataCorruption`
exception.
* Adds integration tests in `test_async_multi_range_downloader.py` and
extensive end-to-end system tests in `test_zonal.py` checking finalized,
unfinalized (appendable), explicit, implicit, and bypassed range
downloads against live GCS buckets.1 parent 2361ba6 commit b6a85e4Copy full SHA for b6a85e4
6 files changed
+257-11Lines changed: 257 additions & 11 deletions
File tree
Expand file treeCollapse file tree
Open diff view settings
Filter options
- packages/google-cloud-storage
- google/cloud/storage
- asyncio
- tests
- system
- unit
- asyncio
Expand file treeCollapse file tree
Open diff view settings
Collapse file
packages/google-cloud-storage/google/cloud/storage/_helpers.py
Copy file name to clipboardExpand all lines: packages/google-cloud-storage/google/cloud/storage/_helpers.py+3-1Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
33 | 33 | |
34 | 34 | |
35 | 35 | |
36 | | - |
37 | 36 | |
38 | 37 | |
| 38 | + |
| 39 | + |
| 40 | + |
39 | 41 | |
40 | 42 | |
41 | 43 | |
|
Collapse file
packages/google-cloud-storage/google/cloud/storage/_http.py
Copy file name to clipboardExpand all lines: packages/google-cloud-storage/google/cloud/storage/_http.py+1-1Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
25 | 25 | |
26 | 26 | |
27 | 27 | |
| 28 | + |
28 | 29 | |
29 | 30 | |
30 | | - |
31 | 31 | |
32 | 32 | |
33 | 33 | |
|
Collapse file
packages/google-cloud-storage/google/cloud/storage/asyncio/async_multi_range_downloader.py
Copy file name to clipboardExpand all lines: packages/google-cloud-storage/google/cloud/storage/asyncio/async_multi_range_downloader.py+36-7Lines changed: 36 additions & 7 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
44 | 44 | |
45 | 45 | |
46 | 46 | |
| 47 | + |
47 | 48 | |
48 | 49 | |
49 | 50 | |
| ||
219 | 220 | |
220 | 221 | |
221 | 222 | |
222 | | - |
223 | | - |
224 | 223 | |
225 | 224 | |
226 | 225 | |
| ||
232 | 231 | |
233 | 232 | |
234 | 233 | |
| 234 | + |
| 235 | + |
235 | 236 | |
236 | 237 | |
237 | 238 | |
| ||
327 | 328 | |
328 | 329 | |
329 | 330 | |
| 331 | + |
| 332 | + |
330 | 333 | |
331 | 334 | |
332 | 335 | |
| ||
363 | 366 | |
364 | 367 | |
365 | 368 | |
| 369 | + |
| 370 | + |
366 | 371 | |
367 | 372 | |
368 | 373 | |
| ||
377 | 382 | |
378 | 383 | |
379 | 384 | |
| 385 | + |
380 | 386 | |
381 | 387 | |
382 | 388 | |
| ||
412 | 418 | |
413 | 419 | |
414 | 420 | |
| 421 | + |
| 422 | + |
| 423 | + |
415 | 424 | |
416 | 425 | |
417 | 426 | |
| ||
422 | 431 | |
423 | 432 | |
424 | 433 | |
| 434 | + |
| 435 | + |
| 436 | + |
| 437 | + |
| 438 | + |
| 439 | + |
| 440 | + |
| 441 | + |
| 442 | + |
| 443 | + |
| 444 | + |
425 | 445 | |
426 | | - |
427 | | - |
428 | | - |
| 446 | + |
| 447 | + |
| 448 | + |
| 449 | + |
429 | 450 | |
430 | 451 | |
431 | 452 | |
432 | 453 | |
433 | 454 | |
434 | 455 | |
| 456 | + |
| 457 | + |
435 | 458 | |
436 | 459 | |
437 | 460 | |
| ||
519 | 542 | |
520 | 543 | |
521 | 544 | |
522 | | - |
| 545 | + |
| 546 | + |
| 547 | + |
| 548 | + |
| 549 | + |
| 550 | + |
523 | 551 | |
524 | 552 | |
525 | 553 | |
526 | 554 | |
527 | | - |
| 555 | + |
| 556 | + |
528 | 557 | |
529 | 558 | |
530 | 559 | |
|
Collapse file
packages/google-cloud-storage/tests/system/test_zonal.py
Copy file name to clipboardExpand all lines: packages/google-cloud-storage/tests/system/test_zonal.py+86Lines changed: 86 additions & 0 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
27 | 27 | |
28 | 28 | |
29 | 29 | |
| 30 | + |
30 | 31 | |
31 | 32 | |
32 | 33 | |
| ||
961 | 962 | |
962 | 963 | |
963 | 964 | |
| 965 | + |
| 966 | + |
| 967 | + |
| 968 | + |
| 969 | + |
| 970 | + |
| 971 | + |
| 972 | + |
| 973 | + |
| 974 | + |
| 975 | + |
| 976 | + |
| 977 | + |
| 978 | + |
| 979 | + |
| 980 | + |
| 981 | + |
| 982 | + |
| 983 | + |
| 984 | + |
| 985 | + |
| 986 | + |
| 987 | + |
| 988 | + |
| 989 | + |
| 990 | + |
| 991 | + |
| 992 | + |
| 993 | + |
| 994 | + |
| 995 | + |
| 996 | + |
| 997 | + |
| 998 | + |
| 999 | + |
| 1000 | + |
| 1001 | + |
| 1002 | + |
| 1003 | + |
| 1004 | + |
| 1005 | + |
| 1006 | + |
| 1007 | + |
| 1008 | + |
| 1009 | + |
| 1010 | + |
| 1011 | + |
| 1012 | + |
| 1013 | + |
| 1014 | + |
| 1015 | + |
| 1016 | + |
| 1017 | + |
| 1018 | + |
| 1019 | + |
| 1020 | + |
| 1021 | + |
| 1022 | + |
| 1023 | + |
| 1024 | + |
| 1025 | + |
| 1026 | + |
| 1027 | + |
| 1028 | + |
| 1029 | + |
| 1030 | + |
| 1031 | + |
| 1032 | + |
| 1033 | + |
| 1034 | + |
| 1035 | + |
| 1036 | + |
| 1037 | + |
| 1038 | + |
| 1039 | + |
| 1040 | + |
| 1041 | + |
| 1042 | + |
| 1043 | + |
| 1044 | + |
| 1045 | + |
| 1046 | + |
| 1047 | + |
| 1048 | + |
| 1049 | + |
Collapse file
packages/google-cloud-storage/tests/unit/asyncio/test_async_multi_range_downloader.py
Copy file name to clipboardExpand all lines: packages/google-cloud-storage/tests/unit/asyncio/test_async_multi_range_downloader.py+130-2Lines changed: 130 additions & 2 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
308 | 308 | |
309 | 309 | |
310 | 310 | |
311 | | - |
| 311 | + |
| 312 | + |
| 313 | + |
| 314 | + |
312 | 315 | |
313 | 316 | |
| 317 | + |
314 | 318 | |
315 | 319 | |
316 | | - |
| 320 | + |
317 | 321 | |
318 | 322 | |
319 | 323 | |
| ||
579 | 583 | |
580 | 584 | |
581 | 585 | |
| 586 | + |
| 587 | + |
| 588 | + |
| 589 | + |
| 590 | + |
| 591 | + |
| 592 | + |
| 593 | + |
| 594 | + |
| 595 | + |
| 596 | + |
| 597 | + |
| 598 | + |
| 599 | + |
| 600 | + |
| 601 | + |
| 602 | + |
| 603 | + |
| 604 | + |
| 605 | + |
| 606 | + |
| 607 | + |
| 608 | + |
| 609 | + |
| 610 | + |
| 611 | + |
| 612 | + |
| 613 | + |
| 614 | + |
| 615 | + |
| 616 | + |
| 617 | + |
| 618 | + |
| 619 | + |
| 620 | + |
| 621 | + |
| 622 | + |
| 623 | + |
| 624 | + |
| 625 | + |
| 626 | + |
| 627 | + |
| 628 | + |
| 629 | + |
| 630 | + |
| 631 | + |
| 632 | + |
| 633 | + |
| 634 | + |
| 635 | + |
| 636 | + |
| 637 | + |
| 638 | + |
| 639 | + |
| 640 | + |
| 641 | + |
| 642 | + |
| 643 | + |
| 644 | + |
| 645 | + |
| 646 | + |
| 647 | + |
| 648 | + |
| 649 | + |
| 650 | + |
| 651 | + |
| 652 | + |
| 653 | + |
| 654 | + |
| 655 | + |
| 656 | + |
| 657 | + |
| 658 | + |
| 659 | + |
| 660 | + |
| 661 | + |
| 662 | + |
| 663 | + |
| 664 | + |
| 665 | + |
| 666 | + |
| 667 | + |
| 668 | + |
| 669 | + |
| 670 | + |
| 671 | + |
| 672 | + |
| 673 | + |
| 674 | + |
| 675 | + |
| 676 | + |
| 677 | + |
| 678 | + |
| 679 | + |
| 680 | + |
| 681 | + |
| 682 | + |
| 683 | + |
| 684 | + |
| 685 | + |
| 686 | + |
| 687 | + |
| 688 | + |
| 689 | + |
| 690 | + |
| 691 | + |
| 692 | + |
| 693 | + |
| 694 | + |
| 695 | + |
| 696 | + |
| 697 | + |
| 698 | + |
| 699 | + |
| 700 | + |
| 701 | + |
| 702 | + |
| 703 | + |
| 704 | + |
| 705 | + |
| 706 | + |
| 707 | + |
| 708 | + |
| 709 | + |
Collapse file
packages/google-cloud-storage/tests/unit/conftest.py
Copy file name to clipboardExpand all lines: packages/google-cloud-storage/tests/unit/conftest.py+1Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
14 | 14 | |
15 | 15 | |
16 | 16 | |
| 17 | + |
17 | 18 | |
18 | 19 | |
19 | 20 | |
|
0 commit comments