[fix](cloud) serialize cache init to avoid unstable cache pick #44429

freemandealer · Nov 21, 2024

The original paralleled cache init will causing unstable pick of cache base path because the choice depends on the order of init which could be different after each BE reboot. Thus, cause cache missing and duplicate cache block across multiple caches (disk space waste).

This commit will serialize the init process of multiple cache and using fixed order, i.e. the order explicitly declared in be conf: file_cache_path.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

The original paralleled cache init will causing unstable pick of cache base path because the choice depends on the order of init which could be different after each BE reboot. Thus, cause cache missing and duplicate cache block across multiple caches (disk space waste). This commit will serialize the init process of multiple cache and using fixed order, i.e. the order explicitly declared in be conf: file_cache_path. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>

doris-robot · Nov 21, 2024

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

freemandealer · Nov 21, 2024

run buildall

github-actions · Nov 21, 2024

clang-tidy review says "All clean, LGTM! 👍"

doris-robot · Nov 21, 2024

TPC-H: Total hot run time: 40232 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 396010b250b74855ae7fd4ece1894cf0ca8c084a, data reload: false

------ Round 1 ----------------------------------
q1	17577	7484	7317	7317
q2	2040	183	169	169
q3	10593	1071	1210	1071
q4	10569	744	675	675
q5	7605	2776	2727	2727
q6	245	151	147	147
q7	1002	625	599	599
q8	9235	1843	1981	1843
q9	6499	6465	6403	6403
q10	6972	2310	2342	2310
q11	460	262	276	262
q12	431	224	218	218
q13	17777	3061	3062	3061
q14	250	232	214	214
q15	568	534	527	527
q16	640	573	588	573
q17	978	591	599	591
q18	7351	6880	6702	6702
q19	1341	1021	1000	1000
q20	499	186	191	186
q21	4181	3384	3320	3320
q22	391	320	317	317
Total cold run time: 107204 ms
Total hot run time: 40232 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7829	7304	7253	7253
q2	328	230	232	230
q3	3007	2985	3024	2985
q4	2181	1875	1849	1849
q5	5690	5719	5775	5719
q6	226	146	145	145
q7	2320	1854	1870	1854
q8	3439	3571	3581	3571
q9	8853	8962	8947	8947
q10	3657	3612	3586	3586
q11	618	521	519	519
q12	829	626	606	606
q13	11779	3290	3238	3238
q14	306	273	264	264
q15	575	535	527	527
q16	680	667	652	652
q17	1910	1661	1660	1660
q18	8501	7796	7710	7710
q19	1712	1532	1686	1532
q20	2160	1892	1871	1871
q21	5757	5578	5467	5467
q22	658	607	587	587
Total cold run time: 73015 ms
Total hot run time: 60772 ms

doris-robot · Nov 21, 2024

TeamCity be ut coverage result:
Function Coverage: 38.04% (9903/26033)
Line Coverage: 29.23% (82871/283529)
Region Coverage: 28.35% (42549/150086)
Branch Coverage: 24.91% (21573/86592)
Coverage Report: http://coverage.selectdb-in.cc/coverage/396010b250b74855ae7fd4ece1894cf0ca8c084a_396010b250b74855ae7fd4ece1894cf0ca8c084a/report/index.html

doris-robot · Nov 21, 2024

TPC-DS: Total hot run time: 197177 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 396010b250b74855ae7fd4ece1894cf0ca8c084a, data reload: false

query1	1244	955	912	912
query2	6249	2090	2073	2073
query3	10811	4040	4012	4012
query4	67800	29116	23582	23582
query5	4886	486	465	465
query6	415	181	173	173
query7	5549	302	289	289
query8	306	222	216	216
query9	8787	2679	2680	2679
query10	429	245	245	245
query11	17160	15124	15993	15124
query12	162	114	105	105
query13	1490	453	442	442
query14	10013	6980	6930	6930
query15	224	190	178	178
query16	7100	471	476	471
query17	1408	573	567	567
query18	1862	314	304	304
query19	202	156	156	156
query20	121	112	124	112
query21	203	107	100	100
query22	4924	4734	4511	4511
query23	34486	34493	34507	34493
query24	5906	2528	2561	2528
query25	481	385	409	385
query26	686	145	146	145
query27	2287	283	287	283
query28	4670	2495	2472	2472
query29	678	438	420	420
query30	226	155	144	144
query31	989	818	849	818
query32	68	55	57	55
query33	434	278	289	278
query34	935	522	524	522
query35	864	731	716	716
query36	1104	966	984	966
query37	118	76	73	73
query38	4620	4510	4409	4409
query39	1511	1476	1449	1449
query40	203	97	103	97
query41	45	45	44	44
query42	108	100	103	100
query43	554	509	505	505
query44	1227	850	857	850
query45	188	170	202	170
query46	1145	694	723	694
query47	2048	1921	1960	1921
query48	431	321	341	321
query49	738	392	387	387
query50	857	386	394	386
query51	7453	7201	7095	7095
query52	102	88	89	88
query53	248	176	189	176
query54	514	410	390	390
query55	79	78	76	76
query56	262	234	251	234
query57	1282	1189	1186	1186
query58	239	235	221	221
query59	3322	3216	3036	3036
query60	284	295	256	256
query61	134	132	131	131
query62	793	679	677	677
query63	219	195	203	195
query64	1487	736	630	630
query65	3281	3231	3254	3231
query66	712	307	309	307
query67	16375	15705	15847	15705
query68	3893	585	588	585
query69	433	252	253	252
query70	1153	1133	1166	1133
query71	359	244	247	244
query72	6476	4114	4011	4011
query73	761	366	364	364
query74	10374	9132	8962	8962
query75	3432	2801	2691	2691
query76	1796	1135	1147	1135
query77	565	294	280	280
query78	10559	9515	9412	9412
query79	1636	595	609	595
query80	924	435	430	430
query81	509	227	272	227
query82	1269	117	118	117
query83	278	149	151	149
query84	279	75	69	69
query85	911	312	305	305
query86	340	309	307	307
query87	4781	4631	4799	4631
query88	3768	2265	2220	2220
query89	421	289	287	287
query90	2039	185	184	184
query91	138	105	103	103
query92	65	47	54	47
query93	1911	542	550	542
query94	861	294	280	280
query95	341	247	247	247
query96	625	278	276	276
query97	2880	2682	2737	2682
query98	220	198	201	198
query99	1615	1310	1301	1301
Total cold run time: 321350 ms
Total hot run time: 197177 ms

doris-robot · Nov 21, 2024

ClickBench: Total hot run time: 32.1 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 396010b250b74855ae7fd4ece1894cf0ca8c084a, data reload: false

query1	0.03	0.03	0.02
query2	0.07	0.03	0.04
query3	0.23	0.07	0.06
query4	1.63	0.10	0.10
query5	0.41	0.42	0.42
query6	1.15	0.65	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.02
query9	0.58	0.49	0.51
query10	0.56	0.56	0.53
query11	0.14	0.11	0.10
query12	0.14	0.11	0.11
query13	0.62	0.60	0.61
query14	2.70	2.70	2.77
query15	0.90	0.83	0.83
query16	0.39	0.39	0.39
query17	1.07	1.02	1.05
query18	0.22	0.22	0.20
query19	1.89	1.75	1.88
query20	0.01	0.01	0.01
query21	15.40	0.58	0.59
query22	2.62	2.35	1.76
query23	17.18	0.83	0.92
query24	3.22	0.62	1.47
query25	0.30	0.24	0.04
query26	0.34	0.13	0.13
query27	0.04	0.05	0.05
query28	10.75	1.09	1.07
query29	12.56	3.30	3.25
query30	0.25	0.07	0.06
query31	2.94	0.39	0.38
query32	3.67	0.48	0.47
query33	3.02	3.09	3.17
query34	16.81	4.45	4.50
query35	4.51	4.48	4.50
query36	0.68	0.49	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.03
query40	0.17	0.13	0.13
query41	0.07	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.04	0.03
Total cold run time: 107.57 s
Total hot run time: 32.1 s

github-actions · Nov 25, 2024

PR approved by at least one committer and no changes requested.

github-actions · Nov 25, 2024

PR approved by anyone and no changes requested.

TangSiyang2001

LGTM

…e#44429) The original paralleled cache init will causing unstable pick of cache base path because the choice depends on the order of init which could be different after each BE reboot. Thus, cause cache missing and duplicate cache block across multiple caches (disk space waste). This commit will serialize the init process of multiple cache and using fixed order, i.e. the order explicitly declared in be conf: file_cache_path. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>

… (#44942)

…e#44429) (apache#3674)

The initialization of the file cache involves asynchronous loading logic and synchronous upgrade directories. The latter mainly handles the conversion from version1 to version2 format and some fallback logic for problematic directories, which involves a large number of directory traversals and can be very slow. Previously, in PR apache#44429, we changed the initialization of multiple cache directories from parallel to serial to avoid the disorder caused by concurrent initialization, which led to a long cache initialization time and affected the startup speed of the BE. We found that the upgrade directory is only meaningful during upgrades and does not need to be executed on every restart. Therefore, if we detect that the version file has been successfully written, we consider the cache directory to have completed the upgrade and skip these redundant directory traversals Of course, we could further optimize the directory traversal process to make it asynchronous and not block the BE startup. However, this would result in three concurrent operations on the file system: asynchronous loading, asynchronous updating, and lazy loading on query. This would increase code complexity, the likelihood of errors, and the difficulty of troubleshooting. Considering that old clusters are not very common and that a cluster only needs to go through such an upgrade once in its lifecycle, we assessed that this optimization would have low cost-effectiveness and decided not to pursue it. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>

The initialization of the file cache involves asynchronous loading logic and synchronous upgrade directories. The latter mainly handles the conversion from version1 to version2 format and some fallback logic for problematic directories, which involves a large number of directory traversals and can be very slow. Previously, in PR #44429, we changed the initialization of multiple cache directories from parallel to serial to avoid the disorder caused by concurrent initialization, which led to a long cache initialization time and affected the startup speed of the BE. We found that the upgrade directory is only meaningful during upgrades and does not need to be executed on every restart. Therefore, if we detect that the version file has been successfully written, we consider the cache directory to have completed the upgrade and skip these redundant directory traversals Of course, we could further optimize the directory traversal process to make it asynchronous and not block the BE startup. However, this would result in three concurrent operations on the file system: asynchronous loading, asynchronous updating, and lazy loading on query. This would increase code complexity, the likelihood of errors, and the difficulty of troubleshooting. Considering that old clusters are not very common and that a cluster only needs to go through such an upgrade once in its lifecycle, we assessed that this optimization would have low cost-effectiveness and decided not to pursue it. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>

The initialization of the file cache involves asynchronous loading logic and synchronous upgrade directories. The latter mainly handles the conversion from version1 to version2 format and some fallback logic for problematic directories, which involves a large number of directory traversals and can be very slow. Previously, in PR apache#44429, we changed the initialization of multiple cache directories from parallel to serial to avoid the disorder caused by concurrent initialization, which led to a long cache initialization time and affected the startup speed of the BE. We found that the upgrade directory is only meaningful during upgrades and does not need to be executed on every restart. Therefore, if we detect that the version file has been successfully written, we consider the cache directory to have completed the upgrade and skip these redundant directory traversals Of course, we could further optimize the directory traversal process to make it asynchronous and not block the BE startup. However, this would result in three concurrent operations on the file system: asynchronous loading, asynchronous updating, and lazy loading on query. This would increase code complexity, the likelihood of errors, and the difficulty of troubleshooting. Considering that old clusters are not very common and that a cluster only needs to go through such an upgrade once in its lifecycle, we assessed that this optimization would have low cost-effectiveness and decided not to pursue it. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

The initialization of the file cache involves asynchronous loading logic and synchronous upgrade directories. The latter mainly handles the conversion from version1 to version2 format and some fallback logic for problematic directories, which involves a large number of directory traversals and can be very slow. Previously, in PR apache#44429, we changed the initialization of multiple cache directories from parallel to serial to avoid the disorder caused by concurrent initialization, which led to a long cache initialization time and affected the startup speed of the BE. We found that the upgrade directory is only meaningful during upgrades and does not need to be executed on every restart. Therefore, if we detect that the version file has been successfully written, we consider the cache directory to have completed the upgrade and skip these redundant directory traversals Of course, we could further optimize the directory traversal process to make it asynchronous and not block the BE startup. However, this would result in three concurrent operations on the file system: asynchronous loading, asynchronous updating, and lazy loading on query. This would increase code complexity, the likelihood of errors, and the difficulty of troubleshooting. Considering that old clusters are not very common and that a cluster only needs to go through such an upgrade once in its lifecycle, we assessed that this optimization would have low cost-effectiveness and decided not to pursue it. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>

dataroaring added the dev/3.0.x label Nov 21, 2024

gavinchou approved these changes Nov 25, 2024

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 25, 2024

github-actions bot added the reviewed label Nov 25, 2024

TangSiyang2001 approved these changes Nov 29, 2024

View reviewed changes

gavinchou self-requested a review November 29, 2024 02:38

gavinchou approved these changes Nov 29, 2024

View reviewed changes

gavinchou merged commit bc67fc9 into apache:master Nov 29, 2024
30 of 32 checks passed

github-actions bot added the dev/3.0.x-conflict label Nov 29, 2024

gavinchou mentioned this pull request Jan 1, 2025

[fix](cloud) serialize cache init to avoid unstable cache pick (#44429) #44942

Merged

16 tasks

gavinchou pushed a commit that referenced this pull request Jan 1, 2025

[fix](cloud) serialize cache init to avoid unstable cache pick (#44429)…

c548055

… (#44942)

gavinchou added dev/3.0.4-merged and removed dev/3.0.x dev/3.0.x-conflict labels Jan 1, 2025

BiteTheDDDDt pushed a commit to BiteTheDDDDt/incubator-doris that referenced this pull request Feb 7, 2025

[fix](cloud) serialize cache init to avoid unstable cache pick (apach…

e6b6956

…e#44429) (apache#3674)

gavinchou mentioned this pull request Feb 18, 2025

Release Notes 3.0.4 #48013

Open

freemandealer mentioned this pull request Mar 5, 2025

[fix](cloud) speed up file cache initializtion #48687

Merged

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix](cloud) serialize cache init to avoid unstable cache pick #44429

[fix](cloud) serialize cache init to avoid unstable cache pick #44429

Uh oh!

freemandealer commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

freemandealer commented Nov 21, 2024

Uh oh!

github-actions bot commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

github-actions bot commented Nov 25, 2024

Uh oh!

github-actions bot commented Nov 25, 2024

Uh oh!

TangSiyang2001 left a comment

Uh oh!

Uh oh!

Uh oh!

Search code, repositories, users, issues, pull requests...

[fix](cloud) serialize cache init to avoid unstable cache pick #44429

[fix](cloud) serialize cache init to avoid unstable cache pick #44429

Uh oh!

Conversation

freemandealer commented Nov 21, 2024

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

freemandealer commented Nov 21, 2024

Uh oh!

github-actions bot commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

doris-robot commented Nov 21, 2024

Uh oh!

github-actions bot commented Nov 25, 2024

Uh oh!

github-actions bot commented Nov 25, 2024

Uh oh!

TangSiyang2001 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!