Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

branch-3.0: [Fix](merge-on-write) Should update pending delete bitmap KVs in MS when no need to calc delete bitmaps in publish phase #46039 #46139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 31, 2024

Conversation

bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Dec 30, 2024

pick #46039

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

…hen no need to calc delete bitmaps in publish phase (apache#46039)

consider the following situation:
1. Txn A acquires the lock, obtains version X to publish, calculates the
delete bitmap, writes the pending delete bitmap KVs to the MS, but fails
for some reason before committing the transaction in the MS.
2. Txn B acquires the lock, obtains version X to publish, **cleans up
the pending delete bitmap KV written by Txn A**, calculates the delete
bitmap, **writes its pending delete bitmap KV to the MS**, but also
fails for some reason before committing the transaction in the MS.
3. Txn A then reacquires the lock, obtains version X to publish, and
notices that neither the version nor the compaction counts have changed.
It will skip the process of calculating the delete bitmap and writing
the pending delete bitmap KV to the MS
apache#39018 and eventually succeeds in
committing the transaction in the MS.

In this case, Txn A will save the wrong delete bitmaps(generated by Txn
B) in MS and causing correctness problem.

To solve the problem, we should still update delete bitmap KVs in MS
when we skip the calculation of delete bitmap on BE in publish phase.

Also add a defensive check: record `lock_id` when writing pending delete
bitmap keys and check if the `lock_id` is correct when commit txn in MS.
@bobhan1 bobhan1 force-pushed the branch-3.0-pick-46039 branch from 84b94a5 to c85695e Compare December 30, 2024 02:21
@bobhan1
Copy link
Contributor Author

bobhan1 commented Dec 30, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40894 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c85695e3348706e912bdbe3da7b8e56208b748b0, data reload: false

------ Round 1 ----------------------------------
q1	17655	7400	7254	7254
q2	2068	190	170	170
q3	10791	1097	1167	1097
q4	10557	771	673	673
q5	7763	2933	2897	2897
q6	234	151	151	151
q7	1012	622	602	602
q8	9361	1957	2020	1957
q9	6618	6421	6455	6421
q10	7026	2332	2337	2332
q11	473	268	269	268
q12	425	207	211	207
q13	17798	2965	3000	2965
q14	233	209	209	209
q15	562	523	537	523
q16	726	626	592	592
q17	984	556	571	556
q18	7388	6736	6710	6710
q19	1391	1010	1084	1010
q20	455	206	198	198
q21	4143	3107	3238	3107
q22	1116	995	1007	995
Total cold run time: 108779 ms
Total hot run time: 40894 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7266	7183	7217	7183
q2	322	228	224	224
q3	2921	2942	2966	2942
q4	2077	1800	1808	1800
q5	5716	5707	5756	5707
q6	226	140	144	140
q7	2231	1846	1835	1835
q8	3376	3545	3409	3409
q9	8924	8935	8843	8843
q10	3600	3588	3520	3520
q11	587	507	495	495
q12	847	653	623	623
q13	9923	3167	3180	3167
q14	300	269	268	268
q15	592	531	514	514
q16	709	663	666	663
q17	1872	1601	1608	1601
q18	8344	7629	7693	7629
q19	1681	1639	1496	1496
q20	2114	1865	1871	1865
q21	5594	5286	5544	5286
q22	1143	1068	1069	1068
Total cold run time: 70365 ms
Total hot run time: 60278 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197133 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c85695e3348706e912bdbe3da7b8e56208b748b0, data reload: false

query1	1305	932	933	932
query2	6230	2132	2172	2132
query3	10935	4370	4426	4370
query4	66581	28942	23446	23446
query5	4970	447	450	447
query6	405	180	167	167
query7	5610	312	312	312
query8	312	232	231	231
query9	9421	2697	2705	2697
query10	477	267	252	252
query11	17445	15267	15695	15267
query12	155	101	103	101
query13	1536	443	445	443
query14	10530	6747	6860	6747
query15	211	176	179	176
query16	6853	495	492	492
query17	1299	590	596	590
query18	1747	346	329	329
query19	216	160	158	158
query20	118	121	111	111
query21	63	46	46	46
query22	4847	4383	4607	4383
query23	34784	34166	34388	34166
query24	6352	2893	2953	2893
query25	541	430	421	421
query26	679	170	170	170
query27	1825	313	315	313
query28	4731	2508	2496	2496
query29	721	463	448	448
query30	261	163	166	163
query31	984	817	836	817
query32	67	55	63	55
query33	481	285	286	285
query34	920	504	506	504
query35	852	761	719	719
query36	1103	949	954	949
query37	123	67	74	67
query38	4108	4008	4037	4008
query39	1498	1464	1456	1456
query40	137	78	80	78
query41	48	55	45	45
query42	111	95	92	92
query43	544	506	504	504
query44	1180	835	863	835
query45	184	167	164	164
query46	1155	696	759	696
query47	1992	1874	1928	1874
query48	478	371	384	371
query49	728	384	396	384
query50	853	419	428	419
query51	7221	7330	7174	7174
query52	98	84	84	84
query53	248	192	181	181
query54	557	461	441	441
query55	76	79	76	76
query56	263	236	249	236
query57	1251	1107	1124	1107
query58	202	200	198	198
query59	3430	3019	2943	2943
query60	276	244	255	244
query61	111	106	103	103
query62	791	657	673	657
query63	215	191	188	188
query64	1410	646	621	621
query65	3260	3201	3214	3201
query66	721	295	305	295
query67	15894	15603	15546	15546
query68	4161	581	572	572
query69	428	264	259	259
query70	1175	1122	1135	1122
query71	387	267	258	258
query72	6492	4040	3990	3990
query73	750	361	355	355
query74	10195	9038	9009	9009
query75	3321	2629	2643	2629
query76	2031	1111	1076	1076
query77	534	281	274	274
query78	10482	9581	9622	9581
query79	2032	605	596	596
query80	1239	418	420	418
query81	531	247	233	233
query82	521	119	115	115
query83	167	146	141	141
query84	292	76	77	76
query85	966	305	293	293
query86	400	314	285	285
query87	4361	4255	4308	4255
query88	3742	2381	2344	2344
query89	401	290	285	285
query90	1983	186	187	186
query91	182	170	146	146
query92	62	49	46	46
query93	2623	548	552	548
query94	856	300	293	293
query95	363	251	249	249
query96	608	277	278	277
query97	3335	3209	3237	3209
query98	212	209	202	202
query99	1618	1295	1309	1295
Total cold run time: 321180 ms
Total hot run time: 197133 ms

@bobhan1
Copy link
Contributor Author

bobhan1 commented Dec 30, 2024

run performance

@doris-robot
Copy link

TPC-H: Total hot run time: 40755 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c85695e3348706e912bdbe3da7b8e56208b748b0, data reload: false

------ Round 1 ----------------------------------
q1	17911	8157	7283	7283
q2	2059	175	161	161
q3	11030	1051	1194	1051
q4	10543	759	778	759
q5	7766	2810	2760	2760
q6	240	143	144	143
q7	979	602	610	602
q8	9382	1961	1997	1961
q9	6644	6419	6426	6419
q10	6976	2279	2328	2279
q11	474	266	260	260
q12	410	219	213	213
q13	17805	2970	2962	2962
q14	249	218	214	214
q15	564	529	514	514
q16	687	619	630	619
q17	977	580	552	552
q18	7608	6661	6708	6661
q19	1394	986	1046	986
q20	480	210	208	208
q21	4002	3266	3143	3143
q22	1100	1005	1015	1005
Total cold run time: 109280 ms
Total hot run time: 40755 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7278	7196	7222	7196
q2	331	231	226	226
q3	3000	2937	2930	2930
q4	2067	1900	1772	1772
q5	5742	5710	5734	5710
q6	225	139	149	139
q7	2260	1794	1780	1780
q8	3312	3624	3456	3456
q9	8844	8850	8888	8850
q10	3591	3542	3539	3539
q11	598	479	492	479
q12	814	589	636	589
q13	11002	3198	3110	3110
q14	309	287	273	273
q15	577	525	511	511
q16	735	675	679	675
q17	1853	1628	1608	1608
q18	8091	7664	7723	7664
q19	1691	1371	1617	1371
q20	2082	1879	1887	1879
q21	5608	5375	5364	5364
q22	1156	1090	1037	1037
Total cold run time: 71166 ms
Total hot run time: 60158 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198108 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c85695e3348706e912bdbe3da7b8e56208b748b0, data reload: false

query1	1292	940	916	916
query2	6229	2066	2080	2066
query3	10953	4357	4438	4357
query4	66789	28590	23473	23473
query5	4898	459	462	459
query6	395	176	178	176
query7	5436	314	318	314
query8	307	240	244	240
query9	8313	2730	2686	2686
query10	458	271	261	261
query11	17046	15198	15735	15198
query12	154	111	102	102
query13	1426	458	430	430
query14	10026	7850	7138	7138
query15	208	183	188	183
query16	6727	519	515	515
query17	1099	611	612	611
query18	1728	331	317	317
query19	204	162	162	162
query20	123	109	108	108
query21	59	49	47	47
query22	4823	4514	4543	4514
query23	35085	34260	34174	34174
query24	6173	3007	3018	3007
query25	542	425	450	425
query26	664	179	172	172
query27	1795	305	328	305
query28	4346	2488	2482	2482
query29	751	513	457	457
query30	239	164	164	164
query31	1070	855	868	855
query32	67	53	53	53
query33	489	296	306	296
query34	995	514	526	514
query35	865	763	775	763
query36	1098	987	994	987
query37	124	77	72	72
query38	4076	4137	4079	4079
query39	1519	1474	1460	1460
query40	148	87	87	87
query41	51	50	48	48
query42	111	106	98	98
query43	545	512	494	494
query44	1252	847	842	842
query45	190	181	177	177
query46	1220	756	747	747
query47	1991	1909	1915	1909
query48	477	385	395	385
query49	741	404	383	383
query50	908	445	434	434
query51	7400	7387	7252	7252
query52	94	84	86	84
query53	263	179	186	179
query54	570	459	458	458
query55	75	74	76	74
query56	248	241	233	233
query57	1276	1143	1109	1109
query58	204	202	202	202
query59	3187	2880	2939	2880
query60	281	260	255	255
query61	105	104	107	104
query62	776	676	654	654
query63	217	197	192	192
query64	1424	673	652	652
query65	3289	3195	3213	3195
query66	739	316	303	303
query67	15933	15615	15674	15615
query68	4003	596	568	568
query69	434	270	270	270
query70	1183	1175	1062	1062
query71	372	253	266	253
query72	6353	4124	3869	3869
query73	790	350	347	347
query74	10179	8993	9038	8993
query75	3410	2673	2666	2666
query76	1817	1178	1076	1076
query77	475	276	279	276
query78	11029	9616	9575	9575
query79	1324	598	602	598
query80	864	429	441	429
query81	503	241	239	239
query82	1066	117	113	113
query83	175	161	145	145
query84	284	82	82	82
query85	877	315	289	289
query86	338	292	313	292
query87	4517	4300	4306	4300
query88	3767	2514	2364	2364
query89	419	293	299	293
query90	2008	188	189	188
query91	179	144	145	144
query92	65	49	50	49
query93	1652	554	555	554
query94	777	291	304	291
query95	355	258	260	258
query96	673	279	281	279
query97	3392	3237	3162	3162
query98	209	201	199	199
query99	1617	1314	1302	1302
Total cold run time: 317412 ms
Total hot run time: 198108 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.64 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c85695e3348706e912bdbe3da7b8e56208b748b0, data reload: false

query1	0.03	0.03	0.03
query2	0.10	0.05	0.05
query3	0.23	0.06	0.06
query4	1.64	0.07	0.08
query5	0.53	0.52	0.50
query6	1.14	0.74	0.74
query7	0.03	0.01	0.03
query8	0.05	0.05	0.05
query9	0.55	0.49	0.49
query10	0.56	0.54	0.55
query11	0.17	0.12	0.12
query12	0.17	0.13	0.13
query13	0.61	0.60	0.58
query14	2.96	2.97	2.95
query15	0.91	0.84	0.82
query16	0.36	0.36	0.37
query17	1.09	1.07	1.04
query18	0.20	0.19	0.20
query19	1.97	1.87	1.97
query20	0.01	0.02	0.01
query21	15.38	0.68	0.67
query22	4.35	7.22	1.90
query23	18.21	1.42	1.31
query24	2.18	0.24	0.22
query25	0.15	0.08	0.09
query26	0.28	0.18	0.17
query27	0.07	0.08	0.08
query28	13.25	1.19	1.15
query29	12.60	3.32	3.35
query30	0.25	0.06	0.07
query31	2.85	0.40	0.41
query32	3.22	0.48	0.49
query33	2.98	3.01	3.05
query34	17.08	4.53	4.57
query35	4.61	4.58	4.61
query36	0.66	0.49	0.48
query37	0.19	0.16	0.16
query38	0.15	0.16	0.14
query39	0.05	0.04	0.04
query40	0.17	0.13	0.13
query41	0.10	0.04	0.05
query42	0.06	0.04	0.04
query43	0.05	0.04	0.04
Total cold run time: 112.2 s
Total hot run time: 33.64 s

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhannngchen zhannngchen merged commit 1ae4d26 into apache:branch-3.0 Dec 31, 2024
20 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.