Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ThinLTO + lld taking much longer linking Android libqcrilNr.so, comparing against full LTO + lld #49555

Copy link
Copy link
Open
@huihzhang

Description

@huihzhang
Issue body actions
Bugzilla Link 50211
Version trunk
OS Linux
CC @Arnaud-de-Grandmaison-ARM,@dwblaikie,@efriedma-quic,@pcc,@smithp35,@stephenhines

Extended Description

When linking 64bit Android library libqcrilNr.so, using in house llvm compiler, code base similar to community release/12.X. Observing ThinLTO link time taking much longer than full lto.

When link with full lto, takes around 5 minutes.
When link with thin lto, takes around 17 minutes.

I can't really figure out why thin lto is taking much longer. Seeking help here, see if anyone more familiar with this issue, or knowing which part of llvm could possibly contribute to this slow down?

Can't really share any object file here. But sharing the time reports, hope this can help.

Time report for full lto:
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 444.9100 seconds (444.2988 wall clock)

---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
69.7080 ( 22.6%) 28.3786 ( 20.8%) 98.0866 ( 22.0%) 98.0891 ( 22.1%) AArch64 Assembly Printer
33.5838 ( 10.9%) 13.1880 ( 9.7%) 46.7717 ( 10.5%) 46.6892 ( 10.5%) AArch64 Instruction Selection
14.8625 ( 4.8%) 7.5726 ( 5.6%) 22.4351 ( 5.0%) 22.4490 ( 5.1%) Machine Module Information
13.1707 ( 4.3%) 6.7261 ( 4.9%) 19.8967 ( 4.5%) 19.9138 ( 4.5%) Dominator Tree Construction #​7
12.8369 ( 4.2%) 6.6886 ( 4.9%) 19.5255 ( 4.4%) 19.5272 ( 4.4%) Function Alias Analysis Results #​4
12.8099 ( 4.2%) 6.6547 ( 4.9%) 19.4646 ( 4.4%) 19.4668 ( 4.4%) Basic Alias Analysis (stateless AA impl) #​4
6.9480 ( 2.3%) 3.2763 ( 2.4%) 10.2243 ( 2.3%) 10.2027 ( 2.3%) Greedy Register Allocator
6.2687 ( 2.0%) 3.1288 ( 2.3%) 9.3975 ( 2.1%) 9.3346 ( 2.1%) Prologue/Epilogue Insertion & Frame Finalization
5.1671 ( 1.7%) 2.3538 ( 1.7%) 7.5209 ( 1.7%) 7.4856 ( 1.7%) Live DEBUG_VALUE analysis
4.7978 ( 1.6%) 2.1008 ( 1.5%) 6.8986 ( 1.6%) 6.8701 ( 1.5%) Live Variable Analysis
5.3652 ( 1.7%) 0.9511 ( 0.7%) 6.3163 ( 1.4%) 6.2932 ( 1.4%) Module Verifier #​2
5.1952 ( 1.7%) 0.5094 ( 0.4%) 5.7046 ( 1.3%) 5.6843 ( 1.3%) Module Verifier
3.6007 ( 1.2%) 1.5525 ( 1.1%) 5.1533 ( 1.2%) 5.1344 ( 1.2%) Live Interval Analysis
3.4184 ( 1.1%) 1.5104 ( 1.1%) 4.9288 ( 1.1%) 4.8973 ( 1.1%) Simple Register Coalescing
3.2224 ( 1.0%) 1.6417 ( 1.2%) 4.8641 ( 1.1%) 4.8603 ( 1.1%) Machine Natural Loop Construction #​2
2.8867 ( 0.9%) 1.4917 ( 1.1%) 4.3784 ( 1.0%) 4.3753 ( 1.0%) MachineDominator Tree Construction #​5
3.0209 ( 1.0%) 1.0482 ( 0.8%) 4.0691 ( 0.9%) 4.0510 ( 0.9%) Memory SSA
2.8420 ( 0.9%) 1.1124 ( 0.8%) 3.9544 ( 0.9%) 3.9477 ( 0.9%) Free MachineFunction
2.0593 ( 0.7%) 1.0078 ( 0.7%) 3.0671 ( 0.7%) 3.0524 ( 0.7%) Insert stack protectors
1.8488 ( 0.6%) 0.8942 ( 0.7%) 2.7430 ( 0.6%) 2.7401 ( 0.6%) Slot index numbering #​2
1.5921 ( 0.5%) 0.7925 ( 0.6%) 2.3846 ( 0.5%) 2.3749 ( 0.5%) MachineDominator Tree Construction
1.4949 ( 0.5%) 0.7285 ( 0.5%) 2.2234 ( 0.5%) 2.2182 ( 0.5%) Machine Block Frequency Analysis #​3
1.5021 ( 0.5%) 0.7089 ( 0.5%) 2.2110 ( 0.5%) 2.2009 ( 0.5%) Branch Probability Analysis #​2
1.3685 ( 0.4%) 0.7354 ( 0.5%) 2.1039 ( 0.5%) 2.0826 ( 0.5%) Scalar Evolution Analysis
1.3959 ( 0.5%) 0.6686 ( 0.5%) 2.0645 ( 0.5%) 2.0621 ( 0.5%) MachineDominator Tree Construction #​2

For thin lto, observing single cpu running for the first 13 minutes, the later 4 minutes are multi-threaded.

===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 122903.7639 seconds (7749.4743 wall clock)

---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
304.7405 ( 3.8%) 2.1033 ( 0.0%) 306.8439 ( 0.2%) 306.8722 ( 4.0%) Lower type metadata
289.0256 ( 3.6%) 0.8153 ( 0.0%) 289.8409 ( 0.2%) 289.8686 ( 3.7%) Branch Probability Basic Block Placement
190.9626 ( 2.4%) 0.4888 ( 0.0%) 191.4514 ( 0.2%) 191.4685 ( 2.5%) Branch relaxation pass
76.6860 ( 0.9%) 1136.7987 ( 1.0%) 1213.4848 ( 1.0%) 62.8574 ( 0.8%) AArch64 Assembly Printer #​116
42.9683 ( 0.5%) 2.5747 ( 0.0%) 45.5430 ( 0.0%) 45.5567 ( 0.6%) AArch64 Instruction Selection
42.5507 ( 0.5%) 0.1631 ( 0.0%) 42.7139 ( 0.0%) 42.7174 ( 0.6%) Control Flow Optimizer
52.7947 ( 0.7%) 782.3718 ( 0.7%) 835.1665 ( 0.7%) 41.8995 ( 0.5%) AArch64 Assembly Printer #​7
47.6952 ( 0.6%) 704.5987 ( 0.6%) 752.2939 ( 0.6%) 37.7206 ( 0.5%) AArch64 Assembly Printer #​40
30.7870 ( 0.4%) 504.1012 ( 0.4%) 534.8883 ( 0.4%) 36.6904 ( 0.5%) AArch64 Assembly Printer #​23
26.1035 ( 0.3%) 449.4197 ( 0.4%) 475.5232 ( 0.4%) 33.7455 ( 0.4%) AArch64 Assembly Printer #​137
32.0621 ( 0.4%) 261.0043 ( 0.2%) 293.0665 ( 0.2%) 33.0863 ( 0.4%) AArch64 Assembly Printer #​11
36.7061 ( 0.5%) 575.6889 ( 0.5%) 612.3949 ( 0.5%) 31.3014 ( 0.4%) AArch64 Assembly Printer #​121
40.1155 ( 0.5%) 522.4378 ( 0.5%) 562.5532 ( 0.5%) 28.2140 ( 0.4%) AArch64 Assembly Printer #​114
25.4224 ( 0.3%) 216.7932 ( 0.2%) 242.2156 ( 0.2%) 27.9494 ( 0.4%) AArch64 Assembly Printer #​139
34.6752 ( 0.4%) 490.1866 ( 0.4%) 524.8617 ( 0.4%) 26.3398 ( 0.3%) Branch Probability Analysis #​231
34.2679 ( 0.4%) 441.6632 ( 0.4%) 475.9312 ( 0.4%) 23.8663 ( 0.3%) AArch64 Assembly Printer #​106
35.0161 ( 0.4%) 427.0860 ( 0.4%) 462.1021 ( 0.4%) 23.2253 ( 0.3%) AArch64 Assembly Printer #​107
22.1234 ( 0.3%) 128.0110 ( 0.1%) 150.1344 ( 0.1%) 22.4227 ( 0.3%) AArch64 Assembly Printer #​92
20.8856 ( 0.3%) 138.7600 ( 0.1%) 159.6456 ( 0.1%) 22.3985 ( 0.3%) AArch64 Assembly Printer #​95
18.4032 ( 0.2%) 189.8183 ( 0.2%) 208.2215 ( 0.2%) 21.9948 ( 0.3%) AArch64 Assembly Printer #​100
16.5243 ( 0.2%) 387.3604 ( 0.3%) 403.8847 ( 0.3%) 20.2050 ( 0.3%) AArch64 Assembly Printer #​28
18.5907 ( 0.2%) 90.6694 ( 0.1%) 109.2601 ( 0.1%) 20.0015 ( 0.3%) AArch64 Assembly Printer #​140
19.3601 ( 0.2%) 72.3643 ( 0.1%) 91.7244 ( 0.1%) 17.7730 ( 0.2%) AArch64 Assembly Printer #​141

Metadata

Metadata

Assignees

No one assigned

    Labels

    LTOLink time optimization (regular/full LTO or ThinLTO)Link time optimization (regular/full LTO or ThinLTO)backend:AArch64bugzillaIssues migrated from bugzillaIssues migrated from bugzilla

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.