Use -fspecialize-aggressively to improve performance by 30% on ACME build by seastian · Pull Request #4584 · purescript/purescript

seastian · Oct 4, 2025

Description of the change

Use -flag-fspecialise-aggressively to improve performance by 30% on ACME build.

Stats for ACME build before change

'purs' 'compile' '--source-globs-file' 'sources.txt' +RTS '-N' '-sstats.txt' 
 448,761,996,296 bytes allocated in the heap
  86,000,620,168 bytes copied during GC
   1,389,766,704 bytes maximum residency (39 sample(s))
      15,732,600 bytes maximum slop
            3891 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     20264 colls, 20264 par   88.627s  26.122s     0.0013s    0.0084s
  Gen  1        39 colls,    38 par   29.499s   3.859s     0.0990s    0.1906s

  Parallel GC work balance: 66.99% (serial 0%, perfect 100%)

  TASKS: 42 (1 bound, 41 peak workers (41 total), using -N10)

  SPARKS: 4556 (4556 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.004s  (  0.004s elapsed)
  MUT     time  171.470s  ( 22.159s elapsed)
  GC      time  118.125s  ( 29.982s elapsed)
  EXIT    time    0.065s  (  0.010s elapsed)
  Total   time  289.665s  ( 52.155s elapsed)

  Alloc rate    2,617,143,209 bytes per MUT second

  Productivity  59.2% of total user, 42.5% of total elapsed

Stats for ACME build after change

'purs' 'compile' '--source-globs-file' 'sources.txt' +RTS '-N' '-sstats.txt' 
 244,192,893,800 bytes allocated in the heap
  75,082,410,768 bytes copied during GC
   1,409,996,352 bytes maximum residency (34 sample(s))
      16,230,848 bytes maximum slop
            3888 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     13248 colls, 13248 par   56.774s  19.087s     0.0014s    0.0089s
  Gen  1        34 colls,    33 par   23.722s   3.168s     0.0932s    0.1958s

  Parallel GC work balance: 63.18% (serial 0%, perfect 100%)

  TASKS: 42 (1 bound, 41 peak workers (41 total), using -N10)

  SPARKS: 4556 (4556 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.005s  (  0.004s elapsed)
  MUT     time  104.922s  ( 14.618s elapsed)
  GC      time   80.497s  ( 22.255s elapsed)
  EXIT    time    0.070s  (  0.010s elapsed)
  Total   time  185.493s  ( 36.887s elapsed)

  Alloc rate    2,327,371,167 bytes per MUT second

  Productivity  56.6% of total user, 39.6% of total elapsed

Binary size went from 110MB to 130MB. I think this is fine and worth the speed improvements.

Compilation time of purs takes longer, but when dev we can always use stack --fast

Repo with acme build: https://github.com/seastian/purs-acme

Yes we spend so much time doing GC 🫣, much of it can be improved by using different RTS options (+RTS -A256m -n16m -RTS) at the cost of more ram.

What do you all think?

Checklist:

Added a file to CHANGELOG.d for this PR (see CHANGELOG.d/README.md)
Added myself to CONTRIBUTORS.md (if this is my first contribution)
Linked any existing issues or proposals that this pull request should close
Updated or added relevant documentation
Added a test for the contribution (if applicable)

f-f

Very happy to merge this patch 😄

Can anyone else have a look?

garyb

Binary size went from 110MB to 130MB. I think this is fine and worth the speed improvements.

Seems like a reasonable tradeoff to me 👍

MonoidMusician · Oct 13, 2025

Wow, that's great improvement! I can reproduce the 30% speedup for ACME, and on our work codebase it is more like 40%, with both single-threaded and multi-threaded builds.

Could you also add it to purescript.cabal for cabal builds? Just to be consistent.

diff --git a/purescript.cabal b/purescript.cabal
index 7601ec39..70e7dabb 100644
--- a/purescript.cabal
+++ b/purescript.cabal
@@ -402,7 +402,7 @@ executable purs
   import: defaults
   hs-source-dirs: app
   main-is: Main.hs
-  ghc-options: -fno-warn-unused-do-bind -threaded -rtsopts -with-rtsopts=-N -Wno-unused-packages
+  ghc-options: -fno-warn-unused-do-bind -threaded -rtsopts -with-rtsopts=-N -Wno-unused-packages -fspecialize-aggressively -fexpose-all-unfoldings
   build-depends:
     prettyprinter >=1.7.1 && <1.8,
     prettyprinter-ansi-terminal >=1.1.3 && <1.2,

seastian · Oct 14, 2025

@MonoidMusician I am not sure where that should go, purescript.cabal or cabal.project?

If it goes to purescript.cabal then it needs to be added to both library and executable and can be dropped from stack.yaml.

The benefit of adding it to cabal.project is that it mimics what stack is doing and people who use purescript as a library will not be forced to compile it with optimizations on (rare?). I just checked and text package includes an -O2 flag in their text.cabal so guess is fine to have optimization options in cabal files.

What do you think?

MonoidMusician · Oct 14, 2025

If it goes to purescript.cabal then it needs to be added to both library and executable and can be dropped from stack.yaml.

I'm not sure how that follows. Is there a problem with adding it to just executable like I did? I am not a cabal/stack expert but it did seem to build me a binary with optimizations. I guess any solution that adds it to purs-the-binary and does not require it on purescript-the-library is fine with me.

purefunctor · Oct 14, 2025

Both would work for the release-distributed builds, but for Hackage-based installs it would need to be in purescript.cabal since cabal.project is not included in the cabal sdist by default.

purefunctor · Oct 14, 2025

This one is quite subtle actually, the inlining/specialisation would only apply to the executable's modules. I think adding it to cabal.project would be best, since it optimises both the purescript and purs targets for our release builds—this will not trickle downstream for library consumers but they can always add ghc-options through their own cabal.project or stack.yaml

seastian · Oct 15, 2025

Okay added to cabal.project! This is ready to go in, would be really nice to make a release, want to use this on our CI 😊

…uild (purescript#4584) * Use -fspecialize-aggressively to improve performance * add fspecialize to cabal project

Use -fspecialize-aggressively to improve performance

51b1bd9

f-f approved these changes Oct 7, 2025

View reviewed changes

garyb approved these changes Oct 12, 2025

View reviewed changes

add fspecialize to cabal project

bf22cdf

MonoidMusician approved these changes Oct 15, 2025

View reviewed changes

purefunctor merged commit 8ac0fb2 into purescript:master Oct 18, 2025
7 checks passed

finnhodgkin mentioned this pull request Nov 4, 2025

Make/build cut-off #4477

Open

5 tasks

wclr mentioned this pull request Nov 6, 2025

Use -fspecialize-aggressively to improve performance by 30% wclr/purescript#3

Merged

5 tasks

purefunctor mentioned this pull request Dec 1, 2025

Use -fspecialize-aggressively to improve performance by 30% on ACME b… OxfordAbstracts/purescript#16

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Use -fspecialize-aggressively to improve performance by 30% on ACME build#4584

Use -fspecialize-aggressively to improve performance by 30% on ACME build#4584
purefunctor merged 2 commits intopurescript:masterpurescript/purescript:masterfrom
seastian:masterseastian/purescript:masterCopy head branch name to clipboard

seastian commented Oct 4, 2025

Uh oh!

f-f left a comment

Uh oh!

garyb left a comment

Uh oh!

MonoidMusician commented Oct 13, 2025

Uh oh!

seastian commented Oct 14, 2025

Uh oh!

MonoidMusician commented Oct 14, 2025

Uh oh!

purefunctor commented Oct 14, 2025

Uh oh!

purefunctor commented Oct 14, 2025 •

edited

Loading

Uh oh!

seastian commented Oct 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Search code, repositories, users, issues, pull requests...

Comments

Conversation

seastian commented Oct 4, 2025

Stats for ACME build before change

Stats for ACME build after change

Uh oh!

f-f left a comment

Choose a reason for hiding this comment

Uh oh!

garyb left a comment

Choose a reason for hiding this comment

Uh oh!

MonoidMusician commented Oct 13, 2025

Uh oh!

seastian commented Oct 14, 2025

Uh oh!

MonoidMusician commented Oct 14, 2025

Uh oh!

purefunctor commented Oct 14, 2025

Uh oh!

purefunctor commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seastian commented Oct 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

purefunctor commented Oct 14, 2025 •

edited

Loading