Commit 9840715
build: encode non-ASCII Latin1 characters as one byte in JS2C
Previously we had two encodings for JS files:
1. If a file contains only ASCII characters, encode it as a one-byte
string (interpreted as uint8_t array during loading).
2. If a file contains any characters with code point above 127,
encode it as a two-byte string (interpreted as uint16_t array
during loading).
This was done because V8 only supports Latin-1 and UTF16 encoding
as underlying representation for strings. To store the JS code
as external strings to save encoding cost and memory overhead
we need to follow the representations supported by V8.
Notice that there is a gap in the Latin1 range (128-255) that we
encoded as two-byte, which was an undocumented TODO for a long
time. That was fine previously because then files that contained
code points beyond the 0-127 range contained code points >255.
Now we have undici which contains code points in the range 0-255
(minus a replaceable code point >255). So this patch adds handling
for the 128-255 range to reduce the size overhead caused by encoding
them as two-byte. This could reduce the size of the binary by
~500KB and helps future files with this kind of code points.
Drive-by: replace `’` with `'` in undici.js to make it a Latin-1
only string. That could be removed if undici updates itself to
replace this character in the comment.
PR-URL: #51605
Reviewed-By: Daniel Lemire <daniel@lemire.me>
Reviewed-By: Ethan Arrowood <ethan@arrowood.dev>1 parent ea08350 commit 9840715Copy full SHA for 9840715
File tree
Expand file treeCollapse file tree
5 files changed
+212
-63
lines changedOpen diff view settings
Filter options
- src
- tools
Expand file treeCollapse file tree
5 files changed
+212
-63
lines changedOpen diff view settings
Collapse file
+7-2Lines changed: 7 additions & 2 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
73 | 73 | |
74 | 74 | |
75 | 75 | |
| 76 | + |
76 | 77 | |
77 | 78 | |
78 | 79 | |
| ||
193 | 194 | |
194 | 195 | |
195 | 196 | |
| 197 | + |
196 | 198 | |
197 | 199 | |
198 | 200 | |
| ||
1214 | 1216 | |
1215 | 1217 | |
1216 | 1218 | |
1217 | | - |
| 1219 | + |
| 1220 | + |
1218 | 1221 | |
1219 | 1222 | |
1220 | 1223 | |
1221 | | - |
| 1224 | + |
| 1225 | + |
| 1226 | + |
1222 | 1227 | |
1223 | 1228 | |
1224 | 1229 | |
|
Collapse file
src/embedded_data.cc
Copy file name to clipboard+33Lines changed: 33 additions & 0 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
| 1 | + |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | + |
| 6 | + |
| 7 | + |
| 8 | + |
| 9 | + |
| 10 | + |
| 11 | + |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | + |
| 17 | + |
| 18 | + |
| 19 | + |
| 20 | + |
| 21 | + |
| 22 | + |
| 23 | + |
| 24 | + |
| 25 | + |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | + |
| 30 | + |
| 31 | + |
| 32 | + |
| 33 | + |
Collapse file
src/embedded_data.h
Copy file name to clipboard+17Lines changed: 17 additions & 0 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
| 1 | + |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | + |
| 6 | + |
| 7 | + |
| 8 | + |
| 9 | + |
| 10 | + |
| 11 | + |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | + |
| 17 | + |
Collapse file
+1-29Lines changed: 1 addition & 29 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
8 | 8 | |
9 | 9 | |
10 | 10 | |
| 11 | + |
11 | 12 | |
12 | 13 | |
13 | 14 | |
| ||
748 | 749 | |
749 | 750 | |
750 | 751 | |
751 | | - |
752 | | - |
753 | | - |
754 | | - |
755 | | - |
756 | | - |
757 | | - |
758 | | - |
759 | | - |
760 | | - |
761 | | - |
762 | | - |
763 | | - |
764 | | - |
765 | | - |
766 | | - |
767 | | - |
768 | | - |
769 | | - |
770 | | - |
771 | | - |
772 | | - |
773 | | - |
774 | | - |
775 | | - |
776 | | - |
777 | | - |
778 | | - |
779 | | - |
780 | 752 | |
781 | 753 | |
782 | 754 | |
|
Collapse file
+154-32Lines changed: 154 additions & 32 deletions
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| ||
11 | 11 | |
12 | 12 | |
13 | 13 | |
| 14 | + |
14 | 15 | |
15 | 16 | |
16 | 17 | |
| ||
396 | 397 | |
397 | 398 | |
398 | 399 | |
| 400 | + |
| 401 | + |
399 | 402 | |
400 | 403 | |
401 | 404 | |
402 | 405 | |
403 | | - |
| 406 | + |
| 407 | + |
404 | 408 | |
405 | 409 | |
406 | 410 | |
| ||
424 | 428 | |
425 | 429 | |
426 | 430 | |
| 431 | + |
| 432 | + |
| 433 | + |
| 434 | + |
| 435 | + |
427 | 436 | |
428 | 437 | |
429 | | - |
| 438 | + |
| 439 | + |
430 | 440 | |
431 | 441 | |
432 | 442 | |
| ||
440 | 450 | |
441 | 451 | |
442 | 452 | |
443 | | - |
| 453 | + |
| 454 | + |
| 455 | + |
| 456 | + |
444 | 457 | |
445 | 458 | |
446 | 459 | |
447 | | - |
| 460 | + |
448 | 461 | |
449 | 462 | |
450 | 463 | |
| ||
456 | 469 | |
457 | 470 | |
458 | 471 | |
459 | | - |
460 | | - |
| 472 | + |
| 473 | + |
| 474 | + |
| 475 | + |
| 476 | + |
| 477 | + |
| 478 | + |
| 479 | + |
| 480 | + |
| 481 | + |
| 482 | + |
| 483 | + |
461 | 484 | |
462 | 485 | |
463 | 486 | |
464 | 487 | |
465 | 488 | |
466 | | - |
467 | | - |
| 489 | + |
| 490 | + |
| 491 | + |
| 492 | + |
| 493 | + |
| 494 | + |
| 495 | + |
| 496 | + |
| 497 | + |
| 498 | + |
| 499 | + |
| 500 | + |
| 501 | + |
| 502 | + |
| 503 | + |
| 504 | + |
| 505 | + |
| 506 | + |
| 507 | + |
| 508 | + |
468 | 509 | |
| 510 | + |
| 511 | + |
| 512 | + |
| 513 | + |
| 514 | + |
| 515 | + |
| 516 | + |
| 517 | + |
| 518 | + |
| 519 | + |
| 520 | + |
| 521 | + |
469 | 522 | |
470 | 523 | |
471 | 524 | |
| ||
476 | 529 | |
477 | 530 | |
478 | 531 | |
479 | | - |
480 | | - |
481 | | - |
| 532 | + |
| 533 | + |
482 | 534 | |
| 535 | + |
483 | 536 | |
484 | 537 | |
485 | 538 | |
| ||
488 | 541 | |
489 | 542 | |
490 | 543 | |
491 | | - |
| 544 | + |
| 545 | + |
| 546 | + |
| 547 | + |
| 548 | + |
492 | 549 | |
493 | | - |
494 | | - |
495 | | - |
496 | | - |
497 | | - |
498 | | - |
499 | | - |
500 | | - |
501 | | - |
502 | | - |
503 | | - |
| 550 | + |
| 551 | + |
| 552 | + |
| 553 | + |
| 554 | + |
| 555 | + |
| 556 | + |
| 557 | + |
| 558 | + |
| 559 | + |
| 560 | + |
| 561 | + |
| 562 | + |
504 | 563 | |
505 | 564 | |
506 | 565 | |
| ||
520 | 579 | |
521 | 580 | |
522 | 581 | |
523 | | - |
524 | | - |
525 | | - |
526 | | - |
| 582 | + |
| 583 | + |
| 584 | + |
| 585 | + |
| 586 | + |
| 587 | + |
| 588 | + |
| 589 | + |
| 590 | + |
| 591 | + |
527 | 592 | |
528 | | - |
529 | | - |
530 | | - |
531 | | - |
532 | | - |
| 593 | + |
| 594 | + |
| 595 | + |
| 596 | + |
| 597 | + |
| 598 | + |
| 599 | + |
| 600 | + |
| 601 | + |
| 602 | + |
| 603 | + |
| 604 | + |
| 605 | + |
| 606 | + |
| 607 | + |
| 608 | + |
| 609 | + |
| 610 | + |
| 611 | + |
533 | 612 | |
| 613 | + |
| 614 | + |
| 615 | + |
| 616 | + |
| 617 | + |
| 618 | + |
| 619 | + |
| 620 | + |
| 621 | + |
| 622 | + |
| 623 | + |
| 624 | + |
| 625 | + |
| 626 | + |
| 627 | + |
| 628 | + |
| 629 | + |
| 630 | + |
| 631 | + |
| 632 | + |
| 633 | + |
| 634 | + |
| 635 | + |
| 636 | + |
| 637 | + |
| 638 | + |
| 639 | + |
| 640 | + |
| 641 | + |
| 642 | + |
| 643 | + |
| 644 | + |
| 645 | + |
| 646 | + |
| 647 | + |
| 648 | + |
| 649 | + |
| 650 | + |
| 651 | + |
| 652 | + |
| 653 | + |
| 654 | + |
| 655 | + |
534 | 656 | |
535 | 657 | |
536 | 658 | |
|
0 commit comments