awk printf long number padding output incorrect

Question

Arch linux 6.15.7-zen1-1-zen,

$ awk -V
GNU Awk 5.3.2, API 4.0, PMA Avon 8-g1, (GNU MPFR 4.2.2, GNU MP 6.3.0)

Start with y.csv:

4 2016201820192020
5 20162018201920202023
5 20162018201920202024
5 00000000000000002024

then, variants of printf:

$ awk '{print $1,sprintf("%020d",$2)}' y.csv
4 00002016201820192020
5 20162018201920200704
5 20162018201920200704
5 00000000000000002024
$ awk '{$2=sprintf("%020d",$2);print $1,$2}' y.csv
4 00002016201820192020
5 20162018201920200704
5 20162018201920200704
5 00000000000000002024
$ awk '{printf("%020d\n",$2)}' y.csv
00002016201820192020
20162018201920200704
20162018201920200704
00000000000000002024
$ awk '{printf("%020.0f\n",$2)}' y.csv
00002016201820192020
20162018201920200704
20162018201920200704
00000000000000002024

What's going on? The last 4 digits of the 2nd & 3rd lines are always changed, seemingly randomly, to 0704!

Don't interpret them as numbers, e.g. awk '{print $1, sprintf("%20s",$2)}' y.csv — pmf, Commented Jul 26 at 20:21
your version of gawk has been compiled with the MPFR and MP libs so you should be able to use the -M (or --bignum) flag to insure you get the desired output; see the gnu.org link mentioned KamilCuk's answer for additional details on support for this feature — markp-fuso, Commented Jul 27 at 2:15

KamilCuk · Accepted Answer · 2025-07-26 21:34:18Z

7

What's going on?

The number is too big for an int, thus it is interpreted as a double IEEE 754.

Double can not represent all values of integer, the value is rounded to the closest representable value.

Consider reading https://www.gnu.org/software/gawk/manual/gawk.html#Other-Stuff-to-Know . Consider -M option. See https://www.binaryconvert.com/result_double.html?decimal=050048049054050048049056050048049057050048050048050048050052 .

answered Jul 26 at 21:34

KamilCuk

145k88 gold badges8484 silver badges152152 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

RARE Kpop Manifesto · Accepted Answer · 2025-07-28 02:59:12Z

bigint is overkill for a formatting issue :

echo '
4 2016201820192020
5 20162018201920202023
5 20162018201920202024
5 00000000000000002024' | 

awk '$2 = sprintf("%.*d%s", (_ = 20 - length($2)) * (_ >= 1), 0, $2)'

4 00002016201820192020
5 20162018201920202023
5 20162018201920202024
5 00000000000000002024

The extra filter (_ >= 1) is to guard against the extremely unlike event that $2 came in longer than 20 characters. Without the guard clause, an extra 0 would get prepended for no reason.

Collectives™ on Stack Overflow

awk printf long number padding output incorrect

2 Answers 2

Comments

Comments

Your Answer

Post as a guest

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related