Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 0dcf753

Browse filesBrowse files
committed
Improve the numeric width_bucket() computation.
Formerly, the computation of the bucket index involved calling div_var() with a scale determined by select_div_scale(), and then taking the floor of the result. That involved computing anything from 16 to 1000 digits after the decimal point, only for floor_var() to throw them away. In addition, the quotient was computed with rounding in the final digit, which meant that in rare cases the whole result could round up to the wrong bucket, and could exceed count. Thus it was also necessary to clamp the result to the range [1, count], though that didn't prevent the result being in the wrong internal bucket. Instead, compute the quotient using floor division, which guarantees the correct result, as specified by the SQL spec, and doesn't need to be clamped. This is both much simpler and more efficient, since it no longer computes any quotient digits after the decimal point. In addition, it is not necessary to have separate code to handle reversed bounds, since the signs cancel out when dividing. As with b0e9e4d and a2a0c7c, no back-patch. Dean Rasheed, reviewed by Joel Jacobson. Discussion: https://postgr.es/m/CAEZATCVbJH%2BLE9EXW8Rk3AxLe%3DjbOk2yrT_AUJGGh5Rah6zoeg%40mail.gmail.com
1 parent 628c1d1 commit 0dcf753
Copy full SHA for 0dcf753

File tree

Expand file treeCollapse file tree

1 file changed

+18
-31
lines changed
Filter options
Expand file treeCollapse file tree

1 file changed

+18
-31
lines changed

‎src/backend/utils/adt/numeric.c

Copy file name to clipboardExpand all lines: src/backend/utils/adt/numeric.c
+18-31Lines changed: 18 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -608,7 +608,7 @@ static void round_var(NumericVar *var, int rscale);
608608
static void trunc_var(NumericVar *var, int rscale);
609609
static void strip_var(NumericVar *var);
610610
static void compute_bucket(Numeric operand, Numeric bound1, Numeric bound2,
611-
const NumericVar *count_var, bool reversed_bounds,
611+
const NumericVar *count_var,
612612
NumericVar *result_var);
613613

614614
static void accum_sum_add(NumericSumAccum *accum, const NumericVar *val);
@@ -1896,7 +1896,7 @@ width_bucket_numeric(PG_FUNCTION_ARGS)
18961896
else if (cmp_numerics(operand, bound2) >= 0)
18971897
add_var(&count_var, &const_one, &result_var);
18981898
else
1899-
compute_bucket(operand, bound1, bound2, &count_var, false,
1899+
compute_bucket(operand, bound1, bound2, &count_var,
19001900
&result_var);
19011901
break;
19021902

@@ -1907,7 +1907,7 @@ width_bucket_numeric(PG_FUNCTION_ARGS)
19071907
else if (cmp_numerics(operand, bound2) <= 0)
19081908
add_var(&count_var, &const_one, &result_var);
19091909
else
1910-
compute_bucket(operand, bound1, bound2, &count_var, true,
1910+
compute_bucket(operand, bound1, bound2, &count_var,
19111911
&result_var);
19121912
break;
19131913
}
@@ -1926,14 +1926,13 @@ width_bucket_numeric(PG_FUNCTION_ARGS)
19261926

19271927
/*
19281928
* 'operand' is inside the bucket range, so determine the correct
1929-
* bucket for it to go. The calculations performed by this function
1929+
* bucket for it to go in. The calculations performed by this function
19301930
* are derived directly from the SQL2003 spec. Note however that we
19311931
* multiply by count before dividing, to avoid unnecessary roundoff error.
19321932
*/
19331933
static void
19341934
compute_bucket(Numeric operand, Numeric bound1, Numeric bound2,
1935-
const NumericVar *count_var, bool reversed_bounds,
1936-
NumericVar *result_var)
1935+
const NumericVar *count_var, NumericVar *result_var)
19371936
{
19381937
NumericVar bound1_var;
19391938
NumericVar bound2_var;
@@ -1943,34 +1942,22 @@ compute_bucket(Numeric operand, Numeric bound1, Numeric bound2,
19431942
init_var_from_num(bound2, &bound2_var);
19441943
init_var_from_num(operand, &operand_var);
19451944

1946-
if (!reversed_bounds)
1947-
{
1948-
sub_var(&operand_var, &bound1_var, &operand_var);
1949-
sub_var(&bound2_var, &bound1_var, &bound2_var);
1950-
}
1951-
else
1952-
{
1953-
sub_var(&bound1_var, &operand_var, &operand_var);
1954-
sub_var(&bound1_var, &bound2_var, &bound2_var);
1955-
}
1945+
/*
1946+
* Per spec, bound1 is inclusive and bound2 is exclusive, and so we have
1947+
* bound1 <= operand < bound2 or bound1 >= operand > bound2. Either way,
1948+
* the result is ((operand - bound1) * count) / (bound2 - bound1) + 1,
1949+
* where the quotient is computed using floor division (i.e., division to
1950+
* zero decimal places with truncation), which guarantees that the result
1951+
* is in the range [1, count]. Reversing the bounds doesn't affect the
1952+
* computation, because the signs cancel out when dividing.
1953+
*/
1954+
sub_var(&operand_var, &bound1_var, &operand_var);
1955+
sub_var(&bound2_var, &bound1_var, &bound2_var);
19561956

19571957
mul_var(&operand_var, count_var, &operand_var,
19581958
operand_var.dscale + count_var->dscale);
1959-
div_var(&operand_var, &bound2_var, result_var,
1960-
select_div_scale(&operand_var, &bound2_var), true);
1961-
1962-
/*
1963-
* Roundoff in the division could give us a quotient exactly equal to
1964-
* "count", which is too large. Clamp so that we do not emit a result
1965-
* larger than "count".
1966-
*/
1967-
if (cmp_var(result_var, count_var) >= 0)
1968-
set_var_from_var(count_var, result_var);
1969-
else
1970-
{
1971-
add_var(result_var, &const_one, result_var);
1972-
floor_var(result_var, result_var);
1973-
}
1959+
div_var(&operand_var, &bound2_var, result_var, 0, false);
1960+
add_var(result_var, &const_one, result_var);
19741961

19751962
free_var(&bound1_var);
19761963
free_var(&bound2_var);

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.