Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[Perf] Bad perf for ragged max #3382

Copy link
Copy link
Open
@WilliamTambellini

Description

@WilliamTambellini
Issue body actions

Hi
The perf of max vs ragged max is not as good as it was 2 years ago:
#2786
showing about the same time.
As today:

ArrayFire v3.8.2 (CUDA, 64-bit Linux, build a9b6b0e)
Platform: CUDA Runtime 11.4, Driver: 495.29.05
[0] Quadro RTX 3000, 5935 MB, CUDA Compute 7.5

Max vs ragged max
                 M               max            ragmax
                 8        0.00241453        0.00872827
                16         0.0023666        0.00861896
                32        0.00237074        0.00865682
                64        0.00245299         0.0087447
               128        0.00240028        0.00879951
               256        0.00414862        0.00866826
               512        0.00701668        0.0096099

Description

  • Did you build ArrayFire yourself or did you use the official installers: myself
  • Which backend is experiencing this issue? (CPU, CUDA, OpenCL): CUDA
  • Do you have a workaround? no
  • Can the bug be reproduced reliably on your system? yes

Reproducible Code

af::array *a, *b;
void doMax() {
  af::max(*a, 0).eval();
}
void doRMax() {
  af::array val;
  af::array idx;
  af::max(val, idx, *a, *b, 0);
}
void raggedMax(const double REPEAT = 20) {
  BENCH("Max vs ragged max\n");
  BENCH("M");
  BENCH("max");
  BENCH("ragmax");
  BENCH(std::endl);
  for (s = 8; s <= 512; s *= 2) {
    BENCH(s);
    a = new af::array(s, s);
    BENCH(1000 * af::timeit(doMax));
    af::array seqlen = af::constant((unsigned)s / 2, 1, s, u32); 
    b = &seqlen;
    BENCH(1000 * af::timeit(doRMax));
    delete a;
    BENCH(std::endl);
  }
}

System Information

ArrayFire Version: 3.8.2
Device: RTX
Operating System: ubuntu
Driver version: 495

Checklist

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Morty Proxy This is a proxified and sanitized view of the page, visit original site.