-
Notifications
You must be signed in to change notification settings - Fork 543
Issues: arrayfire/arrayfire
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Perf] OpenCL vs CUDA performance for Black Hole Example
perf
#3649
opened Mar 28, 2025 by
christophe-murphy
[Perf] Reduction over rows of a multi dimension array takes a while
perf
#3582
opened Aug 12, 2024 by
HugoPhibbs
1 task done
[Perf] Parallel execution of af::fftConvolve not possible
perf
#3250
opened Apr 25, 2022 by
Gobutah
1 task done
Max method seems to be sequential for large data on CPU backend
feature
perf
#3011
opened Sep 11, 2020 by
mehran-kh-z
1 task done
speed up f16 on CUDA with hmul2, hadd2, etc.
CUDA
improvement
perf
#2752
opened Feb 9, 2020 by
ghost
perf of logsoftmax afcuda vs cudnn
CUDA
improvement
perf
#2609
opened Aug 15, 2019 by
WilliamTambellini
use int indexes for JIT kernels if no overflow possible
improvement
perf
#2364
opened Nov 30, 2018 by
WilliamTambellini
Do not use additional memory when broadcasting weights in mean
perf
#1935
opened Sep 18, 2017 by
pavanky
Optimization request: improve af::convolve for small kernel sizes
perf
#1874
opened Jul 19, 2017 by
pthon
Compare performance of fftConvolve with the following alternative implementation
perf
#938
opened Aug 7, 2015 by
pavanky
ProTip!
Mix and match filters to narrow down what you’re looking for.