Fixed initialization error on gebrd #3422
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
test_svd_opencl.exe is failing on Radeon R7, although passes on GTX 750 Ti.
Description
Output:
ArrayFire v3.9.0 (OpenCL, 64-bit Windows, build 64586e0)
[0] AMD: Spectre, 6571 MB -- OpenCL 2.0 AMD-APP (3224.5) -- Device driver 3224.5 -- FP64 Support: True
-1- NVIDIA: NVIDIA GeForce GTX 750 Ti, 2047 MB -- OpenCL 3.0 CUDA -- Device driver 531.61 -- FP64 Support:
svd/2.Square, where TypeParam = struct af::af_cfloat
svd.cpp(150): LAPACKE Error (-5)
svd/2.Rect0, where TypeParam = struct af::af_cfloat
svd.cpp(150): LAPACKE Error (-5)
svd/2.Rect1, where TypeParam = struct af::af_cfloat
svd.cpp(150): LAPACKE Error (-5)
svd/2.InPlaceSquare, where TypeParam = struct af::af_cfloat
svd.cpp(150): LAPACKE Error (-5)
svd/2.InPlaceRect0, where TypeParam = struct af::af_cfloat
svd.cpp(150): LAPACKE Error (-5)
Cause:
The magma gebrd function, functions in an hybrid mode having host buffers and device buffers in sync.
A new device buffer dwork is created for the corresponding host buffer work.
The host buffer result from a vector object, which initializes all elements to 0.0
The corresponding device buffer is not initialized.
On the AMD, this resulted in NAN values produced by the gebrd function, which is detected in the following
lapacke copy function resulting in an error -5.
Additional information about the PR answering following questions:
Fixes: #3147
Changes to Users
Fixed occasional bug
Checklist