Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Computing a function when only its inverse is known, using Newson-Raphson method for 1D,2D,3D arrays in parallel.

License

Notifications You must be signed in to change notification settings

tugrul512bit/InverseFX

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 

Repository files navigation

InverseFX

Computing a function when only its inverse is known, using Newson-Raphson method for 1D,2D,3D arrays in parallel. When user has parallelized version of f(x), it becomes faster. The overall speedup is about 7x for AVX and 15x for AVX512 with x-square function.

Requirements:

  • C++14 compiler
  • Auto-vectorization of compiler
  • Auto-vectorization flags -std=c++14 -O3 -march=native -mavx2 -mprefer-vector-width=256 -ftree-vectorize -fno-math-errno
  • Or AVX512 version of flags and 512 vector width

How it works:

  • user has an f(x) function given to the constructor
  • constructor also takes h(step length) argument for the discrete-derivative calculations such as "two point derivative"/"five point stencil"
  • whenever inverse of f(x) at x=x0 is needed, user can call obj.computeInverse...() method
  • the computeInverse..() method uses Newton-Raphson method and discrete-derivative to approach inverse f(x0)

Usage:

  • parallelized (fast) inversion
InverseFX::ParallelInverse<float> invPar(
        // user's parallel f(x) function 
        // maps one to one from inp to out, for n elements from first element
        [](float * inp, float * out, int n){
		// C++ compiler vectorizes simple loop easily and possibly inlines this lambda for efficient SIMD
		for(int i=0;i<n;i++)
		{
		    out[i]=inp[i]*inp[i]; // square of x
		}
	},
        // h value that is used for computing two-point derivative inside the inversion logic
        0.001f
);

float inp[n],outp[n];

// compute inverse of f(x) at all points (n elements), reading from inp and writing result to outp
invPar.computeInverseLowQuality(inp,outp,n); // square-root is found for n-elements
  • parallelized inversion using scalar f(x) function
InverseFX::ParallelInverse<float> invPar(
      // f(x)=x*x, for answering question of "what is inverse of x*x?"
      // not as efficient as the parallel f(x) but rest of the algorithm is still parallelized
      [](float inp){ return inp*inp;},
      0.001f /* h */
);

float inp[n],outp[n];

// compute inverse of f(x) at all points (n elements), reading from inp and writing result to outp
invPar.computeInverseLowQuality(inp,outp,n); 
  • scalar inversion on scalar f(x) function
InverseFX::ScalarInverse<float> inv([](float inp){
	// black-box function sample
	// f(x)=x*x
	return inp*inp;
},0.001f);

// inverse of x*x is sqrt(x)
float squareRootOfPI = inv.computeInverseLowQuality(3.1415f);

About

Computing a function when only its inverse is known, using Newson-Raphson method for 1D,2D,3D arrays in parallel.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.