Testing the FFTW wrapper on my Mac Pro with Anaconda (64-bit) and numpy 1.9.1. Very similar results with mklfft.
import numpy as np
import pyfftw
def rfft2(A):
a = pyfftw.n_byte_align_empty(A.shape, 8, 'float64')
a[:]=np.copy(A)
fft_object = pyfftw.builders.rfft2(a,threads=1)
return fft_object()
def irfft2(Ahat):
ah = pyfftw.n_byte_align_empty(Ahat.shape, 16, 'complex128')
ah[:]=np.copy(Ahat)
fft_object = pyfftw.builders.irfft2(ah,threads=1)
return fft_object()
A = np.random.randn(64,64)
Now calculate the real fft with numpy and two different calls of fftw
Ahat_np = np.fft.rfft2(A)
Ahat_fftw = pyfftw.interfaces.numpy_fft.rfft2(A, threads=1)
Ahat_fftw_2 = rfft2(A)
Test if the fftw calls give the same results
In [9]: np.allclose(Ahat_fftw,Ahat_fftw_2,rtol=1.e-16,atol=1.e-16)
Out[9]: True
But there are some small differences to bumpy's pfftr
tol,atol = 1.e-14,1.e-14
In [47]: np.allclose(Ahat_np,Ahat_fftw,rtol,atol)
Out[47]: False
The test above passes with a tolerance of 1.e-13.
Now invert back to physical space and compare with original array
Anp = np.fft.irfft2(Ahat_np)
Afftw = pyfftw.interfaces.numpy_fft.irfft2(Ahat_fftw)
Afftw_2 = irfft2(Ahat_fftw_2)
In [40]: rtol,atol = 1.e-14, 1.e-14
In [41]: np.allclose(A,Anp,rtol,atol)
Out[41]: True
In [42]: np.allclose(A,Afftw,rtol,atol)
Out[42]: True
In [43]: np.allclose(A,Afftw_2,rtol,atol)
Out[43]: True
It is somehow upsetting that we can't get the exact same results to 1.e-15... But so far so good. Now repeat the calculations above in a nonsquare domain
A = np.random.randn(64,66)
...
In [61]: rtol,atol = 1.e-13, 1.e-13
In [62]: np.allclose(A,Anp,rtol,atol)
Out[62]: True
In [63]: np.allclose(A,Afftw,rtol,atol)
Out[63]: False
In [64]: np.allclose(A,Afftw_2,rtol,atol)
Out[64]: False
In particular, there are "significant" differences between between the original array and the one that goes through the fftw transforms:
In [65]: A-Afftw
Out[65]:
array([[ -1.27955443e-08, -4.15157833e-09, 2.97806797e-08, ...,
2.01382537e-08, 2.04817235e-08, 1.73261541e-08],
[ -2.57259580e-08, -1.81484963e-08, 4.83268750e-09, ...,
-3.66382132e-09, 1.07922297e-08, 2.94663406e-08],
[ -3.59105948e-09, -1.16771947e-08, 8.73914346e-09, ...,
1.38128254e-08, 3.13628652e-08, 3.43982829e-08],
...,
[ -1.86094601e-08, 4.02695504e-08, -1.77806448e-08, ...,
3.73103608e-08, 3.07046049e-08, -1.44515059e-08],
[ -7.43273176e-09, -2.24983171e-08, -5.21882319e-10, ...,
-1.52015343e-08, 6.26377332e-08, -5.06093842e-09],
[ 3.93612150e-08, -4.75781994e-08, -3.41397135e-08, ...,
1.64235306e-08, 3.56944274e-08, -7.90450114e-08]])
Same thing if we use the call with byte align, etc.
Numpy's fft still OK:
In [66]: A-Anp
Out[66]:
array([[ -1.94289029e-15, -1.99840144e-15, 2.44249065e-15, ...,
-2.22044605e-16, 0.00000000e+00, 2.55351296e-15],
[ 3.33066907e-16, -4.44089210e-16, 2.66453526e-15, ...,
7.07767178e-16, 1.83186799e-15, -5.55111512e-16],
[ -1.24900090e-16, -1.66533454e-16, 2.66453526e-15, ...,
2.10942375e-15, 1.55431223e-15, 2.22044605e-16],
...,
[ -2.77555756e-15, 2.22044605e-15, -7.77156117e-16, ...,
3.55271368e-15, 1.11022302e-15, -3.44169138e-15],
[ 2.58126853e-15, 2.55351296e-15, -4.51028104e-17, ...,
-2.22044605e-15, -3.10862447e-15, -7.02216063e-15],
[ 6.66133815e-16, -2.22044605e-16, -4.21884749e-15, ...,
0.00000000e+00, 4.66293670e-15, -5.77315973e-15]])
Any thoughts on this?