I already have an implementation for both the dct and idct, however they become very slow as the size of the matrix increases despite proper optimization. Does anyone know a faster implementation for both or any Java library that provides a faster implementation for a 2-dimensional case. Thanks
public final double[][] initCoefficients(double[][] c)
{
final int N = c.length;
final double value = 1/Math.sqrt(2.0);
for (int i=1; i<N; i++)
{
for (int j=1; j<N; j++)
{
c[i][j]=1;
}
}
for (int i=0; i<N; i++)
{
c[i][0] = value;
c[0][i] = value;
}
c[0][0] = 0.5;
return c;
}
/* Computes the discrete cosine transform
*/
public final double[][] forwardDCT(double[][] input)
{
final int N = input.length;
final double mathPI = Math.PI;
final int halfN = N/2;
final double doubN = 2.0*N;
double[][] c = new double[N][N];
c = initCoefficients(c);
double[][] output = new double[N][N];
for (int u=0; u<N; u++)
{
double temp_u = u*mathPI;
for (int v=0; v<N; v++)
{
double temp_v = v*mathPI;
double sum = 0.0;
for (int x=0; x<N; x++)
{
int temp_x = 2*x+1;
for (int y=0; y<N; y++)
{
sum += input[x][y] * Math.cos((temp_x/doubN)*temp_u) * Math.cos(((2*y+1)/doubN)*temp_v);
}
}
sum *= c[u][v]/ halfN;
output[u][v] = sum;
}
}
return output;
}
/*
* Computes the inverse discrete cosine transform
*/
public final double[][] inverseDCT(double[][] input)
{
final int N = input.length;
final double mathPI = Math.PI;
final int halfN = N/2;
final double doubN = 2.0*N;
double[][] c = new double[N][N];
c = initCoefficients(c);
double[][] output = new double[N][N];
for (int x=0; x<N; x++)
{
int temp_x = 2*x+1;
for (int y=0; y<N; y++)
{
int temp_y = 2*y+1;
double sum = 0.0;
for (int u=0; u<N; u++)
{
double temp_u = u*mathPI;
for (int v=0; v<N; v++)
{
sum += c[u][v] * input[u][v] * Math.cos((temp_x/doubN)*temp_u) * Math.cos((temp_y/doubN)*v*mathPI);
}
}
sum /= halfN;
output[x][y] = sum;
}
}
return output;
}
Right now it's an O(n 4 ) algorithm, four nested loops all doing n
iterations. Separability gets that down to O(n 3 ) (or O(n 2 log n) if you're feeling brave enough to try the Fast Cosine Transform). It's actually even simpler than using the 2D formula, because all it is is this:
Or (optionally), to make both parts exactly the same:
The transpose means the second time it's really doing columns, and in the two transposes undo each other.
So, the cosines. You note that
precomputing the cosine seems difficult since am computing the cosine of the inner (loop) locals variables
Those cosines are really just constants written down in formulaic form, that array depends only on n
. For example, look at how FFmpeg does it in dctref.c
Do you have a max size for the DCT? If working with integers is OK (and this is usually the case for image manipulation), you can find some fast implementations for size 4, 8, 16 and 32 there: https://github.com/flanglet/kanzi/tree/master/java/src/kanzi/transform
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.