BitMiracle.LibJpeg.Classic.Internal.jpeg_forward_dct.jpeg_fdct_ifast C# (CSharp) Method

jpeg_forward_dct Class Documentation Show file Open project: prepare/HTML-Renderer

jpeg_fdct_ifast() private static method

Perform the forward DCT on one block of samples. NOTE: this code only copes with 8x8 DCTs. This file contains a fast, not so accurate integer implementation of the forward DCT (Discrete Cosine Transform). A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT on each column. Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on Arai, Agui, and Nakajima's algorithm for scaled DCT. Their original paper (Trans. IEICE E-71(11):1095) is in Japanese, but the algorithm is described in the Pennebaker & Mitchell JPEG textbook (see REFERENCES section in file README). The following code is based directly on figure 4-8 in P&M. While an 8-point DCT cannot be done in less than 11 multiplies, it is possible to arrange the computation so that many of the multiplies are simple scalings of the final outputs. These multiplies can then be folded into the multiplications or divisions by the JPEG quantization table entries. The AA&N method leaves only 5 multiplies and 29 adds to be done in the DCT itself. The primary disadvantage of this method is that with fixed-point math, accuracy is lost due to imprecise representation of the scaled quantization values. The smaller the quantization table entry, the less precise the scaled value, so this implementation does worse with high- quality-setting files than with low-quality ones. Scaling decisions are generally the same as in the LL&M algorithm; see jpeg_fdct_islow for more details. However, we choose to descale (right shift) multiplication products as soon as they are formed, rather than carrying additional fractional bits into subsequent additions. This compromises accuracy slightly, but it lets us save a few shifts. More importantly, 16-bit arithmetic is then adequate (for 8-bit samples) everywhere except in the multiplications proper; this saves a good deal of work on 16-bit-int machines. Again to save a few shifts, the intermediate results between pass 1 and pass 2 are not upscaled, but are represented only to integral precision. A final compromise is to represent the multiplicative constants to only 8 fractional bits, rather than 13. This saves some shifting work on some machines, and may also reduce the cost of multiplication (since there are fewer one-bits in the constants).

private static jpeg_fdct_ifast ( int data, byte sample_data, int start_row, int start_col ) : void
data	int
sample_data	byte
start_row	int
start_col	int
return	void

        private static void jpeg_fdct_ifast(int[] data, byte[][] sample_data, int start_row, int start_col)
        {
            /* Pass 1: process rows. */
            int dataIndex = 0;
            for (int ctr = 0; ctr < JpegConstants.DCTSIZE; ctr++)
            {
                byte[] elem = sample_data[start_row + ctr];
                int elemIndex = start_col;

                /* Load data into workspace */
                int tmp0 = elem[elemIndex + 0] + elem[elemIndex + 7];
                int tmp7 = elem[elemIndex + 0] - elem[elemIndex + 7];
                int tmp1 = elem[elemIndex + 1] + elem[elemIndex + 6];
                int tmp6 = elem[elemIndex + 1] - elem[elemIndex + 6];
                int tmp2 = elem[elemIndex + 2] + elem[elemIndex + 5];
                int tmp5 = elem[elemIndex + 2] - elem[elemIndex + 5];
                int tmp3 = elem[elemIndex + 3] + elem[elemIndex + 4];
                int tmp4 = elem[elemIndex + 3] - elem[elemIndex + 4];

                /* Even part */

                int tmp10 = tmp0 + tmp3;    /* phase 2 */
                int tmp13 = tmp0 - tmp3;
                int tmp11 = tmp1 + tmp2;
                int tmp12 = tmp1 - tmp2;

                /* Apply unsigned->signed conversion. */
                data[dataIndex + 0] = tmp10 + tmp11 - 8 * JpegConstants.CENTERJSAMPLE; /* phase 3 */
                data[dataIndex + 4] = tmp10 - tmp11;

                int z1 = FAST_INTEGER_MULTIPLY(tmp12 + tmp13, FAST_INTEGER_FIX_0_707106781); /* c4 */
                data[dataIndex + 2] = tmp13 + z1;    /* phase 5 */
                data[dataIndex + 6] = tmp13 - z1;

                /* Odd part */

                tmp10 = tmp4 + tmp5;    /* phase 2 */
                tmp11 = tmp5 + tmp6;
                tmp12 = tmp6 + tmp7;

                /* The rotator is modified from fig 4-8 to avoid extra negations. */
                int z5 = FAST_INTEGER_MULTIPLY(tmp10 - tmp12, FAST_INTEGER_FIX_0_382683433); /* c6 */
                int z2 = FAST_INTEGER_MULTIPLY(tmp10, FAST_INTEGER_FIX_0_541196100) + z5; /* c2-c6 */
                int z4 = FAST_INTEGER_MULTIPLY(tmp12, FAST_INTEGER_FIX_1_306562965) + z5; /* c2+c6 */
                int z3 = FAST_INTEGER_MULTIPLY(tmp11, FAST_INTEGER_FIX_0_707106781); /* c4 */

                int z11 = tmp7 + z3;        /* phase 5 */
                int z13 = tmp7 - z3;

                data[dataIndex + 5] = z13 + z2;  /* phase 6 */
                data[dataIndex + 3] = z13 - z2;
                data[dataIndex + 1] = z11 + z4;
                data[dataIndex + 7] = z11 - z4;

                dataIndex += JpegConstants.DCTSIZE;     /* advance pointer to next row */
            }

            /* Pass 2: process columns. */

            dataIndex = 0;
            for (int ctr = JpegConstants.DCTSIZE - 1; ctr >= 0; ctr--)
            {
                int tmp0 = data[dataIndex + JpegConstants.DCTSIZE * 0] + data[dataIndex + JpegConstants.DCTSIZE * 7];
                int tmp7 = data[dataIndex + JpegConstants.DCTSIZE * 0] - data[dataIndex + JpegConstants.DCTSIZE * 7];
                int tmp1 = data[dataIndex + JpegConstants.DCTSIZE * 1] + data[dataIndex + JpegConstants.DCTSIZE * 6];
                int tmp6 = data[dataIndex + JpegConstants.DCTSIZE * 1] - data[dataIndex + JpegConstants.DCTSIZE * 6];
                int tmp2 = data[dataIndex + JpegConstants.DCTSIZE * 2] + data[dataIndex + JpegConstants.DCTSIZE * 5];
                int tmp5 = data[dataIndex + JpegConstants.DCTSIZE * 2] - data[dataIndex + JpegConstants.DCTSIZE * 5];
                int tmp3 = data[dataIndex + JpegConstants.DCTSIZE * 3] + data[dataIndex + JpegConstants.DCTSIZE * 4];
                int tmp4 = data[dataIndex + JpegConstants.DCTSIZE * 3] - data[dataIndex + JpegConstants.DCTSIZE * 4];

                /* Even part */

                int tmp10 = tmp0 + tmp3;    /* phase 2 */
                int tmp13 = tmp0 - tmp3;
                int tmp11 = tmp1 + tmp2;
                int tmp12 = tmp1 - tmp2;

                data[dataIndex + JpegConstants.DCTSIZE * 0] = tmp10 + tmp11; /* phase 3 */
                data[dataIndex + JpegConstants.DCTSIZE * 4] = tmp10 - tmp11;

                int z1 = FAST_INTEGER_MULTIPLY(tmp12 + tmp13, FAST_INTEGER_FIX_0_707106781); /* c4 */
                data[dataIndex + JpegConstants.DCTSIZE * 2] = tmp13 + z1; /* phase 5 */
                data[dataIndex + JpegConstants.DCTSIZE * 6] = tmp13 - z1;

                /* Odd part */

                tmp10 = tmp4 + tmp5;    /* phase 2 */
                tmp11 = tmp5 + tmp6;
                tmp12 = tmp6 + tmp7;

                /* The rotator is modified from fig 4-8 to avoid extra negations. */
                int z5 = FAST_INTEGER_MULTIPLY(tmp10 - tmp12, FAST_INTEGER_FIX_0_382683433); /* c6 */
                int z2 = FAST_INTEGER_MULTIPLY(tmp10, FAST_INTEGER_FIX_0_541196100) + z5; /* c2-c6 */
                int z4 = FAST_INTEGER_MULTIPLY(tmp12, FAST_INTEGER_FIX_1_306562965) + z5; /* c2+c6 */
                int z3 = FAST_INTEGER_MULTIPLY(tmp11, FAST_INTEGER_FIX_0_707106781); /* c4 */

                int z11 = tmp7 + z3;        /* phase 5 */
                int z13 = tmp7 - z3;

                data[dataIndex + JpegConstants.DCTSIZE * 5] = z13 + z2; /* phase 6 */
                data[dataIndex + JpegConstants.DCTSIZE * 3] = z13 - z2;
                data[dataIndex + JpegConstants.DCTSIZE * 1] = z11 + z4;
                data[dataIndex + JpegConstants.DCTSIZE * 7] = z11 - z4;

                dataIndex++;          /* advance pointer to next column */
            }
        }

jpeg_forward_dct

FAST_INTEGER_MULTIPLY

SLOW_INTEGER_FIX

forwardDCTFloatImpl

forwardDCTImpl

jpeg_fdct_10x10

jpeg_fdct_10x5

jpeg_fdct_11x11

jpeg_fdct_12x12

jpeg_fdct_12x6

jpeg_fdct_13x13

jpeg_fdct_14x14