GithubHelp home page GithubHelp logo

Comments (12)

cruppstahl avatar cruppstahl commented on June 17, 2024

maxbits(datain) only checks 128 integers, but your datain array holds 9999 integers.
use maxbits_length instead, or simdmaxbitsd1_length (w/ initvalue set to 0).

from simdcomp.

akhld avatar akhld commented on June 17, 2024

Like this? const uint32_t b = maxbits_length(datain, N); It doesn't work for me.

from simdcomp.

cruppstahl avatar cruppstahl commented on June 17, 2024

simdpackwithoutmask() and simdunpack() also just pack/unpack 128 integers. Here is code which works for you:

  size_t k, N = 9999;
  uint32_t * datain = malloc(N * sizeof(uint32_t));
  uint8_t * buffer = malloc(N * sizeof(uint32_t) + N / SIMDBlockSize);
  uint32_t * backbuffer = malloc(N * sizeof(uint32_t));
  uint32_t b;

  for (k = 0; k < N; ++k){        /* start with k=0, not k=1! */
      datain[k] = k;
  }

  b = maxbits_length(datain, N);
  simdpack_length(datain, N, (__m128i *)buffer, b);
  simdunpack_length((const __m128i *)buffer, N, backbuffer, b);

  for (k = 0; k < N; ++k){         /* start with k=0, not k=1! */
      printf ("%d\n", backbuffer[k]);
  }

from simdcomp.

akhld avatar akhld commented on June 17, 2024

Awesome. Works like charm. Thanks a ton @cruppstahl

from simdcomp.

cruppstahl avatar cruppstahl commented on June 17, 2024

great :-)

from simdcomp.

lemire avatar lemire commented on June 17, 2024

Added to README:

68d62c5#diff-04c6e90faac2675aa89e2176d2eec7d8R50

from simdcomp.

akhld avatar akhld commented on June 17, 2024

Just a followup question, I read the simdpack_length function defenition and in the comments its specified as "slower" compared to the simdpackwithoutmask method. I tried to change the compress function in the example as follows:

size_t compress2(uint32_t * datain, size_t length, uint8_t * buffer) {
    uint32_t offset;
    uint8_t * initout;
    size_t k;
    if(length/SIMDBlockSize*SIMDBlockSize != length) {
        printf("Data length should be a multiple of %i \n",SIMDBlockSize);
    }
    offset = 0;
    initout = buffer;
    for(k = 0; k < length / SIMDBlockSize; ++k) {
        uint32_t b = maxbits(datain);
        *buffer++ = b;
        simdpackwithoutmask(datain, buffer, b);
        offset = datain[k * SIMDBlockSize + SIMDBlockSize - 1];
        buffer += b * sizeof(__m128i);
    }
    return buffer - initout;
}

And called it as follows:

size_t nn = 128 * 2;
uint32_t * datainn = malloc(nn * sizeof(uint32_t));
uint8_t * buffern = malloc(nn * sizeof(uint32_t) + nn / SIMDBlockSize);
uint32_t * backbuffern = malloc(nn * sizeof(uint32_t));
size_t k;

for(k=0;k<nn;++k){               
    datainn[k] = rand() % (k + 1);
}

size_t compsize = compress2(datainn,nn,buffern);

And tried to uncompress as follows, but it does not retrieve the values properly.

uint8_t * decbuffern = buffern;
for (k = 0; k * SIMDBlockSize < nn; ++k) {
  uint32_t b = maxbits(datainn);      
  simdunpack(buffern, backbuffern, b);           
  decbuffern += b * sizeof(__m128i);      
}

for (k = 0; k < nn; ++k){       
    printf ("%d\n", backbuffern[k]);
}

Could you also add an example for the same? I bench-marked the previous version with a java based implementation and I'm seeing ~20x higher performance.

from simdcomp.

cruppstahl avatar cruppstahl commented on June 17, 2024

I did not compile your code, but noticed a few things:

in compress2() you store the "maxbit" value in *buffer++ = b, but when uncompressing you don't reuse the stored value. Also, compress2() does not increment the input pointer correctly. It should be like this:

for(k = 0; k < length / SIMDBlockSize; ++k) {
    uint32_t b = maxbits(datain);
    *buffer++ = b;
    simdpackwithoutmask(datain, buffer, b);
    datain += SIMDBlockSize;
    buffer += b * sizeof(__m128i);
}

and uncompress:

uint8_t * decbuffern = buffern;
for (k = 0; k * SIMDBlockSize < nn; ++k) {
  uint32_t b = maxbits(*buffern);      
  buffern++;
  simdunpack(buffern, backbuffern, b);           
  buffern += b * sizeof(__m128i);      
  backbuffern += SIMDBlockSize;
}

(I haven't tested this code.)

from simdcomp.

lemire avatar lemire commented on June 17, 2024

@akhld

The *_length functions were indeed slow for long arrays, but I fixed that with code that looks like what @cruppstahl proposed...

e26e44f#diff-8e134dc3d7779826cc25abb6684670a7R14149

On the plus side, my code handles the case where the input data contains a number of integers that is not divisible by 128.

Of course, these functions may not be exactly what you are looking for but, hopefully, it should be "easy" to modify them to suit your needs.

from simdcomp.

akhld avatar akhld commented on June 17, 2024

thanks @lemire nice work.

@cruppstahl unfortunately, the changes in the uncompress is ending up with "Segmentation fault (core dumped)", i suspect its an index out of bounds?

from simdcomp.

cruppstahl avatar cruppstahl commented on June 17, 2024

The code here works:

    size_t compress2(uint32_t * datain, size_t length, uint8_t * buffer) {
        uint8_t * initout;
        size_t k;
        if(length/SIMDBlockSize*SIMDBlockSize != length) {
            printf("Data length should be a multiple of %i \n",SIMDBlockSize);
        }
        initout = buffer;
        for(k = 0; k < length / SIMDBlockSize; ++k) {
            uint32_t b = maxbits(datain);
            *buffer++ = b;
            simdpackwithoutmask(datain, (__m128i *)buffer, b);
            datain += SIMDBlockSize;
            buffer += b * sizeof(__m128i);
        }
        return buffer - initout;
    }

    int main() {
      size_t nn = 128 * 2;
      uint32_t * datainn = malloc(nn * sizeof(uint32_t));
      uint8_t * buffern = malloc(nn * sizeof(uint32_t) + nn / SIMDBlockSize);
      uint32_t * backbuffern = malloc(nn * sizeof(uint32_t));
      size_t k, compsize;

      for(k=0;k<nn;++k){               
        datainn[k] = rand() % (k + 1);
      }

      compsize = compress2(datainn,nn,buffern);
      printf("encoded size: %u (original size: %u)\n", (unsigned)compsize,
                    (unsigned)(nn * sizeof(uint32_t)));

      for (k = 0; k * SIMDBlockSize < nn; ++k) {
        uint32_t b = *buffern;
        buffern++;
        simdunpack((const __m128i *)buffern, backbuffern + k * SIMDBlockSize, b);
        buffern += b * sizeof(__m128i);      
      }

      for (k = 0; k < nn; ++k){       
          printf ("%d\n", backbuffern[k]);
      }

      return 0;
    }

from simdcomp.

akhld avatar akhld commented on June 17, 2024

Yep, it works. The issue was with the way we were moving the pointer in the unpack i guess.

from simdcomp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.