By: Michael S (already5chosen.delete@this.yahoo.com), May 13, 2013 2:50 am
Room: Moderated Discussions
Ricardo B (ricardo.b.delete@this.xxxxxx.xx) on May 13, 2013 2:40 am wrote:
> EduardoS (no.delete@this.spam.com) on May 12, 2013 7:21 pm wrote:
> > Ricardo B (ricardo.b.delete@this.xxxxx.xx) on May 12, 2013 5:48 pm wrote:
> > > Lossless compression (WinRAR, etc) in general sees big improvements.
> > > They're trival to parallelize effectively to any number of threads and they're very low on ILP.
> >
> > Your definition of "trivial" may be a bit to loose...
>
> No, it's not.
> I may be wrong about the true nature of some of these applications, but I don't think so either.
> AFAIK, all of these algorithms are block based: data to be compressed is divided into blocks,
> with sizes in hundred of kB to a few MB, and these blocks are compressed ndependently.
>
That's true for old (e.g. bzip2) implementations of compressors or for non-aggressive settings in new compressors. For new compressors/high effort blocks are up to GBs. It often makes big difference in practice.
BTW, I just looked at 7zip installed on my work computer (v 9.20). It supports multiple threads when compressing to .zip by various methods, including LZMA. But when compressing to its native 7z format, which gives the best compression ratios, it only supports 2 threads.
> As long as you have enough memory, the most efficient way to multi-thread them is also
> the simplest: have different threads work on diffent blocks and make sure each block
> is big enough to ensure the threading synchronization overhead is neglible.
>
> EduardoS (no.delete@this.spam.com) on May 12, 2013 7:21 pm wrote:
> > Ricardo B (ricardo.b.delete@this.xxxxx.xx) on May 12, 2013 5:48 pm wrote:
> > > Lossless compression (WinRAR, etc) in general sees big improvements.
> > > They're trival to parallelize effectively to any number of threads and they're very low on ILP.
> >
> > Your definition of "trivial" may be a bit to loose...
>
> No, it's not.
> I may be wrong about the true nature of some of these applications, but I don't think so either.
> AFAIK, all of these algorithms are block based: data to be compressed is divided into blocks,
> with sizes in hundred of kB to a few MB, and these blocks are compressed ndependently.
>
That's true for old (e.g. bzip2) implementations of compressors or for non-aggressive settings in new compressors. For new compressors/high effort blocks are up to GBs. It often makes big difference in practice.
BTW, I just looked at 7zip installed on my work computer (v 9.20). It supports multiple threads when compressing to .zip by various methods, including LZMA. But when compressing to its native 7z format, which gives the best compression ratios, it only supports 2 threads.
> As long as you have enough memory, the most efficient way to multi-thread them is also
> the simplest: have different threads work on diffent blocks and make sure each block
> is big enough to ensure the threading synchronization overhead is neglible.
>