From a51bcc6ac6d97bf4de29abf023ffa66750d92d10 Mon Sep 17 00:00:00 2001 From: powturbo Date: Sat, 19 Mar 2016 14:45:21 +0100 Subject: [PATCH] =?UTF-8?q?:new:=20Java+64=20bits=20lists=20for=20BitPacki?= =?UTF-8?q?ng,=20VSimple,=20VByte,=20Elias=20Fano,=E2=80=A6?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 27eb0ce..ebcee29 100644 --- a/README.md +++ b/README.md @@ -73,6 +73,29 @@ MI/s: 1.000.000 integers/second. 1000 MI/s = 4 GB/s
**#BOLD** = pareto frontier. FPF=FastPFor
TurboPForDA,TurboForDA: Direct Access is normally used when accessing individual values. +CPU: Skylake + +|Size| Ratio % |Bits/Integer |C Time MI/s |D Time MI/s |Function | +|--------:|-----:|----:|-------:|-------:|---------| +| 63392801| 15.85| 5.07|**413.76**|**1482.82**|**TurboPFor**| +| 63392801| 15.85| 5.07| 387.30| 243.62|**TurboPForDA**| +| 65359916| 16.34| 5.23| 7.58| 609.12|[OptPFD](#OptPFD)| +| 73477088| 18.37| 5.88| 101.68| 621.37|[Simple16](#Simple16)| +| 78514276| 19.63| 6.28|**256.83**|**676.45**|**VSimple**| +| 95915096| 23.98| 7.67| 211.79|**954.62**|[Simple-8b](#Simple-8b)| +| 98546814| 24.64| 7.88| 70.85|**2349.71**|[QMX](#QMX)| +| 99910930| 24.98| 7.99|**3537.57**|**3081.79**|**TurboPackV**| +| 99910930| 24.98| 7.99| 3099.52|3071.77|[SIMDPack FPF](#SIMDPack FPF)| +| 99910930| 24.98| 7.99| 2050.47|2402.54|**TurboPack**| +| 99910930| 24.98| 7.99| 2049.85|2364.52|**TurboFor**| +| 99910930| 24.98| 7.99| 2049.70|1124.12|**TurboForDA**| +|102074663| 25.52| 8.17| 1354.42|1745.69|[MaskedVByte](#MaskedVByte)| +|102074663| 25.52| 8.17| 1660.76|1626.67|**TurboVbyte**| +|102074663| 25.52| 8.17| 1249.77|1051.85|[Vbyte FPF](#Vbyte FPF)| +|112500000| 28.12| 9.00| 466.94|3003.70|[VarintG8IU](#VarintG8IU)| +|128125000| 32.03| 10.25| 1109.67|1271.32|[StreamVbyte FPF](#StreamVbyte)| +|400000000| 100.00 32.00| 2240.24|2237.05|Copy| + ##### - Data files: - gov2.sorted from [DocId data set](#DocId data set) Block size=128 (lz4+blosc+VSimple w/ 64Ki) @@ -262,11 +285,12 @@ header files to use with documentation:
### References: - + [FastPFor](https://github.com/lemire/FastPFor) + [Simdcomp](https://github.com/lemire/simdcomp): SIMDPack FPF, Vbyte FPF, VarintG8IU + + [FastPFor](https://github.com/lemire/FastPFor) + [Simdcomp](https://github.com/lemire/simdcomp): SIMDPack FPF, Vbyte FPF, VarintG8IU + [Optimized Pfor-delta compression code](http://jinruhe.com): OptPFD/OptP4, Simple16 (limited to 28 bits integers) + [MaskedVByte](http://maskedvbyte.org/). See also: [Vectorized VByte Decoding](http://engineering.indeed.com/blog/2015/03/vectorized-vbyte-decoding-high-performance-vector-instructions/) + [Index Compression Using 64-Bit Words](http://people.eng.unimelb.edu.au/ammoffat/abstracts/am10spe.html): Simple-8b (speed optimized version tested) + [libfor](https://github.com/cruppstahl/for) + + [QMX:Compression, SIMD, and Postings Lists](http://www.cs.otago.ac.nz/homepages/andrew/papers/) + [lz4](https://github.com/Cyan4973/lz4). included w. block size 64K as indication. Tested after preprocessing w. delta+transpose + [blosc](https://github.com/Blosc/c-blosc). blosc is like transpose/shuffle+lz77. Tested blosc+lz4 and blosclz incl. vectorizeed shuffle.
see also [benchmarks from the author of blosc](https://github.com/powturbo/TurboPFor/issues/2) single+multithreading