🆕 Java+64 bits lists for BitPacking, VSimple, VByte, Elias Fano,…
This commit is contained in:
14
README.md
14
README.md
@ -71,7 +71,7 @@ MI/s: 1.000.000 integers/second. 1000 MI/s = 4 GB/s<br>
|
||||
TurboPForDA,TurboForDA: Direct Access is normally used when accessing individual values.
|
||||
|
||||
##### - Data files:
|
||||
- gov2.sorted from [DocId data set](#DocId data set) Block size=128 (lz4+blosc+VSimple w/ 64k)
|
||||
- gov2.sorted from [DocId data set](#DocId data set) Block size=128 (lz4+blosc+VSimple w/ 64K)
|
||||
|
||||
|
||||
./icbench -c1 gov2.sorted
|
||||
@ -90,17 +90,18 @@ TurboPForDA,TurboForDA: Direct Access is normally used when accessing individual
|
||||
| 4.953.768.342| 20.71| 6.63|**1766.05**|**1943.87**|**TurboPackV**|
|
||||
| 4.953.768.342| 20.71| 6.63|1419.35|1512.86|**TurboPack**|
|
||||
| 5.203.353.057| 21.75| 6.96|1560.34|1806.60|SIMDPackD1 FPF|
|
||||
| 6.074.995.117| 25.40| 8.13| 494.70| 729.97|[blosc](#blosc) lz4 64K|
|
||||
| 6.074.995.117| 25.40| 8.13| 494.70| 729.97|[blosc_lz4](#blosc) 64K|
|
||||
| 6.221.886.390| 26.01| 8.32|1666.76|1737.72|**TurboFor**|
|
||||
| 6.221.886.390| 26.01| 8.32|1660.52| 565.25|**TurboForDA**|
|
||||
| 6.699.519.000| 28.01| 8.96| 472.01| 495.12|Vbyte FPF|
|
||||
| 6.700.989.563| 28.02| 8.96| 728.72| 991.57|MaskedVByte|
|
||||
| 7.622.896.878| 31.87|10.20| 208.73|1197.74|VarintG8IU|
|
||||
| 8.594.342.216| 35.93|11.50|1307.22|1593.07|libfor|
|
||||
| 8.773.150.644| 36.68|11.74| 637.83|1301.05|blosc_lz 64K|
|
||||
|23.918.861.764|100.00|32.00|1456.17|1480.78|Copy|
|
||||
|
||||
"lz4+DT 64K" after preprocessing Delta+Transpose similar to delta + [blosc](#blosc)<br>
|
||||
"blosc lz4" tested w/ lz4 compressor+vectorized shuffle
|
||||
"lz4+DT 64K" = Delta+Transpose from TurboPFor + lz4<br>
|
||||
"blosc_lz4" tested w/ lz4 compressor+vectorized shuffle
|
||||
|
||||
##### - Compressed Inverted Index Intersections with GOV2<br />
|
||||
GOV2: 426GB, 25 Millions documents, average doc. size=18k.
|
||||
@ -263,7 +264,8 @@ header files to use with documentation:<br />
|
||||
+ <a name="Simple-8b"></a>[Index Compression Using 64-Bit Words](http://people.eng.unimelb.edu.au/ammoffat/abstracts/am10spe.html): Simple-8b (speed optimized version tested)
|
||||
+ <a name="libfor"></a>[libfor](https://github.com/cruppstahl/for)
|
||||
+ <a name="lz4"></a>[lz4](https://github.com/Cyan4973/lz4). included w. block size 64K as indication. Tested after preprocessing w. delta+transpose
|
||||
+ <a name="blosc"></a>[blosc](https://github.com/Blosc/c-blosc). blosc is like transpose/shuffle+lz77. Tested blosc+lz4 incl. vectorizeed shuffle
|
||||
+ <a name="blosc"></a>[blosc](https://github.com/Blosc/c-blosc). blosc is like transpose/shuffle+lz77. Tested blosc+lz4 and blosclz incl. vectorizeed shuffle.<br>
|
||||
see also [benchmarks from the author of blosc](https://github.com/powturbo/TurboPFor/issues/2) single+multithreading
|
||||
+ <a name="DocId data set"></a>[Document identifier data set](http://lemire.me/data/integercompression2014.html)
|
||||
+ **Publications:**
|
||||
- [SIMD Compression and the Intersection of Sorted Integers](http://arxiv.org/abs/1401.6399)
|
||||
@ -271,4 +273,4 @@ header files to use with documentation:<br />
|
||||
- [On Inverted Index Compression for Search Engine Efficiency](http://www.dcs.gla.ac.uk/~craigm/publications/catena14compression.pdf)
|
||||
- [Google's Group Varint Encoding](http://static.googleusercontent.com/media/research.google.com/de//people/jeff/WSDM09-keynote.pdf)
|
||||
|
||||
Last update: 22 JUN 2015
|
||||
Last update: 23 JUN 2015
|
||||
|
||||
Reference in New Issue
Block a user