From a73275a7cc9c8e06a6228dfa510505cc4e826057 Mon Sep 17 00:00:00 2001 From: powturbo Date: Sun, 21 Jun 2015 19:06:14 +0200 Subject: [PATCH] =?UTF-8?q?:new:=20Java+64=20bits=20lists=20for=20BitPacki?= =?UTF-8?q?ng,=20VSimple,=20VByte,=20Elias=20Fano,=E2=80=A6?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 381a39c..28a8fd5 100644 --- a/README.md +++ b/README.md @@ -71,7 +71,7 @@ MI/s: 1.000.000 integers/second. 1000 MI/s = 4 GB/s
TurboPForDA,TurboForDA: Direct Access is normally used when accessing individual values. ##### - Data files: - - gov2.sorted from [DocId data set](#DocId data set) Block size=128 (lz4+VSimple 64k) + - gov2.sorted from [DocId data set](#DocId data set) Block size=128 (lz4+blosc+VSimple w/ 64k) ./icbench -c1 gov2.sorted @@ -90,7 +90,7 @@ TurboPForDA,TurboForDA: Direct Access is normally used when accessing individual | 4.953.768.342| 20.71| 6.63|**1766.05**|**1943.87**|**TurboPackV**| | 4.953.768.342| 20.71| 6.63|1419.35|1512.86|**TurboPack**| | 5.203.353.057| 21.75| 6.96|1560.34|1806.60|SIMDPackD1 FPF| -| 6.074.995.117| 25.40| 8.13| 494.70| 729.97| blosc lz4 64K| +| 6.074.995.117| 25.40| 8.13| 494.70| 729.97|[blosc](#blosc) lz4 64K| | 6.221.886.390| 26.01| 8.32|1666.76|1737.72|**TurboFor**| | 6.221.886.390| 26.01| 8.32|1660.52| 565.25|**TurboForDA**| | 6.699.519.000| 28.01| 8.96| 472.01| 495.12|Vbyte FPF| @@ -99,7 +99,8 @@ TurboPForDA,TurboForDA: Direct Access is normally used when accessing individual | 8.594.342.216| 35.93|11.50|1307.22|1593.07|libfor| |23.918.861.764|100.00|32.00|1456.17|1480.78|Copy| -lz4 w/ delta+transpose similar to delta + [blosc](#blosc) +lz4 after preprocessing delta+transpose similar to delta + [blosc](#blosc)
+blosc tested w/ lz4 compressor ##### - Compressed Inverted Index Intersections with GOV2
GOV2: 426GB, 25 Millions documents, average doc. size=18k. @@ -262,8 +263,8 @@ header files to use with documentation:
+ [Index Compression Using 64-Bit Words](http://people.eng.unimelb.edu.au/ammoffat/abstracts/am10spe.html): Simple-8b (speed optimized version tested) + [libfor](https://github.com/cruppstahl/for) + [lz4](https://github.com/Cyan4973/lz4). included w. block size 64K as indication. Tested after preprocessing w. delta+transpose + + [blosc](https://github.com/Blosc/c-blosc). blosc is like transpose/shuffle+lz77. Tested blosc+lz4 + [Document identifier data set](http://lemire.me/data/integercompression2014.html) - + [blosc](https://github.com/Blosc/c-blosc). blosc is like transpose/shuffle+lz77 + **Publications:** - [SIMD Compression and the Intersection of Sorted Integers](http://arxiv.org/abs/1401.6399) - [Partitioned Elias-Fano Indexes](http://www.di.unipi.it/~ottavian/files/elias_fano_sigir14.pdf)