diff --git a/README.md b/README.md
index 27eb0ce..ebcee29 100644
--- a/README.md
+++ b/README.md
@@ -73,6 +73,29 @@ MI/s: 1.000.000 integers/second. 1000 MI/s = 4 GB/s
**#BOLD** = pareto frontier. FPF=FastPFor
TurboPForDA,TurboForDA: Direct Access is normally used when accessing individual values.
+CPU: Skylake
+
+|Size| Ratio % |Bits/Integer |C Time MI/s |D Time MI/s |Function |
+|--------:|-----:|----:|-------:|-------:|---------|
+| 63392801| 15.85| 5.07|**413.76**|**1482.82**|**TurboPFor**|
+| 63392801| 15.85| 5.07| 387.30| 243.62|**TurboPForDA**|
+| 65359916| 16.34| 5.23| 7.58| 609.12|[OptPFD](#OptPFD)|
+| 73477088| 18.37| 5.88| 101.68| 621.37|[Simple16](#Simple16)|
+| 78514276| 19.63| 6.28|**256.83**|**676.45**|**VSimple**|
+| 95915096| 23.98| 7.67| 211.79|**954.62**|[Simple-8b](#Simple-8b)|
+| 98546814| 24.64| 7.88| 70.85|**2349.71**|[QMX](#QMX)|
+| 99910930| 24.98| 7.99|**3537.57**|**3081.79**|**TurboPackV**|
+| 99910930| 24.98| 7.99| 3099.52|3071.77|[SIMDPack FPF](#SIMDPack FPF)|
+| 99910930| 24.98| 7.99| 2050.47|2402.54|**TurboPack**|
+| 99910930| 24.98| 7.99| 2049.85|2364.52|**TurboFor**|
+| 99910930| 24.98| 7.99| 2049.70|1124.12|**TurboForDA**|
+|102074663| 25.52| 8.17| 1354.42|1745.69|[MaskedVByte](#MaskedVByte)|
+|102074663| 25.52| 8.17| 1660.76|1626.67|**TurboVbyte**|
+|102074663| 25.52| 8.17| 1249.77|1051.85|[Vbyte FPF](#Vbyte FPF)|
+|112500000| 28.12| 9.00| 466.94|3003.70|[VarintG8IU](#VarintG8IU)|
+|128125000| 32.03| 10.25| 1109.67|1271.32|[StreamVbyte FPF](#StreamVbyte)|
+|400000000| 100.00 32.00| 2240.24|2237.05|Copy|
+
##### - Data files:
- gov2.sorted from [DocId data set](#DocId data set) Block size=128 (lz4+blosc+VSimple w/ 64Ki)
@@ -262,11 +285,12 @@ header files to use with documentation:
### References:
- + [FastPFor](https://github.com/lemire/FastPFor) + [Simdcomp](https://github.com/lemire/simdcomp): SIMDPack FPF, Vbyte FPF, VarintG8IU
+ + [FastPFor](https://github.com/lemire/FastPFor) + [Simdcomp](https://github.com/lemire/simdcomp): SIMDPack FPF, Vbyte FPF, VarintG8IU
+ [Optimized Pfor-delta compression code](http://jinruhe.com): OptPFD/OptP4, Simple16 (limited to 28 bits integers)
+ [MaskedVByte](http://maskedvbyte.org/). See also: [Vectorized VByte Decoding](http://engineering.indeed.com/blog/2015/03/vectorized-vbyte-decoding-high-performance-vector-instructions/)
+ [Index Compression Using 64-Bit Words](http://people.eng.unimelb.edu.au/ammoffat/abstracts/am10spe.html): Simple-8b (speed optimized version tested)
+ [libfor](https://github.com/cruppstahl/for)
+ + [QMX:Compression, SIMD, and Postings Lists](http://www.cs.otago.ac.nz/homepages/andrew/papers/)
+ [lz4](https://github.com/Cyan4973/lz4). included w. block size 64K as indication. Tested after preprocessing w. delta+transpose
+ [blosc](https://github.com/Blosc/c-blosc). blosc is like transpose/shuffle+lz77. Tested blosc+lz4 and blosclz incl. vectorizeed shuffle.
see also [benchmarks from the author of blosc](https://github.com/powturbo/TurboPFor/issues/2) single+multithreading