The per thread cache is implemented so that there is one single
thread cache per thread.
This uses shared_ptr, which is available in different places
depending on which GCC version is default on the target platform.
So, before this is merged to develop, that needs to be sorted out.