Use SIMD stringsearcher and SIMD memcmp optimze split_by_string and substring_index function. split_by_string function has 32%~540% up substring_index function has 22%~46% up Performance difference depends on the needle size and whether the needle is constant param. And the longer the needle, the more performance improvement