Use ax_gcc_archflag.m4 to try to guess the most optimal -march value for the target architecture, unless cross-compiling, or the value is set explicitly with --with-gcc-arch. Also, AArch64/ThunderX-specific updates to ax_gcc_archflag.m4. In sb_concurrency_kit.m4, add --enable-lse to CK build flags if LSE instructions are supported by the target.