Fix --mutex-loops description to say it's the number of empty loops outside the lock. Also, make the loops really empty by using a compiler barrier instead of incrementing a local variable.