Previously the code was very race condition prone leading to
odd failures in production
It was re-written in raw SQL to avoid conditions where rows
conflict on inserts
There is no clean way in ActiveRecord to do:
Insert, on conflict do nothing and return existing id.
This also increases test coverage, we were previously not testing
the code responsible for crawling external sites directly