Improvements to mbox importer

* store time it took to index message in DB (to find performance issues)
* ignore listserv specific files
* better examples for split_regex
* first email in mbox shouldn't contain the split string
* always lock the DB in exclusive mode
* save email within transaction
* messages can be grouped by subject and use original order (for Listserv)
* adds option to index emails without running the import
This commit is contained in:
Gerhard Schlager
2018-01-17 12:03:57 +01:00
parent 5d7a33cd6d
commit bb54eb1192
7 changed files with 134 additions and 49 deletions

View File

@ -31,6 +31,7 @@ class ImportScripts::Base
@site_settings_during_import = {}
@old_site_settings = {}
@start_times = { import: Time.now }
@skip_updates = false
end
def preload_i18n
@ -46,14 +47,16 @@ class ImportScripts::Base
puts ""
update_bumped_at
update_last_posted_at
update_last_seen_at
update_user_stats
update_feature_topic_users
update_category_featured_topics
update_topic_count_replies
reset_topic_counters
unless @skip_updates
update_bumped_at
update_last_posted_at
update_last_seen_at
update_user_stats
update_feature_topic_users
update_category_featured_topics
update_topic_count_replies
reset_topic_counters
end
elapsed = Time.now - @start_times[:import]
puts '', '', 'Done (%02dh %02dmin %02dsec)' % [elapsed / 3600, elapsed / 60 % 60, elapsed % 60]