Commit Graph

51 Commits

Author SHA1 Message Date
Sam
e0c952290b FIX: increase inventory lag for s3 to 2 days (#11606)
Inventory on S3 always lagged, over the past few weeks we are noticing that
1 day of lag is not enough.

We are increasing this to 2, to ensure that we do not get false positive
reports.
2020-12-30 16:05:42 +11:00
9f6c4ad71a FIX: inconsistency in S3 inventory config (#11112)
Ensures it matches S3 inventory config generation in our hosting.
2020-11-05 08:39:40 -05:00
80268357e7 DEV: Change upload verified column to be integer (#10643)
Per review https://review.discourse.org/t/dev-add-verified-to-uploads-and-fill-in-s3-inventory-10406/14180

Change the verified column for Upload to a verified_status integer column, to avoid having NULL as a weird implicit status.
2020-09-17 13:35:29 +10:00
06b4ca5dc7 FIX: Mark only uploads as verified/unverified in S3 inventory 2020-09-14 10:21:34 -04:00
bd0a7553c4 DEV: Detect when s3 inventory failure is caused by etag difference (#10427) 2020-08-13 09:30:28 +10:00
b950b3fb3f DEV: Add verified to uploads and fill in S3 inventory (#10406)
When we run the S3 inventory, mark uploads that exist as verified true, those that don't as verified false, and uploads not included in the check / not yet checked as verified nil.
2020-08-11 14:43:51 +10:00
16c65a94f7 PERF: Preload S3 inventory data for multisite clusters 2020-07-29 10:31:55 +01:00
ec4024fe6d FIX: Keep by_users check in S3 inventory
Partial revert of 8515d8fa - the by_users check is ensuring we don't raise errors for fixtures
2020-07-21 17:19:56 +01:00
8515d8fae5 FIX: Improve S3 inventory logic
Previously we considered 'upload rows without etags' to be exempt from the check. This is bad, because older/migrated sites might not have etags on all their uploads. We should consider rows without etags to be broken, since we can't check them against the inventory.

This also removes the `by_users` scope. We need all uploads to be working, even ones created by the system user.
2020-07-21 15:55:53 +01:00
3d65678a13 DEV: Add timestamp columns to optimized_images table (#10199)
This allows us to filter by created/updated date when comparing to an S3 inventory.
2020-07-14 11:50:33 +01:00
7f2b5a446a PERF: Remove post_upload recovery in daily EnsureS3UploadsExistence job (#10173)
This is a very expensive process, and it should only be required in exceptional circumstances. It is possible to run a similar recovery using `rake uploads:recover` (5284d41a8e/lib/upload_recovery.rb (L135-L184))
2020-07-06 16:26:40 +01:00
38a30a6e96 DEV: correct regression and correct tests
etag change in 31976ecf was incorrect, revert it

Also correct regression in test suite.
2020-07-06 10:56:19 +10:00
31976ecfeb PERF: only update etag when it changes
Previously when synchronizing upload etags we would update every single one
regardless of change.
2020-07-06 10:40:04 +10:00
73b04976e5 FIX: Use updated_at in the S3 inventory job (#8823)
When we change upload's sha1 (e.g. when resizing images) it won't match the data in the most recent S3 inventory index. With this change the uploads that have been updated since the inventory has been generated are ignored.
2020-01-31 11:02:44 +01:00
3b7f5db5ba FIX: parallel spec system needs a dedicated upload folder for each worker. (#8547) 2019-12-18 11:21:57 +05:30
68708db721 DEV: S3Inventory#unsorted_files should always return an array (#8034) 2019-08-23 17:59:31 +10:00
e53a171916 FIX: hold s3 related distributed locks longer
These operations are pretty expensive and can take multiple minutes due to
networking.

Hold distributed mutex for much longer.
2019-08-15 11:48:44 +10:00
9919ee1900 FIX: remove the tmp inventory files after the s3 uploads check. 2019-08-13 11:52:57 +05:30
8a64b0c8e8 Revert "DEV: Remove unused kwarg and properly check for local missing uploads."
This reverts commit 97769f3d0226642905cf0605aeb0bc69d7295ca1.

The code is confusing but this change is quite risky. Defer for now
until we can look at it properly.
2019-07-29 14:35:34 +08:00
97769f3d02 DEV: Remove unused kwarg and properly check for local missing uploads. 2019-07-29 14:21:06 +08:00
47deb8b3da FIX: use same id for both original & optimized inventories in multisite setup. 2019-07-25 14:16:47 +05:30
ad04ce9f43 FIX: remove post upload record creation inside 'find_missing_uploads' method. 2019-07-19 01:44:08 +05:30
35d6fff69e PERF: use url instead of file key in temporary inventory table. 2019-06-13 22:03:58 +05:30
ed21128ee6 FIX: Do not change directory when decompressing S3 inventory
In sidekiq, jobs are run in multiple threads within the same process. `cd` affects the entire process, so can cause unexpected issues in other running jobs.
2019-06-13 17:13:50 +01:00
d74ee9dbce DEV: skip S3 inventory records without correct multisite prefix. 2019-06-08 18:36:06 +05:30
2941c77abc FIX: skip upload recovery if file not found in s3 2019-05-21 00:06:36 +05:30
2a7065c505 FIX: skip uploads without etag in s3 inventory check. 2019-05-20 00:09:52 +05:30
3172172b52 remove unused local variable
ec84c87ddbae307405951fe6606f348c26bc2b07
2019-05-16 15:39:13 +05:30
ec84c87ddb FIX: skip validation while recovering uploads from s3
TODO: add tests
2019-05-16 15:37:11 +05:30
40328f055e FIX: retrieve original filename from s3 object's content disposition header 2019-05-16 09:47:22 +05:30
dd49be27d3 DEV: Fix undefined variable.
Follow up to e8fafbc123170dd1f7d2a8adea4e7810585d3e76.
2019-05-16 11:28:48 +08:00
f5a217be92 Fix typo in condition value. 2019-05-07 17:09:08 +05:30
e8fafbc123 List and restore missing post uploads from S3 inventory. 2019-05-04 01:16:20 +05:30
73418aaf73 DEV: Add bucket folder path to inventory id 2019-05-02 04:35:35 +05:30
a8f410a9c5 FEATURE: Create new helper method 'Discourse.stats' (#7388) 2019-04-17 12:45:04 +05:30
35431a8ddb FIX: set missing count in redis even if zero 2019-04-04 20:05:57 +05:30
df6ef856e6 DEV: save missing s3 uploads count in redis 2019-04-04 19:05:57 +05:30
243fb8d9ad Fix the build. 2019-03-13 17:39:07 +08:00
da1ff2da2c FIX: Create and consume temp table inside a transaction (#7030)
To prevent access issue in pgbouncer which runs in transaction pooling
2019-02-20 13:52:40 +11:00
563b953224 DEV: Add 'backfill_etags_' to the method name since it also backfilling the etags 2019-02-19 21:54:35 +05:30
0472bd4adc FIX: Remove 'backfill_etags' keyword argument from 'uploads:missing' rake task
And etags backfilling code is optimized
2019-02-15 00:34:35 +05:30
b5fbd7385f FIX: run the rake task only for uploads created before a day from inventory date 2019-02-14 17:53:08 +05:30
a9a8855739 DEV: Get only matching records to backfill etags 2019-02-14 06:27:18 +05:30
e2f7db5549 Fix typo 2019-02-14 05:56:30 +05:30
7b5931013a Update rake task to backfill etags from s3 inventory 2019-02-14 05:18:06 +05:30
b8d2549922 FIX: OptimizedImage model doesn't have 'created_at' date column 2019-02-14 03:46:00 +05:30
426bd810f1 FIX: S3 inventory can have duplicate etags 2019-02-14 03:44:14 +05:30
1045bbc35b FIX: S3 inventory data can be splitted into multiple csv files 2019-02-14 03:41:52 +05:30
ba9cc83d4c FIX: Destination prefix in S3 inventory configuration is incorrect 2019-02-06 20:51:28 +05:30
ff12c4b2d4 FIX: Bucket name is missing in S3 inventory data path 2019-02-06 19:16:08 +05:30