FIX: Increase uploads.origin column length to 2000 to accommodate longer S3 pre-signed URLs for user uploads. (#31803)

The current limit is too small for the way Discourse currently stores a
pre-signed S3 URL for each upload in the form of:

```
https://{bucketname}.s3.dualstack.{region}.amazonaws.com/original/1X/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.jpeg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=XXXXXXXXXXXXXXXXXXXX%2FYYYYMMDD%2F{region}%2Fs3%2Faws4_request&X-Amz-Date=YYYYMMDDTHHMMSSZ&X-Amz-Expires=xxxxxx&X-Amz-SignedHeaders=host&X-Amz-Security-Token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
```

The problem here is that this URL, without the S3 bucket name and AWS
region portion of the URL, is already nearly 1000 chars long.

If you have an even slightly longer bucket name or region, it will
easily go over 1000.

---

The more proper fix would be not to store these "template" S3 pre-signed
URLs in the `origin` column to begin with.

S3 pre-signed URLs aren't meant to be persisted (they're by nature
"one-time" use, scoped to a short window), and they have to be
re-generated for each user request anyway, because they have a maximum
validity of 7d (in practice Discourse generates them with a lifetime of
300s), so the value stored in the `origin` column is more of a
"template" that gets discarded and a real pre-signed URL generated
on-the-fly each time a user comes anyway. So the value in the `origin`
column isn't even doing anything.

The proper way would be to store the S3 bucket name, region, and object
key, which is compact, and the three pieces of info from which it is
sufficient to generate pre-signed URLs each time a user requests.
This commit is contained in:
Kevin Hwang
2025-03-19 18:05:35 -07:00
committed by GitHub
parent bdc30ca3a0
commit 0cdbd40866
3 changed files with 12 additions and 2 deletions

View File

@ -681,7 +681,7 @@ end
# created_at :datetime not null
# updated_at :datetime not null
# sha1 :string(40)
# origin :string(1000)
# origin :string(2000)
# retain_hours :integer
# extension :string(10)
# thumbnail_width :integer

View File

@ -0,0 +1,10 @@
# frozen_string_literal: true
class IncreaseUploadsOriginColumnLength < ActiveRecord::Migration[7.2]
def up
change_column :uploads, :origin, :string, limit: 2000
end
def down
change_column :uploads, :origin, :string, limit: 1000
end
end

View File

@ -175,7 +175,7 @@ class UploadCreator
)
@upload.original_sha1 = SiteSetting.secure_uploads? ? sha1 : nil
@upload.url = ""
@upload.origin = @opts[:origin][0...1000] if @opts[:origin]
@upload.origin = @opts[:origin][0...2000] if @opts[:origin]
@upload.extension = image_type || File.extname(@filename)[1..10]
if is_image && !external_upload_too_big