13 Commits

Author SHA1 Message Date
Gary Pendergast
8615fc6cbb
DEV: Add a user agent to all HTTP requests that Discourse makes. (#31555)
This change standardises the `User-Agent` header that Discourse will send when talking to other sites.

`Discourse.user_agent` is now the authority on what the user agent value should be. For Onebox requests, this changes the user agent from their existing value to match the new value (unless overridden).

For all other requests, `Net::HTTPHeader` is monkey-patched to add a default `User-Agent` header when one hasn't been provided.
2025-03-03 16:32:25 +11:00
Loïc Guitaut
5055a071b8 FIX: Allow to follow non-ASCII canonical links for oneboxes 2025-02-04 15:40:23 +01:00
Vinoth Kannan
d681decf01
FEATURE: use new site setting for onebox custom user agent. (#28045)
Previously, we couldn't change the user agent name dynamically for onebox requests. In this commit, a new hidden site setting `onebox_user_agent` is created to override the default user agent value specified in the [initializer](c333e9d6e6/config/initializers/100-onebox_options.rb (L15)).

Co-authored-by: Régis Hanol <regis@hanol.fr>
2024-07-24 04:45:30 +05:30
Alan Guo Xiang Tan
c8da2a33e8
FIX: Attempt to onebox even if response body exceeds max_download_kb (#26929)
In 95a82d608d6377faf68a0e2c5d9640b043557852, we lowered the default for
`Onebox.options.max_download_kb` from 10mb to 2mb for security hardening
purposes. However, this resulted in multiple bug reports where seemingly
nomral URLs stopped being oneboxed. It turns out that lowering
`Onebox.options.max_download_kb` resulted in `Onebox::Helpers::DownloadTooLarge` being raised
more often for more URLs  in `Onebox::Helpers.fetch_response` which
`Onebox::Helpers.fetch_html_doc` relies on. When
`Onebox::Helpers::DownloadTooLarge` is raised in
`Onebox::Helpers.fetch_response`, we throw away whatever response body
which we have already downloaded at that point. This is not ideal
because Nokogiri can parse incomplete HTML documents and there is a
really high chance that the incomplete HTML document still contains the
information which we need for oneboxing.

Therefore, this commit updates `Onebox::Helpers.fetch_html_doc` to not
throw away the response body when the size of the response body exceeds
`Onebox.options.max_download_size`. Instead, we just take whatever
response which we have and get Nokogiri to parse it.
2024-05-09 07:00:34 +08:00
Ted Johansson
60e624e768
DEV: Replace custom Onebox blank implementation with ActiveSupport (#23827)
We have a custom implementation of #blank? in our Onebox helpers. This is likely a legacy from when Onebox was a standalone gem. This change replaces all usages with respective incarnations of #blank?, #present?, and #presence from ActiveSupport. It changes a bunch of "unless blank" to "if present" as well.
2023-10-07 19:54:26 +02:00
David Taylor
cb932d6ee1
DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
David Taylor
68b4fe4cf8
SECURITY: Expand and improve SSRF Protections (#18815)
See https://github.com/discourse/discourse/security/advisories/GHSA-rcc5-28r3-23rr

Co-authored-by: OsamaSayegh <asooomaasoooma90@gmail.com>
Co-authored-by: Daniel Waterworth <me@danielwaterworth.com>
2022-11-01 16:33:17 +00:00
Loïc Guitaut
3eaac56797 DEV: Use proper wording for contexts in specs 2022-08-04 11:05:02 +02:00
Bianca Nenciu
e7f04a8674
FIX: Use URI#merge to merge base and relative URLs (#17454)
The old implementation did not handle all cases, such as the case when
`src` is a relative URL that starts with `..`.
2022-07-18 14:17:54 +03:00
David Taylor
c9dab6fd08
DEV: Automatically require 'rails_helper' in all specs (#16077)
It's very easy to forget to add `require 'rails_helper'` at the top of every core/plugin spec file, and omissions can cause some very confusing/sporadic errors.

By setting this flag in `.rspec`, we can remove the need for `require 'rails_helper'` entirely.
2022-03-01 17:50:50 +00:00
Arpit Jalan
05bdbd9f97
SECURITY: Onebox canonical links bypassing FinalDestination checks (#13605) 2021-07-01 20:09:29 +05:30
Arpit Jalan
b63c9febe8
FIX: ignore canonical link to localhost (#13577) 2021-06-30 13:55:17 +05:30
Arpit Jalan
283b08d45f
DEV: Absorb onebox gem into core (#12979)
* Move onebox gem in core library

* Update template file path

* Remove warning for onebox gem caching

* Remove onebox version file

* Remove onebox gem

* Add sanitize gem

* Require onebox library in lazy-yt plugin

* Remove onebox web specific code

This code was used in standalone onebox Sinatra application

* Merge Discourse specific AllowlistedGenericOnebox engine in core

* Fix onebox engine filenames to match class name casing

* Move onebox specs from gem into core

* DEV: Rename `response` helper to `onebox_response`

Fixes a naming collision.

* Require rails_helper

* Don't use `before/after(:all)`

* Whitespace

* Remove fakeweb

* Remove poor unit tests

* DEV: Re-add fakeweb, plugins are using it

* Move onebox helpers

* Stub Instagram API

* FIX: Follow additional redirect status codes (#476)

Don’t throw errors if we encounter 303, 307 or 308 HTTP status codes in responses

* Remove an empty file

* DEV: Update the license file

Using the copy from https://choosealicense.com/licenses/gpl-2.0/#

Hopefully this will enable GitHub to show the license UI?

* DEV: Update embedded copyrights

* DEV: Add Onebox copyright notice

* DEV: Add MIT license, convert COPYRIGHT.txt to md

* DEV: Remove an incorrect copyright claim

Co-authored-by: Jarek Radosz <jradosz@gmail.com>
Co-authored-by: jbrw <jamie@goatforce5.org>
2021-05-26 15:11:35 +05:30