This change standardises the `User-Agent` header that Discourse will send when talking to other sites.
`Discourse.user_agent` is now the authority on what the user agent value should be. For Onebox requests, this changes the user agent from their existing value to match the new value (unless overridden).
For all other requests, `Net::HTTPHeader` is monkey-patched to add a default `User-Agent` header when one hasn't been provided.
Previously, we couldn't change the user agent name dynamically for onebox requests. In this commit, a new hidden site setting `onebox_user_agent` is created to override the default user agent value specified in the [initializer](c333e9d6e6/config/initializers/100-onebox_options.rb (L15)).
Co-authored-by: Régis Hanol <regis@hanol.fr>
In 95a82d608d6377faf68a0e2c5d9640b043557852, we lowered the default for
`Onebox.options.max_download_kb` from 10mb to 2mb for security hardening
purposes. However, this resulted in multiple bug reports where seemingly
nomral URLs stopped being oneboxed. It turns out that lowering
`Onebox.options.max_download_kb` resulted in `Onebox::Helpers::DownloadTooLarge` being raised
more often for more URLs in `Onebox::Helpers.fetch_response` which
`Onebox::Helpers.fetch_html_doc` relies on. When
`Onebox::Helpers::DownloadTooLarge` is raised in
`Onebox::Helpers.fetch_response`, we throw away whatever response body
which we have already downloaded at that point. This is not ideal
because Nokogiri can parse incomplete HTML documents and there is a
really high chance that the incomplete HTML document still contains the
information which we need for oneboxing.
Therefore, this commit updates `Onebox::Helpers.fetch_html_doc` to not
throw away the response body when the size of the response body exceeds
`Onebox.options.max_download_size`. Instead, we just take whatever
response which we have and get Nokogiri to parse it.
We have a custom implementation of #blank? in our Onebox helpers. This is likely a legacy from when Onebox was a standalone gem. This change replaces all usages with respective incarnations of #blank?, #present?, and #presence from ActiveSupport. It changes a bunch of "unless blank" to "if present" as well.
It's very easy to forget to add `require 'rails_helper'` at the top of every core/plugin spec file, and omissions can cause some very confusing/sporadic errors.
By setting this flag in `.rspec`, we can remove the need for `require 'rails_helper'` entirely.
* Move onebox gem in core library
* Update template file path
* Remove warning for onebox gem caching
* Remove onebox version file
* Remove onebox gem
* Add sanitize gem
* Require onebox library in lazy-yt plugin
* Remove onebox web specific code
This code was used in standalone onebox Sinatra application
* Merge Discourse specific AllowlistedGenericOnebox engine in core
* Fix onebox engine filenames to match class name casing
* Move onebox specs from gem into core
* DEV: Rename `response` helper to `onebox_response`
Fixes a naming collision.
* Require rails_helper
* Don't use `before/after(:all)`
* Whitespace
* Remove fakeweb
* Remove poor unit tests
* DEV: Re-add fakeweb, plugins are using it
* Move onebox helpers
* Stub Instagram API
* FIX: Follow additional redirect status codes (#476)
Don’t throw errors if we encounter 303, 307 or 308 HTTP status codes in responses
* Remove an empty file
* DEV: Update the license file
Using the copy from https://choosealicense.com/licenses/gpl-2.0/#
Hopefully this will enable GitHub to show the license UI?
* DEV: Update embedded copyrights
* DEV: Add Onebox copyright notice
* DEV: Add MIT license, convert COPYRIGHT.txt to md
* DEV: Remove an incorrect copyright claim
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
Co-authored-by: jbrw <jamie@goatforce5.org>