Commit Graph

143 Commits

Author SHA1 Message Date
fc30669db2 FIX: Support new layout on Amazon product pages (#16091)
Some product pages on Amazon are using a new HTML structure, meaning the previous Onebox engine was unable to gather the price and/or description. This change should allow these pages to be Oneboxed.
2022-03-04 18:31:53 -05:00
2fc70c5572 DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
7afe768d60 DEV: Add tests for wistia onebox. (#15860)
Follow-up to 4ef56b0ca4d997e5d0b331c488c83b7f574721df
2022-02-08 13:04:32 +08:00
4ef56b0ca4 FIX: Explicitly set allowfullscreen on Wistia Oneboxes (#15828) 2022-02-08 13:02:32 +11:00
5b5cbbfe5c FEATURE: Onebox for news.ycombinator.com (#15781) 2022-02-03 13:39:21 -03:00
aac9f43038 Only block domains at the final destination (#15689)
In an earlier PR, we decided that we only want to block a domain if 
the blocked domain in the SiteSetting is the final destination (/t/59305). That 
PR used `FinalDestination#get`. `resolve` however is used several places
 but blocks domains along the redirect chain when certain options are provided.

This commit changes the default options for `resolve` to not do that. Existing
users of `FinalDestination#resolve` are
- `Oneboxer#external_onebox`
- our onebox helper `fetch_html_doc`, which is used in amazon, standard embed 
and youtube
  - these folks already go through `Oneboxer#external_onebox` which already
  blocks correctly
2022-01-31 15:35:12 +08:00
847c77de65 FIX: Add another method to check binary file (#15648)
This method looks for a NULL byte that is not usually contained in text
files. Follow up to 376799b1a49306be500be3419a327af8b03819ec.
2022-01-20 23:47:18 +02:00
376799b1a4 FIX: Hide excerpt of binary files in GitHub onebox (#15639)
Oneboxer did not know if a file is binary or not and always tried to
show an excerpt of the file.
2022-01-19 14:45:36 +02:00
2909b8b820 FIX: origins_to_regexes should always return an array (#15589)
If the SiteSetting `allowed_onebox_iframes` contains a value of `*`, it will use the values of `all_iframe_origins` during the Oneboxing process. If `all_iframe_origins` itself contains a value of `*`, `origins_to_regexes` will try to return a "catch-all" regex.

Other code assumes `origins_to_regexes`will return an array, so this change ensures the `*` case will return an array containing only the catch-all regex.
2022-01-17 12:48:41 -05:00
31b27b3712 FIX: Broken GitHub folder onebox logic (#15612)
1. `html_doc.css('.Box.md')` always returns a truthy value (e.g. `[]`) so the second branch of the if-elsif never ran
2. `node&.css('text()')` was invalid code that would raise an error
3. Matching on h3 elements is no longer correct with the current html structure returned by GitHub
2022-01-17 18:32:07 +01:00
f56eff2303 FIX: limits pre-line impact to tweet text (#15583) 2022-01-14 10:44:21 +01:00
6e925fee6f FIX: Use basic meta description if other description tags are missing (#15356)
When attempting to Onebox a page if there is no `meta property="og:description"` tag but there is a  `meta name="description"` tag, Onebox should try to use that value.
2021-12-17 19:36:54 -05:00
4c46c7e334 DEV: Remove xlink hrefs (#15059) 2021-11-25 15:22:43 +11:00
aec125b617 FIX: Display Instagram Oneboxes in an iframe (#14789)
We are no longer able to display the image returned by Instagram directly within a Discourse site (either in the composer, or within a cooked post within a topic), so:

- Display an image placeholder in the composer preview
- A cooked post should use an iframe to display the Instagram 'embed' content
2021-11-02 14:34:51 -04:00
69f0f48dc0 DEV: Fix rubocop issues (#14715) 2021-10-27 11:39:28 +03:00
3fbfec06fc Update replit onebox to accept .com 2021-10-19 16:37:33 -04:00
ba81d1853b FIX: Disable previews if diffhtml is enabled (#14537)
diffhtml should not rerender video and audio elements so there is no
point in having these.
2021-10-08 15:57:08 +03:00
fbe9cd49b6 FIX: Vimeo private video oneboxes were broken (#14510) 2021-10-05 15:46:58 +05:30
02a6b991fe FIX: Correct the play icon position (#14295) 2021-09-09 15:10:32 +02:00
11a07b37e1 FIX: ignore canonical link for medium.com oneboxes (#14278)
https://meta.discourse.org/t/bug-in-onebox-link-being-rendered-as-a-gist-when-it-isnt/202463
2021-09-08 20:19:57 +05:30
d27d7c8cca FIX: Unescapes hash section with present to account for url-encoded chars
Sections with unreserverd characters will appear url-encoded and need to
be unescaped before using it.

Wikipedia generates 2 different spans in this case in the same page, one
with an id resulting of replacing the % symbols with . and the other with
the decoded version of the string. For example, for /wiki/foo#A%C3%A1A it
will generate:

<span id="A.C3.A1A"></span>
<span id="AáA">AáA</span>

Unescaping the `m_url_hash_name` should work in all cases to target the
proper section span.
2021-08-12 10:43:50 -04:00
bb2c48b065 FIX: update iframe url for simplecast onebox (#13957)
https://meta.discourse.org/t/onebox-regression-simplecast-com/187911
2021-08-05 18:29:04 +05:30
a341dba5d9 FIX: update oEmbed URL for simplecast onebox (#13956) 2021-08-05 17:42:38 +05:30
2f28ba318c FEATURE: Onebox can match engines based on the content_type (#13876)
* FEATURE: Onebox can match engines based on the content_type

`FinalDestination` now returns the `content_type` of a resolved URL.

`Oneboxer` passes this value to `Onebox` itself. Onebox engines can now specify a `matches_content_type` regex of content_types that the engine can handle, regardless of the URL.

`ImageOnebox` will match URLs with a content type of `image/png`, `jpg`, `gif`, `bmp`, `tif`, etc.

This will allow images that exist at a URL without a file type extension to be correctly rendered, assuming a valid `content_type` is returned.
2021-07-30 13:36:30 -04:00
02b84dbff2 DEV: Fix flaky instagram onebox spec by not mutating constant. 2021-07-27 13:54:14 +08:00
8b89787426 SECURITY: Sanitize YouTube Onebox data (#13748)
CVE-2021-32764
2021-07-15 19:31:50 +01:00
a64aea38b7 FIX: Don’t use user_generated images as avatar images in Oneboxed Twitter content (#13712)
By default, Twitter will return the URL for the avatar image of the tweet poster as the `og:image` value.

However, if the `user_generated` attribute is true, we should not use this as the avatar URL as this will be an URL of an image in the tweet itself (e.g., an image belonging to a tweeted news story).
2021-07-13 14:54:28 -04:00
05bdbd9f97 SECURITY: Onebox canonical links bypassing FinalDestination checks (#13605) 2021-07-01 20:09:29 +05:30
b63c9febe8 FIX: ignore canonical link to localhost (#13577) 2021-06-30 13:55:17 +05:30
04baca593b UX: Tweak the timestamp line in Twitter onebox (#13551)
Fixed alignment and made the color less intrusive to make the actual content pop out more.
2021-06-28 15:04:33 +02:00
fa4e5e8dad FEATURE: Render emojis on GitHub labels when oneboxing an issue. (#13531) 2021-06-25 14:48:36 -03:00
e50b7e9111 SECURITY: ensures timeouts are correctly used on connect (#13455) 2021-06-21 17:34:01 +02:00
09bc95d46b FIX: Quoting Oneboxed content should exclude formatting (#13296)
* FIX: Quoting Oneboxed content should exclude formatting

When a post is quoted that includes Oneboxed content, we should not include the formatting generated by the Onebox. Rather, we should attempt to collapse the link referenced by the Onebox to a single line text link.

* DEV: fix tests
2021-06-07 13:03:53 -04:00
2e4f07678e FIX: IMDb links were being oneboxed as posters (#13310)
IMDb movie links were being rendered as posters. This was because
IMDb was sending `og:type` as `image` randomly in some cases. To
fix this we'll now default all IMDb links as article type. This will
ensure that the IMDb onebox link includes all the information instead
of showing just a poster without any context.
2021-06-07 18:45:59 +05:30
461a2c334b FIX: return an empty result if response from Amazon is missing expected attributes (#13173)
* FIX: return an empty result if response from Amazon is missing attributes

Check we have the basic attributes requires to construct a Onebox for Amazon.

This is an attempt to handle scenarios where we receive a valid 200-status response from an Amazon request that does not include the data we’re expecting.

* Update lib/onebox/engine/amazon_onebox.rb

Co-authored-by: Régis Hanol <regis@hanol.fr>

Co-authored-by: Régis Hanol <regis@hanol.fr>
2021-06-01 16:23:18 -04:00
06e1af2b1d FIX: Giphy oneboxing when the response is an image (#13199) 2021-05-28 15:10:32 -04:00
47e09700fe FIX: Support pausing GIFs for giphy/tenor oneboxes (#13194) 2021-05-28 08:40:30 -04:00
723d7de18c Various GitHub Onebox improvements (#13163)
* FIX: Improve GitHub folder regexp in Onebox

It used to match any GitHub URL that was not matched by the other GitHub
Oneboxes and it did not do a good job at handling those. With this
change, the generic Onebox will handle the remaining URLs.

* FEATURE: Add Onebox for GitHub Actions

* FEATURE: Add Onebox for PR check runs

* FIX: Remove image from GitHub folder Oneboxes

It is a generic, auto-generated image which does not provide any value.

* DEV: Add tests

* FIX: Strip HTML comments from PR body
2021-05-27 12:38:42 +03:00
1270c7ad15 UX: Twitter onebox layout adjustments (#13181) 2021-05-27 15:35:32 +10:00
283b08d45f DEV: Absorb onebox gem into core (#12979)
* Move onebox gem in core library

* Update template file path

* Remove warning for onebox gem caching

* Remove onebox version file

* Remove onebox gem

* Add sanitize gem

* Require onebox library in lazy-yt plugin

* Remove onebox web specific code

This code was used in standalone onebox Sinatra application

* Merge Discourse specific AllowlistedGenericOnebox engine in core

* Fix onebox engine filenames to match class name casing

* Move onebox specs from gem into core

* DEV: Rename `response` helper to `onebox_response`

Fixes a naming collision.

* Require rails_helper

* Don't use `before/after(:all)`

* Whitespace

* Remove fakeweb

* Remove poor unit tests

* DEV: Re-add fakeweb, plugins are using it

* Move onebox helpers

* Stub Instagram API

* FIX: Follow additional redirect status codes (#476)

Don’t throw errors if we encounter 303, 307 or 308 HTTP status codes in responses

* Remove an empty file

* DEV: Update the license file

Using the copy from https://choosealicense.com/licenses/gpl-2.0/#

Hopefully this will enable GitHub to show the license UI?

* DEV: Update embedded copyrights

* DEV: Add Onebox copyright notice

* DEV: Add MIT license, convert COPYRIGHT.txt to md

* DEV: Remove an incorrect copyright claim

Co-authored-by: Jarek Radosz <jradosz@gmail.com>
Co-authored-by: jbrw <jamie@goatforce5.org>
2021-05-26 15:11:35 +05:30
8fd46c04ea Drop flash video onebox (#12261)
Flash was discontinued by Adobe at the end of 2020. There is no need to continue OneBox support for it
2021-03-02 17:11:14 +00:00
13c2a4886f FEATURE: Add disable_onebox_media_download_controls hidden site setting (#12208)
Uses discourse/onebox@ff9ec90

Adds a hidden site setting called disable_onebox_media_download_controls which will add controlslist="nodownload" to video and audio oneboxes, and also to the local video and audio oneboxes within Discourse.
2021-02-25 12:39:15 +10:00
0116897ac9 UI: Category Onebox styling changes (#11448)
This commit adjusts the category one box styling to be more in line with the discourse categories UI.
2020-12-09 11:36:05 -06:00
51f9a56137 FEATURE: Onebox local categories (#11311)
* FEATURE: onebox for local categories

This commit adjusts the category onebox to look more like the category boxes do on the category page.

Co-authored-by: Jordan Vidrine <jordan@jordanvidrine.com>
2020-11-25 10:53:05 +11:00
331236d6d7 Onebox improved error handling and support for Instagram Access Tokens (#11253)
* FEATURE: display error if Oneboxing fails due to HTTP error

- display warning if onebox URL is unresolvable
- display warning if attributes are missing

* FEATURE: Use new Instagram oEmbed endpoint if access token is configured

Instagram requires an Access Token to access their oEmbed endpoint. The requirements (from https://developers.facebook.com/docs/instagram/oembed/) are as follows:

- a Facebook Developer account, which you can create at developers.facebook.com
- a registered Facebook app
- the oEmbed Product added to the app
- an Access Token
- The Facebook app must be in Live Mode

The generated Access Token, once added to SiteSetting.facebook_app_access_token, will be passed to onebox. Onebox can then use this token to access the oEmbed endpoint to generate a onebox for Instagram.

* DEV: update user agent string

* DEV: don’t do HEAD requests against news.yahoo.com

* DEV: Bump onebox version from 2.1.5 to 2.1.6

* DEV: Avoid re-reading templates

* DEV: Tweaks to onebox mustache templates

* DEV: simplified error message for missing onebox data

* Apply suggestions from code review
Co-authored-by: Gerhard Schlager <mail@gerhard-schlager.at>
2020-11-18 12:55:16 -05:00
a3577435f7 FEATURE: Additional control of iframes in oneboxes (#10523)
This commit adds a new site setting "allowed_onebox_iframes". By default, all onebox iframes are allowed. When the list of domains is restricted, Onebox will automatically skip engines which require those domains, and use a fallback engine.
2020-08-27 20:12:13 +01:00
e0d9232259 FIX: use allowlist and blocklist terminology (#10209)
This is a PR of the renaming whitelist to allowlist and blacklist to the blocklist.
2020-07-27 10:23:54 +10:00
9bff0882c3 FEATURE: Nokogumbo (#9577)
* FEATURE: Nokogumbo

Use Nokogumbo HTML parser.
2020-05-05 13:46:57 +10:00
fd4ce6ab8f DEV: hbs extensions are misleading in this case (#9170)
This would also prevent any linting tool to attempt to lint this incorrectly.
2020-03-11 14:42:14 +01:00
f795c1b8e8 Revert "DEV: enforces ember-template-lint: no-triple-curlies (#9150)"
This reverts commit d436b600fba4cea846a5e96c235c507441965e05.

Triple curlies are still necessary for some raw templates.
2020-03-10 15:00:12 -03:00