FIX: Support new layout on Amazon product pages (#16091)

Some product pages on Amazon are using a new HTML structure, meaning the previous Onebox engine was unable to gather the price and/or description. This change should allow these pages to be Oneboxed.
This commit is contained in:
jbrw
2022-03-04 18:31:53 -05:00
committed by GitHub
parent d760fd4074
commit fc30669db2
3 changed files with 10537 additions and 3 deletions

View File

@ -124,7 +124,12 @@ module Onebox
elsif !raw.css("#priceblock_ourprice").inner_text.empty?
raw.css("#priceblock_ourprice").inner_text
else
raw.css(".mediaMatrixListItem.a-active .a-color-price").inner_text
result = raw.css('#corePrice_feature_div .a-price .a-offscreen').inner_text
if result.blank?
result = raw.css(".mediaMatrixListItem.a-active .a-color-price").inner_text
end
result
end
end
@ -215,8 +220,10 @@ module Onebox
summary = raw.at("#productDescription")
description = og.description || summary&.inner_text
description ||= raw.css("meta[name=description]").first&.[]("content")
description = og.description || summary&.inner_text&.strip
if description.blank?
description = raw.css("meta[name=description]").first&.[]("content")
end
result[:description] = CGI.unescapeHTML(Onebox::Helpers.truncate(description, 250)) if description
end