DEV: Update nokogiri to 1.18.1 (#30554)

Nokogiri/libxml is now more strict in terms of params it receives.

It uses kwargs vs options object (I fixed an issue there in #30545) doesn't accept nil/blank html (fixed here) and most importantly handles encoding in a different way. It seems to require explicitly specifying UTF8.

* Build(deps): Bump nokogiri from 1.16.8 to 1.18.1

Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.16.8 to 1.18.1.
- [Release notes](https://github.com/sparklemotion/nokogiri/releases)
- [Changelog](https://github.com/sparklemotion/nokogiri/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.16.8...v1.18.1)

---
updated-dependencies:
- dependency-name: nokogiri
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This commit is contained in:
Jarek Radosz
2025-01-07 12:05:39 +01:00
committed by GitHub
parent c1a46995a7
commit affe26f0dd
7 changed files with 14 additions and 11 deletions

View File

@ -277,7 +277,7 @@ class DiscourseDiff
def self.tokenize(html)
me = new
parser = Nokogiri::HTML::SAX::Parser.new(me)
parser = Nokogiri::HTML4::SAX::Parser.new(me, Encoding::UTF_8)
parser.parse("<html><body>#{html}</body></html>")
me.tokens
end