loongoffice

Files

Miklos Vajna b38730ae0a sw html import: fix handling of CDATA

In case the HTML contained markup like <![CDATA[...]]>, we simply
ignored it during import, even if e.g. the ODT import handles that
correctly.

The reason for this is that the svtools/ HTMLParser had code to parse
<!-- ... ---> style comments, but not for CDATA.

Fix the problem by introducing a new HtmlTokenId::CDATA, producing a
matching token content in HTMLParser::GetNextToken_(), and finally map
it to normal text on the Writer side.

Note that HtmlTokenId doesn't allow non-on-off tokens past ONOFF_START,
neither allows inserting a single token before ONOFF_START (it breaks
getOnToken()), so for now just add a second, dummy token to avoid
breakage.

Change-Id: I605c3c21dc11986fda5d93d36148788a638e97b4
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/141813
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins

2022-10-25 18:15:47 +02:00

unit

…

unoapi

…