i.e. without child elements and also drop the language that was changed
to implied in the dtd previously/defaults to en-US unless specified.
This change doesn't affect translations, only the content of the
paragraph/headings is extracted.
done with:
perl -CSD -pi -e 'BEGIN {$base = qr/role="heading"|level="(?<level>\d+)"|(?<id>id="[^"]+")/;} s#<paragraph(((\s+($base)){3})|(\s+($base|xml-lang="en-US")){4})>(?<body>[^<]+)</paragraph>#<h$+{level} $+{id}>$+{body}</h$+{level}>#g'
(all permutations re order of attributes, and xml-lang="en-US" being
optional / implied)
Change-Id: I365a2bb983a3969af9390753fce7b7f3597c7b8b
Reviewed-on: https://gerrit.libreoffice.org/c/help/+/148795
Tested-by: Jenkins
Reviewed-by: Olivier Hallot <olivier.hallot@libreoffice.org>
Replacement done with
find . -name \*.xhp -print0 |xargs -0 -P 0 perl -CS -pi -e \
's#(<link[^>]*?) +name *="[^"]*" *( [^>]+|) *>#$1$2>#g'
(note some inconsistencies with space between name and = and also having
empty value, and some more complicated expression to also clear up
double space before/after the attribute)
translation files will be prepped with:
find */helpcontent2 -name \*.po -print0 |xargs -0 -P 0 perl -CS -pi -e \
$'s#(<link[^>]*?) +name=(?:\\\\"[^"]*\\\\"|\'[^\']*\') *( [^>]+|) *(/?>)#$1$2$3#g unless /^#/'
(note that not all languages use the " as quote character for the
attributes, but that also single quotes appera in the po file. Hence
the use of the shell $'string' syntax to be able to quote ' as \'
It also requires to quote the backslash, so that it needs to be escaped
once for the shell, then another time for perl. Also don't work on
obsolete strings (those are prefixed with #~ in the po files)
Also note that <link..></link> gets turned into <link ../> during
translation extraction (along with removal of the space between the
attribute name and the value), so the pattern needs to be slightly
different here)
Change-Id: I95e53a08e6b0095cd894109ea0de154cc4859d8f
Reviewed-on: https://gerrit.libreoffice.org/c/help/+/143713
Tested-by: Jenkins
Reviewed-by: Christian Lohmaier <lohmaier+LibreOffice@googlemail.com>