Commit Graph

68 Commits

Author SHA1 Message Date
6e225b41f1 Resolves: fdo#55707 Word count incorrect if language is set to Finnish
Change-Id: I283dddaa4bd8baf05b90ce5f81d43b785021a3c4
2014-05-12 17:08:24 +01:00
69a74afb07 Avoid expensive dlopen thrash for break iterators.
Change-Id: I770c1b3e5164cb486b5a5c2b1259f713914a1bae
2014-05-12 10:56:33 +01:00
70cc2b191b First batch of adding SAL_OVERRRIDE to overriding function declarations
...mostly done with a rewriting Clang plugin, with just some manual tweaking
necessary to fix poor macro usage.

Change-Id: I71fa20213e86be10de332ece0aa273239df7b61a
2014-03-26 16:39:26 +01:00
d8fd158759 fdo#72219: Fix for corruption of symbols in docx
Issue:
OUString uses UTF-16, so for a Unicode surrogate character there are 2
values stored, not just 1.
So we are getting assert failure in "rtl_uString_iterateCodePoints" method.

erAck: Underlying cause was that the dictionary breakiterator misused UTF-16 positions as Unicode code point positions.

Change-Id: I923485f56c2d879b63687adaea2b489a3479991c
Reviewed-on: https://gerrit.libreoffice.org/6955
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Eike Rathke <erack@redhat.com>
2014-01-08 19:38:45 +00:00
7d3999f2f2 Remove UTF-8 comment.
This breaks windows build with localized versions of MSVC.

Change-Id: I23c46830f96ae661eced88352476e7ae61fbcc2a
Reviewed-on: https://gerrit.libreoffice.org/6847
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Eike Rathke <erack@redhat.com>
2013-11-28 23:30:54 +01:00
1730df0127 remove unnecessary RTL_CONSTASCII_STRINGPARAM in OString::append
Convert code like:
   aOStringBuf.append( RTL_CONSTASCII_STRINGPARAM( " is missing )") );
to:
   aOStringBuf.append( " is missing )" );
which compiles down to the same code.

Change-Id: I3d8ed0cbf96a881686524a167412d5f303c06b71
2013-11-20 10:07:32 +02:00
610b2b94b3 remove unnecessary use of OUString constructor when assigning
change code like
   aStr = OUString("xxxx");
to
   aStr = "xxxx";

Change-Id: Ib981a5cc735677ec5dba76ef9279a107d22e99d4
2013-11-19 10:29:31 +02:00
1ba111343e bugs.freedesktop.org -> bugs.libreoffice.org
Change-Id: I56c1190c93333636981acf2dd271515170a8a904
2013-11-17 08:33:01 +01:00
f24fa8efad Add Lao breakiterator support for selecting and counting Lao words.
Change-Id: I6da721dc25394dfee12e3028aefbf0546d1be984
Reviewed-on: https://gerrit.libreoffice.org/6669
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
2013-11-13 16:25:58 +00:00
e52779d2f8 remove unnecessary use of OUString constructor
Change-Id: Ifb220af71857ddacd64e8204fb6d3e4aad8eef71
2013-11-11 11:21:26 +02:00
0e6a2601b3 Convert code that calls OUString::getStr()[] to use the [] operator
This also means that this code now gets bounds checked in debug builds.

Change-Id: Id777f85eaee6a737bbcb84625e6e110abe0e0f27
2013-11-04 08:06:10 +02:00
76e735b26a enable building against RHEL-6 system icu
Change-Id: I56f08d58d8d8a0e397412580451c90f9605bcb46
2013-08-30 17:30:01 +01:00
03993b47c5 targetted clean of redundant header piece from 62badf3828
Change-Id: Ic1240114d667fb7797afae4847427cc889f3cb48
2013-07-26 14:18:52 +01:00
addc791623 Fix icu version checks.
commit 30c303 "Make charmap.cxx compile with icu >= 4.4." was incomplete
and had wrong version checks. After ICU 4.8 (4.8.1.1) the next version
of ICU was 49 (49.1) so U_ICU_VERSION_MAJOR_NUM contains two digets (49),
earlier that it was just one digit (4). The correct header to include to
do version checks is unicode/uversion.h. USCRIPT_MANDAEAN is the old
alias of USCRIPT_MANDAIC (same numeric value). U_JG_FARSI_YEH is only
available since ICU 4.4. Note that on older icu versions (4.2.1) the
200B (ZWSP) Zero Width Space breakiterator testcase fails (others
succeed).

Change-Id: If73c1402239a28546077437e9382f0bd38642bad
Reviewed-on: https://gerrit.libreoffice.org/4139
Reviewed-by: Luboš Luňák <l.lunak@suse.cz>
Tested-by: Luboš Luňák <l.lunak@suse.cz>
2013-06-03 11:16:37 +00:00
e2e2cc6114 remove usage of RTL_CONSTASCII_USTRINGPARAM
Mechanical removal of usage together with OUString ctor, done
by compiler plugin.

Change-Id: I554227f76df0dac620b1b46fca32516f78b462c5
2013-05-06 16:51:45 +02:00
62badf3828 Move to MPLv2 license headers, with ESC decision and author's permission. 2013-04-22 09:37:38 +01:00
1946794ae0 mass removal of rtl:: prefixes for O(U)String*
Modules sal, salhelper, cppu, cppuhelper, codemaker (selectively) and odk
have kept them, in order not to break external API (the automatic using declaration
is LO-internal).

Change-Id: I588fc9e0c45b914f824f91c0376980621d730f09
2013-04-07 14:23:11 +02:00
39d45390f4 removal of RTL_CONSTASCII_USTRINGPARAM for quoted OUStrings declarations
s/(OUString\s+[a-zA-Z_][A-Za-z0-9_]*\s*)\(\s*RTL_CONSTASCII_USTRINGPARAM\s*\((\s*"[^")]*?"\s*)\)\s*\)/$1\($2\)/gms

Change-Id: Iad20f242c80c4bdc69df17e2d7a69d58ea53654b
Reviewed-on: https://gerrit.libreoffice.org/2835
Reviewed-by: Thomas Arnhold <thomas@arnhold.org>
Tested-by: Thomas Arnhold <thomas@arnhold.org>
2013-03-19 10:48:30 +00:00
8b27d78b4a automated removal of RTL_CONSTASCII_USTRINGPARAM for quoted OUStrings
Done with a perl regex:

s/OUString\s*\(\s*RTL_CONSTASCII_USTRINGPARAM\s*\((\s*"[^")]*?"\s*)\)\s*\)/OUString\($1\)/gms

Change-Id: Idf28320817cdcbea6d0f7ec06a9bf51bd2c3b3ec
Reviewed-on: https://gerrit.libreoffice.org/2832
Reviewed-by: Thomas Arnhold <thomas@arnhold.org>
Tested-by: Thomas Arnhold <thomas@arnhold.org>
2013-03-19 09:00:26 +00:00
9be25f14bd avoid Wundef for various FIXME, FEATURE_NOT_DONE_YET and what not
Change-Id: I8e409ba63d32dca9a1c7f09d143165d1d702d642
2013-03-18 17:15:56 +01:00
616c6924f1 s/the the/the/
Change-Id: Iadacffaad832c6ff06757e8567e24f929f24a4c3
2013-02-22 09:58:19 +02:00
79a3c9b186 partly revert 92a9b7780c6e13a4da3b12794342edbc4c09ef51 for ICU < 49
Re-enable build with ICU 4.6 and 4.8
ICU versions prior to 49 don't know Conditional_Japanese_Starter and
Hebrew_Letter

Also, the change in i18npool/CustomTarget_breakiterator.mk

- -e "s#\[:LineBreak = Close_Punctuation:\]#\[& \[:LineBreak = Close_Parenthesis:\]\]#" \

with i18npool/source/breakiterator/data/line.txt

-$CL = [:LineBreak = Close_Punctuation:] ;
+$CL = [:LineBreak = Close_Parenthesis:];

did not produce equivalent results. Instead use

$CP = [:LineBreak =  Close_Parenthesis:];
$CL = [[:LineBreak =  Close_Punctuation:] $CP];

Change-Id: I14fc14319ea34f23393264560452a79bb49fc3a7
2013-01-02 19:23:03 +01:00
92a9b7780c follow logical consequences of a minimum icu version of 4.6
since commit f20ed8959bc0a984177377a734d34f767653625b

Change-Id: I4f2fc5d9eb7a581b9ed707a3c3f96be817141846
2012-12-29 18:02:25 +00:00
5584761358 Related: #112623# add regression test for japanese word break rules
Change-Id: I05baf163cc00d3770b9a8b25b099ffcbd9623a2f
2012-09-19 14:38:02 +01:00
346cf4ee5d Related: #i50172# add regression test for Tamil cursoring
Change-Id: I8f6c3814aa3630f5f640f611fb51ce72641715c6
2012-09-07 12:39:58 +01:00
9b69085c5d Related: #i80412# add regression test for Indic language cursoring
Change-Id: Ia1cc6ade8d2122abf5469ec521b2883961121a04
2012-09-07 12:21:15 +01:00
08de755c07 Related: #i107843# add regression test for em-dash/en-dash spell checking
Change-Id: I8d9aad9ac648aefdd1f31e09fe2ea84a698c0013
2012-09-07 10:04:05 +01:00
ca00d27e33 Related: fdo#54486 add some regression tests for ordinal suffixes
Change-Id: Iea51d777c3cc1fdc58fa7fccfe01e4e8394e79b2
2012-09-06 11:14:50 +01:00
edfa6cd3e5 Related: #i103552# add regression test for シャットダウン
Change-Id: I30117fdf70036f6df8dc494fe33a8a56d5544635
2012-08-31 10:30:35 +01:00
2fa8271155 Related: #i113785# add regression test for ligatures
Change-Id: I46fca6dc8e77571afda5ceb230dc6c93f730703d
2012-08-29 11:43:50 +01:00
8c205bfccd Related: #i58513# add regression test for Finnish break iterator rules
Change-Id: I5b8c1190db08f781143fd8d12b007dc05a4d6046
2012-08-28 13:26:11 +01:00
60cfa64345 Bin no longer used iOS cppunit stuff that even breaks the build
Change-Id: I459f7fd097a81ef5977974f52b0cc2c2f155a810
2012-08-02 19:16:53 +03:00
fb66fd63d4 stray fprintf
Change-Id: Icd10968e886be1d534e8037db6225e83712384ee
2012-07-27 16:09:45 +01:00
db8853d7a5 add regression test for #i19716#
Change-Id: I11440667bdf73ed09ebc83771acf33e2d3e61f6c
2012-07-27 15:45:27 +01:00
2cf7896039 add regression test for #i21290
Change-Id: Ic60f440f8dc8fcfa76a023557e76fcf8e3c52476
2012-07-27 15:45:27 +01:00
071a0dc02c Related: #i85411# catalan word breaking rules out of sync with ZWNJ
I can see no reason to have specific catalan rules, old examples
work fine with default rules

Change-Id: Ifacb7b46204d8aed543ab0c77fe80d1d5c5de738
2012-07-25 11:37:58 +01:00
5a6279df55 Related: #i17155# regression test for slash part of word for line break
Change-Id: I5b457531fb94f66dd7f5fdcc4636c5a202a036f1
2012-07-25 10:02:15 +01:00
b8fa8841c0 Related: #i13451# regression test for Catalan dictionary word breakiterator
Change-Id: I7785746b2cf4e5e054ced5b728dc69e6b1a966f2
2012-07-25 10:02:15 +01:00
dcb28419b0 Related: #i29548# Thai word breakiterator regression test
Change-Id: Ie47dfe6ab5e308c0353d557fe7530a983622db96
2012-07-25 10:02:14 +01:00
43b75d8af0 Disable testWordBoundaries test with old icu.
Change-Id: I8d75eeb2eee43e1338a6f54c4b8ed633631bac0d
2012-07-24 15:42:51 +02:00
a1cb33edbb Related: #i13494# regression tests for word iterator
Change-Id: Ifad0a8ae01386db80a5eca9dfba8ee6841980139
2012-07-23 15:42:41 +01:00
b0f170d7df 0xFFF9 is a better choice for CH_TXTATR_INWORD than 0x0002.
a) the default properties for the code point make it not split a word it
appears in into two different words in any break mode we have. Which is what we
want from a CH_TXTATR_INWORD

b) unicode TR#20 gives for the interlinear annotation anchor: "What to do if
detected: In a proxy context or browser context, remove U+FFF9", so when we
need to strip it from text to run that text through e.g. the spellchecker or
word counting then there's a solid precedent for stripping it

In addition I *do* want the footnote placeholder to break the word it appears
in, that gives the desired wordcount and cursor travelling behaviour

The BREAKWORD and other *random* selection of CH_TXTATR are still odd choices,
and there's way too many of them.

Change-Id: I930ff8ff806af448829bc1a1ae6cb92053e9a284
2012-07-18 17:27:09 +01:00
1a0d0762ea beef up the join and break tests
Change-Id: Ia34c2f18cfa84447578604ff27a9145d17bf354a
2012-07-18 17:27:09 +01:00
b4f077af54 move test to right category
Change-Id: If2cb8da2a24331cc01fed85750747fff3d2fc8e0
2012-07-18 17:27:08 +01:00
f8f05d43de Resolves: fdo49629 GotoEndOfWord fails with footnote at word end
a) remove special handling of 0x0002 in our custom icu rules.
   Which brings us a step closer to getting rid of at least
   some of them in favour of the defaults
b) expand the 0x02 in SwTxtNode::BuildConversionMap like we
   do for fields so

Good side effect is our word count and character count now take into account
the actual footnote indicator text, as does our cursor travelling. Both of
which are more word-alike.

Change-Id: I3b0024ac4b10934bee7a9e83b0fce08a18556c7b
2012-07-17 14:11:05 +01:00
413554a3bd Related: fdo#49629 add test case for #i14904#
Change-Id: I2bed0272eade44ab988f2cd9becb1f8ef0f232a9
2012-07-13 13:28:40 +01:00
2cf6778842 Related: fdo#49629 add test case for #i21907#
Change-Id: Ie1dd9091e4d8ee09c9a75eecf28fd6cd06ea1839
2012-07-13 13:01:35 +01:00
52280c2988 Related: fdo#49629 add test case for #i11993#
Change-Id: I4466b57514352620fd26072544ec6e50bf08708c
2012-07-13 12:49:28 +01:00
b32fcb79af skip khmer test on older 'broken' icu versions
Change-Id: Iab813f5288af1f0e054c022c4e4a99b92c7ce1ce
2012-07-13 11:11:04 +01:00
8ad1d4443e Resolves: fdo#52020 ICU breakiterator not used for Khmer
Change-Id: I4c99129cabe70f17aa223cf8ec0ae1529188b6b7
2012-07-13 09:49:02 +01:00