From Muhammad Haggag <mhaggag@gmail.com>:
Muhammad Haggag has uploaded a new change for review.
Change subject: fdo#53399 Word count is inconsistent and wrong with non-breaking space
......................................................................
fdo#53399 Word count is inconsistent and wrong with non-breaking space
This change replaces lcl_IsSkippableWhitespace with a call to ICU's u_isspace, which covers all 
Unicode separators. It also updates and fixes one of the SwScanner unit tests.
Bug details:
SwScanner::NextWord skips whitespace before calling into ICU's BreakIterator. The function used to 
identify whitespace (lcl_IsSkippableWhitespace) doesn't cover the full category of Unicode 
separators (code [Zs], 18 in total. See: 
http://www.fileformat.info/info/unicode/category/Zs/index.htm).
Since 0xA0 (no-break space) is not identified as whitespace and not skipped, we end up calling ICU 
starting at the position 0xA0, asking it to get us the boundary of the next word forward. ICU sees 
that it's called at the end of a word, and reverses the query direction to backward, and returns 
the word before. This causes NextWord to think we've hit the end of the string and call it a day, 
terminating word count for the rest of the line.
Change-Id: I29c89ddb0b26e88da822501253898856b28e3fa5
---
M sw/qa/core/swdoc-test.cxx
M sw/source/core/txtnode/txtedt.cxx
2 files changed, 11 insertions(+), 12 deletions(-)
  git pull ssh://gerrit.libreoffice.org:29418/core refs/changes/53/453/1
--
To view, visit https://gerrit.libreoffice.org/453
To unsubscribe, visit https://gerrit.libreoffice.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I29c89ddb0b26e88da822501253898856b28e3fa5
Gerrit-PatchSet: 1
Gerrit-Project: core
Gerrit-Branch: master
Gerrit-Owner: Muhammad Haggag <mhaggag@gmail.com>
Context
- [PATCH] fdo#53399 Word count is inconsistent and wrong with	non-brea... · Gerrit
 
   
 
  Privacy Policy |
  
Impressum (Legal Info) |
  
Copyright information: Unless otherwise specified, all text and images
  on this website are licensed under the
  
Creative Commons Attribution-Share Alike 3.0 License.
  This does not include the source code of LibreOffice, which is
  licensed under the Mozilla Public License (
MPLv2).
  "LibreOffice" and "The Document Foundation" are
  registered trademarks of their corresponding registered owners or are
  in actual use as trademarks in one or more countries. Their respective
  logos and icons are also subject to international copyright laws. Use
  thereof is explained in our 
trademark policy.