Hello,
The use of cppcheck-htmlreport to convert raw cppcheck reports errors to
html fails for some files because of the encodings.
Here's an example message:
cppcheck/htmlreport/cppcheck-htmlreport", line 287, in <module>
content = input_file.read()
File "/usr/lib/python2.7/codecs.py", line 296, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xfc in position 3546:
invalid start byte
Here's the list of files which give this problem:
./hwpfilter/source/hcode.cxx
./hwpfilter/source/hwpread.cxx
./hwpfilter/source/hbox.h
./hwpfilter/source/formula.cxx
./hwpfilter/source/hwpfile.cxx
./hwpfilter/source/hwpeq.cxx
./chart2/source/view/charttypes/Splines.cxx (was containing 2 "ü" but was
detected as iso-8859-1 and not as utf8 by "file -i"), now converted (see
http://cgit.freedesktop.org/libreoffice/core/commit/?id=42d494e7925249c36f62206e7268d849437e219d)
./hwpfilter/source/hbox.cxx
./hwpfilter/source/hinfo.cxx
I gave a try to ./hwpfilter/source/hinfo.cxx
Initial view on vi (Debian testing x86-64, French)
56 /**
57 * ¹®¼Á¤º¸¸¦ ÀоîµéÀÌ´Â ÇÔ¼ö ( 128 bytes )
58 * ¹®¼Á¤º¸´Â ÆÄÀÏÀνÄÁ¤º¸( 30 bytes ) ´ÙÀ½¿¡ À§Ä¡ÇÑ Á¤º¸ÀÌ´Ù.
59 */
60 bool HWPInfo::Read(HWPFile & hwpf)
since README from hwpfilter indicates "Hangul Word Processor" and "Korea", I
gave a try with "iconv -f EUC-KR -t utf8 hwpfilter/source/hinfo.cxx >
stdout.txt", I retrieved this:
56 /**
57 * 문서정보를 읽어들이는 함수 ( 128 bytes )
58 * 문서정보는 파일인식정보( 30 bytes ) 다음에 위치한 정보이다.
59 */
60 bool HWPInfo::Read(HWPFile & hwpf)
I gave a try to Google translate which detected the language as Korean
(hopefully! :-)) and translated this:
"Function to read the document information"
which seems ok according to the name of the function.
Remark : I don't know what means "( 128 bytes )" or "( 30 bytes)", is it a
pb in conversion?
Anyway, would this conversion be ok on these files or might we lose some
information?
Of course, I prefer cppcheck to fail the html conversion of some reports
than losing important information in these files.
Perhaps too, it's a cppcheck bug or Python bug which should be fixed.
Any idea?
Julien
--
View this message in context:
http://nabble.documentfoundation.org/What-encoding-is-used-tp4105106.html
Sent from the Dev mailing list archive at Nabble.com.
Context
- What encoding is used? · julien2412
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.