Thanks Noel! I’ve poked through https://github.com/LibreOffice/core/tree/master/sw/source/filter/html and couldn’t find anything hinting at expanding HTML entities. Looking at the parser you indicated, looks like this code would cover HTML entities: https://github.com/LibreOffice/core/blob/master/svtools/source/svhtml/parhtml.cxx#L394-L622 which are listed here: https://dev.w3.org/html5/html-author/charref I’ll take a closer look… Cheers, Jens On Thu, Sep 27, 2018 at 11:24:40AM +0200, Noel Grandin wrote:
On 2018/09/27 11:10 AM, Jens Tröger wrote:I’ve been poking through the HTML reader (rather superficially, I admit) in search for the code that expands HTML entities to Unicode. I did that toProbably HTMLParser::ScanText at svtools/source/svhtml/parhtml.cxx:394
-- Jens Tröger http://savage.light-speed.de/