Date: prev next · Thread: first prev next last
2015 Archives by date, by thread · List index


Hi all,

I am working on a toolkit which does some text analysis on the given text documents. This toolkit primarily was supposed to work with XML files. But since the input files in the real applications are mostly *.doc/*.docx/*.odt/*.ppt/*.pptx/*.odp/*.pdf/ets, I need to write a library for reading these file formats and convert their contents into the desired XML format.  I was looking for such a library and learned that LibreOffice does have such a functionality.
I searched for the part of the code in LibreOffice which is responsible of reading the given files (in different formats), but couldn't find it.
Could you please point me to this part of the code in the LibreOffice project?


Thank you in advance,
Amin

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.