Date: prev next · Thread: first prev next last
2015 Archives by date, by thread · List index


On Sat, Sep 19, 2015 at 09:30:55AM -0700, julien2412 wrote:
I'm not sure and, perhaps it's not related, but it seems docx attachments
have mimetype "application/zip" instead of
"application/vnd.openxmlformats-officedocument.wordprocessingml.document"
(eg: https://bugs.documentfoundation.org/show_bug.cgi?id=94359). Even if
confirmed, not a blocker of course ! :-)

Because OOXML and ODF both use ZIP as a container for a bunch of XML,
I wonder if the type-detecting mechanisms aren't sufficiently nuanced
to detect and differentiate between ZIP and ZIP+(DOCX guts).

Here's a quick test using the 'file' command (v5.14)

qubit@loopbackoffice:~/libreoffice/bugs$ find -type f -print0 | xargs -0 file
./90736/TableCellBorderLineStyle.pptx:                 Microsoft
PowerPoint 2007+
./93865/fubar.odt:                                     OpenDocument Text
./93853/checkbox bug_pdf-test-export.pdf:              PDF document, version 1.4
./93853/.~lock.checkbox bug.odt#:                      ASCII text,
with no line terminators
./93853/checkbox bug.odt:                              OpenDocument Text
./93356/testsort.ods:                                  OpenDocument Spreadsheet
./92703/Character.Controls.rtf:                        Rich Text
Format data, version 1, ANSI
./91293/impress-file-saved-as-pptx.pptx:               Zip archive
data, at least v2.0 to extract
./91293/.~lock.impress-file-saved-as-pptx_try-2.pptx#: ASCII text,
with no line terminators
./91293/impress-file-saved-as-pptx_try-2.pptx:         Zip archive
data, at least v2.0 to extract
./93868/Lecture 1 African Historical Background.pptx:  Microsoft
PowerPoint 2007+
./93851/Untitled 1.ods:                                OpenDocument Spreadsheet

Looks like ODF files are recognized quite well on this system, but
it's struggling with OOXML. Is there a correlation between authoring
software and recognition?

On Sat, Sep 19, 2015 at 1:27 PM, David Tardon <dtardon@redhat.com> wrote:
That's nothing new. The attachment format auto-detection has never really
worked.

We did have much worse file-type detection when we were hosted at FDO,
but I believe that things have gotten significantly better.

David: Is there something we could do to help upstream utilities more
easily recognize OOXML files?  (both files that LibreOffice creates,
as well as ones from other sources)


Thanks,
--R

-- 
Robinson Tryon
QA Engineer - The Document Foundation
LibreOffice Community Outreach Herald
qubit@libreoffice.org
802-379-9482 | IRC: colonelqubit on Freenode

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.