GithubHelp home page GithubHelp logo

Comments (4)

keestux avatar keestux commented on September 28, 2024

The example program

#!/usr/bin/env python3

from odfdo import Document, Paragraph

mydoc = Document('BugOdfdo.odt')

content = mydoc.get_part('content')

mydoc.save(target='BugOdfdo2.odt', pretty=True)

from odfdo.

keestux avatar keestux commented on September 28, 2024

Here is the example input ODT
BugOdfdo.odt

from odfdo.

keestux avatar keestux commented on September 28, 2024

Maybe it is a bug in LibreOffice.
When I look at content.xml there is nothing different, except for the white space (pretty print).

from odfdo.

jdum avatar jdum commented on September 28, 2024

Hi, thanks for this interesting bug. Actually I'm not sure to remember of what should be the correct interpretation of the standard. But a few first thoughts:

  • as main ODF implementation, let's consider LibreOffice as "always right"...
  • I'm surprised the pretty option breaks things, however maybe I use it mainly to analyse and debug the XML result, not in actual production code,
  • the ODF standard says: http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html at "3.18White Space Processing and EOL Handling" that "ODF processing of whitespace characters is in conformance with the provisions of XML, ..." but this can be quite complex (ODF uses the "tail" part of XML tags, aka the text after the tag itself...), and ODF has a special feature for multiple white chars. And if I remember, XML knows about "significant" and "insignificant" whitespaces (thus allowing the pretty output?). Thus a whitespace at "tail" position of a Span inside a paragraph, between Span tags is unclear for me. Should be insignificant as XML, but maybe still significant in ODF ?
  • As a result, at least, I should add some information about the pretty print to possibly breaks the rendering in major ODF softwares.

from odfdo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.