GithubHelp home page GithubHelp logo

Comments (8)

Conal-Tuohy avatar Conal-Tuohy commented on June 16, 2024

Probably best to deal with the paragraphs that tabs for indent as a separate issue, and come back to this more complex step when those tabs have been converted into paragraph indent formatting, and tidied away.

from vmcp-upconversion.

LucasHorseshoeBend avatar LucasHorseshoeBend commented on June 16, 2024

Is the features facet "tab alignment" supposed to capture these?
In some cases I think that is what is being shown, and in others I can't see it.

from vmcp-upconversion.

Conal-Tuohy avatar Conal-Tuohy commented on June 16, 2024

Yes the "tab alignment" facet value is supposed to identify letters in which appear a sequence of two or more paragraphs which contain tab characters, where the tabs are not at the start of the paragraph. Where a tab occurs only at the start of the paragraph, I've assumed that's not for aligning into columns, but rather just a paragraph indentation.

Maybe that's not a foolproof test, but it's the best I could come up with. Any suggestion for improvement?

Also, if you can point to an example of a table which is constructed with tab characters (rather than a Word table), but which doesn't belong to that facet, please post a link. Cheers!

from vmcp-upconversion.

LucasHorseshoeBend avatar LucasHorseshoeBend commented on June 16, 2024

This still causes problems in XProc. e.g. see 85-08-15a
I had thought the solution is editorial, by creating tables which would work in the one above, but problematic in cases like 54-09-00. I know an editorial solution to this too, just need to add one or two spaces before the tab I think. Only 41 in a final state, so not insurmountable, and it's probably better for me to spend time on that than you trying to find a tweak that will discriminate cases. Views?

from vmcp-upconversion.

Conal-Tuohy avatar Conal-Tuohy commented on June 16, 2024

Here are the documents referred to, both of which have a sequence of paragraphs containing tabs, which aren't converted to tables:
https://vmcp.rbg.vic.gov.au/id/85-08-15a
https://vmcp.rbg.vic.gov.au/id/54-09-00
NB actually these two cases might more appropriately be converted to lists rather than tables, though it's not a big deal if they are treated as tables.

from vmcp-upconversion.

LucasHorseshoeBend avatar LucasHorseshoeBend commented on June 16, 2024

Unfortunately, Word's list capability appears limited to strictly defined presets—lots of them—with the only options allowing styling for fonts or so on.
I've tried and can't adapt any to reflect the way most of these documents are written.
I will find the "best" way for each case, if necessary by iteration.

from vmcp-upconversion.

Conal-Tuohy avatar Conal-Tuohy commented on June 16, 2024

@LucasHorseshoeBend yes I agree that Word's "lists" aren't adequate to capturing these lists, and that comment about converting them to lists was more of a note to myself; I meant that the Word-to-TEI converter could convert them to a TEI list instead of a TEI table. But the difference between a TEI list and a TEI table with just two columns is not huge. I'd rather just fix this bug and get them converted to a table, and put off converting them to lists until later on, or never.

from vmcp-upconversion.

LucasHorseshoeBend avatar LucasHorseshoeBend commented on June 16, 2024

I did amend "most of" the cases (I missed one block in one of the letters), so that where the XProc display said, e,g, "dodo" meaning two dittos under a previous entry these have now been separated as "do do", and where numbers followed by a tab then text now reads, e.g. "1 some text" instead of "1sometext".

Many of the cases would not work as tables, because the line above, say, was set out with spaces and not tabs so it would produce a more misleading representation than the characters separated but not aligned vertically. To make a meaningful table on those cases would require editorial intervention anyway, so it's not worth getting rid of the bug, which would risk not picking up the resulting problem cases.

from vmcp-upconversion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.