Comments (8)
Probably best to deal with the paragraphs that tabs for indent as a separate issue, and come back to this more complex step when those tabs have been converted into paragraph indent formatting, and tidied away.
from vmcp-upconversion.
Is the features facet "tab alignment" supposed to capture these?
In some cases I think that is what is being shown, and in others I can't see it.
from vmcp-upconversion.
Yes the "tab alignment" facet value is supposed to identify letters in which appear a sequence of two or more paragraphs which contain tab characters, where the tabs are not at the start of the paragraph. Where a tab occurs only at the start of the paragraph, I've assumed that's not for aligning into columns, but rather just a paragraph indentation.
Maybe that's not a foolproof test, but it's the best I could come up with. Any suggestion for improvement?
Also, if you can point to an example of a table which is constructed with tab characters (rather than a Word table), but which doesn't belong to that facet, please post a link. Cheers!
from vmcp-upconversion.
This still causes problems in XProc. e.g. see 85-08-15a
I had thought the solution is editorial, by creating tables which would work in the one above, but problematic in cases like 54-09-00. I know an editorial solution to this too, just need to add one or two spaces before the tab I think. Only 41 in a final state, so not insurmountable, and it's probably better for me to spend time on that than you trying to find a tweak that will discriminate cases. Views?
from vmcp-upconversion.
Here are the documents referred to, both of which have a sequence of paragraphs containing tabs, which aren't converted to tables:
https://vmcp.rbg.vic.gov.au/id/85-08-15a
https://vmcp.rbg.vic.gov.au/id/54-09-00
NB actually these two cases might more appropriately be converted to lists rather than tables, though it's not a big deal if they are treated as tables.
from vmcp-upconversion.
Unfortunately, Word's list capability appears limited to strictly defined presets—lots of them—with the only options allowing styling for fonts or so on.
I've tried and can't adapt any to reflect the way most of these documents are written.
I will find the "best" way for each case, if necessary by iteration.
from vmcp-upconversion.
@LucasHorseshoeBend yes I agree that Word's "lists" aren't adequate to capturing these lists, and that comment about converting them to lists was more of a note to myself; I meant that the Word-to-TEI converter could convert them to a TEI list instead of a TEI table. But the difference between a TEI list and a TEI table with just two columns is not huge. I'd rather just fix this bug and get them converted to a table, and put off converting them to lists until later on, or never.
from vmcp-upconversion.
I did amend "most of" the cases (I missed one block in one of the letters), so that where the XProc display said, e,g, "dodo" meaning two dittos under a previous entry these have now been separated as "do do", and where numbers followed by a tab then text now reads, e.g. "1 some text" instead of "1sometext".
Many of the cases would not work as tables, because the line above, say, was set out with spaces and not tabs so it would produce a more misleading representation than the characters separated but not aligned vertically. To make a meaningful table on those cases would require editorial intervention anyway, so it's not worth getting rid of the bug, which would risk not picking up the resulting problem cases.
from vmcp-upconversion.
Related Issues (20)
- Finding a set of files without a given style HOT 4
- Oddity in "title"; false positives for German HOT 6
- display of equations and super and subscripts. HOT 12
- Detecting addressee in letters in Mentions folders HOT 2
- No persons in correspondent line HOT 10
- letters dated to decade, not year HOT 3
- Strange behavior in Addressee facet HOT 3
- Tables in footnotes are flattened HOT 20
- capture document metadata HOT 1
- size of apparatus files HOT 7
- Finding embedded objects HOT 1
- An unexpected servlet error has occurred. HOT 2
- Searching for correspondents with "[...]" as the name
- One file not opening in XTF HOT 2
- Odd behaviopur in footnotes in some files HOT 4
- Indents showing as outdents HOT 2
- Coversion run crashed 18 March 18:00 UTC HOT 1
- Physical location field in Document information: Tweak for XProc display
- Closing pop-up window in search for places plant name is used: tweak for XProc display
- Underscore in XProc version
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vmcp-upconversion.