GithubHelp home page GithubHelp logo

Comments (20)

mwilliamson avatar mwilliamson commented on May 29, 2024

Could you provide a full stack trace for the error?

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

Absolutely. So, for the first error, there isn't much of a stack trace it provides, just:

" at System.IO.MemoryStream.Seek(Int64 offset, SeekOrigin loc)"

Taking a look at my call stack though, here's what it spits out (edited for clarity):

mscorlib.dll!System.IO.MemoryStream.Seek(long offset, System.IO.SeekOrigin loc)
System.IO.Compression.dll!System.IO.Compression.ZipArchive.ReadEndOfCentralDirectory()
System.IO.Compression.dll!System.IO.Compression.ZipArchive.Init(System.IO.Stream stream, System.IO.Compression.ZipArchiveMode mode, bool leaveOpen)
System.IO.Compression.dll!System.IO.Compression.ZipArchive.ZipArchive(System.IO.Stream stream, System.IO.Compression.ZipArchiveMode mode, bool leaveOpen, System.Text.Encoding entryNameEncoding)
System.IO.Compression.dll!System.IO.Compression.ZipArchive.ZipArchive(System.IO.Stream stream)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.archives.ZippedArchive.ZippedArchive(System.IO.Stream stream)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.archives.InMemoryArchive.fromStream(Mammoth.Couscous.java.io.InputStream stream)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter.withDocxFile<Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<string>>(Mammoth.Couscous.java.io.InputStream stream, Mammoth.Couscous.java.util.function.Function<Mammoth.Couscous.org.zwobble.mammoth.internal.archives.Archive, Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<string>> function)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter__Anonymous_1.get()
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.util.PassThroughException.unwrap<Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<string>>(Mammoth.Couscous.org.zwobble.mammoth.internal.util.SupplierWithException<Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<string>, Mammoth.Couscous.java.io.IOException> supplier)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter.convertToHtml(Mammoth.Couscous.java.io.InputStream stream)
Mammoth.dll!Mammoth.DocumentConverter.ConvertToHtml(System.IO.Stream stream)

For the error when I add stream.Position = 0, I get:

at Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator'1.skip(T tokenType, String tokenValue)

Call stack:

Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator<Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenType>.skip(Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenType tokenType, string tokenValue)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.HtmlPathParser__Anonymous_1.run()
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator<Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenType>.tryParse(Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator__Action action)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.HtmlPathParser.parseSeparator(Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator<Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenType> tokens)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.HtmlPathParser.parseElement(Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator<Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenType> tokens)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.HtmlPathParser.parseHtmlPathElements(Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator<Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenType> tokens)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.HtmlPathParser.parse(Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator<Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenType> tokens)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.StyleMapParser.parseHtmlPath(Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenIterator<Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.TokenType> tokens)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.StyleMapParser.parseStyleMapping(string line)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.StyleMapParser.handleLine(Mammoth.Couscous.org.zwobble.mammoth.internal.styles.StyleMapBuilder styleMap, string line)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.parsing.StyleMapParser.parseStyleMappings(Mammoth.Couscous.java.util.List<string> lines)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.styles.DefaultStyles.DefaultStyles()
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.conversion.DocumentToHtmlOptions.styleMap()
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.conversion.DocumentToHtml.DocumentToHtml(Mammoth.Couscous.org.zwobble.mammoth.internal.conversion.DocumentToHtmlOptions options, Mammoth.Couscous.java.util.List<Mammoth.Couscous.org.zwobble.mammoth.internal.documents.Comment> comments)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.conversion.DocumentToHtml.convertToHtml(Mammoth.Couscous.org.zwobble.mammoth.internal.documents.Document document, Mammoth.Couscous.org.zwobble.mammoth.internal.conversion.DocumentToHtmlOptions options)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter__Anonymous_6.apply(Mammoth.Couscous.org.zwobble.mammoth.internal.documents.Document nodes)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<Mammoth.Couscous.org.zwobble.mammoth.internal.documents.Document>.flatMap<Mammoth.Couscous.java.util.List<Mammoth.Couscous.org.zwobble.mammoth.internal.html.HtmlNode>>(Mammoth.Couscous.java.util.function.Function<Mammoth.Couscous.org.zwobble.mammoth.internal.documents.Document, Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<Mammoth.Couscous.java.util.List<Mammoth.Couscous.org.zwobble.mammoth.internal.html.HtmlNode>>> function)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter.convertToHtml(Mammoth.Couscous.java.util.Optional<Mammoth.Couscous.java.nio.file.Path> path, Mammoth.Couscous.org.zwobble.mammoth.internal.archives.Archive zipFile)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter__Anonymous_0.apply(Mammoth.Couscous.org.zwobble.mammoth.internal.archives.Archive zipFile)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter.withDocxFile<Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<string>>(Mammoth.Couscous.java.io.InputStream stream, Mammoth.Couscous.java.util.function.Function<Mammoth.Couscous.org.zwobble.mammoth.internal.archives.Archive, Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<string>> function)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter__Anonymous_1.get()
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.util.PassThroughException.unwrap<Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<string>>(Mammoth.Couscous.org.zwobble.mammoth.internal.util.SupplierWithException<Mammoth.Couscous.org.zwobble.mammoth.internal.results.InternalResult<string>, Mammoth.Couscous.java.io.IOException> supplier)
Mammoth.dll!Mammoth.Couscous.org.zwobble.mammoth.internal.InternalDocumentConverter.convertToHtml(Mammoth.Couscous.java.io.InputStream stream)
Mammoth.dll!Mammoth.DocumentConverter.ConvertToHtml(System.IO.Stream stream)

from dotnet-mammoth.

mwilliamson avatar mwilliamson commented on May 29, 2024

I think your fix for the first issue is correct. The second issue suggests that there's a syntax error in the default style map, which is rather unexpected. What platform are you running on?

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

Windows 10 Pro, build 1703, 64-bit

It's a Lenovo laptop, running Intel I7-4720HQ, 2.6GHZ CPU, 16GB RAM

The environment I'm debugging in is Visual Studio Community 2017, Version 15.4.3. .NET Framework 4.7, though I think the code is running in .NET Framework 4.6.

Any other information I can provide?

from dotnet-mammoth.

mwilliamson avatar mwilliamson commented on May 29, 2024

If you don't mind, could you try cloning the project and then running the tests?

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

Hey - cloned and ran, all tests indicated a pass.

I'm not sure if it will help, but here's the XML that forms the OpenXML Document.Body.InnerXML for my document:

<w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00C44CFC" w:rsidP="00ED4BCD" w:rsidRDefault="00ED4BCD"><w:pPr><w:pStyle w:val="Title" /><w:jc w:val="center" /></w:pPr><w:r><w:t xml:space="preserve">CUPFA </w:t></w:r><w:r w:rsidR="002D3BE5"><w:t>Meeting Agenda</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidRPr="005545E2" w:rsidR="005545E2" w:rsidP="005545E2" w:rsidRDefault="005545E2"><w:pPr><w:pStyle w:val="Heading1" /></w:pPr><w:r w:rsidRPr="005545E2"><w:t>Submitted By: Test, Test</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00E0484C" w:rsidP="00E0484C" w:rsidRDefault="00E0484C"><w:pPr><w:pStyle w:val="Heading1" /><w:spacing w:line="240" w:lineRule="auto" /></w:pPr><w:r w:rsidRPr="005545E2"><w:t xml:space="preserve">Submitted </w:t></w:r><w:r><w:t>From</w:t></w:r><w:r w:rsidRPr="005545E2"><w:t>: </w:t></w:r><w:r><w:t xml:space="preserve"> </w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="003026D3" w:rsidP="005545E2" w:rsidRDefault="00366BFD"><w:pPr><w:pStyle w:val="Heading1" /><w:spacing w:line="360" w:lineRule="auto" /></w:pPr><w:r><w:t>[Section:GeneralInformation</w:t></w:r><w:r w:rsidR="005D35A9"><w:t>]</w:t></w:r><w:bookmarkStart w:name="_GoBack" w:id="0" /><w:bookmarkEnd w:id="0" /></w:p><w:tbl xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"><w:tblPr><w:tblStyle w:val="TableGrid" /><w:tblW w:w="0" w:type="auto" /><w:tblLayout w:type="fixed" /><w:tblLook w:val="04A0" /></w:tblPr><w:tblGrid><w:gridCol w:w="2263" /><w:gridCol w:w="1758" /><w:gridCol w:w="2073" /><w:gridCol w:w="1981" /><w:gridCol w:w="1995" /></w:tblGrid><w:tr w:rsidR="002D3BE5" w:rsidTr="000322FF"><w:tc><w:tcPr><w:tcW w:w="2263" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="00E82EF8" w:rsidRDefault="005545E2"><w:pPr><w:pStyle w:val="Heading2" /><w:outlineLvl w:val="1" /></w:pPr><w:r><w:t>Name</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1758" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="00E82EF8" w:rsidRDefault="002D3BE5"><w:pPr><w:pStyle w:val="Heading2" /><w:outlineLvl w:val="1" /></w:pPr><w:r><w:t>Meeting Date</w:t></w:r><w:r><w:tab /></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2073" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="00E82EF8" w:rsidRDefault="002D3BE5"><w:pPr><w:pStyle w:val="Heading2" /><w:outlineLvl w:val="1" /></w:pPr><w:r><w:t>[Field:StartTime]</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1981" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="00E82EF8" w:rsidRDefault="002D3BE5"><w:pPr><w:pStyle w:val="Heading2" /><w:outlineLvl w:val="1" /></w:pPr><w:r><w:t>[Field:EndTime]</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1995" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="002D3BE5" w:rsidRDefault="002D3BE5"><w:pPr><w:pStyle w:val="Heading2" /><w:outlineLvl w:val="1" /></w:pPr><w:r><w:t>[Field:Important]</w:t></w:r></w:p></w:tc></w:tr><w:tr w:rsidR="002D3BE5" w:rsidTr="000322FF"><w:tc><w:tcPr><w:tcW w:w="2263" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="000322FF" w:rsidRDefault="000322FF"><w:r><w:t>Meeting Agenda for Test</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1758" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="00E82EF8" w:rsidRDefault="002D3BE5"><w:r><w:t>2017-12-21</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="2073" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="00E82EF8" w:rsidRDefault="002D3BE5"><w:r><w:t>[Value:StartTime]</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1981" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="00E82EF8" w:rsidRDefault="002D3BE5"><w:r><w:t>[Value:EndTime]</w:t></w:r></w:p></w:tc><w:tc><w:tcPr><w:tcW w:w="1995" w:type="dxa" /></w:tcPr><w:p w:rsidR="002D3BE5" w:rsidP="002D3BE5" w:rsidRDefault="002D3BE5"><w:r><w:t>[Value:Important]</w:t></w:r></w:p></w:tc></w:tr></w:tbl><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="006D65E3" w:rsidP="00FC5070" w:rsidRDefault="00E82EF8"><w:pPr><w:pStyle w:val="Heading1" /></w:pPr><w:r><w:t>[</w:t></w:r><w:r w:rsidR="00E100C0"><w:t>Section:</w:t></w:r><w:r w:rsidR="00203C44"><w:t>MeetingNotes</w:t></w:r><w:r><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00A877DD" w:rsidP="00FC5070" w:rsidRDefault="006D65E3"><w:pPr><w:pStyle w:val="Heading2" /></w:pPr><w:r><w:t>[</w:t></w:r><w:r w:rsidR="00E100C0"><w:t>Field:</w:t></w:r><w:r w:rsidR="00203C44"><w:t>MeetingNotes</w:t></w:r><w:r><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00313DF8" w:rsidP="00366BFD" w:rsidRDefault="00E82EF8"><w:r><w:t>[</w:t></w:r><w:r w:rsidR="00E100C0"><w:t>Value:</w:t></w:r><w:r w:rsidR="00203C44"><w:t>MeetingNotes</w:t></w:r><w:r w:rsidR="00970479"><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00366BFD" w:rsidP="00366BFD" w:rsidRDefault="00366BFD"><w:pPr><w:pStyle w:val="Heading1" /></w:pPr><w:r><w:t>[Section:</w:t></w:r><w:r w:rsidRPr="00366BFD"><w:t>SubmittedGroup</w:t></w:r><w:r><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00366BFD" w:rsidP="00366BFD" w:rsidRDefault="00366BFD"><w:pPr><w:pStyle w:val="Heading2" /></w:pPr><w:r><w:t>[Field:</w:t></w:r><w:r w:rsidRPr="00366BFD"><w:t>SubmittedGroup</w:t></w:r><w:r><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00366BFD" w:rsidP="00366BFD" w:rsidRDefault="00366BFD"><w:r><w:t>[Value:</w:t></w:r><w:r w:rsidRPr="00366BFD"><w:t>SubmittedGroup</w:t></w:r><w:r><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00366BFD" w:rsidP="00366BFD" w:rsidRDefault="00366BFD"><w:pPr><w:pStyle w:val="Heading1" /></w:pPr><w:r><w:t>[Section:</w:t></w:r><w:r w:rsidRPr="00366BFD"><w:t>CUPFANotes</w:t></w:r><w:r><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00366BFD" w:rsidP="00366BFD" w:rsidRDefault="00366BFD"><w:pPr><w:pStyle w:val="Heading2" /></w:pPr><w:r><w:t>[Field:</w:t></w:r><w:r w:rsidRPr="00366BFD"><w:t>CUPFANotes</w:t></w:r><w:r><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00366BFD" w:rsidP="00366BFD" w:rsidRDefault="00366BFD"><w:r><w:t>[Value:</w:t></w:r><w:r w:rsidRPr="00366BFD"><w:t>CUPFANotes</w:t></w:r><w:r><w:t>]</w:t></w:r></w:p><w:p xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00366BFD" w:rsidP="00366BFD" w:rsidRDefault="00366BFD" /><w:sectPr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:rsidR="00366BFD" w:rsidSect="00A877DD"><w:pgSz w:w="12240" w:h="15840" /><w:pgMar w:top="1440" w:right="1080" w:bottom="1440" w:left="1080" w:header="708" w:footer="708" w:gutter="0" /><w:cols w:space="708" /><w:docGrid w:linePitch="360" /></w:sectPr>

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

If it helps at all, I've stripped out all the document manipulation I was doing, and simply read the byte stream from my database, which contained the .DOCX file I had stored using a basic file upload procedure, and tried running the converter - same error occurs. Here's the file itself that I'm trying to run through it. I even tried pointing straight to the .DOCX file on the computer - same error. I've attached
the document, for reference.

Standard Report Template.docx

from dotnet-mammoth.

mwilliamson avatar mwilliamson commented on May 29, 2024

Based on the stack trace, the issue seems to be loading the default style map, which is nothing to do with your document. What's odd is that it loads successfully when running the tests. Are you running the tests in the same environment (version of .NET, etc.) as your code?

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

I double-checked, and noticed your code was running .NET 4.5, whereas mine was .NET 4.6.2. But, after updating your code to .NET 4.6.2, the test still passed successfully.

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

A bit more info that may or may not help - running .ExtractRawText instead of .ConvertToHtml works - no errors or anything. Though I suspect you already knew that, as I'd expect the style map isn't applied when you're doing raw text.

from dotnet-mammoth.

mwilliamson avatar mwilliamson commented on May 29, 2024

What happens if you compile Mammoth yourself, and try using that DLL?

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

So, digging into the debugger while running your project, it seems that the error is happening here:

        public static string parseSeparator(Mammoth.Couscous.org.zwobble.mammoth.@internal.styles.parsing.TokenIterator<Mammoth.Couscous.org.zwobble.mammoth.@internal.styles.parsing.TokenType> tokens) {
            bool isSeparator = tokens.tryParse(new Mammoth.Couscous.org.zwobble.mammoth.@internal.styles.parsing.HtmlPathParser__Anonymous_1(tokens));

I'm not exactly sure what's happening, but specifically, it's reaching this function:

        public void skip(T tokenType, string tokenValue) {
            Mammoth.Couscous.org.zwobble.mammoth.@internal.styles.parsing.Token<T> token = this.getToken(this._index);
            if (!(token.getTokenType()).equals(tokenType)) {
                throw this.unexpectedTokenType(tokenType, token);
            }
            string actualValue = token.getValue();
            if (!(actualValue.Equals(tokenValue))) {
                throw Mammoth.Couscous.org.zwobble.mammoth.@internal.styles.parsing.LineParseException.lineParseException<T>(token, (((("expected " + tokenType) + " token with value ") + tokenValue) + " but value was ") + actualValue);
            }
            this._index = this._index + 1;
        }

And when it tries executing the line "if (!(token.getTokenType()).equals(tokenType)) ", it throws an exception, because it's trying to compare tokenType, which has the value _SYMBOL, with token, which has the CHARINDEX of 36 (EOF).

Does this help at all? If no, I'll continue digging through the debug .

from dotnet-mammoth.

mwilliamson avatar mwilliamson commented on May 29, 2024

Thanks, I think that confirms what the stack trace was suggesting. If you go back up the stack, you should be able to find out which string is being parsed. Specifically, it'll be the argument to parseStyleMapping.

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

It seems to be happening for every element of the lines list. Specifically, it reaches this point:

string separator = Mammoth.Couscous.org.zwobble.mammoth.@internal.styles.parsing.HtmlPathParser.parseSeparator(tokens);

The code seems to be referencing a list of tokens, which contains 9 elements. It calls the following function:

        public void run() {
            (this._tokens).skip(Mammoth.Couscous.org.zwobble.mammoth.@internal.styles.parsing.TokenType._SYMBOL, ":");
            (this._tokens).skip(Mammoth.Couscous.org.zwobble.mammoth.@internal.styles.parsing.TokenType._IDENTIFIER, "separator");
        }

However, when it reaches the function "skip", the iterator has already reached the index of 9, and therefore when getToken(9) is called, it goes to the else case, and returns EOF. It was, however, expecting to get back _SYMBOL, which I assume is referencing the symbol associated with a separator.

The 9 elements in the token list are:

Index: 0, CHARINDEX: 0, TokenType: _IDENTIFIER, Value: "p"
Index: 1, CHARINDEX: 1, TokenType: _SYMBOL, Value: "."
Index: 2, CHARINDEX: 2, TokenType: _IDENTIFIER, Value: "Heading1"
Index: 3, CHARINDEX: 10, TokenType: _WHITESPACE, Value: " "
Index: 4, CHARINDEX: 11, TokenType: _SYMBOL, Value: "=>"
Index: 5, CHARINDEX: 13, TokenType: _WHITESPACE, Value: " "
Index: 6, CHARINDEX: 14, TokenType: _IDENTIFIER, Value: "h1"
Index: 7, CHARINDEX: 16, TokenType: _SYMBOL, Value: ":"
Index: 8, CHARINDEX: 17, TokenType: _IDENTIFIER, Value: "fresh"

from dotnet-mammoth.

mwilliamson avatar mwilliamson commented on May 29, 2024

However, when it reaches the function "skip", the iterator has already reached the index of 9, and therefore when getToken(9) is called, it goes to the else case, and returns EOF. It was, however, expecting to get back _SYMBOL, which I assume is referencing the symbol associated with a separator.

I think this is expected: what should happen is that the exception is thrown and caught in tryParse. Could you try stepping through that and see whether the exception is being caught or not?

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

Bah - I think I've just wasted your time. You're absolutely right, the exception is caught, however, my debugger was set up to break on caught exceptions anyway, and I just didn't bother continuing past that line, because it kept throwing the same exception. After trying to skip past it a dozen times, I gave up, and just assumed it wasn't working.

Turns out that it just throws that exception for every single record in the "lines" list, but it handles the exception and continues anyway.

After selecting to ignore the exception, it properly processes the document and converts it into an HTML file.

All that being said - it is a bit odd that it's being so liberal with throwing the exception. Perhaps in a future version you could find a more graceful way of dealing with that scenario, if it's expected to be happening so regularly?

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

Actually I think maybe I spoke too soon.

After running it with the exceptions allowed, it did indeed spit out the HTML - but with absolutely no styling. The HTML tags are properly applied, but they're all using the default HTML styles.

Here's the HTML that is returned when I ignore the exceptions:

<p>CUPFA Meeting Agenda</p><h1>Submitted By: Test, Test</h1><h1>Submitted From: </h1><h1>[Section:GeneralInformation]</h1><table><tr><td><h2>Name</h2></td><td><h2>Meeting Date </h2></td><td><h2>[Field:StartTime]</h2></td><td><h2>[Field:EndTime]</h2></td><td><h2>[Field:Important]</h2></td></tr><tr><td><p>Meeting Agenda for Test</p></td><td><p>2017-12-21</p></td><td><p>[Value:StartTime]</p></td><td><p>[Value:EndTime]</p></td><td><p>[Value:Important]</p></td></tr></table><h1>[Section:MeetingNotes]</h1><h2>[Field:MeetingNotes]</h2><p>[Value:MeetingNotes]</p><h1>[Section:SubmittedGroup]</h1><h2>[Field:SubmittedGroup]</h2><p>[Value:SubmittedGroup]</p><h1>[Section:CUPFANotes]</h1><h2>[Field:CUPFANotes]</h2><p>[Value:CUPFANotes]</p>

from dotnet-mammoth.

mwilliamson avatar mwilliamson commented on May 29, 2024

All that being said - it is a bit odd that it's being so liberal with throwing the exception. Perhaps in a future version you could find a more graceful way of dealing with that scenario, if it's expected to be happening so regularly?

I don't normally like exceptions for control flow handling, but given it works I'm not sure it's worth putting in the effort to change it at the moment.

After running it with the exceptions allowed, it did indeed spit out the HTML - but with absolutely no styling. The HTML tags are properly applied, but they're all using the default HTML styles.

It depends on what your source document looks like, but Mammoth intentionally ignores directly applied formatting, such as changing the font size. To apply styling, you'll need to use Word styles instead, and map them to the appropriate HTML if they're not default Word styles that are supported in the default style map.

from dotnet-mammoth.

kramaswamy7 avatar kramaswamy7 commented on May 29, 2024

Righteo - well looks like I need to do a bit more research and coding to get the styling stuff to stick then :) In the mean time, shall I close this issue? Seems like we've resolved it - though I stand by my claim that you should probably not have so many exceptions being thrown for a relatively common occurrence like that :P

from dotnet-mammoth.

mwilliamson avatar mwilliamson commented on May 29, 2024

Yes, I think this can be closed.

from dotnet-mammoth.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.