GithubHelp home page GithubHelp logo

runassudo / gfx2gfx-pdftext Goto Github PK

View Code? Open in Web Editor NEW
9.0 9.0 5.0 11.08 MB

A fork of SWFTools' gfx2gfx which preserves text, rather than converting to shapes.

License: GNU General Public License v2.0

Makefile 0.50% Shell 2.21% Roff 0.27% C++ 4.89% C 79.46% Lex 0.53% Yacc 1.32% ActionScript 1.03% Python 8.99% Perl 0.06% Ruby 0.28% M4 0.25% Scala 0.18% HTML 0.04%

gfx2gfx-pdftext's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

gfx2gfx-pdftext's Issues

Text not recognised as contiguous

As a result of the manual positioning of the letters, contiguous blocks of text are not recognised as such by PDF readers. As a result, copy-pasting does not work well, and nor does in-text search.

This may be related to the issue that text is interpreted by programs like qpdfview and evince back-to-front, possibly related to the topdown setting of PDFlib.

gfx2gfx: errors in converting swf to pdf

Hi, I've succesfully compiled the gfx2gfx-pdftext v0.9.2 build 8d5a70b code (and the same is true for SWFTools v0.9.2) under Mac OS X 10.11.6 but I had to fix the jpeg.c file (as I described in the thread https://github.com/matthiaskramm/swftools/issues/37) given that I had some compiling errors.

In particular I've compiled gfx2gfx without errors but if I use it with the command:
gfx2gfx test.swf -o test.pdf
I get the following errors (swf file zipped and attached test.swf.zip):
Error: ID 142 unknown
Error: ID 145 unknown
Error: ID 148 unknown
Error: ID 151 unknown
Error: ID 154 unknown
Error: ID 157 unknown
Error: ID 160 unknown
Error: ID 163 unknown
Error: ID 166 unknown
Error: ID 169 unknown
Error: ID 172 unknown
Error: ID 175 unknown
Error: ID 178 unknown
Error: ID 181 unknown
Error: ID 184 unknown
Error: ID 187 unknown
Error: ID 190 unknown
Error: ID 193 unknown
Error: ID 196 unknown
Error: ID 199 unknown
Error: ID 202 unknown
Error: ID 205 unknown
Error: ID 208 unknown
Error: ID 211 unknown
Error: ID 214 unknown
Error: ID 217 unknown
Error: ID 220 unknown
Error: ID 223 unknown
Error: ID 226 unknown
Error: ID 229 unknown
Error: ID 232 unknown
Error: ID 235 unknown
Error: ID 238 unknown
Error: ID 241 unknown
Error: ID 244 unknown
Error: ID 247 unknown
Error: ID 250 unknown
Error: ID 253 unknown
Error: ID 256 unknown
Error: ID 259 unknown
Error: ID 262 unknown
Error: ID 265 unknown
Error: ID 268 unknown
Error: ID 271 unknown
Error: ID 274 unknown
Error: ID 277 unknown
Error: ID 280 unknown
Error: ID 283 unknown
Error: ID 286 unknown
Error: ID 289 unknown
Error: ID 292 unknown
Error: ID 295 unknown
Error: ID 298 unknown

Do you have any idea what this is? How can I fix it?

Exclude Specific Text

I would like to be able to exclude specific strings from the generated pdf.

Is there any file I could modify to do so?
I can't find the responsible method in the text.c or the gfx2gfx.c file.

Thanks for your help.

Miscellaneous issues with fonts

  • Unusual glyph positioning (kerning??) not respected

e.g. U+239f, which is erroneously placed at the left of the bounding box rather than the right; italic f, which is placed too close to the next letter.

  • Some glyphs don't render

e.g. U+239b, U+23aa

Strange graphics errors in pdf output

gfx2gfx-pdf2text - part of swftools 0.9.2 (build )

missing build is 8d5a70b

Compiled under ubuntu 17.04, typically flawless conversion of swf and gau files. This particular page (originally 694.gau) converted with no errors under -r0 option, but caused acrobat 8.1 to crash right here when combining pages 654-714. Garbled graphics in the output, but all 693 prior pages fine. Single page conversion with -r0 and -r300 attached.

swf2pdf.zip

OOM with large SWF files

Using gfx2gfx compiled on Windows with mingw. Unknown if this is related. Issue also occurs with upstream non-pdftext version.

Example of offending SWF is on file with @RunasSudo

convert is not totally right

thanks for your fancy project, i'm not sure whether you can read Chinese or not(i got problems while using this tool converting some swf in Chinese),actually, i came across 2 problems:

version: gfx2gfx-pdf2text - part of swftools 0.9.2 (build 8d5a70b)

the 2nd problem is easy to fix(i don't know whether this fix is right or not), i just add 2 lines code after

for(t=0;t<num;t++) {

        if(gt7bits>=128)
            gt7bits=0;

but there still exists the 1st problem, some character is missing after convertion

Text does not display in some viewers

This seems to be a compatibility issue between viewers. In some viewers, including Adobe Reader, MuPDF, xpdf and Firefox's pdf.js, text does not display. qpdf, evince, okular and Google Drive view the PDFs correctly.

A temporary fix appears to be to post-process the PDF with Ghostscript or Poppler:

gs -o output.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress input.pdf

or

pdftocairo -pdf input.pdf output.pdf

This produces a PDF which appears to be readable in all the above applications.

libgfxpdf.a No such File [Compile error]

cd src;make all
fatal: Not a git repository (or any parent up to mount point /media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
make[1]: Entering directory `/media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8/kiran/gfx2gfx-pdftext-master/src'
gcc -c -DHAVE_CONFIG_H -DGIT_VERSION=""  -I/media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8/kiran/PDFlib-Lite-7.0.5p3/libs/pdflib/ -Ilame -Ilib/lame -fPIC -Wimplicit -Wreturn-type -Wno-write-strings -Wformat -O -fomit-frame-pointer  -g -O0 gfx2gfx.c -o gfx2gfx.o
cd ../lib;make libgfxpdf.a;cd -
make[2]: Entering directory `/media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8/kiran/gfx2gfx-pdftext-master/lib'
cd pdf;make libgfxpdf
make[3]: Entering directory `/media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8/kiran/gfx2gfx-pdftext-master/lib/pdf'
make[3]: Nothing to be done for `libgfxpdf'.
make[3]: Leaving directory `/media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8/kiran/gfx2gfx-pdftext-master/lib/pdf'
make[2]: Leaving directory `/media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8/kiran/gfx2gfx-pdftext-master/lib'
/media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8/kiran/gfx2gfx-pdftext-master/src
fatal: Not a git repository (or any parent up to mount point /media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
g++ -DHAVE_CONFIG_H -DGIT_VERSION="" gfx2gfx.o -o gfx2gfx ../lib/libgfxswf.a ../lib/librfxswf.a ../lib/libgfxpdf.a ../lib/libgfx.a ../lib/libbase.a -L/usr/local/lib -ljpeg -lz -lm  -lstdc++
g++: error: ../lib/libgfxpdf.a: No such file or directory
make[1]: *** [gfx2gfx] Error 1
make[1]: Leaving directory `/media/stark/afebc556-6185-4b48-81b3-bc81f3987dd8/kiran/gfx2gfx-pdftext-master/src'
make: *** [all] Error 2
➜  gfx2gfx-pdftext-master 

Copying text from generated pdf's buggy

After compiling some pdf's, I ran into an error with text copying. Before the invisible text -fix, text was still copiable and would paste properly. Currently, trying to highlight text from the generated pdf's doesn't work properly.
git
I only highlighted the word 'dream' but as the picture shows, more words are highlighted. Pasting this results in a mess of incoherency:
git2

Additional details:
I used parameter -r 300
Tested in Adobe Reader and Google Drive

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.