Comments (13)
Hi,
I have another file not working either for:
%PDF-1.5
%���
3 0 obj
<< /Length 4 0 R
/Filter /FlateDecode
stream
x���oo�Ǒ.��������`��{�����7�^l�kd�ͽ@����#E�E9$e��;�]U��穪$�E���.=U3���xH�����w�����wo�.������%�߮��C���*v��.�zs�_���]�����1�/�9�8�ݝ��u�~>߿{{�^��O��z1�W�&�_�<�8�y�R�Ws�a�[,G}wz����T�p���-�X�Hi�����}w9\��������?������{��������Mî�_���)�n ]��?����R��R�o��4NWC��4,�Kp��O7�燇�k����5������w��r�9�=�������u�<L\3^�aIi�� ����J]��Ͽ��?|��R����^��*��+1ī���X/zY�^}�^��������_��% ]�����^��������?�i�軛���W^�!\��tc�\MK�:���������ή˺����+�����ܜ�?}8x �u��d5/�����_�/5{�4Ҽ���S����>}�p�?�iJWSMh�����6MZ��<���淟�n�(a�{�$5�Ġ
��1�w��y��s���X^����
...
from pdftotext.
Hi,
I tried to send a PDF-file to ([email protected]), but it bounced back.
Where can I send it?
Regards,
/Fredrik.
from pdftotext.
I'm happy to take a look at it if you can send it over.
from pdftotext.
Faktura utan giro_627_Adapac AB.PDF
from pdftotext.
from pdftotext.
Thanks, the problem is the CID IDENTITY_H fonts.
With just using the unicode map on the font object you get around a third of the text out but the rest isn't mapped to characters properly.
I'm working on a change that will read CID font's CMAP which will hopefully make reading international PDF's much better.
from pdftotext.
Hi,
Faktura-1587.pdf
This is not working either.
When do you think the change will be done?
Regards,
/Fredrik.
from pdftotext.
Hi,
How is it going?
When do you think a solution can be available?
This file is not possible to read at all.
Faktura20541.pdf
I use this syntax.
The fist part is printed out, but if I do another printout after the function call,
it will not show.
// Check the PDF-file for information
$uri = $dir.'/'.$fileName;
$pdf = new Pdf();
//
error_log(print_r(array(
'uri' => $uri,
'pdf' => $pdf,
'' => ''
), true));
**$pdfdata = $pdf->getPdfInfo($uri);**
Otherwise the tool is great.
Regards,
/Fredrik
from pdftotext.
Hi,
Please, I need this urgently.
Can I atleast get an answer to when it is expected to be changed?
It is much appreciated :).
Regards,
/Fredrik.
from pdftotext.
Hi,
Maybe I am not using the complete files?
I am using:
class/PdfToText.phpclass
class/Maps/adobe-charsets.map
class/Maps/unicode-to-ansi.map
Do I also need the CIDTables-directory like class/CIDTables/.?
Btw:
I tried adding libraries:
class/CIDTables
class/contributions
class/FontMetrics
class/FormTemplates to the class-library without any effect.
from pdftotext.
Hey Fredrik,
We are still working out the best way to resolve the issues with CID fonts.
We've made a few changes to the fork on our github if you check that out you should get some information out of the PDF from the unicode map we process even with CID fonts.
No time-frame currently as this is very much a side project
from pdftotext.
Hi and thanks alot,
It almost suits my purpose.
Can this be adjusted little more?
I seem to get part of the invoice, but not the part that I want.
Great otherwise.
I actually only needs 2 parameters from the PDF-files.
One is the number of pages and the second if the text in the PDF contains
Invoice (Faktura) or Creditinvoice (Kreditfaktura).
Can this be maintained somehow?
Yes, this is solving my problems for now.
Thank you very much :).
from pdftotext.
Hi again,
I am having problem with this type of invoice, is it because of the qr-code?
It doesn't even load anything.
This line of code will not run correctly:
$pdf = new PdfToText($uri);.
The dropzone will respond with:
Server responded with 0 code.
Can this be fixed?
Here is the invoice.
Faktura20541.pdf
from pdftotext.
Related Issues (20)
- Font Widths from another PDF Object
- Preserve new lines in pdf after converting to text.
- issue in convert maths paper how can i solve it HOT 6
- extracted images are black HOT 1
- Problem with Euro (€) char HOT 1
- different fonts problem
- No Spaces in between two text
- Converting only parts of the file
- Error of 'Undefined Constant 'IMG_JPEG' HOT 7
- problem with extracting some hebrew font
- How to get PDF form fields and values ?
- High Memory Usage HOT 1
- Causes garbled characters HOT 2
- PdfToText not reading files created or modified with PDFelement
- Extract Data from PDF form Undefined Functions
- Coordinates not recognized HOT 1
- Why is the original image different from the extracted image? HOT 1
- PdfToText returns only spaces but no text
- A lot fo depreciated warrning on PHP 8 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pdftotext.