Discussion:
extracting text from pdf files
(too old to reply)
r***@fastmail.fm
2006-10-31 22:38:14 UTC
Permalink
Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.
p***@gmail.com
2006-11-01 00:52:11 UTC
Permalink
Hi,

Try the Xpdf project. Run the pdftotext command in the shell to produce
the text.

http://www.foolabs.com/xpdf/download.html

There's more tips at php.net/pdf.
Post by r***@fastmail.fm
Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.
runner7
2006-11-01 04:18:44 UTC
Permalink
Post by p***@gmail.com
Hi,
Try the Xpdf project. Run the pdftotext command in the shell to produce
the text.
http://www.foolabs.com/xpdf/download.html
There's more tips at php.net/pdf.
Post by r***@fastmail.fm
Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.
I really appreciate this lead, thanks, but can I do this all
programmatically without having to manually use a command line? I need
to process hundreds of pdf files to text and then extract what I need
from them.
Toby Inkster
2006-11-01 11:47:12 UTC
Permalink
Post by runner7
I really appreciate this lead, thanks, but can I do this all
programmatically without having to manually use a command line? I need
to process hundreds of pdf files to text and then extract what I need
from them.
The system() function.
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Thomas Merz
2006-11-01 22:55:02 UTC
Permalink
Post by r***@fastmail.fm
Can anyone help me with how to extract text from pdf files using PHP or
ColdFusion? Thanks for any help.
Our TET product extracts the text from PDF. It contains a programming
interface for PHP (and other languages); you can directly
fetch the text (and coordinates, font, etc.) from your PHP
script. Free evaluation version on our Web site.

Thomas

_______________________________________________________________
Thomas Merz ***@pdflib.com http://www.pdflib.com
PDFlib 7: Create PDF/A for archiving, format tables, and more!
_______PDFlib - a library for generating PDF on the fly________

Continue reading on narkive:
Loading...