How to convert PDF to HTML or TXT?
Extracting text from PDF in a few lines of code
In this example we're gonna use REST7 to convert our PDF document to HTML format:<?php
$url = 'http://your_server.com/document.pdf';
$data = json_decode(file_get_contents('http://api.rest7.com/v1/pdf_to_html.php?url=' . $url));
if (@$data->success !== 1)
{
die('Failed');
}
$doc = file_get_contents($data->file);
file_put_contents('rendered_html.htm', $doc);
Converting PDF to TXT is very similar:
<?php
$url = 'http://your_server.com/document.pdf';
$data = json_decode(file_get_contents('http://api.rest7.com/v1/pdf_to_text.php?layout=0&url=' . $url));
if (@$data->success !== 1)
{
die('Failed');
}
$doc = file_get_contents($data->file);
file_put_contents('converted.txt', $doc);
Comments