Advanced Search  :  Site Statistics  :  Directory  :  Web Resources  :  Polls  
    AccessPDF A PDF Forum for Users and Programmers    
 Welcome to AccessPDF
 Sunday, May 11 2008 @ 04:25 PM PDT

Assemble PDF Pages after Double-Sided Scanning

   
Pdftk

Here is an email I received today that describes a common PDF problem. I sketched out a solution, and some kind folks have offered scripts. Feel free to contribute.

In the old HP 6350 scanner (with sheet feeder), they have included the HP Precision scan program that allows user to scan two-side paper in two passes.

Specifically after scanning all pages in other side, you will be asked to turn over the whole pile and scan the back side, i.e. first pass pages 1 3 5 7 second pass pages 8 6 4 2

Combined output pages 1 2 3 4 5 6 7 8

I wonder if your master piece "pdftk" can be used to serve this purpose, i.e.

Taking the first page from PDF A, then the last page from PDF B; then the 2nd page from PDF A, and then the (n-1) page from PDF B...



Hello-

Thank you for your email. Pdftk could definitely handle this task. My first thought is to use a shell script to assemble the necessary command line. It would also be possible to make this a built-in pdftk feature.

Given two PDFs, one.pdf and two.pdf, each four pages long, the command line would look like this:

pdftk A=one.pdf B=two.pdf cat A1 B4 A2 B3 A3 B2 A4 B1 output mydoc.pdf

You could use a tool like pdfinfo (part of xpdf) to discover how many pages each input PDF has, before assembling the command line. Pdftk can also tell you how many pages a PDF has (using the dump_data operation), but pdfinfo is faster.


Here is a concrete solution by Bill Segraves:

For Sid's example, the following bash script worked fine here for a four-page document (tested with MSYS):

eval pdftk A=one.pdf B=two.pdf cat `X=1;Y=2;while [ $X -le 2 ];
do echo A$X B$Y; X=$((X+1));
Y=$((Y-1));
done` output mydoc.pdf

For a larger document, e.g., fifty pages in each half of the document, substitute 50 in place of 2 in the above code. Based on the large environment space suggested by Sid's tests with MSYS, it appears the above script could be used for documents with a rather large number of pages before one would be limited on account of environment space.

Here is a Windows batch script by Ross Presser (cribbed from compt.text.pdf). Note that he is solving a different problem. The resulting command string looks like this:

pdftk A=one.pdf B=two.pdf cat A1 B1 A2 B2 A3 B3 A4 B4 output mydoc.pdf

Here's a Windows 2000/XP batch language solution to create the string:

---- cut here --- mix.cmd ----
set PAGECOUNT=25
set cmd=cat
for /L %%i in (1,1,%PAGECOUNT%) do SET cmd=!cmd! A%%i B%%i
pdftk A=1.pdf B=2.pdf %cmd% output result2.pdf
---- cut here --- mix.cmd ----

This batch file *must* be run with "delayed variable expansion" enabled (see the help for CMD.EXE for an explanation of DVE). Run like this at the commandline:

  CMD.exe /V /C mix.cmd
 

What's Related

Story Options

 Copyright © 2008 AccessPDF
 All trademarks and copyrights on this page are owned by their respective owners.
Powered By Geeklog 
Created this page in 0.46 seconds