Tuesday, June 21 2005 @ 10:27 AM PDT
Contributed by: Admin
Views: 14,426
Some pdftk users process hundreds of files. Performing this work on a Windows machine can yield unexpected results. The problem arises from the Windows command-prompt shell, not pdftk. The problem arises because for every long filename, Windows creates a short, DOS-compatible (8.3) filename. This short filename might end up matching a wildcard expression, even when the long filename does not. When using pdftk, the result is that you end up with more input files than you wanted.
This article offers a couple workarounds and then describes the case where this problem arose.
The Workarounds
One workaround is to use a wildcard expression that couldn't possibly match a short, DOS-style filename. DOS-style filenames have a maximum length of eight characters and an optional, maximum extension of three characters. They look something like this: 343990~1.PDF. In the case below, using the wildcard expression 343990_* solved the problem.
Another workaround is to use a shell other than the Windows command-prompt. I use bash as packaged by MSYS.
The Case
This problem arose in a case where a directory of input files contained 448 PDFs. Their numerical names had incrementing prefixes and suffixes, such as: