Merging Documents With Similar File Names

Introduction

It is often necessary to merge PDF files with similar file names into a single PDF document. For example, multiple statements belonging to the same customer/account can be combined into a single document using this simple approach. The AutoSplit plug-in provides functionality for automatic merging of multiple PDF documents into multiple output documents based on file name similarity. The following tutorial provides step-by-step instructions on how to accomplish this task.

Input Document Description

The input folder we are going to use in this tutorial consists of the following 13 PDF files:

1. INV.TASK.0000001.0000001.pdf

2. INV.TASK.0000001.ACTION.0000654.pdf

3. INV.TASK.0000001.ACTION.0006543.pdf

4. INV.TASK.0000002.0000001.pdf

5. INV.TASK.0000002.ACTION.0000089.pdf

6. INV.TASK.0000002.ACTION.0000123.pdf

7. INV.TASK.0000002.ACTION.0022001.pdf

8. INV.TASK.0000003.0000001.pdf

9. INV.TASK.0000003.ACTION.0003034.pdf

10. INV.TASK.0000003.ACTION.0009781.pdf

11. INV.TASK.0000004.0000001.pdf

12. INV.TASK.0000004.ACTION.0000009.pdf

13. INV.TASK.0000004.ACTION.0004541.pdf

IMPORTANT: Please note that input files should not be password protected or restrict user-access rigths.

The goal is to merge all files from the input folder into the output folder by taking one input document and finding all other documents that have at least 16 (this value can be set by the user) common first characters in the file name. Output files will have the same name as a first PDF file in each group with similar names.

The following 4 output files should be created:

1. INV.TASK.0000001.0000001.pdf

2. INV.TASK.0000002.0000001.pdf

3. INV.TASK.0000003.0000001.pdf

4. INV.TASK.0000004.0000001.pdf

Merging Approach

Unlike regular document merging, there are multiple output files from this operation. Each output file is created by taking one input document and finding all other documents that have at least X common first characters in the file name. The similarity between two file names are controlled by a user-specified number of common characters. All merged documents are placed in the user-selected output folder. Merged files have the same name as a first file in group used for merging.

Files are sorted alphabetically prior to merging. The order of the files in the merged document(s) depends on the document names. The names of the output documents depend on what file name appears first in the sorted list of input files. Here is an example of merging 8 files into 2 output PDF documents:

Output Results

The “AutoSplit Results” dialog appears on screen once processing is completed. It shows a list of output documents.

Click “Open Output Folder” to inspect the results.

The AutoBookmark™ plug-in has automatically created 4 output files after merging process:

1. INV.TASK.0000001.0000001.pdf that includes 3 files: INV.TASK.0000001.0000001.pdf, INV.TASK.0000001.ACTION.0000654.pdf, INV.TASK.0000001.ACTION.0006543.pdf;

2. INV.TASK.0000002.0000001.pdf that includes 4 files: INV.TASK.0000002.0000001.pdf, INV.TASK.0000002.ACTION.0000089.pdf, INV.TASK.0000002.ACTION.0000123.pdf, INV.TASK.0000002.ACTION.0022001.pdf;

3. INV.TASK.0000003.0000001.pdf that includes 3 files: INV.TASK.0000003.0000001.pdf, INV.TASK.0000003.ACTION.0003034.pdf, INV.TASK.0000003.ACTION.0009781.pdf;

4. INV.TASK.0000004.0000001.pdf that includes 3 files: INV.TASK.0000004.0000001.pdf, INV.TASK.0000004.ACTION.0000009.pdf, INV.TASK.0000004.ACTION.0004541.pdf.