How to Merge PDF Files Using Python Script

merge-pdf-files-using-python-simplemsoffice

This python script can merge multiple pdf files without any limitations. It is easy to use and can combine files in a short moment.

If you have many pdf files and you want to merge them into one pdf. So, there are so many online websites and software tools that can merge pdf files into one.

But there is some limitation to merging PDF files like you can merge only 5 times or you can merge limited pages else you have to pay for more actions.

And here is a very easy python code that you can use to merge pdf without any limitations and without paying anything.

Importantly Python should be installed in your system and you can download the latest version from the python website. It is open-source and free to use.

Now below is the step-by-step guide on how you can merge pdf files using Python.

You have two pdf files named “first.pdf” and “second.pdf” and you want to merge both pdf files into one named “mergefile.pdf” in the “C:\Source Folder” location. You can do it very easily using Python script by following the below steps:

Step 1: Open Notepad in your system.

Step 2: Copy the below code in Notepad:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
from argparse import ArgumentParser
from glob import glob
from pyPdf import PdfFileReader, PdfFileWriter
import os


def merge(path, output_filename):
    output = PdfFileWriter()

    for pdffile in glob(path + os.sep + '*.pdf'):
        if pdffile == output_filename:
            continue
        print("Parse '%s'" % pdffile)
        document = PdfFileReader(open(pdffile, 'rb'))
        for i in range(document.getNumPages()):
            output.addPage(document.getPage(i))

    print("Start writing '%s'" % output_filename)
    with open(output_filename, "wb") as f:
        output.write(f)

if __name__ == "__main__":
    parser = ArgumentParser()

    parser.add_argument("-o", "--output",
                        dest="output_filename",
                        default="mergefile.pdf",
                        help="write merged PDF to FILE",
                        metavar="FILE")
    parser.add_argument("-p", "--path",
                        dest="path",
                        default=".",
                        help="path of source PDF files")

    args = parser.parse_args()
    merge(args.path, args.output_filename)

Read: How to run Python Script using Batch File?

below is the explanation for the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# Import modules to work on pdf files
from argparse import ArgumentParser
from glob import glob
from pyPdf import PdfFileReader, PdfFileWriter
import os

# define pdf file writer module
def merge(path, output_filename):
    output = PdfFileWriter()

# Loop will search all files available files in folder
    for pdffile in glob(path + os.sep + '*.pdf'):
        if pdffile == output_filename:
            continue
        print("Parse '%s'" % pdffile)
        document = PdfFileReader(open(pdffile, 'rb'))
        for i in range(document.getNumPages()):
            output.addPage(document.getPage(i))

# Print output filename as wb
        print("Start writing '%s'" % output_filename)
        with open(output_filename, "wb") as f:
        output.write(f)

if __name__ == "__main__":
    parser = ArgumentParser()

    # Give any name which you want for new file like mergepdf.pdf
    parser.add_argument("-o", "--output",
                        dest="output_filename",
                        default="mergefile.pdf",
                        help="write merged PDF to FILE",
                        metavar="FILE")
    parser.add_argument("-p", "--path",
                        dest="path",
                        default=".",
                        help="path of source PDF files")

    args = parser.parse_args()
    merge(args.path, args.output_filename)

Step 4: Save the Notepad file in the “Source Folder” and give any name like “merger.py“.

Now you can script and it will merge both files as mergefile.pdf within seconds in the same folder i.e. “Source Folder“.

Even you can merge all available files in a folder. Just put the script file in the same folder and run.

Note:- If you want to edit a batch file, then right-click on it and then click on “Edit with IDLE” or you can open the python script in Notepad, and after editing you can save it.


0 Comments

Leave a Comment

Your email address will not be published. Required fields are marked *