Articles → Python → Read and write PDF files in python

Read and write PDF files in python






Pypdf2 library





How to install pypdf2?










pip install pypdf2


Picture showing installation of pypdf2 library

Click to Enlarge


How to read a PDF file?




import PyPDF2

#   Open the file using open method
# here rb mode opens the file in binary mode
myfile = open('c:\\temp\\Includes method in JavaScript.pdf', 'rb')

pdf_reader  = PyPDF2.PdfFileReader(myfile)

# Loop through all the pages in PDF
for p in range(pdf_reader.numPages):
    page = pdf_reader.getPage(p)
    print(page.extractText())

myfile.close()




Picture showing the text read from PDF file

Click to Enlarge


Write into PDF file




import PyPDF2

#   Open the file using open method
#   here rb mode opens the file in binary mode
myfile = open('c:\\temp\\Includes method in JavaScript.pdf', 'rb')


pdf_reader  = PyPDF2.PdfFileReader(myfile)

# reading the first page
first_page = pdf_reader.getPage(0)

#writing into the new file
pdf_writer = PyPDF2.PdfFileWriter()
pdf_writer.addPage(first_page)
pdf_output = open("new file.pdf", "wb")
pdf_writer.write(pdf_output)

pdf_output.close()
myfile.close()




Picture showing the new PDF file creation

Click to Enlarge


Posted By  -  Karan Gupta
 
Posted On  -  Thursday, July 2, 2020

Query/Feedback


Your Email Id  
 
Subject 
 
Query/FeedbackCharacters remaining 250