Articles → Python → Read And Write PDF Files In Python

Read And Write PDF Files In Python






Pypdf2 Library





How To Install Pypdf2?




  1. Open the command prompt.
  2. Go to the following path python_path\Scripts (here python_path is the path where python is installed in your machine)
  3. Write the following command
  4. pip install pypdf2


    Picture showing installation of pypdf2 library



How To Read A PDF File?




import PyPDF2

#   Open the file using open method
# here rb mode opens the file in binary mode
myfile = open('c:\\temp\\Includes method in JavaScript.pdf', 'rb')

pdf_reader  = PyPDF2.PdfReader(myfile)


length = len(pdf_reader.pages)

# Loop through all the pages in PDF
for p in range(length):
    page = pdf_reader.pages[p]
    print(page.extract_text())

myfile.close()



Output


Picture showing the text read from PDF file



Write Into The PDF File




import PyPDF2

#   Open the file using open method
#   here rb mode opens the file in binary mode
myfile = open('c:\\temp\\Includes method in JavaScript.pdf', 'rb')


pdf_reader  = PyPDF2.PdfReader(myfile)

# reading the first page
first_page = pdf_reader.pages[0]

#writing into the new file
pdf_writer = PyPDF2.PdfWriter()
pdf_writer.add_page(first_page)
pdf_output = open("new file.pdf", "wb")
pdf_writer.write(pdf_output)

pdf_output.close()
myfile.close()




Picture showing the new PDF file creation



Posted By  -  Karan Gupta
 
Posted On  -  Thursday, July 2, 2020

Query/Feedback


Your Email Id
 
Subject
 
Query/FeedbackCharacters remaining 250