Convert Any PDF eBook to an Audiobook with Python
Table of Content
I came a cross an amazing Python code snippet that convert PDF e-books into an audiobook with a minimal code.
The code snippet uses two Python packages:
- PyPDF2: a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. PyPDF2 can retrieve text and metadata from PDFs as well.
- PyTTSx3 which is a text-to-speech conversion library in Python. Unlike alternative libraries, it works offline, and is compatible with both Python 2 and 3.
The code is pretty straightforward, and it demonstrates how simple and cool Python is.
First install the required packages
pip install PyPDF2
pip install pyttsx3
Now create your Python script file, and add:
import PyPDF2
import pyttsx3
# Read the pdf by specifying the path in your computer
pdfReader = PyPDF2.PdfFileReader(open('clcoding.pdf', 'rb'))
# Get the handle to speaker
speaker = pyttsx3.init()
# split the pages and read one by one
for page_num in range(pdfReader.numPages):
text = pdfReader.getPage(page_num). extractText()
speaker.say(text) #clcoding.com
speaker.runAndWait()
# stop the speaker after completion
speaker.stop()
# save the audiobook at specified path
engine.save_to_file(text, 'E:\audio.mp3')
engine.runAndWait()
I found a pretty close tutorial from 2020 that explains more, by Aman Kharwal.