Adding subtitles to video with python

I have a video of people talking. I also have a transcript. I chunked the words into sentences so that I could display 1 sentence at a time on the screen, like normal subtitles in a movie. To do so, I created a csv where there is a row for every frame, and every row contains the full sentence during that sentence time chunk. This way I loop over all frames and put text for the sentence on every frame within that sentence. I do it in OpenCV.

sample transcript csv:

frame     sentence
0           hello
1           hello
2           how are you
3           how are you
4           how are you
5           how are you
6           how are you
7           how are you 
8           fine

The csv is the same length as the number of frames in the video. To draw subtitles, I do this:

import cv2
import pandas as pd

df = pd.read_csv('data.csv')
video = cv2.VideoCapture('vid.mp4')
num_frames = video.get(cv2.CAP_PROP_FRAME_COUNT)

assert len(df) == num_frames

for i in list(range(0, num_frames)):
    ret, frame =
    cv2.putText(frame, str(df.sentence), (0,50),cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 0), 3, cv2.LINE_AA, True)

    # additional standard cv2 code below...

This works, but now I don’t have any audio. I understand OpenCV does not work with any audio, but are there any other workarounds? This approach works well in my pipeline, so I’d like to be able to write these frames to a new video but keep audio while using as little additional libraries as possible.

