current position:Home>Turn audio into shareable video with Python and ffmpeg

Turn audio into shareable video with Python and ffmpeg

2022-01-30 11:56:28 Stone sword

In this tutorial , We will learn how to use Python and FFmpeg Build an application , So we can turn voice recording into cool video , And can easily share on social media .

At the end of this tutorial , We will turn a voice recording into a video that looks similar to the following .

Project demo

The tutorial requires

To learn this tutorial , You will need the following components .

  • One or more voice records you want to convert to video . Stored in your Twilio The programmable voice recording in the account is very useful for this tutorial .
  • Installed Python3.6+.
  • install FFmpeg4.3.1 Version or later .

Create project structure

In this section , We will create our project directory , In this directory , We will create subdirectories , Store the recordings to be used in this tutorial 、 picture 、 Fonts and videos . Last , We will create Python file , The file will contain information that allows us to use FFmpeg To create and edit video code .

Open a terminal window , Enter the following command , Create a project directory and move to it .

mkdir twilio-turn-recording-to-video
cd twilio-turn-recording-to-video
 Copy code 

Use the following command to create four subdirectories .

mkdir images
mkdir fonts
mkdir videos
mkdir recordings
 Copy code 

images The directory is where we will store the background image of the video . download This picture , And store it in images Directory , The name is bg.png . This picture was originally taken from Freepik.com Download the .

stay fonts Directory , We will store the font file used to write text in our video . download The font , And store it in fonts Directory , The name is LeagueGothic-CondensedRegular.otf . This font was originally from fontsquirrel.com Download the .

videos The directory will contain videos and animations , They will be added to the background picture . download This There is Twilio The logo of the rotating disc video , And save it in videos Directory , The name is spinningRecord.mp4 . The source image used in this video is from flaticon.com Download the .

recordings The directory is where we will store voice records that will become video . Add one or more of your own voice records to this directory .

Now we have created all the required directories , Open your favorite code editor , Create a directory named... In the top-level directory of the project main.py The file of . This file will contain the code responsible for turning our recording into video .

If you don't want to follow every step of the tutorial , You can ad locum Get the complete project source code .

Turn an audio file into a video

In this section , We'll add code , So that we can turn recording into video , Show the recorded sound waves .

We will use FFmpeg , Generate a video from an audio file . therefore , In order to learn from Python Call in FFmpeg And related procedures , We will use Python Of subprocess modular .

Run a command

stay main.py Add the following code to the file .

import subprocess


def run_command(command):
    p = subprocess.run(
        command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT
    )
    print('Done!!!')
    print('stdout:\n{}'.format(p.stdout.decode()))

    return p.stdout.decode().strip()
 Copy code 

In the code block above , We have imported subprocess Module and creates a run_command() function . seeing the name of a thing one thinks of its function , This function is responsible for running a command with parameters passed in . When the command is complete , We'll print out , And return it to the caller .

Get the duration of a recording

stay run_command() Add the following code under the function .

def get_rec_duration(rec_name):
    rec_path = "./recordings/{}".format(rec_name)

    command = "ffprobe -i {rec_path} -show_entries format=duration -v quiet \ -of csv=\"p=0\"".format(rec_path=rec_path)

    rec_duration = run_command(command)
    print("rec duration", rec_duration)

    return rec_duration
 Copy code 

ad locum , We created a get_rec_duration() Function of . This function is responsible for retrieving the duration of a recording . This function receives a recording name (rec_name) As a parameter , This parameter is prefixed with the name of the recording directory , And stored in rec_path In local variables .

ffprobe Procedure is FFmpeg Part of , It is used to create a command string to get the duration of the recording . We call... With this command run_command() function , And store the returned value in rec_duration .

Last , We print , Then return to the recording time we obtained .

Recording duration is required , To specify that the duration of the video to be generated by it is the same .

Convert audio to video

stay get_rec_duration() Add the following code under the function .

def turn_audio_to_video(rec_name, rec_duration):
    rec_path = "./recordings/{}".format(rec_name)
    bg_image_path = "./images/bg.png"
    video_name = "video_with_sound_waves.mp4"

    command = 'ffmpeg -y -i {rec_path} -loop 1 -i {bg_image_path} -t {rec_duration} \ -filter_complex "[0:a]showwaves=s=1280x150:mode=cline:colors=00e5ff[fg]; \ drawbox=x=0:y=285:w=1280:h=150:[email protected]:t=fill[bg]; \ [bg][fg]overlay=format=auto:x=(W-w)/2:y=(H-h)/2 " \ -map 0:a -c:v libx264 -preset fast -crf 18 -c:a aac \ -shortest ./videos/{video_name}'.format(
        rec_path=rec_path,
        bg_image_path=bg_image_path,
        rec_duration=rec_duration,
        video_name=video_name,
    )

    print(video_name)
    run_command(command)
    return video_name
 Copy code 

turn_audio_to_video() The function will turn the recording into a video showing the recorded sound waves . This function takes the recording name (rec_name) And recording time (rec_duration) As a parameter .

To generate video from audio FFmpeg The command uses the recording path (rec_path), The path of the background picture (bg_image_path), And the output file name of the video (video_name).

Let's take a closer look FFmpeg command .

ffmpeg -y -i {rec_path} -loop 1 -i {bg_image_path} -t {rec_duration} \
-filter_complex \"[0:a]showwaves=s=1280x150:mode=cline:colors=00e5ff[fg];  \
drawbox=x=0:y=285:w=1280:h=150:[email protected]:t=fill[bg]; \
[bg][fg]overlay=format=auto:x=(W-w)/2:y=(H-h)/2 \" \
-map 0:a -c:v libx264 -preset fast -crf 18 -c:a aac -shortest ./videos/{video_name}
 Copy code 

The [-y](https://ffmpeg.org/ffmpeg-all.html#Main-options) tell ffmpeg Overwrite output file , If it exists on disk .

The [-i](https://ffmpeg.org/ffmpeg-all.html#Main-options) Option specifies what to enter . In this case , We have 2 An input file , Recording files ,rec_path , The image we use has a background , It's stored in bg_image_path .

The [-loop](https://ffmpeg.org/ffmpeg-all.html#loop) Options are repeated ( loop ) Input file to generate a video . ad locum , We're looping through what we're doing bg_image_path The image entered in . The default value is 0 ( No circulation ), So we set it to 1 ( loop ), To repeat this image in all video frames .

The [-t](https://ffmpeg.org/ffmpeg-all.html#Main-options) Option specifies a duration in seconds , Or use "hh:mm:ss[.xxx]" grammar . Here we use the recording duration (rec_duration) Value to set the duration of our output video .

[-filter_complex](https://ffmpeg.org/ffmpeg.html#filter_005fcomplex_005foption) filter : Allows us to define a complex filter graph , A with any number of inputs and / Or output filter . This is a complex option , You need some parameters , Let's talk about .

First , We use showwaves Filter to convert voice records , Referenced as [0:a] , Output as video .s Parameter is used to specify the size of the output video , We set it to 1280x150.mode Parameter defines how the audio wave is drawn . The available values are .point,line,p2p, and cline .colors Parameter specifies the color of the waveform . The drawing of the waveform is designated as the label [fg] .

We use drawbox The filter draws a colored box on our background image , To help highlight the waveform .x and y Parameter specifies the coordinates of the upper left corner of the box , and w and h Set its width and height .color Parameter configures the color of the box to black , Opacity is 80%.t Parameter sets the thickness of the box boundary . By setting the value to fill , We created a solid box .

In order to complete the definition of this filter , We use it overlay , Put the oscillogram on the top of the black box .overlay The filter configuration is :format , It automatically formats pixels ;x and y , It specifies the coordinates of the cover in the video frame . We use some mathematical methods to specify x and y It should be in the center of our video .

The [-map](https://ffmpeg.org/ffmpeg.html#Stream-selection) Option is used to select which input streams should be included or excluded from the output . We choose to add all the data streams in our video to our output video .

The [-c:v](https://ffmpeg.org/ffmpeg.html#Stream-specifiers-1) Option is used to encode the video stream , And use a specific codec . We tell FFmpeg Use libx264 Encoder .

The [-preset](https://trac.ffmpeg.org/wiki/Encode/H.264) Options you can choose a range of options , Provide a certain coding speed and compression rate . What we use here is fast Options , But if you like , You can change the preset value to slower at will ( Good quality ) Or faster ( Poor quality ) Of .

The [-crf](https://trac.ffmpeg.org/wiki/Encode/H.264) The option represents a constant rate factor . Rate control determines how many bits will be used in each frame . This will determine the size of the file and the quality of the output video . In order to obtain visually lossless quality , It is recommended to use 18 The numerical .

The -[c:a](https://ffmpeg.org/ffmpeg.html#Stream-specifiers-1) Option is used to encode an audio stream with a codec . We are using AAC Codec encodes audio .

The [-shortest](https://ffmpeg.org/ffmpeg.html#toc-Advanced-options) Options tell FFmpeg Stop writing output at the end of the shortest input stream .

At the end of the order ./videos/{video_name} Option specifies the path of our output file .

If you're curious , Here are all of the above FFmpeg The role of waveform patterns and their appearance .

Point Draw a point for each sample .

Point waveform mode

Line , Draw a vertical line for each sample .

Line waveform mode

P2p Draw a point for each sample , Draw a line between them .

P2p waveform mode

Cline Draw a centered vertical line for each sample . This is what we used in this tutorial .

Cline waveform mode

stay turn_audio_to_video() Add the following code below the function .

def main():
    rec_name = "rec_1.mp3"
    rec_duration = get_rec_duration(rec_name)
    turn_audio_to_video(rec_name,rec_duration)


main()
 Copy code 

In this newly introduced code , We have one called main() Function of . In which the , We store the recording name in a file named rec_name Variables in . You should update this line , Include the name of your own recording file .

after , We call get_rec_duration() Function to get the recording time .

then , We call... With the recording name and duration turn_audio_to_video function , And store the returned value in a file named video_with_sound_waves Variables in .

Last , We call main() Function to run the whole process . Remember to replace... With the name of the recording you want to process rec_name The value of the variable .

Go back to your terminal , Run the following command to generate the video .

python main.py
 Copy code 

stay videos Look for a directory named video_with_sound_waves.mp4 The file of , To open it , You should see something similar to the following .

Waveform rendering

Add a video above the background

In this section , We will add a rotating record video in the lower left corner of the generated video . The video we want to add is stored in videos In the directory named spinningRecord.mp4 In the file of .

Spinning record animation

Go back to your code editor , open main.py file , And in turn_audio_to_video() Add the following code under the function .

def add_spinning_record(video_name, rec_duration):
    video_path = "./videos/{}".format(video_name)
    spinning_record_video_path = "./videos/spinningRecord.mp4"
    new_video_name = "video_with_spinning_record.mp4"

    command = 'ffmpeg -y -i {video_path} -stream_loop -1 -i {spinning_record_video_path} \ -t {rec_duration} -filter_complex "[1:v]scale=w=200:h=200[fg]; \ [0:v] scale=w=1280:h=720[bg], [bg][fg]overlay=x=25:y=(H-225)" \ -c:v libx264 -preset fast -crf 18 -c:a copy \ ./videos/{new_video_name}'.format(
        video_path=video_path,
        spinning_record_video_path=spinning_record_video_path,
        rec_duration=rec_duration,
        new_video_name=new_video_name,
    )

    print(new_video_name)
    run_command(command)
    return new_video_name
 Copy code 

ad locum , We've created a new one called add_spinning_record() Function of . This function will be responsible for adding... To the video showing sound waves spinningRecord.mp4 video . Its parameter is the name of the previously generated video (video_name ) And recording time (rec_duration ).

This function also runs FFmpeg. Here are the details of the command .

$ ffmpeg -y -i {video_path} -stream_loop -1 -i {spinning_record_video_path} \
-t {rec_duration} -filter_complex \"[1:v]scale=w=200:h=200[fg]; \
 [0:v] scale=w=1280:h=720[bg], [bg][fg]overlay=x=25:y=(H-225)\" \
-c:v libx264 -preset fast -crf 18 -c:a copy ./videos/{new_video_name}
 Copy code 

The above command has the following options .

-y,-t,-c:v,-preset, and -crf Options are the same as those for generating sound waves FFmpeg The same command .

-i Option has been used before , But in this case , We have 2 A video as an input file , That is, the video file generated in the previous step , And rotating recorded video files .

The [-stream_loop](https://ffmpeg.org/ffmpeg.html#Stream-copy) Option allows us to set the number of times an input stream is cycled . The value is 0 It means that circulation is prohibited , and -1 It means an infinite loop . We set the rotating recorded video to infinite loop . This will enable FFmpeg Encode the output video without limitation , But since we also specify the duration of the output video , When the video reaches this duration ,FFmpeg Coding will stop .

-filter_complex Options : It also has the same function as before , But here we have 2 A video as an input file , That is, the video generated in the previous section [0:v] , And rotate the recorded video [1:v] .

The filter first uses [scale](https://ffmpeg.org/ffmpeg-filters.html#scale) To adjust the size of the rotating recorded video , Make it have 200x200 The size of the , And assigned it [fg] The label of . then , We use... Again scale filter , Set the video created in the previous section to 1280x720 The size of the , And add [bg] The label of . Last , We use [overlay](https://ffmpeg.org/ffmpeg-filters.html#overlay) The filter places the rotated recorded video on top of the video created in the previous section , Coordinate for x=25 , and y=H-225 (H Represents the height of the video ).

-c:a The options are also described in the previous section , But in this case , We use special values copy , tell ffmpeg Copy the audio stream from the source video without recoding .

The last part of the command ,./videos/{new_video_name} , Set the path of our output file .

Replace... With the following main() Code within function , It adds right add_spinning_record() Function call .

def main():
    rec_name = "rec_1.mp3"
    rec_duration = get_rec_duration(rec_name)
    video_with_sound_waves = turn_audio_to_video(rec_name, rec_duration)
    add_spinning_record(video_with_sound_waves, rec_duration)
 Copy code 

Run the following command in your terminal to generate a video .

python main.py
 Copy code 

stay videos Look for a directory named video_with_spinning_record.mp4 The file of , To open it , You should see something similar to the following .

Waveform rendering with spinning record

Add text to video

In this section , We will add a title at the top of the video . As part of it , We will learn how to use FFmpeg To draw text , Color change 、 size 、 Font and position .

Go back to your code editor , open main.py file , And in add_spinning_record Add the following code under the function .

def add_text_to_video(video_name):
    video_path = "./videos/{}".format(video_name)
    new_video_name = "video_with_text.mp4"
    font_path = "./fonts/LeagueGothic-CondensedRegular.otf"

    command = "ffmpeg -y -i {video_path} -vf \"drawtext=fontfile={font_path}: \ text='Turning your Twilio voice recordings into videos':fontcolor=black: \ fontsize=90:box=1:[email protected] \ :boxborderw=5:x=((W/2)-(tw/2)):y=100\" \ -c:a copy ./videos/{new_video_name}".format(
        video_path=video_path,
        font_path=font_path,
        new_video_name=new_video_name
    )

    print(new_video_name)
    run_command(command)
    return new_video_name
 Copy code 

In this function , We created a add_text_to_video() Function of , It calls a new FFmpeg Command to draw text . Let's take a closer look at this FFmpeg command .

ffmpeg -y -i {video_path} -vf \"drawtext=fontfile={font_path}:  \
text='Turning your Twilio voice recordings into videos':fontcolor=black: \
fontsize=90:box=1:[email protected]:boxborderw=5:x=((W/2)-(tw/2)):y=100\" \
-c:a copy ./videos/{new_video_name}
 Copy code 

-y , and -c:a The use of options is exactly the same as before .

-i Options , Defines the input , Now there is only one input file , That is, the video file generated in the previous section .

The -[vf](https://ffmpeg.org/ffmpeg.html#Stream-copy) Option allows us to create a simple filtergraph , And use it to filter the stream . Here we use this [drawtext](https://ffmpeg.org/ffmpeg-filters.html#drawtext-1) The filter draws text at the top of the video , There are some parameters :fontfile Is a font file used to draw text ,text Defines the text to be drawn ( You can change freely according to your preference ),fontcolor Set the text color to black ,fontsize Set text size ,box Enables a box around the text ,boxcolor Set the color of this box to white , Opacity is 50%,boxborderw Set the width of the bounding box ,x and y Set the position of the text to be printed in the video . We use a little mathematical method to draw the text in the middle .

final ./videos/{new_video_name} Option to set the output file , Like before FFmpeg command .

Replace with the following version main() Code in function , The title step is added .

def main():
    rec_name = "rec_1.mp3"
    rec_duration = get_rec_duration(rec_name)
    video_with_sound_waves = turn_audio_to_video(rec_name, rec_duration)
    video_with_spinning_record = add_spinning_record(video_with_sound_waves, rec_duration)
    video_with_text = add_text_to_video(video_with_spinning_record)
 Copy code 

Go back to your terminal , Run the following command to generate a video with a title .

python main.py
 Copy code 

stay videos Look for a directory named video_with_text.mp4 The file of , To open it , You should see something similar to the following .

Waveform rendering with spinning record and title

Conclusion

In this tutorial , We learned how to use FFmpeg Some advanced options in , Record a voice into a video that can be shared on social media . I hope this will encourage you to learn more about FFmpeg Knowledge .

copyright notice
author[Stone sword],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201301156211107.html

Random recommended