Create TikTok “Bagaikan Langit” Video from Python Scripts

4 min readJun 1, 2020

in the name of Allah, most Gracious, most Compassionate

Assalamualaikum Warohmatullahi Wabarokatuh…

Today, I will give you a tutorial about create TikTok “Bagaikan Langit” video using only python and its packages. Here is steps that will be discussed in this article:

Preparing and Installation
Creating Face Detector Scripts
Creating Zoom Function
Compile it with music

Preparing and Installation

I created this scripts on Windows OS, but I think it is don’t have any significant different if you run it on other Operating System.

I assume that all of you already install Python and have created virtual environment. First we should install some Python packages in your environment using following commands:

pip install opencv-contrib-python
pip install pygame
pip install numpy

I use opencv-contrib-python version 4.1.1, but I think this tutorial has no issue with the newest version. this opencv is used to load video that streamed from webcam and detect face using openCV DNN.
Pygame is used to load music to create TikTok vibes.

once you have done install it, let’s proceed to the next step

Creating Face Detector Scripts

OpenCV have already build-in Face Detector in wrapped in caffe-DNN.

You can download deploy-prototxt.txt and res10_300x300_ssd_iter_140000.caffemodel from here. and place it in a folder named detector . Later, it will be used to load Face Detector Model using opencv.

We will create two scripts, first called main.py for playing TikTok videos and the second function.py that contains our self-defined function. The structure of your files will look like this:

.
├── detector
|   ├── deploy-prototxt.txt
|   └── res10_300x300_ssd_iter_140000.caffemodel
├── function.py
├── main.py

Now, inside function.py create an object class that will contain a function to detect face, here is the code:

Face Detection Code

Take a look on line 10 (function get_facebox). This function process an image into detection_result that contains face-boxes and confidences. Face-box is an imaginary rectangle that locate or bound the face in the image , and confidence is a measure of how sure the model predicted the face-box located in image.

If we break line by line, on line 15 we transform the input into a blob that match with model’s input for prediction, then on line 16 we predict the transformed input using our predefined self.face_detector . Then, on the line 18 we do a for loop, because the detection may be more than a single face. On line 20 if the prediction’s confidence exceed the threshold then we extract the face-box’s vertexes from the prediction.

The function will generate output list of face-boxes along with its confidences.

Creating Zoom Function

Alongside with the Face Detection function, create another function to zoom-in right into the face located in the image. Add some lines in the function.py

Zoom Function

There are two functions in the script above , first is PinZoom. The idea of PinZoom function is check the position of the center of face-box (we called facebox-centroid) and the center of the image (image-centroid), then crop the image to make the facebox-centroid as center as possible, after that we resize the cropped image to make the size constant.

The algorithm is like this:

A. Check The position of facebox-centroid (red dot) and image-centroid (green dot)

B. If the facebox-centroid located in the left of image-centroid, crop right of image in amount of x % (depend on crop parameter in function input). Otherwise if the facebox-centroid located in the right of image-centroid, crop left of image. Do the same with upper and bottom part of image.

C. after the image got cropped, resize the image alongside with the face-box so the cropped image will have the same size as before-cropped image.

The PinZoom function will return resized cropped image and the new face-box’s points.

The Second function is GradZoom, here we only add new parameter t, used for “how many PinZoom we want to call” in one loop, the t will increase if we detected the face, and will reset to 1 if we loss the face.

Compile it with music

After that all the struggle, here is the satisfying part : Finale.

I will only explain some lines, you can read the comments line in the gist above for more details.

the background music goes in line 4, import the package. and in lines 30–33, note that you must have a file named bagaikan_langit.wav

inside the while iteration, we read the camera image. The zoom’s function take part in line 45–60. Here we use GradZoom function and check the size of face-box, if the size is still small enough (we use 60% of image as a threshold) then we zoom it again, in the next iteration. If we didn’t find any faces in the image, we reset the t parameter into 1.

That’s all and Viola! You can then run the scripts, and get the following results (please turn on the sound):

Example Results

End of Article

If you have any question feel free the ask me. :)

You can access the full code here

You can get to know me from my linkedin account.

Don’t forget to share and gives clap for this article. thank you!