Building a Snapchat Lens Effect in Python

Snapchat, Instagram, and now Apple have all gotten in on the real time face effects.

In today’s post, we’ll build out a method to track and distort our face in real time, just like these apps do.

We’ll end up with something like this:

For those who’d like a video walkthrough, this entire post is also available as a walkthrough on YouTube. You can find the video walkthrough at the end of this page.

The Tools of Face Detection in Python

We’ll use two of the biggest, most exciting image processing libraries available for Python 3, Dlib and OpenCV.

Installing Dlib is easy enough, thanks to wheels being available for most platforms. Just a simple pip install dlib should be enough to get you up and running.

For OpenCV, however, installation is a bit more complicated. If you’re running on MacOS, you can try this post to get OpenCV setup. Otherwise, you’ll need to figure out installation on your own platform.

Something like this might work for Ubuntu.

For Windows users, you may want to try your luck with this unofficial wheel.

Once you’ve gotten OpenCV installed, you should be set for the rest of this lesson.

Architecture of Lens Effects

We’ll use OpenCV to get a raw video stream from the webcam. We’ll then resize this raw stream, using the imutils resize function, so we get a decent frame rate for face detection.

Once we’ve got a decent frame rate, we’ll convert our webcam image frame to black and white, then pass it to Dlib for face detection.

Dlib’s get_frontal_face_detector returns a set of bounding rectangles for each detected face an image. With this, we can then use a model (in this case, the shape_predictor_68_face_landmarks on Github), and get back a set of 68 points with our face’s orientation.

From the points that match the eyes, we can create a polygon matching their shape in a new channel.

With this, we can do a bitwise_and, and copy just our eyes from the frame.

We then create an object to track the n positions our eyes have been. OpenCV’s boundingRect function gives us a base x and y coordinate to draw from.

Finally, create a mask to build up all the previous places where our eyes where, and then once more, bitwise_and copy our previous eye image into the frame before showing.

Writing the Code

With our concepts laid out, writing our actual eye detection and manipulation is straight forward.

import argparse

import cv2
from imutils.video import VideoStream
from imutils import face_utils, translate, resize

import time
import dlib

import numpy as np

parser = argparse.ArgumentParser()
parser.add_argument("-predictor", required=True, help="path to predictor")
args = parser.parse_args()

print("starting program.")
print("'s' starts drawing eyes.")
print("'r' to toggle recording image, and 'q' to quit")

vs = VideoStream().start()
time.sleep(1.5)

# this detects our face
detector = dlib.get_frontal_face_detector()
# and this predicts our face's orientation
predictor = dlib.shape_predictor(args.predictor)

recording = False
counter = 0

class EyeList(object):
    def __init__(self, length):
        self.length = length
        self.eyes = []

    def push(self, newcoords):
        if len(self.eyes) < self.length:
            self.eyes.append(newcoords)
        else:
            self.eyes.pop(0)
            self.eyes.append(newcoords)
    
    def clear(self):
        self.eyes = []

# start with 10 previous eye positions
eyelist = EyeList(10)
eyeSnake = False

# get our first frame outside of loop, so we can see how our
# webcame resized itself, and it's resolution w/ np.shape
frame = vs.read()
frame = resize(frame, width=800)

eyelayer = np.zeros(frame.shape, dtype='uint8')
eyemask = eyelayer.copy()
eyemask = cv2.cvtColor(eyemask, cv2.COLOR_BGR2GRAY)
translated = np.zeros(frame.shape, dtype='uint8')
translated_mask = eyemask.copy()

while True:
    # read a frame from webcam, resize to be smaller
    frame = vs.read()
    frame = resize(frame, width=800)

    # fill our masks and frames with 0 (black) on every draw loop
    eyelayer.fill(0)
    eyemask.fill(0)
    translated.fill(0)
    translated_mask.fill(0)

    # the detector and predictor expect a grayscale image
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    rects = detector(gray, 0)

    # if we're running the eyesnake loop (press 's' while running to enable)
    if eyeSnake:
        for rect in rects:
            # the predictor is our 68 point model we loaded
            shape = predictor(gray, rect)
            shape = face_utils.shape_to_np(shape)

            # our dlib model returns 68 points that make up a face.
            # the left eye is the 36th point through the 42nd. the right
            # eye is the 42nd point through the 48th.
            leftEye = shape[36:42]
            rightEye = shape[42:48]

            # fill our mask in the shape of our eyes
            cv2.fillPoly(eyemask, [leftEye], 255)
            cv2.fillPoly(eyemask, [rightEye], 255)

            # copy the image from the frame onto the eyelayer using that mask
            eyelayer = cv2.bitwise_and(frame, frame, mask=eyemask)

            # we use this to get an x and y coordinate for the pasting of eyes
            x, y, w, h = cv2.boundingRect(eyemask)

            # push this onto our list
            eyelist.push([x, y])

            # finally, draw our eyes, in reverse order
            for i in reversed(eyelist.eyes):
                # first, translate the eyelayer with just the eyes
                translated1 = translate(eyelayer, i[0] - x, i[1] - y)
                # next, translate its mask
                translated1_mask = translate(eyemask, i[0] - x, i[1] - y)
                # add it to the existing translated eyes mask (not actual add because of
                # risk of overflow)
                translated_mask = np.maximum(translated_mask, translated1_mask)
                # cut out the new translated mask
                translated = cv2.bitwise_and(translated, translated, mask=255 - translated1_mask)
                # paste in the newly translated eye position
                translated += translated1
        # again, cut out the translated mask
        frame = cv2.bitwise_and(frame, frame, mask=255 - translated_mask)
        # and paste in the translated eye image
        frame += translated

    # display the current frame, and check to see if user pressed a key
    cv2.imshow("eye glitch", frame)
    key = cv2.waitKey(1) & 0xFF

    if recording:
        # create a directory called "image_seq", and we'll be able to create gifs in ffmpeg
        # from image sequences
        cv2.imwrite("image_seq/%05d.png" % counter, frame)
        counter += 1

    if key == ord("q"):
        break

    if key == ord("s"):
        eyeSnake = not eyeSnake
        eyelist.clear()

    if key == ord("r"):
        recording = not recording

cv2.destroyAllWindows()
vs.stop()

Running the Code

To run this code, we’ll need to download the dlib 68 point predictor. We can download it, then extract it into our directory where we’ve got our Python program saved. From there we can just do a:

$ python3 eye-glitch.py -predictor shape_predictor_68_face_landmarks.dat

And we should get our frame running. From there, a pressing ‘s’ in our frame toggles our eye-snake effect, and ‘r’ allows us to record the frames to disk, for saving as a movie later. If you want to do that, you’ll need to first create a directory called image_seq in the same directory as your Python program.

Video Walkthrough / Github Code

As usual, the code is available on Github.

You can also view a walkthough of building the code, step by step in the following videos:

And Part 2:

Where to Go From Here

If you enjoyed this post, and would like to see more creative programming posts, I recommend subscribing to my newsletter. I’d also appreciate you sharing this post on your social media.

Finally, if you’re interested in learning software development, or you know somebody who is, I’ve written a book called Make Art with Python, and it will be available for purchase here soon.

For now, you can sign up as a user on this site, and get access to the first three chapters, along with a video walk through for each chapter, just like on this page.

Updated: