How to write your own recognition system using python and facenet

ferumflex (27)in #technology • 6 years ago (edited)

First time when I saw how Face ID for Apple worked, I thought it might be hard to implement it. In general, yes if you write all things from scratch, it can be a different problem. Nowadays we have plenty of instruments which we can use to build such system much faster. Just use good libraries.

One of such library is facenet. Also there are others but we will not discuss them in this text. If you are interested just google or pm me.

To build this program we will use python3.6, tensorflow, opencv, facenet and a little bit of magic. So first of all you need python3.6 to be installed on your computer (it should work with python3.7, but I did not test it). If you do not have python3.6 installed please follow instructions on official python site.

1. Here are we go. First of all you need to create virtual environment for python and activate it(assuming I've created folder for your new project and you are already in it):

python3 -m venv env

source env/bin/activate

2. After that install all requirements with this command:

pip3 install facenet opencv-python

That's all we need.

3. Next step we need to download one of the pre-trained model from facenet(section Pre-trained models). I can not say what exactly model is better, both works fine so you can use what ever you want.

Let's assume you selected model 20180402-114759. Just download zip file and unzip it to your project folder.

4. Take a photo of your face and place it in the folder images. Extension should be *.jpg.

5. In this step we are going to start writing script. I will explain in general what those blocks of code do and after latest block we will combine it all together.

Import things that we need, and init some constants:

import os

import fnmatch

import re



import numpy as np

import cv2

from facenet.src.align import detect_face

from facenet.src import facenet

import tensorflow as tf



MINSIZE = 20

THRESHOLD = [0.6, 0.7, 0.7]

FACTOR = 0.709

MARGIN = 44

SCALE = 0.25

Init facenet model:

# init model

sess = tf.Session()

with sess.as_default():

    facenet.load_model('20180402-114759')



    images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")

    embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")

    phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")

    embedding_size = embeddings.get_shape()

    input_image_size = images_placeholder.get_shape()[1]

pnet, rnet, onet = detect_face.create_mtcnn(sess, None)



def get_embedding(resized):

    reshaped = resized.reshape(-1, input_image_size, input_image_size, 3)

    feed_dict = {

        images_placeholder: reshaped, 

        phase_train_placeholder: False,

    }

    embedding = sess.run(embeddings, feed_dict=feed_dict)

 return embedding



def prewhiten(x):

    mean = np.mean(x)

    std = np.std(x)

    std_adj = np.maximum(std, 1.0/np.sqrt(x.size))

    y = np.multiply(np.subtract(x, mean), 1/std_adj)

 return y

20180402-114759 - name of the folder where we extracted pre-trained model.

Load images from folder, find face in it(should be only one face on image) and create embedding for each face:

def findfiles(which, where='.'):

    rule = re.compile(fnmatch.translate(which), re.IGNORECASE)

 return [os.path.join(where, name) for name in os.listdir(where) if rule.match(name)]



# load images

EMBEDDINGS = {}



for filename in findfiles('*.jpg', 'images'):

    img = cv2.imread(filename)

    bounding_boxes, _ = detect_face.detect_face(img, MINSIZE, pnet, rnet, onet, THRESHOLD, FACTOR)

 if bounding_boxes.any():

 assert bounding_boxes.shape[0] == 1, 'Find too many faces on the image'

        box = bounding_boxes[0]

        x, y, x2, y2, accuracy = box

 if accuracy > 0.7:

            cropped = img[int(y):int(y2), int(x):int(x2), :]

            resized = cv2.resize(cropped, (input_image_size, input_image_size), interpolation=cv2.INTER_CUBIC)

            prewhitened = prewhiten(resized)

            name, _ = os.path.splitext(os.path.basename(filename))

            EMBEDDINGS[name] = get_embedding(prewhitened)

 else:

 raise Exception('Can not find face on the image')

Init opencv, detect faces using facenet and search in our database:

# init video

cap = cv2.VideoCapture(0)

while True:

 # Capture frame-by-frame

    ret, frame = cap.read()

 # Resize frame of video to 1/4 size for faster face recognition processing

    img = cv2.resize(frame, (0, 0), fx=SCALE, fy=SCALE)

 # Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses)

    img = img[:, :, ::-1]



 # detect bounding boxes

    bounding_boxes, list_points = detect_face.detect_face(img, MINSIZE, pnet, rnet, onet, THRESHOLD, FACTOR)

 if bounding_boxes.any():

 for index, face in enumerate(bounding_boxes):

            x, y, x2, y2, accuracy = face

 if accuracy > 0.5:

                cropped = img[int(y):int(y2), int(x):int(x2), :]

                resized = cv2.resize(cropped, (input_image_size, input_image_size), interpolation=cv2.INTER_CUBIC)



 



                prewhitened = prewhiten(resized)

                guest_embedding = get_embedding(prewhitened)



 # try to find guest face in our database

                min_distance = None

                min_name = None

 for name, embedding in EMBEDDINGS.items():

                    distance = facenet.distance(guest_embedding, embedding, 0)

 if min_distance is None or min_distance > distance:

                        min_name = name

                        min_distance = distance



 # if we found face in database and distance is not too big

 if min_distance and min_distance < 1.1:

                    font = cv2.FONT_HERSHEY_SIMPLEX

                    x_text = x / SCALE

                    y_text = y / SCALE - 10

                    point = (int(x_text), int(y_text))

                    cv2.putText(frame, min_name, point, font, 1, (255, 255, 255), 2, cv2.LINE_AA)



 # show rectangle for face at image

                point = (int(x/SCALE), int(y/SCALE))

                point2 = (int(x2/SCALE), int(y2/SCALE))

                cv2.rectangle(frame, point, point2, (0, 255, 0), 2)



 # Display the resulting frame

    cv2.imshow('frame', frame)

 if cv2.waitKey(1) & 0xFF == ord('q'):

 break



# When everything done, release the capture

cap.release()

cv2.destroyAllWindows()