Two main modes for face recognition:
- Face Verification (or authentication): a one-to-one mapping of a given face against a known identity.
- Face Identification (or recognition): a one-to-many mapping for a given face against a database of known faces.
Applications:
- Restrict access to a resource to one person, called face authentication.
- Confirm that the person matches their ID, called face verification.
- Assign a name to a face, called face identification.
Traditional Face Recognition steps or may combine some or all of the steps into a single process:
- Face Detection: locate one or more faces in the image and mark with a bounding box.
- Face Alignment: normalize the face to be consistent with the database, such as geometry and photo-metrics.
- Feature Extraction: extract features from the face that can be used for the recognition task.
- Face Recognition: perform matching of the face against one or more known faces in a prepared database.
Experiment with FaceNet
FaceNet directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity:
faces of the same person have small distances and faces of distinct people have large distances.
FaceNet combine some of the steps into a single process.
Once this embedding (FaceNet embeddings as feature vectors) has been produced, then face verification simply involves thresholding the distance between the two embeddings; recognition becomes a k-NN classification problem.
FaceNet directly trainsits output to be a compact 128-D embedding (128-bytes per face) using a triplet-based loss function. Our triplets consist of two matching face thumbnails and a non-matching face thumbnail and the loss aims to separate the positive pair from the negative by a distance margin. The thumbnails are tight crops of the face area, no 2D or 3D alignment, but scale and translation is performed.
- Face Verification (or authentication): a one-to-one mapping of a given face against a known identity.
- Face Identification (or recognition): a one-to-many mapping for a given face against a database of known faces.
Applications:
- Restrict access to a resource to one person, called face authentication.
- Confirm that the person matches their ID, called face verification.
- Assign a name to a face, called face identification.
Traditional Face Recognition steps or may combine some or all of the steps into a single process:
- Face Detection: locate one or more faces in the image and mark with a bounding box.
- Face Alignment: normalize the face to be consistent with the database, such as geometry and photo-metrics.
- Feature Extraction: extract features from the face that can be used for the recognition task.
- Face Recognition: perform matching of the face against one or more known faces in a prepared database.
Experiment with FaceNet
FaceNet directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity:
faces of the same person have small distances and faces of distinct people have large distances.
FaceNet combine some of the steps into a single process.
Once this embedding (FaceNet embeddings as feature vectors) has been produced, then face verification simply involves thresholding the distance between the two embeddings; recognition becomes a k-NN classification problem.
FaceNet directly trainsits output to be a compact 128-D embedding (128-bytes per face) using a triplet-based loss function. Our triplets consist of two matching face thumbnails and a non-matching face thumbnail and the loss aims to separate the positive pair from the negative by a distance margin. The thumbnails are tight crops of the face area, no 2D or 3D alignment, but scale and translation is performed.
Demo
import numpy as np import os import matplotlib.pyplot as plt import cv2 from sklearn.svm import SVC from sklearn.preprocessing import LabelEncoder from imageio import imread from skimage.transform import resize from keras.models import load_model from mtcnn import MTCNN image_dir_basepath = '5-celebrity-faces-dataset/' names = ['ben_afflek', 'elton_john', 'jerry_seinfeld', 'madonna', 'mindy_kaling'] FACENET_SIZE = (160, 160) model_path = 'facenet_keras.h5' model = load_model(model_path) detector = MTCNN() def l2_normalize(x, axis=-1, epsilon=1e-10): output = x / np.sqrt(np.maximum(np.sum(np.square(x), axis=axis, keepdims=True), epsilon)) return output def load_and_align_images(filepaths, margin): aligned_images = [] for filepath in filepaths: img = cv2.imread(filepath) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) faces = detector.detect_faces(img) x, y, w, h = faces[0]['box'] x, y = abs(x), abs(y) face = img[y - margin//2 : y + h + margin//2, x-margin//2 : x + w + margin//2, :] aligned = cv2.resize(face, FACENET_SIZE)/255.0 aligned_images.append(aligned) return np.array(aligned_images) def calc_embs(filepaths, margin=10, batch_size=1): aligned_images = load_and_align_images(filepaths, margin) pd = [] for start in range(0, len(aligned_images), batch_size): pd.append(model.predict_on_batch(aligned_images[start:start+batch_size])) embs = l2_normalize(np.concatenate(pd)) return embs def train(dir_basepath, names, max_num_img=10): labels = [] embs = [] for name in names: dirpath = os.path.abspath(dir_basepath + name) filepaths = [os.path.join(dirpath, f) for f in os.listdir(dirpath)][:max_num_img] embs_ = calc_embs(filepaths) labels.extend([name] * len(embs_)) embs.append(embs_) embs = np.concatenate(embs) le = LabelEncoder().fit(labels) y = le.transform(labels) clf = SVC(kernel='linear', probability=True).fit(embs, y) return le, clf def infer(le, clf, filepaths): embs = calc_embs(filepaths) pred = le.inverse_transform(clf.predict(embs)) return pred le, clf = train(image_dir_basepath + 'train/', names) test_dirpath = image_dir_basepath + 'val/' test_filepaths = [] for name in names: for f in os.listdir(test_dirpath + name): test_filepaths.append(test_dirpath + name + '/' + f) pred = infer(le, clf, test_filepaths) print(test_filepaths) for i in range(len(pred)): print(test_filepaths[i]) print(pred[i]) print('---------')
0 Comments