AI face detection & recognition with Datamaton products

Wouldn’t it be nice if you could quickly find photos and videos of friends & family scattered across 10+ years of local backups, email attachments & cloud storage and buried in ZIP, PST or MBOX files? Create virtual folders to view and backup precious baby photos & videos from everywhere? Now you can, with privacy preserving face detection and recognition from Datamaton Knit.

Datamaton Knit supports Artificial Intelligence (AI) based face detection and recognition in photos and videos. The high level objective is to allow you to search, organize and manage your photos & videos based on the faces they contain. When you associate a face with that person’s name, Knit will automatically add that name tag to all photos and videos that have this person’s face.

Knit’s face detection capability is similar to what cloud photo storage providers like Google Photos provide, but with a few crucial differences:

a) It is 100% private: Knit does all the work on your computer and and saves face information to encrypted files only on your computer. Only you have this password, and nothing about your photos or videos ever touches our servers. Knit does not ship off your photo or video to any cloud based face detection provider.
b) It is universal: Knit’s solution works for photos & videos across your local hard disks, web based email, calendar and cloud storage accounts too. This is a huge benefit when you have hundreds of gigabytes of locally stored photos & videos, as it is expensive and time-consuming to upload so much to the cloud for face detection.
c) It finds photos & videos others can’t Knit’s solution works with deeply nested photos & videos embedded inside ZIP, PST, MBOX etc. files and as email attachments. Knit will detect faces in a photo file inside a ZIP file which is an email attachment for an email that’s inside an MBOX file that’s inside a ZIP file on your hard drive that you downloaded from Google Takeout.
d) You can organize and backup based on faces too, not just search: Knit’s solution allows you to create Virtual Folders based on face names. This provides a unified view of photos & videos that contain specific faces scattered over different storage locations. It also lets you back up such a virtual folder, or just specific files gathered from different sources based on the faces they contain.
e) The search is more sophisticated: This is not just because Knit’s unified search can search all your storage locations at the same time. It is also because Knit allows you to specify a search that includes person A and B but not C or D.
Unified photo & video search based on faces it contains.
Plus, you can combine face search parameters with other parameters like date ranges, camera used etc. (using the “More Conditions” button in the image above).
f) It supports all major formats and raw images: Knit natively supports all major photo formats like JPG, PNG, BMP, TIF, SVG etc. It also supports raw image formats from major camera manufacturers like Sony, Canon, Nikon, Fuji, Kodak, Olympus etc. and vendor specific photo formats like Apple HEIC/PIC, Adobe Illustrator EPS/CR2 etc. Similarly, it natively supports all major video file formats like WMV, MPG, MP4, AVI, MOV, Apple QuickTime, Matroska MKV, Macromedia Shockwave, 3GP, ASF, WEBM etc.

How It Works

Knit uses two separate AI Convolutional Neural Networks models – one to detect faces and another to recognize them. These AI models are copied to your own computer during Knit installation, and it uses your own computer to do this work. It uses background tasks that run at low priority to make sure your computer remains usable while it is doing CPU and memory intensive face analysis. At a high level, face detection & recognition includes the following steps:

1) Detecting faces: Knit attempts to detect faces each time it indexes a photo or video that’s enabled for this. So If Knit sees that an email it is indexing has a photo or video attachment, it will download the attachment to your local computer to perform face analysis (and delete it immediately afterword). Knit handles videos by extracting still images from it at periodic intervals and performing face detection on the extracted photos. It then attempts to remove duplicate faces of the same person present in different frames in the video, so that each face is reported only once in a video.
2) Recognizing faces: Knit uses a different AI model to create a unique representation for each detected face. This representation is used to recognize if two faces in different photos or videos are the same person or not. These face representations are saved in index files stored only on your computer and encrypted with a password only you know. Each detected face currently takes ~6-10 kilobytes of disk space per face. So if Knit detects 100-thousand faces across all your Data Sources, it will need at least ~600 MB of free disk space on your computer to store face information.
3) Grouping faces: In this step, Knit attempts to put all the faces of a single person present in different photo & video files in the same face-group. It does so by comparing the face representations from step 2) above to detect if two faces belong to the same person. Face grouping is done periodically using background tasks.
4) Assigning a name to a face: In this step, you use the Knit User Interface (UI) to assign names to faces. There are two menu options to do this:
a) Use the “View Or Assign Names To Faces in” menu option to do this for a specific Data Source.
Menu option to view or sssign names to faces in a Data Source
b) Select a specific photo or video and right click on it to bring up the file specific menu. Then use the “Assign Face Names” menu option.
Menu opton to assign face names in a specific photo or video
In both instances, Knit will display a screen with the detected faces. You can type in a name in the space just below the face (red boxes in the image below).
Assign name to displayed faces
5) Propagating a face name to all photos & videos where it appears: . When you assign a name to a face in step 4), Knit will locate the face-group to which it belongs. It will then locate all the photo and video files which contain this person’s face, and automatically add the face name tag you’ve assigned to each such file.
6) Periodically merging different face-groups that represent the same person: Sometimes, Knit can end up creating multiple face-groups for the same person. To rectify this, Knit will periodically attempt to find and merge them. If face-groups are merged, all the faces in the merged group automatically inherit the face name tag you assigned to any member of that face-group.

Hardware Requirements

We constantly hear about the incredibly powerful chips needed to train AI models. Fortunately, actually using the AI models to detect and recognize faces is less intensive and can be done on your home computer (in fact, AI face detection is now built into much less powerful cameras). Still, detecting and recognizing tens or hundreds of thousands of faces from as many photos or videos is still fairly compute and memory intensive. This is especially true for videos (as Knit may have to extracts hundreds of frames from it) and raw images (as these are very large).

We recommend you use a “reasonably modern” computer – from year 2020 or later – with a CPU that has multiple cores or threads, at least 8GB of memory and a solid-state (SSD) hard disk. Knit’s implementation is highly optimized towards running on home computers (not servers). It will run its background tasks at a low priority such that you can continue to use your computer for other work. Also, it will seamlessly scale to using available resources, so it will work much faster on a more capable computer.

What Affects The Results

AI based face detection and recognition models have made tremendous progress in the last few years. However, their effectiveness can still get affected by a number of factors like:

Light conditions: a photo or video taken in very bright or low light conditions may end up not matching a photo of the same face taken in different light conditions.
Partially obstructed faces: If a face has a shadow on it or is partially obstructed by sunglasses, hats, scarves or other objects in front, it may not match other instances of the same person’s face in other photos or videos.
Side faces: these are notoriously hard to match, and Knit actually skips them when grouping faces due to the high rate of matching errors.
Age tracking: Another aspect that often doesn’t work is “tracking” a person’s face through widely different age ranges. So if you have photos of the same person as a baby, child, adolescent and adult, it is likely that the current AI models will not match them all as the same person.
Twins, triplets, close siblings. . .: AI models have gotten better – but can still get tripped up – at recognizing as separate people twins/triplets and siblings very close to each other in age.

Knit will strive to stay up to date with the state of the art as AI models evolve to newer, better and faster versions.

Frequently Asked Questions

Do you use your own AI face detection and recognition models?

No, we use industry standard AI models that were created by full-time researchers. We may change the specific models we use over time, but will always download them to your own computer to do the actual work.

How much time does face detection & recognition take?

It takes less than 1 second to detect faces in small size photos (~3 megabytes or less) and ~1-5 seconds for medium sized photos (less than 10 megabyte). Large photos (20+ megabytes) – especially in raw format – can take upto a minute per photo. Videos are more time consuming – it can take Knit several minutes to extract several hundred still images from a long (~5+ minute) video before these images are processed. However, the most time consuming part is grouping together the same face in different photo & video files. This essentially requires Knit to compare each detected face with every other one to see if they are the same! Grouping 100,000 faces for the first time can take ~30+ minutes! Incrementally adding newly detected faces after this initial grouping is much less expesinve though. Lastly, once you assign a name to a face Knit propagates it to all the photo & video files that contain that face – this can take several minutes to complete.

Please note that these numbers are very rough approximations! Since Knit uses your own computer to do the work, these numbers are heavily influenced by your comptuer’s capabilities and what else is running on it (since Knit plays nice by running its face detection background tasks at low priority). These representative numbers are for “reasonably modern” (2022+) computers with 4+ cores, 8GB+ memory and an SSD hard disk.

Which photo and video formats do you support?

All the major ones including RAW files – see some of the example formats listed above.

Why are some detected faces fuzzy or blurred?

This can happen for faces detected in videos. When Knit extracts still images from a video, it can sometimes capture an image at the exact time instant when a person was moving in the video. This can lead to blurred faces – which then end up not matching other instances of that face in other photo & video files.

Where is the face information stored? Can Datamaton Inc. leak, share or sell it?

Face information is stored encrypted and only on your own computer. The original photo/video or the detected face information is never sent anywhere for any reason. Datamaton cannot leak, share or sell something it never had in the first place.

Can I delete saved face information?

Yes, you can tell Knit to delete all information about detected faces for a specific Data Source or for all of them. You can also opt out of face detection on a per-Data-Source basis. When you do that, Knit will delete saved face information for that Data Source and disable future face detection for it too.

Can I explicitly tell Knit that two faces are the same?

Yes. If Knit shows you two separate prompts to name the same face found in 2 different places, just give them both the exact same name. Knit periodically merges face groups anyway, and you assigning the exact same name to different faces is a strong signal for Knit to merge them.

Does Knit edit or update my photos/videos (e.g. to add name tag)?

No, Knit only reads your photos and videos and will not change them in any way. Any names you associate with faces in a photo or video are added as tags to the Knit’s index files only. Knit will not attempt to add the face name tags to the EXIF, IPTC, XMP etc. metadata inside the corresponding file itself. This is by design – many photos and videos have complex internal structures that can be corrupted by any attempt to add tags. This would also affect the filesystem date of the photo/video file. Plus, Knit detects faces from files embedded inside ZIP files, as email attachments etc. too – it would not be possible to update such files with the name tag.

Does Knit detect children’s faces? Can I control this?

Yes, Knit detects children’s faces too. The ability to search, organize and backup up our baby/toddler photos is one of the biggest usage models for face detection! There is no separate control to enable or disable this.

Download the free, trial or paid version of Knit.

Simplify your digital life