What is MLKit?
ML Kit is a mobile machine-learning platform that offers pre-built models to recognize text in images, detect faces, and more.
MLKit is a machine-learning platform developed by Google that is delivered through a powerful and easy-to-use package. Making your iOS and Android apps much more engaging, personalized, and helpful by using solutions that are optimized to run on-device. Developers can also train their own custom models using Google's AutoML or Tensorflow technology.
Ready-to-use APIs
Text Recognition
Have you ever used an app that translates one language to another using your phone? Or scan a text from a photo. That is called text recognition. It enables developers to recognize and extract text from images and documents using machine learning algorithms.
How It Works
It works by using a technology called optical character recognition (OCR) which takes text patterns from images and translates them into machine-readable text. This model has been trained in various fonts, styles, and characters that even handwriting can be recognized. Making it possible to recognize text in a wide range of scenarios.
Use Cases
- Document Scanning: scan and digitize documents, such as receipts, business cards, and invoices.
- Language Translation: recognize text in different languages and translate it into another language. Currently being used in the Google Lens app.
- Augmented Reality: recognize text from real-world objects
- Accessibility: assist people with visual impairments by converting text in images into audio.
- Security: identity a person by scanning their ID card or passport. Most commonly used by KYC (know-your-customers) in fintech apps
Face Detection
Did you see the Facebook post of your friend and tag you in an unflattering image automatically? Blame Face Detection. It works by analyzing patterns in the pixels of the photo and applying pre-trained models to identify facial features such as eyes, nose, and mouth.
How It Works
It can be used to detect the presence of faces from a photo or video, identify facial landmarks and facial expressions, estimate age and gender, and detect the presence of glasses or a beard.
Use Cases
- Authentication and security: can be used for secure access control to devices.
- Social media and entertainment: Face recognition can help in automatically tagging photos and videos on social media platforms, and also in creating personalized content recommendations for users.
Face Mesh Detection
I'm sure you've heard about Snapchat, one of the apps that utilize face filters. They used Face Mesh Detection to do just that. It works by detecting and tracking facial landmarks and creates a 3D mesh of a user's face in real-time. This allows developers to create augmented reality experiences, apply special effects, and create personalized avatars.
How It Works
Face Mesh Detection delivers a much more accurate representation of the face due to the technology that recognizes the 468 different facial patterns. Using these data, the API will then create a 3D mesh counterpart that accurately tracks the user's facial movements.
Use Case
- Augmented reality (AR): tracks facial expressions and movements and applies virtual masks, filters, or special effects.
- Gaming: reflect a user's facial expressions and movements in real-time through avatars.
- Photo and video editing: add virtual makeup, face distortion, and background replacement, to photos and videos.
Apps that use MLKit
Adidas
ML Kit detects objects of interest in real-time which in the case of Adidas are sports apparel and footwear. Based on these detections, the Adidas app prefetches image recognition results from a cloud-based image recognition service before the user even explicitly requests results for the object of interest. This greatly increases the speed with which users get results and leads to a seamless and pleasant visual search experience.
VSCO
VSCO's challenge was to overcome decision fatigue by providing trusted guidance and encouraging discovery, while still leaving space for their users to be creative in how they edit their images. The solution chosen was to suggest presets for images. The team developed the “For This Photo” feature, utilizing on-device ML to identify the type of photo being edited and suggest relevant presets from a curated list. The feature called "For This Photo" has been widely accepted by users, and it has become the second most popular preset category after displaying all available presets.
Comments ()