Contributor paul lewisDeveloper Relations Engineer
Released in May developers.googleblog.com/2023/05/introducing-mediapipe-solutions-for-on-device-machine-learning.html” target=”_blank”>MediaPipe Solutions is a set of tools for no-code and low-code solutions to common on-device machine learning tasks for Android, Web, and Python. Today, we’re excited to announce the availability of an initial version of the iOS SDK and an update to the Python SDK to support Raspberry Pi. These include support for speech classification, facial landmark detection, and various natural language processing tasks. Let’s take a look at how to use these tools on the new platform.
Raspberry Pi object detection
Apart from setting up the Raspberry Pi hardware with the camera, you can start by installing the MediaPipe dependencies if you haven’t already installed OpenCV and NumPy.
python -m pip install mediapipe |
From there, you can create a new Python file and prepend your imports.
import mediapipe as mp |
You can also confirm that your object detection model is stored locally on your Raspberry Pi. For your convenience, a default model is provided, EfficientDet-Lite0, which can be obtained with the following command:
wget -q -O efficientdet.tflite -q https://storage.googleapis.com/mediapipe-models/object_detector/efficientdet_lite0/int8/1/efficientdet_lite0.tflite |
After downloading the model, you can start creating a new model. object detectorThis includes some customization, such as the maximum results you want to receive and the confidence threshold that must be exceeded before any results are returned.
options = vision.ObjectDetectorOptions( base_options=base_options, running_mode=vision.RunningMode.LIVE_STREAM, max_results=max_results, score_threshold=score_threshold, result_callback=save_result) |
After creating the object detector, we need to open the Raspberry Pi camera to read continuous frames. There are some preprocessing steps that I omit here, but are available in the sample on GitHub.
Within that loop, you can convert the processed camera image into a new MediaPipe.Image and perform detection on that new MediaPipe.Image before displaying the received results on the associated listener.
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb_image) |
If you pull these results to detect the bounding box, you should see something like this:
Find the complete Raspberry Pi example above on GitHub or see the official documentation here.
Text classification on iOS
Text classification is one of the more direct examples, but the core idea also applies to the rest of the iOS tasks available. Just like on the Raspberry Pi, start by creating a new MediaPipe Tasks object.In this case this is TextClassifier.
var textClassifier: TextClassifier?
|
with this, TextClassifierJust pass the . string to it to get TextClassifierResult.
func classify(text: String) -> TextClassifierResult? |
This can be done from elsewhere within the app. ViewController DispatchQueuebefore displaying the results.
let result = self?.textClassifier.classify(text: inputText) |
You can find the rest of the code for this project on GitHub, and see complete documentation at developers.google.com/mediapipe.
Start
To learn more, check out the I/O 2023 sessions: Easier On-Device ML with MediaPipe, Supercharging Web Apps with Machine Learning and MediaPipe, and New Features in Machine Learning. Also, check out the official documentation at developers.google.com. /media pipe.
We look forward to all the exciting creations you create, so please share them with others. @googledevs And the developer community too!