MediaPipe KNIFT is a template-based feature matching solution using KNIFT (Keypoint Neural Invariant Feature Transform).
|Fig 1. Matching a real Stop Sign with a Stop Sign template using KNIFT.|
In many computer vision applications, a crucial building block is to establish reliable correspondences between different views of an object or scene, forming the foundation for approaches like template matching, image retrieval and structure from motion. Correspondences are usually computed by extracting distinctive view-invariant features such as SIFT or ORB from images. The ability to reliably establish such correspondences enables applications like image stitching to create panoramas or template matching for object recognition in videos.
KNIFT is a general purpose local feature descriptor similar to SIFT or ORB. Likewise, KNIFT is also a compact vector representation of local image patches that is invariant to uniform scaling, orientation, and illumination changes. However unlike SIFT or ORB, which were engineered with heuristics, KNIFT is an embedding learned directly from a large number of corresponding local patches extracted from nearby video frames. This data driven approach implicitly encodes complex, real-world spatial transformations and lighting changes in the embedding. As a result, the KNIFT feature descriptor appears to be more robust, not only to affine distortions, but to some degree of perspective distortions as well.
For more information, please see MediaPipe KNIFT: Template-based feature matching in Google Developers Blog.
|Fig 2. Matching US dollar bills using KNIFT.|
In MediaPipe, we’ve already provided an index file pre-computed from the 3 template images (of US dollar bills) shown below. If you’d like to use your own template images, see Matching Your Own Template Images.
Please first see general instructions for Android on how to build MediaPipe examples.
- Android target: (or download prebuilt ARM64 APK)
Note: MediaPipe uses OpenCV 3 by default. However, because of issues between NDK 17+ and OpenCV 3 when using knnMatch, for this example app please use the following commands to temporarily switch to OpenCV 4, and switch back to OpenCV 3 afterwards.
# Switch to OpenCV 4 sed -i -e 's:3.4.3/opencv-3.4.3:4.0.1/opencv-4.0.1:g' WORKSPACE sed -i -e 's:libopencv_java3:libopencv_java4:g' third_party/opencv_android.BUILD # Build and install app bazel build -c opt --config=android_arm64 mediapipe/examples/android/src/java/com/google/mediapipe/apps/templatematchingcpu adb install -r bazel-bin/mediapipe/examples/android/src/java/com/google/mediapipe/apps/templatematchingcpu/templatematchingcpu.apk # Switch back to OpenCV 3 sed -i -e 's:4.0.1/opencv-4.0.1:3.4.3/opencv-3.4.3:g' WORKSPACE sed -i -e 's:libopencv_java4:libopencv_java3:g' third_party/opencv_android.BUILD
Step 1: Put all template images in a single directory.
Step 2: To build the index file for all templates in the directory, run
bazel build -c opt --define MEDIAPIPE_DISABLE_GPU=1 \ mediapipe/examples/desktop/template_matching:template_matching_tflite
bazel-bin/mediapipe/examples/desktop/template_matching/template_matching_tflite \ --calculator_graph_config_file=mediapipe/graphs/template_matching/index_building.pbtxt \ --input_side_packets="file_directory=<template image directory>,file_suffix=png,output_index_filename=<output index filename>"
The output index file includes the extracted KNIFT features.
Step 4: Build and run the app using the same instructions in Matching US Dollar Bills.
- Google Developers Blog: MediaPipe KNIFT: Template-based feature matching
- TFLite model for up to 200 keypoints
- TFLite model for up to 400 keypoints
- TFLite model for up to 1000 keypoints
- Model card