I am using ML Vision api to create embeddings from a FaceNet model then comparing cosine distance between two embeddings. The output of Android version and Python differs a lot. Python version is performing way better than android one. What can be the issue? I am using FaceNet model in both.
I am using ML kit for inferencing https://firebase.google.com/docs/ml-kit/android/use-custom-models
I think may be it is caused by the way java reads images as the image array made in android differs with that of same image in python.
So I was stuck at this problem as I was following google documentation at ML vision docs Where the image was converted to float array before feeding it to classifier, It looks like this:
val bitmap = Bitmap.createScaledBitmap(yourInputImage, 224, 224, true)
val batchNum = 0
val input = Array(1) { Array(224) { Array(224) { FloatArray(3) } } }
for (x in 0..223) {
for (y in 0..223) {
val pixel = bitmap.getPixel(x, y)
// Normalize channel values to [-1.0, 1.0]. This requirement varies by
// model. For example, some models might require values to be normalized
// to the range [0.0, 1.0] instead.
input[batchNum][x][y][0] = (Color.red(pixel) - 127) / 255.0f
input[batchNum][x][y][1] = (Color.green(pixel) - 127) / 255.0f
input[batchNum][x][y][2] = (Color.blue(pixel) - 127) / 255.0f
}
}
Then I analysed every step one by one found that the way pixels are fetched are wrong and were completely different from the way python does all of it.
Then I found this way of doing it from this source I changed that function with mine:
private fun convertBitmapToByteBuffer(bitmap: Bitmap): ByteBuffer {
val imgData = ByteBuffer.allocateDirect(4 * INPUT_SIZE * INPUT_SIZE * PIXEL_SIZE)
imgData.order(ByteOrder.nativeOrder())
val intValues = IntArray(INPUT_SIZE * INPUT_SIZE)
imgData.rewind()
bitmap.getPixels(intValues, 0, bitmap.width, 0, 0, bitmap.width, bitmap.height)
// Convert the image to floating point.
var pixel = 0
for (i in 0 until INPUT_SIZE) {
for (j in 0 until INPUT_SIZE) {
val `val` = intValues[pixel++]
imgData.putFloat(((`val`.shr(16) and 0xFF) - IMAGE_MEAN)/IMAGE_STD)
imgData.putFloat(((`val`.shr(8) and 0xFF)- IMAGE_MEAN)/ IMAGE_STD)
imgData.putFloat(((`val` and 0xFF) - IMAGE_MEAN)/IMAGE_STD)
}
}
return imgData;
}
And it worked!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.