Quantcast
Channel: How to parse the heatmap output for the pose estimation tflite model? - Stack Overflow
Viewing all articles
Browse latest Browse all 2

How to parse the heatmap output for the pose estimation tflite model?

$
0
0

I am starting with the pose estimation tflite model for getting keypoints on humans.

https://www.tensorflow.org/lite/models/pose_estimation/overview

I have started with fitting a single image or a person and invoking the model:

img = cv.imread('photos\standing\\3.jpg')img = tf.reshape(tf.image.resize(img, [257,257]), [1,257,257,3])model = tf.lite.Interpreter('models\posenet_mobilenet_v1_100_257x257_multi_kpt_stripped.tflite')model.allocate_tensors()input_details = model.get_input_details()output_details = model.get_output_details()floating_model = input_details[0]['dtype'] == np.float32if floating_model:    img = (np.float32(img) - 127.5) / 127.5model.set_tensor(input_details[0]['index'], img)model.invoke()output_data =  model.get_tensor(output_details[0]['index'])# o()offset_data = model.get_tensor(output_details[1]['index'])results = np.squeeze(output_data)offsets_results = np.squeeze(offset_data)print("output shape: {}".format(output_data.shape))np.savez('sample3.npz', results, offsets_results)

but I am struggling with parsing the output correctly to get the coordinates/confidences of each body part. Does anyone have a python example for interpreting this models results? (for example: using them to map keypoints back to the original image)

My code (a snippet from a class which essentially takes the np array directly from the model output):

def get_keypoints(self, data):        height, width, num_keypoints = data.shape        keypoints = []        for keypoint in range(0, num_keypoints):            maxval = data[0][0][keypoint]            maxrow = 0            maxcol = 0            for row in range(0, width):                for col in range(0,height):                    if data[row][col][keypoint] > maxval:                        maxrow = row                        maxcol = col                        maxval = data[row][col][keypoint]            keypoints.append(KeyPoint(keypoint, maxrow, maxcol, maxval))            # keypoints = [Keypoint(x,y,z) for x,y,z in ]        return keypointsdef get_image_coordinates_from_keypoints(self, offsets):        height, width, depth = (257,257,3)        # [(x,y,confidence)]        coords = [{ 'point': k.body_part,'location': (k.x / (width - 1)*width + offsets[k.y][k.x][k.index],                   k.y / (height - 1)*height + offsets[k.y][k.x][k.index]),'confidence': k.confidence}                 for k in self.keypoints]        return coords

after matching the indexes to the parts my output is: enter image description here

Some of the coordinates here are negative, which can't be correct. Where is my mistake?


Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images