Quantcast
Channel: How to parse the heatmap output for the pose estimation tflite model? - Stack Overflow
Viewing all articles
Browse latest Browse all 2

Answer by Josh Sharkey for How to parse the heatmap output for the pose estimation tflite model?

$
0
0

import numpy as np

For a pose estimation model which outputs a heatmap and offsets. The desired points can be obtained by:

  1. Performing a sigmoid operation on the heatmap:

    scores = sigmoid(heatmaps)

  2. Each keypoint of those pose is usually represented by a 2-D matrix, the maximum value in that matrix is related to where the model thinks that point is located in the input image. Use argmax2D to obtain the x and y index of that value in each matrix, the value itself represents the confidence value:

    x,y = np.unravel_index(np.argmax(scores[:,:,keypointindex]), scores[:,:,keypointindex].shape)confidences = scores[x,y,keypointindex]

  3. That x,y is used to find the corresponding offset vector for calculating the final location of the keypoint:

    offset_vector = (offsets[y,x,keypointindex], offsets[y,x,num_keypoints+keypointindex])

  4. After you have obtained your keypoint coords and offsets you can calculate the final position of the keypoint by using ():

    image_positions = np.add(np.array(heatmap_positions) * output_stride, offset_vectors)

See this for determining how to get the output stride, if you don't already have it. The tflite pose estimation has an output stride of 32.

A function which takes output from that Pose Estimation model and outputs keypoints. Not including KeyPoint class

def get_keypoints(self, heatmaps, offsets, output_stride=32):        scores = sigmoid(heatmaps)        num_keypoints = scores.shape[2]        heatmap_positions = []        offset_vectors = []        confidences = []        for ki in range(0, num_keypoints ):            x,y = np.unravel_index(np.argmax(scores[:,:,ki]), scores[:,:,ki].shape)            confidences.append(scores[x,y,ki])            offset_vector = (offsets[y,x,ki], offsets[y,x,num_keypoints+ki])            heatmap_positions.append((x,y))            offset_vectors.append(offset_vector)        image_positions = np.add(np.array(heatmap_positions) * output_stride, offset_vectors)        keypoints = [KeyPoint(i, pos, confidences[i]) for i, pos in enumerate(image_positions)]        return keypoints

Keypoint class:

PARTS = {    0: 'NOSE',    1: 'LEFT_EYE',    2: 'RIGHT_EYE',    3: 'LEFT_EAR',    4: 'RIGHT_EAR',    5: 'LEFT_SHOULDER',    6: 'RIGHT_SHOULDER',    7: 'LEFT_ELBOW',    8: 'RIGHT_ELBOW',    9: 'LEFT_WRIST',    10: 'RIGHT_WRIST',    11: 'LEFT_HIP',    12: 'RIGHT_HIP',    13: 'LEFT_KNEE',    14: 'RIGHT_KNEE',    15: 'LEFT_ANKLE',    16: 'RIGHT_ANKLE'}class KeyPoint():    def __init__(self, index, pos, v):        x, y = pos        self.x = x        self.y = y        self.index = index        self.body_part = PARTS.get(index)        self.confidence = v    def point(self):        return int(self.y), int(self.x)    def to_string(self):        return 'part: {} location: {} confidence: {}'.format(            self.body_part, (self.x, self.y), self.confidence)

Viewing all articles
Browse latest Browse all 2

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>