import numpy as np
For a pose estimation model which outputs a heatmap and offsets. The desired points can be obtained by:
Performing a sigmoid operation on the heatmap:
scores = sigmoid(heatmaps)
Each keypoint of those pose is usually represented by a 2-D matrix, the maximum value in that matrix is related to where the model thinks that point is located in the input image. Use argmax2D to obtain the x and y index of that value in each matrix, the value itself represents the confidence value:
x,y = np.unravel_index(np.argmax(scores[:,:,keypointindex]), scores[:,:,keypointindex].shape)
confidences = scores[x,y,keypointindex]
That x,y is used to find the corresponding offset vector for calculating the final location of the keypoint:
offset_vector = (offsets[y,x,keypointindex], offsets[y,x,num_keypoints+keypointindex])
After you have obtained your keypoint coords and offsets you can calculate the final position of the keypoint by using ():
image_positions = np.add(np.array(heatmap_positions) * output_stride, offset_vectors)
See this for determining how to get the output stride, if you don't already have it. The tflite pose estimation has an output stride of 32.
A function which takes output from that Pose Estimation model and outputs keypoints. Not including KeyPoint
class
def get_keypoints(self, heatmaps, offsets, output_stride=32): scores = sigmoid(heatmaps) num_keypoints = scores.shape[2] heatmap_positions = [] offset_vectors = [] confidences = [] for ki in range(0, num_keypoints ): x,y = np.unravel_index(np.argmax(scores[:,:,ki]), scores[:,:,ki].shape) confidences.append(scores[x,y,ki]) offset_vector = (offsets[y,x,ki], offsets[y,x,num_keypoints+ki]) heatmap_positions.append((x,y)) offset_vectors.append(offset_vector) image_positions = np.add(np.array(heatmap_positions) * output_stride, offset_vectors) keypoints = [KeyPoint(i, pos, confidences[i]) for i, pos in enumerate(image_positions)] return keypoints
Keypoint class:
PARTS = { 0: 'NOSE', 1: 'LEFT_EYE', 2: 'RIGHT_EYE', 3: 'LEFT_EAR', 4: 'RIGHT_EAR', 5: 'LEFT_SHOULDER', 6: 'RIGHT_SHOULDER', 7: 'LEFT_ELBOW', 8: 'RIGHT_ELBOW', 9: 'LEFT_WRIST', 10: 'RIGHT_WRIST', 11: 'LEFT_HIP', 12: 'RIGHT_HIP', 13: 'LEFT_KNEE', 14: 'RIGHT_KNEE', 15: 'LEFT_ANKLE', 16: 'RIGHT_ANKLE'}class KeyPoint(): def __init__(self, index, pos, v): x, y = pos self.x = x self.y = y self.index = index self.body_part = PARTS.get(index) self.confidence = v def point(self): return int(self.y), int(self.x) def to_string(self): return 'part: {} location: {} confidence: {}'.format( self.body_part, (self.x, self.y), self.confidence)