File size: 8,030 Bytes
391bc2b
 
 
 
 
cfa323b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
library_name: transformers.js
tags:
- pose-estimation
license: agpl-3.0
---

YOLOv8x-pose-p6 with ONNX weights to be compatible with Transformers.js.

## Usage (Transformers.js)

If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
```bash
npm i @xenova/transformers
```

**Example:** Perform pose-estimation w/ `Xenova/yolov8x-pose-p6`.

```js
import { AutoModel, AutoProcessor, RawImage } from '@xenova/transformers';

// Load model and processor
const model_id = 'Xenova/yolov8x-pose-p6';
const model = await AutoModel.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);

// Read image and run processor
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg';
const image = await RawImage.read(url);
const { pixel_values } = await processor(image);

// Set thresholds
const threshold = 0.3; // Remove detections with low confidence
const iouThreshold = 0.5; // Used to remove duplicates
const pointThreshold = 0.3; // Hide uncertain points

// Predict bounding boxes and keypoints
const { output0 } = await model({ images: pixel_values });

// Post-process:
const permuted = output0[0].transpose(1, 0);
// `permuted` is a Tensor of shape [ 8400, 56 ]:
// - 8400 potential detections
// - 56 parameters for each box:
//   - 4 for the bounding box dimensions (x-center, y-center, width, height)
//   - 1 for the confidence score
//   - 17 * 3 = 51 for the pose keypoints: 17 labels, each with (x, y, visibilitiy)

// Example code to format it nicely:
const results = [];
const [scaledHeight, scaledWidth] = pixel_values.dims.slice(-2);
for (const [xc, yc, w, h, score, ...keypoints] of permuted.tolist()) {
    if (score < threshold) continue;

    // Get pixel values, taking into account the original image size
    const x1 = (xc - w / 2) / scaledWidth * image.width;
    const y1 = (yc - h / 2) / scaledHeight * image.height;
    const x2 = (xc + w / 2) / scaledWidth * image.width;
    const y2 = (yc + h / 2) / scaledHeight * image.height;
    results.push({ x1, x2, y1, y2, score, keypoints })
}


// Define helper functions
function removeDuplicates(detections, iouThreshold) {
    const filteredDetections = [];

    for (const detection of detections) {
        let isDuplicate = false;
        let duplicateIndex = -1;
        let maxIoU = 0;

        for (let i = 0; i < filteredDetections.length; ++i) {
            const filteredDetection = filteredDetections[i];
            const iou = calculateIoU(detection, filteredDetection);
            if (iou > iouThreshold) {
                isDuplicate = true;
                if (iou > maxIoU) {
                    maxIoU = iou;
                    duplicateIndex = i;
                }
            }
        }

        if (!isDuplicate) {
            filteredDetections.push(detection);
        } else if (duplicateIndex !== -1 && detection.score > filteredDetections[duplicateIndex].score) {
            filteredDetections[duplicateIndex] = detection;
        }
    }

    return filteredDetections;
}

function calculateIoU(detection1, detection2) {
    const xOverlap = Math.max(0, Math.min(detection1.x2, detection2.x2) - Math.max(detection1.x1, detection2.x1));
    const yOverlap = Math.max(0, Math.min(detection1.y2, detection2.y2) - Math.max(detection1.y1, detection2.y1));
    const overlapArea = xOverlap * yOverlap;

    const area1 = (detection1.x2 - detection1.x1) * (detection1.y2 - detection1.y1);
    const area2 = (detection2.x2 - detection2.x1) * (detection2.y2 - detection2.y1);
    const unionArea = area1 + area2 - overlapArea;

    return overlapArea / unionArea;
}

const filteredResults = removeDuplicates(results, iouThreshold);

// Display results
for (const { x1, x2, y1, y2, score, keypoints } of filteredResults) {
    console.log(`Found person at [${x1}, ${y1}, ${x2}, ${y2}] with score ${score.toFixed(3)}`)
    for (let i = 0; i < keypoints.length; i += 3) {
        const label = model.config.id2label[Math.floor(i / 3)];
        const [x, y, point_score] = keypoints.slice(i, i + 3);
        if (point_score < pointThreshold) continue;
        console.log(`  - ${label}: (${x.toFixed(2)}, ${y.toFixed(2)}) with score ${point_score.toFixed(3)}`);
    }
}
```

<details>

<summary>See example output</summary>

```
Found person at [535.95703125, 43.12074284553528, 644.3259429931641, 337.3436294078827] with score 0.760
  - nose: (885.58, 179.72) with score 0.975
  - left_eye: (897.09, 165.24) with score 0.976
  - right_eye: (874.85, 164.54) with score 0.851
  - left_ear: (914.39, 169.48) with score 0.806
  - left_shoulder: (947.49, 252.34) with score 0.996
  - right_shoulder: (840.67, 244.42) with score 0.665
  - left_elbow: (1001.36, 351.66) with score 0.983
  - left_wrist: (1011.84, 472.31) with score 0.954
  - left_hip: (931.52, 446.28) with score 0.986
  - right_hip: (860.66, 442.87) with score 0.828
  - left_knee: (930.67, 625.64) with score 0.979
  - right_knee: (872.17, 620.36) with score 0.735
  - left_ankle: (929.01, 772.34) with score 0.880
  - right_ankle: (882.23, 778.68) with score 0.454
Found person at [0.4024791717529297, 59.50179467201233, 156.87244415283203, 370.64377751350406] with score 0.853
  - nose: (115.39, 198.06) with score 0.918
  - left_eye: (120.26, 177.71) with score 0.830
  - right_eye: (105.47, 179.69) with score 0.757
  - left_ear: (144.87, 185.18) with score 0.711
  - right_ear: (97.69, 188.45) with score 0.468
  - left_shoulder: (178.03, 268.88) with score 0.975
  - right_shoulder: (80.69, 273.99) with score 0.954
  - left_elbow: (203.06, 383.33) with score 0.923
  - right_elbow: (43.32, 376.35) with score 0.856
  - left_wrist: (215.74, 504.02) with score 0.888
  - right_wrist: (6.77, 462.65) with score 0.812
  - left_hip: (165.70, 473.24) with score 0.990
  - right_hip: (97.84, 471.69) with score 0.986
  - left_knee: (183.26, 646.61) with score 0.991
  - right_knee: (104.04, 651.17) with score 0.989
  - left_ankle: (199.88, 823.24) with score 0.966
  - right_ankle: (104.66, 827.66) with score 0.963
Found person at [107.49130249023438, 12.557352638244629, 501.3542175292969, 527.4827188491821] with score 0.872
  - nose: (246.06, 180.81) with score 0.722
  - left_eye: (236.99, 148.85) with score 0.523
  - left_ear: (289.26, 152.23) with score 0.770
  - left_shoulder: (391.63, 256.55) with score 0.992
  - right_shoulder: (363.28, 294.56) with score 0.979
  - left_elbow: (514.37, 404.61) with score 0.990
  - right_elbow: (353.58, 523.61) with score 0.957
  - left_wrist: (607.64, 530.43) with score 0.985
  - right_wrist: (246.78, 536.33) with score 0.950
  - left_hip: (563.45, 577.89) with score 0.998
  - right_hip: (544.08, 613.29) with score 0.997
  - left_knee: (466.57, 862.51) with score 0.996
  - right_knee: (518.49, 977.99) with score 0.996
  - left_ankle: (691.56, 844.49) with score 0.960
  - right_ankle: (671.32, 1100.90) with score 0.953
Found person at [424.73594665527344, 68.82870757579803, 640.3419494628906, 492.8904126405716] with score 0.887
  - nose: (840.26, 289.19) with score 0.991
  - left_eye: (851.23, 259.92) with score 0.956
  - right_eye: (823.10, 256.35) with score 0.955
  - left_ear: (889.52, 278.10) with score 0.668
  - right_ear: (799.80, 264.64) with score 0.771
  - left_shoulder: (903.87, 398.65) with score 0.997
  - right_shoulder: (743.88, 403.37) with score 0.988
  - left_elbow: (921.63, 589.83) with score 0.989
  - right_elbow: (699.56, 527.09) with score 0.934
  - left_wrist: (959.21, 728.84) with score 0.984
  - right_wrist: (790.88, 519.34) with score 0.945
  - left_hip: (873.51, 720.07) with score 0.996
  - right_hip: (762.29, 760.91) with score 0.990
  - left_knee: (945.33, 841.65) with score 0.987
  - right_knee: (813.06, 1072.57) with score 0.964
  - left_ankle: (918.48, 1129.20) with score 0.871
  - right_ankle: (886.91, 1053.95) with score 0.716
```
</details>