New Keypoint Heads and Losses (#40)

Co-authored-by: klemen1999 <[email protected]> Co-authored-by: Martin Kozlovsky <[email protected]> Co-authored-by: GitHub Actions <[email protected]>
luxonis · Oct 9, 2024 · fccdc96 · fccdc96
1 parent e184075
commit fccdc96
Show file tree

Hide file tree

Showing 25 changed files with 963 additions and 136 deletions.
diff --git a/README.md b/README.md
@@ -50,6 +50,12 @@ For instructions on how to create a dataset in the LDF, follow the
 [examples](https://github.com/luxonis/luxonis-ml/tree/main/examples) in
 the [luxonis-ml](https://github.com/luxonis/luxonis-ml) repository.
 
+To inspect dataset images by split (train, val, test), use the command:
+
+```bash
+luxonis_train data inspect --config <config.yaml> --view <train/val/test>
+```
+
 ## Training
 
 Once you've created your `config.yaml` file you can train the model using this command:
@@ -66,6 +72,14 @@ luxonis_train train --config config.yaml trainer.batch_size 8 trainer.epochs 10
 
 where key and value are space separated and sub-keys are dot (`.`) separated. If the configuration field is a list, then key/sub-key should be a number (e.g. `trainer.preprocessing.augmentations.0.name RotateCustom`).
 
+## Evaluating
+
+To evaluate the model on a specific dataset split (train, test, or val), use the following command:
+
+```bash
+luxonis_train eval --config <config.yaml> --view <train/test/val>
+```
+
 ## Tuning
 
 To improve training performance you can use `Tuner` for hyperparameter optimization.

diff --git a/configs/coco_model.yaml b/configs/coco_model.yaml
@@ -46,7 +46,7 @@ model:
     - name: ImplicitKeypointBBoxLoss
       attached_to: ImplicitKeypointBBoxHead
       params:
-        keypoint_distance_loss_weight: 0.5
+        keypoint_regression_loss_weight: 0.5
         keypoint_visibility_loss_weight: 0.7
         bbox_loss_weight: 0.05
         objectness_loss_weight: 0.2

diff --git a/luxonis_train/attached_modules/losses/README.md b/luxonis_train/attached_modules/losses/README.md
@@ -11,6 +11,7 @@ List of all the available loss functions.
 - [SoftmaxFocalLoss](#softmaxfocalloss)
 - [AdaptiveDetectionLoss](#adaptivedetectionloss)
 - [ImplicitKeypointBBoxLoss](#implicitkeypointbboxloss)
+- [EfficientKeypointBBoxLoss](#efficientkeypointbboxloss)
 
 ## CrossEntropyLoss
 
@@ -97,10 +98,25 @@ Keypoint Similarity Loss](https://arxiv.org/ftp/arxiv/papers/2204/2204.06806.pdf
 | label_smoothing                 | float         | 0.0               | Smoothing for [SmothBCEWithLogitsLoss](#smoothbcewithlogitsloss) for classification loss.  |
 | min_objectness_iou              | float         | 0.0               | Minimum objectness IoU.                                                                    |
 | bbox_loss_weight                | float         | 0.05              | Weight for bbox detection sub-loss.                                                        |
-| keypoint_distance_loss_weight   | float         | 0.10              | Weight for keypoint distance sub-loss.                                                     |
+| keypoint_regression_loss_weight | float         | 0.5               | Weight for OKS sub-loss.                                                                   |
 | keypoint_visibility_loss_weight | float         | 0.6               | Weight for keypoint visibility sub-loss.                                                   |
 | class_loss_weight               | float         | 0.6               | Weight for classification sub-loss.                                                        |
 | objectness_loss_weight          | float         | 0.7               | Weight for objectness sub-loss.                                                            |
 | anchor_threshold                | float         | 4.0               | Threshold for matching anchors to targets.                                                 |
 | bias                            | float         | 0.5               | Bias for matchinf anchors to targets.                                                      |
 | balance                         | list\[float\] | \[4.0, 1.0, 0.4\] | Balance for objectness loss.                                                               |
+
+## EfficientKeypointBBoxLoss
+
+Adapted from [YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object
+Keypoint Similarity Loss](https://arxiv.org/ftp/arxiv/papers/2204/2204.06806.pdf).
+
+| Key                   | Type                                              | Default value | Description                                                                         |
+| --------------------- | ------------------------------------------------- | ------------- | ----------------------------------------------------------------------------------- |
+| viz_pw                | float                                             | 1.0           | Power for [BCEWithLogitsLoss](#bcewithlogitsloss) for keypoint visibility.          |
+| n_warmup_epochs       | int                                               | 4             | Number of epochs where ATSS assigner is used, after that we switch to TAL assigner. |
+| iou_type              | Literal\["none", "giou", "diou", "ciou", "siou"\] | "giou"        | IoU type used for bbox regression sub-loss                                          |
+| class_loss_weight     | float                                             | 1.0           | Weight used for the classification sub-loss.                                        |
+| iou_loss_weight       | float                                             | 2.5           | Weight used for the IoU sub-loss.                                                   |
+| regr_kpts_loss_weight | float                                             | 1.5           | Weight used for the OKS sub-loss.                                                   |
+| vis_kpts_loss_weight  | float                                             | 1.0           | Weight used for the keypoint visibility sub-loss.                                   |
diff --git a/luxonis_train/attached_modules/losses/__init__.py b/luxonis_train/attached_modules/losses/__init__.py
@@ -2,6 +2,7 @@
 from .base_loss import BaseLoss
 from .bce_with_logits import BCEWithLogitsLoss
 from .cross_entropy import CrossEntropyLoss
+from .efficient_keypoint_bbox_loss import EfficientKeypointBBoxLoss
 from .implicit_keypoint_bbox_loss import ImplicitKeypointBBoxLoss
 from .keypoint_loss import KeypointLoss
 from .sigmoid_focal_loss import SigmoidFocalLoss
@@ -12,6 +13,7 @@
     "AdaptiveDetectionLoss",
     "BCEWithLogitsLoss",
     "CrossEntropyLoss",
+    "EfficientKeypointBBoxLoss",
     "ImplicitKeypointBBoxLoss",
     "KeypointLoss",
     "BaseLoss",

diff --git a/luxonis_train/attached_modules/losses/adaptive_detection_loss.py b/luxonis_train/attached_modules/losses/adaptive_detection_loss.py
@@ -100,7 +100,6 @@ def prepare(
         feats = outputs["features"]
         pred_scores = outputs["class_scores"][0]
         pred_distri = outputs["distributions"][0]
-
         batch_size = pred_scores.shape[0]
         device = pred_scores.device
 
@@ -142,6 +141,7 @@ def prepare(
                 assigned_bboxes,
                 assigned_scores,
                 mask_positive,
+                _,
             ) = self.atts_assigner(
                 anchors,
                 n_anchors_list,
@@ -157,7 +157,8 @@ def prepare(
                 assigned_bboxes,
                 assigned_scores,
                 mask_positive,
-            ) = self.tal_assigner.forward(
+                _,
+            ) = self.tal_assigner(
                 pred_scores.detach(),
                 pred_bboxes.detach() * stride_tensor,
                 anchor_points,