The social behavior tracking of SBeA. Using SBeA_tracker, the multi-animal 3D poses with identities can be acquired.
Tested PC:
CPU: i9-12900K
RAM: 128GB
ROM: 10TB
Operating system: Windows 11
Software: Anaconda and Visual Studio
(1) Switch to the environment.yml path in Anaconda Prompt, and run
conda env create -f environment.yml
to install virtual environment.
You can change the environmental name in environment.yml.
(2) Run conda activate [your environment]
.
(3) Run python setup.py install
to install the DCN module.
(4) Switch to .\gui
path, and run python main.py
to launch the GUI of SBeA_tracker.
Here is the start interface of SBeA_tracker:
Typical time: ~1 hour
Case data:
fig2_data\pose tracking
fig3_data\identification\oft_50cm_id
SBeA_tracker is managed in a project folder. The first step is to create or load a project.
Input:
Create A New 'configfile.yaml'. Select a path and define a name to save your project (workspace). Or load existing 'configfile.yaml' in the first textbox.
Output:
The workspace of your project.
Demo results:
Notes:
datasets folder: save datasets
evals folder: save model evaluation results
models folder: save deep learning models
configfile.yaml: the configuration of SBeA_tracker
Typical time: ~1 min
The second step is to set the configurations for training.
Input:
Your path of social and ID data.
Output:
Changed configfile.yaml.
Your data need to be organized like:
Social data:
Notes:
*-caliParas.mat: the calibration file
*-camera-#.avi: the multi-view videos of number # camera
Fields F1-F2-F3:
F1: the recording serial number
F2: the animal name such as A1 (animal one) and A2 (animal two)
F3: the date
ID data:
Notes:
*-caliParas.mat: the calibration file
*-camera-#.avi: the multi-view videos of number # camera
Fields F1-F2-F3:
F1: the recording serial number
F2: the single animal name such as A1 (animal one)
F3: the date
Tips:
The changes of configurations in user interface would update to configfile.yaml automatically. If the configfile.yaml is changed manually, the program would load it in first priority.
Typical time: ~30 min
The third step is to load data, label data, and train models according to configfile.yaml. In order to reduce the waiting time, we design the program to label data and train models in parallel. The training of models would not block the process of loading and labeling data.
Input:
The previous configurations.
Output:
The preprocessed data.
Demo results:
Mask box:
Load data:
The raw frames in .\datasets\raw_video_images
The backgrounds in .\datasets\video_backgrounds
The trajectories in .\datasets\video_trajectories
The manual label frames in .\datasets\manual_labels
Label data:
Label mask frames calling labelme:
Train model:
The well-trained data generation model in .\models\yolact
The well-trained video instance segmentation model in .\models\vistr
Training data generation based on YOLACT++:
Well-trained video instance segmentation model based on VisTR.
Pose box:
Load pose videos, label pose frames, and train pose estimation models calling DeepLabCut:
ID box:
Load data:
Cascaded identity images in .\datasets\id_images
Train model:
The well-trained model in .\models\reid
Well-trained animal identification model based on EfficientNet.
Typical time: 2 days in parallel
The fourth step is to evaluate the models trained in step 3. This step includes the options to evaluate VIS models and ID models.
Input:
The well-trained video instance segmentation model in .\models\vistr
The well-trained ID model in .\models\reid
Optional:
The ground truth video isntance segmentation data to evaluate model by GT.
The config.yaml file of pose estimation to evaluate the feature of identification.
Output:
The evaluation results of VIS model in .\evals\checkpoint*
The evaluation results of ID model in .\evals\reidmodel*
Demo results:
The evaluation of VIS model:
Notes:
result.json: raw results of video instance segmentation for evaluation data
corrected_result.json: corrected results of video instance segmentation by interframe continuity
*.avi: visualization of each evaluation video
The evaluation of VIS model with ground truth:
Notes:
result.json: raw results of video instance segmentation for evaluation data
eval.json: the performance of VIS model include IST (identity swap time), IST_P (identity swap time percentage), IOU_NID (Intersection over Union without considering ID), IOU_ID (Intersection over Union considering ID), AP_NID/ID_50/70/90 (mean average precision with/without considering ID with IOU larger than 50/70/90)
*.avi: visualization of each evaluation video
The evaluation of ID model:
Notes:
cam folder: folder saves LayerCAM results.
confusion_matrix.jpg: confusion matrix of identities
pca_representations.jpg: PCA to visualize the feature representation of ID models
tsne_representations.jpg: t-SNE to visualize the feature representation of ID models
The evaluation of ID features using LayerCAM:
Typical time: the same as your video time length
The fifth step is to predict 3D poses with identities of new videos.
Input:
New videos and calibration files like step 1.
Output:
Notes:
*-rawresult.json: raw results of video instance segmentation
*-correctedresult.json: corrected results of video instance segmentation by interframe continuity
*-predid.csv: raw file of identities predicted by SBeA
*-corrpredid.csv: corrected ID file by continuity
*-raw3d.mat: raw 3D skeletons of two animals without identities
*-rot3d.mat: 3D skeletons rotated to ground (world coordinate system) without identities
*-id3d.mat: 3D skeletons rotated to ground with identities
Typical time: the same as your video time length
The result visualization can be found in README.md
It is a bug from GitHub. The file synchronization appears to be the problem. We still have not found a way to fix it. So we prepared an offline version of SBeA. The link is "https://drive.google.com/file/d/1B7BWCUgwUnZdWeP4_rv_2byKJ22qZ4tY/view?usp=sharing".