TODO

YoloV8

YOLO(You Only Look Once)是一种流行的物体检测和图像分割模型,由华盛顿大学的约瑟夫-雷德蒙(Joseph Redmon)和阿里-法哈迪(Ali Farhadi)开发。YOLO 于 2015 年推出,因其高速度和高精确度而迅速受到欢迎。

Ultralytics

https://github.com/ultralytics/ultralytics

介绍 UltralyticsYOLOv8YOLOv8 基于深度学习和计算机视觉领域的尖端技术,在速度和准确性方面具有无与伦比的性能。其流线型设计使其适用于各种应用,并可轻松适应从边缘设备到云 API 等不同硬件平台。

安装 ultralytics

# Install the ultralytics package from PyPI
pip install ultralytics

检查

import ultralytics
ultralytics.checks()

实例(训练)

https://docs.ultralytics.com/tasks/detect/#train

预训练模型加载

 
from ultralytics import YOLO
# 加载 yolov8n 预训练模型, 本地若没有会从 https://github.com/ultralytics/assets/.. 中下载
model = YOLO("yolov8n.pt")
# 还支持 yaml 后缀
# model = YOLO("yolov8n.yaml")  # build a new model from YAML

v8 预训练模型列表

https://docs.ultralytics.com/models/yolov8/#supported-tasks-and-modes

ModelFile namesTaskInferenceValidationTrainingExport
YOLOv8yolov8n.pt yolov8s.pt yolov8m.pt yolov8l.pt yolov8x.ptDetection
YOLOv8-segyolov8n-seg.pt yolov8s-seg.pt yolov8m-seg.pt yolov8l-seg.pt yolov8x-seg.ptInstance Segmentation
YOLOv8-poseyolov8n-pose.pt yolov8s-pose.pt yolov8m-pose.pt yolov8l-pose.pt yolov8x-pose.pt yolov8x-pose-p6.ptPose/Keypoints
YOLOv8-obbyolov8n-obb.pt yolov8s-obb.pt yolov8m-obb.pt yolov8l-obb.pt yolov8x-obb.ptOriented Detection
YOLOv8-clsyolov8n-cls.pt yolov8s-cls.pt yolov8m-cls.pt yolov8l-cls.pt yolov8x-cls.ptClassification

v9 预训练模型列表

Detection (COCO)

Modelsize
(pixels)
mAPval
50-95
mAPval
50
params
(M)
FLOPs
(B)
YOLOv9t64038.353.12.07.7
YOLOv9s64046.863.47.226.7
YOLOv9m64051.468.120.176.8
YOLOv9c64053.070.225.5102.8
YOLOv9e64055.672.858.1192.5

Segmentation (COCO)

Modelsize
(pixels)
mAPbox
50-95
mAPmask
50-95
params
(M)
FLOPs
(B)
YOLOv9c-seg64052.442.227.9159.4
YOLOv9e-seg64055.144.360.5248.4

使用coco8 数据集 训练

 
from ultralytics import YOLO
# 加载预训练模型, 
model = YOLO("yolov8n.pt")
# Train the model
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

data="coco8.yaml"表示自动从 coco 下载数据集合, 里面有什么东西? 参考: https://docs.ultralytics.com/datasets/detect/coco8/#introduction

https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8.yaml.

# Ultralytics YOLO 🚀, AGPL-3.0 license
# COCO8 dataset (first 8 images from COCO train2017) by Ultralytics
# Documentation: https://docs.ultralytics.com/datasets/detect/coco8/
# Example usage: yolo train data=coco8.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── coco8  ← downloads here (1 MB)
 
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco8 # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images
test: # test images (optional)
 
# Classes
names:
  0: person
  1: bicycle
  2: car
  3: motorcycle
  4: airplane
  5: bus
  6: train
  7: truck
  8: boat
  9: traffic light
  10: fire hydrant
  11: stop sign
  12: parking meter
  13: bench
  14: bird
  15: cat
  16: dog
  17: horse
  18: sheep
  19: cow
  20: elephant
  21: bear
  22: zebra
  23: giraffe
  24: backpack
  25: umbrella
  26: handbag
  27: tie
  28: suitcase
  29: frisbee
  30: skis
  31: snowboard
  32: sports ball
  33: kite
  34: baseball bat
  35: baseball glove
  36: skateboard
  37: surfboard
  38: tennis racket
  39: bottle
  40: wine glass
  41: cup
  42: fork
  43: knife
  44: spoon
  45: bowl
  46: banana
  47: apple
  48: sandwich
  49: orange
  50: broccoli
  51: carrot
  52: hot dog
  53: pizza
  54: donut
  55: cake
  56: chair
  57: couch
  58: potted plant
  59: bed
  60: dining table
  61: toilet
  62: tv
  63: laptop
  64: mouse
  65: remote
  66: keyboard
  67: cell phone
  68: microwave
  69: oven
  70: toaster
  71: sink
  72: refrigerator
  73: book
  74: clock
  75: vase
  76: scissors
  77: teddy bear
  78: hair drier
  79: toothbrush
 
# Download script/URL (optional)
download: https://ultralytics.com/assets/coco8.zip

参数说明

https://docs.ultralytics.com/usage/cfg/#train-settings

Train Settings

ArgumentDefaultDescription
modelNoneSpecifies the model file for training. Accepts a path to either a .pt pretrained model or a .yaml configuration file. Essential for defining the model structure or initializing weights.
dataNonePath to the dataset configuration file (e.g., coco8.yaml). This file contains dataset-specific parameters, including paths to training and validation data, class names, and number of classes.
epochs100Total number of training epochs. Each epoch represents a full pass over the entire dataset. Adjusting this value can affect training duration and model performance.
timeNoneMaximum training time in hours. If set, this overrides the epochs argument, allowing training to automatically stop after the specified duration. Useful for time-constrained training scenarios.
patience100Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus.
batch16Batch size, with three modes: set as an integer (e.g., batch=16), auto mode for 60% GPU memory utilization (batch=-1), or auto mode with specified utilization fraction (batch=0.70).
imgsz640Target image size for training. All images are resized to this dimension before being fed into the model. Affects model accuracy and computational complexity.
saveTrueEnables saving of training checkpoints and final model weights. Useful for resuming training or model deployment.
save_period-1Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions.
cacheFalseEnables caching of dataset images in memory (True/ram), on disk (disk), or disables it (False). Improves training speed by reducing disk I/O at the cost of increased memory usage.
deviceNoneSpecifies the computational device(s) for training: a single GPU (device=0), multiple GPUs (device=0,1), CPU (device=cpu), or MPS for Apple silicon (device=mps).
workers8Number of worker threads for data loading (per RANK if Multi-GPU training). Influences the speed of data preprocessing and feeding into the model, especially useful in multi-GPU setups.
projectNoneName of the project directory where training outputs are saved. Allows for organized storage of different experiments.
nameNoneName of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored.
exist_okFalseIf True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs.
pretrainedTrueDetermines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance.
optimizer'auto'Choice of optimizer for training. Options include SGDAdamAdamWNAdamRAdamRMSProp etc., or auto for automatic selection based on model configuration. Affects convergence speed and stability.
verboseFalseEnables verbose output during training, providing detailed logs and progress updates. Useful for debugging and closely monitoring the training process.
seed0Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations.
deterministicTrueForces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms.
single_clsFalseTreats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification.
rectFalseEnables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy.
cos_lrFalseUtilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence.
close_mosaic10Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature.
resumeFalseResumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly.
ampTrueEnables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy.
fraction1.0Specifies the fraction of the dataset to use for training. Allows for training on a subset of the full dataset, useful for experiments or when resources are limited.
profileFalseEnables profiling of ONNX and TensorRT speeds during training, useful for optimizing model deployment.
freezeNoneFreezes the first N layers of the model or specified layers by index, reducing the number of trainable parameters. Useful for fine-tuning or transfer learning.
lr00.01Initial learning rate (i.e. SGD=1E-2Adam=1E-3) . Adjusting this value is crucial for the optimization process, influencing how rapidly model weights are updated.
lrf0.01Final learning rate as a fraction of the initial rate = (lr0 * lrf), used in conjunction with schedulers to adjust the learning rate over time.
momentum0.937Momentum factor for SGD or beta1 for Adam optimizers, influencing the incorporation of past gradients in the current update.
weight_decay0.0005L2 regularization term, penalizing large weights to prevent overfitting.
warmup_epochs3.0Number of epochs for learning rate warmup, gradually increasing the learning rate from a low value to the initial learning rate to stabilize training early on.
warmup_momentum0.8Initial momentum for warmup phase, gradually adjusting to the set momentum over the warmup period.
warmup_bias_lr0.1Learning rate for bias parameters during the warmup phase, helping stabilize model training in the initial epochs.
box7.5Weight of the box loss component in the loss function, influencing how much emphasis is placed on accurately predicting bounding box coordinates.
cls0.5Weight of the classification loss in the total loss function, affecting the importance of correct class prediction relative to other components.
dfl1.5Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification.
pose12.0Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints.
kobj2.0Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy.
label_smoothing0.0Applies label smoothing, softening hard labels to a mix of the target label and a uniform distribution over labels, can improve generalization.
nbs64Nominal batch size for normalization of loss.
overlap_maskTrueDetermines whether segmentation masks should overlap during training, applicable in instance segmentation tasks.
mask_ratio4Downsample ratio for segmentation masks, affecting the resolution of masks used during training.
dropout0.0Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training.
valTrueEnables validation during training, allowing for periodic evaluation of model performance on a separate dataset.
plotsFalseGenerates and saves plots of training and validation metrics, as well as prediction examples, providing visual insights into model performance and learning progression.

维度转换

解决输出的维度(多维数组), 不是需要的, 在第二维(以上), 造成大量数组下标偏移访问 例如 : 多维数据 [1, 860000, 5] 如果需要第三个维度的数据, 则需要遍历 860000 次

修改输出维度 gitub镜像下载地址:https://hub.nuaa.cf/shouxieai/infer python v8trans.py best.onnx

模型格式转换

from ultralytics import YOLO
 
# Load a model
model = YOLO("yolov8n.pt")  # load a pretrained model (recommended for training)
results = model.export(format='onnx')  # export the model to ONNX format

标注工具

X-AnyLabeling (自动标注)

https://github.com/CVHub520/X-AnyLabeling