Skip to content

Language

Currency

AFOD Open Source: Major Small-Object Detection Upgrade for SpireCV

by AMOVLAB 14 Mar 2025 0 Comments

background

On September 30, 2024, the official Ultralytics team announced the official release of YOLOv11, marking another major upgrade of the YOLO series of real-time target detectors, and also marking the rapid development of target detection.

In the field of small target detection, due to its poor visual characteristics and high noise, it has long been Target detection a difficulty in it. This is especially true in UAV application scenarios. Due to the high flying height of UAV, there are often Lots of small goals, there are few features that can be extracted, and due to the large fluctuations in UAV flight height, the proportion of objects changes drastically, resulting in The difficulty of detection increases sharply; Moreover, there are many complex scenes in the actual flight perspective, and there will be a lot of occlusion between dense small targets, which can easily be blocked by other targets or the background.

principle

The AFOD algorithm, namely AutoFocusObjectDetector, is SpireCV's new open-source algorithm designed for small target detection from the UAV perspective. The Chinese name is Attention Target Detection. The following is the detection of distant vehicle targets using the AFOD algorithm combined with the GX40 pod without zooming (the pixels are much smaller than 32x32).

The main advantage of attention target detection is to take into account both small target detection accuracy and frame rate performance. It is divided into two stages in time sequence:

    1. Global target search, generally 1280×1280 resolution
    1. After searching for the target, enter the sub-region detection stage, usually with a resolution of 640×640
  • The details are shown in the figure below:

This detector requires the input of two general target detectors, one for full-image search and the other for sub-region search. The type of target to be detected will be defined on the specific data set, and the category information and pixel position of the target (enclosed rectangular box) will be output.

The relevant configuration parameters are detailed as follows:

  1. lock_thres: How many consecutive frames the same target is detected and enters sub-area detection. The default is 5 frames.

  2. unlock_thres: How many consecutive frames the target is lost in the sub-area, returning to global detection, the default is 5 frames

  3. lock_scale_init: The control parameter of the initial sub-region size, specifically a multiple of the width of the target pixel, the default is 12 times

  4. lock_scale: Control parameter of sub-region size (after stable tracking of sub-region), default is 8 times

  5. categories_filter: Filter target name. If it is empty, no filtering will be performed. The filter target names are as follows:[“person”, “car”]

  6. keep_unlocked: Whether to output targets that are not automatically noticed, not output by default (false)

  7. use_square_region: Whether it is a square area during initial detection. If so, for non-square input images, white space on both sides will not be detected. It is not used by default (false)

Universal object detector:
The two general target detectors used by the AFOD algorithm this time are the target detector models (640x640, 1280x1280) trained on the visdrone2019 det data set by yolov11s and yolov11s6. The following is the detection effect achieved by yolov11s6 combined with the 10x optical zoom of the GX40 pod when the P600 UAV hovers at an altitude of 40 meters. It is not difficult to find that the detector can effectively identify vehicles within 1,600 meters and pedestrians within 1,400 meters.

Video placeholder: The original Chinese article includes a video here. AMOVLAB will manually connect the corresponding YouTube video.

use

Leave a comment

All blog comments are checked prior to publishing

Thanks for subscribing!

This email has been registered!

Shop the look

Choose Options

Recently Viewed

Edit Option
Back In Stock Notification
Terms & Conditions
What is Lorem Ipsum? Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum. Why do we use it? It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English. Many desktop publishing packages and web page editors now use Lorem Ipsum as their default model text, and a search for 'lorem ipsum' will uncover many web sites still in their infancy. Various versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like).
this is just a warning
Login
Shopping Cart
0 items