Swarm Search, Encirclement, and Air-Ground Cooperative Target Grasping System
An intelligent air-ground integrated solution combining UAV swarms, unmanned ground vehicles, and robotic arms.
In complex environments, quickly finding a target, tracking it stably, and grasping it safely are shared challenges in public safety, emergency response, industrial inspection, and many other scenarios. Based on controllable software and hardware platforms, AMOVLAB developed a three-UAV swarm search and encirclement system with air-ground cooperative target grasping, forming an integrated closed loop from aerial search to intelligent encirclement and ground-side grasping.
1. Overall System Architecture
The system consists of three layers: the aerial swarm layer, the ground execution layer, and the command-and-collaboration layer. These three layers are tightly coupled through wireless communication and a unified coordinate system to form closed-loop mission execution capability.
Aerial Swarm Layer: Multi-UAV Cooperative Flight and Target Search
The aerial side uses three quadrotors based on the JCV-600 platform as the basic swarm units.
- 600 mm wheelbase medium-sized frame with a stable, extensible platform design
- Aircraft weight with battery: about 2.4 kg; maximum takeoff weight: 4 kg; maximum payload: 1.5 kg
- Endurance in ideal conditions: about 70 minutes unloaded, about 35 minutes with 1.5 kg payload
- Maximum speed: about 45 km/h in Position mode and 80 km/h in Sport mode
- Hover accuracy: about ±0.5 m vertical and ±1.5 m horizontal
- Wind resistance: up to about 15 m/s
Each UAV can carry video/data communication modules, RTK positioning modules, onboard edge computers, and zoom gimbal cameras. Together, these devices support fast coverage of the target area, continuous observation, and local intelligent processing.

Ground Execution Layer: UGV and Six-Axis Robotic Arm
The ground execution layer consists of the RANGER MINI 2.0 unmanned ground vehicle chassis and an RM65-B six-axis robotic arm. It handles ground mobility, target approach, and final grasping.


The RANGER MINI 2.0 uses four wheel-hub servo motors and supports in-place rotation, snake-like movement, and Ackermann steering. Its maximum speed is about 2.6 m/s, with a body weight of about 135 kg and payload capacity of about 150 kg. It can cross obstacles up to about 100 mm, climb slopes of about 10 degrees, and typically operate for about seven hours.
The RM65-B robotic arm weighs about 7.2 kg, has a payload of 5 kg, a maximum working radius of about 610 mm, repeatability up to ±0.05 mm, and an IP54 protection rating. It supports graphical programming, offline simulation, and integration with ROS / MoveIt.
Command and Collaboration Layer: Ground Station and Swarm Management
The ground station acts as the mission hub. It monitors the position, attitude, battery level, and working status of UAVs and ground vehicles; displays multi-UAV tracks, operation areas, and key mission points; supports single-UAV and swarm-level control; and provides abnormal alerts and safety strategy management.
2. Key Modules and Capabilities
Communication and Positioning
To support air-ground integration and multi-UAV collaboration, the system uses dedicated communication and positioning modules.
The LQ video/data link operates in the 2.4 GHz band and supports IEEE 802.11n WLAN. It provides up to approximately 40 Mbps bandwidth, less than 3 ms minimum link latency, and a transmission distance of up to about 3 km under ideal conditions.

The M15 RTK positioning module supports BDS, GPS, GLONASS, and QZSS multi-constellation RTK high-precision positioning. Horizontal accuracy can reach 10 mm + 1 ppm, vertical accuracy 15 mm + 1 ppm, and data update rate up to 20 Hz.
Multimodal Visual Perception
The system combines a zoom gimbal camera and a stereo depth camera, enabling layered perception from long-distance discovery to close-range 3D localization.

The GX40 three-axis zoom gimbal camera provides strong anti-disturbance capability and high-precision attitude control. It carries an 8.29 MP camera, 10x optical zoom, and up to about 40x combined zoom. It supports up to 4K@30 FPS network video output and includes laser illumination for clear images in dark environments.
Edge AI Computing
Both aerial and ground platforms can carry the AllSpark2-Orin NX edge computing unit as the front-end AI computing core. It integrates an NVIDIA Jetson Orin NX module with up to 100 TOPS AI performance, a 1024-core Ampere GPU, an 8-core Cortex-A78AE CPU, M.2 NVMe SSD support, and rich interfaces including CSI camera, Gigabit Ethernet, USB, CAN, UART, and I2C.

3. Mission Flow: From Search to Grasping
Swarm Area Search
Before the mission begins, the operator selects a rectangular target area on the map through the ground station. The ground station sends the area information and related parameters to the UAV swarm. Each UAV plans and executes search routes based on area division and its own status. During search, onboard vision and AI modules continuously detect and recognize objects, while the swarm shares status and local information through the communication link.
Target Recognition and Encirclement
The system provides a complete target-recognition workflow from data labeling and model training to recognition testing. During recognition, it can output image size, frame rate, camera field of view, target pixel-center coordinates, target category, line-of-sight angles, and estimated 3D target position.
After a UAV detects the target, it continuously locks onto it and records its trajectory. Once enough trajectory data is accumulated, the system fits and predicts the target’s possible future path, then broadcasts it to other UAVs. Other UAVs plan their own flight paths and encirclement positions accordingly.
Air-Ground Cooperative Target Grasping
After the aerial swarm discovers and encircles the target, the target’s position and status are sent to the ground vehicle. The UGV navigates to the target, uses LiDAR and vision sensors for environment perception, and performs path planning and obstacle avoidance. Near the target, the depth camera and end-effector vision module identify and localize the target in 3D. The robotic arm then calculates the grasping pose and performs the grasp.
4. Application Value
Based on mature UAV, UGV, and robotic-arm platforms, combined with high-bandwidth communication, centimeter-level RTK, high-performance edge computing, and multimodal visual perception, this system provides an integrated platform for swarm control, air-ground collaboration, intelligent perception, and precise operation.
- Complete architecture: covers the full mission chain from aerial search and trajectory prediction to ground grasping.
- Clear modules: UAVs, UGVs, robotic arms, communication links, RTK positioning, and computing units can all be replaced or extended.
- Algorithm-friendly: suitable for research and verification in swarm cooperative control, path planning, object recognition, tracking, and air-ground collaboration.
- Strong scenario adaptability: supports multiple levels of applications, from teaching experiments to complex scenario verification.
With this system, users can carry out integrated R&D and teaching from low-level perception, communication, and control to high-level mission planning and collaborative decision-making, laying a solid foundation for air-ground unmanned systems in more industry scenarios.
