Achieving autonomous flight of UAVs in complex obstacle environments has always been one of the core challenges in the field of aerial robotics. Traditional solutions often rely on mapping and multi-stage planning processes, which suffer from problems such as high latency and poor robustness. Lu Junjie and others from the School of Electrical Automation and Information Engineering of Tianjin University published a paper in "IEEE Robotics and Automation Letters"You Only Plan Once: A Learning-Based One-Stage Planner With Guidance Learning", proposed the end-to-end planning algorithm YOPO. This method does not require real-time mapping, integrates perception, search and optimization, and can generate a safe trajectory in one forward reasoning; with the new Guidance Learning training strategy, YOPO achieves millisecond-level response in simulation and real machine testing.

Video placeholder: The original Chinese article includes a video here. AMOVLAB will manually connect the corresponding YouTube video.

Video source:https://www.youtube.com/watch\?v=m7u1MYIuIn4

01 Research background

Air navigation tasks usually require the completion of three major modules: perception mapping, path search, and trajectory optimization. Although this traditional "three-stage" process has a clear structure, in actual deployment, it is easy to cause bottlenecks such as high cumulative delays, easy error amplification, and limited re-planning frequency, especially when facing high-speed flight missions in obstacle-dense environments. In response to these issues,YOPO Depth perception, motion primitive search and trajectory optimization are encapsulated into the same network, eliminating the need for online mapping and serial calls. Safe trajectories can be generated in parallel with only one forward reasoning, achieving millisecond-level autonomous flight.

02 System introduction

This system integrates the "perception-planning-control" link into the same framework:
training phase, the network inputs the depth map, body speed/acceleration and target direction; the true value ESDF and pose are only used to calculate the trajectory cost (smoothness, safety, target) and generate numerical gradients, and reversely update the weights through Guidance Learning. Predefined motion primitives evenly arranged in the field of view serve as anchor points, and the network predicts their endpoint offsets and terminal derivatives, thereby obtaining multiple candidate trajectories and evaluating their costs.

reasoning stage, without online mapping, the depth map, velocity and acceleration features are input into the model. The model output is the offset of all predefined motion primitives, and then the fifth time polynomial coefficient is solved based on the offset. Trackable trajectories are generated in real time and executed by the controller to achieve millisecond-level, map-free autonomous flight.

Image source: Junjie Lu et al., “You Only Plan Once: A Learning-Based One-Stage Planner With Guidance Learning,” IEEE Robotics and Automation Letters, 2024.

03 Technical Highlights

Integrated end-to-end planning framework

Integrating the three traditional modules of perception, path search and trajectory optimization into a single neural network structure significantly reduces overall latency.
The model takes the depth map, current state and target direction as input, and can output multiple sets of candidate trajectory parameters (offset, terminal derivative and score) in one forward propagation to achieve rapid decision-making.

Motion primitive + offset prediction mechanism

Drawing on the YOLO idea, fixed motion primitives are used as anchor points, and the network outputs its offset and score to correct the trajectory.
Each primitive covers an angular area in the depth map, predicts the offsets, derivatives, and scores of all primitives in parallel, efficiently generates diverse local trajectories, and comprehensively explores the solution space.

Image source: Junjie Lu et al., “You Only Plan Once: A Learning-Based One-Stage Planner With Guidance Learning,” IEEE Robotics and Automation Letters, 2024.

Guidance Learning

Use real information of the environment (such as ESDF maps) to calculate numerical gradients and directly use them to train network parameters, avoiding reliance on expert demonstrations.
This strategy compares imitation learning more real than reinforcement learning More stable and efficient, it is an unsupervised training method with real feedback.
Supports data enhancement and multi-objective initialization during training to improve generalization capabilities and does not require additional label re-annotation.

Image source: Junjie Lu et al., “You Only Plan Once: A Learning-Based One-Stage Planner With Guidance Learning,” IEEE Robotics and Automation Letters, 2024.

Privileged Learning

Introducing privileged information (real maps and complete states) in the training stage to obtain more accurate gradient feedback; while only relying on noisy depth maps and low-level state information in the inference stage.
Improve the model's robustness to perceptual noise and show strong performance in tasks without maps and high real-time requirements.

Image source: Junjie Lu et al., “You Only Plan Once: A Learning-Based One-Stage Planner With Guidance Learning,” IEEE Robotics and Automation Letters, 2024.

04 Experimental testing

Comparative test

proposed for verification Guidance Learning Regarding the effectiveness of the training method, the R&D team compared it with the classic gradient optimization method. The results show that Guidance Learning not only has a lower average planning cost, but also can achieve better results in a shorter time (1.6 ms) generates multiple feasible trajectories in parallel, with stronger global awareness and robustness. At the same time, in a simulated dense forest environment,YOPO compared to TopoTraj、MPPI and Agile Autonomy,exist Delay、security and success rate The overall performance is the best on other indicators.

Table source: Junjie Lu et al., “You Only Plan Once: A Learning-Based One-Stage Planner With Guidance Learning,” IEEE Robotics and Automation Letters, 2024.

Real machine experiment

Platform configuration: 250mm wheelbase quadcopter, the core computing unit is NVIDIA Xavier NX, equipped with RealSense D455 Depth camera, system usage VINS-Fusion Perform state estimation.

Image source: Junjie Lu et al., “You Only Plan Once: A Learning-Based One-Stage Planner With Guidance Learning,” IEEE Robotics and Automation Letters, 2024.

Flight scene: The tree density is approximately 0.1 trees/square meter, and the real environment is not used for training.Test results:

The fastest flight speed reaches 5.52m/s；
Able to quickly re-plan in unexpected obstacles;
There is no need to build an explicit map during the entire process, demonstrating excellent real-time performance and environmental adaptability.

Image source: Junjie Lu et al., “You Only Plan Once: A Learning-Based One-Stage Planner With Guidance Learning,” IEEE Robotics and Automation Letters, 2024.

05 SU17 reappears

AMOVLAB SU17 scientific research UAV has been completed based on the paper and open-source code YOPO The algorithm was reproduced and preliminary tests were conducted to verify that the YOPO algorithm showed excellent response speed and strong generalization ability. We will launch a complete recurrence tutorial based on SU17 in the future, including training environment construction, model deployment and controller docking. Welcome to continue to pay attention!

Video placeholder: The original Chinese article includes a video here. AMOVLAB will manually connect the corresponding YouTube video.

Resource Express

open-source code:
https://github.com/TJU-Aerial-Robotics/YOPO

Paper link:
https://ieeexplore.ieee.org/document/10528860

DOI:10.1109/LRA.2024.3399589_

The content of the article is only for academic exchange and technology sharing. The copyright of the graphic and text materials belongs to the original author and the journal. If there is any infringement, please contact us to delete it.

Research Drones

Onboard AI Computers

Data Link Modules

Trending Now

Popular Products

Tianjin University Open-Sources an End-to-End Planning Algorithm for Smooth Autonomous Flight

01 Research background

02 System introduction