Try to zoom in on the scene. Whether it is in a dense forest or indoors with dense obstacles, a few small UAVs will be there.No mapping, no communication Continuous crossing under conditions. This is not a concept short film, but a board Nature Machine Intelligence real results.

In June 2025, the team of Professors Lin Weiwei and Zou Danping of Shanghai Jiao Tong University published the paper "Learning vision-based agile flight via differentiable physics”

for the first time Differentiable physics Deploy the trained end-to-end strategy to real UAV to achieve stand-alone operation in the forest 20m/s High-speed obstacle avoidance, and complete six machines under zero communication conditions Complex collaborative traversal。

Video placeholder: The original Chinese article includes a video here. AMOVLAB will manually connect the corresponding YouTube video.

source:https://www.youtube.com/watch?v=LKg9hJqc2cc

Research background

Achieving high-speed autonomous flight in complex, dynamic and unknown environments remains a difficult problem for aerial robots. The traditional cascade solution splits the task into multiple stages of localization, mapping, planning and control. As the speed increases, the system delay and error accumulation will be amplified. Localization and mapping themselves are more prone to instability at high speeds and have high computational overhead. These factors jointly limit the mobility and robustness of the system in real scenarios.
In recent years, end-to-end visual obstacle avoidance strategies have performed well, but reinforcement learning often suffers from low sample efficiency and strong reliance on large-scale parallel environments. Imitation learning is also limited by the coverage of expert demonstrations and has limited generalization capabilities.
The research team proposed Combining deep learning with first-principles physical modeling through differentiable simulation, backpropagates the loss gradient in the simulation closed loop to achieve end-to-end control strategy optimization.

research methods

Differentiable physical modeling and simulation closed-loop training

Use a simplified point mass dynamics model to model UAV translational motion (non-rigid body modeling), and explicitly introduce thrust delay, air resistance, control smoothing and other factors;
The dynamics model is differentiable and is used to implement state updates and gradient backpropagation in simulations;
Through a closed-loop simulation system composed of depth map rendering, the policy network outputs actions after sensing the input at each step, and the loss function can be back-propagated to the policy parameters through the chain rule.

Image source: "Nature Machine Intelligence" (2025), paper "Learning vision-based agile flight via differentiable physics"

Loss function based on physical structure

The loss function consists of three parts:
① Speed tracking item (reference target speed);
②Obstacle avoidance item (based on the nearest obstacle distance and relative speed);
③ Control the smoothing term (penalty for large acceleration/jerk);
All losses are differentiable continuous functions, allowing for efficient training.

Lightweight end-to-end control

to invert and pool to 16×12 The single-channel depth map is used as input, and timing modeling is performed through 3 layers of convolution + 1 layer of GRU, and the output Desired thrust acceleration + current velocity estimate，Yaw automatically aligned by target direction；
The network is lightweight and can be deployed and run in real time on a single-board computer (Mango Pi) without GPU and costing only US$21;
The input information only requires depth map + target speed + attitude estimation, and there is no need to use external localization systems such as VIO, GPS, and VICON.

Image source: "Nature Machine Intelligence" (2025), paper "Learning vision-based agile flight via differentiable physics"

Time gradient decay mechanism

In order to deal with the gradient explosion that may be caused by long sequence backpropagation, the study introduces an exponential decay coefficient to weight the gradient between time steps;
This mechanism allows the strategy to pay more attention to the perceptible state of the "near future", effectively improving training stability and strategy generalization capabilities.

Image source: "Nature Machine Intelligence" (2025), paper "Learning vision-based agile flight via differentiable physics"

Experimental testing

Real flight in multiple scenes

In unknown environments (such as dense forests, city parks, indoor corridors), high-speed obstacle avoidance flights are conducted to verify the generalization ability of the algorithm.

The speed can reach up to 20m/s in the forest and 7m/s in urban/indoor scenes.
It maintains a success rate of up to 90% when flying in unknown and complex environments, and can adapt to dynamic obstacles.

Multi-machine zero communication collaborative transposition test

Six UAVs completed the "doorway change" task in a communication-free manner: starting from both sides of the door and switching positions midway. Indoor measurements use motion capture to provide speed estimates for each machine, but do not exchange information between machines.

The actual measurement showed that without communicating with each other and without centralized planning, six aircraft successfully performed self-organized behaviors such as waiting, yielding, and following, and completed the transposition task.

Comparative experiments and ablation analysis

Compared with methods such as PPO and Agile [1]: it achieves equivalent or better performance with fewer samples; it shows stronger robustness in real machine obstacle avoidance.
Key mechanism verification: Time gradient decay and low-resolution input are effective for convergence and generalization.

Image source: "Nature Machine Intelligence" (2025), paper "Learning vision-based agile flight via differentiable physics"

【1】：Loquercio, A. et al. Learning high-speed flight in the wild. Sci. Robot.*6, eabg5810 (2021).*

Technical Highlights

Adapt to dynamic and unknown complex environments, with a maximum speed of 20m/s
Multi-UAV collaborative obstacle avoidance supporting "zero communication"

High training efficiency and low sample requirements
Low-resolution input enhances reality generalization
Low computing power real machine deployment

Paper information：

first author Zhang Yuang, Hu Yu, Song Yunlong;Corresponding author Lin Weiwei, Zou Danping

Published in "Nature Machine Intelligence》(Five-year impact factor 31.8）

Resource Express

Paper link:

https://www.nature.com/articles/s42256-025-01048-0

open-source code:

https://github.com/HenryHuYu/DiffPhysDrone

Research Drones

Onboard AI Computers

Data Link Modules

Trending Now

Popular Products

Nature Sub-Journal: Multi-UAV Collaboration Without Communication via Differentiable Physics for Vision-Based High-Speed Flight

Research background

research methods

Differentiable physical modeling and simulation closed-loop training