Nature Sub-Journal: Multi-UAV Collaboration Without Communication via Differentiable Physics for Vision-Based High-Speed Flight
Try to zoom in on the scene. Whether it is in a dense forest or indoors with dense obstacles, a few small UAVs will be there.No mapping, no communication Continuous crossing under conditions. This is not a concept short film, but a board Nature Machine Intelligence real results.
In June 2025, the team of Professors Lin Weiwei and Zou Danping of Shanghai Jiao Tong University published the paper "Learning vision-based agile flight via differentiable physics”
for the first time Differentiable physics Deploy the trained end-to-end strategy to real UAV to achieve stand-alone operation in the forest 20m/s High-speed obstacle avoidance, and complete six machines under zero communication conditions Complex collaborative traversal。
source:https://www.youtube.com/watch?v=LKg9hJqc2cc
Research background
-
Achieving high-speed autonomous flight in complex, dynamic and unknown environments remains a difficult problem for aerial robots. The traditional cascade solution splits the task into multiple stages of localization, mapping, planning and control. As the speed increases, the system delay and error accumulation will be amplified. Localization and mapping themselves are more prone to instability at high speeds and have high computational overhead. These factors jointly limit the mobility and robustness of the system in real scenarios.
-
In recent years, end-to-end visual obstacle avoidance strategies have performed well, but reinforcement learning often suffers from low sample efficiency and strong reliance on large-scale parallel environments. Imitation learning is also limited by the coverage of expert demonstrations and has limited generalization capabilities.
-
The research team proposed Combining deep learning with first-principles physical modeling through differentiable simulation, backpropagates the loss gradient in the simulation closed loop to achieve end-to-end control strategy optimization.
research methods
Differentiable physical modeling and simulation closed-loop training
-
Use a simplified point mass dynamics model to model UAV translational motion (non-rigid body modeling), and explicitly introduce thrust delay, air resistance, control smoothing and other factors;
-
The dynamics model is differentiable and is used to implement state updates and gradient backpropagation in simulations;
-
Through a closed-loop simulation system composed of depth map rendering, the policy network outputs actions after sensing the input at each step, and the loss function can be back-propagated to the policy parameters through the chain rule.

Image source: "Nature Machine Intelligence" (2025), paper "Learning vision-based agile flight via differentiable physics"
Loss function based on physical structure
- The loss function consists of three parts:
① Speed tracking item (reference target speed);
②Obstacle avoidance item (based on the nearest obstacle distance and relative speed);
③ Control the smoothing term (penalty for large acceleration/jerk); - All losses are differentiable continuous functions, allowing for efficient training.
Lightweight end-to-end control
-
to invert and pool to 16×12 The single-channel depth map is used as input, and timing modeling is performed through 3 layers of convolution + 1 layer of GRU, and the output Desired thrust acceleration + current velocity estimate,Yaw automatically aligned by target direction;
-
The network is lightweight and can be deployed and run in real time on a single-board computer (Mango Pi) without GPU and costing only US$21;
-
The input information only requires depth map + target speed + attitude estimation, and there is no need to use external localization systems such as VIO, GPS, and VICON.

Image source: "Nature Machine Intelligence" (2025), paper "Learning vision-based agile flight via differentiable physics"
Time gradient decay mechanism
-
In order to deal with the gradient explosion that may be caused by long sequence backpropagation, the study introduces an exponential decay coefficient to weight the gradient between time steps;
-
This mechanism allows the strategy to pay more attention to the perceptible state of the "near future", effectively improving training stability and strategy generalization capabilities.

Image source: "Nature Machine Intelligence" (2025), paper "Learning vision-based agile flight via differentiable physics"
Experimental testing
Real flight in multiple scenes
In unknown environments (such as dense forests, city parks, indoor corridors), high-speed obstacle avoidance flights are conducted to verify the generalization ability of the algorithm.
-
The speed can reach up to 20m/s in the forest and 7m/s in urban/indoor scenes.
-
It maintains a success rate of up to 90% when flying in unknown and complex environments, and can adapt to dynamic obstacles.

Multi-machine zero communication collaborative transposition test
Six UAVs completed the "doorway change" task in a communication-free manner: starting from both sides of the door and switching positions midway. Indoor measurements use motion capture to provide speed estimates for each machine, but do not exchange information between machines.
The actual measurement showed that without communicating with each other and without centralized planning, six aircraft successfully performed self-organized behaviors such as waiting, yielding, and following, and completed the transposition task.


Comparative experiments and ablation analysis
-
Compared with methods such as PPO and Agile [1]: it achieves equivalent or better performance with fewer samples; it shows stronger robustness in real machine obstacle avoidance.
-
Key mechanism verification: Time gradient decay and low-resolution input are effective for convergence and generalization.

Image source: "Nature Machine Intelligence" (2025), paper "Learning vision-based agile flight via differentiable physics"
【1】:Loquercio, A. et al. Learning high-speed flight in the wild. Sci. Robot.*6, eabg5810 (2021).*
Technical Highlights
-
Adapt to dynamic and unknown complex environments, with a maximum speed of 20m/s
-
Multi-UAV collaborative obstacle avoidance supporting "zero communication"
-
High training efficiency and low sample requirements
-
Low-resolution input enhances reality generalization
-
Low computing power real machine deployment
Paper information:
first author Zhang Yuang, Hu Yu, Song Yunlong;Corresponding author Lin Weiwei, Zou Danping
Published in "Nature Machine Intelligence》(Five-year impact factor 31.8)
Resource Express
Paper link:
https://www.nature.com/articles/s42256-025-01048-0
open-source code:
