In March 2025, at the WorldMinds conference at the Kaufleuten Theater in Zurich, more than 500 live audiences witnessed two autonomous racing UAVs that relied entirely on vision shuttled between stages. The large background screen visualizes the observations and decisions of each UAVAI algorithm in real time, and the entire system no map、No inertial measurement unit (IMU), no traditional SLAM, and can still fly stably under dim lights and strict safety restrictions.

Video placeholder: The original Chinese article includes a video here. AMOVLAB will manually connect the corresponding YouTube video.

This performance embodies the many years of AIUAV research by the Robotics and Perception Research Group (RPG) at the University of Zurich.A series of key technical achievements, marking a major milestone for them in the field of purely visual autonomous flight. This article will help you sort out these research threads, analyze key technologies, and look forward to future application prospects.

01

Swift: AI beats world champion

When the algorithm calculates every dive and roll more accurately and faster than humans, the champion pilot can only look at the "machine" and sigh. In 2023, the RPG team of the University of Zurich and the Intel team designed an autonomous UAV system - Swift, and defeated three world championship-level human UAV racing pilots in the official competition, with a total record of 15 wins and 10 losses, and broke the fastest UAV racing record. This blockbuster research result was also published as a cover article in the current issue of Nature magazine.

Video placeholder: The original Chinese article includes a video here. AMOVLAB will manually connect the corresponding YouTube video.

Technical Highlights

Using only airborne vision + IMU, it completely gets rid of external localization and SLAM dependence, and the information conditions are consistent with human pilots.
With only 50 seconds of real flight data, Gaussian process + KNN can be used to quickly compensate for perception and dynamics errors and achieve zero-sample migration.
The two stages of perception and control realize the end-to-end reinforcement learning strategy.
The reward function takes into account both speed and field of view, and “aiming the camera at the gate” is written into the reward to ensure stable passage at high speeds.
High-fidelity parallel simulation, 1e8 steps of training in 50 minutes, accurately replicating the PID, ESC and battery models, greatly shortening the iteration cycle.

The picture comes from the paper "Champion-level Drone Racing Using Deep Reinforcement Learning", Elia Kaufmann et al., Nature 2023

02

Reinforcement learning outperforms optimal control

In a study published in Science Robotics in September 2023, the RPG team of the University of Zurich compared the performance of reinforcement learning and optimal control in UAV racing. The results show that reinforcement learning can directly optimize task-level goals and demonstrate stronger adaptability and performance when facing complex environments and model uncertainties.

Comparison of three control schemes

Trajectory Tracking: Find the shortest trajectory offline, and track MPC online.
Contouring Control: Online MPC simultaneously maximizes path progress and minimizes deviations.
Gate-Progress RL: End-to-end RL directly maximizes the displacement toward the center of the next gate without reference to the trajectory.

Main conclusions

The optimization method is not the decisive factor, the optimization goal is the key.
It is demonstrated that in highly dynamic robotic tasks, choosing the appropriate optimization objective allows reinforcement learning (RL) to surpass optimal control.
The next step is to get rid of external VICONs and improve resilience to environmental changes and post-collision.

03

End-to-end visual flight controller

In professional UAV racing, human pilots can complete high-speed door crossings by relying only on first-person perspective (FPV) video streams; and the most “agile” autonomous quadcopter in academia still relies on the explicit state estimation of VIO/SLAM. In 2024, the University of Zurich RPG team verified it in a real environment for the first time—It does not rely on IMU and state estimation at all, and can complete three laps of racing at 40 km/h and 2 g acceleration using vision alone.。

The picture comes from the paper "Demonstrating Agile Flight from Pixels without State Estimation", Ismail Geles et al., Robotics: Science and Systems 2024

Technical Highlights

Visual feature driven control: For the first time, thrust and angular velocity instructions are directly generated based on camera visual characteristics (rather than state estimation), eliminating dependence on IMU and SLAM.
Abstract modeling of the inner edge of the door frame: It is proposed to use the inner edge features of the door frame as visual input to efficiently simulate training and accelerate the reinforcement learning process.
Asymmetric Actor-Critic architecture: Privileged information is introduced to guide Critic during the training process to improve the sample efficiency and stability of learning complex control strategies from visual input.
Swin Transformer visual perceptron: Develop a highly robust gate detector based on Swin Transformer V2 to cope with challenges such as illumination and blur in actual environments.
Real deployment with zero state estimation: In the real world, high-speed flight missions were completed with a 100% success rate, verifying the ability to directly transfer from simulation to reality.

04

Let the environment also "think"

In the past few years, reinforcement learning (RL) has achieved great success in the field of robot control, from dexterous operation and quadruped running to highly dynamic UAV racing. However, a key problem remains unresolved: once the RL agent changes to a new environment (such as a change in track layout), it is almost impossible to adapt and must be retrained, which greatly limits its practical application. This year the University of Zurich RPG team presented at the ICRA conference by introducing a environmental strategy(Environment Policy), dynamically adjust the track layout, the system can cultivate a system that can operate in various situations.Strange track Universal UAV strategy for on-flight without the need to retrain every time.

Image source: Paper "Environment as Policy: Learning to Race in Unseen Tracks", Hongze Wang et al., ICRA 2025