In our previous post, we shared the progress we made in training and testing our foundation model for mobility using a CARLA-based simulator. While we had already overcome the challenge of visual sim-to-real transfer, the next major hurdle was scaling up data diversity. It quickly became clear, however, that CARLA’s flexibility had limits. We explored several commercially available simulators, but most were highly specialized for narrow domains and lacked the adaptability we needed. Specifically, we were looking for a simulator that could easily generate diverse scenarios focused on bike lane navigation and, more broadly, driving along the edges of roads—areas often underrepresented in traditional simulations. That’s why, in collaboration with Tempo Simulation, we decided to build our own custom simulator: VayuSim.
Here's a video showing a camera fly over a town in VayuSim.
VayuSim is an Unreal Engine 5 based simulation environment, built from the ground up for training Vayu’s mobility foundation model. It has two major components.
The result is a fully-featured, modern simulator that is now at the heart of our data-generation and testing workflows. Here is a preview of training data generated from VayuSim. The video is sped up 4x.
World Generation
Real-world testing will always be constrained by the size of our testing fleet and the number of hours in a day. With simulation, we can design virtual worlds, where we can train and test our bot with a much larger and more diverse set of scenes and scenarios. But hand-built virtual worlds can’t scale to the training and testing volume we need. That’s why, we invested in building procedural generation capabilities into VayuSim, through which we have automated the entire process of generating a virtual world, randomizing thousands of parameters along the way. And once the world is built, we can immediately fill it with simulated agents and begin collecting data.
To build our procedural world generation system, we started with CityBLD, an Unreal plugin for procedural city generation. We added a Python interface which allowed us to create road networks and randomize the density and types of buildings, and landscaping in the resulting towns, all without any manual human intervention. We are now able to build whole towns entirely in Unreal, a significant upgrade from our previous workflow which involved several external tools.
We extended CityBLD to add procedural road lanes, including bike lanes, with more than 100 configurable parameters. Everything from the number and arrangement of lanes to the color and tiling of lane lines can be varied, achieving a level of diversity we wouldn’t be able to reach using real world data alone. This level of diversity turned out to be crucial for sim-to-real generalization.
We have even made significant progress in recreating virtual versions of real neighborhoods in our target domain.
At this point we had highly varied and realistic-looking roads and towns. To truly bring our towns to life, however, we had to extract the lane-level data into a form simulated vehicle agents could use to navigate and drive. To solve this, we developed a tool to extract a lane graph from the CityBLD roads using Unreal’s ZoneGraph plugin, including a custom ruleset to intelligently connect lanes at intersections.

Agents and their Behaviors
For our simulated agent behaviors we looked to Epic’s CitySample project, which contained a city environment full of vehicles and pedestrians.

We also added a large number of bicyclist and scooter assets from the Unreal Marketplace.
.png)
However, after digging deeper we discovered that CitySample made a number of simplifications to limit the complexity of agent interactions. One of these simplifications is that all intersection crossings are fully “protected”, meaning pedestrians and vehicles can always proceed without regard for potential conflicts with each other. In reality, conflicting lanes (those that merge or cross and are open simultaneously) and the nuanced rules that govern them present some of the most difficult challenges in urban driving. Canonical examples from our domain are a vehicle turning right across the bike lane, a bike crossing the through road at a 2-way stop sign intersection, or our bot turning right across a crosswalk. We expanded the CitySample behaviors to add support for yields, merges, and unprotected turns.
These vehicle and pedestrian agents provide a good baseline for typical traffic behavior. To build more robust autonomy, however, our robot must also learn to interact with agents that behave unpredictably. By combining machine learning-based techniques for generating realistic deviations with VayuSim’s ability to randomize every aspect of a scenario, we can create unexpected events—such as a scooter rider abruptly cutting in front of the robot—in a controlled and repeatable manner. This allows us to systematically vary conditions like traffic density and lighting, transforming a single challenging scenario into many variations for comprehensive closed-loop validation. Unlike validation against real-world data, which tends to be anecdotal, this approach enables rigorous and scalable testing. Achieving this level of control and reproducibility remains a significant challenge in neural simulators today.
Performance
Execution speed is critical when generating large training data sets or performing closed-loop validation. Not only does it directly impact compute time and expense, but running simulations faster also speeds up our iteration cycle, enabling us to explore more ideas.
In order to find rare interactions we needed to be able to run large simulations with many agents. To achieve this scale without sacrificing performance we developed the TempoAgents plugin, which leverages Unreal’s Mass system and its data-oriented computation patterns to simulate thousands of agents simultaneously on a single developer machine. As simulated agents move closer to our robot they seamlessly transition to higher levels of fidelity, saving precious computation.
To train our models we needed to generate color, depth, and pixel-wise semantic label images. Other open-source Unreal plugins require two or three render passes to extract these three images, at a substantial performance cost. The TempoSensors plugin can extract all three from a single render pass, thanks to a custom pixel buffer format that packs the data as tightly as possible, resulting in much improved execution speed.

Familiar Interface
We chose to build VayuSim with Unreal because of its state-of-the-art real time rendering features and battle-tested core engine. However, we wanted an interface that would feel familiar to a robotics engineer, even those who had never used a game engine. We developed an RPC system for Unreal in the TempoCore plugin, using gRPC and Protobuf, which allows us to control the simulation and script entire workflows from Python. For example, we can step time, change the time of day and location on Earth, or stream camera images each with just a few lines of client code:

Using the Python API it’s easy to spawn an obstacle in our robot’s path, control a simulated bicyclist, or tell a simulated pedestrian to walk.
TempoWorld includes a reflection-based property API, allowing users to read or write any simulation property by name while editing a scene or at runtime, without custom API. With such a flexible interface seemingly complex tasks become trivially simple. For example, see how we can vary the parameters of procedurally generated leaves on our roads.
When we need to extend the API, that is also straightforward. We simply define our message and RPC types in proto, implement their handlers in C++, and Tempo automatically generates the Python API.
We use ROS as our onboard robot middleware, so integration with it was essential for VayuSim. We developed the TempoROS plugin, which natively integrates rclcpp, ROS’s C++ client library, with Unreal, enabling these two powerful systems to coexist in a single application. Unlike other ROS plugins for Unreal, TempoROS does not require a “bridge” process to translate and forward messages, improving performance and maintainability.
Looking Ahead
Today, VayuSim represents a step change in our simulation abilities. Looking forward, the effort we put into the generic Tempo platform demonstrates our commitment to simulation as a foundational technology as we expand our focus beyond the road margin. Although we developed VayuSim as an internal tool, we feel that the challenges that led us to build it are not unique. So we are excited to have the opportunity to give back to the robotics community by releasing a large portion of VayuSim as free open-source software, in the form of the Tempo Unreal plugins. We’re happy to say that Tempo is already being used by other companies and university researchers and encourage those searching for a custom simulation solution to take advantage of the contribution.