Untold Engine Update: Gaussian Splats, Scripting Support, and macOS Build System

It has been a while since my last update, but I’ve been quietly working behind the scenes on several major features. Today, I want to share three big milestones for the Untold Engine.

Gaussian Splats

The first major update is that the Untold Engine now supports Gaussian Splat Rendering. This is a feature I’ve wanted to implement since I first learned about Gaussian Splatting last year. Other priorities kept delaying it, but a few weeks ago I finally had enough time to focus on it—and I got it working.

Gaussian Splats now run inside the Untold Editor, on iOS, in AR, and on the Vision Pro, running directly on the device. This means both 3D models and splats render natively on visionOS hardware, not just the simulator.

There are a few current limitations:

  • I’m using Bitonic Sort instead of Radix Sort for depth sorting. Bitonic Sort works, but it is slower for large splat counts. Radix Sort is the long-term goal.
  • Spherical Harmonics support is not implemented yet. I’m hoping to add this before the end of the year.

Even with these limitations, this is a major step forward for the engine’s rendering capabilities.

Scripting Support

The engine now supports runtime scripting directly through the Untold Editor.

You can write game logic in Xcode, attach scripts to entities, and instantly see the results while the engine is running. All scripts appear in the Asset Browser, and you can add, link, or reload them with a click. This makes the development workflow smoother and faster.

This feature was long overdue, but I approached it carefully because of past experience integrating scripting languages. Instead of Lua or Python, I built a lightweight DSL (Domain-Specific Language) specifically for the Untold Engine. I used ChatGPT heavily at the beginning to help with the initial structure, and once everything made sense, I took over and customized the system for the engine.

If you see anything that can be improved, please let me know—this is a brand-new system and will evolve with feedback.

Feel free to check out the Scripting Section in the Docs to learn more.

Build System

Another major milestone is the new Build System.

After you set up your scene in the editor and attach scripts, the Untold Engine can now generate a macOS build of your game. The Build System packages your scenes, assets, and scripts and produces an app ready for the App Store. All you need to do is provide a project name and a destination path.

At the moment, the Build System supports macOS only, but iOS and visionOS support is planned.

What’s Next?

I’m currently working on a packaged app bundle for the Untold Engine. The idea is to let developers download a .dmg file and start building games immediately—without cloning the repo or setting up dependencies. Developers who want full control can still clone and build from source.

This bundled version will be called Untold Engine Studio, and it will include everything required: the Untold Engine, the Untold Editor, and all dependencies. My goal is to make the development experience as smooth and accessible as possible.

More updates coming soon. Thanks for following the journey!

Untold Engine Progress Update – New Editor and VisionOS Support!

This past couple of months have been amazing for the Untold Engine — from getting its first contributor and sponsorship to adding VisionOS support.

Let me tell you all about it.

Engine & Editor

You may recall that I had both the core and the editor integrated tightly in the engine. It worked nicely, but the coupling was going to give us headaches in the future.

Thanks to the effort of our first contributor miogds, the core of the engine and the editor are now de-coupled.

So, this is the new architecture of the engine:

  • Core: Handles the runtime — rendering, physics, ECS, and all engine systems.
  • Editor: A dedicated app for scene creation, entity manipulation, and asset management.
 

Untold Engine - Core

 

This separation makes development cleaner, more modular, and sets the stage for headless or custom integration workflows.

Additionally, the core engine will continue in its original repository UntoldEngine, while the editor now lives in a new, dedicated repo UntoldEditor.

 

Untold Engine Editor

 

Unit Tests & Workflows

I've also been working on making the Untold Engine repository more professional.
This includes adding unit tests, GitHub Actions workflows, and automatic formatting and linting.

My hope is that these improvements will make contributing to the project much easier and more reliable.

 
 

Website & Documentation

Another area of progress has been the new website and documentation.
The documentation site covers how to install the engine, explore the APIs, and contribute to development.

You can check it out here: Untold Engine

Each engine release will include its own version of the docs for consistent developer onboarding.

 
 

VisionOS Support

Lastly, the engine now compiles and runs on the VisionOS simulator — the first step toward supporting Apple’s Vision Pro platform.

However, this is still early support — the engine has not yet been tested on an actual Vision Pro device.
We’ve already received an issue report related to Vision Pro hardware, so if you happen to have one and would like to help debug, you’re more than welcome to contribute!

 
 

Thanks for reading.

Debugging a Flickering Issue Caused by Asynchronous Culling

After implementing frustum culling in the Untold Engine, performance improved, but right away I noticed flickering. It didn’t happen every frame, but it was noticeable whenever most of the models were in view.

So, I opened up Instruments to profile the issue. I noticed warnings that the engine was holding on to the drawable too long. I tried restructuring things to hold on to the drawable for as short a time as possible, but nothing helped.

According to Instruments, the engine was not CPU-bound or GPU-bound. There was no clear indication of the root cause of the flickering.


Digging Deeper

At that point, I decided to record a short video of the issue. I slowed it down and went frame by frame. What I saw wasn’t the usual kind of flickering—it was different.

  • Frame 1: a certain set of models was visible.
  • Frame 2: a completely different set was visible.
  • Frame 3: some disappeared, others suddenly appeared.

Models were popping in and out, almost as if something was out of sync.

This was a huge hint: it looked like a data race.


The Culprit

Looking at the code confirmed it.

In the frustum culling command buffer completion handler, I was updating the visibleEntityId array. This array held all the entities that passed the culling test.

The problem was that the GPU calls this completion handler asynchronously, while the CPU was already using that same array during the rendering passes (shadow and geometry).

 
 

In other words, the CPU was iterating over visibleEntityId at the same time the GPU might be modifying it.

Classic data race.


The Fix: Triple Buffering

The solution was to add a triple-buffered visible entity list.

During culling, the GPU writes results into buffer n+1.

 
 

During rendering, the CPU continues to read from buffer n.

 
 

When the frame finishes and the render command buffer’s completion handler triggers, I update the index so the CPU reads from the freshly written buffer n+1 on the next frame.

This guarantees that the CPU never reads data being modified by the GPU. The renderer always sees a stable snapshot of the visible entities.


The Result

With triple buffering in place, the flickering disappeared instantly. Models no longer popped in and out between frames.

This bug was a good reminder: sometimes what looks like a rendering artifact isn’t a math error at all, but a synchronization issue between CPU and GPU.


Lesson Learned

Whenever the GPU produces results asynchronously, the CPU should never iterate over those results directly. Always work with a snapshot. Triple buffering (or even double buffering) is a small architectural change that guarantees stability and avoids subtle bugs that can masquerade as rendering issues.

This experience reinforced for me how crucial synchronization and data ownership are when building GPU-driven systems—sometimes the hardest-looking bugs aren’t about shaders or math, but about who’s allowed to touch the data, and when.

Deferred Entity Destruction in ECS: A Mark-and-Sweep Approach

I found a bug in the Untold Engine in the weirdest way possible. After merging several branches into my develop branch, I decided to run a Swift formatter on the engine. Three files were changed. I ran the unit tests, they all passed, and then I figured I’d do a final performance check before pushing the branch to my repo.

So, I launched the engine, loaded a scene, and then deleted the scene.

The moment I did that, the console log started flooding with messages like:

  • Entity is missing or does not exist.
  • Does not have a Render Component.

This was the first time I had ever seen the engine behave like this when removing all entities from a scene. My first reaction was: the formatter broke something.

But the formatter’s changes were only cosmetic. There was no reason for this kind of bug.

At that point I was lost. So, I asked ChatGPT for some guidance, and it mentioned something interesting: maybe the formatter’s modifications had affected timing. That hint got me thinking.

After tinkering a bit, I realized the truth: this bug was always there. The formatter just exposed it earlier.

The Real Problem

My engine’s editor runs asynchronously from the engine’s core functions. When I clicked the button to remove all entities, the editor tried to clear the scene immediately — even if those entities were still being processed by a kernel or the render graph.

In other words, the engine was destroying entities while they were still in use. That’s why systems started complaining about missing entities and missing components.

The Solution: A Mini Garbage Collector

What I needed was a safe way to destroy entities. The fix was to implement a simple “garbage collector” for my ECS, with two phases:

  • Mark Phase – Instead of destroying entities right away, I mark them as pendingDestroy.
  • Sweep Phase – Once I know the command buffer has completed, I set a flag. In the next update() call, that flag triggers the sweep, where I finally destroy all entities that were marked.

This way, entity destruction only happens at a safe point in the loop, when nothing else is iterating over them.

Conclusion

What looked like a weird formatter bug turned out to be a timing bug in my engine. Immediate destruction was unsafe — the real fix was to defer destruction until the right time.

By adding a simple mark-and-sweep system, I now have a mini garbage collector for entities. It keeps the engine stable, avoids “entity does not exist” spam, and gives me confidence that clearing a scene won’t blow everything up mid-frame.

Thanks for reading.

From 26.7 ms to 16.7 ms: How a simple Optimization Boosted Performance

In my previous article, I talked about my attempts to improve the performance of the Untold Engine. Even after adding GPU frustum culling to reduce the CPU workload, the engine was still CPU-bound — stuck at around 26.7 ms per frame.

Profiling with Xcode Instruments pointed the finger at Metal’s encoder preparation, which appeared to take ~15 ms. Based on that, my next move seemed obvious: switch to a bindless rendering.

What does that mean? Instead of rebinding textures and material properties for every draw call, I would move everything into a single argument buffer. Each draw would reference materials by index. In theory, this should drastically cut CPU overhead and pair nicely with GPU-driven culling.

But reality didn’t match theory. After spending days moving to a bindless model, I ran the engine with 500 models — and the performance needle didn’t budge. In fact, things got worse: encoder prep time increased from ~15 ms to ~17 ms.

You can imagine my disappointment. But I kept digging. And then I found the real bottleneck. Instruments showed the CPU was spending almost 9.5 ms just preparing data for GPU frustum culling.

So the encoder wasn’t the problem after all. As I dug into the code, I discovered the true culprit: a single function that queries all entities with specific component IDs.

 
 

Here’s what was happening:

👉 My component mask was stored as an array of 64 booleans. Every time I checked an entity, the code looped through all 64 slots, read from two arrays, and branched on each one. With 500 entities, that meant tens of thousands of tiny checks every single frame. No wonder the CPU was choking.

The fix? Replace the boolean array with a single 64-bit integer and use a bitwise AND. That collapses the entire check into just two instructions. Here’s the new function:

 
 

That one change dropped the CPU frame time from 26.7 ms down to 16.7 ms. The GPU frame time sits at 9.3 ms.

In other words, the engine now runs at a solid 60 fps.

I’m happy with the results: the engine is no longer CPU-bound or GPU-bound.

But I’m not done yet. The next step is implementing occlusion culling — and I’m excited to see how far I can push performance.

Thanks for reading.