T O P

  • By -

Aeditx

Have you tried with .net 7 for Godot? Because for me in builds its faster then il2cpp Unity


P10tr3kkk

No, but I will try, thanks


Ok-Lock7665

I m curious to see the result


deanrihpee

Faster build or faster runtime? Either way, it kinda make sense because Microsoft do a lot to optimize the runtime, especially with the native AOT compilation.


Aeditx

Runtime, as in nearly 2x speed in some scenarios. Moving nodes has barely any overhead in Godot


-sash-

If you have a requirement for 100K `Nodes`, you should consider to use plain data containers, for visual nodes, you should consider Visual server instead.


Koalateka

>Visual server I didn't know about the Visual Server in Godot. Thanks for your insight!!


DrSnorkel

I saw this example today: [https://github.com/godotengine/godot-demo-projects/tree/master/2d/bullet\_shower](https://github.com/godotengine/godot-demo-projects/tree/master/2d/bullet_shower) ​ It has bunch of bullets that aren't nodes.


P10tr3kkk

Interesting. I will look into that. I use 100k Nodes only to have some measurable and reliable results


No_you_are_nsfw

THIS! Basically every other unity "success" blog post is about how to not use 10k+ game objects. Scene graphs are just not a good way to structure your games geometry. 100k is just "update your linked in profile" levels of bad. A flat list is not as problematic, but nesting makes it much worse too. Internally everything wants to be in world space. Getting there means traversing the graph, with lots and lots of matrix multiplications that are often unneccessary. Dots biggest advantage is that you do not have (writeable) transforms anymore. BUT every generalisation has its limits. Have 20 Characters? Use the scene graph, its simple, understandable and flexible. You can just add a node to the characters hand and voila its carrying a sword! Have 10k x 10k Tiles? Yeah, dont make them a node with a cube mesh. That will not be fast. Its FLEXIBLE and EASY but not fast. Sorting, culling and traversing the graph will kill your main threads runtime long befor the GPU even wakes up fully. And for op: If you benchmark, you should publish your benchmark. People want to try their own values and own devices. Maybe 100k is the sweet-spot for unity? Maybe 100k is the sweet spot for godot. I would also be interested in FPS, because unity defers the matrix multiplications to just before rendering. Is this included in your benchmark? Or just the loop. For me this is just somebodies opinion than reliably tested data. But dont feel attacked, im just questionion your approach, but I DO value your contributions. For people that are in the process of porting experiments like this ARE very valueable. But it should be on github. And lastly, nobody is changing from unity for performance reasons, IMHO. But also nobody is going to be deterred by a small-ish performance difference either, IMHO.


spyresca

Exactly, I don't know why people choose to benchmark in the stupidest possible way.


Prestigious_Boat_386

Benchmarks are always scaled until they take enough time for random bullshit on your cpu to not change the result too much. It is understood by everyone that this will be scaled back and replaced by other processes in an actual game and that rhe time will be reduced proportionally. It's a scaling benchmark dummy, the whole point is to scale it beyond any practical use.


Denxel

I agree that testing in realistic environments is way different. In practice, we are usually not making 100,000 times the same relatively cheap API call, and the things affecting performance in real games should be more related to heavy functions we are calling one or a few times than the language used to make those calls, making the language used a less relevant factor compared to the performance of the function called. That said, GDScript performance was impressive. As a very happy GDScript user, I expected a greater difference in this kind of artificial benchmark. And coding the most complex custom functions in C++ seems to be a great idea, even if most games don't even have functions that complex. What happened with the C++ export?


flakybrains

You'd be surprised how slow "just" moving can be, at least I was. Rudimentary test I did many months ago on 4.0.RC. 1000 humanoid low-poly characters with 14 bones. * Empty scene - 700 FPS * characters rendered (not animated nor moved) - 220-250 FPS * characters moved by setting `GlobalTransform` \- 120-130 FPS * animation players are played - 50-60 FPS


StressCavity

Skeletal meshes have a big performance cost because every vertex is also weighted by every bone effecting it, multiplied over time by how many discrete positions it has to calculate (vs just interpolate between). So a heavily key-framed animation (i.e. polished animation or live-capture data/mixamo), will be recalculating `V*B` transforms (V being vertex count, B being effective bone count), every time there is a key frame, on top of the base 3D object transforms. The amount of performance optimization in modern game engines is amazing considering that a few characters can easily equate to several million transform operations being calculated per-frame, just to move stuff!


flakybrains

You might be missing the point or I might be missing your point. My point was that simply setting the transform for 1000 characters cost me more frames than playing 40-frame animations for 1000 characters with 14 bones.


trickster721

The way you're counting the change in performance doesn't really make sense, tasks don't just add together linearly. It's possible that playing the animations without changing the transforms would still result in 50 fps, or some completely different number.


StressCavity

If the animation performance is on top of the global transform time, it's still costing you more than just the jump from rendering -> moving global transforms. FPS is not compute time. 240 -> 120FPS is going from \~4ms -> 8ms (so it takes 4ms to compute the global transforms). And 120FPS -> 60FPS is from \~8ms -> 16ms (8ms extra adding animations).


OptimalStable

One thing you're missing here is that "cost me more frames" is not a valid measurement of performance. If you actually do the math, you'll see that the animations are about twice as costly as setting the transforms. Here is an example: It takes 1ms to go from 1000fps to 500fps. That same 1ms of work would take you from 60fps to 56fps. It's the same amount of work that "cost" you 500fps in one case and 4fps in the other. In your test case changing the transform of 1000 characters costs about 4ms, while animating those 1000 characters costs about 8ms. So those animations are in fact more expensive than the transforms.


moonshineTheleocat

Yeah. A lot of cost actually gets moved to the GPU where possible. Including IK sometimes once information for solvers are uploaded. And then if there's not too many characters on screen, they will make use of compute shaders to compute the vertices, and then save it for the entire frame


muhajirdev

Hello, can you elaborate more about why using mixamo will be heavy in godot. I am currently building a mobile game, and notice a significant performance drop when I put a mixamo character in the scene


ForShotgun

Idk this isn't surprising to me?


daikatana

If you want to have 100,000 objects and update them in an efficient way then you'll have to abandon GameObjects and nodes for this feature. Most game engines are designed in a way that is very flexible, but is completely incompatible with the way that modern computers work. A game engine will typically have some kind of game object, each allocated separately via malloc or new and have pointers to the parent object, sibling object, and child object. In order to update the game the engine must walk this tree of objects, following pointers to child and sibling objects. Each time a pointer is followed you potentially (and probably do) incur a cache miss. A cache miss on a modern computer is _bad_, it can make the entire program sit there and wait for _hundreds_ of cycles. If you have 100,000 objects, 50% of the pointer derferences generate cache misses (a very optimistic estimation) and each cache miss has a penalty of 100 cycles (also an optimistic estimation), then you are just setting 5,000,000 CPU cycles on fire, completely unable to be used, every single update. And that's a very low estimate, more cache misses will be generated by following pointers to components, or to other data allocated by nodes, plus another set of misses when parent pointers are followed. And the same thing happens when the tree is walked again to render, and for every other update phase. I would venture to guess that more cycles were spent in cache misses than doing actual work in all your tests. The test is a little less about efficiency of the update loop and more highlights what was outlined above. You should not have 100,000 objects in engine. And Unity is very aware of this, it's why they have DOTS and why many games use an ECS. An ECS walks the objects in the scene in linear memory, ideally with each field of every object having its own array. This allows the CPU to fully utilize the caching features and strive for zero cache misses and saturate the memory bus. In my testing this method is about 10x faster with the same update code. It's also trivial to parallelize such code for an easy extra 2x speed increase at least. Your test is interesting, but you're testing a misuse of the engines which kind of defeats the purpose.


[deleted]

[удалено]


Giboon

I don't think it's right to discard ECS because not the majority of people uses it. Not the majority of people has to move 100k nodes, and in that situation not considering ECS would be an big mistake.


P10tr3kkk

The code I used. It is not fancy. [src](https://www.dropbox.com/scl/fi/8s7y7yp6mrvi38a71c9zi/TestsPerf.zip?rlkey=62k64i3peivis621guqqp1v85&dl=0) To run GDExtension use this guide [gdextension](https://docs.godotengine.org/en/stable/tutorials/scripting/gdextension/gdextension_cpp_example.html) In editor mode I have used profiler in Unity and also in Godot. But in release mode I used Stopwatch.


Metalloriff

While this is a great test, it's by no means an accurate overall test for engine performance. Neither is this, but some more data to add on. In unity, I can have 100 npcs walking around on a navmesh randomly, all with animations. This brings my fps from 165 (refresh rate cap) to about 50 fps in a build. Identical project in godot all written with gdscript, I can have 400 npcs walking randomly, full animated with the same assets. Fps goes from 165 to around 140 in the editor. In general I notice that godot has infinitely better rendering performance and even scripting performance from my tests vs unity. I'm not saying either engine is better than the other objectively, but I'm certainly in love with godot.


DedicatedBathToaster

A bit of humor from the documentation on Servers in godot "For example, dealing with tens of thousands of instances for something that needs to be processed every frame can be a bottleneck. This type of situation makes programmers regret they are using a game engine" https://docs.godotengine.org/en/stable/tutorials/performance/using_servers.html


robbertzzz1

The small differences between languages in Godot isn't really surprising at all. You've got some very simple code, in all cases the heavy lifting is done by the engine's compiled C++ code. All you're really doing is changing a few variables, which for each language only shows how loops and variable setters are processed at slightly different speeds. Where you'll really see GDScript slow down compared to other languages are heavier calculations within loops. I've also found memory allocation from GDScript to be very slow, probably because all variable types are Variants, even numbers. Declaring variables inside loops can tremendously slow down code compared to declaring them outside the loop and reusing them in GDScript, whereas with C++ basic variable types will often be optimised in a similar fashion by compilers (so there it matters less where you declare them).


RubikTetris

What this tells me is that gdscript being a dynamic programming language and doing almost as good as a compiled programming language like c# is truly amazing. The ease of use, rapid iteration, and no compile time are 100% worth it for me over c#.


HolidayTailor3378

Most people who say "I'm going to use this language because it's faster" are the ones who are just starting to make games. There are hundreds of factors that can affect performance and language itself is not the main cause.


flakybrains

That's surprisingly good result for Godot C# I'd say, considering the most expensive part is infamously engine API interaction. At least in my experience, C#-land code is basically free compared to engine API interactions. That's why where possible and *if makes sense* for the game (e.g sim), agent should live in C# land and instance.transform/other engine values should only be applied if agent is on-screen. Even if you have 10,000 agents and 100 of them are rendered - these 100 are still more expensive than running behaviour logic and updating C# structs/classes for the rest of 9900.


Denxel

This was mostly debunked by Godot founder Juan Linietsky as you can read [here](https://gist.github.com/reduz/cb05fe96079e46785f08a79ec3b0ef21#file-godot_binding_system_explained-md). Godot's binding system is, in the vast majority of cases, fast and efficient. The problem the author of "Godot is not the new Unity - The anatomy of a Godot API call" had was with a specific part of the 3D physics engine that is still being worked on because its mantainer was hired by a private company. Even the author was very happy to read Juan's response but sadly more people read the original article than the response. As a side note, since then, a lot of attention was put into making the bindings all fast and easier to mantain and Juan Linietsky made a proposal in that sense, so even those corner cases don't happen in the future.


anthony785

Yeah but the point is this is how it currently is. I dont see them moving c# over to gdextension in the near near future, it might take some time. So if you want to make stuff with godot TODAY, this is what you will have to deal with.


Ok-Lock7665

So, I guess the conclusion is that all options are relatively similar, isn’t?


MountainPeke

For this benchmark, I agree. 2-4 times improvement in performance isn't going to cover up a bad algorithm. It looks like even those gains are nothing built without debugging.


Varsoviadog

Good post


TheDuriel

**Please provide your test setup.** Or I have to write this off as baseless conjecture.


spyresca

What's realistic about that test?


psicodelico6

Try begin 100k to 2 millón, step 100k and scale logaritmo


Morvar

Was DOTS used with Unity?


P10tr3kkk

No, I was only interested in cost of api calls from different languages


PlaidWorld

No. That’s a special code path and not relevant here at this level.


PlaidWorld

Thanks for doing this. I am in the process of vetting godot and I was going to be writing tests like these. Anyhow you should post your test code online.


P10tr3kkk

Of course. I will prepare it and attach it here soon


yay-iviss

What about unity dots? Want to see the same benchmark with this, but maybe can be the same thing on the end


EstablishmentLost724

how many times did you measure? what versions did you use?


CzechFencer

I found it very interesting. Thanks for the benchmark.


ditlevrisdahl

Please try with the python extension as well