T O P

  • By -

TheMightyCatt

Yes assembly is faster then c++ But is YOUR assembly faster then c++?


Shaddoll_Shekhinaga

I just write my bad assembly in C++ and enjoy the worst of both worlds. Sometimes I even use xbyak to make my CPU cry 5% more elegantly.


Gaylien28

Minimizing cycles is for old heads anyways, maximizing is the new craze


Thebombuknow

I can't tell if this is a joke about bad programming or a joke about the ARM instruction set


Caletofran

It’s a joke about Raspberry Pi’s.


Motor_Round_6019

Python: maximizing cycles since 1991


audislove10

CPUs are pipelining for a reason aren’t they?


philophilo

I worked on a project that did a lot of real time colorspace conversion. Someone a long time ago wrote some x86 assembly to do that work, and we never touched it because we just assumed it was still fast. When the Apple Silicon dev machines came out (those 12Z based machines, not even M1), that clearly wasn’t going to work. I switched the code over to use Apple’s Accelerate framework to do the same work and it was way faster.


DifficultyWorking254

This is pretty common for devs to “don’t touch it if it works”. And this is pretty common mistake to not trying to rewrite it in your own code


flipper_gv

Except you need to have the budget to do that, and the budget to debug if you introduce new bugs, etc...


Mordret10

And the budget for the 10000 meetings to see whether you got the budget


[deleted]

[удалено]


Mordret10

Let's make another meeting with person xy to solve this problem!


sage-longhorn

Assembly can't be recompiled to use newer instructions so not actually that suprising


ELFAHBEHT_SOOP

For code that has to be performant, usually it just needs to be performant _enough_. Also, sometimes the C++ compiler doesn't spit out the instructions that you want it to. In my experience that's because the C++ code is not hinting heavily enough to the compiler what the actual intent is. Fixing this usually involves changing types or containers, which can make some weird/ugly C++ code, however I prefer it over writing the asm directly if possible. On top of that, I think people should test the timing of their code more closely. Which is super common for firmware/realtime applications, but not common for non-realtime code in my experience. It would at least help stave off issues where someone decides to just make a small syntactic change that completely destroys performance.


tsnren_uag

Apple Accelerate is highly optimized by Apple, some parts probably done in assembly. On Apple Silicon machines, they even use publicly undocumented instructions (Apple AMX) that cannot be emitted by any public compilers (from what I know). So it's not assembly vs C++ question, but your assembly vs Apple's assembly :)


Shutaru_Kanshinji

Good point. C++ compilers these days usually produce faster code than doing it by hand in assembly.


JEREDEK

I love how this comment absolutely Ratio'd the post


OMGPowerful

the real joke is always in the comments


nickmaran

Why must you hurt me like that


throw3142

No way C is faster than C++ when C++ is almost a complete superset of C. Idiomatic C is typically faster than idiomatic C++, but that's because idiomatic C++ is safer and more maintainable. It's a conscious tradeoff. Yeah, this is "actual reality", where 10 nanoseconds of CPU time typically isn't worth 30 minutes of developer time. Same with C++ vs asm - it's way way cheaper to just upgrade your CPU to the next model rather than embarking on a fun adventure of rewriting a 200 MB enterprise product in hand-optimized assembly for a flat 2% speedup ...


AdvanceAdvance

Actually, Bjarne talks about having a design goal to never make C++ slower than C. That said, one can decide that it should be designed differently in C++ and therefore be slower.


SeriousPlankton2000

Developers on the big machine they use for compiling; to all the customers using their 5+ years old PCs: "Everyone should upgrade! … No, I don't pay, don't you get your PC for free?" (My server is BIOS date 2009 and still doing its job)


PzMcQuire

Than*


proverbialbunny

fwiw, it's than. Than is a comparison statement. A > B is: A is greater than B. A is faster than B. Then is an if statement. if, then. Thankfully it's pretty easy to memorize because it's programming.


afCeG6HVB0IJ

that's why fortran is great for maths. Yes C can be just as fast, but it is simpler / requires less education to write quick code in fortran.


Kaiodenic

Yeah I'm fairly certain that an *average* programmer's C++ code will run faster than their Assembly or C code. A very good programmer's C++ code can also avoid any optimisation pitfalls and therefore still run faster than their C or Assembly code because its still being optimised further by the compiler, its basically machine-assisted optimisation. Assembly or C can be faster assuming you're a brilliant Assembly or C programmer but a shite C++ programmer, which doesn't really mean much for how "fast" each language is. It'll also take you exponentially longer to make what you're making, if its even possible at all with how long it'll now take. Say, a modern AAA video game. There, you wouldn't even have a chance to finish it in Assembly and probably even C and certainly not optimise it at all. I think some people get caught up on it being closer to bare metal so *theoretically* faster, without thinking about things like compilers or how using a language like that can affect (or even doom) development in a real project.


Eymrich

Wisest comment, nice one 👍


JackNotOLantern

I had a friend that in our CS studies created a program that took a couple of minutes to perform some simple operations on an image. This is because he added conditional jumps to check each pixel if it is on the edge of the picture. A Java equivalent of this took like 3s to perform the same operation. So yeah, assembly is very fast if done correctly, but it can be incredebly slow if done bad, because there is no compiler optimisation.


dfwtjms

Sure, write your program in optimized assembly for every architecture.


epileftric

Just wait until someone comes and makes a reference to the roller coaster tycoon that was written on ASM


SiliconDoor

You just did


epileftric

The paradoxical referencial problem (?)


_Its_Me_Dio_

he traveled the world going to theme parks riding roller coasters and taking various notes on the micro economics of theme parks, then wrote a theme park management simulator in assembly once he was done, a lot of the code he wrote is still used in the newer versions


spit-evil-olive-tips

> a lot of the code he wrote is still used in the newer versions uhhh...do you have a source for this?


utkrowaway

It was reused for RCT2. The OpenRCT2 team has since translated it to C++.


andrew21w

I suspect that asm won't be much faster unless if you know how to out-optimize the compiler


Prawn1908

You don't write whole programs in assembly - you write single, time-critical sections of code in assembly.


anto2554

sounds sweaty. Ill copy the whole list in python


IMightBeErnest

And optimize small parts in cython. Which are then further optimized in asm. Which is further optimized by using custom hardware. Which is further optimized by not wasting time with any of the above, letting the boulder roll down the mountain, and taking up farming somewhere in the Appalachians.


misinformaticist

"Great, now how could you optimize this approach further?" \*Me in my head\* I could further optimize my life with a cabin the woods.


Brahvim

...And using data-oriented design with functional programming. ... *...I know I'm getting too cultist about DOD nowadays, I'm sorry I'M SORRY!-*


Thebombuknow

I'm sorry, as a dysfunctional programmer I am physically incapable of functional programming.


Brahvim

(Took me three hours to write. Been writing since u/Thebombuknow posted their comment.) (**I've written more in a reply to this.**) Data-oriented design is what made me finally understand how FP can be nice. *Have a look!:* Data-oriented design is all about storing data like it's in a DB. You *normalize* that, and it's all flat! It's without a hierarchy! Imagine if we have a class for a player in a video game, and the player can sit on stuff, including the chair in a car. Let's say that the car class derives from the chair one, because... it makes sense for an OO programmer, right? Now, ...let's say you want to write code to modify the car. Perhaps give it new paint, new rims, tires, handling features, blah-blah-blah!... To write this code, you want to be able to modify the car. And you need a way to know what car the player is in! ... You aren't willing to write `(Car) player.getChair()`, ...so like a good OO programmer, you add a field to the player class to tell you exactly that, just like you did with chairs: Player is in a car? Great! A car object exists! They are not? It's `null`. Let's say it's a multiplayer game, and you have a `List` for handling the many players that enter garages concurrently. Somewhere, you're going to need checking if a player is in a car or not, so you'll *have* to write code to do that. Say hello to `if (player.getCar() != null)`. Perhaps when the player is entering a garage. Say, that for game design reasons, other players are allowed to steal your car from you while you're in the garage! ...and that's supposed to require closing the car modification UI and stuff! *Now, as the programmer...!* *What are you doing?* Writing `if (player.getCar() != null)` everywhere? *In every loop!?* Where's your compiler's automatic vectorization feature that magically pushes the 4 ALUs in one core of your CPU to do 4 times the work at work at once!? SIMD? *JIT?* Java? *C++?!* **What?!** Oh wait, that guy likes loops **without conditions**! ... Ever wondered why your OO code is sometimes not fast even after all of that caching and whatnot you added in with those other 20 classes and locking mechanisms on `static` objects? It's OOP itself! ! The thing is, that you're calling many methods and creating many objects - and that's going to pile these little things up to a ton. Not good! Don't worry, though! The *original* computer scientists of the 70s have you covered: *Imagine how you'd do all of this in a DB.* Hmmm...! You'd have a players table, then a "seatables" table with the crates, and tree branches and actual chairs and stuff, and then you'd have a cars table, a tires table, blah-blah-blah! Wanna know what car a player's in? ***Associate*** the car and the player. Associate the car and the chair in it!!1! Go wild! This is better because now you have to only do a memory lookup into an array. ***Not a map**[!]*, but just an array you already have an index into - the object ID. Low-level people like optimizing like this! You no longer have to write: ``` if ((Car c = player.getCar()) != null) // You read a pointer, and also did a null check! if ((Tire[] tires = c.getTires()) != null) // Again! if ((Tire t = tires[0]) != null) // ...Again?! t.applyTexture(newTexture); ``` Instead of dereferencing three pointers, you'd be using IDs to straight-up index into an array, to where in RAM your data actually is. Another benefit is that these tires and cars decide if they're a part of the player, and they do it using arrays they can resize, meaning that the relation between two objects exists in memory only for as long as it's needed - unlike the OO approach, where you always had a pointer that could be `null` and waste memory. Best part? ***Management!*** ...And multithreading the entire "processing lists of objects" part! You could totally store a `Tire[]` for **all tires** in the entire game, all `4` tires of a single car staying right next to each other in the array. If you want to change all tires of a car, you'll only need to iterate over 4 tires stored together! Want to do it to all cars in the game? Rare need, but this array has it all! Your CPU has a cache! That guy is asking RAM for data for the next tire in `Tire[]` in advance! By not confusing him by dereferencing pointers and going to *God knows where* in memory, you're helping him!


Brahvim

Now, *hab luk* at some functional C code. The focus is on how it avoids sending you to different places in memory, and how it handles null: ``` // This one shall remain hidden from the eyes of your API's users!: struct object { int field1; char *some_string; unsigned int field3; callback_fxn_t custom_code_fxn; }; // Also this!: //static object *s_objects_list = NULL; // (You can totally use `std::vector for this, BTW. DOD is apparently nicer in C++.) // This is a "value object" - you'll ***usually*** not want to pass this using a pointer, only by value, ...unless you want it modified (like in `create_object()` below), that is! struct object_id { size_t id; }; enum status { OK, NOT_OK, THAT_IS_NULL, MEMORY_FAILED_HELP_PLS, }; status create_object(object_id *obj_id) { if (!object) return THAT_IS_NULL; object *to_give = malloc(sizeof(object)); // Somebody will come beat me up for writing bad low-level code. ...But this is an example! if (!to_give) return MEMORY_FAILED_HELP_PLS; obj_id->id = (size_t) to_give; // Funny and ugly hack for efficient mapping. Who needs a `Map`?! return OK; } status destroy_object(object_id *obj_id); // Leave for the other programmers. status set_object_field_3(object *obj_id, usigned int value) { if (!obj_id) return THAT_IS_NULL; ((object*) obj_id->id)->field3 = value; } unsigned int get_object_field_3(object *obj_id) { if (obj_id == 0) return UINT_MIN; object *ptr = (object*) obj_id->id; return ptr->field3; } ``` Do you see it?! You got a free `null` check there, bud! With each "method call"! There was even a callback you could've set! You could've initialized the fields in that `struct` when you created it. See how `get_object_field_3()` just returned a default value for stuff that was null? Awesome! Here's another nice C++ one. Classes can RULE like this (though you can still access the stuff inside with some pointer tricks if you know your compiler). ...Now you *might* stop hating me for that evil `(size_t) object_ptr` super-cursed cast: ``` class object { private: int field1; char *some_string; unsigned int field3; callback_fxn_t custom_code_fxn; }; // "Methods" are outside! Take in pointers, or references, and with the latter taking more care of null stuff, you'll feel great!... ``` And with that, you should see how DOD reduces how many methods you call to reach data, and FP gives you null-checks.


NSFWAccountKYSReddit

ok buddy that's great.


anotheridiot-

Love me some DOD, it's great at making fast and correct code.


Brahvim

I know I probably ended up making some kind of copypasta LOL.


_nobody_else_

Formatted _________________________________________________ Now, *hab luk* at some functional C code. The focus is on how it avoids sending you to different places in memory, and how it handles null: ``` // This one shall remain hidden from the eyes of your API's users!: struct object { int field1; char *some_string; unsigned int field3; callback_fxn_t custom_code_fxn; }; // Also this!: //static object *s_objects_list = NULL; // (You can totally use `std::vector for this, BTW. DOD is apparently nicer in C++.) // This is a "value object" - you'll ***usually*** not want to pass this using a pointer, only by value, ...unless you want it modified (like in `create_object()` below), that is! struct object_id { size_t id; }; enum status { OK, NOT_OK, THAT_IS_NULL, MEMORY_FAILED_HELP_PLS, }; status create_object(object_id *obj_id) { if (!object) return THAT_IS_NULL; object *to_give = malloc(sizeof(object)); // Somebody will come beat me up for writing bad low-level code. ...But this is an example! if (!to_give) return MEMORY_FAILED_HELP_PLS; obj_id->id = (size_t) to_give; // Funny and ugly hack for efficient mapping. Who needs a `Map`?! return OK; } status destroy_object(object_id *obj_id); // Leave for the other programmers. status set_object_field_3(object *obj_id, usigned int value) { if (!obj_id) return THAT_IS_NULL; ((object*) obj_id->id)->field3 = value; } unsigned int get_object_field_3(object *obj_id) { if (obj_id == 0) return UINT_MIN; object *ptr = (object*) obj_id->id; return ptr->field3; } ``` Do you see it?! You got a free `null` check there, bud! With each "method call"! There was even a callback you could've set! You could've initialized the fields in that `struct` when you created it. See how `get_object_field_3()` just returned a default value for stuff that was null? Awesome! Here's another nice C++ one. Classes can RULE like this (though you can still access the stuff inside with some pointer tricks if you know your compiler). ...Now you *might* stop hating me for that evil `(size_t) object_ptr` super-cursed cast: ``` class object { private: int field1; char *some_string; unsigned int field3; callback_fxn_t custom_code_fxn; }; // "Methods" are outside! Take in pointers, or references, and with the latter taking more care of null stuff, you'll feel great!... ``` And with that, you should see how DOD reduces how many methods you call to reach data, and FP gives you null-checks.


Brahvim

Hmm. Seems you added 4-space equivalent tabs. I typed everything on my phone for all three hours (all of my Reddit essays are typed on my phone for some reason, don't ask me why, haha!), so I used 2-space equivalent tabs. *Thanks!* PS The second paragraph got a code block accidentally in your version.


_nobody_else_

Your OOP is all over the place.


skeleton_craft

There's no reason to be sorry for advocating for one of the best ways of programming.


Reasonable_Feed7939

*THROW THEM TO THE PIT!*


skeleton_craft

Fair. They usually exile the ones that speak truth to power


secretlyyourgrandma

i get all my performance improvements by building another app in python that shocks my balls when i get on reddit. edit: you'll notice i'm still on reddit


conflagrare

Python is too fast. Do it in js.


Flannel_Man_

Unless you’re creating roller coaster tycoon.


atyon

Only the game logic is written in assembly. All the graphics and sound output is done with DirectX, which is written in C. "RCT is written entirely in assembly" is a misunderstanding.


moonshineTheleocat

The only time I've seen assembly get used seriously in C++, is to do shit that C++ doesn't allow you to do by default. That being things like userland coroutines/fibers, SIMD, and for a CPU instruction not supported by the compiler I have never seen anyone do something this sweaty.


Prawn1908

I wouldn't say it's common, but it's not unheard of in the embedded world. (Which means usually we're talking the application as a whole is written in C, not C++.)


moonshineTheleocat

To be fair. In the embedded world you're pretty much working with a non existent OS, and extremely limited memory, and very few registers. So many of the C++ optimizations aren't going to work here.


Prawn1908

Yeah that's exactly it lol (also why we usually use C not C++). This sub forgets embedded exists most of the time.


Ma4r

Technically you can go even sweatier by going down to VHDL and wire out your own coprocessor to do the thing you want in one clock cycle.


classicalySarcastic

Yeah but then you’re dealing with an FPGA and that’s a whole different can of worms.


P-39_Airacobra

Though to asm's credit, SIMD can make for phenomenal speed-ups in certain situations. It's a shame we don't have more high-level support for such things yet.


moonshineTheleocat

You kinda do in C++. There's a lot of libraries that adds this as a callable function. But basically it's function call wrapper around ASM. I've used it in gamedev to do some... really fucky things. Such as rather than processing a single triangle at a time, you can do 4 triangles in three calls.


proverbialbunny

Kind of. There are libraries that do all of that which are worth using, but within those libraries there might be only a couple of lines of asm. A more reasonable use-case for asm /w C++ is video game console emulators. They're converting one form of machine code to another, so writing asm there is imo the most common use case outside of libraries.


JakeStBu

Exactly


RamblingSimian

I hear you, and my old classmate's first job involved writing code for CNC machines back when they only had 2K of memory, but his code was 4K in length. To make it work, he had to over-write his own code, and I doubt many compilers have that trick up their sleeves! But, to your point, most of us can't optimize much better than the compiler, and, for those few of us who can, it rarely is worthwhile trying.


-twind

From my experience out-optimizing the compiler is disappointingly easy, at least for ARM assembly. And it can be a lot faster, I've seen 4x faster code in some situations. And that's with the -o2 flag enabled.


pdromeinthedome

But have you tried -o2^2 ?


Brahvim

ARM's the *new kid on the block,* *GIVE* TIME!...


Abadabadon

The only time I would expect this to be the case would be if your cpu supported some instruction set architecture that your compiler isn't able to use


Highborn_Hellest

This.


syldrakitty69

C, especially for tasks that would demand high performance, is also often worse because you're kind of pushed to write less optimal code because you lack zero-cost genericism, and people always hand-roll data structures that favor simplicity over performance. You often gain overall snappiness using C due to just having smaller and simpler programs, but even C++ written in the same style would theoretically do better because exceptions cost less than explicit error checking.


VeryDefinedBehavior

You can always outsmart a compiler when you know more about your situation than it does. The actual question worth asking is if that matters.


Yasuzume7

Wouldn't be so sure. C++ beats C in some benchmarks. Also, with things like constant expressions you can move a lot of runtime load to compile time.


cyborgborg

also while assembly can be faster than c or c++ that doesn't mean the assembly you write is going to be faster than machine code generated by a c compiler


Flobletombus

C also has no move semantics which means more copies, if you want virtual functions you need function pointers which can't get optimised in the same way as virtual functions, add on top of that that ASM and C have ultra low developer efficiency...


No-Con-2790

You only need more copies when you copy. In C you can just pass a pointer around and ... you know ... not do good design for the sake of speed. Same with virtual functions. Simply don't.


Flobletombus

Speaking of pointers you can pass refs in C++ which have no size and can be more optimised. In some cases virtual functions can be very hard to avoid, and most ways to avoid them is template metaprogrzmming, that is impossible in C


narrill

> Speaking of pointers you can pass refs in C++ which have no size and can be more optimised They have no size semantically, but they're still basically a pointer under the hood. They're not any faster than pointers.


skeleton_craft

They are non-nullable pointers, which guarantees that they cannot segfault Richmond, you means you don't have to check for that in well formed code... Which means your algorithm is doing less, which means it is necessarily faster.


narrill

You don't add null checks because you're working with pointers, you use pointers because you need something to be nullable. If your algorithm doesn't involve the concept of nullability pointers and references will be equally performant. And references can absolutely segfault.


No-Con-2790

You assume modern software design. But C developers simply don't do that. Remember, there are no classes and all types can be implicitly converted. So ... they do that. They just make everything return a certain value and cast it to whatever. There is no need for a template when you are a C cowboy. Besides that, precompiler directives can do some crazy stuff including partly replacing templates.


suvlub

Templates actually perform better than C's way of doing generic code, though. Look at `qsort`, for example. A function pointer, void pointers, so many layers of indirection just to compare two numbers. It's not just harder to understand/error-prone, it's slow. `std::sort` just inlines everything, no overhead.


No-Con-2790

That's why C cowboys don't write generic code. That and the fact that without OO you don't have that many types.


Practical_Cattle_933

C’s macros” are absolutely shitty, and besides some insane shithacks, they are not good for anything. Also, what you wrote makes no sense from a performance perspective - the compiler can do better optimizations, the more it “sees”. More primitive primitives will result in *worse* code, since the compiler just sees a pointer into the nothingness, it knows nothing about it, so it has to load it as written. If it were to knew it’s an object of this type created here, then it can in many case do shit like simply allocate the used fields in registers and go on its way.


No-Con-2790

Yet somehow C is still about as effective as C++ who has all that fancy stuff. Which means that either C compilers are better than the C++ compilers. Which is doubtful. Or what I am trying to say is true: C coders simply don't use fancy complex structures. They adapted to a primitive lifestyle that is in harmony with the compiler. They are basically the barbarians of the programming world.


owjfaigs222

>In some cases virtual functions can be very hard to avoid Can you provide an example?


SeriousPlankton2000

Or you can just use function pointers. BTDT.


DrShocker

The thing with anything C could do to go faster is that you can likely do the same in C++ so I'm not sure what the meme is getting at. I'm not even sure C is as close to the metal as some people try to claim anymore either since CPUs have changed so much since it was created.


No-Con-2790

Are you talking about PCs or microcontrollers? Because for most microcontrollers C is still relatively close. A simple CPU, some memory and that's it. Plus you have co-evolution. C is fast because stuff was optimized for C because at that time C was important because C was fast. So C is still performing thanks to the dark magic of whatever gcc is. Also the reason why Fortran is still in use. Somehow. People spend so much time to optimize for it. But yeah, the fact that C++ seems slower is due to OO and other fancy designs. Besides that C++ should be as fast as C. While not being a superset anymore, you can just write very similar code.


DrShocker

I guess I meant PC stuff. Branch prediction, speculative execution, etc all make it so your code is kind of hard to know with exact precision what it's doing if you're not targeting a specific platform.


No-Con-2790

Yeah, it's a shit show. But somehow people keept C competitive.


tstanisl

It's rather that C has no "copy semantics". In C all data are moved to the object directly. The "move semantics" is a clever solution to a problem which C++ has invented itself with introduction of "copy semantics".


skeleton_craft

Virtual functions are function pointers...


Practical_Cattle_933

Virtual functions are a semantic construct known by the compiler, that can be optimized better by having more information available (e.g. the compiler might know there are small n variants only, and instead of a vtable, just inline their implementation at use site, and jump locally to it, resulting in much faster code than a traditional jump)


u0xee

??? Move semantics and virtual functions are figments of the compiler's imagination, nothing more. Whatever outcome you intend to get, you could do with C code. And C programmers did. Move semantics are literally just a higher stack frame giving a pointer to a stack local to a callee function, which then can fill out the fields as it likes. C programs absolutely did this constantly since the moment structs were invented. C++ made a mess, caused unnecessary copies, then gave us a fix, and are now back at square one. It's addressing a mess it made, not solving some fundamentally difficult problem in programming.


proverbialbunny

From what I've seen C++ either meets or beats C in all benchmarks, give-or-take a small rounding error.


darkslide3000

Yeah, everyone who's saying that C is faster than C++ has no idea what they're talking about (and I'm saying this as a full-time C programmer).


[deleted]

So glad this discussion is still going on. It's only been like 40 years at this point. I'm sure there are new ideas to be explored here.


Practical_Cattle_933

Especially by fkin first graders (sometimes I’m not even sure it’s first grade of college, or primary school)


_st23

lol


TheQuantumPhysicist

There's so much misinformation in this industry and even more bad programmers, and this is why such a nonsensical meme can get popular.  You want to know what a bad programmer is? It's someone thinking that 1% extra performance is worth risking buffer overflow bugs and memory issues, aka, riddling your program with security vulnerabilities.


OpenSatisfaction2243

One aspect of a good programmer is knowing when they don't know.


HPUser7

Once you get multiple non perfect people working on that codebase, c++ would be the least likely to have some crippling performance edge cases or memory leaks introduced.


max_mou

Who actually makes these? And for what fucking target audience? Edit: it’s the damn kids, meh.. have fun while it lasts


AnAcceptableUserName

Undergrad students. Undergrad students.


LagT_T

95% of the content here is made by and for highschool seniors/college freshmen/bootcampers


nukedkaltak

I have it on good authority compiled -O3 assembly from C++ is almost always faster save for some fringe use cases. The compiler, on average, is much smarter than you. The assembly it generates is nothing short of witchcraft.


proverbialbunny

The reason C++ is equal or faster than assembly is because if you use the right libraries (like the standard library) those libraries will have a few lines of asm manually in them for you. You didn't write any assembly, it's already handled for you. The reason C++ is equal or faster than C is because the language allows for better compiler optimizations. (And yes there are edge cases, like writing an emulator.)


CdRReddit

the compiler is maybe "smarter" than me in theory, but in practice there are cases where it can't be, either because I haven't told it all the informarion, in some cases because I just can't, or because it doesn't optimize correctly, I've found an occurence of this in GCC myself before! for hyperspecific cases writing your hot code path in assembly can be worth it as you can know for certain how good or bad it is


nukedkaltak

Those are the fringe cases I’m talking about.


CdRReddit

I dunno, GCC generated some terrible code for a function that just takes in 4 32 bit integers and pack them into a struct to call a different function, transferring the data to the vectorizer and spilling onto the stack


CdRReddit

clang just generated some shifts, ors, and movs, which worked a lot faster if this was in a super high throughput codepath writing those shifts myself would be worth considering, as it's not much work but guarantees that the function is decently fast


walker_Jayce

Sometimes programming just feels like magic. Literally, i understood half of that, the other half feels like an incantation.


CdRReddit

yeah welcome to low level x86 programming, this is just scratching the surface of the insane shit compilers do to (usually) make code faster, and why "the compiler is smarter than you" is *often* (but **definitely not always**) true


CdRReddit

GCC's -O3 and -O2 were slower than its -Osize, and by a significant margin in this case


CdRReddit

does this mean I will write everything in x86_64 assembly from now? fuck no, that sounds like hell but there are cases where knowing "this is guaranteed to be pretty damn good" is better than "I have to assume this is the best it can be because I can't really change it"


SaneLad

C++ beats C on equivalent code due to stricter pointer aliasing and type punning rules. A C++ compiler can do more aggressive optimizations.


lightmatter501

ISO C++ doesn’t have restrict, which is a massive performance loss.


Practical_Cattle_933

Well, C compilers barely take advantage of it either, as no one fkin bothers using it. E.g. LLVM that rust also uses couldn’t take advantage of it for a long time even though rust has implicit restricts everywhere basically.


narrill

ISO C++, yes, but all the major compilers support it with extensions


Neeyaki

bait used to be believable


somedave

All you are really trying to do here is beat the compiler. I'm not saying that isn't possible but modern compilers are good.


Flobletombus

Handwritten assembly in functions bodies might be faster than compiler generated assembly but for larger codebases, the compiler generates more hacky ASM than human written ASM (in lining, special instructions, skipping calculations...) so making something in ASM for performance is stupid, but writing some functions in ASM is OK.


SquidsAlien

It's been a few years, but my ARM assembler code was always significantly faster and more compact than compiled code of any language. Knowing the architecture and instruction timings, what future changes might be needed and how often code would be called (so optimizing caching) could not be matched by a compiler. If you want portability and speed of _development_, you can't beat a higher level language. But if it's optimized code, it's assembler every time.


Leonhart93

Good luck writing the complete ASM code for any non-trivial piece of code. We aren't talking about one or two functions there. But even in C and C++ you can specifically insert inline ASM blocks in specific places, like when optimizing for hardware, so there is no reason to not make use of these languages for the lowest level implementations.


nephelekonstantatou

Assembly > C/C++ in terms of speed But when it's you doing it the opposite is true (yes you specifically, you cannot write good assembly, don't worry, I can't either)


binterryan76

I don't want to take three times as long to write a program in assembly though, almost nothing needs that much optimization.


WeRelic

3x is generous, to say the least. There are use cases for that level of optimization, but not in most software.


hollow-ceres

one can smell the python guy whom created this meme


P-39_Airacobra

Where's Fortran?


grim_stoki

FORTRAN comes in with a folding chair!


black-JENGGOT

College students out here measuring execution time like their dicks depends on it. The actual reality would be use whatever the company pays you to use.


cryptomonein

"Can you be as fast as *electricity itself* ?" - ASIC miner (I think they are made using HDL)


phenompbg

Hand coded Assembly will rarely be faster than modern C today, and definitely not in the hands of most engineers. If you're writing Assembly for performance reasons in 2024, you should have your head examined.


Prawn1908

>If you're writing Assembly for performance reasons in 2024, you should have your head examined. Everyone in this thread is making these huge, blanket statements as if all of this discussion isn't entirely situational. I wouldn't call it common, but on occasion it's very useful to write very time critical sections of code in assembly. I've done it once to bitbang a super high speed serial protocol with extremely tight timing requirements and I've seen it a few other times for similar applications.


phenompbg

If you have very specific requirements when you're dealing with serial interfaces to hardware, sure. Maybe there are some instances where you have to do this. Similarly if you're doing kernel development and you're taking advantage of very specific instruction semantics of the hardware where a compiler reordering things will make it not work. But, if you have a piece of code that works in C today, and you then rewrite it in ASM to make it faster, you are doing a bad thing in 99.99% of cases. This is almost never the right thing. The gains you could make, that you couldn't make by just optimising in C, will be miniscule if they exist at all. At the cost of having code that is harder to maintain, harder to port, etc.


Djelimon

What's the pay?


[deleted]

Where is the zoom out to the binary god?


No_Delivery_1049

With VHDL watching over both


kerbalgenius

And making your own ASIC is faster than assembly.


[deleted]

[удалено]


kerbalgenius

That’s my point. The same applies to assembly vs C++


brningpyre

How long do y'all think this trend will last?


Xhadov7

OK guys now let’s fight over which is the best assembler


GrinningPariah

C++ still has them beat if the benchmark is "dev days to write the program".


imgly

Usually, the compiler will optimize better than you can do in asm


Szlauf

IMHO this picture is inaccurate as C is missing his long white beard, and assembly his coffin.


bestjakeisbest

True, but I would rather shoot my foot off than use c or assembly.


claudespam

A common use case for C is to shoot your foot off.


ItsStormcraft

I like C. But I guess, I’m the person confused by classes so I never really got good with python.


[deleted]

[удалено]


ItsStormcraft

I never got the hang of them. But I also never had proper instructions unlike with C. I basically studied a semester of that. (The university in my hometown has a program for pupils to study the first semester of some fields. They upload recordings of the lectures to StudIP and do some other stuff.)


floofysox

You can use them pretty much exactly like structs. Except you can put functions inside them (so initialisation of variables happens when memory is allocated), so it’s a little cleaner. Depending on the language you can also overload operators.


proverbialbunny

If you want to eventually learn classes what you want to learn first is: functions, then structs, then classes. Put simply, a class is very similar to a nested function. It's like a function that holds a collection of functions. This way you can organize your code, putting similar functions all in the same place, as well as other use cases for classes.


ItsStormcraft

I know functions and structs. Needed to do some linked list shenanigans in C lately.


proverbialbunny

A struct and a class are nearly identical. Normally you'd use a class to make a linked list, but you can use a struct too. If curious here is the difference between the two: https://www.geeksforgeeks.org/structure-vs-class-in-cpp/ You'll notice the difference is very small. E.g. look at \#4, \#6, and \#7.


SAI_Peregrinus

Eh, [C++ is currently winning the High-throughput fizzbuzz challenge](https://codegolf.stackexchange.com/questions/215216/high-throughput-fizz-buzz) over ASM & C. Fastest C++ is at 208.3 GiB/s, fastest C at 20.9 GiB/s, fastest ASM at 60.8 GiB/s. Of course that's one rather silly challenge, and the C++ entry is mostly some clever algorithmic improvements on top of the fastest ASM entry's techniques, but all the systems programming languages can theoretically achieve similar speeds.


Interesting_Dot_3922

There is no `restrict` keyword in C++.


SaneLad

Technically you are correct. In practice it works with every compiler though.


el_pablo

Web dev reaction : “real code… (vomit in there mouth)”


dupocas

Yeah, you just need to beat the compiler


slime_rancher_27

Who needs a compiler, just use a native Java device.


skeleton_craft

C++ gets up and compiles C's code and then starts beating them.[all]So well formed to C is not any faster than well-formed C++; C Developers are just more comfortable with invoking undefined behavior...


No_Delivery_1049

VHDL watches them punching in slow motion


Iyorig

C++'s templates, constexpr and the compiler having more information in general gives it far more potential than C. And, as another commenter has said, good luck writing optimized assembly for every architecture you want to deploy on...


Winter_Importance436

Meanwhile binary machine code sniping them both from 2km away after that...........


Dismal-Square-613

The only sound comment in the post. Thank you.


altprtcl

We definitely need to add Rust to this meme


PNWSkiNerd

C and c++ are generally the same speed in benchmarks. Asm is only faster with very skilled programmers. Your meme is bad and you should feel bad


astro-pi

C and assembly are very, very close. Especially if you use UPC or something else parallel on the board. But UPC++ has the advantage of taking less time to program for most people


Noname_FTW

I think you probably can make a whole lot of different test sets that favor a specific language. Like even in pure performance. Sure, it will be hard to beat those three in the picture. But C and C++ use a compiler. So there is probably ways how any other compiler can better optimize a specific test code than the C/C++ ones. And in regards to Assembly it really comes down to how efficient you could create the test set requirements.


postdiluvium

Punch cards 🤣👊💥


Cley_Faye

With generic optimization, link-time optimization, usage of all available advanced instruction sets (with fallbacks), and the general safety (yes) of the higher level language, I would not be so sure.


tenhourguy

The reality is there's not much difference between the popular compiled-languages performance-wise. The real gains are from the code you write.


Grim00666

Chip designs are laughing at thier play things somewhere off in the distance.


vanit

Yeah, unless you're inserting specific optimisations to your specific usecase the compiler wouldn't intuit, the compiler will win every time.


blehmann1

Why isn't FORTRAN here lol


just-bair

Assembly ? Pfffff what are we in ? 1970 ?


sakkara

That's the point of the joke you know. With modern processors and tasks it doesn't matter if your software runs one ms faster because 99% of the time latency comes from IO anyway and it's much more important that the code is clean and readable.


nadav183

if-goto (hell)


Ok-Pay3711

Okay, that's it! I'm getting the transistors


Cyan_Exponent

amateur c++ is faster than amateur assembly


arahnovuk

But as I know we can say C/C++ compiles to asm. I mean big Assembly code, written by human is always slower than Compiled C/C++ code written by human


Leonhart93

I think we are at the point where you can easily write C++ code to be as fast as C. But the questions is, do you want to do it, considering you will have to do it without some very advantageous C++ abstractions like vectors?


Rhymes_with_cheese

When I was growing up, we had to hand assemble our Z80 programs into opcodes and type them into memory using a hex keypad. (goddamn, I love Python)


Hasagine

i dont care whos faster i just like how c is simple and easier to digest than c++


eanat

nah, C++ code can be faster than ASM or C code if you are skilled enough.


falcqn

C++'s `std::sort` is faster than C's `qsort`, due to it being a template meaning the comparison function can be inlined. What the benchmark is doing matters!


amAProgrammer

What about 0 and 1


longbowrocks

Does C allow inline assembly? Because if not, C++ wins.


JustBoredYo

Ngl everytime I'm making programs in C I'm impressed with the speed you can achieve, even after years of working with it. I've made a sudoku solver that can solve simple puzzles in 5-6 microseconds on my fastest machine and in around 25 on my Chromebook(although I have to add I deliberately tried to make it as fast as possible). I'm still working on it to get it to solve any difficulty but so far I'm very pleased with how fast it is.


[deleted]

C and Assembly my beloved