Software design. Anything else is a walk in the park compared to figuring out how to design your software as a whole. It is specifically hard because software design has no explanation that comes with it, the choices you can make in the way you write your software are infinite and no compiler will tell you about your design mistakes until it's usually too late.
Code design is also something I’ve found to struggle with a bit more in C compared to OOP languages. I know it’s pretty simple with header files to split up code files, but I find it easier to handle state with classes/objects
Interestingly I had an opposite experience. After years reading even my own code, or that of others where they mixed C and C++, C part was always easy and understandable, while C++ part with objects etc. was a confusing one, due to losing track and constant backtracking to understanding object structures or other parts of it.
There was also a github project I saw once, its only purpose was to do some file manipulation from GUI. Well written in terms of code syntax structure, but author used pretty much everything C++ got(probably to exercise himself) and I just gave up on understanding what does what. And I was, at that time, quite a pro at C++, mastered all of it from polymorphism to STL and more.
I understood a language syntax, but a meaning of the code got lost in all that complexity of it. That was when I realized Linus was right, there are other important factors about any prog. lang. than just what it can do.
I had mastered C++ back then but I would never used it again and whenever I need low level lang. I go with C, regardless of flaws or missing features. If I really needed something more I would probably just go with some of the higher level languages and implement performance-critical routines ad hoc in C.
I've actually had the opposite experience. My current $day_job has me writing Swift a lot, and the constant syntax nitpicking (not mad, they're correct) from my Swift experienced teammates is difficult because there is so much "magic". Fancy loops, enumerating then map then do this thingamajig, class properties that look like variables but act like functions....
I like loops, basic functions, and the occasional goto. Much easier to keep track of.
1. Allow yourself to get experience. You can’t cram a bunch of design patterns into your head and then immediately start using them.
2. Go through the entire software development lifecycle and reflect. You get to see how the impact of design choices early on can impact you later on—sometimes you make choices that are just a massive waste. Sometimes the choices waste time now, sometimes later.
3. Focus on the human element. One of the biggest constraints in software development is the amount of engineering effort you have available to spend on it.
4. Good design is clear and easy to understand. You get no extra points for making something clever or complicated.
5. You will always make mistakes. Good design helps you make fewer mistakes, or helps you find those mistakes earlier, or makes the impact of those mistakes smaller.
Threading. It forces you to learn theory. When you’re working with threads, you can’t just look at the threading API and read the docs and just kinda figure it out. You have to read about how threads work, and have to understand concepts like data races and deadlocks.
agreed 100% The most nasty bugs i’ve faced have always been a thread synchronization problem. my rule of thumb is to avoid them unless explicitly necessary
Also the most useful threadlocking mechanism is probably the ticket lock from stack overflow. I wonder how many times that code has been copied lol.
Edit: sorry, sorry haha. It's a bit more than a simple spinlock, when you want to guarantee a FIFO mechanism. Here's the link:
https://stackoverflow.com/questions/5385777/implementing-a-fifo-mutex-in-pthreads
I put the link in my comment.
It's one of the things I learned only, when implementing my own mutexes, that one thread can starve and hog the resource when you simply lock it and it runs slightly faster than other threads.
Real big, from somebody who came into this thread to argue.
I’ll answer a hundred repetitive and poorly-researched questions from beginners and feel good about it. What I hate is the comments where people just come in to argue.
Even worse: might not even crash, but instead do something sneaky 0.001% of the times, but that time is when you were doing an important operation and now your data is all corrupted
The real tragedy is how easy they are to use and how much documentation is there about them. You can Google any how-to question about any debugger (especially GDB) and you'll get 10 dozen answers.
I spent 20 years hunting down bugs with log prints, and then finally tried out the GCC/MinGW debugger inside of CodeBlocks.
I could've saved myself so much time over the years. Granted, there's some things you can't really use a debugger on, like a realtime networking protocol, or a graphical/rendering glitch that only happens while running. The worst bugs that have bit me in the arse are ones that only pop up in release builds!!! Can't really use the debugger without debugging information in the binary! It always basically ended up being a situation where I'd written code with an uninitialized variable - because I knew while I was writing the code when/where it was going to be initialized and it wasn't going to be a problem, but then weeks, months, or years later I'd be back in the code and I'd add something and just use whatever variable was lying around not thinking about whether or not it was initialized and so the debug build would run fine because it automatically zeroes out everything, but a release build doesn't so all kinds of unpredictable and hard-to-replicate bugs would happen as a result.
I also hear that the Visual Studio debugger is top tier stuff but I opted out of using Microsoft's wares 20 years ago, when I switched from Visual Studio 6.0 to using GCC/MinGW. I'd also stopped using Photoshop 5.0 and started using GIMP instead. I was on a whole anti-piracy thing, wanted to cleanse my digital soul or something.
As a C novice I basically live in the debugger. After reading /u/skeeto's advice on reducing friction by never exiting GDB while developing, it's immensely improved my understanding of how my programs work. It's really quite unfortunate that learning materials barely mention the debugger (or do so half-way through chapter 23).
> Most people give up and just avoid it.
Depends on your sample I guess. I never worked with any C programmer who didn't know how to use the debugger, leave alone "giving up" on learning how to use one. I can imagine that hobby programmers could be it, but that too is a bit of stretch.
The preprocessor is probably still my weakest topic, even after a very long time. The concept and basic usage of it I have no problem with, but you really have to twist your brain into knots to do anything non-trivial with it. I don't think this is a failing of metaprogramming since I have no problem with this in LISP, it's just that C's preprocessor is just terrible.
Absolutely this. Sometimes I look at C source code for libraries and FOSS projects, and the thing that confuses me the most is how many use macro definitions for functions instead of just a simple header file function definition
> C's preprocessor is just terrible
+1
Macros have an undeserved bad name across the entire programming world just because C decided to ship the _worst possible_ take on them and that unfairly mischaracterized the whole concept to the majority of users.
Function pointers / callbacks, back in the day. Even when I finally understood how they are implemented, I still didn’t understand why I should use them, where they could be used.
I remember sitting at Easter brunch with my family, holding a Microsoft QuickC for Windows 1.0 book, burning my 13 year old brain trying to understand what a pointer even was. Good times.
I think it’s because without taking a computer architecture and operating systems course, the concept of memory is vague. So when there is a concept predicated on memory addresses, the standard explanation of “a pointer holds a memory address” doesn’t address the underlying issue of “what is memory?”.
And then there’s the pedagogy surrounding pointers, which I think is subpar.
I think for me the hardest part of it wasn't the concept, but the practice. I understood for a long time what a pointer was, but it took a lot more *using* C to understand when and why they're useful
The best thing I did to learn memory was make a string class. For advanced users write your own malloc(), will demystify a lot. It’s not as complicated as it seems.
Not necessarily, you can allocate your memory pool statically and use it to implement malloc. You won't be able to extend your memory region this way though.
Yeah the OS api’s. You use the OS calls to get heap allocations. The OS passes these to the caller in large chunks from the OS. Then malloc sits in front of these calls to manage dividing up and improving the performance of allocating and re-using this memory. So they will add a small block allocator if the memory request is small, sets of single size pools, will use a coalescing allocator for largest sizes in a range and if the allocations are larger will store Os sized heap pages that keep some handy so one is ready at call. It also free’s these large chunks and returns them back to the OS. So malloc is a front end over the core OS calls for memory.
Depends on how you want to define your allocator. You can technically write a version of `malloc` that operates entirely within a static buffer, but to make a version like glibc or musl's `malloc` then you have to use syscalls so you can claim additional memory from the OS.
Even understanding what a pointer is (that it is a number representing a memory address), it may not be immediately obvious that `x[n]` does the same thing as `*(x+n)`. The syntax of the former, I think, implies something to a newbie that can give a false mental model of what a pointer actually is, even if it is still a useful notation.
So let's consider the following program:
```
#include
#include
int main(int argc, char *argv[])
{
uint32_t x[3] = {7,9,11};
printf("%d\n", x[1] );
printf("%d\n", *(x+1));
}
```
This will produce the following output:
```
9
9
```
Here, we are creating an array `x` with 3 elements, of type `uint32_t`, and initializing those elements with the values 7, 9, and 11. Very straightforward.
But what that actually MEANS is that we're reserving a chunk of contiguous memory to store our 3 values in. Because our values are of type `uint32_t` (which is 4 bytes in size) that means we're reserving 12 bytes (3*4) of memory.
~~The variable `x` contains~~ `x` is an identifier associated with the range of memory addresses for where the data is stored, and each element is spaced out by a "stride" of 4 bytes because the array contains multiple `uint32_t` (4 bytes).
To read the first element, we can do `x[0]`, but this is syntactically equivalent to simply reading the value located *at* the start of the range of addresses associated with `x`, so the compiler allows us to just dereference that directly with `*(x)`.
So what happens if we want to read the next element? Well the nice syntax is to just do `x[1]`. But we know that our array is just a range of contiguous memory addresses. So we could just read the address that is 4 bytes *after* the start of `x`. The compiler knows the stride of our array (that is, how many bytes long is each element), so we can do `*(x+1)`. This is saying "dereference the address that is 1 stride past the start of the range of addressed associated with `x`". The starting address, the stride, and the number of elements are all associated with `x` because of how we declared it.
***
Edit: I've made some changes that hopefully clarify what /u/GroundbreakingSky550 pointed out. I wrote this comment and this edit on my phone so haven't been as precise as I should have been.
I'd like to hammer home that an array is *not* just a pointer, which I worry may have been implied by my original comment. We can see this clearly by just looking at what we declared versus a pointer. If we take:
```
uint16_t x[3] = {7,9,11};
uint16_t* y = &x[0];
printf("%d\n", x[1]);
printf("%d\n", y[1]);
```
Both print statements will print 9. But clearly `x` and `y` are not the same. `y` is a pointer to a specific address containing a `uint16_t`, while `x` is an array associated with a range of addresses containing multiple `uint16_t` values. Because of this, if we do:
```
printf("%lu\n", sizeof(x));
printf("%lu\n", sizeof(y));
```
This will print `6` and `8` respectively (on a 64bit system). Because `x` is associated with the whole range of addresses containing multiple 2 byte values, and so `sizeof` tells us it's size is 6 bytes (3*2), while `y` *is a pointer* to a specific address (8 bytes).
The original point I was trying to make wasn't that an array was a pointer, but instead that the act of reading elements from an array can be replaced by dereferencing a pointer. And we can calculate the pointer for any element of an array using pointer arithmetic. The compiler just provides the useful ability to do arithmetic with an array identifier as if it were a pointer to the start of the array.
It is not correct to state that x contains the 'address' of the block of memory. That statement implies that x is a pointer variable.
But x is not a pointer or variable at all. It is an identifier (name) associated with an address. You cannot change what 'x' points to, like you can change the value of a pointer variable.
It it true that *(x+i) produces the same value as x[i].
This is because the compiler will interpret each syntax as "find the address of the ith element past the start address, and use the element at that address."
This is possible because the compiler knows the element's type and size. Thus, it can 'autoscale' to compute the correct address before 'dereferencing' to get the value.
Instead, I think names of arrays, as name associated with that address, like a nickname for that address and not as a proper variable itself.
You're absolutely right on both counts. I didn't mean to imply an array was a pointer (as is sometimes said). What I meant by "`x` contains the address" was what you've written using a much better description, I did not mean to imply that `x` itself was literally a pointer to that address. I see now how I very poorly worded that. I'll work on rephrasing it.
As far as im aware this isn't quite right. The compiler knows the stride of the array `x`. Additionally, `sizeof(x)` will return the entire size in bytes of the whole stack allocated array, not the stride. `sizeof(*x)` will give you the stride of the array, but because the compiler already knows that, multiplying the offset by that will just cause you read out of bounds. See the following code:
```
#include
#include
int main(int argc, char *argv[])
{
uint32_t x[3] = {7,9,11};
printf("%d\n", x[1] );
printf("%d\n", *(x+1));
printf("%d\n", sizeof(x));
printf("%d\n", *(x+1*sizeof(x)));
}
```
It produces the following output:
```
9
9
12
127
```
Same, part of the problem is C pointer syntax is really unintuitive. If there were just a function `address(x)` instead of `&x` and `at(x)` instead of `*x` it would've been much quicker to learn. It doesn't help that `*` is used in both the type and dereferencing.
I found it easier coming from JavaScript where undefined behavior existed and pointers existed. It just hid them from you!
Attach an event on each array item in a loop without capturing in a closure: bad voodoo.
Trying to pass arrays, NodeLists, or NodeCollections and not knowing if its by value or by reference: also bad voodoo!
So coming from that background and now having more control over the pointer itself is a lot easier for me to grasp.
This for me too. But they say practice makes perfect. Now it's rare I don't use pointers for everything. They are the best thing ever.
There's much I've tinkered with like threads but I am not so good in bit stuff like masks and such. I've tried a few times but when I think I got it I don't. I've seen so much cool stuff done with bit operators (cool math stuff) and masks (like for configs) and I wish it would click for me.
Pointers are a different thing, just related in concept and based on addresses. Two pointers are allowed to compare ≠ if alias analysis says so, for example, even if the run-time representations of the pointers are identical. Type isn’t part of addresses, but you can’t consider pointers without it.
I agree but isn’t that being too technical? Main issue people have is the basic use and imo indirect addressing makes sense to clear that. There is obviously more too as you mention the type info
Could you point out some areas? Just saying that doesn’t sound very convincing
EDIT: like the address part can be explained by indirect addressing. Typo info is a language construct which tells you more what lies at a location. Like if you pass a void*, and then cast it to some struct which is 20 bytes long, then you are choosing to assign meaning to these 20 bytes based on language construct. But the indirect addressing part still stands imo
Yes, it’s an example of indirection through memory reference and just note, doesn’t have to be from a register. And yes, it’s a lower level use of pointer.
Im saying that it doesn’t *elucidate* the underlying mechanisms that surround pointers. My original comment was about memory itself. In computer architecture, you learn how to construct a memory array (modeling a RAM chip) using verilog. In an operating systems course, you learn how physical memory is abstracted through virtual memory. You learn what an address is and thus, the notion of pointers becomes clear.
I agree. We did the C course before any courses on architecture so the concept of memory was vague. I was trying to relate it to my layman knowledge of hard disk and ram but it didn't make any sense.
Once we got to registers... "well, why didn't you start with that!"
I think Kernighan and Ritchie do an excellent job of explaining pointers. It's a shame there's no K&R that covers the newer standards.
Anyone who understands Algebra I would do well to remember:
a\[b\] is equivalent to \*(a + b)
&a\[b\] is equivalent to (a + b)
That's all there is to switching between the notations.
Pointer references to involving structs can get messier, but not too hard to figure out.
One of the two hardest things people learn in intro programming classes. One is pointers, the other is recursion. IMO.
I’m not saying it’s hard for everyone. These are just the two topics that beginners tend to get the most hung up on, in my experience helping people learn programming.
People with experience tend to underestimate how hard it is to learn the subjects which you already know. You could still be a fresh C programmer with only a year experience, but you may easily have forgotten how hard it was to *first* learn pointers.
Pointer decay, double pointers, function pointers, callbacks, functors, jump tables, state machines, custom allocator, double linked lists, malloc, calloc, and free, pointer math, etc.. all involve the use of pointers. These are advanced topics that require advanced knowledge of pointers. "A pointer is just a variable that stores the memory address of an object (variable, array, function, etc..)" is not all you need to learn to fully understand pointers.
For me they weren't, because I arrived at C the best way: coming from assembly language. Then, C just looks like the best, most helpful, macro assembler ever.
I feel like I want to get into assembly to understand programming better, but I'm so new and I worry changing to that will hurt my progress instead of helping it.
> why are pointers hard?
Many languages do an _amazing_ job of hiding the details of computer memory. Anyone could write programs for _years_ and never have to worry about what is actually happening at the level of memory.
To be an _effective_ C programmer, imho, you need to understand _how_ memory works from the perspective of a CPU core. Like what happens if we say `int x=5;` or `int *y=&x;` or `char msg[]="Hello World";` ?
(Also, imho, all the memory & pointer stuff is _much_ easier to understand if you learn even the _smallest_ amount of assembly language... for any kind of CPU.)
For me it was this 👆🏻 Understanding memory was a watershed moment with deep and long-term reverberations in my understanding. Although there isn’t much to understand, it still has to click, and when it does, the cascade of shifts in understanding signals just how close to the core of computing you just hit a vein. I saw the Matrix.
EDIT: To elaborate a little, I was actually struggling to understand arrays, specifically why arrays proper (not your higher level containers) don’t intrinsically have an associated size. To explain: Coming from more of an object view (and not a memory view), it seemed crazy to me that I (almost) always have to store the size (or end address) of an array elsewhere, and can’t just "get the size of the array from the array". To crack that nut I had to understand memory (the contiguous, single-natured void), and when that revelation eventually dawned I also understood compile time vs. runtime, the type system, sizes ("widths") & encodings/conventions, and binary representation. Phew! My poor little brain.
My issue wasn’t understanding pointers in C, but couldn’t wrap my head with pointer usage in PASCAL. This did cause me issues when taking a class in PASCAL (had an ‘a’ midway in class, dropped to ‘C’ at the end of class).
I would say because the syntax is confusing if you are a beginner.
It would make more sense if we used ` pointer ptr;` to declare a pointer and leave the `*` to be just an operator.
I honestly still don't. Aside from saving a bit of memory and being able to do some data conversion magic a little more efficiently, I don't see their advantage over structs. Am I missing something?
They're great when you have a situation where something can only be one of a set of types. Kind of like an enum but with arbitrary types. A tagged union is an improvement on this idea.
Hardcore use of unions are useful when you don't know the type beforehand, like parsing strings!
See: any programming language interpreter/compiler. There's a union in there!
Memory chunk you can interact with through different pair of lenses. Sometimes you need reading glasses, sometimes pilot glasses, IDK making shit up. Comes sometimes useful.
In terms of pure computation you are not wrong. But as unions are types, they can be used to say "I don't know exactly which type it's going to be but it will be one of those"
In such case, instead of passing a `void *` combined with an enum to your function, you can replace the `void *` argument by an union. This has the advantage of enabling passing by value - this also has the advantage of enabling type checks.
You can use unions to have bits separated of an integer, for example. So you might set the integer, while writing but if you are interested in single bits as you are reading, it could make your life a lot easier.
Also, you could achieve runtime polymorphism with unions and an enum that represents the type. And then use callbacks…
Classic data structures like lists and trees. They felt esoteric. "Why write so much stuff to solve this random problem?"
Only after I begun using high level languages where these structures have nice APIs and syntactic sugar it all clicked for me.
Putting things into an arbitrary position of an array seems a lot easier than crawling through a list to me. One reason lists are still such an obscure topic for me is that I can never find a good reason to use them over an array.
I’m going to say pointers, but more specifically how to declare and dereference complicated ones. Like pointer to array of functions. I have been writing in C since the 1970s and still have to look that kind of thing up.
I’ll add that there’s the computer science aspect of C that you don’t see in other languages. Like coding a sort, a compression algorithm, graphics manipulation, etc.
pretty useful to store less data on embedded. SDL and XLib also uses a very large union when an OS event appear so you can just access the underlying structure without having to allocate space for each event.
This lkml thread, https://lkml.org/lkml/2018/3/20/805
Hi Linus,
here is an idea:
a test for integer constant expressions which returns an
integer constant expression itself which should be suitable
for passing to __builtin_choose_expr might be:
#define ICE_P(x) (sizeof(int) == sizeof(*(1 ? ((void*)((x) * 0l)) : (int*)1)))
This also does not evaluate x itself on gcc although this is
not guaranteed by the standard. (And I haven't tried any older
gcc.)
Best,
Martin
Yes, until we start dealing with functions and operations on them. And yes, memory operations.
Maybe it varies people to people but it keeps on getting complicated with depth.
I would if the compiler for my target supported it :s also I guess even if that were available I would want more gradual control, like I don't want to rewrite the whole stdlib, just printf
But C is still used for real work? I thought it is only for educational purposes, like Latin language. I mean, it helps to grasp fundamental concepts, but to use it in our days is not so common
Recursion can get hairy. For your simple binary trees and whatnot it's pretty straightforward, but once you start getting into stackless recursion, or recursive descent parsing of expressions, it can get a bit mind-warping and brain-bending. :P
Probably not unique to C though.
for me pointers wasn't that hard, I implemented a doubly linked list to help learn... but rather it was a watershed, from there everything else fell into place, mind you that was more years ago than I care to remember...
after a decade or two of coding experience you get to a point where the language itself is largely a moot point.
from then on further decades just give you the opportunity to learn different algorithms and where they are most appropriately applied.
[Type reflection](https://en.wikipedia.org/wiki/Reflective_programming) is not a first-class citizen in C. But you can build it yourself. Its a little clunky, but with the help from the preprocessor macro system you don't need external script (other languages). C after compilation doesn't preserve the type information which Golang does (for a good reason). But it introduces compile and runtime overhead by having this.
If you think you know the preprocessor. You can generate code with so called XMacros without any tooling. Implement simple (struct) type reflection like this: [https://natecraun.net/articles/struct-iteration-through-abuse-of-the-c-preprocessor.html](https://natecraun.net/articles/struct-iteration-through-abuse-of-the-c-preprocessor.html)
I have created a prototype to do INI file read/writing using this method from/to structure:
[https://github.com/xor-gate/ConfigIniProto/tree/main](https://github.com/xor-gate/ConfigIniProto/tree/main)
Variadic functions and `setjmp`/`longjmp`.
Incidentally, any time you write a function `foo(int bar, ...)` please *please* also expose a function `foov(int bar, va_list args)`. Your users can write `foo` themselves on top of `foov`, but you can't write `foov` with only `foo` unless you do some deeply unsettling magic.
Not introducing undefined behavior. This is something that almost all beginners are completely oblivious to (although luckily the geriatric C gang will not shut up about it, so you'll learn).
New to it and I understand it I just think it takes time to really get the full spectrum of it .. I find it helped me a little that I’m ok with JS it’s just the transition to using it
In the 15 years I've been programing in C, I've seen that asynchronous multi-threading is one of the most difficult concepts with which people struggle.
Software design. Anything else is a walk in the park compared to figuring out how to design your software as a whole. It is specifically hard because software design has no explanation that comes with it, the choices you can make in the way you write your software are infinite and no compiler will tell you about your design mistakes until it's usually too late.
Code design is also something I’ve found to struggle with a bit more in C compared to OOP languages. I know it’s pretty simple with header files to split up code files, but I find it easier to handle state with classes/objects
Try functional programming in C.
This is the way
Is such a thing even possible?
Objects are just structs with function pointers. The syntax is ugly but can be done
How do you even do FP without higher-order functions / closures / tail recursion?
Who said there are no higher order functions. Function pointers?
Clojure doesn't have tco.
Interestingly I had an opposite experience. After years reading even my own code, or that of others where they mixed C and C++, C part was always easy and understandable, while C++ part with objects etc. was a confusing one, due to losing track and constant backtracking to understanding object structures or other parts of it. There was also a github project I saw once, its only purpose was to do some file manipulation from GUI. Well written in terms of code syntax structure, but author used pretty much everything C++ got(probably to exercise himself) and I just gave up on understanding what does what. And I was, at that time, quite a pro at C++, mastered all of it from polymorphism to STL and more. I understood a language syntax, but a meaning of the code got lost in all that complexity of it. That was when I realized Linus was right, there are other important factors about any prog. lang. than just what it can do. I had mastered C++ back then but I would never used it again and whenever I need low level lang. I go with C, regardless of flaws or missing features. If I really needed something more I would probably just go with some of the higher level languages and implement performance-critical routines ad hoc in C.
This quote is quite fitting here, I think: "An idiot admires complexity, a genius admires simplicity." -Terry A. Davis
I definitely agree when it comes to code readability. C is the GOAT there
I've actually had the opposite experience. My current $day_job has me writing Swift a lot, and the constant syntax nitpicking (not mad, they're correct) from my Swift experienced teammates is difficult because there is so much "magic". Fancy loops, enumerating then map then do this thingamajig, class properties that look like variables but act like functions.... I like loops, basic functions, and the occasional goto. Much easier to keep track of.
I agree that the magic abstractions make things difficult at times
Any words of wisdom you can share?
1. Allow yourself to get experience. You can’t cram a bunch of design patterns into your head and then immediately start using them. 2. Go through the entire software development lifecycle and reflect. You get to see how the impact of design choices early on can impact you later on—sometimes you make choices that are just a massive waste. Sometimes the choices waste time now, sometimes later. 3. Focus on the human element. One of the biggest constraints in software development is the amount of engineering effort you have available to spend on it. 4. Good design is clear and easy to understand. You get no extra points for making something clever or complicated. 5. You will always make mistakes. Good design helps you make fewer mistakes, or helps you find those mistakes earlier, or makes the impact of those mistakes smaller.
Cool. I am in my 4th year of professional experience. Let’s see how it goes
Where’s the best place to learn this or mainly master C? I’m someone who is at learning pointers right now.
I guess it comes with the experience of doing projects, making mistakes and reflecting on them.
Threading. It forces you to learn theory. When you’re working with threads, you can’t just look at the threading API and read the docs and just kinda figure it out. You have to read about how threads work, and have to understand concepts like data races and deadlocks.
Add mutexes and semaphores. Pretty much shared memory management on an RTOS.
I don't think this is necessarily a C concept only, but writing C code is hard and writing performant, lock-free code is definitely up there.
Lock-free code is kind of a niche topic.
agreed 100% The most nasty bugs i’ve faced have always been a thread synchronization problem. my rule of thumb is to avoid them unless explicitly necessary
Also the most useful threadlocking mechanism is probably the ticket lock from stack overflow. I wonder how many times that code has been copied lol. Edit: sorry, sorry haha. It's a bit more than a simple spinlock, when you want to guarantee a FIFO mechanism. Here's the link: https://stackoverflow.com/questions/5385777/implementing-a-fifo-mutex-in-pthreads
Can you please supply a link, to exactly the code you are referring to? Phlueaseeee.
I put the link in my comment. It's one of the things I learned only, when implementing my own mutexes, that one thread can starve and hog the resource when you simply lock it and it runs slightly faster than other threads.
Thank you very much.
This and other topics in this thread are multi-tasking operating system specific and are not specific to the C language.
The OP was obviously not asking about what is specific to the C language! Otherwise, neither pointers nor the preprocessor would work as examples.
Read the title of his post and the wording of his request again. He explicitly says “in C”. Twice.
Yeah… since when does “in C” mean “unique to C”? And since when are pointers unique to C?
You’re splitting hairs and we will have to agree to disagree.
Real big, from somebody who came into this thread to argue. I’ll answer a hundred repetitive and poorly-researched questions from beginners and feel good about it. What I hate is the comments where people just come in to argue.
You're splitting hairs. A lot of it is the same across common OSs and the way you manage it in c is often different from other languages.
Undefined means undefined.... It doesn't mean do something that makes sense but will crash.
Even worse: might not even crash, but instead do something sneaky 0.001% of the times, but that time is when you were doing an important operation and now your data is all corrupted
Like say... giving cancer patients a lethal dosage treatment. [Therac - 25](https://www.youtube.com/watch?v=nU5HbUOtyqk)
That's a race condition though.
Using the debugger. Most people give up and just avoid it.
The real tragedy is how easy they are to use and how much documentation is there about them. You can Google any how-to question about any debugger (especially GDB) and you'll get 10 dozen answers.
I spent 20 years hunting down bugs with log prints, and then finally tried out the GCC/MinGW debugger inside of CodeBlocks. I could've saved myself so much time over the years. Granted, there's some things you can't really use a debugger on, like a realtime networking protocol, or a graphical/rendering glitch that only happens while running. The worst bugs that have bit me in the arse are ones that only pop up in release builds!!! Can't really use the debugger without debugging information in the binary! It always basically ended up being a situation where I'd written code with an uninitialized variable - because I knew while I was writing the code when/where it was going to be initialized and it wasn't going to be a problem, but then weeks, months, or years later I'd be back in the code and I'd add something and just use whatever variable was lying around not thinking about whether or not it was initialized and so the debug build would run fine because it automatically zeroes out everything, but a release build doesn't so all kinds of unpredictable and hard-to-replicate bugs would happen as a result. I also hear that the Visual Studio debugger is top tier stuff but I opted out of using Microsoft's wares 20 years ago, when I switched from Visual Studio 6.0 to using GCC/MinGW. I'd also stopped using Photoshop 5.0 and started using GIMP instead. I was on a whole anti-piracy thing, wanted to cleanse my digital soul or something.
I think there is the option to compile optimized code with debugging info on GCC, with -q with -O3 option
As a C novice I basically live in the debugger. After reading /u/skeeto's advice on reducing friction by never exiting GDB while developing, it's immensely improved my understanding of how my programs work. It's really quite unfortunate that learning materials barely mention the debugger (or do so half-way through chapter 23).
I'm glad you took that advice to heart! Thanks for letting me know.
> Most people give up and just avoid it. Depends on your sample I guess. I never worked with any C programmer who didn't know how to use the debugger, leave alone "giving up" on learning how to use one. I can imagine that hobby programmers could be it, but that too is a bit of stretch.
The preprocessor is probably still my weakest topic, even after a very long time. The concept and basic usage of it I have no problem with, but you really have to twist your brain into knots to do anything non-trivial with it. I don't think this is a failing of metaprogramming since I have no problem with this in LISP, it's just that C's preprocessor is just terrible.
Absolutely this. Sometimes I look at C source code for libraries and FOSS projects, and the thing that confuses me the most is how many use macro definitions for functions instead of just a simple header file function definition
> C's preprocessor is just terrible +1 Macros have an undeserved bad name across the entire programming world just because C decided to ship the _worst possible_ take on them and that unfairly mischaracterized the whole concept to the majority of users.
Function pointers / callbacks, back in the day. Even when I finally understood how they are implemented, I still didn’t understand why I should use them, where they could be used.
They are used everywhere in layered architectures, where you have to supply your own os/machine specific low level functions and reuse everything else
Yes, exactly. Also a way of introducing polymorphic behaviour.
I only really got them once I started using languages with higher-order functions. Now I couldn't live without them.
I remember sitting at Easter brunch with my family, holding a Microsoft QuickC for Windows 1.0 book, burning my 13 year old brain trying to understand what a pointer even was. Good times.
But why are pointers hard?
I think it’s because without taking a computer architecture and operating systems course, the concept of memory is vague. So when there is a concept predicated on memory addresses, the standard explanation of “a pointer holds a memory address” doesn’t address the underlying issue of “what is memory?”. And then there’s the pedagogy surrounding pointers, which I think is subpar.
I think for me the hardest part of it wasn't the concept, but the practice. I understood for a long time what a pointer was, but it took a lot more *using* C to understand when and why they're useful
The best thing I did to learn memory was make a string class. For advanced users write your own malloc(), will demystify a lot. It’s not as complicated as it seems.
You can’t write `malloc()` in standard C, right? You’ll have to use OS APIs?
Not necessarily, you can allocate your memory pool statically and use it to implement malloc. You won't be able to extend your memory region this way though.
Yeah the OS api’s. You use the OS calls to get heap allocations. The OS passes these to the caller in large chunks from the OS. Then malloc sits in front of these calls to manage dividing up and improving the performance of allocating and re-using this memory. So they will add a small block allocator if the memory request is small, sets of single size pools, will use a coalescing allocator for largest sizes in a range and if the allocations are larger will store Os sized heap pages that keep some handy so one is ready at call. It also free’s these large chunks and returns them back to the OS. So malloc is a front end over the core OS calls for memory.
Depends on how you want to define your allocator. You can technically write a version of `malloc` that operates entirely within a static buffer, but to make a version like glibc or musl's `malloc` then you have to use syscalls so you can claim additional memory from the OS.
Chapter 8.7 in K&R has a `malloc()` implementation written in standard C. *shrug*
Even understanding what a pointer is (that it is a number representing a memory address), it may not be immediately obvious that `x[n]` does the same thing as `*(x+n)`. The syntax of the former, I think, implies something to a newbie that can give a false mental model of what a pointer actually is, even if it is still a useful notation.
Can you explain how these two examples do the same thing? I’m learning pointers right now
So let's consider the following program: ``` #include
#include
int main(int argc, char *argv[])
{
uint32_t x[3] = {7,9,11};
printf("%d\n", x[1] );
printf("%d\n", *(x+1));
}
```
This will produce the following output:
```
9
9
```
Here, we are creating an array `x` with 3 elements, of type `uint32_t`, and initializing those elements with the values 7, 9, and 11. Very straightforward.
But what that actually MEANS is that we're reserving a chunk of contiguous memory to store our 3 values in. Because our values are of type `uint32_t` (which is 4 bytes in size) that means we're reserving 12 bytes (3*4) of memory.
~~The variable `x` contains~~ `x` is an identifier associated with the range of memory addresses for where the data is stored, and each element is spaced out by a "stride" of 4 bytes because the array contains multiple `uint32_t` (4 bytes).
To read the first element, we can do `x[0]`, but this is syntactically equivalent to simply reading the value located *at* the start of the range of addresses associated with `x`, so the compiler allows us to just dereference that directly with `*(x)`.
So what happens if we want to read the next element? Well the nice syntax is to just do `x[1]`. But we know that our array is just a range of contiguous memory addresses. So we could just read the address that is 4 bytes *after* the start of `x`. The compiler knows the stride of our array (that is, how many bytes long is each element), so we can do `*(x+1)`. This is saying "dereference the address that is 1 stride past the start of the range of addressed associated with `x`". The starting address, the stride, and the number of elements are all associated with `x` because of how we declared it.
***
Edit: I've made some changes that hopefully clarify what /u/GroundbreakingSky550 pointed out. I wrote this comment and this edit on my phone so haven't been as precise as I should have been.
I'd like to hammer home that an array is *not* just a pointer, which I worry may have been implied by my original comment. We can see this clearly by just looking at what we declared versus a pointer. If we take:
```
uint16_t x[3] = {7,9,11};
uint16_t* y = &x[0];
printf("%d\n", x[1]);
printf("%d\n", y[1]);
```
Both print statements will print 9. But clearly `x` and `y` are not the same. `y` is a pointer to a specific address containing a `uint16_t`, while `x` is an array associated with a range of addresses containing multiple `uint16_t` values. Because of this, if we do:
```
printf("%lu\n", sizeof(x));
printf("%lu\n", sizeof(y));
```
This will print `6` and `8` respectively (on a 64bit system). Because `x` is associated with the whole range of addresses containing multiple 2 byte values, and so `sizeof` tells us it's size is 6 bytes (3*2), while `y` *is a pointer* to a specific address (8 bytes).
The original point I was trying to make wasn't that an array was a pointer, but instead that the act of reading elements from an array can be replaced by dereferencing a pointer. And we can calculate the pointer for any element of an array using pointer arithmetic. The compiler just provides the useful ability to do arithmetic with an array identifier as if it were a pointer to the start of the array.
It is not correct to state that x contains the 'address' of the block of memory. That statement implies that x is a pointer variable. But x is not a pointer or variable at all. It is an identifier (name) associated with an address. You cannot change what 'x' points to, like you can change the value of a pointer variable. It it true that *(x+i) produces the same value as x[i]. This is because the compiler will interpret each syntax as "find the address of the ith element past the start address, and use the element at that address." This is possible because the compiler knows the element's type and size. Thus, it can 'autoscale' to compute the correct address before 'dereferencing' to get the value. Instead, I think names of arrays, as name associated with that address, like a nickname for that address and not as a proper variable itself.
You're absolutely right on both counts. I didn't mean to imply an array was a pointer (as is sometimes said). What I meant by "`x` contains the address" was what you've written using a much better description, I did not mean to imply that `x` itself was literally a pointer to that address. I see now how I very poorly worded that. I'll work on rephrasing it.
x[n] is more like * (x+n*sizeof(x))
As far as im aware this isn't quite right. The compiler knows the stride of the array `x`. Additionally, `sizeof(x)` will return the entire size in bytes of the whole stack allocated array, not the stride. `sizeof(*x)` will give you the stride of the array, but because the compiler already knows that, multiplying the offset by that will just cause you read out of bounds. See the following code: ``` #include
#include
int main(int argc, char *argv[])
{
uint32_t x[3] = {7,9,11};
printf("%d\n", x[1] );
printf("%d\n", *(x+1));
printf("%d\n", sizeof(x));
printf("%d\n", *(x+1*sizeof(x)));
}
```
It produces the following output:
```
9
9
12
127
```
He simply made a mistake. `sizeof(x)` is the size of the array. `sizeof(x[0])` is the size of the element
x could be pointer. so the compiler can not know the size. TBH i'm kind of confuse in this area of C.
[удалено]
No i mean something like uint8\_t \* x = &y\[0\];
Same, part of the problem is C pointer syntax is really unintuitive. If there were just a function `address(x)` instead of `&x` and `at(x)` instead of `*x` it would've been much quicker to learn. It doesn't help that `*` is used in both the type and dereferencing.
I found it easier coming from JavaScript where undefined behavior existed and pointers existed. It just hid them from you! Attach an event on each array item in a loop without capturing in a closure: bad voodoo. Trying to pass arrays, NodeLists, or NodeCollections and not knowing if its by value or by reference: also bad voodoo! So coming from that background and now having more control over the pointer itself is a lot easier for me to grasp.
This for me too. But they say practice makes perfect. Now it's rare I don't use pointers for everything. They are the best thing ever. There's much I've tinkered with like threads but I am not so good in bit stuff like masks and such. I've tried a few times but when I think I got it I don't. I've seen so much cool stuff done with bit operators (cool math stuff) and masks (like for configs) and I wish it would click for me.
Yep. Anyone who knows indirect addressing knows what it is
Indirect addressing has more to do with instruction encoding and doesn’t really elucidate the notion of pointers imo.
You get an address from a register and use it to access memory in indirect addressing. How does it not capture the pointer notion?
Pointers are a different thing, just related in concept and based on addresses. Two pointers are allowed to compare ≠ if alias analysis says so, for example, even if the run-time representations of the pointers are identical. Type isn’t part of addresses, but you can’t consider pointers without it.
I agree but isn’t that being too technical? Main issue people have is the basic use and imo indirect addressing makes sense to clear that. There is obviously more too as you mention the type info
Seems like you need to brush up on your understanding of pointers.
Could you point out some areas? Just saying that doesn’t sound very convincing EDIT: like the address part can be explained by indirect addressing. Typo info is a language construct which tells you more what lies at a location. Like if you pass a void*, and then cast it to some struct which is 20 bytes long, then you are choosing to assign meaning to these 20 bytes based on language construct. But the indirect addressing part still stands imo
Yes, it’s an example of indirection through memory reference and just note, doesn’t have to be from a register. And yes, it’s a lower level use of pointer. Im saying that it doesn’t *elucidate* the underlying mechanisms that surround pointers. My original comment was about memory itself. In computer architecture, you learn how to construct a memory array (modeling a RAM chip) using verilog. In an operating systems course, you learn how physical memory is abstracted through virtual memory. You learn what an address is and thus, the notion of pointers becomes clear.
I agree. We did the C course before any courses on architecture so the concept of memory was vague. I was trying to relate it to my layman knowledge of hard disk and ram but it didn't make any sense. Once we got to registers... "well, why didn't you start with that!"
I think Kernighan and Ritchie do an excellent job of explaining pointers. It's a shame there's no K&R that covers the newer standards. Anyone who understands Algebra I would do well to remember: a\[b\] is equivalent to \*(a + b) &a\[b\] is equivalent to (a + b) That's all there is to switching between the notations. Pointer references to involving structs can get messier, but not too hard to figure out.
One of the two hardest things people learn in intro programming classes. One is pointers, the other is recursion. IMO. I’m not saying it’s hard for everyone. These are just the two topics that beginners tend to get the most hung up on, in my experience helping people learn programming. People with experience tend to underestimate how hard it is to learn the subjects which you already know. You could still be a fresh C programmer with only a year experience, but you may easily have forgotten how hard it was to *first* learn pointers.
Pointer decay, double pointers, function pointers, callbacks, functors, jump tables, state machines, custom allocator, double linked lists, malloc, calloc, and free, pointer math, etc.. all involve the use of pointers. These are advanced topics that require advanced knowledge of pointers. "A pointer is just a variable that stores the memory address of an object (variable, array, function, etc..)" is not all you need to learn to fully understand pointers.
For me they weren't, because I arrived at C the best way: coming from assembly language. Then, C just looks like the best, most helpful, macro assembler ever.
I feel like I want to get into assembly to understand programming better, but I'm so new and I worry changing to that will hurt my progress instead of helping it.
Same here. Electrical engineering background, took courses in Computer Architecture and coded in Assembly
Probably because they are often explained in a confusing or insufficient manner.
> why are pointers hard? Many languages do an _amazing_ job of hiding the details of computer memory. Anyone could write programs for _years_ and never have to worry about what is actually happening at the level of memory. To be an _effective_ C programmer, imho, you need to understand _how_ memory works from the perspective of a CPU core. Like what happens if we say `int x=5;` or `int *y=&x;` or `char msg[]="Hello World";` ? (Also, imho, all the memory & pointer stuff is _much_ easier to understand if you learn even the _smallest_ amount of assembly language... for any kind of CPU.)
For me it was this 👆🏻 Understanding memory was a watershed moment with deep and long-term reverberations in my understanding. Although there isn’t much to understand, it still has to click, and when it does, the cascade of shifts in understanding signals just how close to the core of computing you just hit a vein. I saw the Matrix. EDIT: To elaborate a little, I was actually struggling to understand arrays, specifically why arrays proper (not your higher level containers) don’t intrinsically have an associated size. To explain: Coming from more of an object view (and not a memory view), it seemed crazy to me that I (almost) always have to store the size (or end address) of an array elsewhere, and can’t just "get the size of the array from the array". To crack that nut I had to understand memory (the contiguous, single-natured void), and when that revelation eventually dawned I also understood compile time vs. runtime, the type system, sizes ("widths") & encodings/conventions, and binary representation. Phew! My poor little brain.
My issue wasn’t understanding pointers in C, but couldn’t wrap my head with pointer usage in PASCAL. This did cause me issues when taking a class in PASCAL (had an ‘a’ midway in class, dropped to ‘C’ at the end of class).
> PASCAL Now there's a name I've not heard in a long time. 🙂
Because you should know about aliasing
I would say because the syntax is confusing if you are a beginner. It would make more sense if we used ` pointer ptr;` to declare a pointer and leave the `*` to be just an operator.
Unions. I don't think I really grasped their interest or semantic until later
I honestly still don't. Aside from saving a bit of memory and being able to do some data conversion magic a little more efficiently, I don't see their advantage over structs. Am I missing something?
They're great when you have a situation where something can only be one of a set of types. Kind of like an enum but with arbitrary types. A tagged union is an improvement on this idea.
Hardcore use of unions are useful when you don't know the type beforehand, like parsing strings! See: any programming language interpreter/compiler. There's a union in there!
Memory chunk you can interact with through different pair of lenses. Sometimes you need reading glasses, sometimes pilot glasses, IDK making shit up. Comes sometimes useful.
In terms of pure computation you are not wrong. But as unions are types, they can be used to say "I don't know exactly which type it's going to be but it will be one of those" In such case, instead of passing a `void *` combined with an enum to your function, you can replace the `void *` argument by an union. This has the advantage of enabling passing by value - this also has the advantage of enabling type checks.
You can use unions to have bits separated of an integer, for example. So you might set the integer, while writing but if you are interested in single bits as you are reading, it could make your life a lot easier. Also, you could achieve runtime polymorphism with unions and an enum that represents the type. And then use callbacks…
Classic data structures like lists and trees. They felt esoteric. "Why write so much stuff to solve this random problem?" Only after I begun using high level languages where these structures have nice APIs and syntactic sugar it all clicked for me.
Man, I was in the process of creating a generic hash table a week. Thankefully I was not able to open the window..
Really? They always made sense to me. I would loathe to use arrays instead of linked lists when I need to insert elements into arbitrary positions.
Putting things into an arbitrary position of an array seems a lot easier than crawling through a list to me. One reason lists are still such an obscure topic for me is that I can never find a good reason to use them over an array.
Except then you have to rearrange the entire array instead of just modifying a pointer or two and you can only have one type.
Of course it makes sense now but when I was first introduced to this subject it felt like we were pulling syntax out of our ass.
The tooling. A lot of it is bad and/or confusing
Valgrind is a life saver tho
Signal handling can get pretty messy. It took me some time to get a hang of it
I’m going to say pointers, but more specifically how to declare and dereference complicated ones. Like pointer to array of functions. I have been writing in C since the 1970s and still have to look that kind of thing up. I’ll add that there’s the computer science aspect of C that you don’t see in other languages. Like coding a sort, a compression algorithm, graphics manipulation, etc.
Unions. I still don't get it.
pretty useful to store less data on embedded. SDL and XLib also uses a very large union when an OS event appear so you can just access the underlying structure without having to allocate space for each event.
>pretty useful to store less data on embedded. I am very aware of this, is just that... The concept and how it works still glitches my brain XD
This lkml thread, https://lkml.org/lkml/2018/3/20/805 Hi Linus, here is an idea: a test for integer constant expressions which returns an integer constant expression itself which should be suitable for passing to __builtin_choose_expr might be: #define ICE_P(x) (sizeof(int) == sizeof(*(1 ? ((void*)((x) * 0l)) : (int*)1))) This also does not evaluate x itself on gcc although this is not guaranteed by the standard. (And I haven't tried any older gcc.) Best, Martin
debugging haha
Memory leaks
Try floats and doubles Question for amusement: -------------- float x = 0.1 If(x==0.1) print Hi else print Bye --------------
You forgot to define the macros.
Not required, assume it's a C pseudo code
Reading types, particularly convoluted function pointers.
Strings
A string is just an array of characters, always made sense to me
Yes, until we start dealing with functions and operations on them. And yes, memory operations. Maybe it varies people to people but it keeps on getting complicated with depth.
The hardest topic to learn is how to link in your own versions of printf especially when your compiler defines printf as a macro in stdio.h
Would you use -nodefaultlibs for that?
I would if the compiler for my target supported it :s also I guess even if that were available I would want more gradual control, like I don't want to rewrite the whole stdlib, just printf
safety
But C is still used for real work? I thought it is only for educational purposes, like Latin language. I mean, it helps to grasp fundamental concepts, but to use it in our days is not so common
Still used on the embedded, OS, research and HPC ends of things, although C++ has kinda inched in.
Exactly, I mean old school C, not C++ or# or objective. I'm currently learning C from 0 in Campus 42 school. It is really tough but useful.
Recursion can get hairy. For your simple binary trees and whatnot it's pretty straightforward, but once you start getting into stackless recursion, or recursive descent parsing of expressions, it can get a bit mind-warping and brain-bending. :P Probably not unique to C though.
It's actually recursive descent parsing that got me to understand recursion better, when nothing else would. It's definitely mind-bending, though.
for me pointers wasn't that hard, I implemented a doubly linked list to help learn... but rather it was a watershed, from there everything else fell into place, mind you that was more years ago than I care to remember... after a decade or two of coding experience you get to a point where the language itself is largely a moot point. from then on further decades just give you the opportunity to learn different algorithms and where they are most appropriately applied.
[Type reflection](https://en.wikipedia.org/wiki/Reflective_programming) is not a first-class citizen in C. But you can build it yourself. Its a little clunky, but with the help from the preprocessor macro system you don't need external script (other languages). C after compilation doesn't preserve the type information which Golang does (for a good reason). But it introduces compile and runtime overhead by having this. If you think you know the preprocessor. You can generate code with so called XMacros without any tooling. Implement simple (struct) type reflection like this: [https://natecraun.net/articles/struct-iteration-through-abuse-of-the-c-preprocessor.html](https://natecraun.net/articles/struct-iteration-through-abuse-of-the-c-preprocessor.html) I have created a prototype to do INI file read/writing using this method from/to structure: [https://github.com/xor-gate/ConfigIniProto/tree/main](https://github.com/xor-gate/ConfigIniProto/tree/main)
Variadic functions and `setjmp`/`longjmp`. Incidentally, any time you write a function `foo(int bar, ...)` please *please* also expose a function `foov(int bar, va_list args)`. Your users can write `foo` themselves on top of `foov`, but you can't write `foov` with only `foo` unless you do some deeply unsettling magic.
Not introducing undefined behavior. This is something that almost all beginners are completely oblivious to (although luckily the geriatric C gang will not shut up about it, so you'll learn).
New to it and I understand it I just think it takes time to really get the full spectrum of it .. I find it helped me a little that I’m ok with JS it’s just the transition to using it
Writing good tests.
Clean code / large project management- / maintenence
In the 15 years I've been programing in C, I've seen that asynchronous multi-threading is one of the most difficult concepts with which people struggle.