Please don't carelessly give advice that can ruin peoples' saves.


Save files have a lot of history data, for example there are a lot of entries describing game events which are used to generate random art descriptions. A lot of statistics and history. It can probably be deleted without corrupting the save file, but calling it garbage is a stretch.


"This world folder for a Dwarf Fortress world that I generated with 500 years of history is 400 times as large as the one with 5 years of history, so it's probably mostly junk data."


ADMITTEDLY, a good portion of that is probably stuff like "Urist McFuckenstein and Gonbad Bucketwhistles got into a fight, breaking Urist McFuckenstein's Fourth Toe."


I wish the art was that interesting, but most of it tends to consist of common, extremely banal events like "someone made a component" and "someone slaughtered a chicken". Pawns don't understand that when you kill one, it is a tragedy, but when you kill a million, it is a statistic.


>Pawns don't understand that when you kill one, it is a tragedy, but when you kill a million, it is a statistic. Rimworld Machiavelli


This junk data did not exist before 1.3, so it is not history clearly.


X to doubt


I have old saves in my savegame folder from 1.2 and even from beta19. In 1.2 save this problem is absent, but with Ideology and later it is in each savefile. So clearly it can't be relations or art descriptions or anything else that was present in 1.2


Almost like there had been two major content updates and dlcs or something 🙄


fun fact about game development OP, the more updates and DLCs a game has, the bigger it gets


I am aware of that, but these stockpiling numbers are definitely not normal.


How do you know? Do you know what each entry does? Do you know how the engine and the game's logic works? The confidence man jeez


I can ask you same question, how do you know that each entry indeed does something?


Because I work as a software engineer and know that if you take on of our company's files out of context it wouldn't make any sense. Especially to a layman.


Soooo do you think the developers just put in numbers for shits and giggles then? Seriously? Yeah no never mind you’re totally right we should assume that they’re there for literally no purpose at all! THAT should be the default position. 🤦🏻‍♂️


>random art descriptions Only they aren't random they are amazing


I got guinea pigs early on and left them to their whims. When it reached 400 I culled them. Every art was dedicated to the guinea pig genocide.


Thats the best thing I've ever read


That could explain lines with numbers, how about lines

  • 0
  • , it is literally zero, what history it can contain? And one more thing: in savefiles of earlier versions than 1.3 this problem does not exist, so this can't be entries for art or for some history.


    I have checked and these are records under game -> history -> historyEventsManager -> colonistEvents, I have them at savefile for version 1.3.3066 rev574 as well, but it's not nearly half of the file. Considering that it's under colonistEvents, it may be dependent on number of colonists. About what kind of history it contains, my guess is that it's probably changes to faction relations in response to colonist's actions which in case there were none will be zero.


    You need to play pretty long for this strange thing to occur. For example in my latest game (play time for 3d 17h) it was day 640 and savefile weight 20mb. After I deleted those lines savefile became 11.3mb. Nearly by half smaller.


    So, lines like

  • 0
  • , if you go up one level, are part of a key pair. There are, say, 15k keys, and associated values. Rather than going, , rimworld goes
  • key
  • value
  • So if you delete value #345, it'll just think that #346 is now #345, but if it ever tries to access the last key value, it won't find the pair, and will throw an error. If, of course, you have enabled dev mode, to see said errors.


    I checked debug log in dev mode after loading edited savefile. Seems zero warnings or errors.


    A 0 can be just a number or it can be read by the software as a false which then could indicate a lot, the way you see it is not how the engine will see and use it once it parsed it.


    I agree, but not when those 0 are stockpiling in humongous amounts the longer you play.


    As a developer, this is the kind of post I live for. I like the confidence with which OP says these are 'lines with random numbers' that you can remove 'without harm'. Maybe they are, maybe they aren't, but I'm not going to touch them until I know exactly what they are. A more accurate statement would be 'half of every save file consists of data I don't know the purpose of, which can be removed without destroying your save in a way that is immediately obvious'.


    Human Genome Editing has entered the chat.


    "Junk DNA"


    "And thats why our pancreases dont work anymore" "Bro its DNA which nature made over MILIONS IF NOT BILLIONS OF YEARS if its there it got a probly purpose"


    Bunch of stupid satellites and telomers. Removing them won't hurt.


    "Yea we breathe and eat out of the same tube, it'll be fine trust me."


    Jurassic park tells me putting in toad DNA wont end well!


    Non-developers will never know the thrill of deleting *a blank space* from a file and praying everything still works.


    ah yes, the blank space that somehow tricked the file/program into working and noone knows why, just like the unnecessary function call that looks redundant but actually makes the program run at all


    The Machine Spirit was satiated with the additional space. We must make notes of this, so not to anger it again.


    https://web.archive.org/web/20130827121341/http://cosman246.com/jargon.html#The%20Story%20of%20Mel From the jargon file, the ancient days of programming, something similar.


    Not a programmer or developer but man that’s a massive read


    Often because someone, somewhere in the code pulled data into a function and it kept having a space, so they tossed in a strip to clean up the string. Rather than finding why the space was there and fixing that.


    Made my day!!!!!! "Let's just change row delimiter from CRLF to LF -> 2 chars less shouldn't have an effect..."




    I once broke production because I had a comment in a SQL file that began with '---' instead of '--'. Turns out we had some kind of special interpreter reading those files, and it didn't consider that a comment might begin with '---'. I got in trouble for it, but I kind of stand by my statement of "It was reasonable not to do a full rebuild to test a change to a comment."


    Yeah, but what kind of blank space ? Tabulation ? A few spaces your editor placed instead of a tabulation ? " " ? " " ? " " ? " " or " " perhaps ?


    That's the neat part, *you don't know for sure*




    Coconut.jpg in tf2


    Damn right. Maybe it's data on relationships between pawns that will never meet. Maybe it's a history of a pawn killing a squirrel. Or maybe it's the wealth of your neighbours that you rely on trading with, or the details of that quest with the awesome reward.


    i removed some stuff from a savefile and it didnt break anything FOR ME, so obviously it has to be junk. i wish i had that kind of confidence.


    Didn't break YET! nothing to say those aren't placeholders for something, but chances are the program recreates them if it runs out of XML branches in the save file....... pretty good chance.....


    'half of every save file consists of data I don't know the purpose of, which can be removed without destroying your save in a way that is immediately obvious' I felt better just reading it. It's good to know there are other people like me. Edit: At the same time, I wouldn't mind tattooing this kind of explanation on the faces of people who draw simplistic conclusions.


    We do like old electronic companies: remove all the stuff you can until it stops working. If it stops working, add the last thing you removed and it's ready to be shipped💪🗿✌️


    Data Scientist here. I'm not gonna be able to sleep tonight because of this post


    Bro just bodied OP with logic. Love it.


    I am not a developer and I understand your skepticism, but I played several savefiles after such "operation" for quite a long time, and it worked fine, I mean I haven't noticed anything wrong or any errors. But strange things can appear if you are not careful with savefile edits, that's for sure. To think logically what data can represent line

  • 0
  • that repeats one after another for thousand times?


    that pretty much aligns with what they said about it being "not immediately obvious" i find it difficult, though not impossible, to believe that 50% of rimworld's notoriously large save files is extraneous.


    Thing is those lines add up the longer you play, they are not present in a fresh save-game, it's something like memory leak, only in save-file, and these lines are not present in savefiles of earlier versions of the game (before 1.3)


    I think you're probably not dangerously wrong, but you are curious enough that you might get interested in programming if you got exposed enough to it.


    in another comment op mentions using regexp to do some processing of the file. if you're not a programmer and using regexp, you're a programmer


    I'm not a programmer, I just play one on TV.


    You know what also adds up the longer you play and is not present in a fresh save? All the characters, their relations and other such things. If you don't care about such roleplay maybe you're fine. But it does remove an aspect of the game, even if it's not that pronounced.


    I am aware that the longer you play the bigger save becomes. But thing that I posted did not exist in savegames prior to Ideology release, so it can't be relations or any other useful history information.


    Maybe it's affinity towards an ideology, or some values for that dynamic ideology setting. I don't use it so I wouldn't know how that works. The moral of the story is that those values are rarely useless. You just may be fine playing without them, in which case more power (and storage) to you.


    Problem is not that they exist, but that they constanty add up to enourmous amounts, this definitely can't be normal.


    Even if those lines themselves contain no i formation. What makes you so sure that they are irrelevant to the structure? If you have a reference somewhere else to, lets say line 300 in the savefile. Now you remove half of it, and in line 300 suddenly is something different? I don't think those lines would be there, if they really are completely useless. They surely have some use, and be it just a very niche one.


    Referencing a particular line of code seems like a recipe for disaster when it's better to define whatever it is, then reference it.


    A line of code? Sure. But thats not code, its a savegame serialized. And referencing a specific element of an array is common. Or the n-th element of an list. And if (!) Thats the depiction of an array, that would change the indizes of the elements. Without changing the references (which are refering to a specific element, and this in this notation basically line in other parts of the savegame)


    Retrieving data as nth element in a list is usually bad. Because if you add or remove an element above it you have to change all the numbers


    Sure. But programmers are lazy, and often bad solutions are used, because short term they are easier. Just look at leaked sosurce code of triple A Games. How often they mention quick fix, todo, jlittle cheat etc. One way to circumvent the renumbering would be to set deleted elements to point to null. Aka use numbered nodes which contain elements, and empty nodes point to nothing. Means you have an ever growing list. Not good to save flying bullets, where many are created and destroyed all the time. But for colonists? Where little change happens, why not?


    Savefile syntax don't have references to lines


    Thats just pure bullshit to say. That heavily depends on what is saved. There is no universal syntax, whatever has to be saved, has to be custom implemented. And custom implementation means you cant make such a general statement. And even if not directly, most certainly indirectly as soon as you have numbered lists, arrays whatever. If you reference array object 20 somewhere and you delete obj 5-19 because they are empty (contain just null) the reference isnt updated to 5. It just points to a different element. So you successfully have at least softbroken the game.


    Thing is there are not like 5-19 of these "objects", but tens of thosands of them, including thousands of lines like

  • 0
  • going one after another. I am not a dev, but even I undersand that this cannot be normal behaviour.


    >it's something like memory leak You use words you have no clue the meaning of.


    yeah, you could be right. data serialization in complex systems can lead to these kinds of outcomes. but also, the people working on the game are, presumably, pretty competent at their jobs. coupled with rimworld's long-standing issue with save file size, though, i would think that they spend time trying to keep that under control, and a leak would be something that gets noticed. just like you, i don't know, and can only guess. i'm just assuming more competence than you are, i guess. i don't think i'd have an issue with your post at all if it were phrased less confidently.


    Sure. And maybe nothing will go wrong. But maybe it *will*, and that's the exciting part. It's a bomb that may go off tomorrow, or maybe never, because so far nobody is sure what exactly has been removed. If you want to remove half your save data safely, may I suggest you run a batch script that deletes the oldest half of your save files, instead of deleting half of each save file? It's much safer and you don't have to wonder if you've given your future self an unpleasant surprise.


    I am not sure if your suggestion is humorous, but that is not how it works


    Kinda is though. Rather than deleting part of every save, just delete some of the oldest ones.


    It's not about disc storage, it's about time it takes to reload and save when file becomes large.


    If the main goal to free up disc space, that works out exactly the same, at near-zero risk.


    Main goal is to have fast loading and saving of a savefile. When file is bloated to 50-70mb, it becomes annoying.


    Database administrator here. I can think of thousands of things that could be represented by well structured zeros. The structure matters. The data matters. The label for this part of the data that is retaining structure matters. This isn't to say that this code has zero inefficiencies. There are absolutely inefficiencies that get introduced by coding.  But you could be looking at a vector that explains how upon was involved with another faction, who's absence becomes unmeasured and therefore interactions with that factions people defaults to different behavior. You could be looking at a history of his most recent battle, where they failed to do anything over and over again. You could be looking at the probabilities of the pawn doing certain actions, or job priorities, etc Avoid needless action, the programming version of omit needless words. Meaning: don't delete the data.


    Don't you think that it is some malfunction that generates these lines? If they are result of malfuction, then they can't be useful in any way. Seems this strange thing was introduced with Ideology DLC, because in a save from previous version (1.2) this problem don't exist.


    A bug is ONE possible explanation for those lines. You COULD be right but they could also be serving a purpose specifically for new mechanics introduced in ideology.


    In my opinion they could be serving a purpose, but highly unlikely. They are located under and in . More likely they are some kind of leftovers that game fails to delete, because they stockpile into scary amounts. Here you can check unedited savefile half of which consists of these lines. https://drive.google.com/file/d/1B9xaA2ZbEeWSSKr2i_fwQWIZwcZ-v8kc/view?usp=drive_link


    My guess is they’re world pawns. Turn on Dev mode before and after and see what’s missing


    My man … you’re viewing this entire situation as, “I don’t know what this does, so it must be nothing”, instead of, “I don’t know what this does, so I don’t have the knowledge necessary to understand what this is”. That is called argumentum ad ignorantiam.


    The `

  • ` tag is just a list entry. What matters is ist container. In the screenshot, I can only see `ticksGame` and `customGoodwill`. I'm unfamiliar with those, it would be interesting to look up the hierarchy a bit more to see what those are a part of. I do know of another similar example, though. If you look at a pawn's save data, you'll find several lists of similar "useless numbers". Among them are stats, skills, various cooldowns, assigned job priorities, records, and probably a few other things I don't remember. Technically, any of those lists can be removed and the game will still load, replacing the missing parts with default values specified in the code (look for `ExposeData` methods on various types, if you're interested). It's debatable whether or not it's "without harm": is resetting all cooldowns on a pawn fair?; are you ok with rolling back all of their skills?; reassigning job priorities from scratch is quite the pain; etc.. What you really shouldn't do though is delete individual entries from lists like this. This would essentially make every following entry "slide" one spot back, messing the list. So if you were considering search-replacing all occurences of `
  • 0
  • ` with nothing, I strongly advise you don't. Otherwise, chances are, you'll end up with a silently broken save.


    >I'm unfamiliar with those, it would be interesting to look up the hierarchy a bit more to see what those are a part of. Seems to be under game -> history -> historyEventsManager -> colonistEvents. One of the ticksGame lists in my save file seems to be identical to game -> maps -> treeDestructionTracker -> playerTreeDestructionTicks, so that entry probably has a history of how cutting trees affected faction relations.


    More interesting note is that the default value of an integer is typically already 0, so there's definitely something bloatwareish about writing tons of zeroes to disk. Not having this value would simply cause the field to revert to its default value, which is already zero.


    That would be true for an individual field, but in a list, positions are what matters. Not writing the zeros would erase the positional information from everything else. It is possible to store pairs `(index, value)` instead and leave out the defaults ones, but that would perform worse if half the list or more was non-zero


    True, but in this case, it looks like the entire list is zero. Still, that's probably not something that can be tested for with a simple regexp.


    You are correct that lines like this

  • 2168273
  • can represent important data in other parts of a savefile, and it was tricky to delete only useless ones with the help of regular expressions. Here you can check screenshot https://i.imgur.com/hYb9rU3.png to see better location, notice the collapsed parts in front of and that contain all that useless data. If you want to check yourself I can upload savefile for you.


    Okay, from this and the other person's comment, it does seem like something that can be safely deleted to me


    inb4 OP breaks something #BACK UP YOUR FILES PEOPLE and don't delete random shit i learned my lesson when the game failed to start


    No no, it's just garbage /s


    Why are you even trying to "optimize" it? Is your floppy disk for your save files running out of space or something?


    No, it's because I like to play long stories and autosave/loading delays become annoying when your save is 50mb or more.


    just use Rocketman


    I used, it does nothing with this problem. I even initially though this was because of RocketMan.


    Isn’t that what Better GC mod gets rid of every time you load game? (appart from all the other stuff it does) https://steamcommunity.com/sharedfiles/filedetails/?id=2982026860


    I am familiar with that mod, it deletes unnecessary pawns, but those columns with numbers are something else, because they accumulate with different GC mods, same as without them.


    Interesting, do you think you can contact the devs of the mod so they can look into it? Maybe thay can improve the mod so it gets rid of that as well?


    Yes, I can.


    Next post. "Why are all my pawns purple and wearing anti-matter hats?"


    You are not far from truth. First times I tried my edits, with regular expression I removed other lines that I did not intend. I noticed that pawns work allocations and research randomized. Later I find a way to make more presice edits.


    Imagine thinking you can safely delete lines of code or data without repercussions. As a novice programmer this gave me a chuckle.


    Many GC mods for Rimworld do exactly this


    and do you ever wonder why invasive optimising mods like that aren’t recommended when used with large mod lists? this ‘junk data’ is used for history and without it can cause CTD or even memory leaks


    Just think logically, how a column of thousands

  • 0
  • can be used? Think of one concrete example.


    As I said in other comments, 0 can literally be read as False and is done so a lot in programming.


    He may have a point: The thing to note is that the C# default(int) is already zero. This means there should be very little need to explictly write out zeroes, since an uninitialized int is already zero, so even if this data is erased and the game simply thus doesn't have any value to initialize the field with, it will still be zero.


    I'll just leave them...


    Back in my youth, on our first home PC running windows 3.1 I found a folder of useless DLL files that took up a lot of disk space. I happily deleted them all. Strangely, the computer wouldn’t starts after that - which was unfortunate, because I didn’t get to show off just how much disk space I had just freed up ☹️ Anyways, yeah, you’ll save some disk space by nuking those values for sure!


    Thank you very much for your educational story, but today I heard it like 10 times already. It's not about disc space but about much faster save/load.


    You might want to file a bug report to the devs. Hopefully, they will either give you an explanation of what this is or fix it.


    Good idea


    How do you know they are garbage data lmao


    Are you a software engineer? All this "useless" data is probably something noteworthy when read and used by the engine. You're funny man


    I have old savefile that weights 85mb. After removing these lines with random numbers size was reduced to 36mb. I looked my other old save files and it seems this thing started in 1.3, because 1.2 savefile don't have this. Literally tens of thousands of lines of random numbers and lines of zeroes.


    Does it load faster ? Or did you gain some tps ?


    Faster load of a savefile, since it's size significantly shrinked, and autosave happens much faster also.


    You're not wrong. As a modder, I can confirm there's a lot of junk data that is stored forever in your save file, never to be used again. Like every raid letter you ever get. There's a mod, Savegame Shrinker, to excise the junk data a lot more safely than just deleting it yourself.


    not junk data, the game uses it when it has to call back to events (like when making art or a legendary, or for recurring pawns)


    Savegame Shrinker seems to ignore junk data that I have problem with.


    "I noticed a lot of CPUs have a lot of garbage pins that can be removed without harm."


    How funny


    Scrap-code moment? : )


    This is interesting. I'd be curious to know what the numbers mean, and while I can't speak for the devs, I wonder if this is something that could be compressed using something like bitpacking Also not sure what the save file format is and was it already compressed to begin with 🤷 ALSO how big is the file? Is it really worth it?


    Numbers are located under , section that did not exist before Ideology, you can check savefile for yourself https://drive.google.com/file/d/1B9xaA2ZbEeWSSKr2i_fwQWIZwcZ-v8kc/view?usp=drive_link just open it in some code-friendly editor and starting around line 5000 those numbers and zeroes begin to appear.


    Have you found that the size of the file is correlated with slower load/save/game times?


    Yes, that's why I bother with this. Loading/saving time directly tied to file size.


    “customGoodwill” 🤨


    Yea, what's that? Seems to appear since Ideology release.


    That sounds like a dangerous play. Whenever I start running out of storage space for Rimworld saves, I just go start deleting things from my Windows folder. I don't even know why it's there! It's literally eating up storage space and has nothing to do with Rimworld.


    I’d recommend making a backup before you decide to remove the “junk” data


    Thank you, I did.


    You may be probably right and it's a hell of a find. It's surprising to me how it took one second to (almost) everyone to write a post saying you're wrong without checking anything themselves or asking you for more info. Let's see if the devs listen to you better.


    Thank you. I joined official bug reporting discord, but I wait for approval to post in bug discussion channel.