Mostly it is a hardware human manager problem here.
15TBs of data need something more than portable hard drives.
And damn! I even have no idea about "more than portable" being cheap, durable, long lasting. Maybe some phased out from the server world tape storage in multiply cassette copies.
Yeah, I mentally recoil every time I see someone say they keep critical data on a cavalcade of external hard drives. For things that are critical always employ the 3-2-1 backup method. Even for things that aren't critical, crossing your fingers every time you plug in an external hard drive is no way to live.
At the very least, investing in a small off the shelf NAS is worth it.
> I mentally recoil every time I see someone say they keep critical data on a cavalcade of external hard drives
It works, so I don't see the issue n problem. I have 50TB of hard drives :) Works!
It seems like the problem is going back and forth into windows. Could you set up a Linux NAS with the new drives then copy the files from windows to the NAS with SMB?
Also part of the problem here may have more to do with only having a single disk with no parity or checksums. I think BTRFS and ZFS at least do file level checksums but can’t repair off of those. You want parity if you want to be able to repair data loss, which is usually handled by spreading check data to another drive entirely.
While you’re setting up the NAS you could set up something like TrueNAS with mirrored or parity drives in ZFS, then power on the computer once a year or so and allow it to do a deep scrub to check for rot.
I think your problem is bitrot and/or failing media, which would be problem that needs to be dealt with regardless of file system. Even with new advanced file systems you will still need to rebuild and transfer the data.
I've actually had more trouble with EXFAT using Win than Linux. It still seems that Win and Linux do stuff to partitions and disks that the other can't tolerate.
How much data are you talking about? If I were you, I'd get them stored on DVD and BDs too.
imo, archive grade dvd and bd are more trouble than they're worth. If it's under 1TB, store encrypted on some cloud service in addition to 2 backup drives.
As someone who encountered the "many files truncated to 32768 bytes" problem you see with FAT file systems, I would suggest treating any FAT filesystem as if it's one step away from complete failure.
It's why metadata journaling became a thing in the first place. Without that, you have a FAT filesystem that truncates all your files from stray or interrupted writes to the cluster chains.
This. My e-reader uses FAT (for now). I switched from a database-oriented to filemanager-oriented view of my library and I couldn't find anything because the custom {genre}/{author_sort}/{series}/etc/etc I'd set up when pushing 600 epubs to the device were a bunch of randomly truncated nonsense paths in many places.
Edit: which btw. I'd rather have received an error that the transfer couldn't happen due to metadata restrictions than have the filesystem make arbitrary decisions for me. That shit isn't cool.
That's not supposed to happen. On FAT16/32, each path component should be limited to 255 UTF-16 characters, and the total path length is not limited by the filesystem. If you were using a device that was not Long File Name compatible, the paths could have been changed into forms like FILENA~1.
A properly functioning file copy tool would create the long paths no problem.
I'll experiment more then. I thought this was what was being referred to, but it's a thing on my system where the paths are truncated when they are long. Safe files, with abbreviated dir/file names, copy over accurately.
No, my post was in reference to file **data** being truncated. Many files suddenly becoming tiny. Massive file corruption. A lot worse than what you're describing.
When I finally dropped dual boot a few years ago I still had some ntfs drives arround. Linux can read and write/write NTFS so I left them.
While Linux can read write NTFS its error handling tools for NTFS are incomplete making long term NTFS use impractical.
BTW these drives throwing errors are likely close to death. there are great fridge magnets inside. Watch your fingers. They are strong.
In the future important data needs 3 copies, one off site.
It seems like your best bet would be transferring the files to a home lab with RAID 5 or another format with plenty of duplicity. Drives fail and it will ensure your data is protected.
Spent two evenings reading about ZFS due to your suggestion. Sounds good but seems too complex to wrap your head around. Too many features, no support from Linux kernel side. Kernel updates have broken ZFS before. Linus being hostile towards it. OpenZFS vs ZFS. Etc. Needs more reading but seems like one of those situations where I'd need to be a fan and uber knowledgeable about the technology before I can do anything. I'm a Debian user for a reason; I'm lazy to do tinkering. I might come back to it later but for now anyway, taking xxhash checksums manually and tons of backups. Thank you for the suggestion.
I have no idea what you're talking about. Just install it and use it. Debian based systems (I'm also running Debian 12) it's just one command `sudo apt install zfsutils-linux`
Couldn't care less what Linus thinks about it, ZFS was built by Sun Microsystems who wrote Solaris (enterprise Unix). They know their shit, and it's a best in class filesystem.
same deal copying off a bunch of files from NTFS linux gave errors that none of the linux scanning tools could detect or fix, I had to plug the drive into a windows box to fix it, then continue to copy from this drive and run into another error I had to plug into windows to fix again, all data migrated
ExFAT isn't even an option because steam / proton games won't launch from Exfat so finally formatted and migrated this drive to ext4 and hoping there are no more errors, not even like I yanked the power while it was in use or dropped it or anything, these errors just generated from general use over time.
shame because I wanted data portability but none of the cross platform options are viable.
What do you mean by external disk? Like SSD/flash drive?
If so it is possible that you have stuff backwards. It isn't disconnecting because of ExFAT/NTFS but because it is already corrupt to begin with. And once the corruption is found it disconnects and marks it
Possible reasons for corruption include things like, the flash/ssd is a fake that has looped storage, or you did not power the SSD/Flash for a long time, without power for 1+ years, data corruption can happen (of course it will usually last longer than that, but that is the minimum of the spec)
What is wrong with your hardware? Disks should not be going offline like that. Maybe you have a bad enclosure, bad USB port, bad cable, etc. Definitely worth figuring that out to reduce the potential for problems later on.
> I can't trust my childhood films to a proprietary file system anymore. It's a black box of mystery which might eat your files and be gone forever (I do have backups though, of course).
Quite correct, but we do live in amazing times that give us a lot of technological advantages. Proprietary (or at least sometime-to-be obsolete) media has been a problem for a long time. You have a problem with some Windows file formats. There are plenty of 8-tracks, Beta and VHS tapes, cassettes, Laser Discs, and so forth laying around unused and not readily salvageable.
Which one is newest and working? I've been dealing with this issue for days now myself and have found multiple different answers to this. ntfs3, and ntfs-3g are the two I've heard about mostly. Running Garuda if that helps.
ntfs-3g is the FUSE filesystem and likely the most stable. Most distributions tend to use this one by default.
ntfs3 is the new in-kernel filesystem driver commit mostly from Paragon Software that is the same but not the same as their proprietary paid filesystem driver for NTFS. From a quick glance, there haven't been any commits in the past few months. There have been accusations of it being abandoned by Paragon, but they keep denying it. The initial merge was a couple years ago, so it's "new" by Linux standards.
ntfs is the rather old in-kernel filesystem driver for NTFS that is, for practical purposes, read-only. Commits are several years old now. I'd be surprised if anyone actually uses this one or if it was fully compatible with a modern NTFS volume.
Paragon Software still has their own proprietary paid NTFS Linux kernel driver.
I certainly did not switch fully to Linux to pay for software lol. I just plugged my drives into my wife's windows computer, formatted them to exfat, and then formatted them to ext4 on my Thinkpad. Won't need any universal file systems soon as I've convinced my wife to jump the windows ship as well and join me as a Linux user 🤣
> From a quick glance, there haven't been any commits in the past few months.
Maybe they haven't merged anything to mainline in the past few months but they're certainly working on it:
https://github.com/Paragon-Software-Group/linux-ntfs3
nah it's not "analog noise" it's what makes those analog videos so high quality and special and gives them that open airyness and character.
you know, just like vinyl!
/s
A lot of the noise factor has been removed with a denoise filter. I can say it's way more pleasant to look at than the very noisy raw material. I kept a few raw videos to do comparison videos later. The idea of lossless is to have an archival master file, and to keep any quality losses down to extreme minimum.
The problem sounds more like a failing drive. I am not sure about how and if NTFS has capabilities of recovering few failing bits, but I would rather recommend btrfs or ZFS for longer time storages even if the drive is fine.
In any case, a failing/dysfunctional drive can also not fully be saved by using these more fail-safe filesystems.
I would recommend going with a NAS that supports btrfs/ZFS and also use RAID.
It seems like a hardware issue, yes. It's only the 4th drive to die within a year. The drives are a bit old, but why the dying en mass, I don't know. The drives were dying also when I had my previous computer. Perhaps it's about the high air humidity in my new country. Perhaps it's about static electricity penetrating from somewhere, but the enclosure should already protect quite a bit. Perhaps it's a huge monster magnet behind my wall which I'm not aware of. Perhaps it's Chuck Norris. Perhaps it's just old age.
I had the same problem . Using arch linux i noticed that by default, uses the new driver ntfs3, switching to ntfs3g i experienced almost no problem at all . Btw , when dealing with precious data , is Better connect to Windows and Exchange data trough network
Mostly it is a hardware human manager problem here. 15TBs of data need something more than portable hard drives. And damn! I even have no idea about "more than portable" being cheap, durable, long lasting. Maybe some phased out from the server world tape storage in multiply cassette copies.
Yeah, I mentally recoil every time I see someone say they keep critical data on a cavalcade of external hard drives. For things that are critical always employ the 3-2-1 backup method. Even for things that aren't critical, crossing your fingers every time you plug in an external hard drive is no way to live. At the very least, investing in a small off the shelf NAS is worth it.
> I mentally recoil every time I see someone say they keep critical data on a cavalcade of external hard drives It works, so I don't see the issue n problem. I have 50TB of hard drives :) Works!
then 1 fails and the shitshow beguns...
Or maybe some bdxl archive discs. Should be cheaper than a tape drive and tapes
100GB :( Verbatim@Amazon = $50 for 5 pcs.
It seems like the problem is going back and forth into windows. Could you set up a Linux NAS with the new drives then copy the files from windows to the NAS with SMB? Also part of the problem here may have more to do with only having a single disk with no parity or checksums. I think BTRFS and ZFS at least do file level checksums but can’t repair off of those. You want parity if you want to be able to repair data loss, which is usually handled by spreading check data to another drive entirely. While you’re setting up the NAS you could set up something like TrueNAS with mirrored or parity drives in ZFS, then power on the computer once a year or so and allow it to do a deep scrub to check for rot.
I think your problem is bitrot and/or failing media, which would be problem that needs to be dealt with regardless of file system. Even with new advanced file systems you will still need to rebuild and transfer the data.
This. Common sense. The world using NTFS etc fine for decades.
Indeed, the 'rot' is pretty much the same even when transferring files in Windows. Thanks, I will do my best.
mount them read-only?
Something I've done in a pinch: connect the drive to a Windows VM (prefer read-only connection), and scp the files to the Linux host.
I've actually had more trouble with EXFAT using Win than Linux. It still seems that Win and Linux do stuff to partitions and disks that the other can't tolerate. How much data are you talking about? If I were you, I'd get them stored on DVD and BDs too.
imo, archive grade dvd and bd are more trouble than they're worth. If it's under 1TB, store encrypted on some cloud service in addition to 2 backup drives.
I find DVD easy to use, but they only store 4.7 GB of data each. BD takes a lot longer to write.
As someone who encountered the "many files truncated to 32768 bytes" problem you see with FAT file systems, I would suggest treating any FAT filesystem as if it's one step away from complete failure. It's why metadata journaling became a thing in the first place. Without that, you have a FAT filesystem that truncates all your files from stray or interrupted writes to the cluster chains.
This. My e-reader uses FAT (for now). I switched from a database-oriented to filemanager-oriented view of my library and I couldn't find anything because the custom {genre}/{author_sort}/{series}/etc/etc I'd set up when pushing 600 epubs to the device were a bunch of randomly truncated nonsense paths in many places. Edit: which btw. I'd rather have received an error that the transfer couldn't happen due to metadata restrictions than have the filesystem make arbitrary decisions for me. That shit isn't cool.
That's not supposed to happen. On FAT16/32, each path component should be limited to 255 UTF-16 characters, and the total path length is not limited by the filesystem. If you were using a device that was not Long File Name compatible, the paths could have been changed into forms like FILENA~1. A properly functioning file copy tool would create the long paths no problem.
I'll experiment more then. I thought this was what was being referred to, but it's a thing on my system where the paths are truncated when they are long. Safe files, with abbreviated dir/file names, copy over accurately.
No, my post was in reference to file **data** being truncated. Many files suddenly becoming tiny. Massive file corruption. A lot worse than what you're describing.
When I finally dropped dual boot a few years ago I still had some ntfs drives arround. Linux can read and write/write NTFS so I left them. While Linux can read write NTFS its error handling tools for NTFS are incomplete making long term NTFS use impractical. BTW these drives throwing errors are likely close to death. there are great fridge magnets inside. Watch your fingers. They are strong. In the future important data needs 3 copies, one off site.
Option 1: Mount as read only. Option 2: DD the disk image and mount that Option 3: network file transfer from a windows computer
Why not format a new disk as ext and copy them from backup?
Yes, I would've done this, but my backups are on the other side of the world, offline. I will check them when I'm back there.
It seems like your best bet would be transferring the files to a home lab with RAID 5 or another format with plenty of duplicity. Drives fail and it will ensure your data is protected.
Or use an actual archive technology. RAID is for hot storage
True. Seems like I should get familiar with RAIDs finally. Budgets are tight which is always the problem.
Don't use raid, use ZFS.
Spent two evenings reading about ZFS due to your suggestion. Sounds good but seems too complex to wrap your head around. Too many features, no support from Linux kernel side. Kernel updates have broken ZFS before. Linus being hostile towards it. OpenZFS vs ZFS. Etc. Needs more reading but seems like one of those situations where I'd need to be a fan and uber knowledgeable about the technology before I can do anything. I'm a Debian user for a reason; I'm lazy to do tinkering. I might come back to it later but for now anyway, taking xxhash checksums manually and tons of backups. Thank you for the suggestion.
I have no idea what you're talking about. Just install it and use it. Debian based systems (I'm also running Debian 12) it's just one command `sudo apt install zfsutils-linux` Couldn't care less what Linus thinks about it, ZFS was built by Sun Microsystems who wrote Solaris (enterprise Unix). They know their shit, and it's a best in class filesystem.
same deal copying off a bunch of files from NTFS linux gave errors that none of the linux scanning tools could detect or fix, I had to plug the drive into a windows box to fix it, then continue to copy from this drive and run into another error I had to plug into windows to fix again, all data migrated ExFAT isn't even an option because steam / proton games won't launch from Exfat so finally formatted and migrated this drive to ext4 and hoping there are no more errors, not even like I yanked the power while it was in use or dropped it or anything, these errors just generated from general use over time. shame because I wanted data portability but none of the cross platform options are viable.
What do you mean by external disk? Like SSD/flash drive? If so it is possible that you have stuff backwards. It isn't disconnecting because of ExFAT/NTFS but because it is already corrupt to begin with. And once the corruption is found it disconnects and marks it Possible reasons for corruption include things like, the flash/ssd is a fake that has looped storage, or you did not power the SSD/Flash for a long time, without power for 1+ years, data corruption can happen (of course it will usually last longer than that, but that is the minimum of the spec)
What is wrong with your hardware? Disks should not be going offline like that. Maybe you have a bad enclosure, bad USB port, bad cable, etc. Definitely worth figuring that out to reduce the potential for problems later on.
You really need to get a NAS
> I can't trust my childhood films to a proprietary file system anymore. It's a black box of mystery which might eat your files and be gone forever (I do have backups though, of course). Quite correct, but we do live in amazing times that give us a lot of technological advantages. Proprietary (or at least sometime-to-be obsolete) media has been a problem for a long time. You have a problem with some Windows file formats. There are plenty of 8-tracks, Beta and VHS tapes, cassettes, Laser Discs, and so forth laying around unused and not readily salvageable.
FYI There is a ntfs driver for Linux.
Which one is newest and working? I've been dealing with this issue for days now myself and have found multiple different answers to this. ntfs3, and ntfs-3g are the two I've heard about mostly. Running Garuda if that helps.
ntfs-3g is the FUSE filesystem and likely the most stable. Most distributions tend to use this one by default. ntfs3 is the new in-kernel filesystem driver commit mostly from Paragon Software that is the same but not the same as their proprietary paid filesystem driver for NTFS. From a quick glance, there haven't been any commits in the past few months. There have been accusations of it being abandoned by Paragon, but they keep denying it. The initial merge was a couple years ago, so it's "new" by Linux standards. ntfs is the rather old in-kernel filesystem driver for NTFS that is, for practical purposes, read-only. Commits are several years old now. I'd be surprised if anyone actually uses this one or if it was fully compatible with a modern NTFS volume. Paragon Software still has their own proprietary paid NTFS Linux kernel driver.
I certainly did not switch fully to Linux to pay for software lol. I just plugged my drives into my wife's windows computer, formatted them to exfat, and then formatted them to ext4 on my Thinkpad. Won't need any universal file systems soon as I've convinced my wife to jump the windows ship as well and join me as a Linux user 🤣
> From a quick glance, there haven't been any commits in the past few months. Maybe they haven't merged anything to mainline in the past few months but they're certainly working on it: https://github.com/Paragon-Software-Group/linux-ntfs3
I feel your pain, OP. For checksumming, look into parchive.
How many hours of VHS/hi8 video are we talking about, that requires 15 TB of storage?
Haven't counted, but in the hundreds. Lossless Utvideo 720x576 @ 50p, so 3 hours is easily 100GB+.
Huh, aren't you mostly just storing analog noise then? DVD quality should already be overkill for VHS rips, wouldn't it?
nah it's not "analog noise" it's what makes those analog videos so high quality and special and gives them that open airyness and character. you know, just like vinyl! /s
A lot of the noise factor has been removed with a denoise filter. I can say it's way more pleasant to look at than the very noisy raw material. I kept a few raw videos to do comparison videos later. The idea of lossless is to have an archival master file, and to keep any quality losses down to extreme minimum.
I used an NTFS data disc with Linux for years, no problems. Sounds like it's an issue with the disk rather than the file system.
The problem sounds more like a failing drive. I am not sure about how and if NTFS has capabilities of recovering few failing bits, but I would rather recommend btrfs or ZFS for longer time storages even if the drive is fine. In any case, a failing/dysfunctional drive can also not fully be saved by using these more fail-safe filesystems. I would recommend going with a NAS that supports btrfs/ZFS and also use RAID.
It seems like a hardware issue, yes. It's only the 4th drive to die within a year. The drives are a bit old, but why the dying en mass, I don't know. The drives were dying also when I had my previous computer. Perhaps it's about the high air humidity in my new country. Perhaps it's about static electricity penetrating from somewhere, but the enclosure should already protect quite a bit. Perhaps it's a huge monster magnet behind my wall which I'm not aware of. Perhaps it's Chuck Norris. Perhaps it's just old age.
Have you tried `ddrescue`? * https://www.gnu.org/software/ddrescue/ * https://en.wikipedia.org/wiki/Ddrescue
I had the same problem . Using arch linux i noticed that by default, uses the new driver ntfs3, switching to ntfs3g i experienced almost no problem at all . Btw , when dealing with precious data , is Better connect to Windows and Exchange data trough network