• 1 Post
  • 9 Comments
Joined 2 years ago
cake
Cake day: June 12th, 2023

help-circle
  • Sure thing - one thing I’ll often do for stuff like this is spin up a VM. You can throw 4x1GiB virtual drives in it and play around with creating and managing a raid using whatever you like. You can try out md, ZFS, and BTRFS without any risk - even unraid.

    Another variable to consider as well - different RAID systems have different flexibility for reshaping the RAID. For example - if you wanted to add a disk later, or swap out old drives for new ones to increase space. It’s yet another rabbit hole to go down, but something to keep in mind. When we start talking about 10’s of terrabytes of data you start to lose somewhere to temporarily put it all if you need to recreate your raid to change your raid layout. :-)


  • Yeah - that’s fair. I may have oversimplified a tad… The concepts behind RAID, the theory, implementations, etc. are pretty complicated. And there are many tools that do “raid-like-things” with many options about raid types… So the landscape has a lot of options.

    But once you’ve made a choice the actual “setting it up” is usually pretty simple, and there’s no real on-going support or management you need to do beyond just basic health monitoring which you’d want to do even without a RAID (e.g. smartd). Any Linux system can create and use a RAID - you don’t need anything special like Unraid. My old early-to-mid-2010’s Debian box manages a RAID with NFS just fine.

    If you decide you want a RAID you first decide which “level” you want before talking about any specific implementations. This drives all of your future decisions including which software you use. This basically focuses on 2 questions - how much budget do you have and what is your fault tolerance?

    e.g. I have a RAID5 because I’m cheap and wanted biggest bang-for-the-buck with some failure resiliency. RAID5 lets me lose one drive and recover, and I get the storage space of N-1 drives (1 drive is redundant). Minimum size for a RAID5 is 3 drives. Wikipedia lists the standard RAID levels which are “basically” standardized even though implementations vary.

    I could have gone with RAID6 (minimum 4 disks) which can suffer a 2 drive outage. I have off-site backups so I’ve decided that the low-probability of a 2 drive failure means this option isn’t necessary for me. If I’m that unlucky I’ll restore from BackBlaze. In 10+ years of managing my own fileserver I’ve never had more than 1 drive fail at a time. I’ve definitely had drives fail though (replaced one 2 weeks ago - was basically a non-issue to fix).

    Some folks are paranoid and go with RAID1 and friends (RAID1, RAID10, etc.) which involves basically full duplication of drives. Very safe, very expensive for the same amount of usable storage. But RAID1 can work with a minimum of 2 drives. It just mirrors them so you get half the storage.

    Next the question becomes - what RAID software to use? Here there are lots of options and where things can get confusing. Many people have become oddly tribal about it as well. There’s the traditional Linux “md” RAID which I use that operates under the filesystems. It basically takes my 4 disks and creates a new block device (/dev/md0) where I create my filesystems. It’s “just a disk” so you can put anything you want on it - I do LVM + ext4. You could put btrfs on it, zfs, etc. It’s “just a disk” as far as the OS is concerned.

    These days the trend is to let the filesystems handle your disk pooling rather than a separate layer. BTRFS will create a RAID (but cautions against RAID5), as does ZFS. These filesystems basically implement the functionality I get from md and lvm into the filesystem itself.

    But there are also tools like Unraid that will provide a nice GUI and handle the details for you. I don’t know much about it though.








  • A fairly common setup is something like this:

    Internet -> nginx -> backend services.

    nginx is the https endpoint and has all the certs. You can manage the certs with letsencrypt on that system. This box now handles all HTTPS traffic to and within your network.

    The more paranoid will have parts of this setup all over the world, connected through VPNs so that “your IP is safe”. But it’s not necessary and costs more. Limit your exposure, ensure your services are up-to-date, and monitor logs.

    fail2ban can give some peace-of-mind for SSH scanning and the like. If you’re using certs to authenticate rather than passwords though you’ll be okay either way.

    Update your servers daily. Automate it so you don’t need to remember. Even a simple “doupdates” script that just does “apt-get update && apt-get upgrade && reboot” will be fine (though you can make it more smart about when it needs to reboot). Have its output mailed to you so that you see if there are failures.

    You can register a cheap domain pretty easily, and then you can sub-domain the different services. nginx can point “x.example.com” to backend service X and “y.example.com” to backend service Y based on the hostname requested.