Kebe Says - Dan McDonald's Blog

Home Data Center 3.0 -- Part 2: HDC's many uses

In the prior post, I mentioned a need for four active ethernet ports. These four ports are physical links to four distinct Ethernet networks. Joyent's SmartOS and Triton characterize these with NIC Tags. I just view them as distinct networks. They are all driven by the illumos igb(7d) driver (hmm, that man page needs updating) on HDC 3.0, and I'll specify them now:

  • igb0 - My home network.
  • igb1 - The external network. This port is directly attached to my FiOS Optical Network Terminal's Gigabit Ethernet port.
  • igb2 - My work network. Used for my workstation, and "external" NIC Tag for my work-at-home Triton deployment, Kebecloud.
  • igb3 - Mostly unused for now, but connected to Kebecloud's "admin" NIC Tag.
The zones abstraction in illumos allows not just containment, but a full TCP/IP stack to be assigned to each zone. This makes a zone feel more like a proper virtual machine in most cases. Many illumos distros are able to run a full VMM as the only process in a zone, which ends up delivering a proper virtual machine. As of this post's publication, however, I'm only running illumos zones, not full VM ones. Here's their list:
(0)# zoneadm list -cv
  ID NAME             STATUS     PATH                           BRAND    IP    
   0 global           running    /                              ipkg     shared
   1 webserver        running    /zones/webserver               lipkg    excl  
   2 work             running    /zones/work                    lipkg    excl  
   3 router           running    /zones/router                  lipkg    excl  
   4 calendar         running    /zones/calendar                lipkg    excl  
   5 dns              running    /zones/dns                     lipkg    excl  
(0)# 
Their zone names correspond to their jobs:
  • global - The illumos global zone is what exists even in the absence of other zones. Some illumos distros, like SmartOS, encourage minimizing what a global zone has for services. HDC's global zone serves NFS and SMB/CIFS to my home network. The global zone has the primary link into the home network. HDC's global zone has no default route, so if any operations that need out-of-the-house networking either go through another zone (e.g. DNS lookups), or a defaut route must be temporarily added (e.g. NTP chimes, `pkg update`).
  • webserver - Just like the name says, this zone hosts the web server for kebe.com. For this zone, it uses lofs(7FS), the loopback virtual file system to inherit subdirectories from the global zone. I edit blog entries (like this one) for this zone via NFS from my laptop. The global zone serves NFS, but the files I'm editing are not only available in the global zone, but are also lofs-mounted into the webserver zone as well. The webserver zone has a vnic (see here for details about a vnic, the virtual network interface controller) link to the home network, but has a default route, and the router zone's NAT (more later) forwards ports 80 and 443 to this zone. Additionally, the home network DHCP server lives here, for no other reason than, "it's not the global zone."
  • work - The work zone is new in the past six years, and as of recently, eschews lofs(7FS) for delegated ZFS datasets. A delegated ZFS dataset, a proper filesystem in this case, is assigned entirely to the zone. This zone also has the primary (and only) link to the work network, a physical connection (for now unused) to my work Triton's admin network, and an etherstub vnic (see here for details about an etherstub) link to the router zone. The work zone itself is a router for work network machines (as well as serves DNS for the work network), but since I only have one public IP address, I use the etherstub to link it to the router zone. The zone, as of recent illumos builds, can further serve its own NFS. This allows even less global-zone participation with work data, and it means work machines do not need backchannel paths to the global zone for NFS service. The work zone has a full illumos development environment on it, and performs builds of illumos rather quickly. It also has its own Unbound (see the DNS zone below) for the work network.
  • router - The router zone does what the name says. It has a vnic link to the home network and the physical link to the external network. It runs ipnat to NAT etherstub work traffic or home network traffic to the Internet, and redirects well-known ports to their respective zones. It does not use a proper firewall, but has IPsec policy in place to drop anything that isn't matched by ipnat, because in a no-policy situation, ipnat lets unmatched packets arrive on the local zone. The router zone also runs the (alas still closed source) IKEv1 daemon to allow me remote access to this server while I'm remote. It uses an old test tool from the pre-Oracle Sun days a few of you half-dozen readers will know by name. We have a larval IKEv2 out in the community, and I'll gladly switch to that once it's available.
  • calendar - Blogged about when first deployed, this zone's sole purpose is to serve our calendar both internally and externally. It uses the Radicale server. Many of my complaints from the prior post have been alleviated by subsequent updates. I wish the authors understood interface stability a bit better (jumping from 2.x to 3.0 was far more annoying than it needed to be), but it gets the job done. It has a vnic link to the home network, a default route, and gets calendaring packets shuffled to it by the router zone so my family can access the calendar wherever we are.
  • dns - A recent switch to OmniOSce-supported NSD and Unbound encouraged me to bring up a dedicated zone for DNS. I run both daemons here, and have the router zone redirect public kebe.com requests here to NSD. The Unbound server services all networks that can reach HDC. It has a vnic link to the home network, and a default route.

The first picture shows HDC as a single entity, and its physical networks. The second picture shows the zones of HDC as Virtual Network Machines, which should give some insight into why I call my home server a Home Data Center.

HDC, physically HDC, logically

Home Data Center 2.0 - dogfooding again!

Over six years ago, I put together my first home data center (HDC), which I assembled around a free CPU that was given to me.

A lot has happened in those six years. I've moved house, been through three different employers (and yes, I count Oracle as a different employer, for reasons you can see here), and most relevant to this blog post - technology has improved.

My old home server was an energy pig, loud, and hitting certain limits. The Opteron Model 185 has a TDP of 110 watts, and worse, the original power supply in the original HDC broke, and I replaced it with a LOUD one from a Sun w2100z workstation. I also replaced other parts over the years as things evolved. What I ended up with at the start of 2014 was:
  • AMD Opteron Model 185 - No changes here.
  • Tyan S2866 - Same here, too.
  • 4GB of ECC RAM - Up from 2GB of ECC, to the motherboard's maximum. I tried at first with two additional GB of non-ECC, but one nightly build of illumos-gate where I saw a single-bit error in one built binary was enough to convince me about ECC's fundamental goodness.
  • Two Intel S3500 80GB SATA SSDs - I use these as mirrored root, and mirrored slog, leaving alone ~20GB slices (16 + 4) each. I'm under the assumption that the Intel disk controller will do proper wear-leveling, and what-not. (Any corrections are most appreciated!) These replace two different, lesser-brand 64GB SSDs that crapped out on me.
  • Two Seagate ST2000DL003 2TB SATA drives. - I bought these on clearance a month before the big Thailand flood that disrupted the disk-drive market. At $30/TB, I still haven't found as good of a deal, and the batch on sale were of sufficient quality to not fail me or my mirrored data (so says ZFS, anyway).
  • Lian Li case - I still like the overall mechanical design of this brother-in-law recommended case. I already mentioned the power supply, so I'll skip that here.
  • A cheap nVidia 8400 card - It runs twm on a 1920x1200 display, good enough!
  • OpenIndiana - After moving OpenSolaris from SVR4 to IPS, I used OpenSolaris until Oracle happened. OI was a natural stepping stone off of OpenSolaris.
I gave a talk on how I use my HDC. I'll update that later in this post, but suffice to say, between the energy consumption and the desire for me and my family to enable more services, I figured it was time to upgrade the hardware. With my new job at OmniTI, I also wanted to start dogfooding something I was working with. I couldn't use NexentaStor with my HDC, because of the non-storage functions of Illumos I was using. OmniOS, on the other hand, was going to be a near-ideal candidate to replace OpenIndiana, especially given its server focus.

As before, I started with a CPU for the system. The Socket 1150 Xeon E3 chips, which we had on one server at Nexenta (to help with the Illumos bring up of Intel's I210 and I217 ethernet chip, alongside Joyent and Pluribus), seemed an ideal candidate. Some models had low power draws, and they had all of the features needed to exploit more advanced Illumos features like KVM, if I ever needed it. I also considered the Socket 2011 Xeon E5 chips, but decided that I really didn't need more than 32GB of RAM for the forseeable future. So with that in mind, I asked OmniTI's Supermicro sales rep to put together a box for me. Here's what I got:
  • Intel Xeon E3 1265L v3 - This CPU has a TDP of 45 watts, that's 40% of the TDP of the old CPU. It clocks slightly slower, but otherwise is quite the upgrade with 4 cores, hyperthreading (looking like 8 CPUs to Illumos), and all of the modern bells and whistles like VT-x with EPT and AES-NI. It also is being used in at least one shipping illumos-driven product, which is nice to know.
  • Supermicro X10SLM-LN4F motherboard - This motherboard has four Intel I210 Gigabit ethernet ports on it. I only need two for now, thanks to Crossbow, but I have plans that my paranoia about separate physical LANs may require one or both of those last two. I'm using all four of its 6Gbit SATA ports, and it has two more 3Gbit ones for later. (I'll probably move the SSDs to the 3Gbit ones, because of latency vs. throughput, if I go to a 4-spinning-rust storage setup.) I've disabled USB3 for now, but if/when illumos supports it, I'll be able to test it here.
  • 32 GB of ECC RAM - Maxxed out now. So far, this hasn't been a concern.
  • Same drives as the old one - I moved them right over from the old setup. Installed OmniOS (see below), but basically did "zpool split", "zpool export" from the old server, and "zpool import" on the new one. ZFS again for the win!
  • Supermicro SC732D4 - The case, while not QUITE as cabling-friendly as the old Lian Li, has plastic disk trays that are an improvement over just screwing them in place on the Lian Li. The case comes standard with a four-disk 3.5" cage, and I added a four-disk 2.5" cage to mine. The 500W power supply seems to be an energy improvement, and is DEFINITELY quieter.
  • OmniOS r151010 - For my home server use, I'm going to be using the stable OmniOS release, which as of very recently became r151010. Every six months, therefore, I'll be getting a new OmniOS to use on this server. I haven't tried installing X or twm just yet, but that, and possibly printer support for my USB color printer, are the only things lacking over my old OI install.
I've had this hardware running for about two weeks now. It does everything the old server did, and a few new things.
  • File Service - NFS, and as of very recently, CIFS as well. The latter is entirely to enable scan-to-network-disk scanning. This happens in the global zone, on the "internal network" NIC.
  • Router - This is a dedicated zone which serves as the default router and NAT box. It also redirects external web and Minecraft requests (see below) to their respective zones. It also serves as an IPsec-protected remote access point. Ex-Sun people will know exactly what I'm talking about. It uses an internal vNIC, and a dedicated external NIC.
  • Webserver - As advertised. Right now it just serves static content on port 80 (www.kebe.com), but I may expand this, if I don't put HTTPS service in another zone later. This sits on an internal vNIC, and its inbound traffic is directed by the NAT/Router.
  • Minecraft - My children discovered Minecraft in the past year or so. Turns out, Illumos does a good job of serving Minecraft. With this new server, and running the processes as 32-bit ones (implicit 4Gig limit), I can host two Minecraft servers easily now. This sits on an internal vNIC as well.
  • Work - For now, this is just a place for me to store files for my job and build things. Soon, I plan on using another IPsec tunnel in the Router zone, an etherstub, and making this a part of my office, sitting in my house. Once that happens, I'll be using a dedicated NIC (for separation) to plug my work-issued laptop into.
  • Remote printing - I have a USB color printer that the global zone can share (via lpd). To be honest, I don't have this working on OmniOS just yet, but I'll get that back.
  • DHCP and DNS - Some people assume these are part of a router, but that's not necessarily the case. In this new instantiation, they'll live in the same zone as the webserver (which has a default route installed but is NOT the router). For this new OmniOS install, I'm switching to the ISC DHCP daemon. I hope to upstream it to omnios-build after some operational experience.
Not quite two weeks now, and so far, so good. My kids haven't noticed any lags in Minecraft, and I've built illumos-gate from scratch, both DEBUG and non-DEBUG, in less than 90 minutes. We'll see how DHCP holds up when Homeschool Book Club shows up with Moms carrying smartphones, tablets, and laptops, plus even a kid or two bringing a Minecraft-playing laptop as well for after the discussion.

Delegated ZFS, cloning, and SCM

Well THAT was a long break from blogging...

One of the things that's happened in the illumos community is a subtle shift of the main illumos source repository from being primarily Mercurial to being primarily Git. This means I've had to learn Git. At first, I wasn't sure why people were so rabidly pro-Git. I found one of the big reasons:


everywhere(~/ws)[0]% /bin/time git clone git-illumos git-illumos.copy
Cloning into git-illumos.copy...
done.

real 11.8
user 4.7
sys 3.2
everywhere(~/ws)[0]% /bin/time hg clone illumos-clone illumos-clone.copy
updating working directory
44332 files updated, 0 files merged, 0 files removed, 0 files unresolved

real 1:52.6
user 28.9
sys 25.4
everywhere(~/ws)[0]%

Wow! Yeah, I can see why this would appeal to people. I'm still using Mercurial in a fair amount of places, both for my illumos work and for Nexenta as well. I should show one other thing that both SCM cloning operations do: take up disk space.


everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 198G 100G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]% /bin/time git clone git-illumos git-illumos.copy

*** SNIP! ***

everywhere(~/ws)[0]% sync
everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 198G 99.6G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]% /bin/time hg clone illumos-clone illumos-clone.copy

*** SNIP! ***

everywhere(~/ws)[0]% sync
everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 199G 98.7G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]%

I believe Git will also take up less disk space, but still, that's approximately half a gig or more for an illumos workspace. If it's populated, say with a preinstalled proto area and compiled objects, that'll be even larger.

Consider one of the great strengths of ZFS: its copy-on-write architecture. Take a local, on-disk master repo, say one you're pulling directly from the source, and make it its own filesystem. Child/downstream workspaces from your on-disk master now can be created using low-latency ZFS operations. Only two problems need to be solved: non-privileged usage, and SCM correction to properly designate the parent/child or upstream/downstream relationship.

Another useful ZFS feature is administrative delegation. Put simply, an administrator can allow an ordinary user to perform selected ZFS primitives on a given filesystem, and its descendants in the ZFS filesystem tree. For example:


everywhere(~)[0]% zfs allow rpool/export/home/danmcd
everywhere(~)[0]% zfs allow rpool/export/home/danmcd/ws
---- Permissions on rpool/export/home/danmcd/ws ----------------------
Local+Descendent permissions:
user danmcd clone,create,destroy,mount,promote,snapshot
everywhere(~)[0]%

I (as root) delegated several permissions for a subdirectory of $HOME to me (as danmcd). From here, I can create new filesystems in ~/ws, as well as destroy them, clone them, mount, snapshot, and promote them. All of these are useful operations. The syntax for delegation is mostly straightforward: zfs allow -ld clone,create,destroy,mount,promote,snapshot rpool/export/home/danmcd/ws. The -ld flags enable local and descendant permission propagation.

First thing I did was zfs create rpool/export/home/danmcd/ws/illumos-clone, followed by hg clone ssh://anonhg@hg.illumos.org/illumos-gate illumos-clone. This populates my local Mercurial illumos repo. I can perform a similar operation with git. Per my above timing examples, I did so with git-illumos.

I wrote a script to clone, promote, and reparent Git and Mercurial workspaces using ZFS operations. It's called zclone and it's here for download. It's still a work in progress, and I'd like to maybe have it end up in usr/src/tools in illumos-gate someday. (I'll try and update this particular post as things evolve.)

Check out the times, and the disk space (not) used:


everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 198G 100G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]% /bin/time zclone git-illumos git-illumos.zc
Created rpool/export/home/danmcd/ws/git-illumos.zc,
a zfs clone of rpool/export/home/danmcd/ws/git-illumos

real 1.0
user 0.0
sys 0.0
everywhere(~/ws)[0]% /bin/time zclone illumos-clone illumos-clone.zc
Created rpool/export/home/danmcd/ws/illumos-clone.zc,
a zfs clone of rpool/export/home/danmcd/ws/illumos-clone

real 1.0
user 0.0
sys 0.0
everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 198G 100G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]%

These are constant-time operations, folks. And like I said earlier, I suppose its possible to have the local master repos populated with pre-compiled objects, header files in proto areas (an illumos build trick), and other disk-intensive operations pre-performed.

A quick search didn't yield me any results in this area: using ZFS to help make source trees take up less space. I'm surprised nobody's blogged about this or documented it, but I may have missed something. Either way, it doesn't hurt to mention it again.

More ZFS Love - Rapid Recovery

I recently scragged my laptop's primary root partition such that I needed to install-from-scratch again. I had a bootable secondary root, but since it was running an experimental BFUed build, this partition could not be upgraded.
Let's quickly look at how I configure my 100-decimal-GB laptop disk (&*%$% disk vendors):
  • c0d0s0 --> Primary root, approx 8GB (and I mean GB the way software geeks mean it, 8 * 1024^3).
  • c0d0s1 --> Secondary root, same size.
  • c0d0s3 --> swap, 3GB, same as main memory size (useful for system dumps).
  • c0d0s7 --> ZFS pool "tank", with 5 ZFS filesystems (tank, CSW, spro, local, and danmcd).
Before I shut it down for upgrade, I simply uttered this: zpool export tank That's it!

Then I plugged in my laptop to a local netinstall network, and PXE-booted to a Nevada build 73 install (which includes detangled NAT-Traversal) and started it up. I used the old Solaris installer because I know how to tell it to preserve disk slices. I told it to preserver the secondary root and the zpool.

One install later, I get root, and to recover my miscellaneous backups, CSW software, compilers, local binaires, and home directory, I just did: zpool import tankAnd again, that's it! All of my filesystems got mounted properly, no tables to edit, NOTHING.
I'll be at the University of Michigan Engineering Career Fair this coming Tuesday, and will be wandering campus on Monday. If you're one of the four people who read this blog and are there, drop by the Sun table - and see the very laptop I'm talking about. :)

Dan's blog is powered by blahgd