Kebe Says - Dan McDonald's Blog

All Your Base Are Belong to 20-Somethings, and Solaris 9

Two Decades Ago…

Someone pointed out recently that the famous Internet meme “All your base are belong to us” turned 20 this week. Boy do I feel old. I was still in California, but Wendy and I were plotting our move to Massachusetts.

In AD 2001, S9 Was Beginning

OF COURSE I watched the video back then. The original Shockwave/Flash version on a site that no longer exists. I used my then-prototype Sun Blade 1000 to watch it, on Netscape, on in-development Solaris 9.

I found a bug in the audio driver by watching it. Luckily for me, portions of the Sun bug database were archived and available for your browsing pleasure. Behold bug 4451857. I reported it, and all of the text there is younger me.

The analysis and solution are not in this version of the bug report, which is a shame, because the maintainer (one Brian Botton) was quite responsive, and appreciated the MDB output. He fixed the bug by moving around a not-shown-there am_exit_task() call.

Another thing missing from the bug report is my “Public Summary” which I thought would tie things up nicely. I now present it here:

In A.D. 2001
S9 was beginning.
Brian: What Happen?
Dan: Someone set up us the livelock
Dan: We get signal
Brian: What!
Dan: MDB screen turn on.
Brian: It’s YOU!
4451857: How are you gentleman?
4451857: All your cv_wait() are belong to us.
4451857: You are on the way to livelock.
Brian: What you say?
4451857: You have no chance to kill -9 make your time.
4451857: HA HA HA HA…
Brian: Take off every am_exit_task().
Dan: You know what you doing
Brian: Move am_exit_task().
Brian: For great bugfix!

Goodbye 2020

Pardon my latency

Well, at least I’m staying on track for single-digit blog posts in a year. :)

Okay, seriously, 2020’s pandemic-and-other-chaos tends to distract. Also, I did actually have a few things worth my attention.

RFD 176

The second half of 2020 at work has been primarily about RFD 176 – weaning SmartOS and Triton off of the requirement for a USB key. Phases I (standalone SmartOS) and II (Triton Compute Node) are finished. Phase III (Triton Head Node) is coming along nicely, thanks to real-world testing on Equinix Metal (nee Packet), and I hope to have a dedicated blog post about our work in this space coming in the first quarter 2021.

Follow our progress in the rfd176 branches of smartos-live and sdc-headnode.

Twins & College

My twins are US High School seniors, meaning they’re off to college/university next fall, modulo pandemic-and-other-chaos. This means applications, a little stress, and generally folding in pandemic-and-other-chaos issues into the normal flow of things as well. Out of respect for their privacy and autonomy, I’ll stop here to avoid details each of them can spill on their own terms.

On 2021

Both “distractions” mentioned above will continue into 2021, so I apologize in advance for any lack of content here for my half-dozen readers. You can follow me on any of the socials mentioned on the right, because I’ll post there if the spirit moves me (especially on issues of the moment).

A Request to Security Researchers from illumos

A Gentle Reminder About illumos

A very bad security vulnerability in Solaris was patched-and-announced by Oracle earlier this week. Turns out, we in open-source-descendant illumos had something in the same neighborhood. We can’t confirm it’s the same bug because reverse-engineering Oracle Solaris is off the table.

In general if a vulnerability is an old one in Solaris, there’s a good chance it’s also in illumos. Alex Wilson said it best in this recent tweet:

If you want to see the full history, the first 11 minutes of my talk from 2016’s FOSDEM contains WHY a sufficiently old vulnerability in Solaris 10 and even Solaris 11 may also be in illumos.

Remember folks, Solaris is closed-source under Oracle, even though it used to be open-source during the last years of Sun’s existence. illumos is open-source, related, but NOT the same as Solaris anymore. Another suggested talk covers this rather well, especially if you start at the right part.

The Actual Request

Because of this history and shared heritage, if you’re a security researcher, PLEASE make sure you find one of many illumos distributions, install it, and try your proof-of-concept on that as well. If you find the same vulnerability in illumos, please report it to us via the security@illumos.org mailing alias. We have a PGP key too!

Thank you, and please test your Solaris exploits on illumos too (and vice-versa).

Now you can boot SmartOS off of a ZFS pool

Booting from a zpool

The most recent published biweekly release of SmartOS has a new feature I authored: the ability to manage and boot SmartOS-bootable ZFS pools.

A few people read about this feature, and jumped to the conclusion that the SmartOS boot philosophy, enumerated here:

  • The "/" filesystem is on a ramdisk
  • The "/usr" filesystem is read-only
  • All of the useful state is stored on the zones ZFS pool.

were suddenly thrown out the window. Nope.

This change is the first phase in a plan to not depend on ISO images or USB sticks for SmartOS, or Triton, to boot.

The primary thrust of this specific SmartOS change was to allow installation-time enabling of a bootable zones pool. The SmartOS installer now allows one to specify a bootable pool, either one created during the "create my special pools" shell escape, or just by specifying zones.

A secondary thrust of this change was to allow running SmartOS deployments to upgrade their zones pools to be BIOS bootable (if the pool structure allows booting), OR to create a new pool with new devices (and use zpool create -B) to be dedicated to boot. For example:

smartos# zpool create -B c3t0d0 standalone
smartos# piadm bootable -e standalone
smartos# piadm bootable
standalone                     ==> BIOS and UEFI
zones                          ==> non-bootable
smartos# 

Under the covers

Most of what’s above can be gleaned from the manual page. This section will discuss what the layout of a bootable pool actually looks like, and how the piadm(1M) command sets things up, and expects things to BE set up.

Bootable pool basics

The piadm bootable command will indicate if a pool is bootable at all via the setting of the bootfs property on the pool. That gets you the BIOS bootability check, which admittedly is an assumption. The UEFI check happens by finding the disks s0 slice, and seeing if it’s formatted as pcfs, and if the proper EFI System Partition boot file is present.

bootfs layout

For standalone SmartOS booting, bootfs is supposed to be mounted on "/" with the pathname equal to the bootfs name. By convention, we prefer POOL/boot. Let’s take a look:

smartos# piadm bootable zones ==> BIOS and UEFI smartos# piadm list PI STAMP BOOTABLE FILESYSTEM BOOT BITS? NOW NEXT 20200810T185749Z zones/boot none yes yes 20200813T030805Z zones/boot next no no smartos# cd /zones/boot smartos# ls -lt total 9 lrwxrwxrwx 1 root root 27 Aug 25 15:58 platform -> ./platform-20200810T185749Z lrwxrwxrwx 1 root root 23 Aug 25 15:58 boot -> ./boot-20200813T030805Z drwxr-xr-x 3 root root 3 Aug 14 16:10 etc drwxr-xr-x 4 root root 15 Aug 13 06:07 boot-20200813T030805Z drwxr-xr-x 4 root root 5 Aug 13 06:07 platform-20200813T030805Z drwxr-xr-x 4 1345 staff 5 Aug 10 20:30 platform-20200810T185749Z smartos#

Notice that the Platform Image stamp 20200810T185749Z is currently booted, and will be booted the next time. Notice, however, that there are no “BOOT BITS”, also known as the Boot Image, for 20200810T185749Z, and instead the 20200813T030805Z boot bits are employed? This allows a SmartOS bootable pool to update just the Platform Image (ala Triton) without altering loader. If one utters piadm activate 20200813T030805Z, then things will change:

smartos# piadm activate 20200813T030805Z
smartos# piadm list
PI STAMP           BOOTABLE FILESYSTEM            BOOT BITS?   NOW   NEXT  
20200810T185749Z   zones/boot                     none         yes   no   
20200813T030805Z   zones/boot                     next         no    yes  
smartos# ls -lt
total 9
lrwxrwxrwx   1 root     root          27 Sep  2 00:25 platform -> ./platform-20200813T030805Z
lrwxrwxrwx   1 root     root          23 Sep  2 00:25 boot -> ./boot-20200813T030805Z
drwxr-xr-x   3 root     root           3 Aug 14 16:10 etc
drwxr-xr-x   4 root     root          15 Aug 13 06:07 boot-20200813T030805Z
drwxr-xr-x   4 root     root           5 Aug 13 06:07 platform-20200813T030805Z
drwxr-xr-x   4 1345     staff          5 Aug 10 20:30 platform-20200810T185749Z
smartos# 

piadm(1M) manipulates symbolic links in the boot filesystem to set versions of both the Boot Image (i.e. loader) and the Platform Image.

Home Data Center 3.0 -- Part 2: HDC's many uses

In the prior post, I mentioned a need for four active ethernet ports. These four ports are physical links to four distinct Ethernet networks. Joyent's SmartOS and Triton characterize these with NIC Tags. I just view them as distinct networks. They are all driven by the illumos igb(7d) driver (hmm, that man page needs updating) on HDC 3.0, and I'll specify them now:

  • igb0 - My home network.
  • igb1 - The external network. This port is directly attached to my FiOS Optical Network Terminal's Gigabit Ethernet port.
  • igb2 - My work network. Used for my workstation, and "external" NIC Tag for my work-at-home Triton deployment, Kebecloud.
  • igb3 - Mostly unused for now, but connected to Kebecloud's "admin" NIC Tag.
The zones abstraction in illumos allows not just containment, but a full TCP/IP stack to be assigned to each zone. This makes a zone feel more like a proper virtual machine in most cases. Many illumos distros are able to run a full VMM as the only process in a zone, which ends up delivering a proper virtual machine. As of this post's publication, however, I'm only running illumos zones, not full VM ones. Here's their list:
(0)# zoneadm list -cv
  ID NAME             STATUS     PATH                           BRAND    IP    
   0 global           running    /                              ipkg     shared
   1 webserver        running    /zones/webserver               lipkg    excl  
   2 work             running    /zones/work                    lipkg    excl  
   3 router           running    /zones/router                  lipkg    excl  
   4 calendar         running    /zones/calendar                lipkg    excl  
   5 dns              running    /zones/dns                     lipkg    excl  
(0)# 
Their zone names correspond to their jobs:
  • global - The illumos global zone is what exists even in the absence of other zones. Some illumos distros, like SmartOS, encourage minimizing what a global zone has for services. HDC's global zone serves NFS and SMB/CIFS to my home network. The global zone has the primary link into the home network. HDC's global zone has no default route, so if any operations that need out-of-the-house networking either go through another zone (e.g. DNS lookups), or a defaut route must be temporarily added (e.g. NTP chimes, `pkg update`).
  • webserver - Just like the name says, this zone hosts the web server for kebe.com. For this zone, it uses lofs(7FS), the loopback virtual file system to inherit subdirectories from the global zone. I edit blog entries (like this one) for this zone via NFS from my laptop. The global zone serves NFS, but the files I'm editing are not only available in the global zone, but are also lofs-mounted into the webserver zone as well. The webserver zone has a vnic (see here for details about a vnic, the virtual network interface controller) link to the home network, but has a default route, and the router zone's NAT (more later) forwards ports 80 and 443 to this zone. Additionally, the home network DHCP server lives here, for no other reason than, "it's not the global zone."
  • work - The work zone is new in the past six years, and as of recently, eschews lofs(7FS) for delegated ZFS datasets. A delegated ZFS dataset, a proper filesystem in this case, is assigned entirely to the zone. This zone also has the primary (and only) link to the work network, a physical connection (for now unused) to my work Triton's admin network, and an etherstub vnic (see here for details about an etherstub) link to the router zone. The work zone itself is a router for work network machines (as well as serves DNS for the work network), but since I only have one public IP address, I use the etherstub to link it to the router zone. The zone, as of recent illumos builds, can further serve its own NFS. This allows even less global-zone participation with work data, and it means work machines do not need backchannel paths to the global zone for NFS service. The work zone has a full illumos development environment on it, and performs builds of illumos rather quickly. It also has its own Unbound (see the DNS zone below) for the work network.
  • router - The router zone does what the name says. It has a vnic link to the home network and the physical link to the external network. It runs ipnat to NAT etherstub work traffic or home network traffic to the Internet, and redirects well-known ports to their respective zones. It does not use a proper firewall, but has IPsec policy in place to drop anything that isn't matched by ipnat, because in a no-policy situation, ipnat lets unmatched packets arrive on the local zone. The router zone also runs the (alas still closed source) IKEv1 daemon to allow me remote access to this server while I'm remote. It uses an old test tool from the pre-Oracle Sun days a few of you half-dozen readers will know by name. We have a larval IKEv2 out in the community, and I'll gladly switch to that once it's available.
  • calendar - Blogged about when first deployed, this zone's sole purpose is to serve our calendar both internally and externally. It uses the Radicale server. Many of my complaints from the prior post have been alleviated by subsequent updates. I wish the authors understood interface stability a bit better (jumping from 2.x to 3.0 was far more annoying than it needed to be), but it gets the job done. It has a vnic link to the home network, a default route, and gets calendaring packets shuffled to it by the router zone so my family can access the calendar wherever we are.
  • dns - A recent switch to OmniOSce-supported NSD and Unbound encouraged me to bring up a dedicated zone for DNS. I run both daemons here, and have the router zone redirect public kebe.com requests here to NSD. The Unbound server services all networks that can reach HDC. It has a vnic link to the home network, and a default route.

The first picture shows HDC as a single entity, and its physical networks. The second picture shows the zones of HDC as Virtual Network Machines, which should give some insight into why I call my home server a Home Data Center.

HDC, physically HDC, logically

Home Data Center 3.0 -- Part 1: Back to AMD

Twelve years ago I built my first Home Data Center (HDC). Six years ago I had OmniTI's Supermicro rep put together the second one.

Unlike last time, I'm not going to recap the entirety of HDC 2.0. I will mention briefly that since its 2014 inception, I've only upgraded its mirrored spinning-rust disk drives twice: once from 2TB to 6TB, and last year from 6TB to 14TB. I'll detail the current drives in the parts list.

Like last time, and the time before it, I started with a CPU in mind. AMD has been on a tear with Ryzen and EPYC. I still wanted low-ish power, but since I use some of HDC's resources for work or the illumos community, I figured a core-count bump would be worth the cost of some watts. Lucky me, the AMD Ryzen 7 3700x fit the bill nicely: Double the cores & threads with a 20W TDP increase.

Unlike last time, but like the time before it, I built this one from parts myself. It took a little digging, and I made one small mistake in parts selection, but otherwise it all came together nicely.

Parts

  • AMD Ryzen 7 3700x - It supports up to 128GB of ECC RAM, it's double the CPU of the old HDC for only 50% more TDP wattage. It's another good upgrade.
  • Noctua NH-U12S (AM4 edition) CPU cooler - I was afraid the stock cooler would cover the RAM slots on the motherboard. Research suggested the NH-U12S would prevent this problem, and the research panned out. Also Noctua's support email, in spite of COVID, has been quite responsive.
  • ASRock Rack X470D4U - While only having two Gigabit Ethernet (GigE) ports, this motherboard was the only purpose-built Socket AM4 server motherboard. It has IPMI/BMC on its own Ethernet port (but you'll have to double check it doesn't "failover" to your first GigE port). It has four DIMM slots, and with the current BIOS (mine shipped with it), supports 128GB of RAM. There are variants with Two 10 Gigabit Ethernet (10GigE) ports, but I opted for the less expensive GigE one. If I'd wanted to wait, there's a new, not yet available, X570 version, whose more expensive version has both two 10GigE AND two GigE ports, which would saved me from needing...
  • Intel I350 dual-port Gigabit Ethernet card - This old reliable is well supported and tested. It brings me up to the four ethernet ports I need.
  • Nemix RAM - 4x32GB PC3200 ECC Unbuffered DIMMS - Yep, like HDC 2.0, I maxxed out my RAM immediately. 6 years ago I'd said 32GB would be enough, and for the most part that's still true, except I sometimes wish to perform multiple concurrent builds, or memory-map large kernel dumps for debugging. The vendor is new-to-me, and did not have a lot of reviews on Newegg. I ran 2.5 passes of memtest86 against the memory, and it held up under those tests. Nightly builds aren't introducing bitflips, which I saw on HDC 1.0 when it ran mixed ECC/non-ECC RAM.
  • A pair of 500GB Samsung 860 EVO SATA SSDs - These are slightly used, but they are mirrored, and partitioned as follows:
    • s0 -- 256MB, EFI System partiontion (ESP)
    • s1 -- 100GB, rpool for OmniOSce
    • s2 -- 64GB, ZFS intent log device (slog)
    • s3 -- 64GB, unused, possible future L2ARC
    • s4 -- 2GB, unused
    • The remaining 200-something GB is unassigned, and fodder for the wear-levellers. The motherboard HAS a pair of M.2 connectors for NVMe or SATA SSDs in that form-factor, but these were hand-me-downs, so free.
  • A pair of Western Digital Ultrastar (nee HGST Ultrastar) HC530 14TB Hard Drives - These are beasts, and according to Backblaze stats released less than a week ago, its 12TB siblings hold up very well with respect to failure rates.
  • Fractal Design Meshify C case - I'd mentioned a small mistake, and this case was it. NOT because the case is bad... the case is quite good, but because I bought the case thinking I needed to optimize for the microATX form factor, and I really didn't need to. The price I paid for this was the inability to ever expand to four 3.5" drives if I so desire. In 12 years of HDC, though, I've never exceeded that. That's why this is only a small mistake. The airflow on this case is amazing, and there's room for more fans if I ever need them.
  • Seasonic Focus GX-550 power supply - In HDC 1.0, I had to burn through two power supplies. This one has a 10 year warranty, so I don't think I'll have to stress about it.
  • OmniOSce stable releases - Starting with HDC 2.0, I've been running OmniOS, and its community-driven successor, OmniOSce. The every-six-month stable releases strike a good balance between refreshes and stability.

I've given two talks on how I use HDC. Since the last of those was six years ago, I'm going to stop now, and dedicate the next post to how I use HDC 3.0.

Now self-hosted at kebe.com

Let's throw out the first pitch.

I've moved my blog over from blogspot to here at kebe.com. I've recently upgraded the hardware for my Home Data Center (the subject of a future post), and while running the Blahg software doesn't require ANY sort of hardware upgrade, I figured since I had the patient open I'd make the change now.

Yes it's been almost five years since last I blogged. Let's see, since the release of OmniOS r151016, I've:

  • Cut r151018, r151020, and r151022.
  • Got RIFfed from OmniTI.
  • Watched OmniOS become OmniOSce with great success.
  • Got hired at Joyent and made more contributions to illumos via SmartOS.
  • Tons more I either wouldn't blog about, or just plain forgot to mention.
So I'm here now, and maybe I'll pick up again? The most prolific year I had blogging was 2007 with 11 posts, with 2011 being 2nd place with 10. Not even sure if I *HAVE* a half-dozen readers anymore, but now I have far more control over the platform (and the truly wonderful software I'm using).

While Blahg supports comments, I've disabled them for now. I might re-enabled them down the road, but for now, you can find me on one of the two socials on the right and comment there.

Goodbye blogspot

First off, long time no blog!

This is the last post I'm putting on the Blogspot site. In the spirit of eating my own dogfood, I've now set up a self-hosted blog on my HDC. I'm sure it won't be hard for all half-dozen of you readers to move over. I'll have new content over there, at the very least the Hello, World post, a catchup post, and a HDC 3.0 post to match the ones for 1.0 and 2.0.

Coming Soon (updated)

This is STILL only a test, but with an update. The big question remains: How quickly I can bring over my old Google owned blog entries?

(BTW Jeff, I could get used to this LaTeX-like syntax…)

You’ll start to see stuff trickle in. I’m using the hopefully-pushed-back-soon enhancements to Blahg that allows simple raw HTML for entries. I’ve done some crazy thing to extract, and maybe the hello-world post here will explain that.

From 0-to-illumos on OmniOS r151016

Today we updated OmniOS to its next stable release: r151016. You can click the link to see its release notes, and you may notice a brief mention the illumos-tools package.

I want to see more people working on illumos. A way to help that is to get people started on actually BUILDING illumos more quickly. To that end, r151016 contains everything to bring up an illumos development environment. You can develop small on it, but this post is going to discuss how we make building all of illumos-gate from scratch easier. (I plan on updating the older post on small/focused compilation after ws(1) and bldenv(1) effectively merge into one tool.)

The first thing you want to do is install OmniOS. The latest release media can be found here, on the Installation page.

After installation, your system is a blank slate. You'll need to set a root password, create a non-root user, and finally add networking parameters. The OmniOS wiki's General Administration Guide covers how to do this.

I've added a new building illumos page to the OmniOS wiki that should detail how straightforward the process is. You should be able to kick off a full nightly(1ONBLD) build quickly enough. If you don't want to edit one of the omnios-illumos-* samples in /opt/onbld/env, just make sure you have a $USER/ws directory, clone one of illumos-gate or illumos-omnios into $USER/ws/testws and use one of the template /opt/onbld/env/omnios-illumos-* files corresponding to illumos-gate or illumos-omnios. For example:


omnios(~)[0]% mkdir ws
omnios(~)[0]% cd ws
omnios(~/ws)[0]% git clone https://github.com/illumos/illumos-gate/ testws

omnios(~/ws)[0]% /bin/time /opt/onbld/bin/nightly /opt/onbld/env/omnios-illumos-gate
You can then look in testws/log/log-date&time/mail_msg to see how your build went.

Dan's blog is powered by blahgd