Kebe Says - Dan McDonald's Blog

All Your Base Are Belong to 20-Somethings, and Solaris 9

Two Decades Ago…

Someone pointed out recently that the famous Internet meme “All your base are belong to us” turned 20 this week. Boy do I feel old. I was still in California, but Wendy and I were plotting our move to Massachusetts.

In AD 2001, S9 Was Beginning

OF COURSE I watched the video back then. The original Shockwave/Flash version on a site that no longer exists. I used my then-prototype Sun Blade 1000 to watch it, on Netscape, on in-development Solaris 9.

I found a bug in the audio driver by watching it. Luckily for me, portions of the Sun bug database were archived and available for your browsing pleasure. Behold bug 4451857. I reported it, and all of the text there is younger me.

The analysis and solution are not in this version of the bug report, which is a shame, because the maintainer (one Brian Botton) was quite responsive, and appreciated the MDB output. He fixed the bug by moving around a not-shown-there am_exit_task() call.

Another thing missing from the bug report is my “Public Summary” which I thought would tie things up nicely. I now present it here:

In A.D. 2001
S9 was beginning.
Brian: What Happen?
Dan: Someone set up us the livelock
Dan: We get signal
Brian: What!
Dan: MDB screen turn on.
Brian: It’s YOU!
4451857: How are you gentleman?
4451857: All your cv_wait() are belong to us.
4451857: You are on the way to livelock.
Brian: What you say?
4451857: You have no chance to kill -9 make your time.
4451857: HA HA HA HA…
Brian: Take off every am_exit_task().
Dan: You know what you doing
Brian: Move am_exit_task().
Brian: For great bugfix!

A Request to Security Researchers from illumos

A Gentle Reminder About illumos

A very bad security vulnerability in Solaris was patched-and-announced by Oracle earlier this week. Turns out, we in open-source-descendant illumos had something in the same neighborhood. We can’t confirm it’s the same bug because reverse-engineering Oracle Solaris is off the table.

In general if a vulnerability is an old one in Solaris, there’s a good chance it’s also in illumos. Alex Wilson said it best in this recent tweet:

If you want to see the full history, the first 11 minutes of my talk from 2016’s FOSDEM contains WHY a sufficiently old vulnerability in Solaris 10 and even Solaris 11 may also be in illumos.

Remember folks, Solaris is closed-source under Oracle, even though it used to be open-source during the last years of Sun’s existence. illumos is open-source, related, but NOT the same as Solaris anymore. Another suggested talk covers this rather well, especially if you start at the right part.

The Actual Request

Because of this history and shared heritage, if you’re a security researcher, PLEASE make sure you find one of many illumos distributions, install it, and try your proof-of-concept on that as well. If you find the same vulnerability in illumos, please report it to us via the security@illumos.org mailing alias. We have a PGP key too!

Thank you, and please test your Solaris exploits on illumos too (and vice-versa).

I Have No Whistle to Blow, But I Must Scream

I'm sure all twelve of you readers out there know what's been going on with respect to recent revelations about NSA activity. Among other things is the unnerving discovery that NSA has been attempting to actively dumb-down security for the Internet.

In the second linked article, Bruce Schneier calls upon people to blow the whistle on, "how the NSA and other agencies are subverting routers, switches, the internet backbone, encryption technologies and cloud systems." Here's the deal:

I have never been asked to introduce back-doors or weaken security in the Solaris, OpenSolaris, Oracle Solaris 11 (for the four months I worked on it post-barn-door-closing), or Illumos. If there are weaknesses there, it was not because of any deliberate effort on my part.

You can view the kernel IPsec protocol sources (AH & ESP) here, by looking at ipsec*.c, sadb.c, spd.c, spdsock.c, keysock.c and header files in the directory above it. You can see the IPsec management utilities here. According to at least one well-known security researcher, the Illumos (nee OpenSolaris) IPsec code isn't bollocks.

There is no open-source for IKE, because the libike.so.1 library was mostly OEM code, from a vendor whose technical lead let me co-write an RFC with him. You can use the various observability and debugging tools in Illumos to see how things work, however, if you wish.

If you want to write your own, better, key management application for Illumos (or even Oracle Solaris), you can use PF_KEY to control the IPsec SADB. I detail the subsequent additions to RFC 2367 on my day-one-of-OpenSolaris blog post. If you want to work on IPsec in totally-open-source Illumos, you have my blessing, and I'll definitely be reviewing (and maybe integrating if you pass code reviews) your code.

Broad-Spectrum Dogfooding, or Why I Miss Jurassic.

NOTE: I imported this from blogspot and the embedded tweet was nicely there. Not sure if other self-hosted entries will be that cool.

I think most of you dozen readers know what I mean, when I refer to dogfooding. Some people think of Microsoft when they hear the term, but I first heard it from the same person via his being a Sun customer, AND via my old roommate, who worked for him.

I saw this Tweet last week:

I then checked out the blog post. It dealt with how an iSCSI LAN can be a failure point, partially due to the weakness of the ones-complement TCP/IP checksum

Reading this reminded me of an old bug we found in Sun with either NFS or an ethernet device driver, and the only way we caught it was by using IPsec (AH particularly) and seeing packets fail the authentication check. The corrupt NFS packets had 16-bits worth of 1 (0xffff), where it should have had 16-bits worth of 0 (0x0000). Using the standard TCP/IP checksum, there's no difference between those two values, no matter where they fall in the packet. Using IPsec, however, even with HMAC-MD5, showed the packet failure clearly when the packet authentication check failed. This bug wouldn't have been discovered were it not for the Solaris Team's big honking server, jurassic, and how its multiple concurrent uses interacted with each other.

Even before there was OpenSolaris, people knew about jurassic. Solaris people's (not any old Sun people... Solaris people) posts on IETF mailing lists often showed user@jurassic. Jurassic served as the NFS source of home directories, and until the early 2000s e-mail inboxes as well. Every two weeks the in-development Solaris build would be placed upon jurassic. As a Solaris developer, if your changes broke jurassic, you fixed those changes immediately, or risked getting your changes yanked out. Not breaking jurassic was a great motivator for code quality. Also, if you had a new feature, you wanted it used on jurassic, even if not by everyone.

Once the basic IPsec protocols - AH & ESP - went into Solaris 8, I convinced the jurassic maintainers to protect all traffic between jurassic and a couple of workstations. One was mine, naturally. I encrypted all of my traffic to jurassic. Since we only had 100Mbit in our building at that time, the performance hit wasn't too bad, relatively speaking. Another belonged to an NFS developer, who I'd somehow convinced to run AH, because I was already running ESP (and AH used less cycles for protection). It was this NFS developer, surprised he wasn't getting data corruption while other were, who helped suss out the bug in question.

At this point, I'd like to have a moment of silence for all of the made-public Solaris information that Oracle has since put back in its box. I could've had a bug id here, folks, A REAL BUG ID!!!

So for a few of us, jurassic also served as an IPsec testbed. It also was helpful in determining that nobody else's cleartext performance dropped while a few of us were running with network traffic (put more succinctly, connection policy latching worked). Other services would run on jurassic as well: DNS, IMAP, and others I'm sure I'm forgetting. Jurassic core dumps eventually would be used to test out the then-new mdb (oh, those early ::findleaks results...), and I'm sure more than a few DTrace scripts helped diagnose some jurassic-discovered bugs.

At Nexenta, we make a dedicated storage appliance. Naturally, we use them inside where appropriate. We Nexentians (especially the ones in Lowell) use Illumos from other distributions for even greater effect. My Illumos Home Data Center talk touches upon these at about 10:43 in. We use Illumos to host VMs (Thank you Joyent), we use it for site-to-site VPNs, we will be using it for public services at some point, and everything I mentioned all runs on Illumos. It's not quite the magnifying glass Jurassic was, but we do what we can.

I believe Oracle still has jurassic around, I know it did prior to my 2011 departure. I suspect it's helping Oracle Solaris even today. I suspect, however, that a less dense, but more widely instantiated broad-spectrum dogfooding continues on in Illumos today.

I'm leaving Oracle, and switching gears

15 years ago I was finishing up last-minute changes at NRL while getting ready to move coasts. While I'm not moving coasts, I'm at the point where I'm finishing up last-minute changes again.

I'm leaving Oracle this week, and will be trying something a bit different after that. I've been doing IPsec or at least TCP/IP related work for the entirety of my time at Sun. I expect to be back in TCP/IP-land relatively soon, but I will be learning some new-to-me technologies in the immediate future.

I've met and worked with some extraordinary people during my time at Sun. I hope to keep in touch with them after I depart. If any of you half-dozen readers wish to keep up, I'd suggest following my Twitter feed until I decide whether or not I find a new home for this blog. I'm also findable on Facebook and LinkedIn for those so inclined.

MAC-then-encrypt - also harmful, also hard to do in Solaris

Hello again!

Kenny Paterson's once again turning the theoretical into practical. This time he's pointed out that if one configures IPsec to MAC-then-encrypt (do packet authentication first, THEN encrypt the packet), one is open to cryptographic attack. Here's a citation for his ACM CCS paper.

The good news is that we cannot configure the IPsec SPD to perform MAC-then-encrypt at all. One could configure transport mode to just MAC, then have the packet transit a tunnel that just encrypts, but then you'll see warnings about the encryption-only tunnel configuration. This has been true for a LONG time (starting with S9, maybe even S8).

So basically, we don't make it easy for you to shoot yourself in the foot this way. You really have to try, and as I pointed out earlier, the encryption-only part will warn you.

New IPsec goodies in S10u7

Hello again. Pardon any latency. This whole Oracle thing has been a bit distracting. Never mind figuring out the hard way what limitations there are on racoon2 and what to do about them.

Anyway, Solaris 10 Update 7 (aka. 5/09) is now out. It contains a few new IPsec features that have been in OpenSolaris for a bit. They include:
  • HMAC-SHA-2 support per RFC 4868 in all three sizes (SHA-256, SHA-384, and SHA-512) for IPsec and IKE.
  • 2048-bit (group 14), 3072-bit (group 15), and 4096-bit (group 16) Diffie-Hellman groups for IKE. (NOTE: Be careful running 3072 or 4096 bit on Niagara 1 hardware, see here for why. Niagara 2 works better, but not optimally, with those two groups.
  • IKE Dead Peer Detection
  • SMF Management of IPsec. Four new services split out from network/initial:
    • svc:/network/ipsec/ipsecalgs:default -- Sets up IPsec kernel algorithm mappings.
    • svc:/network/ipsec/policy:default -- Sets up the IPsec SPD (reads /etc/inet/ipsecinit.conf).
    • svc:/network/ipsec/manual-key:default -- Reads any manually-added SAs (reads /etc/inet/secret/ipseckeys).
    • svc:/network/ipsec/ike:default -- Controls the IKE daemon.
  • The UDP_NAT_T_ENDPOINT socket option from OpenSolaris, so you can develop your own NAT-Traversing IPsec key management apps without relying on in.iked.
We've even more goodies in OpenSolaris, BTW.

Go Blue! Recruiting at Michigan (day 2)

Oh my am I exhausted! I hoped to have most of the text of this completed before my flight got back to Manchester last night, but that didn't happen.

I keep telling people I know that Michigan is a hardware school (in spite of having some great software people - see my post from Monday). We Solaris developers at the Sun table were brutally reminded of this yesterday. Lots of EE's with Verilog and/or VHDL experience. Many of them asking about architecture and/or verification, but a surprising number who have never heard of SPARC, the UltraSPARC T1 (aka. Niagara), or that they can see the entire source for the Niagara with OpenSPARC. Almost every business card of mine I handed out to folks had the word, "OpenSPARC" on the back so they could Google it later.

We also tried to make sure everyone had OpenSolaris disks. There are four binary distributions of OpenSolaris on that set of disks: Solaris Express Community Edition (see the previous link) - Sun's current OpenSolaris vehicle, Nexenta - which is probably going to be one of the more comfortable ones for Ubuntu Linux users to land in, Belenix - which is optimized for Live CD use, and Schillix, which was the first non-Sun distribution of OpenSolaris, by Joerg Schilling of "cdrecord" fame. I hope some of the students went home and had success playing with OpenSolaris. You all should visit opensolaris.org and engage the community discussions with your feedback and questions.

I mentioned Monday about how much like a geezer I felt. I had more of that yesterday not only saying, "Class of '91" a few times, but also when Professor Quentin Stout visited our table. My only graduate-level class I took at U. of M. was his Parallel Algorithms class in the fall of 1990 (during Football/Marching Band season). Back in the day it was all theory - we discussed how to partition problems using the abstract PRAM (Parallel Random Access Machine). It was the ONLY parallel ANYTHING class offered when I had an available slot. This was when shared-memory multiprocessors were experiments or startups (anyone remember the BBN Butterfly, the Sequent Balance, or the Encore Multimax?). I mentioned to Prof. Stout I took his class back then. He then proceeded to tell me how the class is far more practical now. He told me all about stuff like OpenMP, and other high-level constructs that as a systems' programmer I just don't get to use all that much. I still, however, felt pretty smart for seeing the future back in 1990. I hope I have as good luck 17 years later.

Anyway, I had a great time in Ann Arbor, and I hope to get back there sooner rather than later. If anyone who visited our table is reading this, leave a comment, and don't be afraid to be honest. :)

Go Blue! Recruiting at Michigan (day 1)

I mentioned I was going to be at the University of Michigan's Engineering career fair, and here I am!

I got in yesterday (Sunday) afternoon, and did some things to re-orient myself. I visited my fraternity house first, and quickly, because rush began that night. In some ways things hadn't changed a bit - the house is still there and the rooms have the same names (my old room with a skylight window is still called Lighthouse). In other ways, they had - the TV is bigger and flatter, half of 'em had laptops, and the basement was being seriously renovated. The guys were pretty mellow, probably because of all of the post-beating-of-Penn-State celebrations. I then wandered around campus, eating dinner at Krazy Jim's Blimpyburger, where they give you burgers made of small, ground-that-day, patties. Yum!

When I flew in, the woman next to me on the plane explained the phenomenon she experienced when taking one of her kids to her alma mater. It all felt intimately familiar to her, even modulo some new buildings, but then she suddenly realized she was an old fart wandering campus. My kids aren't old enough to be shopping colleges yet, but I definitely felt the combination of familiarity and age. I saw buildings with new names, old names on new buildings, and just plain new buildings (esp. at North Campus). 20 years ago I was a freshman, now I'm literally old enough to be a father to a student in the incoming class of 2011.

This morning, I tagged along with Kais Belgaied as he visted some Computer Science faculty and grad students here. Our first visit was with Professor Z. Morley Mao, who's a new professor here. She has a lot of great ideas on how to exploit the Crossbow project for aiding intrusion detection (and mitigation), among other interesting ideas. We then talked to two other professors, Atul Prakash and Thomas Wenisch, and a few students as well. I remember Prof. Prakash from my time at Michigan (1987-1991), but the other two are new Assistant Professors. I'm confident from what I saw that U. of M.'s CSE division of EECS is going to be strong for a continuing number of years.

[Edit from Wednesday]Shoot! I forgot I also visited my old theory professor, Kevin Compton. He's a very good teacher, and helps even the most clueless undergrads (hem hem). He told me he's teaching a very popular undergraduate cryptography class, which is just too-cool, IMHO.

This evening several of us (Kais, Eric Kustarz, Bill and Sherry Moore, and I) gave a breezy tech talk about various goodies in OpenSolaris that we work on. We also had very yummy Pizza House pizza. Pizza House was "established 1986", which means it wasn't all that old when I was there, but it was good enough to have our host recommend it.

I'm now back in my hotel, squeezing packets over a flaky, but free, wifi. Tomorrow we will be spending the whole day at the table, taking resumes and answering questions. If one of you four readers of this blog is a U. of M. student, you don't have to wear a suit when visiting us. :)

IPsec Tunnel Reform, IP Instances, and other new-in-S10 goodies

Solaris 10 Update 4 (or as marketing calls it, Solaris 10 08/07) contains some backported goodies we've had in Nevada/OpenSolaris for a while.

IPsec Tunnel Reform was one of the first big pieces of code to be dropped into the S10u4 codebase. It shores up our interoperability story, so you can now start constructing VPNs that tell IKE to negotiation Tunnel-Mode (as opposed to IP-in-IP transport mode). Tunnels themselves are still network interfaces, but their IPsec configuration is now wholly in the purview of ipsecconf(1M). Modulo IKE (which we still OEM part of), we developed Tunnel Reform in the open with OpenSolaris.

Also new for S10u4 is IP Instances. Before u4, you could create non-global zones, but their network management (e.g. ifconfig(1M)) had to be done from the global zone. With u4, one can create a unique instance zone which gives the zone its own complete TCP/IP stack. The global zone needs to only assign a GLDv3-compatible interface to the zone (e.g. bge, nge, e1000g) to give it a unique IP Instance. You could have a single box be your router/firewall/NAT, your web server, and who knows what else, all while keeping those functions out of the fully-privileged global zone. It makes me think about upgrading to business-class Internet service at home, building my own box like Bart did and getting a few extra Ethernet ports.

Oh, and if you want to do it all with less ethernet ports, check out OpenSolaris's Crossbow and its VNIC abstraction!

Have fun moving your network bits in new and interesting ways!

Dan's blog is powered by blahgd