Kebe Says - Dan McDonald's Blog

Not using "ncp" on Niagara considered harmful

One of our IPsec remote-access servers here is on a Niagara-powered T2000 server. It's really overkill for the job, but we get to see how IKE and the Niagara crypto accelerator (known as "ncp" by its driver name) interact.

The nice thing about running your own stuff is you find things out before others do. Consider bug 6339802. We saw AWFUL IKE performance on Niagara boxes before we fixed this. Admittedly, IKE is single-threaded (for reasons beyond the scope of this blog entry), but it was taking seconds to complete an IKE Phase I with 2048-bit RSA and 1536-bit Diffie-Hellman.

More recently, we've been enabling bigger Diffie-Hellman MODP groups in our IKE. The Niagara driver has a limit of 2048-bit operations, so we limited the Phase I DH to 2048-bits.

Here's a DTrace script we like to use to measure responder-side Phase I times (in.iked and libike are closed-source, but trust me on this one):
#!/usr/sbin/dtrace -s

/*
 * Responder-side Phase I setup.
 */
pid$1::ssh_policy_new_connection:entry
{
        self->negstart[arg0] = timestamp;
        printf("Initial packet received, pm_info = %p", arg0);
}

pid$1::ssh_policy_negotiation_done_isakmp:entry
{
        /* Use 16384 value for "CONNECTED" from isakmp_doi.h */
        printf("return %d - %s ", arg1,
            (arg1 == 16384) ? "Success" : "Error case.");

        printf("pm_info %p finished, took %d ns", arg0,
            timestamp - self->negstart[arg0]);
}
With the fix for 6339802 in place, we can get pretty good phase 1 times....

dtrace: script '/space/responder-phase1.d' matched 4 probes
CPU     ID                    FUNCTION:NAME
  4  48040  ssh_policy_new_connection:entry Initial packet received, pm_info = bed6c8
  4  48042 ssh_policy_negotiation_done_isakmp:entry return 16384 - Success pm_info bed6c8 finished, took 165512764 ns
That's 165msec. Some of that time is packet round-trips, but let's ignore that now for this exercise.

Now let's do someting drastic: cryptoadm disable provider=ncp/0 all. Suddenly those seconds come back...


4 48040 ssh_policy_new_connection:entry Initial packet received, pm_info = bed6c8
4 48042 ssh_policy_negotiation_done_isakmp:entry return 16384 - Success pm_info bed6c8 finished, took 7732419300 ns


WOW! Like Darren said, that's blog-worthy, and that's why I'm here.

So why is Niagara so slow without its crypto accelerator?


That's a good question. Keep in mind that four big-number operations occur in an IKE Phase I exchange like I measured above: one RSA Signature, one RSA Verification, one Diffie-Hellman generate, and one Diffie-Hellman agree. I'm not 100% sure, but I believe the default software implementation of big-number operations on SPARC uses floating-point tricks to help out. Using floating-point on a Niagara kicks in a software emulation, which would definitely increase the time taken for each bignum operation.

So the moral of the story is to make sure you're exploiting all of the hardware that's available to you!

"Adaptation Issues", and a Neuromancer movie?

Ahh what to do while test runs finish? Sometimes I get additional real work done, but when it gets close to a holiday, or just the end of a day, I visit Movie websites.

Coming Soon recently quoted an article from Variety about one of my favorite novels - William Gibson's Neuromancer getting a producer for a (long-overdue, IMHO) movie.

Now granted, some of the book hasn't aged very well (Gibson's introduction to a recent re-issue asks about the lack of cell phones), but some of it would still translate VERY nicely to the big screen. Of course, the last major-release attempt to bring one of his works to the big screen was a dismal failure, story-wise, even though (again IMHO) the look-and-feel was close.

Another movie site I visit introduced me to a phrase - "adaptation issues". Basically, if one has enjoyed a story in one form, one may have problem what that story is retold in another form. Ask die-hard (insert-book or comic here) fans about how (book or comic)'s movie version just messed things up SO BADLY. Often, some of their criticisms turn out to be well-based, other times, it's just acute fan{boy,girl} nitpicking. One's adaptation issues seem to be tied to how beloved the original source material is by someone. My only real experience with adaptation issues was the reaction I had to Johnny Mnemonic, where I felt like there was a wonderful opportunity that had been squandered. I really hope Neuromancer's movie adaptation doesn't leave me feeling the same way.

To that end, I'm going to resurrect a conversation I had with a former movie-site writer about who should be cast in a Neuromancer movie. Hey Widge -- care to revisit that cast again?

Unnumbered interfaces confuse Quagga

The whole reason I was reading e-mail on a Sunday was not to look for telnetd exploits.

I was logged in because Team IPsec runs its punchin IPsec remote-access server (sometimes called a VPN server, but I hate that term because it's pushed by too many middlebox vendors) which was having routing problems.

As stated before, Solaris implements tunnels as point-to-point interfaces. For a remote-access server like we have in punchin, this means every external IP address gets a tunnel interface. (Until we had Tunnel Reform, this meant only one client per external IP address, which messed up NATs for multiple clients.) A tunnel interface has two addresses - a local one and a remote one. The local one can be shared with other tunnels or even with a different local interface (like the local ethernet). Such interfaces are called unnumbered interfaces.

A remote access server does forward packets, and is therefore by definition a router. One of our servers just swapped out Zebra (from older OpenSolaris/Nevada build) to Quagga. We use Quagga's OSPF to learn the topology of the Sun internal network (the SWAN).

As clients "punch out", their tunnel gets destroyed. Now each of these tunnels shares the same local IP address with our ethernet to the SWAN. Unfortunately, these "interface down" events confuse Quagga, and suddenly all of my punchin clients can't move bits to the internal network anymore.

There is a workaround, and that's to assign a different local IP address than the one that is directly connected to the SWAN for use with all of the client tunnels. It's not that painful, as I only lose one out of 256 possible client addresses (our engineering ones only have a /24 from which to allocate client addresses). Still, as an esteemed colleague said, "I hope that's not the *whole* solution."

It isn't, and I would like to ask the Quagga community (as I've already asked our local routing folks, Paul Jakma and Alan Maguire) to make sure that Quagga and its routing protocols play nicely with unnumbered interfaces. It'll allow me to plumb tunnels until I'm all out of address space! :)

This entry brought to you by the Technorati tags , , and .

How OpenSolaris did its job during this telnet mess

I don't have a tag for general Security because dammit, I'm still a networking person who works on security! (UPDATE: I'm wiser now in 2020 and have added one.)

Anyway, you've seen elsewhere about how Alan H. turned around the S10 fix as quickly as he could. I'm going to tell you how Alan already found this:


D 1.67 07/02/11 19:46:41 danmcd 90 89 00009/00010/04896
6523815 LARGE vulnerability in telnetd


when he went to file a bug that'd already been putback into Nevada/OpenSolaris.

The best place to see what happened is to visit the OpenSolaris discussions, especially this thread.

I was reading e-mail on a Sunday because of an operations problem I was having with one of our punchin IPsec remote access servers. (I'll discuss the problem, a routing one, in a followup entry later today.) I found the initial note and read the PDF file to which "skunsul" so graciously provided a link. MAN I was embarassed. After trying it on some lab machines and my laptop, I brought up the in.telnetd source (at the line number provided by Kingcope). My first approach was to verify the content of the $USER environment variable fed to in.telnetd. I compiled-and-ran the fix, which seemed to work. Great! Time to find some code reviewers.

My only regret about this was not putting the review on security-discuss@opensolaris.org or networking-discuss@opensolaris.org. I'll try better next time, especially for something that was announced on an opensolaris list initially. Anyway, two reviewers (OpenSolaris board member and well-known Sun Good Guy Casper Dik, and crypto framework expert Krishna Yenduri) suggested that login(1) is already getopt-compliant, and that I should just pass "--" between the rest of the arguments and the contents of $USER, no matter how *&^$-ed up it is. Because it was a Sunday, I didn't get rapid turnaround on e-mail replies. This is why the putback didn't happen until six hours after I'd read the note from skunsul. Krishna also recommended (in the spirit of open development) that I place the diffs on the very thread, and I did just that.

Anyone I know here who happened to have seen the initial note would've jumped on this in the same way - please don't think I did something others wouldn't do. My point is - this is the first security exploit reported to us via OpenSolaris, and I think the "Open" part of OpenSolaris helped out the code, as well as Sun's customers.

This entry brought to you by the Technorati tags and .

Tunnel Reform Code Review starts now.

Hey everyone!

The IPsec Tunnel Reform project's code review is now underway. Take a look and see what it took to bring up IPsec Tunnel-Mode processing in a world where tunnels are not actions from a policy, but rather a first-class network interface (or at least after Clearview it will be).

Highlights for administrators include:

  • Augmentiations to ipsecconf(1m) to specify a tunnel interface's policy, whether it's S9-style IP-in-IP transport mode, or RFC 2401-compliant Tunnel Mode.

  • No changes to IKE configuration.

  • You can configure tunnel security without ifconfig(1m) using just ipsecconf(1m). We put all IPsec policy in ipsecconf(1m) and let ifconfig manage interfaces (and route(1m) manage routing).

  • Additions to ipseckey(1m) for manual tunnel-mode SA configuration, or monitoring of kernel interactions with Key Management.

  • Better interoperability with everyone else's Tunnel Mode IPsec.



Highlights for OpenSolaris-hackers include:

  • New per-tunnel policy structure: ipsec_tun_pol_t, which instantiates the existing policy-head per tunnel.

  • Getting rid of IRE_DB_REQ messages for SA addition/updates. This improves SA-adding performance and reduces the complexity of the ESP and AH modules.

  • New PF_KEY and PF_POLICY messages to reflect Tunnel Mode.

  • Shifting of tunnel IPsec policy enforcment from the lower-instance of IP to "tun" itself. (NOTE: This will change again when we merge with Clearview.)



Share your comments on tref-discuss, and let us know what you think!

This entry brought to you by the Technorati tags , , and .

Tunnel Reform now open for your perusal

NOTE: Links here point to docs that no longer exist. Maybe the the Internet Archive might have 'em?

IPsec in Solaris has one missing piece, and we're about to put it in place.

The IPsec Tunnel Reform project aims to give Solaris and OpenSolaris an RFC 2401-compliant tunnel-mode implementation.

There's a lot of changes in the source base, some of which aren't open sourced (IKE), but most of which are in existing OpenSolaris code. The project page has a webrev showing the changes thus far. We're trying to be more open in our development processes here in the Solaris group, and showing you Tunnel Reform before we've finished it, AND before we've started major test efforts, is Team IPsec's own way of contributing to this openness.

Think of the source snapshot as a "Code Preview" instead of a "Code Review". There's a newly-rewhacked design document there too, and we'd like you to look at it and discuss it on the OpenSolaris communities or the tref-discuss@opensolaris.org mailing list.

And once we're done with this, we can think about RFC 4301 (2401's replacement) and friends, more zones support, SMF-izing things, giving TX labelled SA support... :)



This entry brought to you by the Technorati tags , , and .

ESP without authentication considered harmful

Hopefully you will read this and go "That's obvious". I'm writing this entry, however, for those who don't.

When IPsec was being specified over 10 years ago, attacks against cipher-block-chaining (CBC) encryption were understood. ESP has an authentication algorithm because AH had a vocal-enough opposition to merit having packet integrity in ESP also (there are also performance arguments for ESP-auth).

Now there actual attacks with actual results. Kenny Paterson and Arnold Yau have published a paper with attacks against no-authentication ESP Tunnel Mode. I believe some of the techniques can also be employed against Transport Mode as well, but again, only with no authentication present.

The simple solution, of course, is to employ your choice of ESP Authentication (encr_auth_algs in ipsecconf(1m) or ifconfig(1m)) or AH (auth_algs in ipsecconf(1m) or ifconfig(1m)) with your IPsec deployment. We warn users about such configurations with ifconfig(1m) today. There is an RFE to eliminate or make very difficult encryption-only configurations in Solaris. Maybe someone in the OpenSolaris community would like to take a stab at it?

Also in Solaris 10 01/06 (aka. Update 1)

Solaris 10 Update 1 has shipped. There are nifty things like new-boot in there, but here's a small, subtle, but perhaps blog-worthy entry.

RFC 3947 NAT-Traversal is now part of Solaris. The RFCs were published literally days after the Solaris 10 development gate closed its doors for anything but critical bug fixes. We had used the draft-09 version of NAT-T for Solaris 10, but now we have the RFC-compliant one, which should insure maximum interoperability.

Not a big deal, but -savvy folks out there ought to know.

It's time to enjoy my Christmas/New-Year's break. Enjoy your own end-of-year activities (whatever they are) and catch you in 2006.

Put IPsec to work in YOUR application

Hello coders!

Most people know that you can use ipsecconf(1m) to apply IPsec policy enforcement to an existing application. For example, if you wish to only allow inbound telnet traffic that's under IPsec protection, you'd put something like this into /etc/inet/ipsecinit.conf or other ipsecconf(1m) input:
# Inbound telnet traffic should be IPsec protected
{ lport 23 } ipsec { encr_algs any(128..) encr_auth_algs md5 sa shared}
    or ipsec { encr_algs any(128..) encr_auth_algs sha1 sa shared}


Combine that with appropriate IKE configuration or manual IPsec keys, and you can secure your telnet traffic against eavesdropping, connection hijacking, etc.

For existing services, using ipsecconf(1m) is the most expedient way to bring IPsec protection to bear on packets.

For new services, or services that are being modified anyway, consider using per-socket policy as an alternative. Some advantages to per-socket policy are:

  • Per-socket policy is stored internally in network session state (the conn_t structure in OpenSolaris). Entries from ipsecconf(1m) are stored in the global Security Policy Database (SPD). No global SPD entries means lower latency for fresh flow creation, and less lock acquisition.

  • Per-socket bypass means fewer bypass entries in global SPD. If I bypass remote-port 80 using ipsecconf(1m), I can, in theory, enter the system with a remote TCP packet with port=80. There's an RFE (6219908) to work around this, but per-socket is still quicker. I'd love a web proxy with the ability to set per-socket bypass.



The newly SMF-ized inetd(1m) would be a prime candidate for per-socket policy. See RFE 6226853, and this might be something someone in the OpenSolaris community would like to tackle!

Let's look at the ipsec_req_t structure that's been around since Solaris 8 in /usr/include/netinet/in.h:
/*
 * Different preferences that can be requested from IPSEC protocols.
 */

#define IP_SEC_OPT 0x22 /* Used to set IPSEC options */
#define IPSEC_PREF_NEVER 0x01
#define IPSEC_PREF_REQUIRED 0x02
#define IPSEC_PREF_UNIQUE 0x04
/*
 * This can be used with the setsockopt() call to set per socket security
 * options. When the application uses per-socket API, we will reflect
 * the request on both outbound and inbound packets.
 */

typedef struct ipsec_req {
	uint_t ipsr_ah_req; /* AH request */
	uint_t ipsr_esp_req; /* ESP request */
	uint_t ipsr_self_encap_req; /* Self-Encap request */
	uint8_t ipsr_auth_alg; /* Auth algs for AH */
	uint8_t ipsr_esp_alg; /* Encr algs for ESP */
	uint8_t ipsr_esp_auth_alg; /* Auth algs for ESP */
} ipsec_req_t;
The ipsec_req_t is a subset of what one can specify with ipsecconf(1m) in Solaris 9 or later, but it matched what one could do with Solaris 8's version. Algorithm values are derived from PF_KEY (see /usr/include/net/pfkeyv2.h for values), as below. One could also use getipsecalgbyname(3nsl). If I wish to set a socket to use ESP with AES and and MD5, I'd set it up as follows:
	int s; /* Socket file descriptor... */

	ipsec_req_t ipsr;

 .....

	/* NOTE: Do this BEFORE calling connect() or accept() for TCP sockets. */
	ipsr.ipsr_ah_req = 0;
	ipsr.ipsr_esp_req = IPSEC_PREF_REQUIRED;

	ipsr.ipsr_self_encap_req = 0;
	ipsr.ipsr_auth_alg = 0;

	ipsr.ipsr_esp_alg = SADB_EALG_AES;
	ipsr.ipsr_esp_auth_alg = SADB_AALG_MD5HMAC;
	if (setsockopt(s, IPPROTO_IP, IP_SEC_OPT, &ipsr,
	    sizeof (ipsr)) == -1) {
		perror("setsockopt");
		bail(); /* Ugggh, we failed. */
	}
	/* You now have per-socket policy set. */
Notice I mentioned setting the socket option BEFORE calling connect() or accept? This is because of a phenomenon we implement called connection latching. Basically, connection latching means that once an endpoint is connect()-ed, the IPsec policy (whether set per-socket or inherited from the state of the global SPD at the time) latches in place. We made this decision to avoid keeping policy-per-datagram state for things like TCP retransmits.

One thing per-socket policy does not address is the case of unconnected datagram services. In a perfect world, we could have IPsec policy information percolate all the way to the socket layer, where an application can make fully-informed per-datagram decisions on whether or not a particular packet was secured or not. It's a hard problem, requiring XNET sockets (to use sendmsg() and recvmsg() with ancillary data).

BTW, if you want to bypass whatever global entries are in the SPD, you can zero out the structure, and set all three (ah, esp, self_encap) action indicators to IPSEC_PREF_NEVER. You need to be privileged (root or "sys_net_config") to use per-socket bypass, however.

So modulo the keying problem (setting up IKE or having both ends agree on IPsec manual keys), you can put IPsec to work right in your application. In fact, if you use IKE, you can let IKE sort out permissions and access control (by using PKI-issued certificates, self-signed certificates, or preshared keys) and have policy merely determine the details of the protection required.



EDITED: This entry brought to you by the Technorati tags , , and .

Sooner than I thought!

Well, it looks like our friends on the OpenSolaris source team have been working more quickly than I thought. If you look at certain files (like the ESP source), they've been mysteriously fleshed-out. While crypto hasn't been sorted out, some other IPsec-specific files are now present in the tree!

Since crypto hasn't shown up yet, I guess we can't fully celebrate, but once it does, maybe Bill or I will have some more to say on the IPsec source.

Dan's blog is powered by blahgd