Print this page
OS-5330 zoneadm mounting an lx or joyent branded zone fails
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Approved by: Jerry Jelinek <jerry.jelinek@joyent.com>
(NOTE: Manual port, because of divergence from SmartOS.)
OS-3831 lxbrand /proc/cmdline should reflect zone boot arguments
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Approved by: Jerry Jelinek <jerry.jelinek@joyent.com>
Remove most KEBE comments and accompanying unused code or variables/fields.
Merge cleanup from previous six commits
OS-200 need a better mechanism for storing persistent zone_did
OS-2564 zone boot failed: could not start zoneadmd
OS-1763 mount of /etc/svc/volatile failed: Device busy
OS-511 make zonecfg device resource extensible, like the net resource
OS-224 add more zonecfg net properties
Reduce lint
Add zfd.c to zoneadmd's Makefile, a bit more not-yet ifdef-out.
zoneadmd mismerge (we don't support debug yet)
OS-4932 zoneadm boot args not passed to lx init
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
OS-4781 would like to be able to add CT_PR_EV_EXIT to fatal event set of current contract
OS-4253 lxbrand ubuntu 15.04 won't boot because /sbin/init is a symlink
OS-3524 in order to support interaction with docker containers, need to be able to connect to stdio for init from GZ
OS-3525 in order to support 'docker logs' need to be able to get stdio from zone to log file
OS-3429 Expose zone's init exit status
OS-3342 dlmgmtd needs to be mindful of lock ordering
OS-2608 dlmgmtd needs to record zone identifiers
OS-3492 zone_free asserts to its destruction when dlmgmtd has fallen
OS-3494 zoneadmd tears down networking too soon when boot fails
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
OS-3077 restarted zoneadmd uses invalid zlogp
OS-3075 zone long boot args aren't passed through
OS-11 rcapd behaves poorly when under extreme load
| Split |
Close |
| Expand all |
| Collapse all |
--- old/usr/src/cmd/zoneadmd/zoneadmd.c
+++ new/usr/src/cmd/zoneadmd/zoneadmd.c
1 1 /*
2 2 * CDDL HEADER START
3 3 *
4 4 * The contents of this file are subject to the terms of the
5 5 * Common Development and Distribution License (the "License").
6 6 * You may not use this file except in compliance with the License.
7 7 *
8 8 * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9 9 * or http://www.opensolaris.org/os/licensing.
10 10 * See the License for the specific language governing permissions
11 11 * and limitations under the License.
12 12 *
13 13 * When distributing Covered Code, include this CDDL HEADER in each
14 14 * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
|
↓ open down ↓ |
14 lines elided |
↑ open up ↑ |
15 15 * If applicable, add the following below this CDDL HEADER, with the
16 16 * fields enclosed by brackets "[]" replaced with your own identifying
17 17 * information: Portions Copyright [yyyy] [name of copyright owner]
18 18 *
19 19 * CDDL HEADER END
20 20 */
21 21
22 22 /*
23 23 * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved.
24 24 * Copyright 2014 Nexenta Systems, Inc. All rights reserved.
25 + * Copyright 2016 Joyent, Inc.
25 26 */
26 27
27 28 /*
28 29 * zoneadmd manages zones; one zoneadmd process is launched for each
29 30 * non-global zone on the system. This daemon juggles four jobs:
30 31 *
31 32 * - Implement setup and teardown of the zone "virtual platform": mount and
32 33 * unmount filesystems; create and destroy network interfaces; communicate
33 34 * with devfsadmd to lay out devices for the zone; instantiate the zone
34 35 * console device; configure process runtime attributes such as resource
35 36 * controls, pool bindings, fine-grained privileges.
36 37 *
37 38 * - Launch the zone's init(1M) process.
38 39 *
39 40 * - Implement a door server; clients (like zoneadm) connect to the door
40 41 * server and request zone state changes. The kernel is also a client of
41 42 * this door server. A request to halt or reboot the zone which originates
42 43 * *inside* the zone results in a door upcall from the kernel into zoneadmd.
43 44 *
44 45 * One minor problem is that messages emitted by zoneadmd need to be passed
45 46 * back to the zoneadm process making the request. These messages need to
46 47 * be rendered in the client's locale; so, this is passed in as part of the
47 48 * request. The exception is the kernel upcall to zoneadmd, in which case
48 49 * messages are syslog'd.
49 50 *
50 51 * To make all of this work, the Makefile adds -a to xgettext to extract *all*
51 52 * strings, and an exclusion file (zoneadmd.xcl) is used to exclude those
52 53 * strings which do not need to be translated.
53 54 *
54 55 * - Act as a console server for zlogin -C processes; see comments in zcons.c
55 56 * for more information about the zone console architecture.
56 57 *
57 58 * DESIGN NOTES
58 59 *
59 60 * Restart:
60 61 * A chief design constraint of zoneadmd is that it should be restartable in
|
↓ open down ↓ |
26 lines elided |
↑ open up ↑ |
61 62 * the case that the administrator kills it off, or it suffers a fatal error,
62 63 * without the running zone being impacted; this is akin to being able to
63 64 * reboot the service processor of a server without affecting the OS instance.
64 65 */
65 66
66 67 #include <sys/param.h>
67 68 #include <sys/mman.h>
68 69 #include <sys/types.h>
69 70 #include <sys/stat.h>
70 71 #include <sys/sysmacros.h>
72 +#include <sys/time.h>
71 73
72 74 #include <bsm/adt.h>
73 75 #include <bsm/adt_event.h>
74 76
75 77 #include <alloca.h>
76 78 #include <assert.h>
77 79 #include <errno.h>
78 80 #include <door.h>
79 81 #include <fcntl.h>
80 82 #include <locale.h>
81 83 #include <signal.h>
82 84 #include <stdarg.h>
83 85 #include <stdio.h>
84 86 #include <stdlib.h>
85 87 #include <string.h>
86 88 #include <strings.h>
87 89 #include <synch.h>
88 90 #include <syslog.h>
89 91 #include <thread.h>
90 92 #include <unistd.h>
91 93 #include <wait.h>
92 94 #include <limits.h>
93 95 #include <zone.h>
94 96 #include <libbrand.h>
95 97 #include <sys/brand.h>
96 98 #include <libcontract.h>
97 99 #include <libcontract_priv.h>
98 100 #include <sys/brand.h>
99 101 #include <sys/contract/process.h>
100 102 #include <sys/ctfs.h>
|
↓ open down ↓ |
20 lines elided |
↑ open up ↑ |
101 103 #include <libdladm.h>
102 104 #include <sys/dls_mgmt.h>
103 105 #include <libscf.h>
104 106
105 107 #include <libzonecfg.h>
106 108 #include <zonestat_impl.h>
107 109 #include "zoneadmd.h"
108 110
109 111 static char *progname;
110 112 char *zone_name; /* zone which we are managing */
113 +zone_dochandle_t snap_hndl; /* handle for snapshot created when ready */
114 +char zonepath[MAXNAMELEN];
111 115 char pool_name[MAXNAMELEN];
112 116 char default_brand[MAXNAMELEN];
113 117 char brand_name[MAXNAMELEN];
114 118 boolean_t zone_isnative;
115 119 boolean_t zone_iscluster;
116 120 boolean_t zone_islabeled;
117 121 boolean_t shutdown_in_progress;
118 122 static zoneid_t zone_id;
119 123 dladm_handle_t dld_handle = NULL;
120 124
121 125 static char pre_statechg_hook[2 * MAXPATHLEN];
122 126 static char post_statechg_hook[2 * MAXPATHLEN];
123 127 char query_hook[2 * MAXPATHLEN];
124 128
125 129 zlog_t logsys;
126 130
127 131 mutex_t lock = DEFAULTMUTEX; /* to serialize stuff */
128 132 mutex_t msglock = DEFAULTMUTEX; /* for calling setlocale() */
129 133
130 134 static sema_t scratch_sem; /* for scratch zones */
131 135
132 136 static char zone_door_path[MAXPATHLEN];
133 137 static int zone_door = -1;
|
↓ open down ↓ |
13 lines elided |
↑ open up ↑ |
134 138
135 139 boolean_t in_death_throes = B_FALSE; /* daemon is dying */
136 140 boolean_t bringup_failure_recovery = B_FALSE; /* ignore certain failures */
137 141
138 142 #if !defined(TEXT_DOMAIN) /* should be defined by cc -D */
139 143 #define TEXT_DOMAIN "SYS_TEST" /* Use this only if it wasn't */
140 144 #endif
141 145
142 146 #define DEFAULT_LOCALE "C"
143 147
148 +#define RSRC_NET "net"
149 +#define RSRC_DEV "device"
150 +
144 151 static const char *
145 152 z_cmd_name(zone_cmd_t zcmd)
146 153 {
147 154 /* This list needs to match the enum in sys/zone.h */
148 155 static const char *zcmdstr[] = {
149 156 "ready", "boot", "forceboot", "reboot", "halt",
150 157 "note_uninstalling", "mount", "forcemount", "unmount",
151 158 "shutdown"
152 159 };
153 160
154 161 if (zcmd >= sizeof (zcmdstr) / sizeof (*zcmdstr))
155 162 return ("unknown");
156 163 else
157 164 return (zcmdstr[(int)zcmd]);
158 165 }
159 166
160 167 static char *
161 168 get_execbasename(char *execfullname)
162 169 {
163 170 char *last_slash, *execbasename;
164 171
165 172 /* guard against '/' at end of command invocation */
166 173 for (;;) {
167 174 last_slash = strrchr(execfullname, '/');
168 175 if (last_slash == NULL) {
169 176 execbasename = execfullname;
170 177 break;
171 178 } else {
172 179 execbasename = last_slash + 1;
173 180 if (*execbasename == '\0') {
174 181 *last_slash = '\0';
175 182 continue;
176 183 }
177 184 break;
178 185 }
179 186 }
180 187 return (execbasename);
181 188 }
182 189
183 190 static void
184 191 usage(void)
185 192 {
186 193 (void) fprintf(stderr, gettext("Usage: %s -z zonename\n"), progname);
187 194 (void) fprintf(stderr,
188 195 gettext("\tNote: %s should not be run directly.\n"), progname);
189 196 exit(2);
190 197 }
191 198
192 199 /* ARGSUSED */
193 200 static void
194 201 sigchld(int sig)
195 202 {
196 203 }
197 204
198 205 char *
199 206 localize_msg(char *locale, const char *msg)
200 207 {
201 208 char *out;
202 209
203 210 (void) mutex_lock(&msglock);
204 211 (void) setlocale(LC_MESSAGES, locale);
205 212 out = gettext(msg);
206 213 (void) setlocale(LC_MESSAGES, DEFAULT_LOCALE);
207 214 (void) mutex_unlock(&msglock);
208 215 return (out);
209 216 }
210 217
211 218 /* PRINTFLIKE3 */
212 219 void
213 220 zerror(zlog_t *zlogp, boolean_t use_strerror, const char *fmt, ...)
214 221 {
215 222 va_list alist;
216 223 char buf[MAXPATHLEN * 2]; /* enough space for err msg with a path */
217 224 char *bp;
218 225 int saved_errno = errno;
219 226
220 227 if (zlogp == NULL)
221 228 return;
222 229 if (zlogp == &logsys)
223 230 (void) snprintf(buf, sizeof (buf), "[zone '%s'] ",
224 231 zone_name);
225 232 else
226 233 buf[0] = '\0';
227 234 bp = &(buf[strlen(buf)]);
228 235
229 236 /*
230 237 * In theory, the locale pointer should be set to either "C" or a
231 238 * char array, so it should never be NULL
232 239 */
233 240 assert(zlogp->locale != NULL);
234 241 /* Locale is per process, but we are multi-threaded... */
235 242 fmt = localize_msg(zlogp->locale, fmt);
236 243
237 244 va_start(alist, fmt);
238 245 (void) vsnprintf(bp, sizeof (buf) - (bp - buf), fmt, alist);
239 246 va_end(alist);
240 247 bp = &(buf[strlen(buf)]);
241 248 if (use_strerror)
242 249 (void) snprintf(bp, sizeof (buf) - (bp - buf), ": %s",
243 250 strerror(saved_errno));
244 251 if (zlogp == &logsys) {
245 252 (void) syslog(LOG_ERR, "%s", buf);
246 253 } else if (zlogp->logfile != NULL) {
247 254 (void) fprintf(zlogp->logfile, "%s\n", buf);
248 255 } else {
249 256 size_t buflen;
|
↓ open down ↓ |
96 lines elided |
↑ open up ↑ |
250 257 size_t copylen;
251 258
252 259 buflen = snprintf(zlogp->log, zlogp->loglen, "%s\n", buf);
253 260 copylen = MIN(buflen, zlogp->loglen);
254 261 zlogp->log += copylen;
255 262 zlogp->loglen -= copylen;
256 263 }
257 264 }
258 265
259 266 /*
260 - * Emit a warning for any boot arguments which are unrecognized. Since
261 - * Solaris boot arguments are getopt(3c) compatible (see kernel(1m)), we
267 + * Append src to dest, modifying dest in the process. Prefix src with
268 + * a space character if dest is a non-empty string.
269 + */
270 +static void
271 +strnappend(char *dest, size_t n, const char *src)
272 +{
273 + (void) snprintf(dest, n, "%s%s%s", dest,
274 + dest[0] == '\0' ? "" : " ", src);
275 +}
276 +
277 +/*
278 + * Since illumos boot arguments are getopt(3c) compatible (see kernel(1m)), we
262 279 * put the arguments into an argv style array, use getopt to process them,
263 - * and put the resultant argument string back into outargs.
280 + * and put the resultant argument string back into outargs. Non-native brands
281 + * may support alternate forms of boot arguments so we must handle that as well.
264 282 *
265 283 * During the filtering, we pull out any arguments which are truly "boot"
266 284 * arguments, leaving only those which are to be passed intact to the
267 285 * progenitor process. The one we support at the moment is -i, which
268 286 * indicates to the kernel which program should be launched as 'init'.
269 287 *
270 - * A return of Z_INVAL indicates specifically that the arguments are
271 - * not valid; this is a non-fatal error. Except for Z_OK, all other return
272 - * values are treated as fatal.
288 + * Except for Z_OK, all other return values are treated as fatal.
273 289 */
274 290 static int
275 291 filter_bootargs(zlog_t *zlogp, const char *inargs, char *outargs,
276 - char *init_file, char *badarg)
292 + char *init_file)
277 293 {
278 294 int argc = 0, argc_save;
279 295 int i;
280 296 int err;
281 297 char *arg, *lasts, **argv = NULL, **argv_save;
282 298 char zonecfg_args[BOOTARGS_MAX];
283 299 char scratchargs[BOOTARGS_MAX], *sargs;
300 + char scratchopt[3];
284 301 char c;
285 302
286 303 bzero(outargs, BOOTARGS_MAX);
287 - bzero(badarg, BOOTARGS_MAX);
288 304
289 305 /*
290 306 * If the user didn't specify transient boot arguments, check
291 307 * to see if there were any specified in the zone configuration,
292 308 * and use them if applicable.
293 309 */
294 310 if (inargs == NULL || inargs[0] == '\0') {
295 311 zone_dochandle_t handle;
296 312 if ((handle = zonecfg_init_handle()) == NULL) {
297 313 zerror(zlogp, B_TRUE,
298 314 "getting zone configuration handle");
299 315 return (Z_BAD_HANDLE);
300 316 }
301 317 err = zonecfg_get_snapshot_handle(zone_name, handle);
302 318 if (err != Z_OK) {
303 319 zerror(zlogp, B_FALSE,
304 320 "invalid configuration snapshot");
305 321 zonecfg_fini_handle(handle);
306 322 return (Z_BAD_HANDLE);
307 323 }
308 324
309 325 bzero(zonecfg_args, sizeof (zonecfg_args));
310 326 (void) zonecfg_get_bootargs(handle, zonecfg_args,
311 327 sizeof (zonecfg_args));
312 328 inargs = zonecfg_args;
313 329 zonecfg_fini_handle(handle);
314 330 }
315 331
316 332 if (strlen(inargs) >= BOOTARGS_MAX) {
317 333 zerror(zlogp, B_FALSE, "boot argument string too long");
318 334 return (Z_INVAL);
319 335 }
320 336
321 337 (void) strlcpy(scratchargs, inargs, sizeof (scratchargs));
322 338 sargs = scratchargs;
323 339 while ((arg = strtok_r(sargs, " \t", &lasts)) != NULL) {
324 340 sargs = NULL;
325 341 argc++;
326 342 }
327 343
328 344 if ((argv = calloc(argc + 1, sizeof (char *))) == NULL) {
329 345 zerror(zlogp, B_FALSE, "memory allocation failed");
330 346 return (Z_NOMEM);
331 347 }
332 348
333 349 argv_save = argv;
334 350 argc_save = argc;
335 351
336 352 (void) strlcpy(scratchargs, inargs, sizeof (scratchargs));
337 353 sargs = scratchargs;
338 354 i = 0;
339 355 while ((arg = strtok_r(sargs, " \t", &lasts)) != NULL) {
|
↓ open down ↓ |
42 lines elided |
↑ open up ↑ |
340 356 sargs = NULL;
341 357 if ((argv[i] = strdup(arg)) == NULL) {
342 358 err = Z_NOMEM;
343 359 zerror(zlogp, B_FALSE, "memory allocation failed");
344 360 goto done;
345 361 }
346 362 i++;
347 363 }
348 364
349 365 /*
350 - * We preserve compatibility with the Solaris system boot behavior,
366 + * We preserve compatibility with the illumos system boot behavior,
351 367 * which allows:
352 368 *
353 369 * # reboot kernel/unix -s -m verbose
354 370 *
355 - * In this example, kernel/unix tells the booter what file to
356 - * boot. We don't want reboot in a zone to be gratuitously different,
357 - * so we silently ignore the boot file, if necessary.
371 + * In this example, kernel/unix tells the booter what file to boot. The
372 + * original intent of this was that we didn't want reboot in a zone to
373 + * be gratuitously different, so we would silently ignore the boot
374 + * file, if necessary. However, this usage is archaic and has never
375 + * been common, since it is impossible to boot a zone onto a different
376 + * kernel. Ignoring the first argument breaks for non-native brands
377 + * which pass boot arguments in a different style. e.g.
378 + * systemd.log_level=debug
379 + * Thus, for backward compatibility we only ignore the first argument
380 + * if it appears to be in the illumos form and attempting to specify a
381 + * kernel.
358 382 */
359 383 if (argv[0] == NULL)
360 384 goto done;
361 385
362 386 assert(argv[0][0] != ' ');
363 387 assert(argv[0][0] != '\t');
364 388
365 - if (argv[0][0] != '-' && argv[0][0] != '\0') {
389 + if (strncmp(argv[0], "kernel/", 7) == 0) {
366 390 argv = &argv[1];
367 391 argc--;
368 392 }
369 393
370 394 optind = 0;
371 395 opterr = 0;
372 396 err = Z_OK;
373 397 while ((c = getopt(argc, argv, "fi:m:s")) != -1) {
374 398 switch (c) {
375 399 case 'i':
376 400 /*
377 401 * -i is handled by the runtime and is not passed
|
↓ open down ↓ |
2 lines elided |
↑ open up ↑ |
378 402 * along to userland
379 403 */
380 404 (void) strlcpy(init_file, optarg, MAXPATHLEN);
381 405 break;
382 406 case 'f':
383 407 /* This has already been processed by zoneadm */
384 408 break;
385 409 case 'm':
386 410 case 's':
387 411 /* These pass through unmolested */
388 - (void) snprintf(outargs, BOOTARGS_MAX,
389 - "%s -%c %s ", outargs, c, optarg ? optarg : "");
412 + (void) snprintf(scratchopt, sizeof (scratchopt),
413 + "-%c", c);
414 + strnappend(outargs, BOOTARGS_MAX, scratchopt);
415 + if (optarg != NULL)
416 + strnappend(outargs, BOOTARGS_MAX, optarg);
390 417 break;
391 418 case '?':
392 419 /*
393 - * We warn about unknown arguments but pass them
394 - * along anyway-- if someone wants to develop their
395 - * own init replacement, they can pass it whatever
396 - * args they want.
420 + * If a brand has its own init, we need to pass along
421 + * whatever the user provides. We use the entire
422 + * unknown string here so that we correctly handle
423 + * unknown long options (e.g. --debug).
397 424 */
398 - err = Z_INVAL;
399 - (void) snprintf(outargs, BOOTARGS_MAX,
400 - "%s -%c", outargs, optopt);
401 - (void) snprintf(badarg, BOOTARGS_MAX,
402 - "%s -%c", badarg, optopt);
425 + strnappend(outargs, BOOTARGS_MAX, argv[optind - 1]);
403 426 break;
404 427 }
405 428 }
406 429
407 430 /*
408 - * For Solaris Zones we warn about and discard non-option arguments.
409 - * Hence 'boot foo bar baz gub' --> 'boot'. However, to be similar
410 - * to the kernel, we concat up all the other remaining boot args.
411 - * and warn on them as a group.
431 + * We need to pass along everything else since we don't know what
432 + * the brand's init is expecting. For example, an argument list like:
433 + * --confdir /foo --debug
434 + * will cause the getopt parsing to stop at '/foo' but we need to pass
435 + * that on, along with the '--debug'. This does mean that we require
436 + * any of our known options (-ifms) to preceed the brand-specific ones.
412 437 */
413 - if (optind < argc) {
414 - err = Z_INVAL;
415 - while (optind < argc) {
416 - (void) snprintf(badarg, BOOTARGS_MAX, "%s%s%s",
417 - badarg, strlen(badarg) > 0 ? " " : "",
418 - argv[optind]);
419 - optind++;
420 - }
421 - zerror(zlogp, B_FALSE, "WARNING: Unused or invalid boot "
422 - "arguments `%s'.", badarg);
438 + while (optind < argc) {
439 + strnappend(outargs, BOOTARGS_MAX, argv[optind]);
440 + optind++;
423 441 }
424 442
425 443 done:
426 444 for (i = 0; i < argc_save; i++) {
427 445 if (argv_save[i] != NULL)
428 446 free(argv_save[i]);
429 447 }
430 448 free(argv_save);
431 449 return (err);
432 450 }
433 451
434 452
435 453 static int
436 454 mkzonedir(zlog_t *zlogp)
437 455 {
438 456 struct stat st;
439 457 /*
440 458 * We must create and lock everyone but root out of ZONES_TMPDIR
441 459 * since anyone can open any UNIX domain socket, regardless of
442 460 * its file system permissions. Sigh...
443 461 */
444 462 if (mkdir(ZONES_TMPDIR, S_IRWXU) < 0 && errno != EEXIST) {
445 463 zerror(zlogp, B_TRUE, "could not mkdir '%s'", ZONES_TMPDIR);
446 464 return (-1);
447 465 }
448 466 /* paranoia */
449 467 if ((stat(ZONES_TMPDIR, &st) < 0) || !S_ISDIR(st.st_mode)) {
450 468 zerror(zlogp, B_TRUE, "'%s' is not a directory", ZONES_TMPDIR);
451 469 return (-1);
452 470 }
453 471 (void) chmod(ZONES_TMPDIR, S_IRWXU);
454 472 return (0);
455 473 }
456 474
457 475 /*
458 476 * Run the brand's pre-state change callback, if it exists.
459 477 */
460 478 static int
461 479 brand_prestatechg(zlog_t *zlogp, int state, int cmd)
462 480 {
463 481 char cmdbuf[2 * MAXPATHLEN];
464 482 const char *altroot;
465 483
466 484 if (pre_statechg_hook[0] == '\0')
467 485 return (0);
468 486
469 487 altroot = zonecfg_get_root();
470 488 if (snprintf(cmdbuf, sizeof (cmdbuf), "%s %d %d %s", pre_statechg_hook,
471 489 state, cmd, altroot) > sizeof (cmdbuf))
472 490 return (-1);
473 491
474 492 if (do_subproc(zlogp, cmdbuf, NULL) != 0)
475 493 return (-1);
476 494
477 495 return (0);
478 496 }
479 497
480 498 /*
481 499 * Run the brand's post-state change callback, if it exists.
482 500 */
483 501 static int
484 502 brand_poststatechg(zlog_t *zlogp, int state, int cmd)
485 503 {
486 504 char cmdbuf[2 * MAXPATHLEN];
487 505 const char *altroot;
488 506
489 507 if (post_statechg_hook[0] == '\0')
490 508 return (0);
491 509
492 510 altroot = zonecfg_get_root();
493 511 if (snprintf(cmdbuf, sizeof (cmdbuf), "%s %d %d %s", post_statechg_hook,
494 512 state, cmd, altroot) > sizeof (cmdbuf))
495 513 return (-1);
496 514
497 515 if (do_subproc(zlogp, cmdbuf, NULL) != 0)
498 516 return (-1);
499 517
500 518 return (0);
501 519 }
502 520
503 521 /*
504 522 * Notify zonestatd of the new zone. If zonestatd is not running, this
505 523 * will do nothing.
506 524 */
507 525 static void
508 526 notify_zonestatd(zoneid_t zoneid)
509 527 {
510 528 int cmd[2];
511 529 int fd;
512 530 door_arg_t params;
513 531
514 532 fd = open(ZS_DOOR_PATH, O_RDONLY);
515 533 if (fd < 0)
516 534 return;
517 535
518 536 cmd[0] = ZSD_CMD_NEW_ZONE;
519 537 cmd[1] = zoneid;
520 538 params.data_ptr = (char *)&cmd;
521 539 params.data_size = sizeof (cmd);
522 540 params.desc_ptr = NULL;
523 541 params.desc_num = 0;
524 542 params.rbuf = NULL;
525 543 params.rsize = NULL;
526 544 (void) door_call(fd, ¶ms);
527 545 (void) close(fd);
528 546 }
529 547
|
↓ open down ↓ |
97 lines elided |
↑ open up ↑ |
530 548 /*
531 549 * Bring a zone up to the pre-boot "ready" stage. The mount_cmd argument is
532 550 * 'true' if this is being invoked as part of the processing for the "mount"
533 551 * subcommand.
534 552 */
535 553 static int
536 554 zone_ready(zlog_t *zlogp, zone_mnt_t mount_cmd, int zstate)
537 555 {
538 556 int err;
539 557
540 - if (brand_prestatechg(zlogp, zstate, Z_READY) != 0)
558 + if (!ALT_MOUNT(mount_cmd) &&
559 + brand_prestatechg(zlogp, zstate, Z_READY) != 0)
541 560 return (-1);
542 561
543 562 if ((err = zonecfg_create_snapshot(zone_name)) != Z_OK) {
544 563 zerror(zlogp, B_FALSE, "unable to create snapshot: %s",
545 564 zonecfg_strerror(err));
546 565 goto bad;
547 566 }
548 567
549 568 if ((zone_id = vplat_create(zlogp, mount_cmd)) == -1) {
550 569 if ((err = zonecfg_destroy_snapshot(zone_name)) != Z_OK)
551 570 zerror(zlogp, B_FALSE, "destroying snapshot: %s",
552 571 zonecfg_strerror(err));
553 572 goto bad;
|
↓ open down ↓ |
3 lines elided |
↑ open up ↑ |
554 573 }
555 574 if (vplat_bringup(zlogp, mount_cmd, zone_id) != 0) {
556 575 bringup_failure_recovery = B_TRUE;
557 576 (void) vplat_teardown(NULL, (mount_cmd != Z_MNT_BOOT), B_FALSE);
558 577 if ((err = zonecfg_destroy_snapshot(zone_name)) != Z_OK)
559 578 zerror(zlogp, B_FALSE, "destroying snapshot: %s",
560 579 zonecfg_strerror(err));
561 580 goto bad;
562 581 }
563 582
564 - if (brand_poststatechg(zlogp, zstate, Z_READY) != 0)
583 + if (!ALT_MOUNT(mount_cmd) &&
584 + brand_poststatechg(zlogp, zstate, Z_READY) != 0)
565 585 goto bad;
566 586
567 587 return (0);
568 588
569 589 bad:
570 590 /*
571 591 * If something goes wrong, we up the zones's state to the target
572 592 * state, READY, and then invoke the hook as if we're halting.
573 593 */
574 - (void) brand_poststatechg(zlogp, ZONE_STATE_READY, Z_HALT);
594 + if (!ALT_MOUNT(mount_cmd))
595 + (void) brand_poststatechg(zlogp, ZONE_STATE_READY, Z_HALT);
575 596 return (-1);
576 597 }
577 598
578 599 int
579 600 init_template(void)
580 601 {
581 602 int fd;
582 603 int err = 0;
583 604
584 605 fd = open64(CTFS_ROOT "/process/template", O_RDWR);
585 606 if (fd == -1)
586 607 return (-1);
587 608
588 609 /*
589 610 * For now, zoneadmd doesn't do anything with the contract.
590 611 * Deliver no events, don't inherit, and allow it to be orphaned.
591 612 */
592 613 err |= ct_tmpl_set_critical(fd, 0);
593 614 err |= ct_tmpl_set_informative(fd, 0);
594 615 err |= ct_pr_tmpl_set_fatal(fd, CT_PR_EV_HWERR);
595 616 err |= ct_pr_tmpl_set_param(fd, CT_PR_PGRPONLY | CT_PR_REGENT);
596 617 if (err || ct_tmpl_activate(fd)) {
597 618 (void) close(fd);
598 619 return (-1);
599 620 }
600 621
601 622 return (fd);
602 623 }
603 624
604 625 typedef struct fs_callback {
605 626 zlog_t *zlogp;
606 627 zoneid_t zoneid;
607 628 boolean_t mount_cmd;
608 629 } fs_callback_t;
609 630
610 631 static int
611 632 mount_early_fs(void *data, const char *spec, const char *dir,
612 633 const char *fstype, const char *opt)
613 634 {
614 635 zlog_t *zlogp = ((fs_callback_t *)data)->zlogp;
615 636 zoneid_t zoneid = ((fs_callback_t *)data)->zoneid;
|
↓ open down ↓ |
31 lines elided |
↑ open up ↑ |
616 637 boolean_t mount_cmd = ((fs_callback_t *)data)->mount_cmd;
617 638 char rootpath[MAXPATHLEN];
618 639 pid_t child;
619 640 int child_status;
620 641 int tmpl_fd;
621 642 int rv;
622 643 ctid_t ct;
623 644
624 645 /* determine the zone rootpath */
625 646 if (mount_cmd) {
626 - char zonepath[MAXPATHLEN];
627 647 char luroot[MAXPATHLEN];
628 648
629 - if (zone_get_zonepath(zone_name,
630 - zonepath, sizeof (zonepath)) != Z_OK) {
631 - zerror(zlogp, B_FALSE, "unable to determine zone path");
632 - return (-1);
633 - }
634 -
635 649 (void) snprintf(luroot, sizeof (luroot), "%s/lu", zonepath);
636 650 resolve_lofs(zlogp, luroot, sizeof (luroot));
637 651 (void) strlcpy(rootpath, luroot, sizeof (rootpath));
638 652 } else {
639 653 if (zone_get_rootpath(zone_name,
640 654 rootpath, sizeof (rootpath)) != Z_OK) {
641 655 zerror(zlogp, B_FALSE, "unable to determine zone root");
642 656 return (-1);
643 657 }
644 658 }
645 659
646 660 if ((rv = valid_mount_path(zlogp, rootpath, spec, dir, fstype)) < 0) {
647 661 zerror(zlogp, B_FALSE, "%s%s is not a valid mount point",
648 662 rootpath, dir);
649 663 return (-1);
650 664 } else if (rv > 0) {
651 665 /* The mount point path doesn't exist, create it now. */
652 666 if (make_one_dir(zlogp, rootpath, dir,
653 667 DEFAULT_DIR_MODE, DEFAULT_DIR_USER,
654 668 DEFAULT_DIR_GROUP) != 0) {
655 669 zerror(zlogp, B_FALSE, "failed to create mount point");
656 670 return (-1);
657 671 }
658 672
659 673 /*
660 674 * Now this might seem weird, but we need to invoke
661 675 * valid_mount_path() again. Why? Because it checks
662 676 * to make sure that the mount point path is canonical,
663 677 * which it can only do if the path exists, so now that
664 678 * we've created the path we have to verify it again.
665 679 */
666 680 if ((rv = valid_mount_path(zlogp, rootpath, spec, dir,
667 681 fstype)) < 0) {
668 682 zerror(zlogp, B_FALSE,
669 683 "%s%s is not a valid mount point", rootpath, dir);
670 684 return (-1);
671 685 }
672 686 }
673 687
674 688 if ((tmpl_fd = init_template()) == -1) {
675 689 zerror(zlogp, B_TRUE, "failed to create contract");
676 690 return (-1);
677 691 }
678 692
|
↓ open down ↓ |
34 lines elided |
↑ open up ↑ |
679 693 if ((child = fork()) == -1) {
680 694 (void) ct_tmpl_clear(tmpl_fd);
681 695 (void) close(tmpl_fd);
682 696 zerror(zlogp, B_TRUE, "failed to fork");
683 697 return (-1);
684 698
685 699 } else if (child == 0) { /* child */
686 700 char opt_buf[MAX_MNTOPT_STR];
687 701 int optlen = 0;
688 702 int mflag = MS_DATA;
703 + int i;
704 + int ret;
689 705
690 706 (void) ct_tmpl_clear(tmpl_fd);
691 707 /*
692 708 * Even though there are no procs running in the zone, we
693 709 * do this for paranoia's sake.
694 710 */
695 711 (void) closefrom(0);
696 712
697 713 if (zone_enter(zoneid) == -1) {
698 714 _exit(errno);
699 715 }
700 716 if (opt != NULL) {
701 717 /*
702 718 * The mount() system call is incredibly annoying.
703 719 * If options are specified, we need to copy them
704 720 * into a temporary buffer since the mount() system
705 721 * call will overwrite the options string. It will
|
↓ open down ↓ |
7 lines elided |
↑ open up ↑ |
706 722 * also fail if the new option string it wants to
707 723 * write is bigger than the one we passed in, so
708 724 * you must pass in a buffer of the maximum possible
709 725 * option string length. sigh.
710 726 */
711 727 (void) strlcpy(opt_buf, opt, sizeof (opt_buf));
712 728 opt = opt_buf;
713 729 optlen = MAX_MNTOPT_STR;
714 730 mflag = MS_OPTIONSTR;
715 731 }
716 - if (mount(spec, dir, mflag, fstype, NULL, 0, opt, optlen) != 0)
717 - _exit(errno);
718 - _exit(0);
732 +
733 + /*
734 + * There is an obscure race condition which can cause mount
735 + * to return EBUSY. This happens for example on the mount
736 + * of the zone's /etc/svc/volatile file system if there is
737 + * a GZ process running svcs -Z, which will touch the
738 + * mountpoint, just as we're trying to do the mount. To cope
739 + * with this, we retry up to 3 times to let this transient
740 + * process get out of the way.
741 + */
742 + for (i = 0; i < 3; i++) {
743 + ret = 0;
744 + if (mount(spec, dir, mflag, fstype, NULL, 0, opt,
745 + optlen) != 0)
746 + ret = errno;
747 + if (ret != EBUSY)
748 + break;
749 + (void) sleep(1);
750 + }
751 + _exit(ret);
719 752 }
720 753
721 754 /* parent */
722 755 if (contract_latest(&ct) == -1)
723 756 ct = -1;
724 757 (void) ct_tmpl_clear(tmpl_fd);
725 758 (void) close(tmpl_fd);
726 759 if (waitpid(child, &child_status, 0) != child) {
727 760 /* unexpected: we must have been signalled */
728 761 (void) contract_abandon_id(ct);
729 762 return (-1);
730 763 }
731 764 (void) contract_abandon_id(ct);
|
↓ open down ↓ |
3 lines elided |
↑ open up ↑ |
732 765 if (WEXITSTATUS(child_status) != 0) {
733 766 errno = WEXITSTATUS(child_status);
734 767 zerror(zlogp, B_TRUE, "mount of %s failed", dir);
735 768 return (-1);
736 769 }
737 770
738 771 return (0);
739 772 }
740 773
741 774 /*
775 + * env variable name format
776 + * _ZONECFG;{resource name};{identifying attr. name};{property name}
777 + */
778 +static void
779 +set_zonecfg_env(char *rsrc, char *attr, char *name, char *val)
780 +{
781 + char *p;
782 + /* Enough for maximal name, rsrc + attr, & slop for ZONECFG & _'s */
783 + char nm[2 * MAXNAMELEN + 32];
784 +
785 + if (attr == NULL)
786 + (void) snprintf(nm, sizeof (nm), "_ZONECFG_%s_%s", rsrc,
787 + name);
788 + else
789 + (void) snprintf(nm, sizeof (nm), "_ZONECFG_%s_%s_%s", rsrc,
790 + attr, name);
791 +
792 + p = nm;
793 + while ((p = strchr(p, '-')) != NULL)
794 + *p++ = '_';
795 +
796 + (void) setenv(nm, val, 1);
797 +}
798 +
799 +/*
800 + * Export zonecfg network and device properties into environment for the boot
801 + * and state change hooks.
802 + * If debug is true, export the brand hook debug env. variable as well.
803 + *
804 + * We could export more of the config in the future, as necessary.
805 + */
806 +static int
807 +setup_subproc_env()
808 +{
809 + int res;
810 + zone_dochandle_t handle;
811 + struct zone_nwiftab ntab;
812 + struct zone_devtab dtab;
813 + char net_resources[MAXNAMELEN * 2];
814 + char dev_resources[MAXNAMELEN * 2];
815 +
816 + if ((handle = zonecfg_init_handle()) == NULL)
817 + exit(Z_NOMEM);
818 +
819 + if ((res = zonecfg_get_handle(zone_name, handle)) != Z_OK)
820 + goto done;
821 +
822 + if ((res = zonecfg_setnwifent(handle)) != Z_OK)
823 + goto done;
824 +
825 + while (zonecfg_getnwifent(handle, &ntab) == Z_OK) {
826 + struct zone_res_attrtab *rap;
827 + char *phys;
828 +
829 + phys = ntab.zone_nwif_physical;
830 +
831 + (void) strlcat(net_resources, phys, sizeof (net_resources));
832 + (void) strlcat(net_resources, " ", sizeof (net_resources));
833 +
834 + set_zonecfg_env(RSRC_NET, phys, "physical", phys);
835 +
836 + set_zonecfg_env(RSRC_NET, phys, "address",
837 + ntab.zone_nwif_address);
838 + set_zonecfg_env(RSRC_NET, phys, "allowed-address",
839 + ntab.zone_nwif_allowed_address);
840 + set_zonecfg_env(RSRC_NET, phys, "defrouter",
841 + ntab.zone_nwif_defrouter);
842 + set_zonecfg_env(RSRC_NET, phys, "global-nic",
843 + ntab.zone_nwif_gnic);
844 + set_zonecfg_env(RSRC_NET, phys, "mac-addr", ntab.zone_nwif_mac);
845 + set_zonecfg_env(RSRC_NET, phys, "vlan-id",
846 + ntab.zone_nwif_vlan_id);
847 +
848 + for (rap = ntab.zone_nwif_attrp; rap != NULL;
849 + rap = rap->zone_res_attr_next)
850 + set_zonecfg_env(RSRC_NET, phys, rap->zone_res_attr_name,
851 + rap->zone_res_attr_value);
852 + }
853 +
854 + (void) zonecfg_endnwifent(handle);
855 +
856 + if ((res = zonecfg_setdevent(handle)) != Z_OK)
857 + goto done;
858 +
859 + while (zonecfg_getdevent(handle, &dtab) == Z_OK) {
860 + struct zone_res_attrtab *rap;
861 + char *match;
862 +
863 + match = dtab.zone_dev_match;
864 +
865 + (void) strlcat(dev_resources, match, sizeof (dev_resources));
866 + (void) strlcat(dev_resources, " ", sizeof (dev_resources));
867 +
868 + for (rap = dtab.zone_dev_attrp; rap != NULL;
869 + rap = rap->zone_res_attr_next)
870 + set_zonecfg_env(RSRC_DEV, match,
871 + rap->zone_res_attr_name, rap->zone_res_attr_value);
872 + }
873 +
874 + (void) zonecfg_enddevent(handle);
875 +
876 + res = Z_OK;
877 +
878 +done:
879 + zonecfg_fini_handle(handle);
880 + return (res);
881 +}
882 +
883 +/*
742 884 * If retstr is not NULL, the output of the subproc is returned in the str,
743 885 * otherwise it is output using zerror(). Any memory allocated for retstr
744 886 * should be freed by the caller.
745 887 */
746 888 int
747 889 do_subproc(zlog_t *zlogp, char *cmdbuf, char **retstr)
748 890 {
749 891 char buf[1024]; /* arbitrary large amount */
750 892 char *inbuf;
751 893 FILE *file;
752 894 int status;
753 895 int rd_cnt;
754 896
755 897 if (retstr != NULL) {
|
↓ open down ↓ |
4 lines elided |
↑ open up ↑ |
756 898 if ((*retstr = malloc(1024)) == NULL) {
757 899 zerror(zlogp, B_FALSE, "out of memory");
758 900 return (-1);
759 901 }
760 902 inbuf = *retstr;
761 903 rd_cnt = 0;
762 904 } else {
763 905 inbuf = buf;
764 906 }
765 907
908 + if (setup_subproc_env() != Z_OK) {
909 + zerror(zlogp, B_FALSE, "failed to setup environment");
910 + return (-1);
911 + }
912 +
766 913 file = popen(cmdbuf, "r");
767 914 if (file == NULL) {
768 915 zerror(zlogp, B_TRUE, "could not launch: %s", cmdbuf);
769 916 return (-1);
770 917 }
771 918
772 919 while (fgets(inbuf, 1024, file) != NULL) {
773 920 if (retstr == NULL) {
774 921 if (zlogp != &logsys)
775 922 zerror(zlogp, B_FALSE, "%s", inbuf);
776 923 } else {
777 924 char *p;
778 925
779 926 rd_cnt += 1024 - 1;
780 927 if ((p = realloc(*retstr, rd_cnt + 1024)) == NULL) {
781 928 zerror(zlogp, B_FALSE, "out of memory");
782 929 (void) pclose(file);
783 930 return (-1);
784 931 }
785 932
786 933 *retstr = p;
787 934 inbuf = *retstr + rd_cnt;
788 935 }
789 936 }
790 937 status = pclose(file);
791 938
792 939 if (WIFSIGNALED(status)) {
793 940 zerror(zlogp, B_FALSE, "%s unexpectedly terminated due to "
794 941 "signal %d", cmdbuf, WTERMSIG(status));
|
↓ open down ↓ |
19 lines elided |
↑ open up ↑ |
795 942 return (-1);
796 943 }
797 944 assert(WIFEXITED(status));
798 945 if (WEXITSTATUS(status) == ZEXIT_EXEC) {
799 946 zerror(zlogp, B_FALSE, "failed to exec %s", cmdbuf);
800 947 return (-1);
801 948 }
802 949 return (WEXITSTATUS(status));
803 950 }
804 951
952 +/*
953 + * Get the app-svc-dependent flag for this zone's init process. This is a
954 + * zone-specific attr which controls the type of contract we create for the
955 + * zone's init. When true, the contract will include CT_PR_EV_EXIT in the fatal
956 + * set, so that when any service which is in the same contract exits, the init
957 + * application will be terminated.
958 + *
959 + * We use the global "snap_hndl", so no parameters get passed here.
960 + */
961 +static boolean_t
962 +is_app_svc_dep(void)
963 +{
964 + struct zone_attrtab a;
965 +
966 + bzero(&a, sizeof (a));
967 + (void) strlcpy(a.zone_attr_name, "app-svc-dependent",
968 + sizeof (a.zone_attr_name));
969 +
970 + if (zonecfg_lookup_attr(snap_hndl, &a) == Z_OK &&
971 + strcmp(a.zone_attr_value, "true") == 0) {
972 + return (B_TRUE);
973 + }
974 +
975 + return (B_FALSE);
976 +}
977 +
805 978 static int
806 979 zone_bootup(zlog_t *zlogp, const char *bootargs, int zstate)
807 980 {
808 981 zoneid_t zoneid;
809 982 struct stat st;
810 - char zpath[MAXPATHLEN], initpath[MAXPATHLEN], init_file[MAXPATHLEN];
983 + char rpath[MAXPATHLEN], initpath[MAXPATHLEN], init_file[MAXPATHLEN];
811 984 char nbootargs[BOOTARGS_MAX];
812 985 char cmdbuf[MAXPATHLEN];
813 986 fs_callback_t cb;
814 987 brand_handle_t bh;
815 988 zone_iptype_t iptype;
816 - boolean_t links_loaded = B_FALSE;
817 989 dladm_status_t status;
818 990 char errmsg[DLADM_STRSIZE];
819 991 int err;
820 992 boolean_t restart_init;
993 + boolean_t app_svc_dep;
821 994
822 995 if (brand_prestatechg(zlogp, zstate, Z_BOOT) != 0)
823 996 return (-1);
824 997
825 998 if ((zoneid = getzoneidbyname(zone_name)) == -1) {
826 999 zerror(zlogp, B_TRUE, "unable to get zoneid");
827 1000 goto bad;
828 1001 }
829 1002
830 1003 cb.zlogp = zlogp;
831 1004 cb.zoneid = zoneid;
832 1005 cb.mount_cmd = B_FALSE;
833 1006
834 1007 /* Get a handle to the brand info for this zone */
835 1008 if ((bh = brand_open(brand_name)) == NULL) {
836 1009 zerror(zlogp, B_FALSE, "unable to determine zone brand");
837 1010 goto bad;
838 1011 }
839 1012
840 1013 /*
841 1014 * Get the list of filesystems to mount from the brand
842 1015 * configuration. These mounts are done via a thread that will
843 1016 * enter the zone, so they are done from within the context of the
844 1017 * zone.
|
↓ open down ↓ |
14 lines elided |
↑ open up ↑ |
845 1018 */
846 1019 if (brand_platform_iter_mounts(bh, mount_early_fs, &cb) != 0) {
847 1020 zerror(zlogp, B_FALSE, "unable to mount filesystems");
848 1021 brand_close(bh);
849 1022 goto bad;
850 1023 }
851 1024
852 1025 /*
853 1026 * Get the brand's boot callback if it exists.
854 1027 */
855 - if (zone_get_zonepath(zone_name, zpath, sizeof (zpath)) != Z_OK) {
856 - zerror(zlogp, B_FALSE, "unable to determine zone path");
857 - brand_close(bh);
858 - goto bad;
859 - }
860 1028 (void) strcpy(cmdbuf, EXEC_PREFIX);
861 - if (brand_get_boot(bh, zone_name, zpath, cmdbuf + EXEC_LEN,
1029 + if (brand_get_boot(bh, zone_name, zonepath, cmdbuf + EXEC_LEN,
862 1030 sizeof (cmdbuf) - EXEC_LEN) != 0) {
863 1031 zerror(zlogp, B_FALSE,
864 1032 "unable to determine branded zone's boot callback");
865 1033 brand_close(bh);
866 1034 goto bad;
867 1035 }
868 1036
869 1037 /* Get the path for this zone's init(1M) (or equivalent) process. */
870 1038 if (brand_get_initname(bh, init_file, MAXPATHLEN) != 0) {
871 1039 zerror(zlogp, B_FALSE,
872 1040 "unable to determine zone's init(1M) location");
873 1041 brand_close(bh);
874 1042 goto bad;
875 1043 }
876 1044
877 1045 /* See if this zone's brand should restart init if it dies. */
878 1046 restart_init = brand_restartinit(bh);
879 1047
1048 + /*
1049 + * See if we need to setup contract dependencies between the zone's
1050 + * primary application and any of its services.
1051 + */
1052 + app_svc_dep = is_app_svc_dep();
1053 +
880 1054 brand_close(bh);
881 1055
882 - err = filter_bootargs(zlogp, bootargs, nbootargs, init_file,
883 - bad_boot_arg);
884 - if (err == Z_INVAL)
885 - eventstream_write(Z_EVT_ZONE_BADARGS);
886 - else if (err != Z_OK)
1056 + err = filter_bootargs(zlogp, bootargs, nbootargs, init_file);
1057 + if (err != Z_OK)
887 1058 goto bad;
888 1059
889 1060 assert(init_file[0] != '\0');
890 1061
891 - /* Try to anticipate possible problems: Make sure init is executable. */
892 - if (zone_get_rootpath(zone_name, zpath, sizeof (zpath)) != Z_OK) {
1062 + /*
1063 + * Try to anticipate possible problems: If possible, make sure init is
1064 + * executable.
1065 + */
1066 + if (zone_get_rootpath(zone_name, rpath, sizeof (rpath)) != Z_OK) {
893 1067 zerror(zlogp, B_FALSE, "unable to determine zone root");
894 1068 goto bad;
895 1069 }
896 1070
897 - (void) snprintf(initpath, sizeof (initpath), "%s%s", zpath, init_file);
1071 + (void) snprintf(initpath, sizeof (initpath), "%s%s", rpath, init_file);
898 1072
899 - if (stat(initpath, &st) == -1) {
1073 + if (lstat(initpath, &st) == -1) {
900 1074 zerror(zlogp, B_TRUE, "could not stat %s", initpath);
901 1075 goto bad;
902 1076 }
903 1077
904 - if ((st.st_mode & S_IXUSR) == 0) {
1078 + /*
1079 + * If a symlink, we'll have to wait and resolve when we boot,
1080 + * otherwise check the executable bits now.
1081 + */
1082 + if ((st.st_mode & S_IFMT) != S_IFLNK && (st.st_mode & S_IXUSR) == 0) {
905 1083 zerror(zlogp, B_FALSE, "%s is not executable", initpath);
906 1084 goto bad;
907 1085 }
908 1086
909 1087 /*
910 1088 * Exclusive stack zones interact with the dlmgmtd running in the
911 1089 * global zone. dladm_zone_boot() tells dlmgmtd that this zone is
912 1090 * booting, and loads its datalinks from the zone's datalink
913 1091 * configuration file.
914 1092 */
915 1093 if (vplat_get_iptype(zlogp, &iptype) == 0 && iptype == ZS_EXCLUSIVE) {
916 1094 status = dladm_zone_boot(dld_handle, zoneid);
917 1095 if (status != DLADM_STATUS_OK) {
918 1096 zerror(zlogp, B_FALSE, "unable to load zone datalinks: "
919 1097 " %s", dladm_status2str(status, errmsg));
920 1098 goto bad;
921 1099 }
922 - links_loaded = B_TRUE;
923 1100 }
924 1101
925 1102 /*
926 1103 * If there is a brand 'boot' callback, execute it now to give the
927 1104 * brand one last chance to do any additional setup before the zone
928 1105 * is booted.
929 1106 */
930 1107 if ((strlen(cmdbuf) > EXEC_LEN) &&
931 1108 (do_subproc(zlogp, cmdbuf, NULL) != Z_OK)) {
932 1109 zerror(zlogp, B_FALSE, "%s failed", cmdbuf);
933 1110 goto bad;
934 1111 }
935 1112
936 1113 if (zone_setattr(zoneid, ZONE_ATTR_INITNAME, init_file, 0) == -1) {
937 1114 zerror(zlogp, B_TRUE, "could not set zone boot file");
938 1115 goto bad;
939 1116 }
940 1117
941 1118 if (zone_setattr(zoneid, ZONE_ATTR_BOOTARGS, nbootargs, 0) == -1) {
|
↓ open down ↓ |
9 lines elided |
↑ open up ↑ |
942 1119 zerror(zlogp, B_TRUE, "could not set zone boot arguments");
943 1120 goto bad;
944 1121 }
945 1122
946 1123 if (!restart_init && zone_setattr(zoneid, ZONE_ATTR_INITNORESTART,
947 1124 NULL, 0) == -1) {
948 1125 zerror(zlogp, B_TRUE, "could not set zone init-no-restart");
949 1126 goto bad;
950 1127 }
951 1128
1129 + if (app_svc_dep && zone_setattr(zoneid, ZONE_ATTR_APP_SVC_CT,
1130 + (void *)B_TRUE, sizeof (boolean_t)) == -1) {
1131 + zerror(zlogp, B_TRUE, "could not set zone app-die");
1132 + goto bad;
1133 + }
1134 +
952 1135 /*
953 1136 * Inform zonestatd of a new zone so that it can install a door for
954 1137 * the zone to contact it.
955 1138 */
956 1139 notify_zonestatd(zone_id);
957 1140
958 1141 if (zone_boot(zoneid) == -1) {
959 1142 zerror(zlogp, B_TRUE, "unable to boot zone");
960 1143 goto bad;
961 1144 }
962 1145
963 1146 if (brand_poststatechg(zlogp, zstate, Z_BOOT) != 0)
964 1147 goto bad;
965 1148
1149 + /* Startup a thread to perform zfd logging/tty svc for the zone. */
1150 + create_log_thread(zlogp, zone_id);
1151 +
1152 + /* Startup a thread to perform memory capping for the zone. */
1153 + create_mcap_thread(zlogp, zone_id);
1154 +
966 1155 return (0);
967 1156
968 1157 bad:
969 1158 /*
970 1159 * If something goes wrong, we up the zones's state to the target
971 1160 * state, RUNNING, and then invoke the hook as if we're halting.
972 1161 */
973 1162 (void) brand_poststatechg(zlogp, ZONE_STATE_RUNNING, Z_HALT);
974 - if (links_loaded)
975 - (void) dladm_zone_halt(dld_handle, zoneid);
1163 +
976 1164 return (-1);
977 1165 }
978 1166
979 1167 static int
980 1168 zone_halt(zlog_t *zlogp, boolean_t unmount_cmd, boolean_t rebooting, int zstate)
981 1169 {
982 1170 int err;
983 1171
984 - if (brand_prestatechg(zlogp, zstate, Z_HALT) != 0)
1172 + if (unmount_cmd == B_FALSE &&
1173 + brand_prestatechg(zlogp, zstate, Z_HALT) != 0)
985 1174 return (-1);
986 1175
1176 + /* Shutting down, stop the memcap thread */
1177 + destroy_mcap_thread();
1178 +
987 1179 if (vplat_teardown(zlogp, unmount_cmd, rebooting) != 0) {
988 1180 if (!bringup_failure_recovery)
989 1181 zerror(zlogp, B_FALSE, "unable to destroy zone");
1182 + destroy_log_thread();
990 1183 return (-1);
991 1184 }
992 1185
1186 + /* Shut down is done, stop the log thread */
1187 + destroy_log_thread();
1188 +
1189 + if (unmount_cmd == B_FALSE &&
1190 + brand_poststatechg(zlogp, zstate, Z_HALT) != 0)
1191 + return (-1);
1192 +
993 1193 if ((err = zonecfg_destroy_snapshot(zone_name)) != Z_OK)
994 1194 zerror(zlogp, B_FALSE, "destroying snapshot: %s",
995 1195 zonecfg_strerror(err));
996 1196
997 - if (brand_poststatechg(zlogp, zstate, Z_HALT) != 0)
998 - return (-1);
999 -
1000 1197 return (0);
1001 1198 }
1002 1199
1003 1200 static int
1004 1201 zone_graceful_shutdown(zlog_t *zlogp)
1005 1202 {
1006 1203 zoneid_t zoneid;
1007 1204 pid_t child;
1008 1205 char cmdbuf[MAXPATHLEN];
1009 1206 brand_handle_t bh = NULL;
1010 - char zpath[MAXPATHLEN];
1011 1207 ctid_t ct;
1012 1208 int tmpl_fd;
1013 1209 int child_status;
1014 1210
1015 1211 if (shutdown_in_progress) {
1016 1212 zerror(zlogp, B_FALSE, "shutdown already in progress");
1017 1213 return (-1);
1018 1214 }
1019 1215
1020 1216 if ((zoneid = getzoneidbyname(zone_name)) == -1) {
1021 1217 zerror(zlogp, B_TRUE, "unable to get zoneid");
1022 1218 return (-1);
1023 1219 }
1024 1220
1025 1221 /* Get a handle to the brand info for this zone */
1026 1222 if ((bh = brand_open(brand_name)) == NULL) {
1027 1223 zerror(zlogp, B_FALSE, "unable to determine zone brand");
1028 1224 return (-1);
1029 1225 }
1030 1226
1031 - if (zone_get_zonepath(zone_name, zpath, sizeof (zpath)) != Z_OK) {
1032 - zerror(zlogp, B_FALSE, "unable to determine zone path");
1033 - brand_close(bh);
1034 - return (-1);
1035 - }
1036 -
1037 1227 /*
1038 1228 * If there is a brand 'shutdown' callback, execute it now to give the
1039 1229 * brand a chance to cleanup any custom configuration.
1040 1230 */
1041 1231 (void) strcpy(cmdbuf, EXEC_PREFIX);
1042 - if (brand_get_shutdown(bh, zone_name, zpath, cmdbuf + EXEC_LEN,
1232 + if (brand_get_shutdown(bh, zone_name, zonepath, cmdbuf + EXEC_LEN,
1043 1233 sizeof (cmdbuf) - EXEC_LEN) != 0 || strlen(cmdbuf) <= EXEC_LEN) {
1044 1234 (void) strcat(cmdbuf, SHUTDOWN_DEFAULT);
1045 1235 }
1046 1236 brand_close(bh);
1047 1237
1048 1238 if ((tmpl_fd = init_template()) == -1) {
1049 1239 zerror(zlogp, B_TRUE, "failed to create contract");
1050 1240 return (-1);
1051 1241 }
1052 1242
1053 1243 if ((child = fork()) == -1) {
1054 1244 (void) ct_tmpl_clear(tmpl_fd);
1055 1245 (void) close(tmpl_fd);
1056 1246 zerror(zlogp, B_TRUE, "failed to fork");
1057 1247 return (-1);
1058 1248 } else if (child == 0) {
1059 1249 (void) ct_tmpl_clear(tmpl_fd);
1060 1250 if (zone_enter(zoneid) == -1) {
1061 1251 _exit(errno);
1062 1252 }
1063 1253 _exit(execl("/bin/sh", "sh", "-c", cmdbuf, (char *)NULL));
1064 1254 }
1065 1255
1066 1256 if (contract_latest(&ct) == -1)
1067 1257 ct = -1;
1068 1258 (void) ct_tmpl_clear(tmpl_fd);
1069 1259 (void) close(tmpl_fd);
1070 1260
1071 1261 if (waitpid(child, &child_status, 0) != child) {
1072 1262 /* unexpected: we must have been signalled */
1073 1263 (void) contract_abandon_id(ct);
1074 1264 return (-1);
1075 1265 }
1076 1266
1077 1267 (void) contract_abandon_id(ct);
1078 1268 if (WEXITSTATUS(child_status) != 0) {
1079 1269 errno = WEXITSTATUS(child_status);
1080 1270 zerror(zlogp, B_FALSE, "unable to shutdown zone");
1081 1271 return (-1);
1082 1272 }
1083 1273
1084 1274 shutdown_in_progress = B_TRUE;
1085 1275
1086 1276 return (0);
1087 1277 }
1088 1278
1089 1279 static int
1090 1280 zone_wait_shutdown(zlog_t *zlogp)
1091 1281 {
1092 1282 zone_state_t zstate;
1093 1283 uint64_t *tm = NULL;
1094 1284 scf_simple_prop_t *prop = NULL;
1095 1285 int timeout;
1096 1286 int tries;
1097 1287 int rc = -1;
1098 1288
1099 1289 /* Get default stop timeout from SMF framework */
1100 1290 timeout = SHUTDOWN_WAIT;
1101 1291 if ((prop = scf_simple_prop_get(NULL, SHUTDOWN_FMRI, "stop",
1102 1292 SCF_PROPERTY_TIMEOUT)) != NULL) {
1103 1293 if ((tm = scf_simple_prop_next_count(prop)) != NULL) {
1104 1294 if (tm != 0)
1105 1295 timeout = *tm;
1106 1296 }
1107 1297 scf_simple_prop_free(prop);
1108 1298 }
1109 1299
1110 1300 /* allow time for zone to shutdown cleanly */
1111 1301 for (tries = 0; tries < timeout; tries ++) {
1112 1302 (void) sleep(1);
1113 1303 if (zone_get_state(zone_name, &zstate) == Z_OK &&
1114 1304 zstate == ZONE_STATE_INSTALLED) {
1115 1305 rc = 0;
1116 1306 break;
1117 1307 }
1118 1308 }
1119 1309
1120 1310 if (rc != 0)
1121 1311 zerror(zlogp, B_FALSE, "unable to shutdown zone");
1122 1312
1123 1313 shutdown_in_progress = B_FALSE;
1124 1314
1125 1315 return (rc);
1126 1316 }
1127 1317
1128 1318
1129 1319
1130 1320 /*
1131 1321 * Generate AUE_zone_state for a command that boots a zone.
1132 1322 */
1133 1323 static void
1134 1324 audit_put_record(zlog_t *zlogp, ucred_t *uc, int return_val,
1135 1325 char *new_state)
1136 1326 {
1137 1327 adt_session_data_t *ah;
1138 1328 adt_event_data_t *event;
1139 1329 int pass_fail, fail_reason;
1140 1330
1141 1331 if (!adt_audit_enabled())
1142 1332 return;
1143 1333
1144 1334 if (return_val == 0) {
1145 1335 pass_fail = ADT_SUCCESS;
1146 1336 fail_reason = ADT_SUCCESS;
1147 1337 } else {
1148 1338 pass_fail = ADT_FAILURE;
1149 1339 fail_reason = ADT_FAIL_VALUE_PROGRAM;
1150 1340 }
1151 1341
1152 1342 if (adt_start_session(&ah, NULL, 0)) {
1153 1343 zerror(zlogp, B_TRUE, gettext("audit failure."));
1154 1344 return;
1155 1345 }
1156 1346 if (adt_set_from_ucred(ah, uc, ADT_NEW)) {
1157 1347 zerror(zlogp, B_TRUE, gettext("audit failure."));
1158 1348 (void) adt_end_session(ah);
1159 1349 return;
1160 1350 }
1161 1351
1162 1352 event = adt_alloc_event(ah, ADT_zone_state);
1163 1353 if (event == NULL) {
1164 1354 zerror(zlogp, B_TRUE, gettext("audit failure."));
1165 1355 (void) adt_end_session(ah);
1166 1356 return;
1167 1357 }
1168 1358 event->adt_zone_state.zonename = zone_name;
1169 1359 event->adt_zone_state.new_state = new_state;
|
↓ open down ↓ |
117 lines elided |
↑ open up ↑ |
1170 1360
1171 1361 if (adt_put_event(event, pass_fail, fail_reason))
1172 1362 zerror(zlogp, B_TRUE, gettext("audit failure."));
1173 1363
1174 1364 adt_free_event(event);
1175 1365
1176 1366 (void) adt_end_session(ah);
1177 1367 }
1178 1368
1179 1369 /*
1370 + * Log the exit time and status of the zone's init process into
1371 + * {zonepath}/lastexited. If the zone shutdown normally, the exit status will
1372 + * be -1, otherwise it will be the exit status as described in wait.3c.
1373 + * If the zone is configured to restart init, then nothing will be logged if
1374 + * init exits unexpectedly (the kernel will never upcall in this case).
1375 + */
1376 +static void
1377 +log_init_exit(int status)
1378 +{
1379 + char p[MAXPATHLEN];
1380 + char buf[128];
1381 + struct timeval t;
1382 + int fd;
1383 +
1384 + if (snprintf(p, sizeof (p), "%s/lastexited", zonepath) > sizeof (p))
1385 + return;
1386 + if (gettimeofday(&t, NULL) != 0)
1387 + return;
1388 + if (snprintf(buf, sizeof (buf), "%ld.%ld %d\n", t.tv_sec, t.tv_usec,
1389 + status) > sizeof (buf))
1390 + return;
1391 + if ((fd = open(p, O_WRONLY | O_CREAT | O_TRUNC, 0644)) < 0)
1392 + return;
1393 +
1394 + (void) write(fd, buf, strlen(buf));
1395 +
1396 + (void) close(fd);
1397 +}
1398 +
1399 +/*
1180 1400 * The main routine for the door server that deals with zone state transitions.
1181 1401 */
1182 1402 /* ARGSUSED */
1183 1403 static void
1184 1404 server(void *cookie, char *args, size_t alen, door_desc_t *dp,
1185 1405 uint_t n_desc)
1186 1406 {
1187 1407 ucred_t *uc = NULL;
1188 1408 const priv_set_t *eset;
1189 1409
1190 1410 zone_state_t zstate;
1191 1411 zone_cmd_t cmd;
1412 + int init_status;
1192 1413 zone_cmd_arg_t *zargp;
1193 1414
1194 1415 boolean_t kernelcall;
1195 1416
1196 1417 int rval = -1;
1197 1418 uint64_t uniqid;
1198 1419 zoneid_t zoneid = -1;
1199 1420 zlog_t zlog;
1200 1421 zlog_t *zlogp;
1201 1422 zone_cmd_rval_t *rvalp;
1202 1423 size_t rlen = getpagesize(); /* conservative */
1203 1424 fs_callback_t cb;
1204 1425 brand_handle_t bh;
1205 1426 boolean_t wait_shut = B_FALSE;
1206 1427
1207 1428 /* LINTED E_BAD_PTR_CAST_ALIGN */
1208 1429 zargp = (zone_cmd_arg_t *)args;
1209 1430
1210 1431 /*
1211 1432 * When we get the door unref message, we've fdetach'd the door, and
1212 1433 * it is time for us to shut down zoneadmd.
1213 1434 */
1214 1435 if (zargp == DOOR_UNREF_DATA) {
1215 1436 /*
1216 1437 * See comment at end of main() for info on the last rites.
1217 1438 */
1218 1439 exit(0);
1219 1440 }
1220 1441
1221 1442 if (zargp == NULL) {
1222 1443 (void) door_return(NULL, 0, 0, 0);
1223 1444 }
1224 1445
1225 1446 rvalp = alloca(rlen);
1226 1447 bzero(rvalp, rlen);
1227 1448 zlog.logfile = NULL;
1228 1449 zlog.buflen = zlog.loglen = rlen - sizeof (zone_cmd_rval_t) + 1;
1229 1450 zlog.buf = rvalp->errbuf;
1230 1451 zlog.log = zlog.buf;
1231 1452 /* defer initialization of zlog.locale until after credential check */
1232 1453 zlogp = &zlog;
1233 1454
|
↓ open down ↓ |
32 lines elided |
↑ open up ↑ |
1234 1455 if (alen != sizeof (zone_cmd_arg_t)) {
1235 1456 /*
1236 1457 * This really shouldn't be happening.
1237 1458 */
1238 1459 zerror(&logsys, B_FALSE, "argument size (%d bytes) "
1239 1460 "unexpected (expected %d bytes)", alen,
1240 1461 sizeof (zone_cmd_arg_t));
1241 1462 goto out;
1242 1463 }
1243 1464 cmd = zargp->cmd;
1465 + init_status = zargp->status;
1244 1466
1245 1467 if (door_ucred(&uc) != 0) {
1246 1468 zerror(&logsys, B_TRUE, "door_ucred");
1247 1469 goto out;
1248 1470 }
1249 1471 eset = ucred_getprivset(uc, PRIV_EFFECTIVE);
1250 1472 if (ucred_getzoneid(uc) != GLOBAL_ZONEID ||
1251 1473 (eset != NULL ? !priv_ismember(eset, PRIV_SYS_CONFIG) :
1252 1474 ucred_geteuid(uc) != 0)) {
1253 1475 zerror(&logsys, B_FALSE, "insufficient privileges");
1254 1476 goto out;
1255 1477 }
1256 1478
1257 1479 kernelcall = ucred_getpid(uc) == 0;
1258 1480
1259 1481 /*
1260 1482 * This is safe because we only use a zlog_t throughout the
1261 1483 * duration of a door call; i.e., by the time the pointer
1262 1484 * might become invalid, the door call would be over.
1263 1485 */
1264 1486 zlog.locale = kernelcall ? DEFAULT_LOCALE : zargp->locale;
1265 1487
1266 1488 (void) mutex_lock(&lock);
1267 1489
1268 1490 /*
1269 1491 * Once we start to really die off, we don't want more connections.
1270 1492 */
1271 1493 if (in_death_throes) {
1272 1494 (void) mutex_unlock(&lock);
1273 1495 ucred_free(uc);
1274 1496 (void) door_return(NULL, 0, 0, 0);
1275 1497 thr_exit(NULL);
1276 1498 }
1277 1499
1278 1500 /*
1279 1501 * Check for validity of command.
1280 1502 */
1281 1503 if (cmd != Z_READY && cmd != Z_BOOT && cmd != Z_FORCEBOOT &&
1282 1504 cmd != Z_REBOOT && cmd != Z_SHUTDOWN && cmd != Z_HALT &&
1283 1505 cmd != Z_NOTE_UNINSTALLING && cmd != Z_MOUNT &&
1284 1506 cmd != Z_FORCEMOUNT && cmd != Z_UNMOUNT) {
1285 1507 zerror(&logsys, B_FALSE, "invalid command %d", (int)cmd);
1286 1508 goto out;
1287 1509 }
1288 1510
1289 1511 if (kernelcall && (cmd != Z_HALT && cmd != Z_REBOOT)) {
1290 1512 /*
1291 1513 * Can't happen
1292 1514 */
1293 1515 zerror(&logsys, B_FALSE, "received unexpected kernel upcall %d",
1294 1516 cmd);
1295 1517 goto out;
1296 1518 }
1297 1519 /*
1298 1520 * We ignore the possibility of someone calling zone_create(2)
1299 1521 * explicitly; all requests must come through zoneadmd.
1300 1522 */
1301 1523 if (zone_get_state(zone_name, &zstate) != Z_OK) {
1302 1524 /*
1303 1525 * Something terribly wrong happened
1304 1526 */
1305 1527 zerror(&logsys, B_FALSE, "unable to determine state of zone");
1306 1528 goto out;
1307 1529 }
1308 1530
1309 1531 if (kernelcall) {
1310 1532 /*
1311 1533 * Kernel-initiated requests may lose their validity if the
1312 1534 * zone_t the kernel was referring to has gone away.
1313 1535 */
1314 1536 if ((zoneid = getzoneidbyname(zone_name)) == -1 ||
1315 1537 zone_getattr(zoneid, ZONE_ATTR_UNIQID, &uniqid,
1316 1538 sizeof (uniqid)) == -1 || uniqid != zargp->uniqid) {
1317 1539 /*
1318 1540 * We're not talking about the same zone. The request
1319 1541 * must have arrived too late. Return error.
1320 1542 */
1321 1543 rval = -1;
1322 1544 goto out;
1323 1545 }
1324 1546 zlogp = &logsys; /* Log errors to syslog */
1325 1547 }
1326 1548
1327 1549 /*
1328 1550 * If we are being asked to forcibly mount or boot a zone, we
1329 1551 * pretend that an INCOMPLETE zone is actually INSTALLED.
1330 1552 */
1331 1553 if (zstate == ZONE_STATE_INCOMPLETE &&
1332 1554 (cmd == Z_FORCEBOOT || cmd == Z_FORCEMOUNT))
1333 1555 zstate = ZONE_STATE_INSTALLED;
1334 1556
1335 1557 switch (zstate) {
1336 1558 case ZONE_STATE_CONFIGURED:
1337 1559 case ZONE_STATE_INCOMPLETE:
1338 1560 /*
1339 1561 * Not our area of expertise; we just print a nice message
1340 1562 * and die off.
1341 1563 */
1342 1564 zerror(zlogp, B_FALSE,
|
↓ open down ↓ |
89 lines elided |
↑ open up ↑ |
1343 1565 "%s operation is invalid for zones in state '%s'",
1344 1566 z_cmd_name(cmd), zone_state_str(zstate));
1345 1567 break;
1346 1568
1347 1569 case ZONE_STATE_INSTALLED:
1348 1570 switch (cmd) {
1349 1571 case Z_READY:
1350 1572 rval = zone_ready(zlogp, Z_MNT_BOOT, zstate);
1351 1573 if (rval == 0)
1352 1574 eventstream_write(Z_EVT_ZONE_READIED);
1575 + zcons_statechanged();
1353 1576 break;
1354 1577 case Z_BOOT:
1355 1578 case Z_FORCEBOOT:
1356 1579 eventstream_write(Z_EVT_ZONE_BOOTING);
1357 1580 if ((rval = zone_ready(zlogp, Z_MNT_BOOT, zstate))
1358 1581 == 0) {
1359 1582 rval = zone_bootup(zlogp, zargp->bootbuf,
1360 1583 zstate);
1361 1584 }
1362 1585 audit_put_record(zlogp, uc, rval, "boot");
1586 + zcons_statechanged();
1363 1587 if (rval != 0) {
1364 1588 bringup_failure_recovery = B_TRUE;
1365 1589 (void) zone_halt(zlogp, B_FALSE, B_FALSE,
1366 1590 zstate);
1367 1591 eventstream_write(Z_EVT_ZONE_BOOTFAILED);
1368 1592 }
1369 1593 break;
1370 1594 case Z_SHUTDOWN:
1371 1595 case Z_HALT:
1372 1596 if (kernelcall) /* Invalid; can't happen */
1373 1597 abort();
1374 1598 /*
1375 1599 * We could have two clients racing to halt this
1376 1600 * zone; the second client loses, but his request
1377 1601 * doesn't fail, since the zone is now in the desired
1378 1602 * state.
1379 1603 */
1380 1604 zerror(zlogp, B_FALSE, "zone is already halted");
1381 1605 rval = 0;
1382 1606 break;
1383 1607 case Z_REBOOT:
1384 1608 if (kernelcall) /* Invalid; can't happen */
1385 1609 abort();
1386 1610 zerror(zlogp, B_FALSE, "%s operation is invalid "
1387 1611 "for zones in state '%s'", z_cmd_name(cmd),
1388 1612 zone_state_str(zstate));
1389 1613 rval = -1;
1390 1614 break;
1391 1615 case Z_NOTE_UNINSTALLING:
1392 1616 if (kernelcall) /* Invalid; can't happen */
1393 1617 abort();
1394 1618 /*
1395 1619 * Tell the console to print out a message about this.
1396 1620 * Once it does, we will be in_death_throes.
1397 1621 */
1398 1622 eventstream_write(Z_EVT_ZONE_UNINSTALLING);
1399 1623 break;
1400 1624 case Z_MOUNT:
1401 1625 case Z_FORCEMOUNT:
1402 1626 if (kernelcall) /* Invalid; can't happen */
1403 1627 abort();
1404 1628 if (!zone_isnative && !zone_iscluster &&
1405 1629 !zone_islabeled) {
1406 1630 /*
1407 1631 * -U mounts the zone without lofs mounting
1408 1632 * zone file systems back into the scratch
1409 1633 * zone. This is required when mounting
1410 1634 * non-native branded zones.
1411 1635 */
1412 1636 (void) strlcpy(zargp->bootbuf, "-U",
1413 1637 BOOTARGS_MAX);
1414 1638 }
1415 1639
1416 1640 rval = zone_ready(zlogp,
1417 1641 strcmp(zargp->bootbuf, "-U") == 0 ?
1418 1642 Z_MNT_UPDATE : Z_MNT_SCRATCH, zstate);
1419 1643 if (rval != 0)
1420 1644 break;
1421 1645
1422 1646 eventstream_write(Z_EVT_ZONE_READIED);
1423 1647
1424 1648 /*
1425 1649 * Get a handle to the default brand info.
1426 1650 * We must always use the default brand file system
1427 1651 * list when mounting the zone.
1428 1652 */
1429 1653 if ((bh = brand_open(default_brand)) == NULL) {
1430 1654 rval = -1;
1431 1655 break;
1432 1656 }
1433 1657
1434 1658 /*
1435 1659 * Get the list of filesystems to mount from
1436 1660 * the brand configuration. These mounts are done
1437 1661 * via a thread that will enter the zone, so they
1438 1662 * are done from within the context of the zone.
1439 1663 */
1440 1664 cb.zlogp = zlogp;
1441 1665 cb.zoneid = zone_id;
1442 1666 cb.mount_cmd = B_TRUE;
1443 1667 rval = brand_platform_iter_mounts(bh,
1444 1668 mount_early_fs, &cb);
1445 1669
1446 1670 brand_close(bh);
1447 1671
1448 1672 /*
1449 1673 * Ordinarily, /dev/fd would be mounted inside the zone
1450 1674 * by svc:/system/filesystem/usr:default, but since
1451 1675 * we're not booting the zone, we need to do this
1452 1676 * manually.
1453 1677 */
1454 1678 if (rval == 0)
1455 1679 rval = mount_early_fs(&cb,
1456 1680 "fd", "/dev/fd", "fd", NULL);
1457 1681 break;
1458 1682 case Z_UNMOUNT:
1459 1683 if (kernelcall) /* Invalid; can't happen */
1460 1684 abort();
1461 1685 zerror(zlogp, B_FALSE, "zone is already unmounted");
1462 1686 rval = 0;
1463 1687 break;
1464 1688 }
1465 1689 break;
1466 1690
1467 1691 case ZONE_STATE_READY:
1468 1692 switch (cmd) {
1469 1693 case Z_READY:
1470 1694 /*
1471 1695 * We could have two clients racing to ready this
1472 1696 * zone; the second client loses, but his request
1473 1697 * doesn't fail, since the zone is now in the desired
1474 1698 * state.
|
↓ open down ↓ |
102 lines elided |
↑ open up ↑ |
1475 1699 */
1476 1700 zerror(zlogp, B_FALSE, "zone is already ready");
1477 1701 rval = 0;
1478 1702 break;
1479 1703 case Z_BOOT:
1480 1704 (void) strlcpy(boot_args, zargp->bootbuf,
1481 1705 sizeof (boot_args));
1482 1706 eventstream_write(Z_EVT_ZONE_BOOTING);
1483 1707 rval = zone_bootup(zlogp, zargp->bootbuf, zstate);
1484 1708 audit_put_record(zlogp, uc, rval, "boot");
1709 + zcons_statechanged();
1485 1710 if (rval != 0) {
1486 1711 bringup_failure_recovery = B_TRUE;
1487 1712 (void) zone_halt(zlogp, B_FALSE, B_TRUE,
1488 1713 zstate);
1489 1714 eventstream_write(Z_EVT_ZONE_BOOTFAILED);
1490 1715 }
1491 1716 boot_args[0] = '\0';
1492 1717 break;
1493 1718 case Z_HALT:
1494 1719 if (kernelcall) /* Invalid; can't happen */
1495 1720 abort();
1496 1721 if ((rval = zone_halt(zlogp, B_FALSE, B_FALSE, zstate))
1497 1722 != 0)
1498 1723 break;
1724 + zcons_statechanged();
1499 1725 eventstream_write(Z_EVT_ZONE_HALTED);
1500 1726 break;
1501 1727 case Z_SHUTDOWN:
1502 1728 case Z_REBOOT:
1503 1729 case Z_NOTE_UNINSTALLING:
1504 1730 case Z_MOUNT:
1505 1731 case Z_UNMOUNT:
1506 1732 if (kernelcall) /* Invalid; can't happen */
1507 1733 abort();
1508 1734 zerror(zlogp, B_FALSE, "%s operation is invalid "
1509 1735 "for zones in state '%s'", z_cmd_name(cmd),
1510 1736 zone_state_str(zstate));
1511 1737 rval = -1;
1512 1738 break;
1513 1739 }
1514 1740 break;
1515 1741
1516 1742 case ZONE_STATE_MOUNTED:
1517 1743 switch (cmd) {
1518 1744 case Z_UNMOUNT:
1519 1745 if (kernelcall) /* Invalid; can't happen */
1520 1746 abort();
1521 1747 rval = zone_halt(zlogp, B_TRUE, B_FALSE, zstate);
1522 1748 if (rval == 0) {
1523 1749 eventstream_write(Z_EVT_ZONE_HALTED);
1524 1750 (void) sema_post(&scratch_sem);
1525 1751 }
1526 1752 break;
1527 1753 default:
1528 1754 if (kernelcall) /* Invalid; can't happen */
1529 1755 abort();
1530 1756 zerror(zlogp, B_FALSE, "%s operation is invalid "
1531 1757 "for zones in state '%s'", z_cmd_name(cmd),
1532 1758 zone_state_str(zstate));
1533 1759 rval = -1;
1534 1760 break;
1535 1761 }
|
↓ open down ↓ |
27 lines elided |
↑ open up ↑ |
1536 1762 break;
1537 1763
1538 1764 case ZONE_STATE_RUNNING:
1539 1765 case ZONE_STATE_SHUTTING_DOWN:
1540 1766 case ZONE_STATE_DOWN:
1541 1767 switch (cmd) {
1542 1768 case Z_READY:
1543 1769 if ((rval = zone_halt(zlogp, B_FALSE, B_TRUE, zstate))
1544 1770 != 0)
1545 1771 break;
1772 + zcons_statechanged();
1546 1773 if ((rval = zone_ready(zlogp, Z_MNT_BOOT, zstate)) == 0)
1547 1774 eventstream_write(Z_EVT_ZONE_READIED);
1548 1775 else
1549 1776 eventstream_write(Z_EVT_ZONE_HALTED);
1550 1777 break;
1551 1778 case Z_BOOT:
1552 1779 /*
1553 1780 * We could have two clients racing to boot this
1554 1781 * zone; the second client loses, but his request
1555 1782 * doesn't fail, since the zone is now in the desired
1556 1783 * state.
1557 1784 */
1558 1785 zerror(zlogp, B_FALSE, "zone is already booted");
1559 1786 rval = 0;
1560 1787 break;
1561 1788 case Z_HALT:
1789 + if (kernelcall) {
1790 + log_init_exit(init_status);
1791 + } else {
1792 + log_init_exit(-1);
1793 + }
1562 1794 if ((rval = zone_halt(zlogp, B_FALSE, B_FALSE, zstate))
1563 1795 != 0)
1564 1796 break;
1565 1797 eventstream_write(Z_EVT_ZONE_HALTED);
1798 + zcons_statechanged();
1566 1799 break;
1567 1800 case Z_REBOOT:
1568 1801 (void) strlcpy(boot_args, zargp->bootbuf,
1569 1802 sizeof (boot_args));
1570 1803 eventstream_write(Z_EVT_ZONE_REBOOTING);
1571 1804 if ((rval = zone_halt(zlogp, B_FALSE, B_TRUE, zstate))
1572 1805 != 0) {
1573 1806 eventstream_write(Z_EVT_ZONE_BOOTFAILED);
1574 1807 boot_args[0] = '\0';
1575 1808 break;
1576 1809 }
1577 - if ((rval = zone_ready(zlogp, Z_MNT_BOOT, zstate))
1578 - != 0) {
1810 + zcons_statechanged();
1811 + if ((rval = zone_ready(zlogp, Z_MNT_BOOT, zstate)) !=
1812 + 0) {
1579 1813 eventstream_write(Z_EVT_ZONE_BOOTFAILED);
1580 1814 boot_args[0] = '\0';
1581 1815 break;
1582 1816 }
1583 1817 rval = zone_bootup(zlogp, zargp->bootbuf, zstate);
1584 1818 audit_put_record(zlogp, uc, rval, "reboot");
1585 1819 if (rval != 0) {
1586 1820 (void) zone_halt(zlogp, B_FALSE, B_TRUE,
1587 1821 zstate);
1588 1822 eventstream_write(Z_EVT_ZONE_BOOTFAILED);
1589 1823 }
1590 1824 boot_args[0] = '\0';
1591 1825 break;
1592 1826 case Z_SHUTDOWN:
1593 1827 if ((rval = zone_graceful_shutdown(zlogp)) == 0) {
1594 1828 wait_shut = B_TRUE;
1595 1829 }
1596 1830 break;
1597 1831 case Z_NOTE_UNINSTALLING:
1598 1832 case Z_MOUNT:
1599 1833 case Z_UNMOUNT:
1600 1834 zerror(zlogp, B_FALSE, "%s operation is invalid "
1601 1835 "for zones in state '%s'", z_cmd_name(cmd),
1602 1836 zone_state_str(zstate));
1603 1837 rval = -1;
1604 1838 break;
1605 1839 }
1606 1840 break;
1607 1841 default:
1608 1842 abort();
1609 1843 }
1610 1844
1611 1845 /*
1612 1846 * Because the state of the zone may have changed, we make sure
1613 1847 * to wake the console poller, which is in charge of initiating
1614 1848 * the shutdown procedure as necessary.
1615 1849 */
1616 1850 eventstream_write(Z_EVT_NULL);
1617 1851
1618 1852 out:
1619 1853 (void) mutex_unlock(&lock);
1620 1854
1621 1855 /* Wait for the Z_SHUTDOWN commands to complete */
1622 1856 if (wait_shut)
1623 1857 rval = zone_wait_shutdown(zlogp);
1624 1858
1625 1859 if (kernelcall) {
1626 1860 rvalp = NULL;
1627 1861 rlen = 0;
1628 1862 } else {
1629 1863 rvalp->rval = rval;
1630 1864 }
1631 1865 if (uc != NULL)
1632 1866 ucred_free(uc);
1633 1867 (void) door_return((char *)rvalp, rlen, NULL, 0);
1634 1868 thr_exit(NULL);
1635 1869 }
1636 1870
1637 1871 static int
1638 1872 setup_door(zlog_t *zlogp)
1639 1873 {
1640 1874 if ((zone_door = door_create(server, NULL,
1641 1875 DOOR_UNREF | DOOR_REFUSE_DESC | DOOR_NO_CANCEL)) < 0) {
1642 1876 zerror(zlogp, B_TRUE, "%s failed", "door_create");
1643 1877 return (-1);
1644 1878 }
1645 1879 (void) fdetach(zone_door_path);
1646 1880
1647 1881 if (fattach(zone_door, zone_door_path) != 0) {
1648 1882 zerror(zlogp, B_TRUE, "fattach to %s failed", zone_door_path);
1649 1883 (void) door_revoke(zone_door);
1650 1884 (void) fdetach(zone_door_path);
1651 1885 zone_door = -1;
1652 1886 return (-1);
1653 1887 }
1654 1888 return (0);
1655 1889 }
1656 1890
1657 1891 /*
1658 1892 * zoneadm(1m) will start zoneadmd if it thinks it isn't running; this
1659 1893 * is where zoneadmd itself will check to see that another instance of
1660 1894 * zoneadmd isn't already controlling this zone.
1661 1895 *
1662 1896 * The idea here is that we want to open the path to which we will
1663 1897 * attach our door, lock it, and then make sure that no-one has beat us
1664 1898 * to fattach(3c)ing onto it.
1665 1899 *
1666 1900 * fattach(3c) is really a mount, so there are actually two possible
1667 1901 * vnodes we could be dealing with. Our strategy is as follows:
1668 1902 *
1669 1903 * - If the file we opened is a regular file (common case):
1670 1904 * There is no fattach(3c)ed door, so we have a chance of becoming
1671 1905 * the managing zoneadmd. We attempt to lock the file: if it is
1672 1906 * already locked, that means someone else raced us here, so we
1673 1907 * lose and give up. zoneadm(1m) will try to contact the zoneadmd
1674 1908 * that beat us to it.
1675 1909 *
1676 1910 * - If the file we opened is a namefs file:
1677 1911 * This means there is already an established door fattach(3c)'ed
1678 1912 * to the rendezvous path. We've lost the race, so we give up.
1679 1913 * Note that in this case we also try to grab the file lock, and
1680 1914 * will succeed in acquiring it since the vnode locked by the
1681 1915 * "winning" zoneadmd was a regular one, and the one we locked was
1682 1916 * the fattach(3c)'ed door node. At any rate, no harm is done, and
1683 1917 * we just return to zoneadm(1m) which knows to retry.
1684 1918 */
1685 1919 static int
1686 1920 make_daemon_exclusive(zlog_t *zlogp)
1687 1921 {
1688 1922 int doorfd = -1;
1689 1923 int err, ret = -1;
1690 1924 struct stat st;
1691 1925 struct flock flock;
1692 1926 zone_state_t zstate;
1693 1927
1694 1928 top:
1695 1929 if ((err = zone_get_state(zone_name, &zstate)) != Z_OK) {
1696 1930 zerror(zlogp, B_FALSE, "failed to get zone state: %s",
1697 1931 zonecfg_strerror(err));
1698 1932 goto out;
1699 1933 }
1700 1934 if ((doorfd = open(zone_door_path, O_CREAT|O_RDWR,
1701 1935 S_IREAD|S_IWRITE)) < 0) {
1702 1936 zerror(zlogp, B_TRUE, "failed to open %s", zone_door_path);
1703 1937 goto out;
1704 1938 }
1705 1939 if (fstat(doorfd, &st) < 0) {
1706 1940 zerror(zlogp, B_TRUE, "failed to stat %s", zone_door_path);
1707 1941 goto out;
1708 1942 }
1709 1943 /*
1710 1944 * Lock the file to synchronize with other zoneadmd
1711 1945 */
1712 1946 flock.l_type = F_WRLCK;
1713 1947 flock.l_whence = SEEK_SET;
1714 1948 flock.l_start = (off_t)0;
1715 1949 flock.l_len = (off_t)0;
1716 1950 if (fcntl(doorfd, F_SETLK, &flock) < 0) {
1717 1951 /*
1718 1952 * Someone else raced us here and grabbed the lock file
1719 1953 * first. A warning here is inappropriate since nothing
1720 1954 * went wrong.
1721 1955 */
1722 1956 goto out;
1723 1957 }
1724 1958
1725 1959 if (strcmp(st.st_fstype, "namefs") == 0) {
1726 1960 struct door_info info;
1727 1961
1728 1962 /*
1729 1963 * There is already something fattach()'ed to this file.
1730 1964 * Lets see what the door is up to.
1731 1965 */
1732 1966 if (door_info(doorfd, &info) == 0 && info.di_target != -1) {
1733 1967 /*
1734 1968 * Another zoneadmd process seems to be in
1735 1969 * control of the situation and we don't need to
1736 1970 * be here. A warning here is inappropriate
1737 1971 * since nothing went wrong.
1738 1972 *
1739 1973 * If the door has been revoked, the zoneadmd
1740 1974 * process currently managing the zone is going
1741 1975 * away. We'll return control to zoneadm(1m)
1742 1976 * which will try again (by which time zoneadmd
1743 1977 * will hopefully have exited).
1744 1978 */
1745 1979 goto out;
1746 1980 }
1747 1981
1748 1982 /*
1749 1983 * If we got this far, there's a fattach(3c)'ed door
1750 1984 * that belongs to a process that has exited, which can
1751 1985 * happen if the previous zoneadmd died unexpectedly.
|
↓ open down ↓ |
163 lines elided |
↑ open up ↑ |
1752 1986 *
1753 1987 * Let user know that something is amiss, but that we can
1754 1988 * recover; if the zone is in the installed state, then don't
1755 1989 * message, since having a running zoneadmd isn't really
1756 1990 * expected/needed. We want to keep occurences of this message
1757 1991 * limited to times when zoneadmd is picking back up from a
1758 1992 * zoneadmd that died while the zone was in some non-trivial
1759 1993 * state.
1760 1994 */
1761 1995 if (zstate > ZONE_STATE_INSTALLED) {
1996 + static zoneid_t zid;
1997 +
1762 1998 zerror(zlogp, B_FALSE,
1763 1999 "zone '%s': WARNING: zone is in state '%s', but "
1764 2000 "zoneadmd does not appear to be available; "
1765 2001 "restarted zoneadmd to recover.",
1766 2002 zone_name, zone_state_str(zstate));
2003 +
2004 + /*
2005 + * Startup a thread to perform the zfd logging/tty svc
2006 + * and a thread to perform memory capping for the
2007 + * zone. zlogp won't be valid for much longer so use
2008 + * logsys.
2009 + */
2010 + if ((zid = getzoneidbyname(zone_name)) != -1) {
2011 + create_log_thread(&logsys, zid);
2012 + create_mcap_thread(&logsys, zid);
2013 + }
2014 +
2015 + /* recover the global configuration snapshot */
2016 + if (snap_hndl == NULL) {
2017 + if ((snap_hndl = zonecfg_init_handle())
2018 + == NULL ||
2019 + zonecfg_create_snapshot(zone_name)
2020 + != Z_OK ||
2021 + zonecfg_get_snapshot_handle(zone_name,
2022 + snap_hndl) != Z_OK) {
2023 + zerror(zlogp, B_FALSE, "recovering "
2024 + "zone configuration handle");
2025 + goto out;
2026 + }
2027 + }
1767 2028 }
1768 2029
1769 2030 (void) fdetach(zone_door_path);
1770 2031 (void) close(doorfd);
1771 2032 goto top;
1772 2033 }
1773 2034 ret = 0;
1774 2035 out:
1775 2036 (void) close(doorfd);
1776 2037 return (ret);
1777 2038 }
1778 2039
1779 2040 /*
1780 2041 * Setup the brand's pre and post state change callbacks, as well as the
1781 2042 * query callback, if any of these exist.
1782 2043 */
1783 2044 static int
1784 2045 brand_callback_init(brand_handle_t bh, char *zone_name)
1785 2046 {
1786 - char zpath[MAXPATHLEN];
1787 -
1788 - if (zone_get_zonepath(zone_name, zpath, sizeof (zpath)) != Z_OK)
1789 - return (-1);
1790 -
1791 2047 (void) strlcpy(pre_statechg_hook, EXEC_PREFIX,
1792 2048 sizeof (pre_statechg_hook));
1793 2049
1794 - if (brand_get_prestatechange(bh, zone_name, zpath,
2050 + if (brand_get_prestatechange(bh, zone_name, zonepath,
1795 2051 pre_statechg_hook + EXEC_LEN,
1796 2052 sizeof (pre_statechg_hook) - EXEC_LEN) != 0)
1797 2053 return (-1);
1798 2054
1799 2055 if (strlen(pre_statechg_hook) <= EXEC_LEN)
1800 2056 pre_statechg_hook[0] = '\0';
1801 2057
1802 2058 (void) strlcpy(post_statechg_hook, EXEC_PREFIX,
1803 2059 sizeof (post_statechg_hook));
1804 2060
1805 - if (brand_get_poststatechange(bh, zone_name, zpath,
2061 + if (brand_get_poststatechange(bh, zone_name, zonepath,
1806 2062 post_statechg_hook + EXEC_LEN,
1807 2063 sizeof (post_statechg_hook) - EXEC_LEN) != 0)
1808 2064 return (-1);
1809 2065
1810 2066 if (strlen(post_statechg_hook) <= EXEC_LEN)
1811 2067 post_statechg_hook[0] = '\0';
1812 2068
1813 2069 (void) strlcpy(query_hook, EXEC_PREFIX,
1814 2070 sizeof (query_hook));
1815 2071
1816 - if (brand_get_query(bh, zone_name, zpath, query_hook + EXEC_LEN,
2072 + if (brand_get_query(bh, zone_name, zonepath, query_hook + EXEC_LEN,
1817 2073 sizeof (query_hook) - EXEC_LEN) != 0)
1818 2074 return (-1);
1819 2075
1820 2076 if (strlen(query_hook) <= EXEC_LEN)
1821 2077 query_hook[0] = '\0';
1822 2078
1823 2079 return (0);
1824 2080 }
1825 2081
1826 2082 int
1827 2083 main(int argc, char *argv[])
1828 2084 {
1829 2085 int opt;
1830 2086 zoneid_t zid;
1831 2087 priv_set_t *privset;
1832 2088 zone_state_t zstate;
1833 2089 char parents_locale[MAXPATHLEN];
1834 2090 brand_handle_t bh;
1835 2091 int err;
1836 2092
1837 2093 pid_t pid;
1838 2094 sigset_t blockset;
1839 2095 sigset_t block_cld;
1840 2096
1841 2097 struct {
1842 2098 sema_t sem;
1843 2099 int status;
1844 2100 zlog_t log;
1845 2101 } *shstate;
1846 2102 size_t shstatelen = getpagesize();
1847 2103
1848 2104 zlog_t errlog;
1849 2105 zlog_t *zlogp;
1850 2106
1851 2107 int ctfd;
1852 2108
1853 2109 progname = get_execbasename(argv[0]);
1854 2110
1855 2111 /*
1856 2112 * Make sure stderr is unbuffered
1857 2113 */
1858 2114 (void) setbuffer(stderr, NULL, 0);
1859 2115
1860 2116 /*
1861 2117 * Get out of the way of mounted filesystems, since we will daemonize
1862 2118 * soon.
1863 2119 */
1864 2120 (void) chdir("/");
1865 2121
1866 2122 /*
1867 2123 * Use the default system umask per PSARC 1998/110 rather than
1868 2124 * anything that may have been set by the caller.
1869 2125 */
1870 2126 (void) umask(CMASK);
1871 2127
1872 2128 /*
1873 2129 * Initially we want to use our parent's locale.
1874 2130 */
1875 2131 (void) setlocale(LC_ALL, "");
1876 2132 (void) textdomain(TEXT_DOMAIN);
1877 2133 (void) strlcpy(parents_locale, setlocale(LC_MESSAGES, NULL),
1878 2134 sizeof (parents_locale));
1879 2135
1880 2136 /*
1881 2137 * This zlog_t is used for writing to stderr
1882 2138 */
1883 2139 errlog.logfile = stderr;
1884 2140 errlog.buflen = errlog.loglen = 0;
1885 2141 errlog.buf = errlog.log = NULL;
1886 2142 errlog.locale = parents_locale;
1887 2143
1888 2144 /*
1889 2145 * We start off writing to stderr until we're ready to daemonize.
1890 2146 */
1891 2147 zlogp = &errlog;
1892 2148
1893 2149 /*
1894 2150 * Process options.
1895 2151 */
1896 2152 while ((opt = getopt(argc, argv, "R:z:")) != EOF) {
1897 2153 switch (opt) {
1898 2154 case 'R':
1899 2155 zonecfg_set_root(optarg);
1900 2156 break;
1901 2157 case 'z':
1902 2158 zone_name = optarg;
1903 2159 break;
1904 2160 default:
1905 2161 usage();
1906 2162 }
1907 2163 }
1908 2164
1909 2165 if (zone_name == NULL)
1910 2166 usage();
1911 2167
1912 2168 /*
1913 2169 * Because usage() prints directly to stderr, it has gettext()
1914 2170 * wrapping, which depends on the locale. But since zerror() calls
1915 2171 * localize() which tweaks the locale, it is not safe to call zerror()
1916 2172 * until after the last call to usage(). Fortunately, the last call
1917 2173 * to usage() is just above and the first call to zerror() is just
1918 2174 * below. Don't mess this up.
1919 2175 */
1920 2176 if (strcmp(zone_name, GLOBAL_ZONENAME) == 0) {
1921 2177 zerror(zlogp, B_FALSE, "cannot manage the %s zone",
1922 2178 GLOBAL_ZONENAME);
1923 2179 return (1);
1924 2180 }
1925 2181
1926 2182 if (zone_get_id(zone_name, &zid) != 0) {
1927 2183 zerror(zlogp, B_FALSE, "could not manage %s: %s", zone_name,
1928 2184 zonecfg_strerror(Z_NO_ZONE));
1929 2185 return (1);
1930 2186 }
1931 2187
1932 2188 if ((err = zone_get_state(zone_name, &zstate)) != Z_OK) {
1933 2189 zerror(zlogp, B_FALSE, "failed to get zone state: %s",
|
↓ open down ↓ |
107 lines elided |
↑ open up ↑ |
1934 2190 zonecfg_strerror(err));
1935 2191 return (1);
1936 2192 }
1937 2193 if (zstate < ZONE_STATE_INCOMPLETE) {
1938 2194 zerror(zlogp, B_FALSE,
1939 2195 "cannot manage a zone which is in state '%s'",
1940 2196 zone_state_str(zstate));
1941 2197 return (1);
1942 2198 }
1943 2199
2200 + if (zone_get_zonepath(zone_name, zonepath, sizeof (zonepath)) != Z_OK) {
2201 + zerror(zlogp, B_FALSE, "unable to determine zone path");
2202 + return (-1);
2203 + }
2204 +
1944 2205 if (zonecfg_default_brand(default_brand,
1945 2206 sizeof (default_brand)) != Z_OK) {
1946 2207 zerror(zlogp, B_FALSE, "unable to determine default brand");
1947 2208 return (1);
1948 2209 }
1949 2210
1950 2211 /* Get a handle to the brand info for this zone */
1951 2212 if (zone_get_brand(zone_name, brand_name, sizeof (brand_name))
1952 2213 != Z_OK) {
1953 2214 zerror(zlogp, B_FALSE, "unable to determine zone brand");
1954 2215 return (1);
1955 2216 }
1956 2217 zone_isnative = (strcmp(brand_name, NATIVE_BRAND_NAME) == 0);
1957 2218 zone_islabeled = (strcmp(brand_name, LABELED_BRAND_NAME) == 0);
1958 2219
1959 2220 /*
1960 2221 * In the alternate root environment, the only supported
1961 2222 * operations are mount and unmount. In this case, just treat
1962 2223 * the zone as native if it is cluster. Cluster zones can be
1963 2224 * native for the purpose of LU or upgrade, and the cluster
1964 2225 * brand may not exist in the miniroot (such as in net install
1965 2226 * upgrade).
1966 2227 */
1967 2228 if (strcmp(brand_name, CLUSTER_BRAND_NAME) == 0) {
1968 2229 zone_iscluster = B_TRUE;
1969 2230 if (zonecfg_in_alt_root()) {
1970 2231 (void) strlcpy(brand_name, default_brand,
1971 2232 sizeof (brand_name));
1972 2233 }
1973 2234 } else {
1974 2235 zone_iscluster = B_FALSE;
1975 2236 }
1976 2237
1977 2238 if ((bh = brand_open(brand_name)) == NULL) {
1978 2239 zerror(zlogp, B_FALSE, "unable to open zone brand");
1979 2240 return (1);
1980 2241 }
1981 2242
1982 2243 /* Get state change brand hooks. */
1983 2244 if (brand_callback_init(bh, zone_name) == -1) {
1984 2245 zerror(zlogp, B_TRUE,
1985 2246 "failed to initialize brand state change hooks");
1986 2247 brand_close(bh);
1987 2248 return (1);
1988 2249 }
1989 2250
1990 2251 brand_close(bh);
1991 2252
1992 2253 /*
1993 2254 * Check that we have all privileges. It would be nice to pare
1994 2255 * this down, but this is at least a first cut.
1995 2256 */
1996 2257 if ((privset = priv_allocset()) == NULL) {
1997 2258 zerror(zlogp, B_TRUE, "%s failed", "priv_allocset");
1998 2259 return (1);
1999 2260 }
2000 2261
2001 2262 if (getppriv(PRIV_EFFECTIVE, privset) != 0) {
2002 2263 zerror(zlogp, B_TRUE, "%s failed", "getppriv");
2003 2264 priv_freeset(privset);
2004 2265 return (1);
2005 2266 }
2006 2267
2007 2268 if (priv_isfullset(privset) == B_FALSE) {
2008 2269 zerror(zlogp, B_FALSE, "You lack sufficient privilege to "
2009 2270 "run this command (all privs required)");
2010 2271 priv_freeset(privset);
2011 2272 return (1);
2012 2273 }
2013 2274 priv_freeset(privset);
2014 2275
2015 2276 if (mkzonedir(zlogp) != 0)
2016 2277 return (1);
2017 2278
2018 2279 /*
2019 2280 * Pre-fork: setup shared state
2020 2281 */
2021 2282 if ((shstate = (void *)mmap(NULL, shstatelen,
2022 2283 PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANON, -1, (off_t)0)) ==
2023 2284 MAP_FAILED) {
2024 2285 zerror(zlogp, B_TRUE, "%s failed", "mmap");
2025 2286 return (1);
2026 2287 }
2027 2288 if (sema_init(&shstate->sem, 0, USYNC_PROCESS, NULL) != 0) {
2028 2289 zerror(zlogp, B_TRUE, "%s failed", "sema_init()");
2029 2290 (void) munmap((char *)shstate, shstatelen);
2030 2291 return (1);
2031 2292 }
2032 2293 shstate->log.logfile = NULL;
2033 2294 shstate->log.buflen = shstatelen - sizeof (*shstate);
2034 2295 shstate->log.loglen = shstate->log.buflen;
2035 2296 shstate->log.buf = (char *)shstate + sizeof (*shstate);
2036 2297 shstate->log.log = shstate->log.buf;
2037 2298 shstate->log.locale = parents_locale;
2038 2299 shstate->status = -1;
2039 2300
2040 2301 /*
2041 2302 * We need a SIGCHLD handler so the sema_wait() below will wake
2042 2303 * up if the child dies without doing a sema_post().
2043 2304 */
2044 2305 (void) sigset(SIGCHLD, sigchld);
2045 2306 /*
2046 2307 * We must mask SIGCHLD until after we've coped with the fork
2047 2308 * sufficiently to deal with it; otherwise we can race and
2048 2309 * receive the signal before pid has been initialized
2049 2310 * (yes, this really happens).
2050 2311 */
2051 2312 (void) sigemptyset(&block_cld);
2052 2313 (void) sigaddset(&block_cld, SIGCHLD);
2053 2314 (void) sigprocmask(SIG_BLOCK, &block_cld, NULL);
2054 2315
2055 2316 /*
2056 2317 * The parent only needs stderr after the fork, so close other fd's
2057 2318 * that we inherited from zoneadm so that the parent doesn't have those
2058 2319 * open while waiting. The child will close the rest after the fork.
2059 2320 */
2060 2321 closefrom(3);
2061 2322
2062 2323 if ((ctfd = init_template()) == -1) {
2063 2324 zerror(zlogp, B_TRUE, "failed to create contract");
2064 2325 return (1);
2065 2326 }
2066 2327
2067 2328 /*
2068 2329 * Do not let another thread localize a message while we are forking.
2069 2330 */
2070 2331 (void) mutex_lock(&msglock);
2071 2332 pid = fork();
2072 2333 (void) mutex_unlock(&msglock);
2073 2334
2074 2335 /*
2075 2336 * In all cases (parent, child, and in the event of an error) we
2076 2337 * don't want to cause creation of contracts on subsequent fork()s.
2077 2338 */
2078 2339 (void) ct_tmpl_clear(ctfd);
2079 2340 (void) close(ctfd);
2080 2341
2081 2342 if (pid == -1) {
2082 2343 zerror(zlogp, B_TRUE, "could not fork");
2083 2344 return (1);
2084 2345
2085 2346 } else if (pid > 0) { /* parent */
2086 2347 (void) sigprocmask(SIG_UNBLOCK, &block_cld, NULL);
2087 2348 /*
2088 2349 * This marks a window of vulnerability in which we receive
2089 2350 * the SIGCLD before falling into sema_wait (normally we would
2090 2351 * get woken up from sema_wait with EINTR upon receipt of
2091 2352 * SIGCLD). So we may need to use some other scheme like
2092 2353 * sema_posting in the sigcld handler.
2093 2354 * blech
2094 2355 */
2095 2356 (void) sema_wait(&shstate->sem);
2096 2357 (void) sema_destroy(&shstate->sem);
2097 2358 if (shstate->status != 0)
2098 2359 (void) waitpid(pid, NULL, WNOHANG);
2099 2360 /*
2100 2361 * It's ok if we die with SIGPIPE. It's not like we could have
2101 2362 * done anything about it.
2102 2363 */
2103 2364 (void) fprintf(stderr, "%s", shstate->log.buf);
2104 2365 _exit(shstate->status == 0 ? 0 : 1);
2105 2366 }
2106 2367
2107 2368 /*
2108 2369 * The child charges on.
2109 2370 */
2110 2371 (void) sigset(SIGCHLD, SIG_DFL);
2111 2372 (void) sigprocmask(SIG_UNBLOCK, &block_cld, NULL);
2112 2373
2113 2374 /*
2114 2375 * SIGPIPE can be delivered if we write to a socket for which the
2115 2376 * peer endpoint is gone. That can lead to too-early termination
2116 2377 * of zoneadmd, and that's not good eats.
2117 2378 */
2118 2379 (void) sigset(SIGPIPE, SIG_IGN);
2119 2380 /*
2120 2381 * Stop using stderr
2121 2382 */
2122 2383 zlogp = &shstate->log;
2123 2384
2124 2385 /*
2125 2386 * We don't need stdout/stderr from now on.
2126 2387 */
2127 2388 closefrom(0);
2128 2389
2129 2390 /*
2130 2391 * Initialize the syslog zlog_t. This needs to be done after
2131 2392 * the call to closefrom().
2132 2393 */
2133 2394 logsys.buf = logsys.log = NULL;
2134 2395 logsys.buflen = logsys.loglen = 0;
2135 2396 logsys.logfile = NULL;
2136 2397 logsys.locale = DEFAULT_LOCALE;
2137 2398
2138 2399 openlog("zoneadmd", LOG_PID, LOG_DAEMON);
2139 2400
2140 2401 /*
2141 2402 * The eventstream is used to publish state changes in the zone
2142 2403 * from the door threads to the console I/O poller.
2143 2404 */
2144 2405 if (eventstream_init() == -1) {
2145 2406 zerror(zlogp, B_TRUE, "unable to create eventstream");
2146 2407 goto child_out;
2147 2408 }
2148 2409
2149 2410 (void) snprintf(zone_door_path, sizeof (zone_door_path),
2150 2411 "%s" ZONE_DOOR_PATH, zonecfg_get_root(), zone_name);
2151 2412
2152 2413 /*
2153 2414 * See if another zoneadmd is running for this zone. If not, then we
2154 2415 * can now modify system state.
2155 2416 */
2156 2417 if (make_daemon_exclusive(zlogp) == -1)
2157 2418 goto child_out;
2158 2419
2159 2420
2160 2421 /*
2161 2422 * Create/join a new session; we need to be careful of what we do with
2162 2423 * the console from now on so we don't end up being the session leader
2163 2424 * for the terminal we're going to be handing out.
2164 2425 */
2165 2426 (void) setsid();
2166 2427
2167 2428 /*
2168 2429 * This thread shouldn't be receiving any signals; in particular,
2169 2430 * SIGCHLD should be received by the thread doing the fork().
2170 2431 */
2171 2432 (void) sigfillset(&blockset);
2172 2433 (void) thr_sigsetmask(SIG_BLOCK, &blockset, NULL);
2173 2434
2174 2435 /*
2175 2436 * Setup the console device and get ready to serve the console;
2176 2437 * once this has completed, we're ready to let console clients
2177 2438 * make an attempt to connect (they will block until
2178 2439 * serve_console_sock() below gets called, and any pending
2179 2440 * connection is accept()ed).
2180 2441 */
2181 2442 if (!zonecfg_in_alt_root() && init_console(zlogp) < 0)
2182 2443 goto child_out;
2183 2444
2184 2445 /*
2185 2446 * Take the lock now, so that when the door server gets going, we
2186 2447 * are guaranteed that it won't take a request until we are sure
2187 2448 * that everything is completely set up. See the child_out: label
2188 2449 * below to see why this matters.
2189 2450 */
2190 2451 (void) mutex_lock(&lock);
2191 2452
2192 2453 /* Init semaphore for scratch zones. */
2193 2454 if (sema_init(&scratch_sem, 0, USYNC_THREAD, NULL) == -1) {
2194 2455 zerror(zlogp, B_TRUE,
2195 2456 "failed to initialize semaphore for scratch zone");
2196 2457 goto child_out;
2197 2458 }
2198 2459
2199 2460 /* open the dladm handle */
2200 2461 if (dladm_open(&dld_handle) != DLADM_STATUS_OK) {
2201 2462 zerror(zlogp, B_FALSE, "failed to open dladm handle");
2202 2463 goto child_out;
2203 2464 }
2204 2465
2205 2466 /*
2206 2467 * Note: door setup must occur *after* the console is setup.
2207 2468 * This is so that as zlogin tests the door to see if zoneadmd
2208 2469 * is ready yet, we know that the console will get serviced
2209 2470 * once door_info() indicates that the door is "up".
2210 2471 */
2211 2472 if (setup_door(zlogp) == -1)
2212 2473 goto child_out;
2213 2474
2214 2475 /*
2215 2476 * Things seem OK so far; tell the parent process that we're done
2216 2477 * with setup tasks. This will cause the parent to exit, signalling
2217 2478 * to zoneadm, zlogin, or whatever forked it that we are ready to
2218 2479 * service requests.
2219 2480 */
2220 2481 shstate->status = 0;
2221 2482 (void) sema_post(&shstate->sem);
2222 2483 (void) munmap((char *)shstate, shstatelen);
2223 2484 shstate = NULL;
2224 2485
2225 2486 (void) mutex_unlock(&lock);
2226 2487
2227 2488 /*
2228 2489 * zlogp is now invalid, so reset it to the syslog logger.
2229 2490 */
2230 2491 zlogp = &logsys;
2231 2492
2232 2493 /*
2233 2494 * Now that we are free of any parents, switch to the default locale.
2234 2495 */
2235 2496 (void) setlocale(LC_ALL, DEFAULT_LOCALE);
2236 2497
2237 2498 /*
2238 2499 * At this point the setup portion of main() is basically done, so
2239 2500 * we reuse this thread to manage the zone console. When
2240 2501 * serve_console() has returned, we are past the point of no return
2241 2502 * in the life of this zoneadmd.
2242 2503 */
2243 2504 if (zonecfg_in_alt_root()) {
2244 2505 /*
2245 2506 * This is just awful, but mounted scratch zones don't (and
2246 2507 * can't) have consoles. We just wait for unmount instead.
2247 2508 */
2248 2509 while (sema_wait(&scratch_sem) == EINTR)
2249 2510 ;
2250 2511 } else {
2251 2512 serve_console(zlogp);
2252 2513 assert(in_death_throes);
2253 2514 }
2254 2515
2255 2516 /*
2256 2517 * This is the next-to-last part of the exit interlock. Upon calling
2257 2518 * fdetach(), the door will go unreferenced; once any
2258 2519 * outstanding requests (like the door thread doing Z_HALT) are
2259 2520 * done, the door will get an UNREF notification; when it handles
2260 2521 * the UNREF, the door server will cause the exit. It's possible
2261 2522 * that fdetach() can fail because the file is in use, in which
2262 2523 * case we'll retry the operation.
2263 2524 */
2264 2525 assert(!MUTEX_HELD(&lock));
2265 2526 for (;;) {
2266 2527 if ((fdetach(zone_door_path) == 0) || (errno != EBUSY))
2267 2528 break;
2268 2529 yield();
2269 2530 }
2270 2531
2271 2532 for (;;)
2272 2533 (void) pause();
2273 2534
2274 2535 child_out:
2275 2536 assert(pid == 0);
2276 2537 if (shstate != NULL) {
2277 2538 shstate->status = -1;
2278 2539 (void) sema_post(&shstate->sem);
2279 2540 (void) munmap((char *)shstate, shstatelen);
2280 2541 }
2281 2542
2282 2543 /*
2283 2544 * This might trigger an unref notification, but if so,
2284 2545 * we are still holding the lock, so our call to exit will
2285 2546 * ultimately win the race and will publish the right exit
2286 2547 * code.
2287 2548 */
2288 2549 if (zone_door != -1) {
2289 2550 assert(MUTEX_HELD(&lock));
2290 2551 (void) door_revoke(zone_door);
2291 2552 (void) fdetach(zone_door_path);
2292 2553 }
2293 2554
2294 2555 if (dld_handle != NULL)
2295 2556 dladm_close(dld_handle);
2296 2557
2297 2558 return (1); /* return from main() forcibly exits an MT process */
2298 2559 }
|
↓ open down ↓ |
345 lines elided |
↑ open up ↑ |
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX