Print this page
OS-5223 removed shm segment is no longer available
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
| Split |
Close |
| Expand all |
| Collapse all |
--- old/usr/src/uts/common/os/ipc.c
+++ new/usr/src/uts/common/os/ipc.c
1 1 /*
2 2 * CDDL HEADER START
3 3 *
4 4 * The contents of this file are subject to the terms of the
5 5 * Common Development and Distribution License (the "License").
6 6 * You may not use this file except in compliance with the License.
7 7 *
8 8 * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9 9 * or http://www.opensolaris.org/os/licensing.
10 10 * See the License for the specific language governing permissions
11 11 * and limitations under the License.
12 12 *
|
↓ open down ↓ |
12 lines elided |
↑ open up ↑ |
13 13 * When distributing Covered Code, include this CDDL HEADER in each
14 14 * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15 15 * If applicable, add the following below this CDDL HEADER, with the
16 16 * fields enclosed by brackets "[]" replaced with your own identifying
17 17 * information: Portions Copyright [yyyy] [name of copyright owner]
18 18 *
19 19 * CDDL HEADER END
20 20 */
21 21 /*
22 22 * Copyright (c) 1988, 2010, Oracle and/or its affiliates. All rights reserved.
23 + * Copyright 2016 Joyent, Inc.
23 24 */
24 25
25 26 /* Copyright (c) 1984, 1986, 1987, 1988, 1989 AT&T */
26 27 /* All Rights Reserved */
27 28
28 29
29 30 /*
30 31 * Common Inter-Process Communication routines.
31 32 *
32 33 * Overview
33 34 * --------
34 35 *
35 36 * The System V inter-process communication (IPC) facilities provide
36 37 * three services, message queues, semaphore arrays, and shared memory
37 38 * segments, which are mananged using filesystem-like namespaces.
38 39 * Unlike a filesystem, these namespaces aren't mounted and accessible
39 40 * via a path -- a special API is used to interact with the different
40 41 * facilities (nothing precludes a VFS-based interface, but the
41 42 * standards require the special APIs). Furthermore, these special
42 43 * APIs don't use file descriptors, nor do they have an equivalent.
43 44 * This means that every operation which acts on an object needs to
44 45 * perform the quivalent of a lookup, which in turn means that every
45 46 * operation can fail if the specified object doesn't exist in the
46 47 * facility's namespace.
47 48 *
48 49 * Objects
49 50 * -------
50 51 *
51 52 * Each object in a namespace has a unique ID, which is assigned by the
52 53 * system and is used to identify the object when performing operations
53 54 * on it. An object can also have a key, which is selected by the user
54 55 * at allocation time and is used as a primitive rendezvous mechanism.
55 56 * An object without a key is said to have a "private" key.
56 57 *
57 58 * To perform an operation on an object given its key, one must first
58 59 * perform a lookup and obtain its ID. The ID is then used to identify
59 60 * the object when performing the operation. If the object has a
60 61 * private key, the ID must be known or obtained by other means.
61 62 *
62 63 * Each object in the namespace has a creator uid and gid, as well as
63 64 * an owner uid and gid. Both are initialized with the ruid and rgid
64 65 * of the process which created the object. The creator or current
65 66 * owner has the ability to change the owner of the object.
66 67 *
67 68 * Each object in the namespace has a set of file-like permissions,
68 69 * which, in conjunction with the creator and owner uid and gid,
69 70 * control read and write access to the object (execute is ignored).
70 71 *
71 72 * Each object also has a creator project and zone, which are used to
72 73 * account for its resource usage.
73 74 *
74 75 * Operations
75 76 * ----------
76 77 *
77 78 * There are five operations which all three facilities have in
78 79 * common: GET, SET, STAT, RMID, and IDS.
79 80 *
80 81 * GET, like open, is used to allocate a new object or obtain an
81 82 * existing one (using its key). It takes a key, a set of flags and
82 83 * mode bits, and optionally facility-specific arguments. If the key
83 84 * is IPC_PRIVATE, a new object with the requested mode bits and
84 85 * facility-specific attributes is created. If the key isn't
85 86 * IPC_PRIVATE, the GET will attempt to look up the specified key and
86 87 * either return that or create a new key depending on the state of the
87 88 * IPC_CREAT and IPC_EXCL flags, much like open. If GET needs to
88 89 * allocate an object, it can fail if there is insufficient space in
89 90 * the namespace (the maximum number of ids for the facility has been
90 91 * exceeded) or if the facility-specific initialization fails. If GET
91 92 * finds an object it can return, it can still fail if that object's
92 93 * permissions or facility-specific attributes are less than those
93 94 * requested.
94 95 *
95 96 * SET is used to adjust facility-specific parameters of an object, in
96 97 * addition to the owner uid and gid, and mode bits. It can fail if
97 98 * the caller isn't the creator or owner.
98 99 *
99 100 * STAT is used to obtain information about an object including the
100 101 * general attributes object described as well as facility-specific
101 102 * information. It can fail if the caller doesn't have read
102 103 * permission.
103 104 *
104 105 * RMID removes an object from the namespace. Subsequent operations
105 106 * using the object's ID or key will fail (until another object is
106 107 * created with the same key or ID). Since an RMID may be performed
107 108 * asynchronously with other operations, it is possible that other
108 109 * threads and/or processes will have references to the object. While
109 110 * a facility may have actions which need to be performed at RMID time,
110 111 * only when all references are dropped can the object be destroyed.
111 112 * RMID will fail if the caller isn't the creator or owner.
112 113 *
113 114 * IDS obtains a list of all IDs in a facility's namespace. There are
114 115 * no facility-specific behaviors of IDS.
115 116 *
116 117 * Design
117 118 * ------
118 119 *
119 120 * Because some IPC facilities provide services whose operations must
120 121 * scale, a mechanism which allows fast, concurrent access to
121 122 * individual objects is needed. Of primary importance is object
122 123 * lookup based on ID (SET, STAT, others). Allocation (GET),
123 124 * deallocation (RMID), ID enumeration (IDS), and key lookups (GET) are
124 125 * lesser concerns, but should be implemented in such a way that ID
125 126 * lookup isn't affected (at least not in the common case).
126 127 *
127 128 * Starting from the bottom up, each object is represented by a
128 129 * structure, the first member of which must be a kipc_perm_t. The
129 130 * kipc_perm_t contains the information described above in "Objects", a
130 131 * reference count (since the object may continue to exist after it has
131 132 * been removed from the namespace), as well as some additional
132 133 * metadata used to manage data structure membership. These objects
133 134 * are dynamically allocated.
134 135 *
135 136 * Above the objects is a power-of-two sized table of ID slots. Each
136 137 * slot contains a pointer to an object, a sequence number, and a
137 138 * lock. An object's ID is a function of its slot's index in the table
138 139 * and its slot's sequence number. Every time a slot is released (via
139 140 * RMID) its sequence number is increased. Strictly speaking, the
140 141 * sequence number is unnecessary. However, checking the sequence
141 142 * number after a lookup provides a certain degree of robustness
142 143 * against the use of stale IDs (useful since nothing else does). When
143 144 * the table fills up, it is resized (see Locking, below).
144 145 *
145 146 * Of an ID's 31 bits (an ID is, as defined by the standards, a signed
146 147 * int) the top IPC_SEQ_BITS are used for the sequence number with the
147 148 * remainder holding the index into the table. The size of the table
148 149 * is therefore bounded at 2 ^ (31 - IPC_SEQ_BITS) slots.
149 150 *
150 151 * Managing this table is the ipc_service structure. It contains a
151 152 * pointer to the dynamically allocated ID table, a namespace-global
152 153 * lock, an id_space for managing the free space in the table, and
153 154 * sundry other metadata necessary for the maintenance of the
154 155 * namespace. An AVL tree of all keyed objects in the table (sorted by
155 156 * key) is used for key lookups. An unordered doubly linked list of
156 157 * all objects in the namespace (keyed or not) is maintained to
157 158 * facilitate ID enumeration.
158 159 *
159 160 * To help visualize these relationships, here's a picture of a
160 161 * namespace with a table of size 8 containing three objects
161 162 * (IPC_SEQ_BITS = 28):
162 163 *
163 164 *
164 165 * +-ipc_service_t--+
165 166 * | table *---\
166 167 * | keys *---+----------------------\
167 168 * | all ids *--\| |
168 169 * | | || |
169 170 * +----------------+ || |
170 171 * || |
171 172 * /-------------------/| |
172 173 * | /---------------/ |
173 174 * | | |
174 175 * | v |
175 176 * | +-0------+-1------+-2------+-3------+-4--+---+-5------+-6------+-7------+
176 177 * | | Seq=3 | | | Seq=1 | : | | | Seq=6 |
177 178 * | | | | | | : | | | |
178 179 * | +-*------+--------+--------+-*------+----+---+--------+--------+-*------+
179 180 * | | | | |
180 181 * | | /---/ | /----------------/
181 182 * | | | | |
182 183 * | v v | v
183 184 * | +-kipc_perm_t-+ +-kipc_perm_t-+ | +-kipc_perm_t-+
184 185 * | | id=0x30 | | id=0x13 | | | id=0x67 |
185 186 * | | key=0xfeed | | key=0xbeef | | | key=0xcafe |
186 187 * \->| [list] |<------>| [list] |<------>| [list] |
187 188 * /->| [avl left] x /--->| [avl left] x \--->| [avl left] *---\
188 189 * | | [avl right] x | | [avl right] x | [avl right] *---+-\
189 190 * | | | | | | | | | |
190 191 * | +-------------+ | +-------------+ +-------------+ | |
191 192 * | \---------------------------------------------/ |
192 193 * \--------------------------------------------------------------------/
193 194 *
194 195 * Locking
195 196 * -------
196 197 *
197 198 * There are three locks (or sets of locks) which are used to ensure
198 199 * correctness: the slot locks, the namespace lock, and p_lock (needed
199 200 * when checking resource controls). Their ordering is
200 201 *
201 202 * namespace lock -> slot lock 0 -> ... -> slot lock t -> p_lock
202 203 *
203 204 * Generally speaking, the namespace lock is used to protect allocation
204 205 * and removal from the namespace, ID enumeration, and resizing the ID
205 206 * table. Specifically:
206 207 *
207 208 * - write access to all fields of the ipc_service structure
208 209 * - read access to all variable fields of ipc_service except
209 210 * ipcs_tabsz (table size) and ipcs_table (the table pointer)
210 211 * - read/write access to ipc_avl, ipc_list in visible objects'
211 212 * kipc_perm structures (i.e. objects which have been removed from
212 213 * the namespace don't have this restriction)
213 214 * - write access to ipct_seq and ipct_data in the table entries
214 215 *
215 216 * A slot lock by itself is meaningless (except when resizing). Of
216 217 * greater interest conceptually is the notion of an ID lock -- a
217 218 * "virtual lock" which refers to whichever slot lock an object's ID
218 219 * currently hashes to.
219 220 *
220 221 * An ID lock protects all objects with that ID. Normally there will
221 222 * only be one such object: the one pointed to by the locked slot.
222 223 * However, if an object is removed from the namespace but retains
223 224 * references (e.g. an attached shared memory segment which has been
224 225 * RMIDed), it continues to use the lock associated with its original
225 226 * ID. While this can result in increased contention, operations which
226 227 * require taking the ID lock of removed objects are infrequent.
227 228 *
228 229 * Specifically, an ID lock protects the contents of an object's
229 230 * structure, including the contents of the embedded kipc_perm
230 231 * structure (but excluding those fields protected by the namespace
231 232 * lock). It also protects the ipct_seq and ipct_data fields in its
232 233 * slot (it is really a slot lock, after all).
233 234 *
234 235 * Recall that the table is resizable. To avoid requiring every ID
235 236 * lookup to take a global lock, a scheme much like that employed for
236 237 * file descriptors (see the comment above UF_ENTER in user.h) is
237 238 * used. Note that the sequence number and data pointer are protected
238 239 * by both the namespace lock and their slot lock. When the table is
239 240 * resized, the following operations take place:
240 241 *
241 242 * 1) A new table is allocated.
242 243 * 2) The global lock is taken.
243 244 * 3) All old slots are locked, in order.
244 245 * 4) The first half of the new slots are locked.
245 246 * 5) All table entries are copied to the new table, and cleared from
246 247 * the old table.
247 248 * 6) The ipc_service structure is updated to point to the new table.
248 249 * 7) The ipc_service structure is updated with the new table size.
249 250 * 8) All slot locks (old and new) are dropped.
250 251 *
251 252 * Because the slot locks are embedded in the table, ID lookups and
252 253 * other operations which require taking an slot lock need to verify
253 254 * that the lock taken wasn't part of a stale table. This is
254 255 * accomplished by checking the table size before and after
255 256 * dereferencing the table pointer and taking the lock: if the size
256 257 * changes, the lock must be dropped and reacquired. It is this
257 258 * additional work which distinguishes an ID lock from a slot lock.
258 259 *
259 260 * Because we can't guarantee that threads aren't accessing the old
260 261 * tables' locks, they are never deallocated. To prevent spurious
261 262 * reports of memory leaks, a pointer to the discarded table is stored
262 263 * in the new one in step 5. (Theoretically ipcs_destroy will delete
263 264 * the discarded tables, but it is only ever called from a failed _init
264 265 * invocation; i.e. when there aren't any.)
265 266 *
266 267 * Interfaces
267 268 * ----------
268 269 *
269 270 * The following interfaces are provided by the ipc module for use by
270 271 * the individual IPC facilities:
271 272 *
272 273 * ipcperm_access
273 274 *
274 275 * Given an object and a cred structure, determines if the requested
275 276 * access type is allowed.
276 277 *
277 278 * ipcperm_set, ipcperm_stat,
278 279 * ipcperm_set64, ipcperm_stat64
279 280 *
280 281 * Performs the common portion of an STAT or SET operation. All
281 282 * (except stat and stat64) can fail, so they should be called before
282 283 * any facility-specific non-reversible changes are made to an
283 284 * object. Similarly, the set operations have side effects, so they
284 285 * should only be called once the possibility of a facility-specific
285 286 * failure is eliminated.
286 287 *
287 288 * ipcs_create
288 289 *
289 290 * Creates an IPC namespace for use by an IPC facility.
290 291 *
291 292 * ipcs_destroy
292 293 *
293 294 * Destroys an IPC namespace.
294 295 *
295 296 * ipcs_lock, ipcs_unlock
296 297 *
297 298 * Takes the namespace lock. Ideally such access wouldn't be
298 299 * necessary, but there may be facility-specific data protected by
299 300 * this lock (e.g. project-wide resource consumption).
300 301 *
301 302 * ipc_lock
302 303 *
303 304 * Takes the lock associated with an ID. Can't fail.
304 305 *
305 306 * ipc_relock
306 307 *
307 308 * Like ipc_lock, but takes a pointer to a held lock. Drops the lock
308 309 * unless it is the one that would have been returned by ipc_lock.
309 310 * Used after calls to cv_wait.
310 311 *
311 312 * ipc_lookup
312 313 *
313 314 * Performs an ID lookup, returns with the ID lock held. Fails if
314 315 * the ID doesn't exist in the namespace.
315 316 *
316 317 * ipc_hold
317 318 *
318 319 * Takes a reference on an object.
319 320 *
320 321 * ipc_rele
321 322 *
322 323 * Releases a reference on an object, and drops the object's lock.
323 324 * Calls the object's destructor if last reference is being
324 325 * released.
325 326 *
326 327 * ipc_rele_locked
327 328 *
328 329 * Releases a reference on an object. Doesn't drop lock, and may
329 330 * only be called when there is more than one reference to the
330 331 * object.
331 332 *
332 333 * ipc_get, ipc_commit_begin, ipc_commit_end, ipc_cleanup
333 334 *
334 335 * Components of a GET operation. ipc_get performs a key lookup,
335 336 * allocating an object if the key isn't found (returning with the
336 337 * namespace lock and p_lock held), and returning the existing object
337 338 * if it is (with the object lock held). ipc_get doesn't modify the
338 339 * namespace.
339 340 *
340 341 * ipc_commit_begin begins the process of inserting an object
341 342 * allocated by ipc_get into the namespace, and can fail. If
342 343 * successful, it returns with the namespace lock and p_lock held.
343 344 * ipc_commit_end completes the process of inserting an object into
344 345 * the namespace and can't fail. The facility can call ipc_cleanup
345 346 * at any time following a successful ipc_get and before
346 347 * ipc_commit_end or a failed ipc_commit_begin to fail the
347 348 * allocation. Pseudocode for the suggested GET implementation:
348 349 *
349 350 * top:
350 351 *
351 352 * ipc_get
352 353 *
353 354 * if failure
354 355 * return
355 356 *
356 357 * if found {
357 358 *
358 359 * if object meets criteria
359 360 * unlock object and return success
360 361 * else
361 362 * unlock object and return failure
362 363 *
363 364 * } else {
364 365 *
365 366 * perform resource control tests
366 367 * drop namespace lock, p_lock
367 368 * if failure
368 369 * ipc_cleanup
369 370 *
370 371 * perform facility-specific initialization
371 372 * if failure {
372 373 * facility-specific cleanup
373 374 * ipc_cleanup
374 375 * }
375 376 *
376 377 * ( At this point the object should be destructible using the
377 378 * destructor given to ipcs_create )
378 379 *
379 380 * ipc_commit_begin
380 381 * if retry
381 382 * goto top
382 383 * else if failure
383 384 * return
384 385 *
385 386 * perform facility-specific resource control tests/allocations
386 387 * if failure
387 388 * ipc_cleanup
388 389 *
389 390 * ipc_commit_end
390 391 * perform any infallible post-creation actions, unlock, and return
391 392 *
392 393 * }
393 394 *
394 395 * ipc_rmid
395 396 *
396 397 * Performs the common portion of an RMID operation -- looks up an ID
397 398 * removes it, and calls the a facility-specific function to do
398 399 * RMID-time cleanup on the private portions of the object.
399 400 *
400 401 * ipc_ids
401 402 *
402 403 * Performs the common portion of an IDS operation.
403 404 *
404 405 */
405 406
406 407 #include <sys/types.h>
407 408 #include <sys/param.h>
408 409 #include <sys/cred.h>
409 410 #include <sys/policy.h>
410 411 #include <sys/proc.h>
411 412 #include <sys/user.h>
412 413 #include <sys/ipc.h>
413 414 #include <sys/ipc_impl.h>
414 415 #include <sys/errno.h>
415 416 #include <sys/systm.h>
416 417 #include <sys/list.h>
417 418 #include <sys/atomic.h>
418 419 #include <sys/zone.h>
419 420 #include <sys/task.h>
420 421 #include <sys/modctl.h>
421 422
422 423 #include <c2/audit.h>
423 424
424 425 static struct modlmisc modlmisc = {
425 426 &mod_miscops,
426 427 "common ipc code",
427 428 };
428 429
429 430 static struct modlinkage modlinkage = {
430 431 MODREV_1, (void *)&modlmisc, NULL
431 432 };
432 433
433 434
434 435 int
435 436 _init(void)
436 437 {
437 438 return (mod_install(&modlinkage));
438 439 }
439 440
440 441 int
441 442 _fini(void)
442 443 {
443 444 return (mod_remove(&modlinkage));
444 445 }
445 446
446 447 int
447 448 _info(struct modinfo *modinfop)
448 449 {
449 450 return (mod_info(&modlinkage, modinfop));
450 451 }
451 452
452 453
453 454 /*
454 455 * Check message, semaphore, or shared memory access permissions.
455 456 *
456 457 * This routine verifies the requested access permission for the current
457 458 * process. The zone ids are compared, and the appropriate bits are
458 459 * checked corresponding to owner, group (including the list of
459 460 * supplementary groups), or everyone. Zero is returned on success.
460 461 * On failure, the security policy is asked to check to override the
461 462 * permissions check; the policy will either return 0 for access granted
462 463 * or EACCES.
463 464 *
464 465 * Access to objects in other zones requires that the caller be in the
465 466 * global zone and have the appropriate IPC_DAC_* privilege, regardless
466 467 * of whether the uid or gid match those of the object. Note that
467 468 * cross-zone accesses will normally never get here since they'll
468 469 * fail in ipc_lookup or ipc_get.
469 470 *
470 471 * The arguments must be set up as follows:
471 472 * p - Pointer to permission structure to verify
472 473 * mode - Desired access permissions
473 474 */
474 475 int
475 476 ipcperm_access(kipc_perm_t *p, int mode, cred_t *cr)
476 477 {
477 478 int shifts = 0;
478 479 uid_t uid = crgetuid(cr);
479 480 zoneid_t zoneid = getzoneid();
480 481
481 482 if (p->ipc_zoneid == zoneid) {
482 483 if (uid != p->ipc_uid && uid != p->ipc_cuid) {
483 484 shifts += 3;
484 485 if (!groupmember(p->ipc_gid, cr) &&
485 486 !groupmember(p->ipc_cgid, cr))
486 487 shifts += 3;
487 488 }
488 489
489 490 mode &= ~(p->ipc_mode << shifts);
490 491
491 492 if (mode == 0)
492 493 return (0);
493 494 } else if (zoneid != GLOBAL_ZONEID)
494 495 return (EACCES);
495 496
496 497 return (secpolicy_ipc_access(cr, p, mode));
497 498 }
498 499
499 500 /*
500 501 * There are two versions of the ipcperm_set/stat functions:
501 502 * ipcperm_??? - for use with IPC_SET/STAT
502 503 * ipcperm_???_64 - for use with IPC_SET64/STAT64
503 504 *
504 505 * These functions encapsulate the common portions (copying, permission
505 506 * checks, and auditing) of the set/stat operations. All, except for
506 507 * stat and stat_64 which are void, return 0 on success or a non-zero
507 508 * errno value on error.
508 509 */
509 510
510 511 int
511 512 ipcperm_set(ipc_service_t *service, struct cred *cr,
512 513 kipc_perm_t *kperm, struct ipc_perm *perm, model_t model)
513 514 {
514 515 STRUCT_HANDLE(ipc_perm, lperm);
515 516 uid_t uid;
516 517 gid_t gid;
517 518 mode_t mode;
518 519 zone_t *zone;
519 520
520 521 ASSERT(IPC_LOCKED(service, kperm));
521 522
522 523 STRUCT_SET_HANDLE(lperm, model, perm);
523 524 uid = STRUCT_FGET(lperm, uid);
524 525 gid = STRUCT_FGET(lperm, gid);
525 526 mode = STRUCT_FGET(lperm, mode);
526 527
527 528 if (secpolicy_ipc_owner(cr, kperm) != 0)
528 529 return (EPERM);
529 530
530 531 zone = crgetzone(cr);
531 532 if (!VALID_UID(uid, zone) || !VALID_GID(gid, zone))
532 533 return (EINVAL);
533 534
534 535 kperm->ipc_uid = uid;
535 536 kperm->ipc_gid = gid;
536 537 kperm->ipc_mode = (mode & 0777) | (kperm->ipc_mode & ~0777);
537 538
538 539 if (AU_AUDITING())
539 540 audit_ipcget(service->ipcs_atype, kperm);
540 541
541 542 return (0);
542 543 }
543 544
544 545 void
545 546 ipcperm_stat(struct ipc_perm *perm, kipc_perm_t *kperm, model_t model)
546 547 {
547 548 STRUCT_HANDLE(ipc_perm, lperm);
548 549
549 550 STRUCT_SET_HANDLE(lperm, model, perm);
550 551 STRUCT_FSET(lperm, uid, kperm->ipc_uid);
551 552 STRUCT_FSET(lperm, gid, kperm->ipc_gid);
552 553 STRUCT_FSET(lperm, cuid, kperm->ipc_cuid);
553 554 STRUCT_FSET(lperm, cgid, kperm->ipc_cgid);
554 555 STRUCT_FSET(lperm, mode, kperm->ipc_mode);
555 556 STRUCT_FSET(lperm, seq, 0);
556 557 STRUCT_FSET(lperm, key, kperm->ipc_key);
557 558 }
558 559
559 560 int
560 561 ipcperm_set64(ipc_service_t *service, struct cred *cr,
561 562 kipc_perm_t *kperm, ipc_perm64_t *perm64)
562 563 {
563 564 zone_t *zone;
564 565
565 566 ASSERT(IPC_LOCKED(service, kperm));
566 567
567 568 if (secpolicy_ipc_owner(cr, kperm) != 0)
568 569 return (EPERM);
569 570
570 571 zone = crgetzone(cr);
571 572 if (!VALID_UID(perm64->ipcx_uid, zone) ||
572 573 !VALID_GID(perm64->ipcx_gid, zone))
573 574 return (EINVAL);
574 575
575 576 kperm->ipc_uid = perm64->ipcx_uid;
576 577 kperm->ipc_gid = perm64->ipcx_gid;
577 578 kperm->ipc_mode = (perm64->ipcx_mode & 0777) |
578 579 (kperm->ipc_mode & ~0777);
579 580
580 581 if (AU_AUDITING())
581 582 audit_ipcget(service->ipcs_atype, kperm);
582 583
583 584 return (0);
584 585 }
585 586
586 587 void
587 588 ipcperm_stat64(ipc_perm64_t *perm64, kipc_perm_t *kperm)
588 589 {
589 590 perm64->ipcx_uid = kperm->ipc_uid;
590 591 perm64->ipcx_gid = kperm->ipc_gid;
591 592 perm64->ipcx_cuid = kperm->ipc_cuid;
592 593 perm64->ipcx_cgid = kperm->ipc_cgid;
593 594 perm64->ipcx_mode = kperm->ipc_mode;
594 595 perm64->ipcx_key = kperm->ipc_key;
595 596 perm64->ipcx_projid = kperm->ipc_proj->kpj_id;
596 597 perm64->ipcx_zoneid = kperm->ipc_zoneid;
597 598 }
598 599
599 600
600 601 /*
601 602 * ipc key comparator.
602 603 */
603 604 static int
604 605 ipc_key_compar(const void *a, const void *b)
605 606 {
606 607 kipc_perm_t *aperm = (kipc_perm_t *)a;
607 608 kipc_perm_t *bperm = (kipc_perm_t *)b;
608 609 int ak = aperm->ipc_key;
609 610 int bk = bperm->ipc_key;
610 611 zoneid_t az;
611 612 zoneid_t bz;
612 613
613 614 ASSERT(ak != IPC_PRIVATE);
614 615 ASSERT(bk != IPC_PRIVATE);
615 616
616 617 /*
617 618 * Compare key first, then zoneid. This optimizes performance for
618 619 * systems with only one zone, since the zone checks will only be
619 620 * made when the keys match.
620 621 */
621 622 if (ak < bk)
622 623 return (-1);
623 624 if (ak > bk)
624 625 return (1);
625 626
626 627 /* keys match */
627 628 az = aperm->ipc_zoneid;
628 629 bz = bperm->ipc_zoneid;
629 630 if (az < bz)
630 631 return (-1);
631 632 if (az > bz)
632 633 return (1);
633 634 return (0);
634 635 }
635 636
636 637 /*
637 638 * Create an ipc service.
638 639 */
639 640 ipc_service_t *
640 641 ipcs_create(const char *name, rctl_hndl_t proj_rctl, rctl_hndl_t zone_rctl,
641 642 size_t size, ipc_func_t *dtor, ipc_func_t *rmid, int audit_type,
642 643 size_t rctl_offset)
643 644 {
644 645 ipc_service_t *result;
645 646
646 647 result = kmem_alloc(sizeof (ipc_service_t), KM_SLEEP);
647 648
648 649 mutex_init(&result->ipcs_lock, NULL, MUTEX_ADAPTIVE, NULL);
649 650 result->ipcs_count = 0;
650 651 avl_create(&result->ipcs_keys, ipc_key_compar, size, 0);
651 652 result->ipcs_tabsz = IPC_IDS_MIN;
652 653 result->ipcs_table =
653 654 kmem_zalloc(IPC_IDS_MIN * sizeof (ipc_slot_t), KM_SLEEP);
654 655 result->ipcs_ssize = size;
655 656 result->ipcs_ids = id_space_create(name, 0, IPC_IDS_MIN);
656 657 result->ipcs_dtor = dtor;
657 658 result->ipcs_rmid = rmid;
658 659 result->ipcs_proj_rctl = proj_rctl;
659 660 result->ipcs_zone_rctl = zone_rctl;
660 661 result->ipcs_atype = audit_type;
661 662 ASSERT(rctl_offset < sizeof (ipc_rqty_t));
662 663 result->ipcs_rctlofs = rctl_offset;
663 664 list_create(&result->ipcs_usedids, sizeof (kipc_perm_t),
664 665 offsetof(kipc_perm_t, ipc_list));
665 666
666 667 return (result);
667 668 }
668 669
669 670 /*
670 671 * Destroy an ipc service.
671 672 */
672 673 void
673 674 ipcs_destroy(ipc_service_t *service)
674 675 {
675 676 ipc_slot_t *slot, *next;
676 677
677 678 mutex_enter(&service->ipcs_lock);
678 679
679 680 ASSERT(service->ipcs_count == 0);
680 681 avl_destroy(&service->ipcs_keys);
681 682 list_destroy(&service->ipcs_usedids);
682 683 id_space_destroy(service->ipcs_ids);
683 684
684 685 for (slot = service->ipcs_table; slot; slot = next) {
685 686 next = slot[0].ipct_chain;
686 687 kmem_free(slot, service->ipcs_tabsz * sizeof (ipc_slot_t));
687 688 service->ipcs_tabsz >>= 1;
688 689 }
689 690
690 691 mutex_destroy(&service->ipcs_lock);
691 692 kmem_free(service, sizeof (ipc_service_t));
692 693 }
693 694
694 695 /*
695 696 * Takes the service lock.
696 697 */
697 698 void
698 699 ipcs_lock(ipc_service_t *service)
699 700 {
700 701 mutex_enter(&service->ipcs_lock);
701 702 }
702 703
703 704 /*
704 705 * Releases the service lock.
705 706 */
706 707 void
707 708 ipcs_unlock(ipc_service_t *service)
708 709 {
709 710 mutex_exit(&service->ipcs_lock);
710 711 }
711 712
712 713
713 714 /*
714 715 * Locks the specified ID. Returns the ID's ID table index.
715 716 */
716 717 static int
717 718 ipc_lock_internal(ipc_service_t *service, uint_t id)
718 719 {
719 720 uint_t tabsz;
720 721 uint_t index;
721 722 kmutex_t *mutex;
722 723
723 724 for (;;) {
724 725 tabsz = service->ipcs_tabsz;
725 726 membar_consumer();
726 727 index = id & (tabsz - 1);
727 728 mutex = &service->ipcs_table[index].ipct_lock;
728 729 mutex_enter(mutex);
729 730 if (tabsz == service->ipcs_tabsz)
730 731 break;
731 732 mutex_exit(mutex);
732 733 }
733 734
734 735 return (index);
735 736 }
736 737
737 738 /*
738 739 * Locks the specified ID. Returns a pointer to the ID's lock.
739 740 */
740 741 kmutex_t *
741 742 ipc_lock(ipc_service_t *service, int id)
742 743 {
743 744 uint_t index;
744 745
745 746 /*
746 747 * These assertions don't reflect requirements of the code
747 748 * which follows, but they should never fail nonetheless.
748 749 */
749 750 ASSERT(id >= 0);
750 751 ASSERT(IPC_INDEX(id) < service->ipcs_tabsz);
751 752 index = ipc_lock_internal(service, id);
752 753
753 754 return (&service->ipcs_table[index].ipct_lock);
754 755 }
755 756
756 757 /*
757 758 * Checks to see if the held lock provided is the current lock for the
758 759 * specified id. If so, we return it instead of dropping it and
759 760 * returning the result of ipc_lock. This is intended to speed up cv
760 761 * wakeups where we are left holding a lock which could be stale, but
761 762 * probably isn't.
762 763 */
763 764 kmutex_t *
764 765 ipc_relock(ipc_service_t *service, int id, kmutex_t *lock)
765 766 {
766 767 ASSERT(id >= 0);
767 768 ASSERT(IPC_INDEX(id) < service->ipcs_tabsz);
768 769 ASSERT(MUTEX_HELD(lock));
769 770
770 771 if (&service->ipcs_table[IPC_INDEX(id)].ipct_lock == lock)
771 772 return (lock);
772 773
773 774 mutex_exit(lock);
774 775 return (ipc_lock(service, id));
775 776 }
776 777
777 778 /*
778 779 * Performs an ID lookup. If the ID doesn't exist or has been removed,
779 780 * or isn't visible to the caller (because of zones), NULL is returned.
780 781 * Otherwise, a pointer to the ID's perm structure and held ID lock are
781 782 * returned.
782 783 */
783 784 kmutex_t *
784 785 ipc_lookup(ipc_service_t *service, int id, kipc_perm_t **perm)
785 786 {
786 787 kipc_perm_t *result;
787 788 uint_t index;
788 789
789 790 /*
790 791 * There is no need to check to see if id is in-range (i.e.
791 792 * positive and fits into the table). If it is out-of-range,
792 793 * the id simply won't match the object's.
793 794 */
794 795
795 796 index = ipc_lock_internal(service, id);
796 797 result = service->ipcs_table[index].ipct_data;
797 798 if (result == NULL || result->ipc_id != (uint_t)id ||
798 799 !HASZONEACCESS(curproc, result->ipc_zoneid)) {
799 800 mutex_exit(&service->ipcs_table[index].ipct_lock);
800 801 return (NULL);
801 802 }
802 803
803 804 ASSERT(IPC_SEQ(id) == service->ipcs_table[index].ipct_seq);
804 805
805 806 *perm = result;
806 807 if (AU_AUDITING())
807 808 audit_ipc(service->ipcs_atype, id, result);
808 809
809 810 return (&service->ipcs_table[index].ipct_lock);
810 811 }
811 812
812 813 /*
813 814 * Increase the reference count on an ID.
814 815 */
815 816 /*ARGSUSED*/
816 817 void
817 818 ipc_hold(ipc_service_t *s, kipc_perm_t *perm)
818 819 {
819 820 ASSERT(IPC_INDEX(perm->ipc_id) < s->ipcs_tabsz);
820 821 ASSERT(IPC_LOCKED(s, perm));
821 822 perm->ipc_ref++;
822 823 }
823 824
824 825 /*
825 826 * Decrease the reference count on an ID and drops the ID's lock.
826 827 * Destroys the ID if the new reference count is zero.
827 828 */
828 829 void
829 830 ipc_rele(ipc_service_t *s, kipc_perm_t *perm)
830 831 {
831 832 int nref;
832 833
833 834 ASSERT(IPC_INDEX(perm->ipc_id) < s->ipcs_tabsz);
834 835 ASSERT(IPC_LOCKED(s, perm));
835 836 ASSERT(perm->ipc_ref > 0);
836 837
837 838 nref = --perm->ipc_ref;
838 839 mutex_exit(&s->ipcs_table[IPC_INDEX(perm->ipc_id)].ipct_lock);
839 840
840 841 if (nref == 0) {
841 842 ASSERT(IPC_FREE(perm)); /* ipc_rmid clears IPC_ALLOC */
842 843 s->ipcs_dtor(perm);
843 844 project_rele(perm->ipc_proj);
844 845 zone_rele_ref(&perm->ipc_zone_ref, ZONE_REF_IPC);
845 846 kmem_free(perm, s->ipcs_ssize);
846 847 }
847 848 }
848 849
849 850 /*
850 851 * Decrease the reference count on an ID, but don't drop the ID lock.
851 852 * Used in cases where one thread needs to remove many references (on
852 853 * behalf of other parties).
853 854 */
854 855 void
855 856 ipc_rele_locked(ipc_service_t *s, kipc_perm_t *perm)
856 857 {
857 858 ASSERT(perm->ipc_ref > 1);
858 859 ASSERT(IPC_INDEX(perm->ipc_id) < s->ipcs_tabsz);
859 860 ASSERT(IPC_LOCKED(s, perm));
860 861
861 862 perm->ipc_ref--;
862 863 }
863 864
864 865
865 866 /*
866 867 * Internal function to grow the service ID table.
867 868 */
868 869 static int
869 870 ipc_grow(ipc_service_t *service)
870 871 {
871 872 ipc_slot_t *new, *old;
872 873 int i, oldsize, newsize;
873 874
874 875 ASSERT(MUTEX_HELD(&service->ipcs_lock));
875 876 ASSERT(MUTEX_NOT_HELD(&curproc->p_lock));
876 877
877 878 if (service->ipcs_tabsz == IPC_IDS_MAX)
878 879 return (ENOSPC);
879 880
880 881 oldsize = service->ipcs_tabsz;
881 882 newsize = oldsize << 1;
882 883 new = kmem_zalloc(newsize * sizeof (ipc_slot_t), KM_NOSLEEP);
883 884 if (new == NULL)
884 885 return (ENOSPC);
885 886
886 887 old = service->ipcs_table;
887 888 for (i = 0; i < oldsize; i++) {
888 889 mutex_enter(&old[i].ipct_lock);
889 890 mutex_enter(&new[i].ipct_lock);
890 891
891 892 new[i].ipct_seq = old[i].ipct_seq;
892 893 new[i].ipct_data = old[i].ipct_data;
893 894 old[i].ipct_data = NULL;
894 895 }
895 896
896 897 new[0].ipct_chain = old;
897 898 service->ipcs_table = new;
898 899 membar_producer();
899 900 service->ipcs_tabsz = newsize;
900 901
901 902 for (i = 0; i < oldsize; i++) {
902 903 mutex_exit(&old[i].ipct_lock);
903 904 mutex_exit(&new[i].ipct_lock);
904 905 }
905 906
906 907 id_space_extend(service->ipcs_ids, oldsize, service->ipcs_tabsz);
907 908
908 909 return (0);
909 910 }
910 911
911 912
912 913 static int
913 914 ipc_keylookup(ipc_service_t *service, key_t key, int flag, kipc_perm_t **permp)
914 915 {
915 916 kipc_perm_t *perm = NULL;
916 917 avl_index_t where;
917 918 kipc_perm_t template;
918 919
919 920 ASSERT(MUTEX_HELD(&service->ipcs_lock));
920 921
921 922 template.ipc_key = key;
922 923 template.ipc_zoneid = getzoneid();
923 924 if (perm = avl_find(&service->ipcs_keys, &template, &where)) {
924 925 ASSERT(!IPC_FREE(perm));
925 926 if ((flag & (IPC_CREAT | IPC_EXCL)) == (IPC_CREAT | IPC_EXCL))
926 927 return (EEXIST);
927 928 if ((flag & 0777) & ~perm->ipc_mode) {
928 929 if (AU_AUDITING())
929 930 audit_ipcget(NULL, (void *)perm);
930 931 return (EACCES);
931 932 }
932 933 *permp = perm;
933 934 return (0);
934 935 } else if (flag & IPC_CREAT) {
935 936 *permp = NULL;
936 937 return (0);
937 938 }
938 939 return (ENOENT);
939 940 }
940 941
941 942 static int
942 943 ipc_alloc_test(ipc_service_t *service, proc_t *pp)
943 944 {
944 945 ASSERT(MUTEX_HELD(&service->ipcs_lock));
945 946
946 947 /*
947 948 * Resizing the table first would result in a cleaner code
948 949 * path, but would also allow a user to (permanently) double
949 950 * the id table size in cases where the allocation would be
950 951 * denied. Hence we test the rctl first.
951 952 */
952 953 retry:
953 954 mutex_enter(&pp->p_lock);
954 955 if ((rctl_test(service->ipcs_proj_rctl, pp->p_task->tk_proj->kpj_rctls,
955 956 pp, 1, RCA_SAFE) & RCT_DENY) ||
956 957 (rctl_test(service->ipcs_zone_rctl, pp->p_zone->zone_rctls,
957 958 pp, 1, RCA_SAFE) & RCT_DENY)) {
958 959 mutex_exit(&pp->p_lock);
959 960 return (ENOSPC);
960 961 }
961 962
962 963 if (service->ipcs_count == service->ipcs_tabsz) {
963 964 int error;
964 965
965 966 mutex_exit(&pp->p_lock);
966 967 if (error = ipc_grow(service))
967 968 return (error);
968 969 goto retry;
969 970 }
970 971
971 972 return (0);
972 973 }
973 974
974 975 /*
975 976 * Given a key, search for or create the associated identifier.
976 977 *
977 978 * If IPC_CREAT is specified and the key isn't found, or if the key is
978 979 * equal to IPC_PRIVATE, we return 0 and place a pointer to a newly
979 980 * allocated object structure in permp. A pointer to the held service
980 981 * lock is placed in lockp. ipc_mode's IPC_ALLOC bit is clear.
981 982 *
982 983 * If the key is found and no error conditions arise, we return 0 and
983 984 * place a pointer to the existing object structure in permp. A
984 985 * pointer to the held ID lock is placed in lockp. ipc_mode's
985 986 * IPC_ALLOC bit is set.
986 987 *
987 988 * Otherwise, a non-zero errno value is returned.
988 989 */
989 990 int
990 991 ipc_get(ipc_service_t *service, key_t key, int flag, kipc_perm_t **permp,
991 992 kmutex_t **lockp)
992 993 {
993 994 kipc_perm_t *perm = NULL;
994 995 proc_t *pp = curproc;
995 996 int error, index;
996 997 cred_t *cr = CRED();
997 998
998 999 if (key != IPC_PRIVATE) {
999 1000
1000 1001 mutex_enter(&service->ipcs_lock);
1001 1002 error = ipc_keylookup(service, key, flag, &perm);
1002 1003 if (perm != NULL)
1003 1004 index = ipc_lock_internal(service, perm->ipc_id);
1004 1005 mutex_exit(&service->ipcs_lock);
1005 1006
1006 1007 if (error) {
1007 1008 ASSERT(perm == NULL);
1008 1009 return (error);
1009 1010 }
1010 1011
1011 1012 if (perm) {
1012 1013 ASSERT(!IPC_FREE(perm));
1013 1014 *permp = perm;
1014 1015 *lockp = &service->ipcs_table[index].ipct_lock;
1015 1016 return (0);
1016 1017 }
1017 1018
1018 1019 /* Key not found; fall through */
1019 1020 }
1020 1021
1021 1022 perm = kmem_zalloc(service->ipcs_ssize, KM_SLEEP);
1022 1023
1023 1024 mutex_enter(&service->ipcs_lock);
1024 1025 if (error = ipc_alloc_test(service, pp)) {
1025 1026 mutex_exit(&service->ipcs_lock);
1026 1027 kmem_free(perm, service->ipcs_ssize);
1027 1028 return (error);
1028 1029 }
1029 1030
1030 1031 perm->ipc_cuid = perm->ipc_uid = crgetuid(cr);
1031 1032 perm->ipc_cgid = perm->ipc_gid = crgetgid(cr);
1032 1033 perm->ipc_zoneid = getzoneid();
1033 1034 perm->ipc_mode = flag & 0777;
1034 1035 perm->ipc_key = key;
1035 1036 perm->ipc_ref = 1;
1036 1037 perm->ipc_id = IPC_ID_INVAL;
1037 1038 *permp = perm;
1038 1039 *lockp = &service->ipcs_lock;
1039 1040
1040 1041 return (0);
1041 1042 }
1042 1043
1043 1044 /*
1044 1045 * Attempts to add the a newly created ID to the global namespace. If
1045 1046 * creating it would cause an error, we return the error. If there is
1046 1047 * the possibility that we could obtain the existing ID and return it
1047 1048 * to the user, we return EAGAIN. Otherwise, we return 0 with p_lock
1048 1049 * and the service lock held.
1049 1050 *
1050 1051 * Since this should be only called after all initialization has been
1051 1052 * completed, on failure we automatically invoke the destructor for the
1052 1053 * object and deallocate the memory associated with it.
1053 1054 */
1054 1055 int
1055 1056 ipc_commit_begin(ipc_service_t *service, key_t key, int flag,
1056 1057 kipc_perm_t *newperm)
1057 1058 {
1058 1059 kipc_perm_t *perm;
1059 1060 int error;
1060 1061 proc_t *pp = curproc;
1061 1062
1062 1063 ASSERT(newperm->ipc_ref == 1);
1063 1064 ASSERT(IPC_FREE(newperm));
1064 1065
1065 1066 /*
1066 1067 * Set ipc_proj and ipc_zone_ref so that future calls to ipc_cleanup()
1067 1068 * clean up the necessary state. This must be done before the
1068 1069 * potential call to ipcs_dtor() below.
1069 1070 */
1070 1071 newperm->ipc_proj = pp->p_task->tk_proj;
1071 1072 zone_init_ref(&newperm->ipc_zone_ref);
1072 1073 zone_hold_ref(pp->p_zone, &newperm->ipc_zone_ref, ZONE_REF_IPC);
1073 1074
1074 1075 mutex_enter(&service->ipcs_lock);
1075 1076 /*
1076 1077 * Ensure that no-one has raced with us and created the key.
1077 1078 */
1078 1079 if ((key != IPC_PRIVATE) &&
1079 1080 (((error = ipc_keylookup(service, key, flag, &perm)) != 0) ||
1080 1081 (perm != NULL))) {
1081 1082 error = error ? error : EAGAIN;
1082 1083 goto errout;
1083 1084 }
1084 1085
1085 1086 /*
1086 1087 * Ensure that no-one has raced with us and used the last of
1087 1088 * the permissible ids, or the last of the free spaces in the
1088 1089 * id table.
1089 1090 */
1090 1091 if (error = ipc_alloc_test(service, pp))
1091 1092 goto errout;
1092 1093
1093 1094 ASSERT(MUTEX_HELD(&service->ipcs_lock));
1094 1095 ASSERT(MUTEX_HELD(&pp->p_lock));
1095 1096
1096 1097 return (0);
1097 1098 errout:
1098 1099 mutex_exit(&service->ipcs_lock);
1099 1100 service->ipcs_dtor(newperm);
1100 1101 zone_rele_ref(&newperm->ipc_zone_ref, ZONE_REF_IPC);
1101 1102 kmem_free(newperm, service->ipcs_ssize);
1102 1103 return (error);
1103 1104 }
1104 1105
1105 1106 /*
1106 1107 * Commit the ID allocation transaction. Called with p_lock and the
1107 1108 * service lock held, both of which are dropped. Returns the held ID
1108 1109 * lock so the caller can extract the ID and perform ipcget auditing.
1109 1110 */
1110 1111 kmutex_t *
1111 1112 ipc_commit_end(ipc_service_t *service, kipc_perm_t *perm)
1112 1113 {
1113 1114 ipc_slot_t *slot;
1114 1115 avl_index_t where;
1115 1116 int index;
1116 1117 void *loc;
1117 1118
1118 1119 ASSERT(MUTEX_HELD(&service->ipcs_lock));
1119 1120 ASSERT(MUTEX_HELD(&curproc->p_lock));
1120 1121
1121 1122 (void) project_hold(perm->ipc_proj);
1122 1123 mutex_exit(&curproc->p_lock);
1123 1124
1124 1125 /*
1125 1126 * Pick out our slot.
1126 1127 */
1127 1128 service->ipcs_count++;
1128 1129 index = id_alloc(service->ipcs_ids);
1129 1130 ASSERT(index < service->ipcs_tabsz);
1130 1131 slot = &service->ipcs_table[index];
1131 1132 mutex_enter(&slot->ipct_lock);
1132 1133 ASSERT(slot->ipct_data == NULL);
1133 1134
1134 1135 /*
1135 1136 * Update the perm structure.
1136 1137 */
1137 1138 perm->ipc_mode |= IPC_ALLOC;
1138 1139 perm->ipc_id = (slot->ipct_seq << IPC_SEQ_SHIFT) | index;
1139 1140
1140 1141 /*
1141 1142 * Push into global visibility.
1142 1143 */
1143 1144 slot->ipct_data = perm;
1144 1145 if (perm->ipc_key != IPC_PRIVATE) {
1145 1146 loc = avl_find(&service->ipcs_keys, perm, &where);
1146 1147 ASSERT(loc == NULL);
1147 1148 avl_insert(&service->ipcs_keys, perm, where);
1148 1149 }
1149 1150 list_insert_head(&service->ipcs_usedids, perm);
1150 1151
1151 1152 /*
1152 1153 * Update resource consumption.
1153 1154 */
1154 1155 IPC_PROJ_USAGE(perm, service) += 1;
1155 1156 IPC_ZONE_USAGE(perm, service) += 1;
1156 1157
1157 1158 mutex_exit(&service->ipcs_lock);
1158 1159 return (&slot->ipct_lock);
1159 1160 }
1160 1161
1161 1162 /*
1162 1163 * Clean up function, in case the allocation fails. If called between
1163 1164 * ipc_lookup and ipc_commit_begin, perm->ipc_proj will be 0 and we
1164 1165 * merely free the perm structure. If called after ipc_commit_begin,
1165 1166 * we also drop locks and call the ID's destructor.
1166 1167 */
1167 1168 void
1168 1169 ipc_cleanup(ipc_service_t *service, kipc_perm_t *perm)
1169 1170 {
1170 1171 ASSERT(IPC_FREE(perm));
1171 1172 if (perm->ipc_proj) {
1172 1173 mutex_exit(&curproc->p_lock);
1173 1174 mutex_exit(&service->ipcs_lock);
1174 1175 service->ipcs_dtor(perm);
1175 1176 }
1176 1177 if (perm->ipc_zone_ref.zref_zone != NULL)
1177 1178 zone_rele_ref(&perm->ipc_zone_ref, ZONE_REF_IPC);
1178 1179 kmem_free(perm, service->ipcs_ssize);
1179 1180 }
1180 1181
1181 1182
1182 1183 /*
1183 1184 * Common code to remove an IPC object. This should be called after
1184 1185 * all permissions checks have been performed, and with the service
1185 1186 * and ID locked. Note that this does not remove the object from
1186 1187 * the ipcs_usedids list (this needs to be done by the caller before
1187 1188 * dropping the service lock).
1188 1189 */
1189 1190 static void
1190 1191 ipc_remove(ipc_service_t *service, kipc_perm_t *perm)
1191 1192 {
1192 1193 int id = perm->ipc_id;
1193 1194 int index;
1194 1195
1195 1196 ASSERT(MUTEX_HELD(&service->ipcs_lock));
1196 1197 ASSERT(IPC_LOCKED(service, perm));
1197 1198
1198 1199 index = IPC_INDEX(id);
1199 1200
1200 1201 service->ipcs_table[index].ipct_data = NULL;
1201 1202
1202 1203 if (perm->ipc_key != IPC_PRIVATE)
1203 1204 avl_remove(&service->ipcs_keys, perm);
1204 1205 list_remove(&service->ipcs_usedids, perm);
1205 1206 perm->ipc_mode &= ~IPC_ALLOC;
1206 1207
1207 1208 id_free(service->ipcs_ids, index);
1208 1209
1209 1210 if (service->ipcs_table[index].ipct_seq++ == IPC_SEQ_MASK)
|
↓ open down ↓ |
1177 lines elided |
↑ open up ↑ |
1210 1211 service->ipcs_table[index].ipct_seq = 0;
1211 1212 service->ipcs_count--;
1212 1213 ASSERT(IPC_PROJ_USAGE(perm, service) > 0);
1213 1214 ASSERT(IPC_ZONE_USAGE(perm, service) > 0);
1214 1215 IPC_PROJ_USAGE(perm, service) -= 1;
1215 1216 IPC_ZONE_USAGE(perm, service) -= 1;
1216 1217 ASSERT(service->ipcs_count || ((IPC_PROJ_USAGE(perm, service) == 0) &&
1217 1218 (IPC_ZONE_USAGE(perm, service) == 0)));
1218 1219 }
1219 1220
1221 +/*
1222 + * Perform actual IPC_RMID, either via ipc_rmid or due to a delayed *_RMID.
1223 + */
1224 +void
1225 +ipc_rmsvc(ipc_service_t *service, kipc_perm_t *perm)
1226 +{
1227 + ASSERT(service->ipcs_count > 0);
1228 + ASSERT(MUTEX_HELD(&service->ipcs_lock));
1220 1229
1230 + ipc_remove(service, perm);
1231 + mutex_exit(&service->ipcs_lock);
1232 +
1233 + /* perform any per-service removal actions */
1234 + service->ipcs_rmid(perm);
1235 +
1236 + ipc_rele(service, perm);
1237 +}
1238 +
1221 1239 /*
1222 1240 * Common code to perform an IPC_RMID. Returns an errno value on
1223 1241 * failure, 0 on success.
1224 1242 */
1225 1243 int
1226 1244 ipc_rmid(ipc_service_t *service, int id, cred_t *cr)
1227 1245 {
1228 1246 kipc_perm_t *perm;
1229 1247 kmutex_t *lock;
1230 1248
1231 1249 mutex_enter(&service->ipcs_lock);
1232 1250
1233 1251 lock = ipc_lookup(service, id, &perm);
1234 1252 if (lock == NULL) {
1235 1253 mutex_exit(&service->ipcs_lock);
1236 1254 return (EINVAL);
1237 1255 }
1238 1256
1239 1257 ASSERT(service->ipcs_count > 0);
|
↓ open down ↓ |
9 lines elided |
↑ open up ↑ |
1240 1258
1241 1259 if (secpolicy_ipc_owner(cr, perm) != 0) {
1242 1260 mutex_exit(lock);
1243 1261 mutex_exit(&service->ipcs_lock);
1244 1262 return (EPERM);
1245 1263 }
1246 1264
1247 1265 /*
1248 1266 * Nothing can fail from this point on.
1249 1267 */
1250 - ipc_remove(service, perm);
1251 - mutex_exit(&service->ipcs_lock);
1268 + ipc_rmsvc(service, perm);
1252 1269
1253 - /* perform any per-service removal actions */
1254 - service->ipcs_rmid(perm);
1255 -
1256 - ipc_rele(service, perm);
1257 -
1258 1270 return (0);
1259 1271 }
1260 1272
1261 1273 /*
1262 1274 * Implementation for shmids, semids, and msgids. buf is the address
1263 1275 * of the user buffer, nids is the size, and pnids is a pointer to
1264 1276 * where we write the actual number of ids that [would] have been
1265 1277 * copied out.
1266 1278 */
1267 1279 int
1268 1280 ipc_ids(ipc_service_t *service, int *buf, uint_t nids, uint_t *pnids)
1269 1281 {
1270 1282 kipc_perm_t *perm;
1271 1283 size_t idsize = 0;
1272 1284 int error = 0;
1273 1285 int idcount;
1274 1286 int *ids;
1275 1287 int numids = 0;
1276 1288 zoneid_t zoneid = getzoneid();
1277 1289 int global = INGLOBALZONE(curproc);
1278 1290
1279 1291 if (buf == NULL)
1280 1292 nids = 0;
1281 1293
1282 1294 /*
1283 1295 * Get an accurate count of the total number of ids, and allocate a
1284 1296 * staging buffer. Since ipcs_count is always sane, we don't have
1285 1297 * to take ipcs_lock for our first guess. If there are no ids, or
1286 1298 * we're in the global zone and the number of ids is greater than
1287 1299 * the size of the specified buffer, we shunt to the end. Otherwise,
1288 1300 * we go through the id list looking for (and counting) what is
1289 1301 * visible in the specified zone.
1290 1302 */
1291 1303 idcount = service->ipcs_count;
1292 1304 for (;;) {
1293 1305 if ((global && idcount > nids) || idcount == 0) {
1294 1306 numids = idcount;
1295 1307 nids = 0;
1296 1308 goto out;
1297 1309 }
1298 1310
1299 1311 idsize = idcount * sizeof (int);
1300 1312 ids = kmem_alloc(idsize, KM_SLEEP);
1301 1313
1302 1314 mutex_enter(&service->ipcs_lock);
1303 1315 if (idcount >= service->ipcs_count)
1304 1316 break;
1305 1317 idcount = service->ipcs_count;
1306 1318 mutex_exit(&service->ipcs_lock);
1307 1319
1308 1320 if (idsize != 0) {
1309 1321 kmem_free(ids, idsize);
1310 1322 idsize = 0;
1311 1323 }
1312 1324 }
1313 1325
1314 1326 for (perm = list_head(&service->ipcs_usedids); perm != NULL;
1315 1327 perm = list_next(&service->ipcs_usedids, perm)) {
1316 1328 ASSERT(!IPC_FREE(perm));
1317 1329 if (global || perm->ipc_zoneid == zoneid)
1318 1330 ids[numids++] = perm->ipc_id;
1319 1331 }
1320 1332 mutex_exit(&service->ipcs_lock);
1321 1333
1322 1334 /*
1323 1335 * If there isn't enough space to hold all of the ids, just
1324 1336 * return the number of ids without copying out any of them.
1325 1337 */
1326 1338 if (nids < numids)
1327 1339 nids = 0;
1328 1340
1329 1341 out:
1330 1342 if (suword32(pnids, (uint32_t)numids) ||
1331 1343 (nids != 0 && copyout(ids, buf, numids * sizeof (int))))
1332 1344 error = EFAULT;
1333 1345 if (idsize != 0)
1334 1346 kmem_free(ids, idsize);
1335 1347 return (error);
1336 1348 }
1337 1349
1338 1350 /*
1339 1351 * Destroy IPC objects from the given service that are associated with
1340 1352 * the given zone.
1341 1353 *
1342 1354 * We can't hold on to the service lock when freeing objects, so we
1343 1355 * first search the service and move all the objects to a private
1344 1356 * list, then walk through and free them after dropping the lock.
1345 1357 */
1346 1358 void
1347 1359 ipc_remove_zone(ipc_service_t *service, zoneid_t zoneid)
1348 1360 {
1349 1361 kipc_perm_t *perm, *next;
1350 1362 list_t rmlist;
1351 1363 kmutex_t *lock;
1352 1364
1353 1365 list_create(&rmlist, sizeof (kipc_perm_t),
1354 1366 offsetof(kipc_perm_t, ipc_list));
1355 1367
1356 1368 mutex_enter(&service->ipcs_lock);
1357 1369 for (perm = list_head(&service->ipcs_usedids); perm != NULL;
1358 1370 perm = next) {
1359 1371 next = list_next(&service->ipcs_usedids, perm);
1360 1372 if (perm->ipc_zoneid != zoneid)
1361 1373 continue;
1362 1374
1363 1375 /*
1364 1376 * Remove the object from the service, then put it on
1365 1377 * the removal list so we can defer the call to
1366 1378 * ipc_rele (which will actually free the structure).
1367 1379 * We need to do this since the destructor may grab
1368 1380 * the service lock.
1369 1381 */
1370 1382 ASSERT(!IPC_FREE(perm));
1371 1383 lock = ipc_lock(service, perm->ipc_id);
1372 1384 ipc_remove(service, perm);
1373 1385 mutex_exit(lock);
1374 1386 list_insert_tail(&rmlist, perm);
1375 1387 }
1376 1388 mutex_exit(&service->ipcs_lock);
1377 1389
1378 1390 /*
1379 1391 * Now that we've dropped the service lock, loop through the
1380 1392 * private list freeing removed objects.
1381 1393 */
1382 1394 for (perm = list_head(&rmlist); perm != NULL; perm = next) {
1383 1395 next = list_next(&rmlist, perm);
1384 1396 list_remove(&rmlist, perm);
1385 1397
1386 1398 (void) ipc_lock(service, perm->ipc_id);
1387 1399
1388 1400 /* perform any per-service removal actions */
1389 1401 service->ipcs_rmid(perm);
1390 1402
1391 1403 /* release reference */
1392 1404 ipc_rele(service, perm);
1393 1405 }
1394 1406
1395 1407 list_destroy(&rmlist);
1396 1408 }
|
↓ open down ↓ |
129 lines elided |
↑ open up ↑ |
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX