Print this page
5513 KM_NORMALPRI should be documented in kmem_alloc(9f) and kmem_cache_create(9f) man pages
14465 Present KM_NOSLEEP_LAZY as documented interface
Change-Id: I002ec28ddf390650f1fcba1ca94f6abfdb241439
| Split |
Close |
| Expand all |
| Collapse all |
--- old/usr/src/man/man9f/kmem_cache_create.9f.man.txt
+++ new/usr/src/man/man9f/kmem_cache_create.9f.man.txt
1 1 KMEM_CACHE_CREATE(9F) Kernel Functions for Drivers KMEM_CACHE_CREATE(9F)
2 2
3 3
4 4
5 5 NAME
6 6 kmem_cache_create, kmem_cache_alloc, kmem_cache_free,
7 7 kmem_cache_destroy, kmem_cache_set_move - kernel memory cache allocator
8 8 operations
9 9
10 10 SYNOPSIS
11 11 #include <sys/types.h>
12 12 #include <sys/kmem.h>
13 13
14 14 kmem_cache_t *kmem_cache_create(char *name, size_t bufsize,
15 15 size_t align, int (*constructor)(void *, void *, int),
16 16 void (*destructor)(void *, void *), void (*reclaim)(void *),
17 17 void *private, void *vmp, int cflags);
18 18
19 19
20 20 void kmem_cache_destroy(kmem_cache_t *cp);
21 21
22 22
23 23 void *kmem_cache_alloc(kmem_cache_t *cp, int kmflag);
24 24
25 25
26 26 void kmem_cache_free(kmem_cache_t *cp, void *obj);
27 27
28 28
29 29 void kmem_cache_set_move(kmem_cache_t *cp, kmem_cbrc_t (*move)(void *,
30 30 void *, size_t *, void *));
31 31
32 32
33 33 [Synopsis for callback functions:]
34 34
35 35
36 36 int (*constructor)(void *buf, void *user_arg, int kmflags);
37 37
38 38
39 39 void (*destructor)(void *buf, void *user_arg);
40 40
41 41
42 42 kmem_cbrc_t (*move)(void *old, void *new, size_t bufsize,
43 43 void *user_arg);
44 44
45 45
46 46 INTERFACE LEVEL
47 47 illumos DDI specific (illumos DDI)
48 48
49 49 PARAMETERS
50 50 The parameters for the kmem_cache_* functions are as follows:
51 51
52 52 name
53 53 Descriptive name of a kstat(9S) structure of class
54 54 kmem_cache. Names longer than 31 characters are
55 55 truncated.
56 56
57 57
58 58 bufsize
59 59 Size of the objects it manages.
60 60
61 61
62 62 align
63 63 Required object alignment.
64 64
65 65
66 66 constructor
67 67 Pointer to an object constructor function. Parameters
68 68 are defined below.
69 69
70 70
71 71 destructor
72 72 Pointer to an object destructor function. Parameters are
73 73 defined below.
74 74
75 75
76 76 reclaim
77 77 Drivers should pass NULL.
78 78
79 79
80 80 private
81 81 Pass-through argument for constructor/destructor.
82 82
83 83
84 84 vmp
85 85 Drivers should pass NULL.
86 86
87 87
88 88 cflags
89 89 Drivers must pass 0.
90 90
91 91
|
↓ open down ↓ |
91 lines elided |
↑ open up ↑ |
92 92 kmflag
93 93 Possible flags are:
94 94
95 95 KM_SLEEP
96 96 Allow sleeping (blocking) until memory is
97 97 available.
98 98
99 99
100 100 KM_NOSLEEP
101 101 Return NULL immediately if memory is not
102 - available.
102 + available, but after an aggressive
103 + reclaiming attempt. Any mention of
104 + KM_NOSLEEP without mentioning
105 + KM_NOSLEEP_LAZY (see below) applies to
106 + both values.
103 107
104 108
109 + KM_NOSLEEP_LAZY
110 + Return NULL immediately if memory is not
111 + available, without the aggressive
112 + reclaiming attempt. This is actually two
113 + flags combined: (KM_NOSLEEP |
114 + KM_NORMALPRI), the latter flag indicating
115 + not to attempt reclamation before giving
116 + up and returning NULL.
117 +
118 +
105 119 KM_PUSHPAGE
106 120 Allow the allocation to use reserved
107 121 memory.
108 122
109 123
110 124
111 125 obj
112 126 Pointer to the object allocated by kmem_cache_alloc().
113 127
114 128
115 129 move
116 130 Pointer to an object relocation function. Parameters are
117 131 defined below.
118 132
119 133
120 134
121 135 The parameters for the callback constructor function are as follows:
122 136
123 137 void *buf
124 138 Pointer to the object to be constructed.
125 139
126 140
127 141 void *user_arg
128 142 The private parameter from the call to
129 143 kmem_cache_create(); it is typically a pointer to the
130 144 soft-state structure.
131 145
132 146
133 147 int kmflags
134 148 Propagated kmflag values.
135 149
136 150
137 151
138 152 The parameters for the callback destructor function are as follows:
139 153
140 154 void *buf
141 155 Pointer to the object to be deconstructed.
142 156
143 157
144 158 void *user_arg
145 159 The private parameter from the call to
146 160 kmem_cache_create(); it is typically a pointer to the
147 161 soft-state structure.
148 162
149 163
150 164
151 165 The parameters for the callback move() function are as follows:
152 166
153 167 void *old
154 168 Pointer to the object to be moved.
155 169
156 170
157 171 void *new
158 172 Pointer to the object that serves as the copy
159 173 destination for the contents of the old parameter.
160 174
161 175
162 176 size_t bufsize
163 177 Size of the object to be moved.
164 178
165 179
166 180 void *user_arg
167 181 The private parameter from the call to
168 182 kmem_cache_create(); it is typically a pointer to the
169 183 soft-state structure.
170 184
171 185
172 186 DESCRIPTION
173 187 In many cases, the cost of initializing and destroying an object
174 188 exceeds the cost of allocating and freeing memory for it. The functions
175 189 described here address this condition.
176 190
177 191
178 192 Object caching is a technique for dealing with objects that are:
179 193
180 194 o frequently allocated and freed, and
181 195
182 196 o have setup and initialization costs.
183 197
184 198
185 199 The idea is to allow the allocator and its clients to cooperate to
186 200 preserve the invariant portion of an object's initial state, or
187 201 constructed state, between uses, so it does not have to be destroyed
188 202 and re-created every time the object is used. For example, an object
189 203 containing a mutex only needs to have mutex_init() applied once, the
190 204 first time the object is allocated. The object can then be freed and
191 205 reallocated many times without incurring the expense of mutex_destroy()
192 206 and mutex_init() each time. An object's embedded locks, condition
193 207 variables, reference counts, lists of other objects, and read-only data
194 208 all generally qualify as constructed state. The essential requirement
195 209 is that the client must free the object (using kmem_cache_free()) in
196 210 its constructed state. The allocator cannot enforce this, so
197 211 programming errors will lead to hard-to-find bugs.
198 212
199 213
200 214 A driver should call kmem_cache_create() at the time of _init(9E) or
201 215 attach(9E), and call the corresponding kmem_cache_destroy() at the time
202 216 of _fini(9E) or detach(9E).
203 217
204 218
205 219 kmem_cache_create() creates a cache of objects, each of size bufsize
206 220 bytes, aligned on an align boundary. Drivers not requiring a specific
207 221 alignment can pass 0. name identifies the cache for statistics and
208 222 debugging. constructor and destructor convert plain memory into objects
209 223 and back again; constructor can fail if it needs to allocate memory but
210 224 cannot. private is a parameter passed to the constructor and destructor
211 225 callbacks to support parameterized caches (for example, a pointer to an
212 226 instance of the driver's soft-state structure). To facilitate
213 227 debugging, kmem_cache_create() creates a kstat(9S) structure of class
214 228 kmem_cache and name name. It returns an opaque pointer to the object
215 229 cache.
216 230
217 231
218 232 kmem_cache_alloc() gets an object from the cache. The object will be in
219 233 its constructed state. kmflag has either KM_SLEEP or KM_NOSLEEP set,
220 234 indicating whether it is acceptable to wait for memory if none is
221 235 currently available.
222 236
223 237
224 238 A small pool of reserved memory is available to allow the system to
225 239 progress toward the goal of freeing additional memory while in a low
226 240 memory situation. The KM_PUSHPAGE flag enables use of this reserved
227 241 memory pool on an allocation. This flag can be used by drivers that
228 242 implement strategy(9E) on memory allocations associated with a single
229 243 I/O operation. The driver guarantees that the I/O operation will
230 244 complete (or timeout) and, on completion, that the memory will be
231 245 returned. The KM_PUSHPAGE flag should be used only in
232 246 kmem_cache_alloc() calls. All allocations from a given cache should be
233 247 consistent in their use of the flag. A driver that adheres to these
234 248 restrictions can guarantee progress in a low memory situation without
235 249 resorting to complex private allocation and queuing schemes. If
236 250 KM_PUSHPAGE is specified, KM_SLEEP can also be used without causing
237 251 deadlock.
238 252
239 253
240 254 kmem_cache_free() returns an object to the cache. The object must be in
241 255 its constructed state.
242 256
243 257
244 258 kmem_cache_destroy() destroys the cache and releases all associated
245 259 resources. All allocated objects must have been previously freed.
246 260
247 261
248 262 kmem_cache_set_move() registers a function that the allocator may call
249 263 to move objects from sparsely allocated pages of memory so that the
250 264 system can reclaim pages that are tied up by the client. Since caching
251 265 objects of the same size and type already makes severe memory
252 266 fragmentation unlikely, there is generally no need to register such a
253 267 function. The idea is to make it possible to limit worst-case
254 268 fragmentation in caches that exhibit a tendency to become highly
255 269 fragmented. Only clients that allocate a mix of long- and short-lived
256 270 objects from the same cache are prone to exhibit this tendency, making
257 271 them candidates for a move() callback.
258 272
259 273
260 274 The move() callback supplies the client with two addresses: the
261 275 allocated object that the allocator wants to move and a buffer selected
262 276 by the allocator for the client to use as the copy destination. The new
263 277 parameter is an allocated, constructed object ready to receive the
264 278 contents of the old parameter. The bufsize parameter supplies the size
265 279 of the object, in case a single move function handles multiple caches
266 280 whose objects differ only in size. Finally, the private parameter
267 281 passed to the constructor and destructor is also passed to the move()
268 282 callback.
269 283
270 284
271 285 Only the client knows about its own data and when it is a good time to
272 286 move it. The client cooperates with the allocator to return unused
273 287 memory to the system, and the allocator accepts this help at the
274 288 client's convenience. When asked to move an object, the client can
275 289 respond with any of the following:
276 290
277 291 typedef enum kmem_cbrc {
278 292 KMEM_CBRC_YES,
279 293 KMEM_CBRC_NO,
280 294 KMEM_CBRC_LATER,
281 295 KMEM_CBRC_DONT_NEED,
282 296 KMEM_CBRC_DONT_KNOW
283 297 } kmem_cbrc_t;
284 298
285 299
286 300
287 301
288 302 The client must not explicitly free either of the objects passed to the
289 303 move() callback, since the allocator wants to free them directly to the
290 304 slab layer (bypassing the per-CPU magazine layer). The response tells
291 305 the allocator which of the two object parameters to free:
292 306
293 307 KMEM_CBRC_YES
294 308 The client moved the object; the allocator frees
295 309 the old parameter.
296 310
297 311
298 312 KMEM_CBRC_NO
299 313 The client refused to move the object; the
300 314 allocator frees the new parameter (the unused
301 315 copy destination).
302 316
303 317
304 318 KMEM_CBRC_LATER
305 319 The client is using the object and cannot move
306 320 it now; the allocator frees the new parameter
307 321 (the unused copy destination). The client should
308 322 use KMEM_CBRC_LATER instead of KMEM_CBRC_NO if
309 323 the object is likely to become movable soon.
310 324
311 325
312 326 KMEM_CBRC_DONT_NEED
313 327 The client no longer needs the object; the
314 328 allocator frees both the old and new parameters.
315 329 This response is the client's opportunity to be
316 330 a model citizen and give back as much as it can.
317 331
318 332
319 333 KMEM_CBRC_DONT_KNOW
320 334 The client does not know about the object
321 335 because:
322 336
323 337 a)
324 338 the client has just allocated the object
325 339 and has not yet put it wherever it expects
326 340 to find known objects
327 341
328 342
329 343 b)
330 344 the client has removed the object from
331 345 wherever it expects to find known objects
332 346 and is about to free the object
333 347
334 348
335 349 c)
336 350 the client has freed the object
337 351
338 352 In all of these cases above, the allocator frees
339 353 the new parameter (the unused copy destination)
340 354 and searches for the old parameter in the
341 355 magazine layer. If the object is found, it is
342 356 removed from the magazine layer and freed to the
343 357 slab layer so that it will no longer tie up an
344 358 entire page of memory.
345 359
346 360
347 361
348 362 Any object passed to the move() callback is guaranteed to have been
349 363 touched only by the allocator or by the client. Because memory patterns
350 364 applied by the allocator always set at least one of the two lowest
351 365 order bits, the bottom two bits of any pointer member (other than char
352 366 * or short *, which may not be 8-byte aligned on all platforms) are
353 367 available to the client for marking cached objects that the client is
354 368 about to free. This way, the client can recognize known objects in the
355 369 move() callback by the unmarked (valid) pointer value.
356 370
357 371
358 372 If the client refuses to move an object with either KMEM_CBRC_NO or
359 373 KMEM_CBRC_LATER, and that object later becomes movable, the client can
360 374 notify the allocator by calling kmem_cache_move_notify().
361 375 Alternatively, the client can simply wait for the allocator to call
362 376 back again with the same object address. Responding KMEM_CRBC_NO even
363 377 once or responding KMEM_CRBC_LATER too many times for the same object
364 378 makes the allocator less likely to call back again for that object.
365 379
366 380 [Synopsis for notification function:]
367 381
368 382
369 383 void kmem_cache_move_notify(kmem_cache_t *cp, void *obj);
370 384
371 385
372 386
373 387 The parameters for the notification function are as follows:
374 388
375 389 cp
376 390 Pointer to the object cache.
377 391
378 392
379 393 obj
380 394 Pointer to the object that has become movable since an earlier
381 395 refusal to move it.
382 396
383 397
384 398 CONTEXT
385 399 Constructors can be invoked during any call to kmem_cache_alloc(), and
386 400 will run in that context. Similarly, destructors can be invoked during
387 401 any call to kmem_cache_free(), and can also be invoked during
388 402 kmem_cache_destroy(). Therefore, the functions that a constructor or
389 403 destructor invokes must be appropriate in that context. Furthermore,
390 404 the allocator may also call the constructor and destructor on objects
391 405 still under its control without client involvement.
392 406
393 407
394 408 kmem_cache_create() and kmem_cache_destroy() must not be called from
395 409 interrupt context. kmem_cache_create() can also block for available
396 410 memory.
397 411
398 412
399 413 kmem_cache_alloc() can be called from interrupt context only if the
400 414 KM_NOSLEEP flag is set. It can be called from user or kernel context
401 415 with any valid flag.
402 416
403 417
404 418 kmem_cache_free() can be called from user, kernel, or interrupt
405 419 context.
406 420
407 421
408 422 kmem_cache_set_move() is called from the same context as
409 423 kmem_cache_create(), immediately after kmem_cache_create() and before
410 424 allocating any objects from the cache.
411 425
412 426
413 427 The registered move() callback is always invoked in the same global
414 428 callback thread dedicated for move requests, guaranteeing that no
415 429 matter how many clients register a move() function, the allocator never
416 430 tries to move more than one object at a time. Neither the allocator nor
417 431 the client can be assumed to know the object's whereabouts at the time
418 432 of the callback.
419 433
420 434 EXAMPLES
421 435 Example 1 Object Caching
422 436
423 437
424 438 Consider the following data structure:
425 439
426 440
427 441 struct foo {
428 442 kmutex_t foo_lock;
429 443 kcondvar_t foo_cv;
430 444 struct bar *foo_barlist;
431 445 int foo_refcnt;
432 446 };
433 447
434 448
435 449
436 450 Assume that a foo structure cannot be freed until there are no
437 451 outstanding references to it (foo_refcnt == 0) and all of its pending
438 452 bar events (whatever they are) have completed (foo_barlist == NULL).
439 453 The life cycle of a dynamically allocated foo would be something like
440 454 this:
441 455
442 456
443 457 foo = kmem_alloc(sizeof (struct foo), KM_SLEEP);
444 458 mutex_init(&foo->foo_lock, ...);
445 459 cv_init(&foo->foo_cv, ...);
446 460 foo->foo_refcnt = 0;
447 461 foo->foo_barlist = NULL;
448 462 use foo;
449 463 ASSERT(foo->foo_barlist == NULL);
450 464 ASSERT(foo->foo_refcnt == 0);
451 465 cv_destroy(&foo->foo_cv);
452 466 mutex_destroy(&foo->foo_lock);
453 467 kmem_free(foo);
454 468
455 469
456 470
457 471 Notice that between each use of a foo object we perform a sequence of
458 472 operations that constitutes nothing but expensive overhead. All of this
459 473 overhead (that is, everything other than use foo above) can be
460 474 eliminated by object caching.
461 475
462 476
463 477 int
464 478 foo_constructor(void *buf, void *arg, int tags)
465 479 {
466 480 struct foo *foo = buf;
467 481 mutex_init(&foo->foo_lock, ...);
468 482 cv_init(&foo->foo_cv, ...);
469 483 foo->foo_refcnt = 0;
470 484 foo->foo_barlist = NULL;
471 485 return (0);
472 486 }
473 487
474 488 void
475 489 foo_destructor(void *buf, void *arg)
476 490 {
477 491 struct foo *foo = buf;
478 492 ASSERT(foo->foo_barlist == NULL);
479 493 ASSERT(foo->foo_refcnt == 0);
480 494 cv_destroy(&foo->foo_cv);
481 495 mutex_destroy(&foo->foo_lock);
482 496 }
483 497
484 498 user_arg = ddi_get_soft_state(foo_softc, instance);
485 499 (void) snprintf(buf, KSTAT_STRLEN, "foo%d_cache",
486 500 ddi_get_instance(dip));
487 501 foo_cache = kmem_cache_create(buf,
488 502 sizeof (struct foo), 0,
489 503 foo_constructor, foo_destructor,
490 504 NULL, user_arg, 0);
491 505
492 506
493 507
494 508 To allocate, use, and free a foo object:
495 509
496 510
497 511 foo = kmem_cache_alloc(foo_cache, KM_SLEEP);
498 512 use foo;
499 513 kmem_cache_free(foo_cache, foo);
500 514
501 515
502 516
503 517 This makes foo allocation fast, because the allocator will usually do
504 518 nothing more than fetch an already-constructed foo from the cache.
505 519 foo_constructor and foo_destructor will be invoked only to populate and
506 520 drain the cache, respectively.
507 521
508 522
509 523 Example 2 Registering a Move Callback
|
↓ open down ↓ |
395 lines elided |
↑ open up ↑ |
510 524
511 525
512 526 To register a move() callback:
513 527
514 528
515 529 object_cache = kmem_cache_create(...);
516 530 kmem_cache_set_move(object_cache, object_move);
517 531
518 532
519 533 RETURN VALUES
520 - If successful, the constructor function must return 0. If KM_NOSLEEP is
521 - set and memory cannot be allocated without sleeping, the constructor
522 - must return -1.
534 + If successful, the constructor function must return 0. If KM_NOSLEEP or
535 + KM_NOSLEEP_LAZY is set and memory cannot be allocated without sleeping,
536 + the constructor must return -1. If the constructor takes extraordinary
537 + steps during a KM_NOSLEEP construction, it may not take those for a
538 + KM_NOSLEEP_LAZY construction.
523 539
524 540
525 541 kmem_cache_create() returns a pointer to the allocated cache.
526 542
527 543
528 544 If successful, kmem_cache_alloc() returns a pointer to the allocated
529 545 object. If KM_NOSLEEP is set and memory cannot be allocated without
530 546 sleeping, kmem_cache_alloc() returns NULL.
531 547
532 548 ATTRIBUTES
533 549 See attributes(5) for descriptions of the following attributes:
534 550
535 551
536 552
537 553
538 554 +--------------------+-----------------+
539 555 | ATTRIBUTE TYPE | ATTRIBUTE VALUE |
540 556 +--------------------+-----------------+
541 557 |Interface Stability | Committed |
542 558 +--------------------+-----------------+
543 559
544 560 SEE ALSO
545 561 condvar(9F), kmem_alloc(9F), mutex(9F), kstat(9S)
546 562
547 563
548 564 Writing Device Drivers
549 565
550 566
551 567 The Slab Allocator: An Object-Caching Kernel Memory Allocator, Bonwick,
552 568 J.; USENIX Summer 1994 Technical Conference (1994).
553 569
554 570
555 571 Magazines and vmem: Extending the Slab Allocator to Many CPUs and
556 572 Arbitrary Resources, Bonwick, J. and Adams, J.; USENIX 2001 Technical
557 573 Conference (2001).
558 574
559 575 NOTES
560 576 The constructor must be immediately reversible by the destructor, since
561 577 the allocator may call the constructor and destructor on objects still
562 578 under its control at any time without client involvement.
563 579
564 580
565 581 The constructor must respect the kmflags argument by forwarding it to
566 582 allocations made inside the constructor, and must not ASSERT anything
567 583 about the given flags.
568 584
569 585
570 586 The user argument forwarded to the constructor must be fully
571 587 operational before it is passed to kmem_cache_create().
572 588
573 589
574 590
575 591 February 18, 2015 KMEM_CACHE_CREATE(9F)
|
↓ open down ↓ |
43 lines elided |
↑ open up ↑ |
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX