1 KMEM_CACHE_CREATE(9F) Kernel Functions for Drivers KMEM_CACHE_CREATE(9F)
2
3
4
5 NAME
6 kmem_cache_create, kmem_cache_alloc, kmem_cache_free,
7 kmem_cache_destroy, kmem_cache_set_move - kernel memory cache allocator
8 operations
9
10 SYNOPSIS
11 #include <sys/types.h>
12 #include <sys/kmem.h>
13
14 kmem_cache_t *kmem_cache_create(char *name, size_t bufsize,
15 size_t align, int (*constructor)(void *, void *, int),
16 void (*destructor)(void *, void *), void (*reclaim)(void *),
17 void *private, void *vmp, int cflags);
18
19
20 void kmem_cache_destroy(kmem_cache_t *cp);
21
22
23 void *kmem_cache_alloc(kmem_cache_t *cp, int kmflag);
24
25
26 void kmem_cache_free(kmem_cache_t *cp, void *obj);
27
28
29 void kmem_cache_set_move(kmem_cache_t *cp, kmem_cbrc_t (*move)(void *,
30 void *, size_t *, void *));
31
32
33 [Synopsis for callback functions:]
34
35
36 int (*constructor)(void *buf, void *user_arg, int kmflags);
37
38
39 void (*destructor)(void *buf, void *user_arg);
40
41
42 kmem_cbrc_t (*move)(void *old, void *new, size_t bufsize,
43 void *user_arg);
44
45
46 INTERFACE LEVEL
47 illumos DDI specific (illumos DDI)
48
49 PARAMETERS
50 The parameters for the kmem_cache_* functions are as follows:
51
52 name
53 Descriptive name of a kstat(9S) structure of class
54 kmem_cache. Names longer than 31 characters are
55 truncated.
56
57
58 bufsize
59 Size of the objects it manages.
60
61
62 align
63 Required object alignment.
64
65
66 constructor
67 Pointer to an object constructor function. Parameters
68 are defined below.
69
70
71 destructor
72 Pointer to an object destructor function. Parameters are
73 defined below.
74
75
76 reclaim
77 Drivers should pass NULL.
78
79
80 private
81 Pass-through argument for constructor/destructor.
82
83
84 vmp
85 Drivers should pass NULL.
86
87
88 cflags
89 Drivers must pass 0.
90
91
92 kmflag
93 Possible flags are:
94
95 KM_SLEEP
96 Allow sleeping (blocking) until memory is
97 available.
98
99
100 KM_NOSLEEP
101 Return NULL immediately if memory is not
102 available, but after an aggressive
103 reclaiming attempt. Any mention of
104 KM_NOSLEEP without mentioning
105 KM_NOSLEEP_LAZY (see below) applies to
106 both values.
107
108
109 KM_NOSLEEP_LAZY
110 Return NULL immediately if memory is not
111 available, without the aggressive
112 reclaiming attempt. This is actually two
113 flags combined: (KM_NOSLEEP |
114 KM_NORMALPRI), the latter flag indicating
115 not to attempt reclamation before giving
116 up and returning NULL.
117
118
119 KM_PUSHPAGE
120 Allow the allocation to use reserved
121 memory.
122
123
124
125 obj
126 Pointer to the object allocated by kmem_cache_alloc().
127
128
129 move
130 Pointer to an object relocation function. Parameters are
131 defined below.
132
133
134
135 The parameters for the callback constructor function are as follows:
136
137 void *buf
138 Pointer to the object to be constructed.
139
140
141 void *user_arg
142 The private parameter from the call to
143 kmem_cache_create(); it is typically a pointer to the
144 soft-state structure.
145
146
147 int kmflags
148 Propagated kmflag values.
149
150
151
152 The parameters for the callback destructor function are as follows:
153
154 void *buf
155 Pointer to the object to be deconstructed.
156
157
158 void *user_arg
159 The private parameter from the call to
160 kmem_cache_create(); it is typically a pointer to the
161 soft-state structure.
162
163
164
165 The parameters for the callback move() function are as follows:
166
167 void *old
168 Pointer to the object to be moved.
169
170
171 void *new
172 Pointer to the object that serves as the copy
173 destination for the contents of the old parameter.
174
175
176 size_t bufsize
177 Size of the object to be moved.
178
179
180 void *user_arg
181 The private parameter from the call to
182 kmem_cache_create(); it is typically a pointer to the
183 soft-state structure.
184
185
186 DESCRIPTION
187 In many cases, the cost of initializing and destroying an object
188 exceeds the cost of allocating and freeing memory for it. The functions
189 described here address this condition.
190
191
192 Object caching is a technique for dealing with objects that are:
193
194 o frequently allocated and freed, and
195
196 o have setup and initialization costs.
197
198
199 The idea is to allow the allocator and its clients to cooperate to
200 preserve the invariant portion of an object's initial state, or
201 constructed state, between uses, so it does not have to be destroyed
202 and re-created every time the object is used. For example, an object
203 containing a mutex only needs to have mutex_init() applied once, the
204 first time the object is allocated. The object can then be freed and
205 reallocated many times without incurring the expense of mutex_destroy()
206 and mutex_init() each time. An object's embedded locks, condition
207 variables, reference counts, lists of other objects, and read-only data
208 all generally qualify as constructed state. The essential requirement
209 is that the client must free the object (using kmem_cache_free()) in
210 its constructed state. The allocator cannot enforce this, so
211 programming errors will lead to hard-to-find bugs.
212
213
214 A driver should call kmem_cache_create() at the time of _init(9E) or
215 attach(9E), and call the corresponding kmem_cache_destroy() at the time
216 of _fini(9E) or detach(9E).
217
218
219 kmem_cache_create() creates a cache of objects, each of size bufsize
220 bytes, aligned on an align boundary. Drivers not requiring a specific
221 alignment can pass 0. name identifies the cache for statistics and
222 debugging. constructor and destructor convert plain memory into objects
223 and back again; constructor can fail if it needs to allocate memory but
224 cannot. private is a parameter passed to the constructor and destructor
225 callbacks to support parameterized caches (for example, a pointer to an
226 instance of the driver's soft-state structure). To facilitate
227 debugging, kmem_cache_create() creates a kstat(9S) structure of class
228 kmem_cache and name name. It returns an opaque pointer to the object
229 cache.
230
231
232 kmem_cache_alloc() gets an object from the cache. The object will be in
233 its constructed state. kmflag has either KM_SLEEP or KM_NOSLEEP set,
234 indicating whether it is acceptable to wait for memory if none is
235 currently available.
236
237
238 A small pool of reserved memory is available to allow the system to
239 progress toward the goal of freeing additional memory while in a low
240 memory situation. The KM_PUSHPAGE flag enables use of this reserved
241 memory pool on an allocation. This flag can be used by drivers that
242 implement strategy(9E) on memory allocations associated with a single
243 I/O operation. The driver guarantees that the I/O operation will
244 complete (or timeout) and, on completion, that the memory will be
245 returned. The KM_PUSHPAGE flag should be used only in
246 kmem_cache_alloc() calls. All allocations from a given cache should be
247 consistent in their use of the flag. A driver that adheres to these
248 restrictions can guarantee progress in a low memory situation without
249 resorting to complex private allocation and queuing schemes. If
250 KM_PUSHPAGE is specified, KM_SLEEP can also be used without causing
251 deadlock.
252
253
254 kmem_cache_free() returns an object to the cache. The object must be in
255 its constructed state.
256
257
258 kmem_cache_destroy() destroys the cache and releases all associated
259 resources. All allocated objects must have been previously freed.
260
261
262 kmem_cache_set_move() registers a function that the allocator may call
263 to move objects from sparsely allocated pages of memory so that the
264 system can reclaim pages that are tied up by the client. Since caching
265 objects of the same size and type already makes severe memory
266 fragmentation unlikely, there is generally no need to register such a
267 function. The idea is to make it possible to limit worst-case
268 fragmentation in caches that exhibit a tendency to become highly
269 fragmented. Only clients that allocate a mix of long- and short-lived
270 objects from the same cache are prone to exhibit this tendency, making
271 them candidates for a move() callback.
272
273
274 The move() callback supplies the client with two addresses: the
275 allocated object that the allocator wants to move and a buffer selected
276 by the allocator for the client to use as the copy destination. The new
277 parameter is an allocated, constructed object ready to receive the
278 contents of the old parameter. The bufsize parameter supplies the size
279 of the object, in case a single move function handles multiple caches
280 whose objects differ only in size. Finally, the private parameter
281 passed to the constructor and destructor is also passed to the move()
282 callback.
283
284
285 Only the client knows about its own data and when it is a good time to
286 move it. The client cooperates with the allocator to return unused
287 memory to the system, and the allocator accepts this help at the
288 client's convenience. When asked to move an object, the client can
289 respond with any of the following:
290
291 typedef enum kmem_cbrc {
292 KMEM_CBRC_YES,
293 KMEM_CBRC_NO,
294 KMEM_CBRC_LATER,
295 KMEM_CBRC_DONT_NEED,
296 KMEM_CBRC_DONT_KNOW
297 } kmem_cbrc_t;
298
299
300
301
302 The client must not explicitly free either of the objects passed to the
303 move() callback, since the allocator wants to free them directly to the
304 slab layer (bypassing the per-CPU magazine layer). The response tells
305 the allocator which of the two object parameters to free:
306
307 KMEM_CBRC_YES
308 The client moved the object; the allocator frees
309 the old parameter.
310
311
312 KMEM_CBRC_NO
313 The client refused to move the object; the
314 allocator frees the new parameter (the unused
315 copy destination).
316
317
318 KMEM_CBRC_LATER
319 The client is using the object and cannot move
320 it now; the allocator frees the new parameter
321 (the unused copy destination). The client should
322 use KMEM_CBRC_LATER instead of KMEM_CBRC_NO if
323 the object is likely to become movable soon.
324
325
326 KMEM_CBRC_DONT_NEED
327 The client no longer needs the object; the
328 allocator frees both the old and new parameters.
329 This response is the client's opportunity to be
330 a model citizen and give back as much as it can.
331
332
333 KMEM_CBRC_DONT_KNOW
334 The client does not know about the object
335 because:
336
337 a)
338 the client has just allocated the object
339 and has not yet put it wherever it expects
340 to find known objects
341
342
343 b)
344 the client has removed the object from
345 wherever it expects to find known objects
346 and is about to free the object
347
348
349 c)
350 the client has freed the object
351
352 In all of these cases above, the allocator frees
353 the new parameter (the unused copy destination)
354 and searches for the old parameter in the
355 magazine layer. If the object is found, it is
356 removed from the magazine layer and freed to the
357 slab layer so that it will no longer tie up an
358 entire page of memory.
359
360
361
362 Any object passed to the move() callback is guaranteed to have been
363 touched only by the allocator or by the client. Because memory patterns
364 applied by the allocator always set at least one of the two lowest
365 order bits, the bottom two bits of any pointer member (other than char
366 * or short *, which may not be 8-byte aligned on all platforms) are
367 available to the client for marking cached objects that the client is
368 about to free. This way, the client can recognize known objects in the
369 move() callback by the unmarked (valid) pointer value.
370
371
372 If the client refuses to move an object with either KMEM_CBRC_NO or
373 KMEM_CBRC_LATER, and that object later becomes movable, the client can
374 notify the allocator by calling kmem_cache_move_notify().
375 Alternatively, the client can simply wait for the allocator to call
376 back again with the same object address. Responding KMEM_CRBC_NO even
377 once or responding KMEM_CRBC_LATER too many times for the same object
378 makes the allocator less likely to call back again for that object.
379
380 [Synopsis for notification function:]
381
382
383 void kmem_cache_move_notify(kmem_cache_t *cp, void *obj);
384
385
386
387 The parameters for the notification function are as follows:
388
389 cp
390 Pointer to the object cache.
391
392
393 obj
394 Pointer to the object that has become movable since an earlier
395 refusal to move it.
396
397
398 CONTEXT
399 Constructors can be invoked during any call to kmem_cache_alloc(), and
400 will run in that context. Similarly, destructors can be invoked during
401 any call to kmem_cache_free(), and can also be invoked during
402 kmem_cache_destroy(). Therefore, the functions that a constructor or
403 destructor invokes must be appropriate in that context. Furthermore,
404 the allocator may also call the constructor and destructor on objects
405 still under its control without client involvement.
406
407
408 kmem_cache_create() and kmem_cache_destroy() must not be called from
409 interrupt context. kmem_cache_create() can also block for available
410 memory.
411
412
413 kmem_cache_alloc() can be called from interrupt context only if the
414 KM_NOSLEEP flag is set. It can be called from user or kernel context
415 with any valid flag.
416
417
418 kmem_cache_free() can be called from user, kernel, or interrupt
419 context.
420
421
422 kmem_cache_set_move() is called from the same context as
423 kmem_cache_create(), immediately after kmem_cache_create() and before
424 allocating any objects from the cache.
425
426
427 The registered move() callback is always invoked in the same global
428 callback thread dedicated for move requests, guaranteeing that no
429 matter how many clients register a move() function, the allocator never
430 tries to move more than one object at a time. Neither the allocator nor
431 the client can be assumed to know the object's whereabouts at the time
432 of the callback.
433
434 EXAMPLES
435 Example 1 Object Caching
436
437
438 Consider the following data structure:
439
440
441 struct foo {
442 kmutex_t foo_lock;
443 kcondvar_t foo_cv;
444 struct bar *foo_barlist;
445 int foo_refcnt;
446 };
447
448
449
450 Assume that a foo structure cannot be freed until there are no
451 outstanding references to it (foo_refcnt == 0) and all of its pending
452 bar events (whatever they are) have completed (foo_barlist == NULL).
453 The life cycle of a dynamically allocated foo would be something like
454 this:
455
456
457 foo = kmem_alloc(sizeof (struct foo), KM_SLEEP);
458 mutex_init(&foo->foo_lock, ...);
459 cv_init(&foo->foo_cv, ...);
460 foo->foo_refcnt = 0;
461 foo->foo_barlist = NULL;
462 use foo;
463 ASSERT(foo->foo_barlist == NULL);
464 ASSERT(foo->foo_refcnt == 0);
465 cv_destroy(&foo->foo_cv);
466 mutex_destroy(&foo->foo_lock);
467 kmem_free(foo);
468
469
470
471 Notice that between each use of a foo object we perform a sequence of
472 operations that constitutes nothing but expensive overhead. All of this
473 overhead (that is, everything other than use foo above) can be
474 eliminated by object caching.
475
476
477 int
478 foo_constructor(void *buf, void *arg, int tags)
479 {
480 struct foo *foo = buf;
481 mutex_init(&foo->foo_lock, ...);
482 cv_init(&foo->foo_cv, ...);
483 foo->foo_refcnt = 0;
484 foo->foo_barlist = NULL;
485 return (0);
486 }
487
488 void
489 foo_destructor(void *buf, void *arg)
490 {
491 struct foo *foo = buf;
492 ASSERT(foo->foo_barlist == NULL);
493 ASSERT(foo->foo_refcnt == 0);
494 cv_destroy(&foo->foo_cv);
495 mutex_destroy(&foo->foo_lock);
496 }
497
498 user_arg = ddi_get_soft_state(foo_softc, instance);
499 (void) snprintf(buf, KSTAT_STRLEN, "foo%d_cache",
500 ddi_get_instance(dip));
501 foo_cache = kmem_cache_create(buf,
502 sizeof (struct foo), 0,
503 foo_constructor, foo_destructor,
504 NULL, user_arg, 0);
505
506
507
508 To allocate, use, and free a foo object:
509
510
511 foo = kmem_cache_alloc(foo_cache, KM_SLEEP);
512 use foo;
513 kmem_cache_free(foo_cache, foo);
514
515
516
517 This makes foo allocation fast, because the allocator will usually do
518 nothing more than fetch an already-constructed foo from the cache.
519 foo_constructor and foo_destructor will be invoked only to populate and
520 drain the cache, respectively.
521
522
523 Example 2 Registering a Move Callback
524
525
526 To register a move() callback:
527
528
529 object_cache = kmem_cache_create(...);
530 kmem_cache_set_move(object_cache, object_move);
531
532
533 RETURN VALUES
534 If successful, the constructor function must return 0. If KM_NOSLEEP or
535 KM_NOSLEEP_LAZY is set and memory cannot be allocated without sleeping,
536 the constructor must return -1. If the constructor takes extraordinary
537 steps during a KM_NOSLEEP construction, it may not take those for a
538 KM_NOSLEEP_LAZY construction.
539
540
541 kmem_cache_create() returns a pointer to the allocated cache.
542
543
544 If successful, kmem_cache_alloc() returns a pointer to the allocated
545 object. If KM_NOSLEEP is set and memory cannot be allocated without
546 sleeping, kmem_cache_alloc() returns NULL.
547
548 ATTRIBUTES
549 See attributes(5) for descriptions of the following attributes:
550
551
552
553
554 +--------------------+-----------------+
555 | ATTRIBUTE TYPE | ATTRIBUTE VALUE |
556 +--------------------+-----------------+
557 |Interface Stability | Committed |
558 +--------------------+-----------------+
559
560 SEE ALSO
561 condvar(9F), kmem_alloc(9F), mutex(9F), kstat(9S)
562
563
564 Writing Device Drivers
565
566
567 The Slab Allocator: An Object-Caching Kernel Memory Allocator, Bonwick,
568 J.; USENIX Summer 1994 Technical Conference (1994).
569
570
571 Magazines and vmem: Extending the Slab Allocator to Many CPUs and
572 Arbitrary Resources, Bonwick, J. and Adams, J.; USENIX 2001 Technical
573 Conference (2001).
574
575 NOTES
576 The constructor must be immediately reversible by the destructor, since
577 the allocator may call the constructor and destructor on objects still
578 under its control at any time without client involvement.
579
580
581 The constructor must respect the kmflags argument by forwarding it to
582 allocations made inside the constructor, and must not ASSERT anything
583 about the given flags.
584
585
586 The user argument forwarded to the constructor must be fully
587 operational before it is passed to kmem_cache_create().
588
589
590
591 February 18, 2015 KMEM_CACHE_CREATE(9F)