1 KMEM_CACHE_CREATE(9F)    Kernel Functions for Drivers    KMEM_CACHE_CREATE(9F)
   2 
   3 
   4 
   5 NAME
   6        kmem_cache_create, kmem_cache_alloc, kmem_cache_free,
   7        kmem_cache_destroy, kmem_cache_set_move - kernel memory cache allocator
   8        operations
   9 
  10 SYNOPSIS
  11        #include <sys/types.h>
  12        #include <sys/kmem.h>
  13 
  14        kmem_cache_t *kmem_cache_create(char *name, size_t bufsize,
  15             size_t align, int (*constructor)(void *, void *, int),
  16             void (*destructor)(void *, void *), void (*reclaim)(void *),
  17             void *private, void *vmp, int cflags);
  18 
  19 
  20        void kmem_cache_destroy(kmem_cache_t *cp);
  21 
  22 
  23        void *kmem_cache_alloc(kmem_cache_t *cp, int kmflag);
  24 
  25 
  26        void kmem_cache_free(kmem_cache_t *cp, void *obj);
  27 
  28 
  29        void kmem_cache_set_move(kmem_cache_t *cp, kmem_cbrc_t (*move)(void *,
  30             void *, size_t *, void *));
  31 
  32 
  33         [Synopsis for callback functions:]
  34 
  35 
  36        int (*constructor)(void *buf, void *user_arg, int kmflags);
  37 
  38 
  39        void (*destructor)(void *buf, void *user_arg);
  40 
  41 
  42        kmem_cbrc_t (*move)(void *old, void *new, size_t bufsize,
  43             void *user_arg);
  44 
  45 
  46 INTERFACE LEVEL
  47        illumos DDI specific (illumos DDI)
  48 
  49 PARAMETERS
  50        The parameters for the kmem_cache_* functions are as follows:
  51 
  52        name
  53                       Descriptive name of a kstat(9S) structure of class
  54                       kmem_cache.  Names longer than 31 characters are
  55                       truncated.
  56 
  57 
  58        bufsize
  59                       Size of the objects it manages.
  60 
  61 
  62        align
  63                       Required object alignment.
  64 
  65 
  66        constructor
  67                       Pointer to an object constructor function. Parameters
  68                       are defined below.
  69 
  70 
  71        destructor
  72                       Pointer to an object destructor function. Parameters are
  73                       defined below.
  74 
  75 
  76        reclaim
  77                       Drivers should pass NULL.
  78 
  79 
  80        private
  81                       Pass-through argument for constructor/destructor.
  82 
  83 
  84        vmp
  85                       Drivers should pass NULL.
  86 
  87 
  88        cflags
  89                       Drivers must pass 0.
  90 
  91 
  92        kmflag
  93                       Possible flags are:
  94 
  95                       KM_SLEEP
  96                                      Allow sleeping (blocking) until memory is
  97                                      available.
  98 
  99 
 100                       KM_NOSLEEP
 101                                      Return NULL immediately if memory is not
 102                                      available, but after an aggressive
 103                                      reclaiming attempt.  Any mention of
 104                                      KM_NOSLEEP without mentioning
 105                                      KM_NOSLEEP_LAZY (see below) applies to
 106                                      both values.
 107 
 108 
 109                       KM_NOSLEEP_LAZY
 110                                      Return NULL immediately if memory is not
 111                                      available, without the aggressive
 112                                      reclaiming attempt.  This is actually two
 113                                      flags combined: (KM_NOSLEEP |
 114                                      KM_NORMALPRI), the latter flag indicating
 115                                      not to attempt reclamation before giving
 116                                      up and returning NULL.
 117 
 118 
 119                       KM_PUSHPAGE
 120                                      Allow the allocation to use reserved
 121                                      memory.
 122 
 123 
 124 
 125        obj
 126                       Pointer to the object allocated by kmem_cache_alloc().
 127 
 128 
 129        move
 130                       Pointer to an object relocation function. Parameters are
 131                       defined below.
 132 
 133 
 134 
 135        The parameters for the callback constructor function are as follows:
 136 
 137        void *buf
 138                          Pointer to the object to be constructed.
 139 
 140 
 141        void *user_arg
 142                          The private parameter from the call to
 143                          kmem_cache_create(); it is typically a pointer to the
 144                          soft-state structure.
 145 
 146 
 147        int kmflags
 148                          Propagated kmflag values.
 149 
 150 
 151 
 152        The parameters for the callback destructor function are as follows:
 153 
 154        void *buf
 155                          Pointer to the object to be deconstructed.
 156 
 157 
 158        void *user_arg
 159                          The private parameter from the call to
 160                          kmem_cache_create(); it is typically a pointer to the
 161                          soft-state structure.
 162 
 163 
 164 
 165        The parameters for the callback move() function are as follows:
 166 
 167        void *old
 168                          Pointer to the object to be moved.
 169 
 170 
 171        void *new
 172                          Pointer to the object that serves as the copy
 173                          destination for the contents of the old parameter.
 174 
 175 
 176        size_t bufsize
 177                          Size of the object to be moved.
 178 
 179 
 180        void *user_arg
 181                          The private parameter from the call to
 182                          kmem_cache_create(); it is typically a pointer to the
 183                          soft-state structure.
 184 
 185 
 186 DESCRIPTION
 187        In many cases, the cost of initializing and destroying an object
 188        exceeds the cost of allocating and freeing memory for it. The functions
 189        described here address this condition.
 190 
 191 
 192        Object caching is a technique for dealing with objects that are:
 193 
 194            o      frequently allocated and freed, and
 195 
 196            o      have setup and initialization costs.
 197 
 198 
 199        The idea is to allow the allocator and its clients to cooperate to
 200        preserve the invariant portion of an object's initial state, or
 201        constructed state, between uses, so it does not have to be destroyed
 202        and re-created every time the object is used. For example, an object
 203        containing a mutex only needs to have mutex_init() applied once, the
 204        first time the object is allocated. The object can then be freed and
 205        reallocated many times without incurring the expense of mutex_destroy()
 206        and mutex_init() each time. An object's embedded locks, condition
 207        variables, reference counts, lists of other objects, and read-only data
 208        all generally qualify as constructed state. The essential requirement
 209        is that the client must free the object (using kmem_cache_free()) in
 210        its constructed state. The allocator cannot enforce this, so
 211        programming errors will lead to hard-to-find bugs.
 212 
 213 
 214        A driver should call kmem_cache_create() at the time of _init(9E) or
 215        attach(9E), and call the corresponding kmem_cache_destroy() at the time
 216        of _fini(9E) or detach(9E).
 217 
 218 
 219        kmem_cache_create() creates a cache of objects, each of size bufsize
 220        bytes, aligned on an align boundary. Drivers not requiring a specific
 221        alignment can pass 0. name identifies the cache for statistics and
 222        debugging. constructor and destructor convert plain memory into objects
 223        and back again; constructor can fail if it needs to allocate memory but
 224        cannot. private is a parameter passed to the constructor and destructor
 225        callbacks to support parameterized caches (for example, a pointer to an
 226        instance of the driver's soft-state structure). To facilitate
 227        debugging, kmem_cache_create() creates a kstat(9S) structure of class
 228        kmem_cache and name name. It returns an opaque pointer to the object
 229        cache.
 230 
 231 
 232        kmem_cache_alloc() gets an object from the cache. The object will be in
 233        its constructed state. kmflag has either KM_SLEEP or KM_NOSLEEP set,
 234        indicating whether it is acceptable to wait for memory if none is
 235        currently available.
 236 
 237 
 238        A small pool of reserved memory is available to allow the system to
 239        progress toward the goal of freeing additional memory while in a low
 240        memory situation.  The KM_PUSHPAGE flag enables use of this reserved
 241        memory pool on an allocation. This flag can be used by drivers that
 242        implement strategy(9E) on memory allocations associated with a single
 243        I/O operation. The driver guarantees that the I/O operation will
 244        complete (or timeout) and, on completion, that the memory will be
 245        returned. The KM_PUSHPAGE flag should be used only in
 246        kmem_cache_alloc() calls. All allocations from a given cache should be
 247        consistent in their use of the flag. A driver that adheres to these
 248        restrictions can guarantee progress in a low memory situation without
 249        resorting to complex private allocation and queuing schemes. If
 250        KM_PUSHPAGE is specified, KM_SLEEP can also be used without causing
 251        deadlock.
 252 
 253 
 254        kmem_cache_free() returns an object to the cache. The object must be in
 255        its constructed state.
 256 
 257 
 258        kmem_cache_destroy() destroys the cache and releases all associated
 259        resources. All allocated objects must have been previously freed.
 260 
 261 
 262        kmem_cache_set_move() registers a function that the allocator may call
 263        to move objects from sparsely allocated pages of memory so that the
 264        system can reclaim pages that are tied up by the client. Since caching
 265        objects of the same size and type already makes severe memory
 266        fragmentation unlikely, there is generally no need to register such a
 267        function. The idea is to make it possible to limit worst-case
 268        fragmentation in caches that exhibit a tendency to become highly
 269        fragmented. Only clients that allocate a mix of long- and short-lived
 270        objects from the same cache are prone to exhibit this tendency, making
 271        them candidates for a move() callback.
 272 
 273 
 274        The move() callback supplies the client with two addresses: the
 275        allocated object that the allocator wants to move and a buffer selected
 276        by the allocator for the client to use as the copy destination. The new
 277        parameter is an allocated, constructed object ready to receive the
 278        contents of the old parameter. The bufsize parameter supplies the size
 279        of the object, in case a single move function handles multiple caches
 280        whose objects differ only in size. Finally, the private parameter
 281        passed to the constructor and destructor is also passed to the move()
 282        callback.
 283 
 284 
 285        Only the client knows about its own data and when it is a good time to
 286        move it.  The client cooperates with the allocator to return unused
 287        memory to the system, and the allocator accepts this help at the
 288        client's convenience. When asked to move an object, the client can
 289        respond with any of the following:
 290 
 291          typedef enum kmem_cbrc {
 292                       KMEM_CBRC_YES,
 293                       KMEM_CBRC_NO,
 294                       KMEM_CBRC_LATER,
 295                       KMEM_CBRC_DONT_NEED,
 296                       KMEM_CBRC_DONT_KNOW
 297          } kmem_cbrc_t;
 298 
 299 
 300 
 301 
 302        The client must not explicitly free either of the objects passed to the
 303        move() callback, since the allocator wants to free them directly to the
 304        slab layer (bypassing the per-CPU magazine layer). The response tells
 305        the allocator which of the two object parameters to free:
 306 
 307        KMEM_CBRC_YES
 308                               The client moved the object; the allocator frees
 309                               the old parameter.
 310 
 311 
 312        KMEM_CBRC_NO
 313                               The client refused to move the object; the
 314                               allocator frees the new parameter (the unused
 315                               copy destination).
 316 
 317 
 318        KMEM_CBRC_LATER
 319                               The client is using the object and cannot move
 320                               it now; the allocator frees the new parameter
 321                               (the unused copy destination). The client should
 322                               use KMEM_CBRC_LATER instead of KMEM_CBRC_NO if
 323                               the object is likely to become movable soon.
 324 
 325 
 326        KMEM_CBRC_DONT_NEED
 327                               The client no longer needs the object; the
 328                               allocator frees both the old and new parameters.
 329                               This response is the client's opportunity to be
 330                               a model citizen and give back as much as it can.
 331 
 332 
 333        KMEM_CBRC_DONT_KNOW
 334                               The client does not know about the object
 335                               because:
 336 
 337                               a)
 338                                     the client has just allocated the object
 339                                     and has not yet put it wherever it expects
 340                                     to find known objects
 341 
 342 
 343                               b)
 344                                     the client has removed the object from
 345                                     wherever it expects to find known objects
 346                                     and is about to free the object
 347 
 348 
 349                               c)
 350                                     the client has freed the object
 351 
 352                               In all of these cases above, the allocator frees
 353                               the new parameter (the unused copy destination)
 354                               and searches for the old parameter in the
 355                               magazine layer. If the object is found, it is
 356                               removed from the magazine layer and freed to the
 357                               slab layer so that it will no longer tie up an
 358                               entire page of memory.
 359 
 360 
 361 
 362        Any object passed to the move() callback is guaranteed to have been
 363        touched only by the allocator or by the client. Because memory patterns
 364        applied by the allocator always set at least one of the two lowest
 365        order bits, the bottom two bits of any pointer member (other than char
 366        * or short *, which may not be 8-byte aligned on all platforms) are
 367        available to the client for marking cached objects that the client is
 368        about to free. This way, the client can recognize known objects in the
 369        move() callback by the unmarked (valid) pointer value.
 370 
 371 
 372        If the client refuses to move an object with either KMEM_CBRC_NO or
 373        KMEM_CBRC_LATER, and that object later becomes movable, the client can
 374        notify the allocator by calling kmem_cache_move_notify().
 375        Alternatively, the client can simply wait for the allocator to call
 376        back again with the same object address. Responding KMEM_CRBC_NO even
 377        once or responding KMEM_CRBC_LATER too many times for the same object
 378        makes the allocator less likely to call back again for that object.
 379 
 380        [Synopsis for notification function:]
 381 
 382 
 383        void kmem_cache_move_notify(kmem_cache_t *cp, void *obj);
 384 
 385 
 386 
 387        The parameters for the notification function are as follows:
 388 
 389        cp
 390               Pointer to the object cache.
 391 
 392 
 393        obj
 394               Pointer to the object that has become movable since an earlier
 395               refusal to move it.
 396 
 397 
 398 CONTEXT
 399        Constructors can be invoked during any call to kmem_cache_alloc(), and
 400        will run in that context. Similarly, destructors can be invoked during
 401        any call to kmem_cache_free(), and can also be invoked during
 402        kmem_cache_destroy(). Therefore, the functions that a constructor or
 403        destructor invokes must be appropriate in that context. Furthermore,
 404        the allocator may also call the constructor and destructor on objects
 405        still under its control without client involvement.
 406 
 407 
 408        kmem_cache_create() and kmem_cache_destroy() must not be called from
 409        interrupt context. kmem_cache_create() can also block for available
 410        memory.
 411 
 412 
 413        kmem_cache_alloc() can be called from interrupt context only if the
 414        KM_NOSLEEP flag is set. It can be called from user or kernel context
 415        with any valid flag.
 416 
 417 
 418        kmem_cache_free() can be called from user, kernel, or interrupt
 419        context.
 420 
 421 
 422        kmem_cache_set_move() is called from the same context as
 423        kmem_cache_create(), immediately after kmem_cache_create() and before
 424        allocating any objects from the cache.
 425 
 426 
 427        The registered move() callback is always invoked in the same global
 428        callback thread dedicated for move requests, guaranteeing that no
 429        matter how many clients register a move() function, the allocator never
 430        tries to move more than one object at a time. Neither the allocator nor
 431        the client can be assumed to know the object's whereabouts at the time
 432        of the callback.
 433 
 434 EXAMPLES
 435        Example 1 Object Caching
 436 
 437 
 438        Consider the following data structure:
 439 
 440 
 441          struct foo {
 442              kmutex_t foo_lock;
 443              kcondvar_t foo_cv;
 444              struct bar *foo_barlist;
 445              int foo_refcnt;
 446              };
 447 
 448 
 449 
 450        Assume that a foo structure cannot be freed until there are no
 451        outstanding references to it (foo_refcnt == 0) and all of its pending
 452        bar events (whatever they are) have completed (foo_barlist == NULL).
 453        The life cycle of a dynamically allocated foo would be something like
 454        this:
 455 
 456 
 457          foo = kmem_alloc(sizeof (struct foo), KM_SLEEP);
 458          mutex_init(&foo->foo_lock, ...);
 459          cv_init(&foo->foo_cv, ...);
 460          foo->foo_refcnt = 0;
 461          foo->foo_barlist = NULL;
 462              use foo;
 463          ASSERT(foo->foo_barlist == NULL);
 464          ASSERT(foo->foo_refcnt      == 0);
 465          cv_destroy(&foo->foo_cv);
 466          mutex_destroy(&foo->foo_lock);
 467          kmem_free(foo);
 468 
 469 
 470 
 471        Notice that between each use of a foo object we perform a sequence of
 472        operations that constitutes nothing but expensive overhead. All of this
 473        overhead (that is, everything other than use foo above) can be
 474        eliminated by object caching.
 475 
 476 
 477          int
 478          foo_constructor(void *buf, void *arg, int tags)
 479          {
 480              struct foo *foo = buf;
 481              mutex_init(&foo->foo_lock,  ...);
 482              cv_init(&foo->foo_cv, ...);
 483              foo->foo_refcnt = 0;
 484              foo->foo_barlist =      NULL;
 485              return (0);
 486          }
 487 
 488          void
 489          foo_destructor(void *buf, void *arg)
 490          {
 491              struct foo *foo = buf;
 492              ASSERT(foo->foo_barlist ==      NULL);
 493              ASSERT(foo->foo_refcnt == 0);
 494              cv_destroy(&foo->foo_cv);
 495              mutex_destroy(&foo->foo_lock);
 496          }
 497 
 498          user_arg = ddi_get_soft_state(foo_softc, instance);
 499          (void) snprintf(buf, KSTAT_STRLEN, "foo%d_cache",
 500                  ddi_get_instance(dip));
 501          foo_cache = kmem_cache_create(buf,
 502                  sizeof (struct foo), 0,
 503                  foo_constructor, foo_destructor,
 504                  NULL, user_arg, 0);
 505 
 506 
 507 
 508        To allocate, use, and free a foo object:
 509 
 510 
 511          foo = kmem_cache_alloc(foo_cache, KM_SLEEP);
 512              use foo;
 513          kmem_cache_free(foo_cache, foo);
 514 
 515 
 516 
 517        This makes foo allocation fast, because the allocator will usually do
 518        nothing more than fetch an already-constructed foo from the cache.
 519        foo_constructor and foo_destructor will be invoked only to populate and
 520        drain the cache, respectively.
 521 
 522 
 523        Example 2 Registering a Move Callback
 524 
 525 
 526        To register a move() callback:
 527 
 528 
 529          object_cache = kmem_cache_create(...);
 530          kmem_cache_set_move(object_cache, object_move);
 531 
 532 
 533 RETURN VALUES
 534        If successful, the constructor function must return 0. If KM_NOSLEEP or
 535        KM_NOSLEEP_LAZY is set and memory cannot be allocated without sleeping,
 536        the constructor must return -1.  If the constructor takes extraordinary
 537        steps during a KM_NOSLEEP construction, it may not take those for a
 538        KM_NOSLEEP_LAZY construction.
 539 
 540 
 541        kmem_cache_create() returns a pointer to the allocated cache.
 542 
 543 
 544        If successful, kmem_cache_alloc() returns a pointer to the allocated
 545        object. If KM_NOSLEEP is set and memory cannot be allocated without
 546        sleeping, kmem_cache_alloc() returns NULL.
 547 
 548 ATTRIBUTES
 549        See attributes(5) for descriptions of the following attributes:
 550 
 551 
 552 
 553 
 554        +--------------------+-----------------+
 555        |  ATTRIBUTE TYPE    | ATTRIBUTE VALUE |
 556        +--------------------+-----------------+
 557        |Interface Stability | Committed       |
 558        +--------------------+-----------------+
 559 
 560 SEE ALSO
 561        condvar(9F), kmem_alloc(9F), mutex(9F), kstat(9S)
 562 
 563 
 564        Writing Device Drivers
 565 
 566 
 567        The Slab Allocator: An Object-Caching Kernel Memory Allocator, Bonwick,
 568        J.; USENIX Summer 1994 Technical Conference (1994).
 569 
 570 
 571        Magazines and vmem: Extending the Slab Allocator to Many CPUs and
 572        Arbitrary Resources, Bonwick, J. and Adams, J.; USENIX 2001 Technical
 573        Conference (2001).
 574 
 575 NOTES
 576        The constructor must be immediately reversible by the destructor, since
 577        the allocator may call the constructor and destructor on objects still
 578        under its control at any time without client involvement.
 579 
 580 
 581        The constructor must respect the kmflags argument by forwarding it to
 582        allocations made inside the constructor, and must not ASSERT anything
 583        about the given flags.
 584 
 585 
 586        The user argument forwarded to the constructor must be fully
 587        operational before it is passed to kmem_cache_create().
 588 
 589 
 590 
 591                                February 18, 2015         KMEM_CACHE_CREATE(9F)