Print this page
    
5513 KM_NORMALPRI should be documented in kmem_alloc(9f) and kmem_cache_create(9f) man pages
14465 Present KM_NOSLEEP_LAZY as documented interface
Change-Id: I002ec28ddf390650f1fcba1ca94f6abfdb241439
    
      
        | Split | 
	Close | 
      
      | Expand all | 
      | Collapse all | 
    
    
          --- old/usr/src/man/man9f/kmem_cache_create.9f.man.txt
          +++ new/usr/src/man/man9f/kmem_cache_create.9f.man.txt
   1    1  KMEM_CACHE_CREATE(9F)    Kernel Functions for Drivers    KMEM_CACHE_CREATE(9F)
   2    2  
   3    3  
   4    4  
   5    5  NAME
   6    6         kmem_cache_create, kmem_cache_alloc, kmem_cache_free,
   7    7         kmem_cache_destroy, kmem_cache_set_move - kernel memory cache allocator
   8    8         operations
   9    9  
  10   10  SYNOPSIS
  11   11         #include <sys/types.h>
  12   12         #include <sys/kmem.h>
  13   13  
  14   14         kmem_cache_t *kmem_cache_create(char *name, size_t bufsize,
  15   15              size_t align, int (*constructor)(void *, void *, int),
  16   16              void (*destructor)(void *, void *), void (*reclaim)(void *),
  17   17              void *private, void *vmp, int cflags);
  18   18  
  19   19  
  20   20         void kmem_cache_destroy(kmem_cache_t *cp);
  21   21  
  22   22  
  23   23         void *kmem_cache_alloc(kmem_cache_t *cp, int kmflag);
  24   24  
  25   25  
  26   26         void kmem_cache_free(kmem_cache_t *cp, void *obj);
  27   27  
  28   28  
  29   29         void kmem_cache_set_move(kmem_cache_t *cp, kmem_cbrc_t (*move)(void *,
  30   30              void *, size_t *, void *));
  31   31  
  32   32  
  33   33          [Synopsis for callback functions:]
  34   34  
  35   35  
  36   36         int (*constructor)(void *buf, void *user_arg, int kmflags);
  37   37  
  38   38  
  39   39         void (*destructor)(void *buf, void *user_arg);
  40   40  
  41   41  
  42   42         kmem_cbrc_t (*move)(void *old, void *new, size_t bufsize,
  43   43              void *user_arg);
  44   44  
  45   45  
  46   46  INTERFACE LEVEL
  47   47         illumos DDI specific (illumos DDI)
  48   48  
  49   49  PARAMETERS
  50   50         The parameters for the kmem_cache_* functions are as follows:
  51   51  
  52   52         name
  53   53                        Descriptive name of a kstat(9S) structure of class
  54   54                        kmem_cache.  Names longer than 31 characters are
  55   55                        truncated.
  56   56  
  57   57  
  58   58         bufsize
  59   59                        Size of the objects it manages.
  60   60  
  61   61  
  62   62         align
  63   63                        Required object alignment.
  64   64  
  65   65  
  66   66         constructor
  67   67                        Pointer to an object constructor function. Parameters
  68   68                        are defined below.
  69   69  
  70   70  
  71   71         destructor
  72   72                        Pointer to an object destructor function. Parameters are
  73   73                        defined below.
  74   74  
  75   75  
  76   76         reclaim
  77   77                        Drivers should pass NULL.
  78   78  
  79   79  
  80   80         private
  81   81                        Pass-through argument for constructor/destructor.
  82   82  
  83   83  
  84   84         vmp
  85   85                        Drivers should pass NULL.
  86   86  
  87   87  
  88   88         cflags
  89   89                        Drivers must pass 0.
  90   90  
  91   91  
  
    | 
      ↓ open down ↓ | 
    91 lines elided | 
    
      ↑ open up ↑ | 
  
  92   92         kmflag
  93   93                        Possible flags are:
  94   94  
  95   95                        KM_SLEEP
  96   96                                       Allow sleeping (blocking) until memory is
  97   97                                       available.
  98   98  
  99   99  
 100  100                        KM_NOSLEEP
 101  101                                       Return NULL immediately if memory is not
 102      -                                     available.
      102 +                                     available, but after an aggressive
      103 +                                     reclaiming attempt.  Any mention of
      104 +                                     KM_NOSLEEP without mentioning
      105 +                                     KM_NOSLEEP_LAZY (see below) applies to
      106 +                                     both values.
 103  107  
 104  108  
      109 +                      KM_NOSLEEP_LAZY
      110 +                                     Return NULL immediately if memory is not
      111 +                                     available, without the aggressive
      112 +                                     reclaiming attempt.  This is actually two
      113 +                                     flags combined: (KM_NOSLEEP |
      114 +                                     KM_NORMALPRI), the latter flag indicating
      115 +                                     not to attempt reclamation before giving
      116 +                                     up and returning NULL.
      117 +
      118 +
 105  119                        KM_PUSHPAGE
 106  120                                       Allow the allocation to use reserved
 107  121                                       memory.
 108  122  
 109  123  
 110  124  
 111  125         obj
 112  126                        Pointer to the object allocated by kmem_cache_alloc().
 113  127  
 114  128  
 115  129         move
 116  130                        Pointer to an object relocation function. Parameters are
 117  131                        defined below.
 118  132  
 119  133  
 120  134  
 121  135         The parameters for the callback constructor function are as follows:
 122  136  
 123  137         void *buf
 124  138                           Pointer to the object to be constructed.
 125  139  
 126  140  
 127  141         void *user_arg
 128  142                           The private parameter from the call to
 129  143                           kmem_cache_create(); it is typically a pointer to the
 130  144                           soft-state structure.
 131  145  
 132  146  
 133  147         int kmflags
 134  148                           Propagated kmflag values.
 135  149  
 136  150  
 137  151  
 138  152         The parameters for the callback destructor function are as follows:
 139  153  
 140  154         void *buf
 141  155                           Pointer to the object to be deconstructed.
 142  156  
 143  157  
 144  158         void *user_arg
 145  159                           The private parameter from the call to
 146  160                           kmem_cache_create(); it is typically a pointer to the
 147  161                           soft-state structure.
 148  162  
 149  163  
 150  164  
 151  165         The parameters for the callback move() function are as follows:
 152  166  
 153  167         void *old
 154  168                           Pointer to the object to be moved.
 155  169  
 156  170  
 157  171         void *new
 158  172                           Pointer to the object that serves as the copy
 159  173                           destination for the contents of the old parameter.
 160  174  
 161  175  
 162  176         size_t bufsize
 163  177                           Size of the object to be moved.
 164  178  
 165  179  
 166  180         void *user_arg
 167  181                           The private parameter from the call to
 168  182                           kmem_cache_create(); it is typically a pointer to the
 169  183                           soft-state structure.
 170  184  
 171  185  
 172  186  DESCRIPTION
 173  187         In many cases, the cost of initializing and destroying an object
 174  188         exceeds the cost of allocating and freeing memory for it. The functions
 175  189         described here address this condition.
 176  190  
 177  191  
 178  192         Object caching is a technique for dealing with objects that are:
 179  193  
 180  194             o      frequently allocated and freed, and
 181  195  
 182  196             o      have setup and initialization costs.
 183  197  
 184  198  
 185  199         The idea is to allow the allocator and its clients to cooperate to
 186  200         preserve the invariant portion of an object's initial state, or
 187  201         constructed state, between uses, so it does not have to be destroyed
 188  202         and re-created every time the object is used. For example, an object
 189  203         containing a mutex only needs to have mutex_init() applied once, the
 190  204         first time the object is allocated. The object can then be freed and
 191  205         reallocated many times without incurring the expense of mutex_destroy()
 192  206         and mutex_init() each time. An object's embedded locks, condition
 193  207         variables, reference counts, lists of other objects, and read-only data
 194  208         all generally qualify as constructed state. The essential requirement
 195  209         is that the client must free the object (using kmem_cache_free()) in
 196  210         its constructed state. The allocator cannot enforce this, so
 197  211         programming errors will lead to hard-to-find bugs.
 198  212  
 199  213  
 200  214         A driver should call kmem_cache_create() at the time of _init(9E) or
 201  215         attach(9E), and call the corresponding kmem_cache_destroy() at the time
 202  216         of _fini(9E) or detach(9E).
 203  217  
 204  218  
 205  219         kmem_cache_create() creates a cache of objects, each of size bufsize
 206  220         bytes, aligned on an align boundary. Drivers not requiring a specific
 207  221         alignment can pass 0. name identifies the cache for statistics and
 208  222         debugging. constructor and destructor convert plain memory into objects
 209  223         and back again; constructor can fail if it needs to allocate memory but
 210  224         cannot. private is a parameter passed to the constructor and destructor
 211  225         callbacks to support parameterized caches (for example, a pointer to an
 212  226         instance of the driver's soft-state structure). To facilitate
 213  227         debugging, kmem_cache_create() creates a kstat(9S) structure of class
 214  228         kmem_cache and name name. It returns an opaque pointer to the object
 215  229         cache.
 216  230  
 217  231  
 218  232         kmem_cache_alloc() gets an object from the cache. The object will be in
 219  233         its constructed state. kmflag has either KM_SLEEP or KM_NOSLEEP set,
 220  234         indicating whether it is acceptable to wait for memory if none is
 221  235         currently available.
 222  236  
 223  237  
 224  238         A small pool of reserved memory is available to allow the system to
 225  239         progress toward the goal of freeing additional memory while in a low
 226  240         memory situation.  The KM_PUSHPAGE flag enables use of this reserved
 227  241         memory pool on an allocation. This flag can be used by drivers that
 228  242         implement strategy(9E) on memory allocations associated with a single
 229  243         I/O operation. The driver guarantees that the I/O operation will
 230  244         complete (or timeout) and, on completion, that the memory will be
 231  245         returned. The KM_PUSHPAGE flag should be used only in
 232  246         kmem_cache_alloc() calls. All allocations from a given cache should be
 233  247         consistent in their use of the flag. A driver that adheres to these
 234  248         restrictions can guarantee progress in a low memory situation without
 235  249         resorting to complex private allocation and queuing schemes. If
 236  250         KM_PUSHPAGE is specified, KM_SLEEP can also be used without causing
 237  251         deadlock.
 238  252  
 239  253  
 240  254         kmem_cache_free() returns an object to the cache. The object must be in
 241  255         its constructed state.
 242  256  
 243  257  
 244  258         kmem_cache_destroy() destroys the cache and releases all associated
 245  259         resources. All allocated objects must have been previously freed.
 246  260  
 247  261  
 248  262         kmem_cache_set_move() registers a function that the allocator may call
 249  263         to move objects from sparsely allocated pages of memory so that the
 250  264         system can reclaim pages that are tied up by the client. Since caching
 251  265         objects of the same size and type already makes severe memory
 252  266         fragmentation unlikely, there is generally no need to register such a
 253  267         function. The idea is to make it possible to limit worst-case
 254  268         fragmentation in caches that exhibit a tendency to become highly
 255  269         fragmented. Only clients that allocate a mix of long- and short-lived
 256  270         objects from the same cache are prone to exhibit this tendency, making
 257  271         them candidates for a move() callback.
 258  272  
 259  273  
 260  274         The move() callback supplies the client with two addresses: the
 261  275         allocated object that the allocator wants to move and a buffer selected
 262  276         by the allocator for the client to use as the copy destination. The new
 263  277         parameter is an allocated, constructed object ready to receive the
 264  278         contents of the old parameter. The bufsize parameter supplies the size
 265  279         of the object, in case a single move function handles multiple caches
 266  280         whose objects differ only in size. Finally, the private parameter
 267  281         passed to the constructor and destructor is also passed to the move()
 268  282         callback.
 269  283  
 270  284  
 271  285         Only the client knows about its own data and when it is a good time to
 272  286         move it.  The client cooperates with the allocator to return unused
 273  287         memory to the system, and the allocator accepts this help at the
 274  288         client's convenience. When asked to move an object, the client can
 275  289         respond with any of the following:
 276  290  
 277  291           typedef enum kmem_cbrc {
 278  292                        KMEM_CBRC_YES,
 279  293                        KMEM_CBRC_NO,
 280  294                        KMEM_CBRC_LATER,
 281  295                        KMEM_CBRC_DONT_NEED,
 282  296                        KMEM_CBRC_DONT_KNOW
 283  297           } kmem_cbrc_t;
 284  298  
 285  299  
 286  300  
 287  301  
 288  302         The client must not explicitly free either of the objects passed to the
 289  303         move() callback, since the allocator wants to free them directly to the
 290  304         slab layer (bypassing the per-CPU magazine layer). The response tells
 291  305         the allocator which of the two object parameters to free:
 292  306  
 293  307         KMEM_CBRC_YES
 294  308                                The client moved the object; the allocator frees
 295  309                                the old parameter.
 296  310  
 297  311  
 298  312         KMEM_CBRC_NO
 299  313                                The client refused to move the object; the
 300  314                                allocator frees the new parameter (the unused
 301  315                                copy destination).
 302  316  
 303  317  
 304  318         KMEM_CBRC_LATER
 305  319                                The client is using the object and cannot move
 306  320                                it now; the allocator frees the new parameter
 307  321                                (the unused copy destination). The client should
 308  322                                use KMEM_CBRC_LATER instead of KMEM_CBRC_NO if
 309  323                                the object is likely to become movable soon.
 310  324  
 311  325  
 312  326         KMEM_CBRC_DONT_NEED
 313  327                                The client no longer needs the object; the
 314  328                                allocator frees both the old and new parameters.
 315  329                                This response is the client's opportunity to be
 316  330                                a model citizen and give back as much as it can.
 317  331  
 318  332  
 319  333         KMEM_CBRC_DONT_KNOW
 320  334                                The client does not know about the object
 321  335                                because:
 322  336  
 323  337                                a)
 324  338                                      the client has just allocated the object
 325  339                                      and has not yet put it wherever it expects
 326  340                                      to find known objects
 327  341  
 328  342  
 329  343                                b)
 330  344                                      the client has removed the object from
 331  345                                      wherever it expects to find known objects
 332  346                                      and is about to free the object
 333  347  
 334  348  
 335  349                                c)
 336  350                                      the client has freed the object
 337  351  
 338  352                                In all of these cases above, the allocator frees
 339  353                                the new parameter (the unused copy destination)
 340  354                                and searches for the old parameter in the
 341  355                                magazine layer. If the object is found, it is
 342  356                                removed from the magazine layer and freed to the
 343  357                                slab layer so that it will no longer tie up an
 344  358                                entire page of memory.
 345  359  
 346  360  
 347  361  
 348  362         Any object passed to the move() callback is guaranteed to have been
 349  363         touched only by the allocator or by the client. Because memory patterns
 350  364         applied by the allocator always set at least one of the two lowest
 351  365         order bits, the bottom two bits of any pointer member (other than char
 352  366         * or short *, which may not be 8-byte aligned on all platforms) are
 353  367         available to the client for marking cached objects that the client is
 354  368         about to free. This way, the client can recognize known objects in the
 355  369         move() callback by the unmarked (valid) pointer value.
 356  370  
 357  371  
 358  372         If the client refuses to move an object with either KMEM_CBRC_NO or
 359  373         KMEM_CBRC_LATER, and that object later becomes movable, the client can
 360  374         notify the allocator by calling kmem_cache_move_notify().
 361  375         Alternatively, the client can simply wait for the allocator to call
 362  376         back again with the same object address. Responding KMEM_CRBC_NO even
 363  377         once or responding KMEM_CRBC_LATER too many times for the same object
 364  378         makes the allocator less likely to call back again for that object.
 365  379  
 366  380         [Synopsis for notification function:]
 367  381  
 368  382  
 369  383         void kmem_cache_move_notify(kmem_cache_t *cp, void *obj);
 370  384  
 371  385  
 372  386  
 373  387         The parameters for the notification function are as follows:
 374  388  
 375  389         cp
 376  390                Pointer to the object cache.
 377  391  
 378  392  
 379  393         obj
 380  394                Pointer to the object that has become movable since an earlier
 381  395                refusal to move it.
 382  396  
 383  397  
 384  398  CONTEXT
 385  399         Constructors can be invoked during any call to kmem_cache_alloc(), and
 386  400         will run in that context. Similarly, destructors can be invoked during
 387  401         any call to kmem_cache_free(), and can also be invoked during
 388  402         kmem_cache_destroy(). Therefore, the functions that a constructor or
 389  403         destructor invokes must be appropriate in that context. Furthermore,
 390  404         the allocator may also call the constructor and destructor on objects
 391  405         still under its control without client involvement.
 392  406  
 393  407  
 394  408         kmem_cache_create() and kmem_cache_destroy() must not be called from
 395  409         interrupt context. kmem_cache_create() can also block for available
 396  410         memory.
 397  411  
 398  412  
 399  413         kmem_cache_alloc() can be called from interrupt context only if the
 400  414         KM_NOSLEEP flag is set. It can be called from user or kernel context
 401  415         with any valid flag.
 402  416  
 403  417  
 404  418         kmem_cache_free() can be called from user, kernel, or interrupt
 405  419         context.
 406  420  
 407  421  
 408  422         kmem_cache_set_move() is called from the same context as
 409  423         kmem_cache_create(), immediately after kmem_cache_create() and before
 410  424         allocating any objects from the cache.
 411  425  
 412  426  
 413  427         The registered move() callback is always invoked in the same global
 414  428         callback thread dedicated for move requests, guaranteeing that no
 415  429         matter how many clients register a move() function, the allocator never
 416  430         tries to move more than one object at a time. Neither the allocator nor
 417  431         the client can be assumed to know the object's whereabouts at the time
 418  432         of the callback.
 419  433  
 420  434  EXAMPLES
 421  435         Example 1 Object Caching
 422  436  
 423  437  
 424  438         Consider the following data structure:
 425  439  
 426  440  
 427  441           struct foo {
 428  442               kmutex_t foo_lock;
 429  443               kcondvar_t foo_cv;
 430  444               struct bar *foo_barlist;
 431  445               int foo_refcnt;
 432  446               };
 433  447  
 434  448  
 435  449  
 436  450         Assume that a foo structure cannot be freed until there are no
 437  451         outstanding references to it (foo_refcnt == 0) and all of its pending
 438  452         bar events (whatever they are) have completed (foo_barlist == NULL).
 439  453         The life cycle of a dynamically allocated foo would be something like
 440  454         this:
 441  455  
 442  456  
 443  457           foo = kmem_alloc(sizeof (struct foo), KM_SLEEP);
 444  458           mutex_init(&foo->foo_lock, ...);
 445  459           cv_init(&foo->foo_cv, ...);
 446  460           foo->foo_refcnt = 0;
 447  461           foo->foo_barlist = NULL;
 448  462               use foo;
 449  463           ASSERT(foo->foo_barlist == NULL);
 450  464           ASSERT(foo->foo_refcnt == 0);
 451  465           cv_destroy(&foo->foo_cv);
 452  466           mutex_destroy(&foo->foo_lock);
 453  467           kmem_free(foo);
 454  468  
 455  469  
 456  470  
 457  471         Notice that between each use of a foo object we perform a sequence of
 458  472         operations that constitutes nothing but expensive overhead. All of this
 459  473         overhead (that is, everything other than use foo above) can be
 460  474         eliminated by object caching.
 461  475  
 462  476  
 463  477           int
 464  478           foo_constructor(void *buf, void *arg, int tags)
 465  479           {
 466  480               struct foo *foo = buf;
 467  481               mutex_init(&foo->foo_lock, ...);
 468  482               cv_init(&foo->foo_cv, ...);
 469  483               foo->foo_refcnt = 0;
 470  484               foo->foo_barlist = NULL;
 471  485               return (0);
 472  486           }
 473  487  
 474  488           void
 475  489           foo_destructor(void *buf, void *arg)
 476  490           {
 477  491               struct foo *foo = buf;
 478  492               ASSERT(foo->foo_barlist == NULL);
 479  493               ASSERT(foo->foo_refcnt == 0);
 480  494               cv_destroy(&foo->foo_cv);
 481  495               mutex_destroy(&foo->foo_lock);
 482  496           }
 483  497  
 484  498           user_arg = ddi_get_soft_state(foo_softc, instance);
 485  499           (void) snprintf(buf, KSTAT_STRLEN, "foo%d_cache",
 486  500                   ddi_get_instance(dip));
 487  501           foo_cache = kmem_cache_create(buf,
 488  502                   sizeof (struct foo), 0,
 489  503                   foo_constructor, foo_destructor,
 490  504                   NULL, user_arg, 0);
 491  505  
 492  506  
 493  507  
 494  508         To allocate, use, and free a foo object:
 495  509  
 496  510  
 497  511           foo = kmem_cache_alloc(foo_cache, KM_SLEEP);
 498  512               use foo;
 499  513           kmem_cache_free(foo_cache, foo);
 500  514  
 501  515  
 502  516  
 503  517         This makes foo allocation fast, because the allocator will usually do
 504  518         nothing more than fetch an already-constructed foo from the cache.
 505  519         foo_constructor and foo_destructor will be invoked only to populate and
 506  520         drain the cache, respectively.
 507  521  
 508  522  
 509  523         Example 2 Registering a Move Callback
  
    | 
      ↓ open down ↓ | 
    395 lines elided | 
    
      ↑ open up ↑ | 
  
 510  524  
 511  525  
 512  526         To register a move() callback:
 513  527  
 514  528  
 515  529           object_cache = kmem_cache_create(...);
 516  530           kmem_cache_set_move(object_cache, object_move);
 517  531  
 518  532  
 519  533  RETURN VALUES
 520      -       If successful, the constructor function must return 0. If KM_NOSLEEP is
 521      -       set and memory cannot be allocated without sleeping, the constructor
 522      -       must return -1.
      534 +       If successful, the constructor function must return 0. If KM_NOSLEEP or
      535 +       KM_NOSLEEP_LAZY is set and memory cannot be allocated without sleeping,
      536 +       the constructor must return -1.  If the constructor takes extraordinary
      537 +       steps during a KM_NOSLEEP construction, it may not take those for a
      538 +       KM_NOSLEEP_LAZY construction.
 523  539  
 524  540  
 525  541         kmem_cache_create() returns a pointer to the allocated cache.
 526  542  
 527  543  
 528  544         If successful, kmem_cache_alloc() returns a pointer to the allocated
 529  545         object. If KM_NOSLEEP is set and memory cannot be allocated without
 530  546         sleeping, kmem_cache_alloc() returns NULL.
 531  547  
 532  548  ATTRIBUTES
 533  549         See attributes(5) for descriptions of the following attributes:
 534  550  
 535  551  
 536  552  
 537  553  
 538  554         +--------------------+-----------------+
 539  555         |  ATTRIBUTE TYPE    | ATTRIBUTE VALUE |
 540  556         +--------------------+-----------------+
 541  557         |Interface Stability | Committed       |
 542  558         +--------------------+-----------------+
 543  559  
 544  560  SEE ALSO
 545  561         condvar(9F), kmem_alloc(9F), mutex(9F), kstat(9S)
 546  562  
 547  563  
 548  564         Writing Device Drivers
 549  565  
 550  566  
 551  567         The Slab Allocator: An Object-Caching Kernel Memory Allocator, Bonwick,
 552  568         J.; USENIX Summer 1994 Technical Conference (1994).
 553  569  
 554  570  
 555  571         Magazines and vmem: Extending the Slab Allocator to Many CPUs and
 556  572         Arbitrary Resources, Bonwick, J. and Adams, J.; USENIX 2001 Technical
 557  573         Conference (2001).
 558  574  
 559  575  NOTES
 560  576         The constructor must be immediately reversible by the destructor, since
 561  577         the allocator may call the constructor and destructor on objects still
 562  578         under its control at any time without client involvement.
 563  579  
 564  580  
 565  581         The constructor must respect the kmflags argument by forwarding it to
 566  582         allocations made inside the constructor, and must not ASSERT anything
 567  583         about the given flags.
 568  584  
 569  585  
 570  586         The user argument forwarded to the constructor must be fully
 571  587         operational before it is passed to kmem_cache_create().
 572  588  
 573  589  
 574  590  
 575  591                                 February 18, 2015         KMEM_CACHE_CREATE(9F)
  
    | 
      ↓ open down ↓ | 
    43 lines elided | 
    
      ↑ open up ↑ | 
  
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX