Print this page
NEX-15740 NFS deadlock in rfs4_compound with hundreds of threads waiting for lock owned by rfs4_op_rename (lint fix)
NEX-15740 NFS deadlock in rfs4_compound with hundreds of threads waiting for lock owned by rfs4_op_rename
Reviewed by: Evan Layton <evan.layton@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-16917 Need to reduce the impact of NFS per-share kstats on failover
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>
Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
NEX-16835 Kernel panic during BDD tests at rfs4_compound func
Reviewed by: Evan Layton <evan.layton@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-15924 Getting panic: BAD TRAP: type=d (#gp General protection) rp=ffffff0021464690 addr=12
Reviewed by: Evan Layton <evan.layton@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
NEX-16812 Timing window where dtrace probe could try to access share info after unshared
Reviewed by: Evan Layton <evan.layton@nexenta.com>
Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
NEX-16452 NFS server in a zone state database needs to be per zone
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-15279 support NFS server in zone
NEX-15520 online NFS shares cause zoneadm halt to hang in nfs_export_zone_fini
Portions contributed by: Dan Kruchinin dan.kruchinin@nexenta.com
Portions contributed by: Stepan Zastupov stepan.zastupov@gmail.com
Reviewed by: Joyce McIntosh <joyce.mcintosh@nexenta.com>
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
NEX-9275 Got "bad mutex" panic when run IO to nfs share from clients
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
NEX-7366 Getting panic in "module "nfssrv" due to a NULL pointer dereference" when updating NFS shares on a pool
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
NEX-6778 NFS kstats leak and cause system to hang
Revert "NEX-4261 Per-client NFS server IOPS, bandwidth, and latency kstats"
This reverts commit 586c3ab1927647487f01c337ddc011c642575a52.
Revert "NEX-5354 Aggregated IOPS, bandwidth, and latency kstats for NFS server"
This reverts commit c91d7614da8618ef48018102b077f60ecbbac8c2.
Revert "NEX-5667 nfssrv_stats_flags does not work for aggregated kstats"
This reverts commit 3dcf42618be7dd5f408c327f429c81e07ca08e74.
Revert "NEX-5750 Time values for aggregated NFS server kstats should be normalized"
This reverts commit 1f4d4f901153b0191027969fa4a8064f9d3b9ee1.
Revert "NEX-5942 Panic in rfs4_minorvers_mismatch() with NFSv4.1 client"
This reverts commit 40766417094a162f5e4cc8786c0fa0a7e5871cd9.
Revert "NEX-5752 NFS server: namespace collision in kstats"
This reverts commit ae81e668db86050da8e483264acb0cce0444a132.
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-6109 NFS client panics in nfssrv when running nfsv4-test basic_ops STC tests
Reviewed by: Gordon Ross <gwr@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
NEX-4261 Per-client NFS server IOPS, bandwidth, and latency kstats
Reviewed by: Kevin Crowe <kevin.crowe@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
NEX-5134 Deadlock between rfs4_do_lock() and rfs4_op_read()
Reviewed by: Dan Fields <dan.fields@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
NEX-3311 NFSv4: setlock() can spin forever
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
NEX-3097 IOPS, bandwidth, and latency kstats for NFS server
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
NEX-1128 NFS server: Generic uid and gid remapping for AUTH_SYS
Reviewed by: Jan Kryl <jan.kryl@nexenta.com>
OS-72 NULL pointer dereference in rfs4_op_setclientid()
Reviewed by: Dan McDonald <danmcd@nexenta.com>

Split Close
Expand all
Collapse all
          --- old/usr/src/uts/common/fs/nfs/nfs4_srv.c
          +++ new/usr/src/uts/common/fs/nfs/nfs4_srv.c
↓ open down ↓ 12 lines elided ↑ open up ↑
  13   13   * When distributing Covered Code, include this CDDL HEADER in each
  14   14   * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  15   15   * If applicable, add the following below this CDDL HEADER, with the
  16   16   * fields enclosed by brackets "[]" replaced with your own identifying
  17   17   * information: Portions Copyright [yyyy] [name of copyright owner]
  18   18   *
  19   19   * CDDL HEADER END
  20   20   */
  21   21  
  22   22  /*
  23      - * Copyright 2016 Nexenta Systems, Inc.  All rights reserved.
  24   23   * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved.
  25      - * Copyright (c) 2012, 2016 by Delphix. All rights reserved.
  26   24   */
  27   25  
  28   26  /*
  29   27   *      Copyright (c) 1983,1984,1985,1986,1987,1988,1989  AT&T.
  30   28   *      All Rights Reserved
  31   29   */
  32   30  
       31 +/*
       32 + * Copyright 2019 Nexenta Systems, Inc.
       33 + * Copyright (c) 2012, 2016 by Delphix. All rights reserved.
       34 + */
       35 +
  33   36  #include <sys/param.h>
  34   37  #include <sys/types.h>
  35   38  #include <sys/systm.h>
  36   39  #include <sys/cred.h>
  37   40  #include <sys/buf.h>
  38   41  #include <sys/vfs.h>
  39   42  #include <sys/vfs_opreg.h>
  40   43  #include <sys/vnode.h>
  41   44  #include <sys/uio.h>
  42   45  #include <sys/errno.h>
↓ open down ↓ 7 lines elided ↑ open up ↑
  50   53  #include <sys/flock.h>
  51   54  #include <sys/pathname.h>
  52   55  #include <sys/nbmlock.h>
  53   56  #include <sys/share.h>
  54   57  #include <sys/atomic.h>
  55   58  #include <sys/policy.h>
  56   59  #include <sys/fem.h>
  57   60  #include <sys/sdt.h>
  58   61  #include <sys/ddi.h>
  59   62  #include <sys/zone.h>
       63 +#include <sys/kstat.h>
  60   64  
  61   65  #include <fs/fs_reparse.h>
  62   66  
  63   67  #include <rpc/types.h>
  64   68  #include <rpc/auth.h>
  65   69  #include <rpc/rpcsec_gss.h>
  66   70  #include <rpc/svc.h>
  67   71  
  68   72  #include <nfs/nfs.h>
       73 +#include <nfs/nfssys.h>
  69   74  #include <nfs/export.h>
  70   75  #include <nfs/nfs_cmd.h>
  71   76  #include <nfs/lm.h>
  72   77  #include <nfs/nfs4.h>
       78 +#include <nfs/nfs4_drc.h>
  73   79  
  74   80  #include <sys/strsubr.h>
  75   81  #include <sys/strsun.h>
  76   82  
  77   83  #include <inet/common.h>
  78   84  #include <inet/ip.h>
  79   85  #include <inet/ip6.h>
  80   86  
  81   87  #include <sys/tsol/label.h>
  82   88  #include <sys/tsol/tndb.h>
↓ open down ↓ 57 lines elided ↑ open up ↑
 140  146   *
 141  147   * dirent64: named padded to provide 8 byte struct alignment
 142  148   *      d_ino(8) + d_off(8) + d_reclen(2) + d_name(namelen + null(1) + pad)
 143  149   *
 144  150   * cookie: uint64_t   +  utf8namelen: uint_t  +   utf8name padded to 8 bytes
 145  151   *
 146  152   */
 147  153  #define DIRENT64_TO_DIRCOUNT(dp) \
 148  154          (3 * BYTES_PER_XDR_UNIT + DIRENT64_NAMELEN((dp)->d_reclen))
 149  155  
 150      -time_t rfs4_start_time;                 /* Initialized in rfs4_srvrinit */
      156 +zone_key_t      rfs4_zone_key;
 151  157  
 152      -static sysid_t lockt_sysid;             /* dummy sysid for all LOCKT calls */
      158 +static sysid_t          lockt_sysid;    /* dummy sysid for all LOCKT calls */
 153  159  
 154  160  u_longlong_t    nfs4_srv_caller_id;
 155  161  uint_t          nfs4_srv_vkey = 0;
 156  162  
 157      -verifier4       Write4verf;
 158      -verifier4       Readdir4verf;
 159      -
 160  163  void    rfs4_init_compound_state(struct compound_state *);
 161  164  
 162  165  static void     nullfree(caddr_t);
 163  166  static void     rfs4_op_inval(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 164      -                        struct compound_state *);
      167 +                    struct compound_state *);
 165  168  static void     rfs4_op_access(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 166      -                        struct compound_state *);
      169 +                    struct compound_state *);
 167  170  static void     rfs4_op_close(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 168      -                        struct compound_state *);
      171 +                    struct compound_state *);
 169  172  static void     rfs4_op_commit(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 170      -                        struct compound_state *);
      173 +                    struct compound_state *);
 171  174  static void     rfs4_op_create(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 172      -                        struct compound_state *);
      175 +                    struct compound_state *);
 173  176  static void     rfs4_op_create_free(nfs_resop4 *resop);
 174  177  static void     rfs4_op_delegreturn(nfs_argop4 *, nfs_resop4 *,
 175      -                        struct svc_req *, struct compound_state *);
      178 +                    struct svc_req *, struct compound_state *);
 176  179  static void     rfs4_op_delegpurge(nfs_argop4 *, nfs_resop4 *,
 177      -                        struct svc_req *, struct compound_state *);
      180 +                    struct svc_req *, struct compound_state *);
 178  181  static void     rfs4_op_getattr(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 179      -                        struct compound_state *);
      182 +                    struct compound_state *);
 180  183  static void     rfs4_op_getattr_free(nfs_resop4 *);
 181  184  static void     rfs4_op_getfh(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 182      -                        struct compound_state *);
      185 +                    struct compound_state *);
 183  186  static void     rfs4_op_getfh_free(nfs_resop4 *);
 184  187  static void     rfs4_op_illegal(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 185      -                        struct compound_state *);
      188 +                    struct compound_state *);
 186  189  static void     rfs4_op_link(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 187      -                        struct compound_state *);
      190 +                    struct compound_state *);
 188  191  static void     rfs4_op_lock(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 189      -                        struct compound_state *);
      192 +                    struct compound_state *);
 190  193  static void     lock_denied_free(nfs_resop4 *);
 191  194  static void     rfs4_op_locku(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 192      -                        struct compound_state *);
      195 +                    struct compound_state *);
 193  196  static void     rfs4_op_lockt(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 194      -                        struct compound_state *);
      197 +                    struct compound_state *);
 195  198  static void     rfs4_op_lookup(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 196      -                        struct compound_state *);
      199 +                    struct compound_state *);
 197  200  static void     rfs4_op_lookupp(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 198      -                        struct compound_state *);
      201 +                    struct compound_state *);
 199  202  static void     rfs4_op_openattr(nfs_argop4 *argop, nfs_resop4 *resop,
 200      -                                struct svc_req *req, struct compound_state *cs);
      203 +                    struct svc_req *req, struct compound_state *cs);
 201  204  static void     rfs4_op_nverify(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 202      -                        struct compound_state *);
      205 +                    struct compound_state *);
 203  206  static void     rfs4_op_open(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 204      -                        struct compound_state *);
      207 +                    struct compound_state *);
 205  208  static void     rfs4_op_open_confirm(nfs_argop4 *, nfs_resop4 *,
 206      -                        struct svc_req *, struct compound_state *);
      209 +                    struct svc_req *, struct compound_state *);
 207  210  static void     rfs4_op_open_downgrade(nfs_argop4 *, nfs_resop4 *,
 208      -                        struct svc_req *, struct compound_state *);
      211 +                    struct svc_req *, struct compound_state *);
 209  212  static void     rfs4_op_putfh(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 210      -                        struct compound_state *);
      213 +                    struct compound_state *);
 211  214  static void     rfs4_op_putpubfh(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 212      -                        struct compound_state *);
      215 +                    struct compound_state *);
 213  216  static void     rfs4_op_putrootfh(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 214      -                        struct compound_state *);
      217 +                    struct compound_state *);
 215  218  static void     rfs4_op_read(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 216      -                        struct compound_state *);
      219 +                    struct compound_state *);
 217  220  static void     rfs4_op_read_free(nfs_resop4 *);
 218  221  static void     rfs4_op_readdir_free(nfs_resop4 *resop);
 219  222  static void     rfs4_op_readlink(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 220      -                        struct compound_state *);
      223 +                    struct compound_state *);
 221  224  static void     rfs4_op_readlink_free(nfs_resop4 *);
 222  225  static void     rfs4_op_release_lockowner(nfs_argop4 *, nfs_resop4 *,
 223      -                        struct svc_req *, struct compound_state *);
      226 +                    struct svc_req *, struct compound_state *);
 224  227  static void     rfs4_op_remove(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 225      -                        struct compound_state *);
      228 +                    struct compound_state *);
 226  229  static void     rfs4_op_rename(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 227      -                        struct compound_state *);
      230 +                    struct compound_state *);
 228  231  static void     rfs4_op_renew(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 229      -                        struct compound_state *);
      232 +                    struct compound_state *);
 230  233  static void     rfs4_op_restorefh(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 231      -                        struct compound_state *);
      234 +                    struct compound_state *);
 232  235  static void     rfs4_op_savefh(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 233      -                        struct compound_state *);
      236 +                    struct compound_state *);
 234  237  static void     rfs4_op_setattr(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 235      -                        struct compound_state *);
      238 +                    struct compound_state *);
 236  239  static void     rfs4_op_verify(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 237      -                        struct compound_state *);
      240 +                    struct compound_state *);
 238  241  static void     rfs4_op_write(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 239      -                        struct compound_state *);
      242 +                    struct compound_state *);
 240  243  static void     rfs4_op_setclientid(nfs_argop4 *, nfs_resop4 *,
 241      -                        struct svc_req *, struct compound_state *);
      244 +                    struct svc_req *, struct compound_state *);
 242  245  static void     rfs4_op_setclientid_confirm(nfs_argop4 *, nfs_resop4 *,
 243      -                        struct svc_req *req, struct compound_state *);
      246 +                    struct svc_req *req, struct compound_state *);
 244  247  static void     rfs4_op_secinfo(nfs_argop4 *, nfs_resop4 *, struct svc_req *,
 245      -                        struct compound_state *);
      248 +                    struct compound_state *);
 246  249  static void     rfs4_op_secinfo_free(nfs_resop4 *);
 247  250  
 248      -static nfsstat4 check_open_access(uint32_t,
 249      -                                struct compound_state *, struct svc_req *);
 250      -nfsstat4 rfs4_client_sysid(rfs4_client_t *, sysid_t *);
 251      -void rfs4_ss_clid(rfs4_client_t *);
      251 +static nfsstat4 check_open_access(uint32_t, struct compound_state *,
      252 +                    struct svc_req *);
      253 +nfsstat4        rfs4_client_sysid(rfs4_client_t *, sysid_t *);
      254 +void            rfs4_ss_clid(nfs4_srv_t *, rfs4_client_t *);
 252  255  
      256 +
 253  257  /*
 254  258   * translation table for attrs
 255  259   */
 256  260  struct nfs4_ntov_table {
 257  261          union nfs4_attr_u *na;
 258  262          uint8_t amap[NFS4_MAXNUM_ATTRS];
 259  263          int attrcnt;
 260  264          bool_t vfsstat;
 261  265  };
 262  266  
 263  267  static void     nfs4_ntov_table_init(struct nfs4_ntov_table *ntovp);
 264  268  static void     nfs4_ntov_table_free(struct nfs4_ntov_table *ntovp,
 265      -                                    struct nfs4_svgetit_arg *sargp);
      269 +                    struct nfs4_svgetit_arg *sargp);
 266  270  
 267  271  static nfsstat4 do_rfs4_set_attrs(bitmap4 *resp, fattr4 *fattrp,
 268  272                      struct compound_state *cs, struct nfs4_svgetit_arg *sargp,
 269  273                      struct nfs4_ntov_table *ntovp, nfs4_attr_cmd_t cmd);
 270  274  
      275 +static void     hanfsv4_failover(nfs4_srv_t *);
      276 +
 271  277  fem_t           *deleg_rdops;
 272  278  fem_t           *deleg_wrops;
 273  279  
 274      -rfs4_servinst_t *rfs4_cur_servinst = NULL;      /* current server instance */
 275      -kmutex_t        rfs4_servinst_lock;     /* protects linked list */
 276      -int             rfs4_seen_first_compound;       /* set first time we see one */
 277      -
 278  280  /*
 279  281   * NFS4 op dispatch table
 280  282   */
 281  283  
 282  284  struct rfsv4disp {
 283  285          void    (*dis_proc)();          /* proc to call */
 284  286          void    (*dis_resfree)();       /* frees space allocated by proc */
 285  287          int     dis_flags;              /* RPC_IDEMPOTENT, etc... */
      288 +        int     op_type;                /* operation type, see below */
 286  289  };
 287  290  
      291 +/*
      292 + * operation types; used primarily for the per-exportinfo kstat implementation
      293 + */
      294 +#define NFS4_OP_NOFH    0       /* The operation does not operate with any */
      295 +                                /* particular filehandle; we cannot associate */
      296 +                                /* it with any exportinfo. */
      297 +
      298 +#define NFS4_OP_CFH     1       /* The operation works with the current */
      299 +                                /* filehandle; we associate the operation */
      300 +                                /* with the exportinfo related to the current */
      301 +                                /* filehandle (as set before the operation is */
      302 +                                /* executed). */
      303 +
      304 +#define NFS4_OP_SFH     2       /* The operation works with the saved */
      305 +                                /* filehandle; we associate the operation */
      306 +                                /* with the exportinfo related to the saved */
      307 +                                /* filehandle (as set before the operation is */
      308 +                                /* executed). */
      309 +
      310 +#define NFS4_OP_POSTCFH 3       /* The operation ignores the current */
      311 +                                /* filehandle, but sets the new current */
      312 +                                /* filehandle instead; we associate the */
      313 +                                /* operation with the exportinfo related to */
      314 +                                /* the current filehandle as set after the */
      315 +                                /* operation is successfuly executed.  Since */
      316 +                                /* we do not know the particular exportinfo */
      317 +                                /* (and thus the kstat) before the operation */
      318 +                                /* is done, there is no simple way how to */
      319 +                                /* update some I/O kstat statistics related */
      320 +                                /* to kstat_queue(9F). */
      321 +
 288  322  static struct rfsv4disp rfsv4disptab[] = {
 289  323          /*
 290  324           * NFS VERSION 4
 291  325           */
 292  326  
 293  327          /* RFS_NULL = 0 */
 294      -        {rfs4_op_illegal, nullfree, 0},
      328 +        {rfs4_op_illegal, nullfree, 0, NFS4_OP_NOFH},
 295  329  
 296  330          /* UNUSED = 1 */
 297      -        {rfs4_op_illegal, nullfree, 0},
      331 +        {rfs4_op_illegal, nullfree, 0, NFS4_OP_NOFH},
 298  332  
 299  333          /* UNUSED = 2 */
 300      -        {rfs4_op_illegal, nullfree, 0},
      334 +        {rfs4_op_illegal, nullfree, 0, NFS4_OP_NOFH},
 301  335  
 302  336          /* OP_ACCESS = 3 */
 303      -        {rfs4_op_access, nullfree, RPC_IDEMPOTENT},
      337 +        {rfs4_op_access, nullfree, RPC_IDEMPOTENT, NFS4_OP_CFH},
 304  338  
 305  339          /* OP_CLOSE = 4 */
 306      -        {rfs4_op_close, nullfree, 0},
      340 +        {rfs4_op_close, nullfree, 0, NFS4_OP_CFH},
 307  341  
 308  342          /* OP_COMMIT = 5 */
 309      -        {rfs4_op_commit, nullfree, RPC_IDEMPOTENT},
      343 +        {rfs4_op_commit, nullfree, RPC_IDEMPOTENT, NFS4_OP_CFH},
 310  344  
 311  345          /* OP_CREATE = 6 */
 312      -        {rfs4_op_create, nullfree, 0},
      346 +        {rfs4_op_create, nullfree, 0, NFS4_OP_CFH},
 313  347  
 314  348          /* OP_DELEGPURGE = 7 */
 315      -        {rfs4_op_delegpurge, nullfree, 0},
      349 +        {rfs4_op_delegpurge, nullfree, 0, NFS4_OP_NOFH},
 316  350  
 317  351          /* OP_DELEGRETURN = 8 */
 318      -        {rfs4_op_delegreturn, nullfree, 0},
      352 +        {rfs4_op_delegreturn, nullfree, 0, NFS4_OP_CFH},
 319  353  
 320  354          /* OP_GETATTR = 9 */
 321      -        {rfs4_op_getattr, rfs4_op_getattr_free, RPC_IDEMPOTENT},
      355 +        {rfs4_op_getattr, rfs4_op_getattr_free, RPC_IDEMPOTENT, NFS4_OP_CFH},
 322  356  
 323  357          /* OP_GETFH = 10 */
 324      -        {rfs4_op_getfh, rfs4_op_getfh_free, RPC_ALL},
      358 +        {rfs4_op_getfh, rfs4_op_getfh_free, RPC_ALL, NFS4_OP_CFH},
 325  359  
 326  360          /* OP_LINK = 11 */
 327      -        {rfs4_op_link, nullfree, 0},
      361 +        {rfs4_op_link, nullfree, 0, NFS4_OP_CFH},
 328  362  
 329  363          /* OP_LOCK = 12 */
 330      -        {rfs4_op_lock, lock_denied_free, 0},
      364 +        {rfs4_op_lock, lock_denied_free, 0, NFS4_OP_CFH},
 331  365  
 332  366          /* OP_LOCKT = 13 */
 333      -        {rfs4_op_lockt, lock_denied_free, 0},
      367 +        {rfs4_op_lockt, lock_denied_free, 0, NFS4_OP_CFH},
 334  368  
 335  369          /* OP_LOCKU = 14 */
 336      -        {rfs4_op_locku, nullfree, 0},
      370 +        {rfs4_op_locku, nullfree, 0, NFS4_OP_CFH},
 337  371  
 338  372          /* OP_LOOKUP = 15 */
 339      -        {rfs4_op_lookup, nullfree, (RPC_IDEMPOTENT | RPC_PUBLICFH_OK)},
      373 +        {rfs4_op_lookup, nullfree, (RPC_IDEMPOTENT | RPC_PUBLICFH_OK),
      374 +            NFS4_OP_CFH},
 340  375  
 341  376          /* OP_LOOKUPP = 16 */
 342      -        {rfs4_op_lookupp, nullfree, (RPC_IDEMPOTENT | RPC_PUBLICFH_OK)},
      377 +        {rfs4_op_lookupp, nullfree, (RPC_IDEMPOTENT | RPC_PUBLICFH_OK),
      378 +            NFS4_OP_CFH},
 343  379  
 344  380          /* OP_NVERIFY = 17 */
 345      -        {rfs4_op_nverify, nullfree, RPC_IDEMPOTENT},
      381 +        {rfs4_op_nverify, nullfree, RPC_IDEMPOTENT, NFS4_OP_CFH},
 346  382  
 347  383          /* OP_OPEN = 18 */
 348      -        {rfs4_op_open, rfs4_free_reply, 0},
      384 +        {rfs4_op_open, rfs4_free_reply, 0, NFS4_OP_CFH},
 349  385  
 350  386          /* OP_OPENATTR = 19 */
 351      -        {rfs4_op_openattr, nullfree, 0},
      387 +        {rfs4_op_openattr, nullfree, 0, NFS4_OP_CFH},
 352  388  
 353  389          /* OP_OPEN_CONFIRM = 20 */
 354      -        {rfs4_op_open_confirm, nullfree, 0},
      390 +        {rfs4_op_open_confirm, nullfree, 0, NFS4_OP_CFH},
 355  391  
 356  392          /* OP_OPEN_DOWNGRADE = 21 */
 357      -        {rfs4_op_open_downgrade, nullfree, 0},
      393 +        {rfs4_op_open_downgrade, nullfree, 0, NFS4_OP_CFH},
 358  394  
 359  395          /* OP_OPEN_PUTFH = 22 */
 360      -        {rfs4_op_putfh, nullfree, RPC_ALL},
      396 +        {rfs4_op_putfh, nullfree, RPC_ALL, NFS4_OP_POSTCFH},
 361  397  
 362  398          /* OP_PUTPUBFH = 23 */
 363      -        {rfs4_op_putpubfh, nullfree, RPC_ALL},
      399 +        {rfs4_op_putpubfh, nullfree, RPC_ALL, NFS4_OP_POSTCFH},
 364  400  
 365  401          /* OP_PUTROOTFH = 24 */
 366      -        {rfs4_op_putrootfh, nullfree, RPC_ALL},
      402 +        {rfs4_op_putrootfh, nullfree, RPC_ALL, NFS4_OP_POSTCFH},
 367  403  
 368  404          /* OP_READ = 25 */
 369      -        {rfs4_op_read, rfs4_op_read_free, RPC_IDEMPOTENT},
      405 +        {rfs4_op_read, rfs4_op_read_free, RPC_IDEMPOTENT, NFS4_OP_CFH},
 370  406  
 371  407          /* OP_READDIR = 26 */
 372      -        {rfs4_op_readdir, rfs4_op_readdir_free, RPC_IDEMPOTENT},
      408 +        {rfs4_op_readdir, rfs4_op_readdir_free, RPC_IDEMPOTENT, NFS4_OP_CFH},
 373  409  
 374  410          /* OP_READLINK = 27 */
 375      -        {rfs4_op_readlink, rfs4_op_readlink_free, RPC_IDEMPOTENT},
      411 +        {rfs4_op_readlink, rfs4_op_readlink_free, RPC_IDEMPOTENT, NFS4_OP_CFH},
 376  412  
 377  413          /* OP_REMOVE = 28 */
 378      -        {rfs4_op_remove, nullfree, 0},
      414 +        {rfs4_op_remove, nullfree, 0, NFS4_OP_CFH},
 379  415  
 380  416          /* OP_RENAME = 29 */
 381      -        {rfs4_op_rename, nullfree, 0},
      417 +        {rfs4_op_rename, nullfree, 0, NFS4_OP_CFH},
 382  418  
 383  419          /* OP_RENEW = 30 */
 384      -        {rfs4_op_renew, nullfree, 0},
      420 +        {rfs4_op_renew, nullfree, 0, NFS4_OP_NOFH},
 385  421  
 386  422          /* OP_RESTOREFH = 31 */
 387      -        {rfs4_op_restorefh, nullfree, RPC_ALL},
      423 +        {rfs4_op_restorefh, nullfree, RPC_ALL, NFS4_OP_SFH},
 388  424  
 389  425          /* OP_SAVEFH = 32 */
 390      -        {rfs4_op_savefh, nullfree, RPC_ALL},
      426 +        {rfs4_op_savefh, nullfree, RPC_ALL, NFS4_OP_CFH},
 391  427  
 392  428          /* OP_SECINFO = 33 */
 393      -        {rfs4_op_secinfo, rfs4_op_secinfo_free, 0},
      429 +        {rfs4_op_secinfo, rfs4_op_secinfo_free, 0, NFS4_OP_CFH},
 394  430  
 395  431          /* OP_SETATTR = 34 */
 396      -        {rfs4_op_setattr, nullfree, 0},
      432 +        {rfs4_op_setattr, nullfree, 0, NFS4_OP_CFH},
 397  433  
 398  434          /* OP_SETCLIENTID = 35 */
 399      -        {rfs4_op_setclientid, nullfree, 0},
      435 +        {rfs4_op_setclientid, nullfree, 0, NFS4_OP_NOFH},
 400  436  
 401  437          /* OP_SETCLIENTID_CONFIRM = 36 */
 402      -        {rfs4_op_setclientid_confirm, nullfree, 0},
      438 +        {rfs4_op_setclientid_confirm, nullfree, 0, NFS4_OP_NOFH},
 403  439  
 404  440          /* OP_VERIFY = 37 */
 405      -        {rfs4_op_verify, nullfree, RPC_IDEMPOTENT},
      441 +        {rfs4_op_verify, nullfree, RPC_IDEMPOTENT, NFS4_OP_CFH},
 406  442  
 407  443          /* OP_WRITE = 38 */
 408      -        {rfs4_op_write, nullfree, 0},
      444 +        {rfs4_op_write, nullfree, 0, NFS4_OP_CFH},
 409  445  
 410  446          /* OP_RELEASE_LOCKOWNER = 39 */
 411      -        {rfs4_op_release_lockowner, nullfree, 0},
      447 +        {rfs4_op_release_lockowner, nullfree, 0, NFS4_OP_NOFH},
 412  448  };
 413  449  
 414  450  static uint_t rfsv4disp_cnt = sizeof (rfsv4disptab) / sizeof (rfsv4disptab[0]);
 415  451  
 416  452  #define OP_ILLEGAL_IDX (rfsv4disp_cnt)
 417  453  
 418  454  #ifdef DEBUG
 419  455  
 420  456  int             rfs4_fillone_debug = 0;
 421  457  int             rfs4_no_stub_access = 1;
↓ open down ↓ 37 lines elided ↑ open up ↑
 459  495          "rfs4_op_setattr",
 460  496          "rfs4_op_setclientid",
 461  497          "rfs4_op_setclient_confirm",
 462  498          "rfs4_op_verify",
 463  499          "rfs4_op_write",
 464  500          "rfs4_op_release_lockowner",
 465  501          "rfs4_op_illegal"
 466  502  };
 467  503  #endif
 468  504  
 469      -void    rfs4_ss_chkclid(rfs4_client_t *);
      505 +void    rfs4_ss_chkclid(nfs4_srv_t *, rfs4_client_t *);
 470  506  
 471  507  extern size_t   strlcpy(char *dst, const char *src, size_t dstsize);
 472  508  
 473  509  extern void     rfs4_free_fs_locations4(fs_locations4 *);
 474  510  
 475  511  #ifdef  nextdp
 476  512  #undef nextdp
 477  513  #endif
 478  514  #define nextdp(dp)      ((struct dirent64 *)((char *)(dp) + (dp)->d_reclen))
 479  515  
↓ open down ↓ 12 lines elided ↑ open up ↑
 492  528          VOPNAME_READ,           { .femop_read = deleg_wr_read },
 493  529          VOPNAME_WRITE,          { .femop_write = deleg_wr_write },
 494  530          VOPNAME_SETATTR,        { .femop_setattr = deleg_wr_setattr },
 495  531          VOPNAME_RWLOCK,         { .femop_rwlock = deleg_wr_rwlock },
 496  532          VOPNAME_SPACE,          { .femop_space = deleg_wr_space },
 497  533          VOPNAME_SETSECATTR,     { .femop_setsecattr = deleg_wr_setsecattr },
 498  534          VOPNAME_VNEVENT,        { .femop_vnevent = deleg_wr_vnevent },
 499  535          NULL,                   NULL
 500  536  };
 501  537  
 502      -int
 503      -rfs4_srvrinit(void)
      538 +/* ARGSUSED */
      539 +static void *
      540 +rfs4_zone_init(zoneid_t zoneid)
 504  541  {
      542 +        nfs4_srv_t *nsrv4;
 505  543          timespec32_t verf;
 506      -        int error;
 507      -        extern void rfs4_attr_init();
 508      -        extern krwlock_t rfs4_deleg_policy_lock;
 509  544  
      545 +        nsrv4 = kmem_zalloc(sizeof (*nsrv4), KM_SLEEP);
      546 +
 510  547          /*
 511  548           * The following algorithm attempts to find a unique verifier
 512  549           * to be used as the write verifier returned from the server
 513  550           * to the client.  It is important that this verifier change
 514  551           * whenever the server reboots.  Of secondary importance, it
 515  552           * is important for the verifier to be unique between two
 516  553           * different servers.
 517  554           *
 518  555           * Thus, an attempt is made to use the system hostid and the
 519  556           * current time in seconds when the nfssrv kernel module is
↓ open down ↓ 8 lines elided ↑ open up ↑
 528  565          verf.tv_sec = (time_t)zone_get_hostid(NULL);
 529  566          if (verf.tv_sec != 0) {
 530  567                  verf.tv_nsec = gethrestime_sec();
 531  568          } else {
 532  569                  timespec_t tverf;
 533  570  
 534  571                  gethrestime(&tverf);
 535  572                  verf.tv_sec = (time_t)tverf.tv_sec;
 536  573                  verf.tv_nsec = tverf.tv_nsec;
 537  574          }
      575 +        nsrv4->write4verf = *(uint64_t *)&verf;
 538  576  
 539      -        Write4verf = *(uint64_t *)&verf;
      577 +        /* Used to manage create/destroy of server state */
      578 +        nsrv4->nfs4_server_state = NULL;
      579 +        nsrv4->nfs4_cur_servinst = NULL;
      580 +        nsrv4->nfs4_deleg_policy = SRV_NEVER_DELEGATE;
      581 +        mutex_init(&nsrv4->deleg_lock, NULL, MUTEX_DEFAULT, NULL);
      582 +        mutex_init(&nsrv4->state_lock, NULL, MUTEX_DEFAULT, NULL);
      583 +        mutex_init(&nsrv4->servinst_lock, NULL, MUTEX_DEFAULT, NULL);
      584 +        rw_init(&nsrv4->deleg_policy_lock, NULL, RW_DEFAULT, NULL);
 540  585  
 541      -        rfs4_attr_init();
 542      -        mutex_init(&rfs4_deleg_lock, NULL, MUTEX_DEFAULT, NULL);
      586 +        return (nsrv4);
      587 +}
 543  588  
 544      -        /* Used to manage create/destroy of server state */
 545      -        mutex_init(&rfs4_state_lock, NULL, MUTEX_DEFAULT, NULL);
      589 +/* ARGSUSED */
      590 +static void
      591 +rfs4_zone_fini(zoneid_t zoneid, void *data)
      592 +{
      593 +        nfs4_srv_t *nsrv4 = data;
 546  594  
 547      -        /* Used to manage access to server instance linked list */
 548      -        mutex_init(&rfs4_servinst_lock, NULL, MUTEX_DEFAULT, NULL);
      595 +        mutex_destroy(&nsrv4->deleg_lock);
      596 +        mutex_destroy(&nsrv4->state_lock);
      597 +        mutex_destroy(&nsrv4->servinst_lock);
      598 +        rw_destroy(&nsrv4->deleg_policy_lock);
 549  599  
 550      -        /* Used to manage access to rfs4_deleg_policy */
 551      -        rw_init(&rfs4_deleg_policy_lock, NULL, RW_DEFAULT, NULL);
      600 +        kmem_free(nsrv4, sizeof (*nsrv4));
      601 +}
 552  602  
 553      -        error = fem_create("deleg_rdops", nfs4_rd_deleg_tmpl, &deleg_rdops);
 554      -        if (error != 0) {
      603 +void
      604 +rfs4_srvrinit(void)
      605 +{
      606 +        extern void rfs4_attr_init();
      607 +
      608 +        zone_key_create(&rfs4_zone_key, rfs4_zone_init, NULL, rfs4_zone_fini);
      609 +
      610 +        rfs4_attr_init();
      611 +
      612 +
      613 +        if (fem_create("deleg_rdops", nfs4_rd_deleg_tmpl, &deleg_rdops) != 0) {
 555  614                  rfs4_disable_delegation();
 556      -        } else {
 557      -                error = fem_create("deleg_wrops", nfs4_wr_deleg_tmpl,
 558      -                    &deleg_wrops);
 559      -                if (error != 0) {
 560      -                        rfs4_disable_delegation();
 561      -                        fem_free(deleg_rdops);
 562      -                }
      615 +        } else if (fem_create("deleg_wrops", nfs4_wr_deleg_tmpl,
      616 +            &deleg_wrops) != 0) {
      617 +                rfs4_disable_delegation();
      618 +                fem_free(deleg_rdops);
 563  619          }
 564  620  
 565  621          nfs4_srv_caller_id = fs_new_caller_id();
 566      -
 567  622          lockt_sysid = lm_alloc_sysidt();
 568      -
 569  623          vsd_create(&nfs4_srv_vkey, NULL);
 570      -
 571      -        return (0);
      624 +        rfs4_state_g_init();
 572  625  }
 573  626  
 574  627  void
 575  628  rfs4_srvrfini(void)
 576  629  {
 577      -        extern krwlock_t rfs4_deleg_policy_lock;
 578      -
 579  630          if (lockt_sysid != LM_NOSYSID) {
 580  631                  lm_free_sysidt(lockt_sysid);
 581  632                  lockt_sysid = LM_NOSYSID;
 582  633          }
 583  634  
 584      -        mutex_destroy(&rfs4_deleg_lock);
 585      -        mutex_destroy(&rfs4_state_lock);
 586      -        rw_destroy(&rfs4_deleg_policy_lock);
      635 +        rfs4_state_g_fini();
 587  636  
 588  637          fem_free(deleg_rdops);
 589  638          fem_free(deleg_wrops);
      639 +
      640 +        (void) zone_key_delete(rfs4_zone_key);
 590  641  }
 591  642  
 592  643  void
      644 +rfs4_do_server_start(int server_upordown,
      645 +    int srv_delegation, int cluster_booted)
      646 +{
      647 +        nfs4_srv_t *nsrv4 = zone_getspecific(rfs4_zone_key, curzone);
      648 +
      649 +        /* Is this a warm start? */
      650 +        if (server_upordown == NFS_SERVER_QUIESCED) {
      651 +                cmn_err(CE_NOTE, "nfs4_srv: "
      652 +                    "server was previously quiesced; "
      653 +                    "existing NFSv4 state will be re-used");
      654 +
      655 +                /*
      656 +                 * HA-NFSv4: this is also the signal
      657 +                 * that a Resource Group failover has
      658 +                 * occurred.
      659 +                 */
      660 +                if (cluster_booted)
      661 +                        hanfsv4_failover(nsrv4);
      662 +        } else {
      663 +                /* Cold start */
      664 +                nsrv4->rfs4_start_time = 0;
      665 +                rfs4_state_zone_init(nsrv4);
      666 +                nsrv4->nfs4_drc = rfs4_init_drc(nfs4_drc_max,
      667 +                    nfs4_drc_hash);
      668 +        }
      669 +
      670 +        /* Check if delegation is to be enabled */
      671 +        if (srv_delegation != FALSE)
      672 +                rfs4_set_deleg_policy(nsrv4, SRV_NORMAL_DELEGATE);
      673 +}
      674 +
      675 +void
 593  676  rfs4_init_compound_state(struct compound_state *cs)
 594  677  {
 595  678          bzero(cs, sizeof (*cs));
 596  679          cs->cont = TRUE;
 597  680          cs->access = CS_ACCESS_DENIED;
 598  681          cs->deleg = FALSE;
 599  682          cs->mandlock = FALSE;
 600  683          cs->fh.nfs_fh4_val = cs->fhbuf;
      684 +        cs->statusp = NULL;
 601  685  }
 602  686  
 603  687  void
 604  688  rfs4_grace_start(rfs4_servinst_t *sip)
 605  689  {
 606  690          rw_enter(&sip->rwlock, RW_WRITER);
 607  691          sip->start_time = (time_t)TICK_TO_SEC(ddi_get_lbolt());
 608  692          sip->grace_period = rfs4_grace_period;
 609  693          rw_exit(&sip->rwlock);
 610  694  }
↓ open down ↓ 34 lines elided ↑ open up ↑
 645  729  {
 646  730          ASSERT(rfs4_dbe_refcnt(cp->rc_dbe) > 0);
 647  731  
 648  732          return (rfs4_servinst_in_grace(cp->rc_server_instance));
 649  733  }
 650  734  
 651  735  /*
 652  736   * reset all currently active grace periods
 653  737   */
 654  738  void
 655      -rfs4_grace_reset_all(void)
      739 +rfs4_grace_reset_all(nfs4_srv_t *nsrv4)
 656  740  {
 657  741          rfs4_servinst_t *sip;
 658  742  
 659      -        mutex_enter(&rfs4_servinst_lock);
 660      -        for (sip = rfs4_cur_servinst; sip != NULL; sip = sip->prev)
      743 +        mutex_enter(&nsrv4->servinst_lock);
      744 +        for (sip = nsrv4->nfs4_cur_servinst; sip != NULL; sip = sip->prev)
 661  745                  if (rfs4_servinst_in_grace(sip))
 662  746                          rfs4_grace_start(sip);
 663      -        mutex_exit(&rfs4_servinst_lock);
      747 +        mutex_exit(&nsrv4->servinst_lock);
 664  748  }
 665  749  
 666  750  /*
 667  751   * start any new instances' grace periods
 668  752   */
 669  753  void
 670      -rfs4_grace_start_new(void)
      754 +rfs4_grace_start_new(nfs4_srv_t *nsrv4)
 671  755  {
 672  756          rfs4_servinst_t *sip;
 673  757  
 674      -        mutex_enter(&rfs4_servinst_lock);
 675      -        for (sip = rfs4_cur_servinst; sip != NULL; sip = sip->prev)
      758 +        mutex_enter(&nsrv4->servinst_lock);
      759 +        for (sip = nsrv4->nfs4_cur_servinst; sip != NULL; sip = sip->prev)
 676  760                  if (rfs4_servinst_grace_new(sip))
 677  761                          rfs4_grace_start(sip);
 678      -        mutex_exit(&rfs4_servinst_lock);
      762 +        mutex_exit(&nsrv4->servinst_lock);
 679  763  }
 680  764  
 681  765  static rfs4_dss_path_t *
 682      -rfs4_dss_newpath(rfs4_servinst_t *sip, char *path, unsigned index)
      766 +rfs4_dss_newpath(nfs4_srv_t *nsrv4, rfs4_servinst_t *sip,
      767 +    char *path, unsigned index)
 683  768  {
 684  769          size_t len;
 685  770          rfs4_dss_path_t *dss_path;
 686  771  
 687  772          dss_path = kmem_alloc(sizeof (rfs4_dss_path_t), KM_SLEEP);
 688  773  
 689  774          /*
 690  775           * Take a copy of the string, since the original may be overwritten.
 691  776           * Sadly, no strdup() in the kernel.
 692  777           */
↓ open down ↓ 3 lines elided ↑ open up ↑
 696  781          (void) strlcpy(dss_path->path, path, len);
 697  782  
 698  783          /* associate with servinst */
 699  784          dss_path->sip = sip;
 700  785          dss_path->index = index;
 701  786  
 702  787          /*
 703  788           * Add to list of served paths.
 704  789           * No locking required, as we're only ever called at startup.
 705  790           */
 706      -        if (rfs4_dss_pathlist == NULL) {
      791 +        if (nsrv4->dss_pathlist == NULL) {
 707  792                  /* this is the first dss_path_t */
 708  793  
 709  794                  /* needed for insque/remque */
 710  795                  dss_path->next = dss_path->prev = dss_path;
 711  796  
 712      -                rfs4_dss_pathlist = dss_path;
      797 +                nsrv4->dss_pathlist = dss_path;
 713  798          } else {
 714      -                insque(dss_path, rfs4_dss_pathlist);
      799 +                insque(dss_path, nsrv4->dss_pathlist);
 715  800          }
 716  801  
 717  802          return (dss_path);
 718  803  }
 719  804  
 720  805  /*
 721  806   * Create a new server instance, and make it the currently active instance.
 722  807   * Note that starting the grace period too early will reduce the clients'
 723  808   * recovery window.
 724  809   */
 725  810  void
 726      -rfs4_servinst_create(int start_grace, int dss_npaths, char **dss_paths)
      811 +rfs4_servinst_create(nfs4_srv_t *nsrv4, int start_grace,
      812 +    int dss_npaths, char **dss_paths)
 727  813  {
 728  814          unsigned i;
 729  815          rfs4_servinst_t *sip;
 730  816          rfs4_oldstate_t *oldstate;
 731  817  
 732  818          sip = kmem_alloc(sizeof (rfs4_servinst_t), KM_SLEEP);
 733  819          rw_init(&sip->rwlock, NULL, RW_DEFAULT, NULL);
 734  820  
 735  821          sip->start_time = (time_t)0;
 736  822          sip->grace_period = (time_t)0;
↓ open down ↓ 10 lines elided ↑ open up ↑
 747  833          oldstate->next = oldstate;
 748  834          oldstate->prev = oldstate;
 749  835          sip->oldstate = oldstate;
 750  836  
 751  837  
 752  838          sip->dss_npaths = dss_npaths;
 753  839          sip->dss_paths = kmem_alloc(dss_npaths *
 754  840              sizeof (rfs4_dss_path_t *), KM_SLEEP);
 755  841  
 756  842          for (i = 0; i < dss_npaths; i++) {
 757      -                sip->dss_paths[i] = rfs4_dss_newpath(sip, dss_paths[i], i);
      843 +                /* CSTYLED */
      844 +                sip->dss_paths[i] = rfs4_dss_newpath(nsrv4, sip, dss_paths[i], i);
 758  845          }
 759  846  
 760      -        mutex_enter(&rfs4_servinst_lock);
 761      -        if (rfs4_cur_servinst != NULL) {
      847 +        mutex_enter(&nsrv4->servinst_lock);
      848 +        if (nsrv4->nfs4_cur_servinst != NULL) {
 762  849                  /* add to linked list */
 763      -                sip->prev = rfs4_cur_servinst;
 764      -                rfs4_cur_servinst->next = sip;
      850 +                sip->prev = nsrv4->nfs4_cur_servinst;
      851 +                nsrv4->nfs4_cur_servinst->next = sip;
 765  852          }
 766  853          if (start_grace)
 767  854                  rfs4_grace_start(sip);
 768  855          /* make the new instance "current" */
 769      -        rfs4_cur_servinst = sip;
      856 +        nsrv4->nfs4_cur_servinst = sip;
 770  857  
 771      -        mutex_exit(&rfs4_servinst_lock);
      858 +        mutex_exit(&nsrv4->servinst_lock);
 772  859  }
 773  860  
 774  861  /*
 775  862   * In future, we might add a rfs4_servinst_destroy(sip) but, for now, destroy
 776  863   * all instances directly.
 777  864   */
 778  865  void
 779      -rfs4_servinst_destroy_all(void)
      866 +rfs4_servinst_destroy_all(nfs4_srv_t *nsrv4)
 780  867  {
 781  868          rfs4_servinst_t *sip, *prev, *current;
 782  869  #ifdef DEBUG
 783  870          int n = 0;
 784  871  #endif
 785  872  
 786      -        mutex_enter(&rfs4_servinst_lock);
 787      -        ASSERT(rfs4_cur_servinst != NULL);
 788      -        current = rfs4_cur_servinst;
 789      -        rfs4_cur_servinst = NULL;
      873 +        mutex_enter(&nsrv4->servinst_lock);
      874 +        ASSERT(nsrv4->nfs4_cur_servinst != NULL);
      875 +        current = nsrv4->nfs4_cur_servinst;
      876 +        nsrv4->nfs4_cur_servinst = NULL;
 790  877          for (sip = current; sip != NULL; sip = prev) {
 791  878                  prev = sip->prev;
 792  879                  rw_destroy(&sip->rwlock);
 793  880                  if (sip->oldstate)
 794  881                          kmem_free(sip->oldstate, sizeof (rfs4_oldstate_t));
 795  882                  if (sip->dss_paths)
 796  883                          kmem_free(sip->dss_paths,
 797  884                              sip->dss_npaths * sizeof (rfs4_dss_path_t *));
 798  885                  kmem_free(sip, sizeof (rfs4_servinst_t));
 799  886  #ifdef DEBUG
 800  887                  n++;
 801  888  #endif
 802  889          }
 803      -        mutex_exit(&rfs4_servinst_lock);
      890 +        mutex_exit(&nsrv4->servinst_lock);
 804  891  }
 805  892  
 806  893  /*
 807  894   * Assign the current server instance to a client_t.
 808  895   * Should be called with cp->rc_dbe held.
 809  896   */
 810  897  void
 811      -rfs4_servinst_assign(rfs4_client_t *cp, rfs4_servinst_t *sip)
      898 +rfs4_servinst_assign(nfs4_srv_t *nsrv4, rfs4_client_t *cp,
      899 +    rfs4_servinst_t *sip)
 812  900  {
 813  901          ASSERT(rfs4_dbe_refcnt(cp->rc_dbe) > 0);
 814  902  
 815  903          /*
 816  904           * The lock ensures that if the current instance is in the process
 817  905           * of changing, we will see the new one.
 818  906           */
 819      -        mutex_enter(&rfs4_servinst_lock);
      907 +        mutex_enter(&nsrv4->servinst_lock);
 820  908          cp->rc_server_instance = sip;
 821      -        mutex_exit(&rfs4_servinst_lock);
      909 +        mutex_exit(&nsrv4->servinst_lock);
 822  910  }
 823  911  
 824  912  rfs4_servinst_t *
 825  913  rfs4_servinst(rfs4_client_t *cp)
 826  914  {
 827  915          ASSERT(rfs4_dbe_refcnt(cp->rc_dbe) > 0);
 828  916  
 829  917          return (cp->rc_server_instance);
 830  918  }
 831  919  
↓ open down ↓ 40 lines elided ↑ open up ↑
 872  960          int error, different_export = 0;
 873  961          vnode_t *dvp, *vp;
 874  962          struct exportinfo *exi = NULL;
 875  963          fid_t fid;
 876  964          uint_t count, i;
 877  965          secinfo4 *resok_val;
 878  966          struct secinfo *secp;
 879  967          seconfig_t *si;
 880  968          bool_t did_traverse = FALSE;
 881  969          int dotdot, walk;
      970 +        nfs_export_t *ne = nfs_get_export();
 882  971  
 883  972          dvp = cs->vp;
 884  973          dotdot = (nm[0] == '.' && nm[1] == '.' && nm[2] == '\0');
 885  974  
 886  975          /*
 887  976           * If dotdotting, then need to check whether it's above the
 888  977           * root of a filesystem, or above an export point.
 889  978           */
 890  979          if (dotdot) {
 891  980  
↓ open down ↓ 1 lines elided ↑ open up ↑
 893  982                   * If dotdotting at the root of a filesystem, then
 894  983                   * need to traverse back to the mounted-on filesystem
 895  984                   * and do the dotdot lookup there.
 896  985                   */
 897  986                  if (cs->vp->v_flag & VROOT) {
 898  987  
 899  988                          /*
 900  989                           * If at the system root, then can
 901  990                           * go up no further.
 902  991                           */
 903      -                        if (VN_CMP(dvp, rootdir))
      992 +                        if (VN_CMP(dvp, ZONE_ROOTVP()))
 904  993                                  return (puterrno4(ENOENT));
 905  994  
 906  995                          /*
 907  996                           * Traverse back to the mounted-on filesystem
 908  997                           */
 909  998                          dvp = untraverse(cs->vp);
 910  999  
 911 1000                          /*
 912 1001                           * Set the different_export flag so we remember
 913 1002                           * to pick up a new exportinfo entry for
↓ open down ↓ 96 lines elided ↑ open up ↑
1010 1099  
1011 1100  
1012 1101          /*
1013 1102           * Create the secinfo result based on the security information
1014 1103           * from the exportinfo structure (exi).
1015 1104           *
1016 1105           * Return all flavors for a pseudo node.
1017 1106           * For a real export node, return the flavor that the client
1018 1107           * has access with.
1019 1108           */
1020      -        ASSERT(RW_LOCK_HELD(&exported_lock));
     1109 +        ASSERT(RW_LOCK_HELD(&ne->exported_lock));
1021 1110          if (PSEUDO(exi)) {
1022 1111                  count = exi->exi_export.ex_seccnt; /* total sec count */
1023 1112                  resok_val = kmem_alloc(count * sizeof (secinfo4), KM_SLEEP);
1024 1113                  secp = exi->exi_export.ex_secinfo;
1025 1114  
1026 1115                  for (i = 0; i < count; i++) {
1027 1116                          si = &secp[i].s_secinfo;
1028 1117                          resok_val[i].flavor = si->sc_rpcnum;
1029 1118                          if (resok_val[i].flavor == RPCSEC_GSS) {
1030 1119                                  rpcsec_gss_info *info;
↓ open down ↓ 342 lines elided ↑ open up ↑
1373 1462  static void
1374 1463  rfs4_op_commit(nfs_argop4 *argop, nfs_resop4 *resop, struct svc_req *req,
1375 1464      struct compound_state *cs)
1376 1465  {
1377 1466          COMMIT4args *args = &argop->nfs_argop4_u.opcommit;
1378 1467          COMMIT4res *resp = &resop->nfs_resop4_u.opcommit;
1379 1468          int error;
1380 1469          vnode_t *vp = cs->vp;
1381 1470          cred_t *cr = cs->cr;
1382 1471          vattr_t va;
     1472 +        nfs4_srv_t *nsrv4;
1383 1473  
1384 1474          DTRACE_NFSV4_2(op__commit__start, struct compound_state *, cs,
1385 1475              COMMIT4args *, args);
1386 1476  
1387 1477          if (vp == NULL) {
1388 1478                  *cs->statusp = resp->status = NFS4ERR_NOFILEHANDLE;
1389 1479                  goto out;
1390 1480          }
1391 1481          if (cs->access == CS_ACCESS_DENIED) {
1392 1482                  *cs->statusp = resp->status = NFS4ERR_ACCESS;
↓ open down ↓ 36 lines elided ↑ open up ↑
1429 1519                  goto out;
1430 1520          }
1431 1521  
1432 1522          error = VOP_FSYNC(vp, FSYNC, cr, NULL);
1433 1523  
1434 1524          if (error) {
1435 1525                  *cs->statusp = resp->status = puterrno4(error);
1436 1526                  goto out;
1437 1527          }
1438 1528  
     1529 +        nsrv4 = zone_getspecific(rfs4_zone_key, curzone);
1439 1530          *cs->statusp = resp->status = NFS4_OK;
1440      -        resp->writeverf = Write4verf;
     1531 +        resp->writeverf = nsrv4->write4verf;
1441 1532  out:
1442 1533          DTRACE_NFSV4_2(op__commit__done, struct compound_state *, cs,
1443 1534              COMMIT4res *, resp);
1444 1535  }
1445 1536  
1446 1537  /*
1447 1538   * rfs4_op_mknod is called from rfs4_op_create after all initial verification
1448 1539   * was completed. It does the nfsv4 create for special files.
1449 1540   */
1450 1541  /* ARGSUSED */
↓ open down ↓ 1187 lines elided ↑ open up ↑
2638 2729                   * If dotdotting at the root of a filesystem, then
2639 2730                   * need to traverse back to the mounted-on filesystem
2640 2731                   * and do the dotdot lookup there.
2641 2732                   */
2642 2733                  if (cs->vp->v_flag & VROOT) {
2643 2734  
2644 2735                          /*
2645 2736                           * If at the system root, then can
2646 2737                           * go up no further.
2647 2738                           */
2648      -                        if (VN_CMP(cs->vp, rootdir))
     2739 +                        if (VN_CMP(cs->vp, ZONE_ROOTVP()))
2649 2740                                  return (puterrno4(ENOENT));
2650 2741  
2651 2742                          /*
2652 2743                           * Traverse back to the mounted-on filesystem
2653 2744                           */
2654 2745                          cs->vp = untraverse(cs->vp);
2655 2746  
2656 2747                          /*
2657 2748                           * Set the different_export flag so we remember
2658 2749                           * to pick up a new exportinfo entry for
↓ open down ↓ 743 lines elided ↑ open up ↑
3402 3493  /* ARGSUSED */
3403 3494  static void
3404 3495  rfs4_op_putpubfh(nfs_argop4 *args, nfs_resop4 *resop, struct svc_req *req,
3405 3496      struct compound_state *cs)
3406 3497  {
3407 3498          PUTPUBFH4res    *resp = &resop->nfs_resop4_u.opputpubfh;
3408 3499          int             error;
3409 3500          vnode_t         *vp;
3410 3501          struct exportinfo *exi, *sav_exi;
3411 3502          nfs_fh4_fmt_t   *fh_fmtp;
     3503 +        nfs_export_t *ne = nfs_get_export();
3412 3504  
3413 3505          DTRACE_NFSV4_1(op__putpubfh__start, struct compound_state *, cs);
3414 3506  
3415 3507          if (cs->vp) {
3416 3508                  VN_RELE(cs->vp);
3417 3509                  cs->vp = NULL;
3418 3510          }
3419 3511  
3420 3512          if (cs->cr)
3421 3513                  crfree(cs->cr);
3422 3514  
3423 3515          cs->cr = crdup(cs->basecr);
3424 3516  
3425      -        vp = exi_public->exi_vp;
     3517 +        vp = ne->exi_public->exi_vp;
3426 3518          if (vp == NULL) {
3427 3519                  *cs->statusp = resp->status = NFS4ERR_SERVERFAULT;
3428 3520                  goto out;
3429 3521          }
3430 3522  
3431      -        error = makefh4(&cs->fh, vp, exi_public);
     3523 +        error = makefh4(&cs->fh, vp, ne->exi_public);
3432 3524          if (error != 0) {
3433 3525                  *cs->statusp = resp->status = puterrno4(error);
3434 3526                  goto out;
3435 3527          }
3436 3528          sav_exi = cs->exi;
3437      -        if (exi_public == exi_root) {
     3529 +        if (ne->exi_public == ne->exi_root) {
3438 3530                  /*
3439 3531                   * No filesystem is actually shared public, so we default
3440 3532                   * to exi_root. In this case, we must check whether root
3441 3533                   * is exported.
3442 3534                   */
3443 3535                  fh_fmtp = (nfs_fh4_fmt_t *)cs->fh.nfs_fh4_val;
3444 3536  
3445 3537                  /*
3446 3538                   * if root filesystem is exported, the exportinfo struct that we
3447 3539                   * should use is what checkexport4 returns, because root_exi is
3448 3540                   * actually a mostly empty struct.
3449 3541                   */
3450 3542                  exi = checkexport4(&fh_fmtp->fh4_fsid,
3451 3543                      (fid_t *)&fh_fmtp->fh4_xlen, NULL);
3452      -                cs->exi = ((exi != NULL) ? exi : exi_public);
     3544 +                cs->exi = ((exi != NULL) ? exi : ne->exi_public);
3453 3545          } else {
3454 3546                  /*
3455 3547                   * it's a properly shared filesystem
3456 3548                   */
3457      -                cs->exi = exi_public;
     3549 +                cs->exi = ne->exi_public;
3458 3550          }
3459 3551  
3460 3552          if (is_system_labeled()) {
3461 3553                  bslabel_t *clabel;
3462 3554  
3463 3555                  ASSERT(req->rq_label != NULL);
3464 3556                  clabel = req->rq_label;
3465 3557                  DTRACE_PROBE2(tx__rfs4__log__info__opputpubfh__clabel, char *,
3466 3558                      "got client label from request(1)",
3467 3559                      struct svc_req *, req);
↓ open down ↓ 54 lines elided ↑ open up ↑
3522 3614          if (cs->vp) {
3523 3615                  VN_RELE(cs->vp);
3524 3616                  cs->vp = NULL;
3525 3617          }
3526 3618  
3527 3619          if (cs->cr) {
3528 3620                  crfree(cs->cr);
3529 3621                  cs->cr = NULL;
3530 3622          }
3531 3623  
3532      -
3533 3624          if (args->object.nfs_fh4_len < NFS_FH4_LEN) {
3534 3625                  *cs->statusp = resp->status = NFS4ERR_BADHANDLE;
3535 3626                  goto out;
3536 3627          }
3537 3628  
3538 3629          fh_fmtp = (nfs_fh4_fmt_t *)args->object.nfs_fh4_val;
3539 3630          cs->exi = checkexport4(&fh_fmtp->fh4_fsid, (fid_t *)&fh_fmtp->fh4_xlen,
3540 3631              NULL);
3541 3632  
3542 3633          if (cs->exi == NULL) {
↓ open down ↓ 46 lines elided ↑ open up ↑
3589 3680                  crfree(cs->cr);
3590 3681  
3591 3682          cs->cr = crdup(cs->basecr);
3592 3683  
3593 3684          /*
3594 3685           * Using rootdir, the system root vnode,
3595 3686           * get its fid.
3596 3687           */
3597 3688          bzero(&fid, sizeof (fid));
3598 3689          fid.fid_len = MAXFIDSZ;
3599      -        error = vop_fid_pseudo(rootdir, &fid);
     3690 +        error = vop_fid_pseudo(ZONE_ROOTVP(), &fid);
3600 3691          if (error != 0) {
3601 3692                  *cs->statusp = resp->status = puterrno4(error);
3602 3693                  goto out;
3603 3694          }
3604 3695  
3605 3696          /*
3606 3697           * Then use the root fsid & fid it to find out if it's exported
3607 3698           *
3608 3699           * If the server root isn't exported directly, then
3609 3700           * it should at least be a pseudo export based on
3610 3701           * one or more exports further down in the server's
3611 3702           * file tree.
3612 3703           */
3613      -        exi = checkexport4(&rootdir->v_vfsp->vfs_fsid, &fid, NULL);
     3704 +        exi = checkexport4(&ZONE_ROOTVP()->v_vfsp->vfs_fsid, &fid, NULL);
3614 3705          if (exi == NULL || exi->exi_export.ex_flags & EX_PUBLIC) {
3615 3706                  NFS4_DEBUG(rfs4_debug,
3616 3707                      (CE_WARN, "rfs4_op_putrootfh: export check failure"));
3617 3708                  *cs->statusp = resp->status = NFS4ERR_SERVERFAULT;
3618 3709                  goto out;
3619 3710          }
3620 3711  
3621 3712          /*
3622 3713           * Now make a filehandle based on the root
3623 3714           * export and root vnode.
3624 3715           */
3625      -        error = makefh4(&cs->fh, rootdir, exi);
     3716 +        error = makefh4(&cs->fh, ZONE_ROOTVP(), exi);
3626 3717          if (error != 0) {
3627 3718                  *cs->statusp = resp->status = puterrno4(error);
3628 3719                  goto out;
3629 3720          }
3630 3721  
3631 3722          sav_exi = cs->exi;
3632 3723          cs->exi = exi;
3633 3724  
3634      -        VN_HOLD(rootdir);
3635      -        cs->vp = rootdir;
     3725 +        VN_HOLD(ZONE_ROOTVP());
     3726 +        cs->vp = ZONE_ROOTVP();
3636 3727  
3637 3728          if ((resp->status = call_checkauth4(cs, req)) != NFS4_OK) {
3638      -                VN_RELE(rootdir);
     3729 +                VN_RELE(cs->vp);
3639 3730                  cs->vp = NULL;
3640 3731                  cs->exi = sav_exi;
3641 3732                  goto out;
3642 3733          }
3643 3734  
3644 3735          *cs->statusp = resp->status = NFS4_OK;
3645 3736          cs->deleg = FALSE;
3646 3737  out:
3647 3738          DTRACE_NFSV4_2(op__putrootfh__done, struct compound_state *, cs,
3648 3739              PUTROOTFH4res *, resp);
↓ open down ↓ 590 lines elided ↑ open up ↑
4239 4330                  if (vn_ismntpt(vp)) {
4240 4331                          error = EACCES;
4241 4332                  } else {
4242 4333                          /*
4243 4334                           * System V defines rmdir to return EEXIST,
4244 4335                           * not ENOTEMPTY, if the directory is not
4245 4336                           * empty.  A System V NFS server needs to map
4246 4337                           * NFS4ERR_EXIST to NFS4ERR_NOTEMPTY to
4247 4338                           * transmit over the wire.
4248 4339                           */
4249      -                        if ((error = VOP_RMDIR(dvp, name, rootdir, cs->cr,
     4340 +                        if ((error = VOP_RMDIR(dvp, name, ZONE_ROOTVP(), cs->cr,
4250 4341                              NULL, 0)) == EEXIST)
4251 4342                                  error = ENOTEMPTY;
4252 4343                  }
4253 4344          } else {
4254 4345                  if ((error = VOP_REMOVE(dvp, name, cs->cr, NULL, 0)) == 0 &&
4255 4346                      fp != NULL) {
4256 4347                          struct vattr va;
4257 4348                          vnode_t *tvp;
4258 4349  
4259 4350                          rfs4_dbe_lock(fp->rf_dbe);
↓ open down ↓ 91 lines elided ↑ open up ↑
4351 4442  /* ARGSUSED */
4352 4443  static void
4353 4444  rfs4_op_rename(nfs_argop4 *argop, nfs_resop4 *resop, struct svc_req *req,
4354 4445      struct compound_state *cs)
4355 4446  {
4356 4447          RENAME4args *args = &argop->nfs_argop4_u.oprename;
4357 4448          RENAME4res *resp = &resop->nfs_resop4_u.oprename;
4358 4449          int error;
4359 4450          vnode_t *odvp;
4360 4451          vnode_t *ndvp;
4361      -        vnode_t *srcvp, *targvp;
     4452 +        vnode_t *srcvp, *targvp, *tvp;
4362 4453          struct vattr obdva, oidva, oadva;
4363 4454          struct vattr nbdva, nidva, nadva;
4364 4455          char *onm, *nnm;
4365 4456          uint_t olen, nlen;
4366 4457          rfs4_file_t *fp, *sfp;
4367 4458          int in_crit_src, in_crit_targ;
4368 4459          int fp_rele_grant_hold, sfp_rele_grant_hold;
     4460 +        int unlinked;
4369 4461          bslabel_t *clabel;
4370 4462          struct sockaddr *ca;
4371 4463          char *converted_onm = NULL;
4372 4464          char *converted_nnm = NULL;
4373 4465          nfsstat4 status;
4374 4466  
4375 4467          DTRACE_NFSV4_2(op__rename__start, struct compound_state *, cs,
4376 4468              RENAME4args *, args);
4377 4469  
4378 4470          fp = sfp = NULL;
4379      -        srcvp = targvp = NULL;
     4471 +        srcvp = targvp = tvp = NULL;
4380 4472          in_crit_src = in_crit_targ = 0;
4381 4473          fp_rele_grant_hold = sfp_rele_grant_hold = 0;
     4474 +        unlinked = 0;
4382 4475  
4383 4476          /* CURRENT_FH: target directory */
4384 4477          ndvp = cs->vp;
4385 4478          if (ndvp == NULL) {
4386 4479                  *cs->statusp = resp->status = NFS4ERR_NOFILEHANDLE;
4387 4480                  goto out;
4388 4481          }
4389 4482  
4390 4483          /* SAVED_FH: from directory */
4391 4484          odvp = cs->saved_vp;
↓ open down ↓ 152 lines elided ↑ open up ↑
4544 4637          if (fp = rfs4_lookup_and_findfile(ndvp, converted_nnm, &targvp,
4545 4638              NULL, cs->cr)) {
4546 4639                  if (rfs4_check_delegated_byfp(FWRITE, fp, TRUE, TRUE, TRUE,
4547 4640                      NULL)) {
4548 4641                          *cs->statusp = resp->status = NFS4ERR_DELAY;
4549 4642                          goto err_out;
4550 4643                  }
4551 4644          }
4552 4645          fp_rele_grant_hold = 1;
4553 4646  
4554      -
4555 4647          /* Check for NBMAND lock on both source and target */
4556 4648          if (nbl_need_check(srcvp)) {
4557 4649                  nbl_start_crit(srcvp, RW_READER);
4558 4650                  in_crit_src = 1;
4559 4651                  if (nbl_conflict(srcvp, NBL_RENAME, 0, 0, 0, NULL)) {
4560 4652                          *cs->statusp = resp->status = NFS4ERR_FILE_OPEN;
4561 4653                          goto err_out;
4562 4654                  }
4563 4655          }
4564 4656  
↓ open down ↓ 14 lines elided ↑ open up ↑
4579 4671                  error = VOP_GETATTR(ndvp, &nbdva, 0, cs->cr, NULL);
4580 4672          }
4581 4673          if (error) {
4582 4674                  *cs->statusp = resp->status = puterrno4(error);
4583 4675                  goto err_out;
4584 4676          }
4585 4677  
4586 4678          NFS4_SET_FATTR4_CHANGE(resp->source_cinfo.before, obdva.va_ctime)
4587 4679          NFS4_SET_FATTR4_CHANGE(resp->target_cinfo.before, nbdva.va_ctime)
4588 4680  
4589      -        if ((error = VOP_RENAME(odvp, converted_onm, ndvp, converted_nnm,
4590      -            cs->cr, NULL, 0)) == 0 && fp != NULL) {
4591      -                struct vattr va;
4592      -                vnode_t *tvp;
     4681 +        error = VOP_RENAME(odvp, converted_onm, ndvp, converted_nnm, cs->cr,
     4682 +            NULL, 0);
4593 4683  
     4684 +        /*
     4685 +         * If target existed and was unlinked by VOP_RENAME, state will need
     4686 +         * closed. To avoid deadlock, rfs4_close_all_state will be done after
     4687 +         * any necessary nbl_end_crit on srcvp and tgtvp.
     4688 +         */
     4689 +        if (error == 0 && fp != NULL) {
4594 4690                  rfs4_dbe_lock(fp->rf_dbe);
4595 4691                  tvp = fp->rf_vp;
4596 4692                  if (tvp)
4597 4693                          VN_HOLD(tvp);
4598 4694                  rfs4_dbe_unlock(fp->rf_dbe);
4599 4695  
4600 4696                  if (tvp) {
     4697 +                        struct vattr va;
4601 4698                          va.va_mask = AT_NLINK;
     4699 +
4602 4700                          if (!VOP_GETATTR(tvp, &va, 0, cs->cr, NULL) &&
4603 4701                              va.va_nlink == 0) {
4604      -                                /* The file is gone and so should the state */
4605      -                                if (in_crit_targ) {
4606      -                                        nbl_end_crit(targvp);
4607      -                                        in_crit_targ = 0;
     4702 +                                unlinked = 1;
     4703 +
     4704 +                                /* DEBUG data */
     4705 +                                if ((srcvp == targvp) || (tvp != targvp)) {
     4706 +                                        cmn_err(CE_WARN, "rfs4_op_rename: "
     4707 +                                            "srcvp %p, targvp: %p, tvp: %p",
     4708 +                                            (void *)srcvp, (void *)targvp,
     4709 +                                            (void *)tvp);
4608 4710                                  }
4609      -                                rfs4_close_all_state(fp);
     4711 +                        } else {
     4712 +                                VN_RELE(tvp);
4610 4713                          }
4611      -                        VN_RELE(tvp);
4612 4714                  }
4613 4715          }
4614 4716          if (error == 0)
4615 4717                  vn_renamepath(ndvp, srcvp, nnm, nlen - 1);
4616 4718  
4617 4719          if (in_crit_src)
4618 4720                  nbl_end_crit(srcvp);
4619 4721          if (srcvp)
4620 4722                  VN_RELE(srcvp);
4621 4723          if (in_crit_targ)
4622 4724                  nbl_end_crit(targvp);
4623 4725          if (targvp)
4624 4726                  VN_RELE(targvp);
4625 4727  
     4728 +        if (unlinked) {
     4729 +                ASSERT(fp != NULL);
     4730 +                ASSERT(tvp != NULL);
     4731 +
     4732 +                /* DEBUG data */
     4733 +                if (RW_READ_HELD(&tvp->v_nbllock)) {
     4734 +                        cmn_err(CE_WARN, "rfs4_op_rename: "
     4735 +                            "RW_READ_HELD(%p)", (void *)tvp);
     4736 +                }
     4737 +
     4738 +                /* The file is gone and so should the state */
     4739 +                rfs4_close_all_state(fp);
     4740 +                VN_RELE(tvp);
     4741 +        }
     4742 +
4626 4743          if (sfp) {
4627 4744                  rfs4_clear_dont_grant(sfp);
4628 4745                  rfs4_file_rele(sfp);
4629 4746          }
4630 4747          if (fp) {
4631 4748                  rfs4_clear_dont_grant(fp);
4632 4749                  rfs4_file_rele(fp);
4633 4750          }
4634 4751  
4635 4752          if (converted_onm != onm)
↓ open down ↓ 916 lines elided ↑ open up ↑
5552 5669          struct uio uio;
5553 5670          struct iovec iov[MAX_IOVECS];
5554 5671          struct iovec *iovp;
5555 5672          int iovcnt;
5556 5673          int ioflag;
5557 5674          cred_t *savecred, *cr;
5558 5675          bool_t *deleg = &cs->deleg;
5559 5676          nfsstat4 stat;
5560 5677          int in_crit = 0;
5561 5678          caller_context_t ct;
     5679 +        nfs4_srv_t *nsrv4;
5562 5680  
5563 5681          DTRACE_NFSV4_2(op__write__start, struct compound_state *, cs,
5564 5682              WRITE4args *, args);
5565 5683  
5566 5684          vp = cs->vp;
5567 5685          if (vp == NULL) {
5568 5686                  *cs->statusp = resp->status = NFS4ERR_NOFILEHANDLE;
5569 5687                  goto out;
5570 5688          }
5571 5689          if (cs->access == CS_ACCESS_DENIED) {
↓ open down ↓ 50 lines elided ↑ open up ↑
5622 5740              (error = VOP_ACCESS(vp, VWRITE, 0, cr, &ct))) {
5623 5741                  *cs->statusp = resp->status = puterrno4(error);
5624 5742                  goto out;
5625 5743          }
5626 5744  
5627 5745          if (MANDLOCK(vp, bva.va_mode)) {
5628 5746                  *cs->statusp = resp->status = NFS4ERR_ACCESS;
5629 5747                  goto out;
5630 5748          }
5631 5749  
     5750 +        nsrv4 = zone_getspecific(rfs4_zone_key, curzone);
5632 5751          if (args->data_len == 0) {
5633 5752                  *cs->statusp = resp->status = NFS4_OK;
5634 5753                  resp->count = 0;
5635 5754                  resp->committed = args->stable;
5636      -                resp->writeverf = Write4verf;
     5755 +                resp->writeverf = nsrv4->write4verf;
5637 5756                  goto out;
5638 5757          }
5639 5758  
5640 5759          if (args->mblk != NULL) {
5641 5760                  mblk_t *m;
5642 5761                  uint_t bytes, round_len;
5643 5762  
5644 5763                  iovcnt = 0;
5645 5764                  bytes = 0;
5646 5765                  round_len = roundup(args->data_len, BYTES_PER_XDR_UNIT);
↓ open down ↓ 75 lines elided ↑ open up ↑
5722 5841          }
5723 5842  
5724 5843          *cs->statusp = resp->status = NFS4_OK;
5725 5844          resp->count = args->data_len - uio.uio_resid;
5726 5845  
5727 5846          if (ioflag == 0)
5728 5847                  resp->committed = UNSTABLE4;
5729 5848          else
5730 5849                  resp->committed = FILE_SYNC4;
5731 5850  
5732      -        resp->writeverf = Write4verf;
     5851 +        resp->writeverf = nsrv4->write4verf;
5733 5852  
5734 5853  out:
5735 5854          if (in_crit)
5736 5855                  nbl_end_crit(vp);
5737 5856  
5738 5857          DTRACE_NFSV4_2(op__write__done, struct compound_state *, cs,
5739 5858              WRITE4res *, resp);
5740 5859  }
5741 5860  
5742 5861  
5743 5862  /* XXX put in a header file */
5744 5863  extern int      sec_svc_getcred(struct svc_req *, cred_t *,  caddr_t *, int *);
5745 5864  
5746 5865  void
5747 5866  rfs4_compound(COMPOUND4args *args, COMPOUND4res *resp, struct exportinfo *exi,
5748 5867      struct svc_req *req, cred_t *cr, int *rv)
5749 5868  {
5750 5869          uint_t i;
5751 5870          struct compound_state cs;
     5871 +        nfs4_srv_t *nsrv4;
     5872 +        nfs_export_t *ne = nfs_get_export();
5752 5873  
5753 5874          if (rv != NULL)
5754 5875                  *rv = 0;
5755 5876          rfs4_init_compound_state(&cs);
5756 5877          /*
5757 5878           * Form a reply tag by copying over the reqeuest tag.
5758 5879           */
5759 5880          resp->tag.utf8string_val =
5760 5881              kmem_alloc(args->tag.utf8string_len, KM_SLEEP);
5761 5882          resp->tag.utf8string_len = args->tag.utf8string_len;
↓ open down ↓ 37 lines elided ↑ open up ↑
5799 5920                  svcerr_badcred(req->rq_xprt);
5800 5921                  if (rv != NULL)
5801 5922                          *rv = 1;
5802 5923                  return;
5803 5924          }
5804 5925          resp->array_len = args->array_len;
5805 5926          resp->array = kmem_zalloc(args->array_len * sizeof (nfs_resop4),
5806 5927              KM_SLEEP);
5807 5928  
5808 5929          cs.basecr = cr;
     5930 +        nsrv4 = zone_getspecific(rfs4_zone_key, curzone);
5809 5931  
5810 5932          DTRACE_NFSV4_2(compound__start, struct compound_state *, &cs,
5811 5933              COMPOUND4args *, args);
5812 5934  
5813 5935          /*
5814 5936           * For now, NFS4 compound processing must be protected by
5815 5937           * exported_lock because it can access more than one exportinfo
5816 5938           * per compound and share/unshare can now change multiple
5817 5939           * exinfo structs.  The NFS2/3 code only refs 1 exportinfo
5818 5940           * per proc (excluding public exinfo), and exi_count design
5819 5941           * is sufficient to protect concurrent execution of NFS2/3
5820 5942           * ops along with unexport.  This lock will be removed as
5821 5943           * part of the NFSv4 phase 2 namespace redesign work.
5822 5944           */
5823      -        rw_enter(&exported_lock, RW_READER);
     5945 +        rw_enter(&ne->exported_lock, RW_READER);
5824 5946  
5825 5947          /*
5826 5948           * If this is the first compound we've seen, we need to start all
5827 5949           * new instances' grace periods.
5828 5950           */
5829      -        if (rfs4_seen_first_compound == 0) {
5830      -                rfs4_grace_start_new();
     5951 +        if (nsrv4->seen_first_compound == 0) {
     5952 +                rfs4_grace_start_new(nsrv4);
5831 5953                  /*
5832 5954                   * This must be set after rfs4_grace_start_new(), otherwise
5833 5955                   * another thread could proceed past here before the former
5834 5956                   * is finished.
5835 5957                   */
5836      -                rfs4_seen_first_compound = 1;
     5958 +                nsrv4->seen_first_compound = 1;
5837 5959          }
5838 5960  
5839 5961          for (i = 0; i < args->array_len && cs.cont; i++) {
5840 5962                  nfs_argop4 *argop;
5841 5963                  nfs_resop4 *resop;
5842 5964                  uint_t op;
5843 5965  
5844 5966                  argop = &args->array[i];
5845 5967                  resop = &resp->array[i];
5846 5968                  resop->resop = argop->argop;
5847 5969                  op = (uint_t)resop->resop;
5848 5970  
5849 5971                  if (op < rfsv4disp_cnt) {
     5972 +                        kstat_t *ksp = rfsprocio_v4_ptr[op];
     5973 +                        kstat_t *exi_ksp = NULL;
     5974 +
5850 5975                          /*
5851 5976                           * Count the individual ops here; NULL and COMPOUND
5852 5977                           * are counted in common_dispatch()
5853 5978                           */
5854 5979                          rfsproccnt_v4_ptr[op].value.ui64++;
5855 5980  
     5981 +                        if (ksp != NULL) {
     5982 +                                mutex_enter(ksp->ks_lock);
     5983 +                                kstat_runq_enter(KSTAT_IO_PTR(ksp));
     5984 +                                mutex_exit(ksp->ks_lock);
     5985 +                        }
     5986 +
     5987 +                        switch (rfsv4disptab[op].op_type) {
     5988 +                        case NFS4_OP_CFH:
     5989 +                                resop->exi = cs.exi;
     5990 +                                break;
     5991 +                        case NFS4_OP_SFH:
     5992 +                                resop->exi = cs.saved_exi;
     5993 +                                break;
     5994 +                        default:
     5995 +                                ASSERT(resop->exi == NULL);
     5996 +                                break;
     5997 +                        }
     5998 +
     5999 +                        if (resop->exi != NULL) {
     6000 +                                exi_ksp = NULL;
     6001 +                                if (resop->exi->exi_kstats != NULL) {
     6002 +                                        exi_ksp = exp_kstats_v4(
     6003 +                                            resop->exi->exi_kstats, op);
     6004 +                                }
     6005 +                                if (exi_ksp != NULL) {
     6006 +                                        mutex_enter(exi_ksp->ks_lock);
     6007 +                                        kstat_runq_enter(KSTAT_IO_PTR(exi_ksp));
     6008 +                                        mutex_exit(exi_ksp->ks_lock);
     6009 +                                }
     6010 +                        }
     6011 +
5856 6012                          NFS4_DEBUG(rfs4_debug > 1,
5857 6013                              (CE_NOTE, "Executing %s", rfs4_op_string[op]));
5858 6014                          (*rfsv4disptab[op].dis_proc)(argop, resop, req, &cs);
5859 6015                          NFS4_DEBUG(rfs4_debug > 1, (CE_NOTE, "%s returned %d",
5860 6016                              rfs4_op_string[op], *cs.statusp));
5861 6017                          if (*cs.statusp != NFS4_OK)
5862 6018                                  cs.cont = FALSE;
     6019 +
     6020 +                        if (rfsv4disptab[op].op_type == NFS4_OP_POSTCFH &&
     6021 +                            *cs.statusp == NFS4_OK &&
     6022 +                            (resop->exi = cs.exi) != NULL) {
     6023 +                                exi_ksp = NULL;
     6024 +                                if (resop->exi->exi_kstats != NULL) {
     6025 +                                        exi_ksp = exp_kstats_v4(
     6026 +                                            resop->exi->exi_kstats, op);
     6027 +                                }
     6028 +                        }
     6029 +
     6030 +                        if (exi_ksp != NULL) {
     6031 +                                mutex_enter(exi_ksp->ks_lock);
     6032 +                                KSTAT_IO_PTR(exi_ksp)->nwritten +=
     6033 +                                    argop->opsize;
     6034 +                                KSTAT_IO_PTR(exi_ksp)->writes++;
     6035 +                                if (rfsv4disptab[op].op_type != NFS4_OP_POSTCFH)
     6036 +                                        kstat_runq_exit(KSTAT_IO_PTR(exi_ksp));
     6037 +                                mutex_exit(exi_ksp->ks_lock);
     6038 +                        } else {
     6039 +                                resop->exi = NULL;
     6040 +                        }
     6041 +
     6042 +                        if (ksp != NULL) {
     6043 +                                mutex_enter(ksp->ks_lock);
     6044 +                                kstat_runq_exit(KSTAT_IO_PTR(ksp));
     6045 +                                mutex_exit(ksp->ks_lock);
     6046 +                        }
5863 6047                  } else {
5864 6048                          /*
5865 6049                           * This is effectively dead code since XDR code
5866 6050                           * will have already returned BADXDR if op doesn't
5867 6051                           * decode to legal value.  This only done for a
5868 6052                           * day when XDR code doesn't verify v4 opcodes.
5869 6053                           */
5870 6054                          op = OP_ILLEGAL;
5871 6055                          rfsproccnt_v4_ptr[OP_ILLEGAL_IDX].value.ui64++;
5872 6056  
5873 6057                          rfs4_op_illegal(argop, resop, req, &cs);
5874 6058                          cs.cont = FALSE;
5875 6059                  }
5876 6060  
5877 6061                  /*
     6062 +                 * The exi saved in the resop to be used for kstats update
     6063 +                 * once the opsize is calculated during XDR response encoding.
     6064 +                 * Put a hold on resop->exi so that it can't be destroyed.
     6065 +                 */
     6066 +                if (resop->exi != NULL)
     6067 +                        exi_hold(resop->exi);
     6068 +
     6069 +                /*
5878 6070                   * If not at last op, and if we are to stop, then
5879 6071                   * compact the results array.
5880 6072                   */
5881 6073                  if ((i + 1) < args->array_len && !cs.cont) {
5882 6074                          nfs_resop4 *new_res = kmem_alloc(
5883      -                            (i+1) * sizeof (nfs_resop4), KM_SLEEP);
     6075 +                            (i + 1) * sizeof (nfs_resop4), KM_SLEEP);
5884 6076                          bcopy(resp->array,
5885      -                            new_res, (i+1) * sizeof (nfs_resop4));
     6077 +                            new_res, (i + 1) * sizeof (nfs_resop4));
5886 6078                          kmem_free(resp->array,
5887 6079                              args->array_len * sizeof (nfs_resop4));
5888 6080  
5889      -                        resp->array_len =  i + 1;
     6081 +                        resp->array_len = i + 1;
5890 6082                          resp->array = new_res;
5891 6083                  }
5892 6084          }
5893 6085  
5894      -        rw_exit(&exported_lock);
     6086 +        rw_exit(&ne->exported_lock);
5895 6087  
5896      -        DTRACE_NFSV4_2(compound__done, struct compound_state *, &cs,
5897      -            COMPOUND4res *, resp);
5898      -
     6088 +        /*
     6089 +         * clear exportinfo and vnode fields from compound_state before dtrace
     6090 +         * probe, to avoid tracing residual values for path and share path.
     6091 +         */
5899 6092          if (cs.vp)
5900 6093                  VN_RELE(cs.vp);
5901 6094          if (cs.saved_vp)
5902 6095                  VN_RELE(cs.saved_vp);
     6096 +        cs.exi = cs.saved_exi = NULL;
     6097 +        cs.vp = cs.saved_vp = NULL;
     6098 +
     6099 +        DTRACE_NFSV4_2(compound__done, struct compound_state *, &cs,
     6100 +            COMPOUND4res *, resp);
     6101 +
5903 6102          if (cs.saved_fh.nfs_fh4_val)
5904 6103                  kmem_free(cs.saved_fh.nfs_fh4_val, NFS4_FHSIZE);
5905 6104  
5906 6105          if (cs.basecr)
5907 6106                  crfree(cs.basecr);
5908 6107          if (cs.cr)
5909 6108                  crfree(cs.cr);
5910 6109          /*
5911 6110           * done with this compound request, free the label
5912 6111           */
↓ open down ↓ 49 lines elided ↑ open up ↑
5962 6161                  op = (uint_t)args->array[i].argop;
5963 6162  
5964 6163                  if (op < rfsv4disp_cnt)
5965 6164                          flag &= rfsv4disptab[op].dis_flags;
5966 6165                  else
5967 6166                          flag = 0;
5968 6167          }
5969 6168          *flagp = flag;
5970 6169  }
5971 6170  
     6171 +/*
     6172 + * Update the kstats for the received requests.
     6173 + * Note: writes/nwritten are used to hold count and nbytes of requests received.
     6174 + *
     6175 + * Per export request statistics need to be updated during the compound request
     6176 + * processing (rfs4_compound()) as that is where it is known which exportinfo to
     6177 + * associate the kstats with.
     6178 + */
     6179 +void
     6180 +rfs4_compound_kstat_args(COMPOUND4args *args)
     6181 +{
     6182 +        int i;
     6183 +
     6184 +        for (i = 0; i < args->array_len; i++) {
     6185 +                uint_t op = (uint_t)args->array[i].argop;
     6186 +
     6187 +                if (op < rfsv4disp_cnt) {
     6188 +                        kstat_t *ksp = rfsprocio_v4_ptr[op];
     6189 +
     6190 +                        if (ksp != NULL) {
     6191 +                                mutex_enter(ksp->ks_lock);
     6192 +                                KSTAT_IO_PTR(ksp)->nwritten +=
     6193 +                                    args->array[i].opsize;
     6194 +                                KSTAT_IO_PTR(ksp)->writes++;
     6195 +                                mutex_exit(ksp->ks_lock);
     6196 +                        }
     6197 +                }
     6198 +        }
     6199 +}
     6200 +
     6201 +/*
     6202 + * Update the kstats for the sent responses.
     6203 + * Note: reads/nread are used to hold count and nbytes of responses sent.
     6204 + *
     6205 + * Per export response statistics cannot be updated until here, after the
     6206 + * response send has generated the opsize (bytes sent) in the XDR encoding.
     6207 + * The exportinfo with which the kstats should be associated is thus saved
     6208 + * in the response structure (by rfs4_compound()) for use here. A hold is
     6209 + * placed on the exi to ensure it cannot be deleted before use. This hold
     6210 + * is released, and the exi set to NULL, here.
     6211 + */
     6212 +void
     6213 +rfs4_compound_kstat_res(COMPOUND4res *res)
     6214 +{
     6215 +        int i;
     6216 +        nfs_export_t *ne = nfs_get_export();
     6217 +
     6218 +        for (i = 0; i < res->array_len; i++) {
     6219 +                uint_t op = (uint_t)res->array[i].resop;
     6220 +
     6221 +                if (op < rfsv4disp_cnt) {
     6222 +                        kstat_t *ksp = rfsprocio_v4_ptr[op];
     6223 +                        struct exportinfo *exi = res->array[i].exi;
     6224 +
     6225 +                        if (ksp != NULL) {
     6226 +                                mutex_enter(ksp->ks_lock);
     6227 +                                KSTAT_IO_PTR(ksp)->nread +=
     6228 +                                    res->array[i].opsize;
     6229 +                                KSTAT_IO_PTR(ksp)->reads++;
     6230 +                                mutex_exit(ksp->ks_lock);
     6231 +                        }
     6232 +
     6233 +                        if (exi != NULL) {
     6234 +                                kstat_t *exi_ksp = NULL;
     6235 +
     6236 +                                rw_enter(&ne->exported_lock, RW_READER);
     6237 +
     6238 +                                if (exi->exi_kstats != NULL) {
     6239 +                                        /*CSTYLED*/
     6240 +                                        exi_ksp = exp_kstats_v4(exi->exi_kstats, op);
     6241 +                                }
     6242 +                                if (exi_ksp != NULL) {
     6243 +                                        mutex_enter(exi_ksp->ks_lock);
     6244 +                                        KSTAT_IO_PTR(exi_ksp)->nread +=
     6245 +                                            res->array[i].opsize;
     6246 +                                        KSTAT_IO_PTR(exi_ksp)->reads++;
     6247 +                                        mutex_exit(exi_ksp->ks_lock);
     6248 +                                }
     6249 +
     6250 +                                exi_rele(&exi);
     6251 +                                res->array[i].exi = NULL;
     6252 +                                rw_exit(&ne->exported_lock);
     6253 +                        }
     6254 +                }
     6255 +        }
     6256 +}
     6257 +
5972 6258  nfsstat4
5973 6259  rfs4_client_sysid(rfs4_client_t *cp, sysid_t *sp)
5974 6260  {
5975 6261          nfsstat4 e;
5976 6262  
5977 6263          rfs4_dbe_lock(cp->rc_dbe);
5978 6264  
5979 6265          if (cp->rc_sysidt != LM_NOSYSID) {
5980 6266                  *sp = cp->rc_sysidt;
5981 6267                  e = NFS4_OK;
↓ open down ↓ 614 lines elided ↑ open up ↑
6596 6882                  cs->mandlock = MANDLOCK(cs->vp, cva.va_mode);
6597 6883  
6598 6884                  /*
6599 6885                   * Truncate the file if necessary; this would be
6600 6886                   * the case for create over an existing file.
6601 6887                   */
6602 6888  
6603 6889                  if (trunc) {
6604 6890                          int in_crit = 0;
6605 6891                          rfs4_file_t *fp;
     6892 +                        nfs4_srv_t *nsrv4;
6606 6893                          bool_t create = FALSE;
6607 6894  
6608 6895                          /*
6609 6896                           * We are writing over an existing file.
6610 6897                           * Check to see if we need to recall a delegation.
6611 6898                           */
6612      -                        rfs4_hold_deleg_policy();
     6899 +                        nsrv4 = zone_getspecific(rfs4_zone_key, curzone);
     6900 +                        rfs4_hold_deleg_policy(nsrv4);
6613 6901                          if ((fp = rfs4_findfile(vp, NULL, &create)) != NULL) {
6614 6902                                  if (rfs4_check_delegated_byfp(FWRITE, fp,
6615 6903                                      (reqsize == 0), FALSE, FALSE, &clientid)) {
6616 6904                                          rfs4_file_rele(fp);
6617      -                                        rfs4_rele_deleg_policy();
     6905 +                                        rfs4_rele_deleg_policy(nsrv4);
6618 6906                                          VN_RELE(vp);
6619 6907                                          *attrset = 0;
6620 6908                                          return (NFS4ERR_DELAY);
6621 6909                                  }
6622 6910                                  rfs4_file_rele(fp);
6623 6911                          }
6624      -                        rfs4_rele_deleg_policy();
     6912 +                        rfs4_rele_deleg_policy(nsrv4);
6625 6913  
6626 6914                          if (nbl_need_check(vp)) {
6627 6915                                  in_crit = 1;
6628 6916  
6629 6917                                  ASSERT(reqsize == 0);
6630 6918  
6631 6919                                  nbl_start_crit(vp, RW_READER);
6632 6920                                  if (nbl_conflict(vp, NBL_WRITE, 0,
6633 6921                                      cva.va_size, 0, NULL)) {
6634 6922                                          in_crit = 0;
↓ open down ↓ 1537 lines elided ↑ open up ↑
8172 8460  /*ARGSUSED*/
8173 8461  void
8174 8462  rfs4_op_setclientid_confirm(nfs_argop4 *argop, nfs_resop4 *resop,
8175 8463      struct svc_req *req, struct compound_state *cs)
8176 8464  {
8177 8465          SETCLIENTID_CONFIRM4args *args =
8178 8466              &argop->nfs_argop4_u.opsetclientid_confirm;
8179 8467          SETCLIENTID_CONFIRM4res *res =
8180 8468              &resop->nfs_resop4_u.opsetclientid_confirm;
8181 8469          rfs4_client_t *cp, *cptoclose = NULL;
     8470 +        nfs4_srv_t *nsrv4;
8182 8471  
8183 8472          DTRACE_NFSV4_2(op__setclientid__confirm__start,
8184 8473              struct compound_state *, cs,
8185 8474              SETCLIENTID_CONFIRM4args *, args);
8186 8475  
     8476 +        nsrv4 = zone_getspecific(rfs4_zone_key, curzone);
8187 8477          *cs->statusp = res->status = NFS4_OK;
8188 8478  
8189 8479          cp = rfs4_findclient_by_id(args->clientid, TRUE);
8190 8480  
8191 8481          if (cp == NULL) {
8192 8482                  *cs->statusp = res->status =
8193 8483                      rfs4_check_clientid(&args->clientid, 1);
8194 8484                  goto out;
8195 8485          }
8196 8486  
↓ open down ↓ 15 lines elided ↑ open up ↑
8212 8502          if (cp->rc_cp_confirmed) {
8213 8503                  cptoclose = cp->rc_cp_confirmed;
8214 8504                  cptoclose->rc_ss_remove = 1;
8215 8505                  cp->rc_cp_confirmed = NULL;
8216 8506          }
8217 8507  
8218 8508          /*
8219 8509           * Update the client's associated server instance, if it's changed
8220 8510           * since the client was created.
8221 8511           */
8222      -        if (rfs4_servinst(cp) != rfs4_cur_servinst)
8223      -                rfs4_servinst_assign(cp, rfs4_cur_servinst);
     8512 +        if (rfs4_servinst(cp) != nsrv4->nfs4_cur_servinst)
     8513 +                rfs4_servinst_assign(nsrv4, cp, nsrv4->nfs4_cur_servinst);
8224 8514  
8225 8515          /*
8226 8516           * Record clientid in stable storage.
8227 8517           * Must be done after server instance has been assigned.
8228 8518           */
8229      -        rfs4_ss_clid(cp);
     8519 +        rfs4_ss_clid(nsrv4, cp);
8230 8520  
8231 8521          rfs4_dbe_unlock(cp->rc_dbe);
8232 8522  
8233 8523          if (cptoclose)
8234 8524                  /* don't need to rele, client_close does it */
8235 8525                  rfs4_client_close(cptoclose);
8236 8526  
8237 8527          /* If needed, initiate CB_NULL call for callback path */
8238 8528          rfs4_deleg_cb_check(cp);
8239 8529          rfs4_update_lease(cp);
8240 8530  
8241 8531          /*
8242 8532           * Check to see if client can perform reclaims
8243 8533           */
8244      -        rfs4_ss_chkclid(cp);
     8534 +        rfs4_ss_chkclid(nsrv4, cp);
8245 8535  
8246 8536          rfs4_client_rele(cp);
8247 8537  
8248 8538  out:
8249 8539          DTRACE_NFSV4_2(op__setclientid__confirm__done,
8250 8540              struct compound_state *, cs,
8251 8541              SETCLIENTID_CONFIRM4 *, res);
8252 8542  }
8253 8543  
8254 8544  
↓ open down ↓ 1623 lines elided ↑ open up ↑
9878 10168          int is_downrev;
9879 10169  
9880 10170          ca = (struct sockaddr *)svc_getrpccaller(req->rq_xprt)->buf;
9881 10171          ASSERT(ca);
9882 10172          ci = rfs4_find_clntip(ca, &create);
9883 10173          if (ci == NULL)
9884 10174                  return (0);
9885 10175          is_downrev = ci->ri_no_referrals;
9886 10176          rfs4_dbe_rele(ci->ri_dbe);
9887 10177          return (is_downrev);
     10178 +}
     10179 +
     10180 +/*
     10181 + * Do the main work of handling HA-NFSv4 Resource Group failover on
     10182 + * Sun Cluster.
     10183 + * We need to detect whether any RG admin paths have been added or removed,
     10184 + * and adjust resources accordingly.
     10185 + * Currently we're using a very inefficient algorithm, ~ 2 * O(n**2). In
     10186 + * order to scale, the list and array of paths need to be held in more
     10187 + * suitable data structures.
     10188 + */
     10189 +static void
     10190 +hanfsv4_failover(nfs4_srv_t *nsrv4)
     10191 +{
     10192 +        int i, start_grace, numadded_paths = 0;
     10193 +        char **added_paths = NULL;
     10194 +        rfs4_dss_path_t *dss_path;
     10195 +
     10196 +        /*
     10197 +         * Note: currently, dss_pathlist cannot be NULL, since
     10198 +         * it will always include an entry for NFS4_DSS_VAR_DIR. If we
     10199 +         * make the latter dynamically specified too, the following will
     10200 +         * need to be adjusted.
     10201 +         */
     10202 +
     10203 +        /*
     10204 +         * First, look for removed paths: RGs that have been failed-over
     10205 +         * away from this node.
     10206 +         * Walk the "currently-serving" dss_pathlist and, for each
     10207 +         * path, check if it is on the "passed-in" rfs4_dss_newpaths array
     10208 +         * from nfsd. If not, that RG path has been removed.
     10209 +         *
     10210 +         * Note that nfsd has sorted rfs4_dss_newpaths for us, and removed
     10211 +         * any duplicates.
     10212 +         */
     10213 +        dss_path = nsrv4->dss_pathlist;
     10214 +        do {
     10215 +                int found = 0;
     10216 +                char *path = dss_path->path;
     10217 +
     10218 +                /* used only for non-HA so may not be removed */
     10219 +                if (strcmp(path, NFS4_DSS_VAR_DIR) == 0) {
     10220 +                        dss_path = dss_path->next;
     10221 +                        continue;
     10222 +                }
     10223 +
     10224 +                for (i = 0; i < rfs4_dss_numnewpaths; i++) {
     10225 +                        int cmpret;
     10226 +                        char *newpath = rfs4_dss_newpaths[i];
     10227 +
     10228 +                        /*
     10229 +                         * Since nfsd has sorted rfs4_dss_newpaths for us,
     10230 +                         * once the return from strcmp is negative we know
     10231 +                         * we've passed the point where "path" should be,
     10232 +                         * and can stop searching: "path" has been removed.
     10233 +                         */
     10234 +                        cmpret = strcmp(path, newpath);
     10235 +                        if (cmpret < 0)
     10236 +                                break;
     10237 +                        if (cmpret == 0) {
     10238 +                                found = 1;
     10239 +                                break;
     10240 +                        }
     10241 +                }
     10242 +
     10243 +                if (found == 0) {
     10244 +                        unsigned index = dss_path->index;
     10245 +                        rfs4_servinst_t *sip = dss_path->sip;
     10246 +                        rfs4_dss_path_t *path_next = dss_path->next;
     10247 +
     10248 +                        /*
     10249 +                         * This path has been removed.
     10250 +                         * We must clear out the servinst reference to
     10251 +                         * it, since it's now owned by another
     10252 +                         * node: we should not attempt to touch it.
     10253 +                         */
     10254 +                        ASSERT(dss_path == sip->dss_paths[index]);
     10255 +                        sip->dss_paths[index] = NULL;
     10256 +
     10257 +                        /* remove from "currently-serving" list, and destroy */
     10258 +                        remque(dss_path);
     10259 +                        /* allow for NUL */
     10260 +                        kmem_free(dss_path->path, strlen(dss_path->path) + 1);
     10261 +                        kmem_free(dss_path, sizeof (rfs4_dss_path_t));
     10262 +
     10263 +                        dss_path = path_next;
     10264 +                } else {
     10265 +                        /* path was found; not removed */
     10266 +                        dss_path = dss_path->next;
     10267 +                }
     10268 +        } while (dss_path != nsrv4->dss_pathlist);
     10269 +
     10270 +        /*
     10271 +         * Now, look for added paths: RGs that have been failed-over
     10272 +         * to this node.
     10273 +         * Walk the "passed-in" rfs4_dss_newpaths array from nfsd and,
     10274 +         * for each path, check if it is on the "currently-serving"
     10275 +         * dss_pathlist. If not, that RG path has been added.
     10276 +         *
     10277 +         * Note: we don't do duplicate detection here; nfsd does that for us.
     10278 +         *
     10279 +         * Note: numadded_paths <= rfs4_dss_numnewpaths, which gives us
     10280 +         * an upper bound for the size needed for added_paths[numadded_paths].
     10281 +         */
     10282 +
     10283 +        /* probably more space than we need, but guaranteed to be enough */
     10284 +        if (rfs4_dss_numnewpaths > 0) {
     10285 +                size_t sz = rfs4_dss_numnewpaths * sizeof (char *);
     10286 +                added_paths = kmem_zalloc(sz, KM_SLEEP);
     10287 +        }
     10288 +
     10289 +        /* walk the "passed-in" rfs4_dss_newpaths array from nfsd */
     10290 +        for (i = 0; i < rfs4_dss_numnewpaths; i++) {
     10291 +                int found = 0;
     10292 +                char *newpath = rfs4_dss_newpaths[i];
     10293 +
     10294 +                dss_path = nsrv4->dss_pathlist;
     10295 +                do {
     10296 +                        char *path = dss_path->path;
     10297 +
     10298 +                        /* used only for non-HA */
     10299 +                        if (strcmp(path, NFS4_DSS_VAR_DIR) == 0) {
     10300 +                                dss_path = dss_path->next;
     10301 +                                continue;
     10302 +                        }
     10303 +
     10304 +                        if (strncmp(path, newpath, strlen(path)) == 0) {
     10305 +                                found = 1;
     10306 +                                break;
     10307 +                        }
     10308 +
     10309 +                        dss_path = dss_path->next;
     10310 +                } while (dss_path != nsrv4->dss_pathlist);
     10311 +
     10312 +                if (found == 0) {
     10313 +                        added_paths[numadded_paths] = newpath;
     10314 +                        numadded_paths++;
     10315 +                }
     10316 +        }
     10317 +
     10318 +        /* did we find any added paths? */
     10319 +        if (numadded_paths > 0) {
     10320 +
     10321 +                /* create a new server instance, and start its grace period */
     10322 +                start_grace = 1;
     10323 +                /* CSTYLED */
     10324 +                rfs4_servinst_create(nsrv4, start_grace, numadded_paths, added_paths);
     10325 +
     10326 +                /* read in the stable storage state from these paths */
     10327 +                rfs4_dss_readstate(nsrv4, numadded_paths, added_paths);
     10328 +
     10329 +                /*
     10330 +                 * Multiple failovers during a grace period will cause
     10331 +                 * clients of the same resource group to be partitioned
     10332 +                 * into different server instances, with different
     10333 +                 * grace periods.  Since clients of the same resource
     10334 +                 * group must be subject to the same grace period,
     10335 +                 * we need to reset all currently active grace periods.
     10336 +                 */
     10337 +                rfs4_grace_reset_all(nsrv4);
     10338 +        }
     10339 +
     10340 +        if (rfs4_dss_numnewpaths > 0)
     10341 +                kmem_free(added_paths, rfs4_dss_numnewpaths * sizeof (char *));
9888 10342  }
    
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX