Print this page
NEX-9200 Improve the scalability of attribute locking in zfs_zget
Reviewed by: Joyce McIntosh <joyce.mcintosh@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-9552 zfs_scan_idle throttling harms performance and needs to be removed
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
NEX-13140 DVA-throttle support for special-class
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-13937 Improve kstat performance
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>
NEX-6088 ZFS scrub/resilver take excessively long due to issuing lots of random IO
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-8711 backport illumos 7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information
7115 6922 generates ESC_ZFS_VDEV_REMOVE_AUX a bit too often
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Robert Mustacchi <rm@joyent.com>
NEX-6884 KRRP: replication deadlock due to unavailable resources
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-5856 ddt_capped isn't reset when deduped dataset is destroyed
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
NEX-5553 ZFS auto-trim, manual-trim and scrub can race and deadlock
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-5795 Rename 'wrc' as 'wbc' in the source and in the tech docs
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5367 special vdev: sync-write options (NEW)
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5318 Cleanup specialclass property (obsolete, not used) and fix related meta-to-special case
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5255 speed-up migration of the write-cached data
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5064 On-demand trim should store operation start and stop time
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5219 WBC: Add capability to delay migration
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5186 smf-tests contains built files and it shouldn't
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
NEX-5168 cleanup and productize non-default latency based writecache load-balancer
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4940 Special Vdev operation in presence (or absense) of IO Errors
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-4934 Add capability to remove special vdev
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4807 writecache load-balancing statistics: several distinct problems, must be revisited and revised
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4876 On-demand TRIM shouldn't use system_taskq and should queue jobs
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4794 Write Back Cache sync and async writes: adjust routing according to watermark limits
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4620 ZFS autotrim triggering is unreliable
NEX-4622 On-demand TRIM code illogically enumerates metaslabs via mg_ms_tree
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
NEX-4619 Want kstats to monitor TRIM and UNMAP operation
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R (fix studio build)
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Garrett D'Amore <garrett@damore.org>
5818 zfs {ref}compressratio is incorrect with 4k sector size
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Reviewed by: Don Brady <dev.fs.zfs@gmail.com>
Approved by: Albert Lee <trisk@omniti.com>
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Revert "NEX-4476 WRC: Allow to use write back cache per tree of datasets"
This reverts commit fe97b74444278a6f36fec93179133641296312da.
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-3502 dedup ceiling should set a pool prop when cap is in effect
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-3984 On-demand TRIM
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Conflicts:
        usr/src/common/zfs/zpool_prop.c
        usr/src/uts/common/sys/fs/zfs.h
NEX-3558 KRRP Integration
NEX-3508 CLONE - Port NEX-2946 Add UNMAP/TRIM functionality to ZFS and illumos
Reviewed by: Josef Sipek <josef.sipek@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Conflicts:
    usr/src/uts/common/io/scsi/targets/sd.c
    usr/src/uts/common/sys/scsi/targets/sddef.h
NEX-3165 need some dedup improvements
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Garrett D'Amore <garrett@damore.org>
SUP-577 deadlock between zpool detach and syseventd
OS-80 support for vdev and CoS properties for the new I/O scheduler
OS-95 lint warning introduced by OS-61
Issue #27: Auto best-effort dedup enable/disable - settable per pool
Issues #7: Reconsile L2ARC and "special" use by datasets
Issue #2: optimize DDE lookup in DDT objects
Added option to control number of classes of DDE's in DDT.
New default is one, that is all DDE's are stored together
regardless of refcount.
Issue #3: Add support for parametrized number of copies for DDTs
Issue #25: Add a pool-level property that controls the number of copies of DDTs in the pool.
re #12643 rb4064 ZFS meta refactoring - vdev utilization tracking, auto-dedup
re #12585 rb4049 ZFS++ work port - refactoring to improve separation of open/closed code, bug fixes, performance improvements - open code
Bug 11205: add missing libzfs_closed_stubs.c to fix opensource-only build.
ZFS plus work: special vdevs, cos, cos/vdev properties


   3  *
   4  * The contents of this file are subject to the terms of the
   5  * Common Development and Distribution License (the "License").
   6  * You may not use this file except in compliance with the License.
   7  *
   8  * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
   9  * or http://www.opensolaris.org/os/licensing.
  10  * See the License for the specific language governing permissions
  11  * and limitations under the License.
  12  *
  13  * When distributing Covered Code, include this CDDL HEADER in each
  14  * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  15  * If applicable, add the following below this CDDL HEADER, with the
  16  * fields enclosed by brackets "[]" replaced with your own identifying
  17  * information: Portions Copyright [yyyy] [name of copyright owner]
  18  *
  19  * CDDL HEADER END
  20  */
  21 /*
  22  * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
  23  * Copyright (c) 2011, 2018 by Delphix. All rights reserved.
  24  * Copyright 2011 Nexenta Systems, Inc.  All rights reserved.
  25  * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved.

  26  * Copyright 2013 Saso Kiselkov. All rights reserved.
  27  * Copyright (c) 2017 Datto Inc.
  28  */
  29 
  30 #ifndef _SYS_SPA_IMPL_H
  31 #define _SYS_SPA_IMPL_H
  32 
  33 #include <sys/spa.h>
  34 #include <sys/vdev.h>
  35 #include <sys/vdev_removal.h>
  36 #include <sys/metaslab.h>
  37 #include <sys/dmu.h>
  38 #include <sys/dsl_pool.h>
  39 #include <sys/uberblock_impl.h>
  40 #include <sys/zfs_context.h>
  41 #include <sys/avl.h>
  42 #include <sys/refcount.h>
  43 #include <sys/bplist.h>
  44 #include <sys/bpobj.h>


  45 #include <sys/zfeature.h>
  46 #include <sys/zthr.h>
  47 #include <zfeature_common.h>

  48 
  49 #ifdef  __cplusplus
  50 extern "C" {
  51 #endif
  52 






  53 typedef struct spa_error_entry {
  54         zbookmark_phys_t        se_bookmark;
  55         char                    *se_name;
  56         avl_node_t              se_avl;
  57 } spa_error_entry_t;
  58 
  59 typedef struct spa_history_phys {
  60         uint64_t sh_pool_create_len;    /* ending offset of zpool create */
  61         uint64_t sh_phys_max_off;       /* physical EOF */
  62         uint64_t sh_bof;                /* logical BOF */
  63         uint64_t sh_eof;                /* logical EOF */
  64         uint64_t sh_records_lost;       /* num of records overwritten */
  65 } spa_history_phys_t;
  66 
  67 /*
  68  * All members must be uint64_t, for byteswap purposes.
  69  */
  70 typedef struct spa_removing_phys {
  71         uint64_t sr_state; /* dsl_scan_state_t */
  72 
  73         /*
  74          * The vdev ID that we most recently attempted to remove,
  75          * or -1 if no removal has been attempted.
  76          */
  77         uint64_t sr_removing_vdev;
  78 
  79         /*
  80          * The vdev ID that we most recently successfully removed,
  81          * or -1 if no devices have been removed.
  82          */
  83         uint64_t sr_prev_indirect_vdev;
  84 
  85         uint64_t sr_start_time;
  86         uint64_t sr_end_time;
  87 
  88         /*
  89          * Note that we can not use the space map's or indirect mapping's
  90          * accounting as a substitute for these values, because we need to
  91          * count frees of not-yet-copied data as though it did the copy.
  92          * Otherwise, we could get into a situation where copied > to_copy,
  93          * or we complete before copied == to_copy.
  94          */
  95         uint64_t sr_to_copy; /* bytes that need to be copied */
  96         uint64_t sr_copied; /* bytes that have been copied or freed */
  97 } spa_removing_phys_t;
  98 
  99 /*
 100  * This struct is stored as an entry in the DMU_POOL_DIRECTORY_OBJECT
 101  * (with key DMU_POOL_CONDENSING_INDIRECT).  It is present if a condense
 102  * of an indirect vdev's mapping object is in progress.
 103  */
 104 typedef struct spa_condensing_indirect_phys {
 105         /*
 106          * The vdev ID of the indirect vdev whose indirect mapping is
 107          * being condensed.
 108          */
 109         uint64_t        scip_vdev;
 110 
 111         /*
 112          * The vdev's old obsolete spacemap.  This spacemap's contents are
 113          * being integrated into the new mapping.
 114          */
 115         uint64_t        scip_prev_obsolete_sm_object;
 116 
 117         /*
 118          * The new mapping object that is being created.
 119          */
 120         uint64_t        scip_next_mapping_object;
 121 } spa_condensing_indirect_phys_t;
 122 
 123 struct spa_aux_vdev {
 124         uint64_t        sav_object;             /* MOS object for device list */
 125         nvlist_t        *sav_config;            /* cached device config */
 126         vdev_t          **sav_vdevs;            /* devices */
 127         int             sav_count;              /* number devices */
 128         boolean_t       sav_sync;               /* sync the device list */
 129         nvlist_t        **sav_pending;          /* pending device additions */
 130         uint_t          sav_npending;           /* # pending devices */
 131 };
 132 
 133 typedef struct spa_config_lock {
 134         kmutex_t        scl_lock;
 135         kthread_t       *scl_writer;
 136         int             scl_write_wanted;
 137         kcondvar_t      scl_cv;
 138         refcount_t      scl_count;
 139 } spa_config_lock_t;
 140 
 141 typedef struct spa_config_dirent {
 142         list_node_t     scd_link;


 165 typedef enum spa_proc_state {
 166         SPA_PROC_NONE,          /* spa_proc = &p0, no process created */
 167         SPA_PROC_CREATED,       /* spa_activate() has proc, is waiting */
 168         SPA_PROC_ACTIVE,        /* taskqs created, spa_proc set */
 169         SPA_PROC_DEACTIVATE,    /* spa_deactivate() requests process exit */
 170         SPA_PROC_GONE           /* spa_thread() is exiting, spa_proc = &p0 */
 171 } spa_proc_state_t;
 172 
 173 typedef struct spa_taskqs {
 174         uint_t stqs_count;
 175         taskq_t **stqs_taskq;
 176 } spa_taskqs_t;
 177 
 178 typedef enum spa_all_vdev_zap_action {
 179         AVZ_ACTION_NONE = 0,
 180         AVZ_ACTION_DESTROY,     /* Destroy all per-vdev ZAPs and the AVZ. */
 181         AVZ_ACTION_REBUILD,     /* Populate the new AVZ, see spa_avz_rebuild */
 182         AVZ_ACTION_INITIALIZE
 183 } spa_avz_action_t;
 184 
 185 typedef enum spa_config_source {
 186         SPA_CONFIG_SRC_NONE = 0,
 187         SPA_CONFIG_SRC_SCAN,            /* scan of path (default: /dev/dsk) */
 188         SPA_CONFIG_SRC_CACHEFILE,       /* any cachefile */
 189         SPA_CONFIG_SRC_TRYIMPORT,       /* returned from call to tryimport */
 190         SPA_CONFIG_SRC_SPLIT,           /* new pool in a pool split */
 191         SPA_CONFIG_SRC_MOS              /* MOS, but not always from right txg */
 192 } spa_config_source_t;
 193 














































 194 struct spa {
 195         /*
 196          * Fields protected by spa_namespace_lock.
 197          */
 198         char            spa_name[ZFS_MAX_DATASET_NAME_LEN];     /* pool name */
 199         char            *spa_comment;           /* comment */
 200         avl_node_t      spa_avl;                /* node in spa_namespace_avl */
 201         nvlist_t        *spa_config;            /* last synced config */
 202         nvlist_t        *spa_config_syncing;    /* currently syncing config */
 203         nvlist_t        *spa_config_splitting;  /* config for splitting */
 204         nvlist_t        *spa_load_info;         /* info and errors from load */
 205         uint64_t        spa_config_txg;         /* txg of last config change */
 206         int             spa_sync_pass;          /* iterate-to-convergence */
 207         pool_state_t    spa_state;              /* pool state */
 208         int             spa_inject_ref;         /* injection references */
 209         uint8_t         spa_sync_on;            /* sync threads are running */
 210         spa_load_state_t spa_load_state;        /* current load operation */
 211         boolean_t       spa_indirect_vdevs_loaded; /* mappings loaded? */
 212         boolean_t       spa_trust_config;       /* do we trust vdev tree? */
 213         spa_config_source_t spa_config_source;  /* where config comes from? */
 214         uint64_t        spa_import_flags;       /* import specific flags */
 215         spa_taskqs_t    spa_zio_taskq[ZIO_TYPES][ZIO_TASKQ_TYPES];
 216         dsl_pool_t      *spa_dsl_pool;
 217         boolean_t       spa_is_initializing;    /* true while opening pool */
 218         metaslab_class_t *spa_normal_class;     /* normal data class */
 219         metaslab_class_t *spa_log_class;        /* intent log data class */

 220         uint64_t        spa_first_txg;          /* first txg after spa_open() */
 221         uint64_t        spa_final_txg;          /* txg of export/destroy */
 222         uint64_t        spa_freeze_txg;         /* freeze pool at this txg */
 223         uint64_t        spa_load_max_txg;       /* best initial ub_txg */
 224         uint64_t        spa_claim_max_txg;      /* highest claimed birth txg */
 225         timespec_t      spa_loaded_ts;          /* 1st successful open time */
 226         objset_t        *spa_meta_objset;       /* copy of dp->dp_meta_objset */
 227         kmutex_t        spa_evicting_os_lock;   /* Evicting objset list lock */
 228         list_t          spa_evicting_os_list;   /* Objsets being evicted. */
 229         kcondvar_t      spa_evicting_os_cv;     /* Objset Eviction Completion */
 230         txg_list_t      spa_vdev_txg_list;      /* per-txg dirty vdev list */
 231         vdev_t          *spa_root_vdev;         /* top-level vdev container */
 232         int             spa_min_ashift;         /* of vdevs in normal class */
 233         int             spa_max_ashift;         /* of vdevs in normal class */
 234         uint64_t        spa_config_guid;        /* config pool guid */
 235         uint64_t        spa_load_guid;          /* spa_load initialized guid */
 236         uint64_t        spa_last_synced_guid;   /* last synced guid */
 237         list_t          spa_config_dirty_list;  /* vdevs with dirty config */
 238         list_t          spa_state_dirty_list;   /* vdevs with dirty state */
 239         kmutex_t        spa_alloc_lock;
 240         avl_tree_t      spa_alloc_tree;
 241         spa_aux_vdev_t  spa_spares;             /* hot spares */
 242         spa_aux_vdev_t  spa_l2cache;            /* L2ARC cache devices */
 243         nvlist_t        *spa_label_features;    /* Features for reading MOS */
 244         uint64_t        spa_config_object;      /* MOS object for pool config */
 245         uint64_t        spa_config_generation;  /* config generation number */
 246         uint64_t        spa_syncing_txg;        /* txg currently syncing */
 247         bpobj_t         spa_deferred_bpobj;     /* deferred-free bplist */
 248         bplist_t        spa_free_bplist[TXG_SIZE]; /* bplist of stuff to free */
 249         zio_cksum_salt_t spa_cksum_salt;        /* secret salt for cksum */
 250         /* checksum context templates */
 251         kmutex_t        spa_cksum_tmpls_lock;
 252         void            *spa_cksum_tmpls[ZIO_CHECKSUM_FUNCTIONS];
 253         uberblock_t     spa_ubsync;             /* last synced uberblock */
 254         uberblock_t     spa_uberblock;          /* current uberblock */
 255         boolean_t       spa_extreme_rewind;     /* rewind past deferred frees */
 256         uint64_t        spa_last_io;            /* lbolt of last non-scan I/O */
 257         kmutex_t        spa_scrub_lock;         /* resilver/scrub lock */
 258         uint64_t        spa_scrub_inflight;     /* in-flight scrub I/Os */
 259         kcondvar_t      spa_scrub_io_cv;        /* scrub I/O completion */
 260         uint8_t         spa_scrub_active;       /* active or suspended? */
 261         uint8_t         spa_scrub_type;         /* type of scrub we're doing */
 262         uint8_t         spa_scrub_finished;     /* indicator to rotate logs */
 263         uint8_t         spa_scrub_started;      /* started since last boot */
 264         uint8_t         spa_scrub_reopen;       /* scrub doing vdev_reopen */
 265         uint64_t        spa_scan_pass_start;    /* start time per pass/reboot */
 266         uint64_t        spa_scan_pass_scrub_pause; /* scrub pause time */
 267         uint64_t        spa_scan_pass_scrub_spent_paused; /* total paused */
 268         uint64_t        spa_scan_pass_exam;     /* examined bytes per pass */

 269         kmutex_t        spa_async_lock;         /* protect async state */
 270         kthread_t       *spa_async_thread;      /* thread doing async task */
 271         int             spa_async_suspended;    /* async tasks suspended */
 272         kcondvar_t      spa_async_cv;           /* wait for thread_exit() */
 273         uint16_t        spa_async_tasks;        /* async task mask */
 274         uint64_t        spa_missing_tvds;       /* unopenable tvds on load */
 275         uint64_t        spa_missing_tvds_allowed; /* allow loading spa? */
 276 
 277         spa_removing_phys_t spa_removing_phys;
 278         spa_vdev_removal_t *spa_vdev_removal;
 279 
 280         spa_condensing_indirect_phys_t  spa_condensing_indirect_phys;
 281         spa_condensing_indirect_t       *spa_condensing_indirect;
 282         zthr_t          *spa_condense_zthr;     /* zthr doing condense. */
 283 
 284         char            *spa_root;              /* alternate root directory */
 285         uint64_t        spa_ena;                /* spa-wide ereport ENA */
 286         int             spa_last_open_failed;   /* error if last open failed */
 287         uint64_t        spa_last_ubsync_txg;    /* "best" uberblock txg */
 288         uint64_t        spa_last_ubsync_txg_ts; /* timestamp from that ub */
 289         uint64_t        spa_load_txg;           /* ub txg that loaded */
 290         uint64_t        spa_load_txg_ts;        /* timestamp from that ub */
 291         uint64_t        spa_load_meta_errors;   /* verify metadata err count */
 292         uint64_t        spa_load_data_errors;   /* verify data err count */
 293         uint64_t        spa_verify_min_txg;     /* start txg of verify scrub */
 294         kmutex_t        spa_errlog_lock;        /* error log lock */
 295         uint64_t        spa_errlog_last;        /* last error log object */
 296         uint64_t        spa_errlog_scrub;       /* scrub error log object */
 297         kmutex_t        spa_errlist_lock;       /* error list/ereport lock */
 298         avl_tree_t      spa_errlist_last;       /* last error list */
 299         avl_tree_t      spa_errlist_scrub;      /* scrub error list */
 300         uint64_t        spa_deflate;            /* should we deflate? */
 301         uint64_t        spa_history;            /* history object */
 302         kmutex_t        spa_history_lock;       /* history lock */
 303         vdev_t          *spa_pending_vdev;      /* pending vdev additions */
 304         kmutex_t        spa_props_lock;         /* property lock */
 305         uint64_t        spa_pool_props_object;  /* object for properties */




 306         uint64_t        spa_bootfs;             /* default boot filesystem */
 307         uint64_t        spa_failmode;           /* failure mode for the pool */
 308         uint64_t        spa_delegation;         /* delegation on/off */
 309         list_t          spa_config_list;        /* previous cache file(s) */
 310         /* per-CPU array of root of async I/O: */
 311         zio_t           **spa_async_zio_root;
 312         zio_t           *spa_suspend_zio_root;  /* root of all suspended I/O */
 313         zio_t           *spa_txg_zio[TXG_SIZE]; /* spa_sync() waits for this */
 314         kmutex_t        spa_suspend_lock;       /* protects suspend_zio_root */
 315         kcondvar_t      spa_suspend_cv;         /* notification of resume */
 316         uint8_t         spa_suspended;          /* pool is suspended */
 317         uint8_t         spa_claiming;           /* pool is doing zil_claim() */
 318         boolean_t       spa_debug;              /* debug enabled? */
 319         boolean_t       spa_is_root;            /* pool is root */
 320         int             spa_minref;             /* num refs when first opened */
 321         int             spa_mode;               /* FREAD | FWRITE */
 322         spa_log_state_t spa_log_state;          /* log state */
 323         uint64_t        spa_autoexpand;         /* lun expansion on/off */
 324         uint64_t        spa_bootsize;           /* efi system partition size */
 325         ddt_t           *spa_ddt[ZIO_CHECKSUM_FUNCTIONS]; /* in-core DDTs */
 326         uint64_t        spa_ddt_stat_object;    /* DDT statistics */
 327         uint64_t        spa_dedup_ditto;        /* dedup ditto threshold */
 328         uint64_t        spa_dedup_checksum;     /* default dedup checksum */


 329         uint64_t        spa_dspace;             /* dspace in normal class */
 330         kmutex_t        spa_vdev_top_lock;      /* dueling offline/remove */
 331         kmutex_t        spa_proc_lock;          /* protects spa_proc* */
 332         kcondvar_t      spa_proc_cv;            /* spa_proc_state transitions */
 333         spa_proc_state_t spa_proc_state;        /* see definition */
 334         struct proc     *spa_proc;              /* "zpool-poolname" process */
 335         uint64_t        spa_did;                /* if procp != p0, did of t1 */
 336         boolean_t       spa_autoreplace;        /* autoreplace set in open */
 337         int             spa_vdev_locks;         /* locks grabbed */
 338         uint64_t        spa_creation_version;   /* version at pool creation */
 339         uint64_t        spa_prev_software_version; /* See ub_software_version */
 340         uint64_t        spa_feat_for_write_obj; /* required to write to pool */
 341         uint64_t        spa_feat_for_read_obj;  /* required to read from pool */
 342         uint64_t        spa_feat_desc_obj;      /* Feature descriptions */
 343         uint64_t        spa_feat_enabled_txg_obj; /* Feature enabled txg */
 344         /* cache feature refcounts */
 345         uint64_t        spa_feat_refcount_cache[SPA_FEATURES];
 346         cyclic_id_t     spa_deadman_cycid;      /* cyclic id */
 347         uint64_t        spa_deadman_calls;      /* number of deadman calls */
 348         hrtime_t        spa_sync_starttime;     /* starting time fo spa_sync */
 349         uint64_t        spa_deadman_synctime;   /* deadman expiration timer */
 350         uint64_t        spa_all_vdev_zaps;      /* ZAP of per-vd ZAP obj #s */
 351         spa_avz_action_t        spa_avz_action; /* destroy/rebuild AVZ? */
 352 




















 353         /*
 354          * spa_iokstat_lock protects spa_iokstat and
 355          * spa_queue_stats[].
 356          */
 357         kmutex_t        spa_iokstat_lock;
 358         struct kstat    *spa_iokstat;           /* kstat of io to this pool */
 359         struct {
 360                 int spa_active;
 361                 int spa_queued;
 362         } spa_queue_stats[ZIO_PRIORITY_NUM_QUEUEABLE];
 363 








 364         hrtime_t        spa_ccw_fail_time;      /* Conf cache write fail time */
 365 


























 366         /*
 367          * spa_refcount & spa_config_lock must be the last elements




























 368          * because refcount_t changes size based on compilation options.
 369          * In order for the MDB module to function correctly, the other
 370          * fields must remain in the same location.
 371          */
 372         spa_config_lock_t spa_config_lock[SCL_LOCKS]; /* config changes */
 373         refcount_t      spa_refcount;           /* number of opens */

























 374 };
 375 






 376 extern const char *spa_config_path;
 377 
 378 extern void spa_taskq_dispatch_ent(spa_t *spa, zio_type_t t, zio_taskq_type_t q,
 379     task_func_t *func, void *arg, uint_t flags, taskq_ent_t *ent);
 380 extern void spa_load_spares(spa_t *spa);
 381 extern void spa_load_l2cache(spa_t *spa);
 382 





 383 #ifdef  __cplusplus
 384 }
 385 #endif
 386 
 387 #endif  /* _SYS_SPA_IMPL_H */


   3  *
   4  * The contents of this file are subject to the terms of the
   5  * Common Development and Distribution License (the "License").
   6  * You may not use this file except in compliance with the License.
   7  *
   8  * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
   9  * or http://www.opensolaris.org/os/licensing.
  10  * See the License for the specific language governing permissions
  11  * and limitations under the License.
  12  *
  13  * When distributing Covered Code, include this CDDL HEADER in each
  14  * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  15  * If applicable, add the following below this CDDL HEADER, with the
  16  * fields enclosed by brackets "[]" replaced with your own identifying
  17  * information: Portions Copyright [yyyy] [name of copyright owner]
  18  *
  19  * CDDL HEADER END
  20  */
  21 /*
  22  * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
  23  * Copyright (c) 2011, 2015 by Delphix. All rights reserved.

  24  * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved.
  25  * Copyright 2017 Nexenta Systems, Inc.  All rights reserved.
  26  * Copyright 2013 Saso Kiselkov. All rights reserved.
  27  * Copyright (c) 2017 Datto Inc.
  28  */
  29 
  30 #ifndef _SYS_SPA_IMPL_H
  31 #define _SYS_SPA_IMPL_H
  32 
  33 #include <sys/spa.h>
  34 #include <sys/vdev.h>
  35 #include <sys/vdev_impl.h>
  36 #include <sys/metaslab.h>
  37 #include <sys/dmu.h>
  38 #include <sys/dsl_pool.h>
  39 #include <sys/uberblock_impl.h>
  40 #include <sys/zfs_context.h>
  41 #include <sys/avl.h>
  42 #include <sys/refcount.h>
  43 #include <sys/bplist.h>
  44 #include <sys/bpobj.h>
  45 #include <sys/special_impl.h>
  46 #include <sys/wbc.h>
  47 #include <sys/zfeature.h>

  48 #include <zfeature_common.h>
  49 #include <sys/autosnap.h>
  50 
  51 #ifdef  __cplusplus
  52 extern "C" {
  53 #endif
  54 
  55 /*
  56  * This (illegal) pool name is used when temporarily importing a spa_t in order
  57  * to get the vdev stats associated with the imported devices.
  58  */
  59 #define TRYIMPORT_NAME  "$import"
  60 
  61 typedef struct spa_error_entry {
  62         zbookmark_phys_t        se_bookmark;
  63         char                    *se_name;
  64         avl_node_t              se_avl;
  65 } spa_error_entry_t;
  66 
  67 typedef struct spa_history_phys {
  68         uint64_t sh_pool_create_len;    /* ending offset of zpool create */
  69         uint64_t sh_phys_max_off;       /* physical EOF */
  70         uint64_t sh_bof;                /* logical BOF */
  71         uint64_t sh_eof;                /* logical EOF */
  72         uint64_t sh_records_lost;       /* num of records overwritten */
  73 } spa_history_phys_t;
  74 
























































  75 struct spa_aux_vdev {
  76         uint64_t        sav_object;             /* MOS object for device list */
  77         nvlist_t        *sav_config;            /* cached device config */
  78         vdev_t          **sav_vdevs;            /* devices */
  79         int             sav_count;              /* number devices */
  80         boolean_t       sav_sync;               /* sync the device list */
  81         nvlist_t        **sav_pending;          /* pending device additions */
  82         uint_t          sav_npending;           /* # pending devices */
  83 };
  84 
  85 typedef struct spa_config_lock {
  86         kmutex_t        scl_lock;
  87         kthread_t       *scl_writer;
  88         int             scl_write_wanted;
  89         kcondvar_t      scl_cv;
  90         refcount_t      scl_count;
  91 } spa_config_lock_t;
  92 
  93 typedef struct spa_config_dirent {
  94         list_node_t     scd_link;


 117 typedef enum spa_proc_state {
 118         SPA_PROC_NONE,          /* spa_proc = &p0, no process created */
 119         SPA_PROC_CREATED,       /* spa_activate() has proc, is waiting */
 120         SPA_PROC_ACTIVE,        /* taskqs created, spa_proc set */
 121         SPA_PROC_DEACTIVATE,    /* spa_deactivate() requests process exit */
 122         SPA_PROC_GONE           /* spa_thread() is exiting, spa_proc = &p0 */
 123 } spa_proc_state_t;
 124 
 125 typedef struct spa_taskqs {
 126         uint_t stqs_count;
 127         taskq_t **stqs_taskq;
 128 } spa_taskqs_t;
 129 
 130 typedef enum spa_all_vdev_zap_action {
 131         AVZ_ACTION_NONE = 0,
 132         AVZ_ACTION_DESTROY,     /* Destroy all per-vdev ZAPs and the AVZ. */
 133         AVZ_ACTION_REBUILD,     /* Populate the new AVZ, see spa_avz_rebuild */
 134         AVZ_ACTION_INITIALIZE
 135 } spa_avz_action_t;
 136 
 137 typedef enum spa_watermark {
 138         SPA_WM_NONE,
 139         SPA_WM_LOW,
 140         SPA_WM_HIGH
 141 } spa_watermark_t;



 142 
 143 /*
 144  * average utilization, latency and throughput
 145  * for spa and special/normal classes
 146  */
 147 typedef struct spa_avg_stat {
 148         uint64_t spa_utilization;
 149         uint64_t special_utilization;
 150         uint64_t normal_utilization;
 151         uint64_t special_latency;
 152         uint64_t normal_latency;
 153         uint64_t special_throughput;
 154         uint64_t normal_throughput;
 155 } spa_avg_stat_t;
 156 
 157 typedef struct spa_perfmon_data {
 158         kthread_t               *perfmon_thread;
 159         boolean_t               perfmon_thr_exit;
 160         kmutex_t                perfmon_lock;
 161         kcondvar_t              perfmon_cv;
 162 } spa_perfmon_data_t;
 163 
 164 /*
 165  * Metaplacement controls 3-types of meta
 166  * (see spa_refine_meta_placement() in special.c):
 167  * - DDT-Meta (pool level property) (see DMU_OT_IS_DDT_META())
 168  * - ZPL-Meta (dataset level property) (see DMU_OT_IS_ZPL_META())
 169  * - ZFS-Meta (pool level property) all other metadata except
 170  * DDT-Meta and ZPL-Meta
 171  *
 172  * spa_enable_meta_placement_selection is global switch
 173  *
 174  * spa_small_data_to_special contains max size of data that
 175  * can be placed on special
 176  *
 177  * spa_sync_to_special uses special device for slog synchronous transactions
 178  */
 179 typedef struct spa_meta_placement {
 180         uint64_t spa_enable_meta_placement_selection;
 181         uint64_t spa_ddt_meta_to_special;
 182         uint64_t spa_zfs_meta_to_special;
 183         uint64_t spa_small_data_to_special;
 184         uint64_t spa_sync_to_special;
 185 } spa_meta_placement_t;
 186 
 187 typedef struct spa_trimstats spa_trimstats_t;
 188 
 189 struct spa {
 190         /*
 191          * Fields protected by spa_namespace_lock.
 192          */
 193         char            spa_name[ZFS_MAX_DATASET_NAME_LEN];     /* pool name */
 194         char            *spa_comment;           /* comment */
 195         avl_node_t      spa_avl;                /* node in spa_namespace_avl */
 196         nvlist_t        *spa_config;            /* last synced config */
 197         nvlist_t        *spa_config_syncing;    /* currently syncing config */
 198         nvlist_t        *spa_config_splitting;  /* config for splitting */
 199         nvlist_t        *spa_load_info;         /* info and errors from load */
 200         uint64_t        spa_config_txg;         /* txg of last config change */
 201         int             spa_sync_pass;          /* iterate-to-convergence */
 202         pool_state_t    spa_state;              /* pool state */
 203         int             spa_inject_ref;         /* injection references */
 204         uint8_t         spa_sync_on;            /* sync threads are running */
 205         spa_load_state_t spa_load_state;        /* current load operation */



 206         uint64_t        spa_import_flags;       /* import specific flags */
 207         spa_taskqs_t    spa_zio_taskq[ZIO_TYPES][ZIO_TASKQ_TYPES];
 208         dsl_pool_t      *spa_dsl_pool;
 209         boolean_t       spa_is_initializing;    /* true while opening pool */
 210         metaslab_class_t *spa_normal_class;     /* normal data class */
 211         metaslab_class_t *spa_log_class;        /* intent log data class */
 212         metaslab_class_t *spa_special_class;    /* special usage class */
 213         uint64_t        spa_first_txg;          /* first txg after spa_open() */
 214         uint64_t        spa_final_txg;          /* txg of export/destroy */
 215         uint64_t        spa_freeze_txg;         /* freeze pool at this txg */
 216         uint64_t        spa_load_max_txg;       /* best initial ub_txg */
 217         uint64_t        spa_claim_max_txg;      /* highest claimed birth txg */
 218         timespec_t      spa_loaded_ts;          /* 1st successful open time */
 219         objset_t        *spa_meta_objset;       /* copy of dp->dp_meta_objset */
 220         kmutex_t        spa_evicting_os_lock;   /* Evicting objset list lock */
 221         list_t          spa_evicting_os_list;   /* Objsets being evicted. */
 222         kcondvar_t      spa_evicting_os_cv;     /* Objset Eviction Completion */
 223         txg_list_t      spa_vdev_txg_list;      /* per-txg dirty vdev list */
 224         vdev_t          *spa_root_vdev;         /* top-level vdev container */
 225         int             spa_min_ashift;         /* of vdevs in normal class */
 226         int             spa_max_ashift;         /* of vdevs in normal class */
 227         uint64_t        spa_config_guid;        /* config pool guid */
 228         uint64_t        spa_load_guid;          /* spa_load initialized guid */
 229         uint64_t        spa_last_synced_guid;   /* last synced guid */
 230         list_t          spa_config_dirty_list;  /* vdevs with dirty config */
 231         list_t          spa_state_dirty_list;   /* vdevs with dirty state */


 232         spa_aux_vdev_t  spa_spares;             /* hot spares */
 233         spa_aux_vdev_t  spa_l2cache;            /* L2ARC cache devices */
 234         nvlist_t        *spa_label_features;    /* Features for reading MOS */
 235         uint64_t        spa_config_object;      /* MOS object for pool config */
 236         uint64_t        spa_config_generation;  /* config generation number */
 237         uint64_t        spa_syncing_txg;        /* txg currently syncing */
 238         bpobj_t         spa_deferred_bpobj;     /* deferred-free bplist */
 239         bplist_t        spa_free_bplist[TXG_SIZE]; /* bplist of stuff to free */
 240         zio_cksum_salt_t spa_cksum_salt;        /* secret salt for cksum */
 241         /* checksum context templates */
 242         kmutex_t        spa_cksum_tmpls_lock;
 243         void            *spa_cksum_tmpls[ZIO_CHECKSUM_FUNCTIONS];
 244         uberblock_t     spa_ubsync;             /* last synced uberblock */
 245         uberblock_t     spa_uberblock;          /* current uberblock */
 246         boolean_t       spa_extreme_rewind;     /* rewind past deferred frees */

 247         kmutex_t        spa_scrub_lock;         /* resilver/scrub lock */
 248         uint64_t        spa_scrub_inflight;     /* in-flight scrub I/Os */
 249         kcondvar_t      spa_scrub_io_cv;        /* scrub I/O completion */
 250         uint8_t         spa_scrub_active;       /* active or suspended? */
 251         uint8_t         spa_scrub_type;         /* type of scrub we're doing */
 252         uint8_t         spa_scrub_finished;     /* indicator to rotate logs */
 253         uint8_t         spa_scrub_started;      /* started since last boot */
 254         uint8_t         spa_scrub_reopen;       /* scrub doing vdev_reopen */
 255         uint64_t        spa_scan_pass_start;    /* start time per pass/reboot */
 256         uint64_t        spa_scan_pass_scrub_pause; /* scrub pause time */
 257         uint64_t        spa_scan_pass_scrub_spent_paused; /* total paused */
 258         uint64_t        spa_scan_pass_exam;     /* examined bytes per pass */
 259         uint64_t        spa_scan_pass_work;     /* actually processed bytes */
 260         kmutex_t        spa_async_lock;         /* protect async state */
 261         kthread_t       *spa_async_thread;      /* thread doing async task */
 262         int             spa_async_suspended;    /* async tasks suspended */
 263         kcondvar_t      spa_async_cv;           /* wait for thread_exit() */
 264         uint16_t        spa_async_tasks;        /* async task mask */










 265         char            *spa_root;              /* alternate root directory */
 266         uint64_t        spa_ena;                /* spa-wide ereport ENA */
 267         int             spa_last_open_failed;   /* error if last open failed */
 268         uint64_t        spa_last_ubsync_txg;    /* "best" uberblock txg */
 269         uint64_t        spa_last_ubsync_txg_ts; /* timestamp from that ub */
 270         uint64_t        spa_load_txg;           /* ub txg that loaded */
 271         uint64_t        spa_load_txg_ts;        /* timestamp from that ub */
 272         uint64_t        spa_load_meta_errors;   /* verify metadata err count */
 273         uint64_t        spa_load_data_errors;   /* verify data err count */
 274         uint64_t        spa_verify_min_txg;     /* start txg of verify scrub */
 275         kmutex_t        spa_errlog_lock;        /* error log lock */
 276         uint64_t        spa_errlog_last;        /* last error log object */
 277         uint64_t        spa_errlog_scrub;       /* scrub error log object */
 278         kmutex_t        spa_errlist_lock;       /* error list/ereport lock */
 279         avl_tree_t      spa_errlist_last;       /* last error list */
 280         avl_tree_t      spa_errlist_scrub;      /* scrub error list */
 281         uint64_t        spa_deflate;            /* should we deflate? */
 282         uint64_t        spa_history;            /* history object */
 283         kmutex_t        spa_history_lock;       /* history lock */
 284         vdev_t          *spa_pending_vdev;      /* pending vdev additions */
 285         kmutex_t        spa_props_lock;         /* property lock */
 286         uint64_t        spa_pool_props_object;  /* object for properties */
 287         kmutex_t        spa_cos_props_lock;     /* property lock */
 288         uint64_t        spa_cos_props_object;   /* object for cos properties */
 289         kmutex_t        spa_vdev_props_lock;    /* property lock */
 290         uint64_t        spa_vdev_props_object;  /* object for vdev properties */
 291         uint64_t        spa_bootfs;             /* default boot filesystem */
 292         uint64_t        spa_failmode;           /* failure mode for the pool */
 293         uint64_t        spa_delegation;         /* delegation on/off */
 294         list_t          spa_config_list;        /* previous cache file(s) */
 295         /* per-CPU array of root of async I/O: */
 296         zio_t           **spa_async_zio_root;
 297         zio_t           *spa_suspend_zio_root;  /* root of all suspended I/O */

 298         kmutex_t        spa_suspend_lock;       /* protects suspend_zio_root */
 299         kcondvar_t      spa_suspend_cv;         /* notification of resume */
 300         uint8_t         spa_suspended;          /* pool is suspended */
 301         uint8_t         spa_claiming;           /* pool is doing zil_claim() */
 302         boolean_t       spa_debug;              /* debug enabled? */
 303         boolean_t       spa_is_root;            /* pool is root */
 304         int             spa_minref;             /* num refs when first opened */
 305         int             spa_mode;               /* FREAD | FWRITE */
 306         spa_log_state_t spa_log_state;          /* log state */
 307         uint64_t        spa_autoexpand;         /* lun expansion on/off */
 308         uint64_t        spa_bootsize;           /* efi system partition size */
 309         ddt_t           *spa_ddt[ZIO_CHECKSUM_FUNCTIONS]; /* in-core DDTs */
 310         uint64_t        spa_ddt_stat_object;    /* DDT statistics */
 311         uint64_t        spa_dedup_ditto;        /* dedup ditto threshold */
 312         uint64_t        spa_dedup_checksum;     /* default dedup checksum */
 313         uint64_t        spa_ddt_msize;          /* ddt size in core, from ddo */
 314         uint64_t        spa_ddt_dsize;          /* ddt size on disk, from ddo */
 315         uint64_t        spa_dspace;             /* dspace in normal class */
 316         kmutex_t        spa_vdev_top_lock;      /* dueling offline/remove */
 317         kmutex_t        spa_proc_lock;          /* protects spa_proc* */
 318         kcondvar_t      spa_proc_cv;            /* spa_proc_state transitions */
 319         spa_proc_state_t spa_proc_state;        /* see definition */
 320         struct proc     *spa_proc;              /* "zpool-poolname" process */
 321         uint64_t        spa_did;                /* if procp != p0, did of t1 */
 322         boolean_t       spa_autoreplace;        /* autoreplace set in open */
 323         int             spa_vdev_locks;         /* locks grabbed */
 324         uint64_t        spa_creation_version;   /* version at pool creation */
 325         uint64_t        spa_prev_software_version; /* See ub_software_version */
 326         uint64_t        spa_feat_for_write_obj; /* required to write to pool */
 327         uint64_t        spa_feat_for_read_obj;  /* required to read from pool */
 328         uint64_t        spa_feat_desc_obj;      /* Feature descriptions */
 329         uint64_t        spa_feat_enabled_txg_obj; /* Feature enabled txg */
 330         /* cache feature refcounts */
 331         uint64_t        spa_feat_refcount_cache[SPA_FEATURES];
 332         cyclic_id_t     spa_deadman_cycid;      /* cyclic id */
 333         uint64_t        spa_deadman_calls;      /* number of deadman calls */
 334         hrtime_t        spa_sync_starttime;     /* starting time fo spa_sync */
 335         uint64_t        spa_deadman_synctime;   /* deadman expiration timer */
 336         uint64_t        spa_all_vdev_zaps;      /* ZAP of per-vd ZAP obj #s */
 337         spa_avz_action_t        spa_avz_action; /* destroy/rebuild AVZ? */
 338 
 339         /* TRIM */
 340         uint64_t        spa_force_trim;         /* force sending trim? */
 341         uint64_t        spa_auto_trim;          /* see spa_auto_trim_t */
 342 
 343         kmutex_t        spa_auto_trim_lock;
 344         kcondvar_t      spa_auto_trim_done_cv;  /* all autotrim thrd's exited */
 345         uint64_t        spa_num_auto_trimming;  /* # of autotrim threads */
 346         taskq_t         *spa_auto_trim_taskq;
 347 
 348         kmutex_t        spa_man_trim_lock;
 349         uint64_t        spa_man_trim_rate;      /* rate of trim in bytes/sec */
 350         uint64_t        spa_num_man_trimming;   /* # of manual trim threads */
 351         boolean_t       spa_man_trim_stop;      /* requested manual trim stop */
 352         kcondvar_t      spa_man_trim_update_cv; /* updates to TRIM settings */
 353         kcondvar_t      spa_man_trim_done_cv;   /* manual trim has completed */
 354         /* For details on trim start/stop times see spa_get_trim_prog. */
 355         uint64_t        spa_man_trim_start_time;
 356         uint64_t        spa_man_trim_stop_time;
 357         taskq_t         *spa_man_trim_taskq;
 358 
 359         /*
 360          * spa_iokstat_lock protects spa_iokstat and
 361          * spa_queue_stats[].
 362          */
 363         kmutex_t        spa_iokstat_lock;
 364         struct kstat    *spa_iokstat;           /* kstat of io to this pool */
 365         struct {
 366                 uint64_t spa_active;
 367                 uint64_t spa_queued;
 368         } spa_queue_stats[ZIO_PRIORITY_NUM_QUEUEABLE];
 369 
 370         /* Pool-wide scrub & resilver priority values. */
 371         uint64_t        spa_scrub_prio;
 372         uint64_t        spa_resilver_prio;
 373 
 374         /* TRIM/UNMAP kstats */
 375         spa_trimstats_t *spa_trimstats;         /* alloc'd by kstat_create */
 376         struct kstat    *spa_trimstats_ks;
 377 
 378         hrtime_t        spa_ccw_fail_time;      /* Conf cache write fail time */
 379 
 380         /* total space on all L2ARC devices used for DDT (l2arc_ddt=on) */
 381         uint64_t spa_l2arc_ddt_devs_size;
 382 
 383         /* if 1 this means we have stopped DDT growth for this pool */
 384         uint8_t spa_ddt_capped;
 385 
 386         /* specialclass support */
 387         boolean_t       spa_usesc;              /* enable special class */
 388         uint64_t        spa_special_vdev_correction_rate;
 389         uint64_t        spa_minwat;             /* min watermark percent */
 390         uint64_t        spa_lowat;              /* low watermark percent */
 391         uint64_t        spa_hiwat;              /* high watermark percent */
 392         uint64_t        spa_lwm_space;          /* low watermark */
 393         uint64_t        spa_hwm_space;          /* high watermark */
 394         uint64_t        spa_wbc_wm_range;       /* high wm - low wm */
 395         uint8_t         spa_wbc_perc;           /* percent of writes to spec. */
 396         spa_watermark_t spa_watermark;
 397         boolean_t       spa_special_has_errors;
 398 
 399         /* Write Back Cache */
 400         uint64_t        spa_wbc_mode;
 401         wbc_data_t      spa_wbc;
 402 
 403         /* cos list */
 404         list_t          spa_cos_list;
 405 
 406         /*
 407          * utilization, latency and throughput statistics per metaslab_class
 408          * to aid dynamic balancing of I/O across normal and special classes
 409          */
 410         uint64_t                spa_avg_stat_rotor;
 411         spa_avg_stat_t          spa_avg_stat;
 412 
 413         spa_perfmon_data_t      spa_perfmon;
 414 
 415         /*
 416          * Percentage of total write traffic routed to the special class when
 417          * the latter is working as writeback cache.
 418          * Note that this value is continuously recomputed at runtime based on
 419          * the configured load-balancing mechanism (see spa_special_selection)
 420          * For instance, 0% would mean that special class is not to be used
 421          * for new writes, etc.
 422          */
 423         uint64_t spa_special_to_normal_ratio;
 424 
 425         /*
 426          * last re-routing delta value for the spa_special_to_normal_ratio
 427          */
 428         int64_t spa_special_to_normal_delta;
 429 
 430         /* target percentage of data to be considered for dedup */
 431         int spa_dedup_percentage;
 432         uint64_t spa_dedup_rotor;
 433 
 434         /*
 435          * spa_refcnt & spa_config_lock must be the last elements
 436          * because refcount_t changes size based on compilation options.
 437          * In order for the MDB module to function correctly, the other
 438          * fields must remain in the same location.
 439          */
 440         spa_config_lock_t spa_config_lock[SCL_LOCKS]; /* config changes */
 441         refcount_t      spa_refcount;           /* number of opens */
 442 
 443         uint64_t spa_ddt_meta_copies; /* amount of ddt-metadata copies */
 444 
 445         /*
 446          * The following two fields are designed to restrict the distribution
 447          * of the deduplication entries. There are two possible states of these
 448          * vars:
 449          * 1) min=DITTO, max=DUPLICATED - it provides the old behavior
 450          * 2) min=DUPLICATED, MAX=DUPLICATED - new behavior: all entries into
 451          * the single zap.
 452          */
 453         enum ddt_class spa_ddt_class_min;
 454         enum ddt_class spa_ddt_class_max;
 455 
 456         spa_meta_placement_t spa_meta_policy;
 457 
 458         uint64_t spa_dedup_best_effort;
 459         uint64_t spa_dedup_lo_best_effort;
 460         uint64_t spa_dedup_hi_best_effort;
 461 
 462         zfs_autosnap_t spa_autosnap;
 463 
 464         zbookmark_phys_t spa_lszb;
 465 
 466         int spa_obj_mtx_sz;
 467 };
 468 
 469 /* possible in core size of all DDTs  */
 470 extern uint64_t zfs_ddts_msize;
 471 
 472 /* spa sysevent taskq */
 473 extern taskq_t *spa_sysevent_taskq;
 474 
 475 extern const char *spa_config_path;
 476 
 477 extern void spa_taskq_dispatch_ent(spa_t *spa, zio_type_t t, zio_taskq_type_t q,
 478     task_func_t *func, void *arg, uint_t flags, taskq_ent_t *ent);


 479 
 480 extern void spa_auto_trim_taskq_create(spa_t *spa);
 481 extern void spa_man_trim_taskq_create(spa_t *spa);
 482 extern void spa_auto_trim_taskq_destroy(spa_t *spa);
 483 extern void spa_man_trim_taskq_destroy(spa_t *spa);
 484 
 485 #ifdef  __cplusplus
 486 }
 487 #endif
 488 
 489 #endif  /* _SYS_SPA_IMPL_H */