Print this page
NEX-9200 Improve the scalability of attribute locking in zfs_zget
Reviewed by: Joyce McIntosh <joyce.mcintosh@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-9552 zfs_scan_idle throttling harms performance and needs to be removed
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
NEX-13140 DVA-throttle support for special-class
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-13937 Improve kstat performance
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>
NEX-6088 ZFS scrub/resilver take excessively long due to issuing lots of random IO
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-8711 backport illumos 7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information
7115 6922 generates ESC_ZFS_VDEV_REMOVE_AUX a bit too often
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Robert Mustacchi <rm@joyent.com>
NEX-6884 KRRP: replication deadlock due to unavailable resources
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-5856 ddt_capped isn't reset when deduped dataset is destroyed
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
NEX-5553 ZFS auto-trim, manual-trim and scrub can race and deadlock
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-5795 Rename 'wrc' as 'wbc' in the source and in the tech docs
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5367 special vdev: sync-write options (NEW)
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5318 Cleanup specialclass property (obsolete, not used) and fix related meta-to-special case
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5255 speed-up migration of the write-cached data
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5064 On-demand trim should store operation start and stop time
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5219 WBC: Add capability to delay migration
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5186 smf-tests contains built files and it shouldn't
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
NEX-5168 cleanup and productize non-default latency based writecache load-balancer
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4940 Special Vdev operation in presence (or absense) of IO Errors
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-4934 Add capability to remove special vdev
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4807 writecache load-balancing statistics: several distinct problems, must be revisited and revised
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4876 On-demand TRIM shouldn't use system_taskq and should queue jobs
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4794 Write Back Cache sync and async writes: adjust routing according to watermark limits
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4620 ZFS autotrim triggering is unreliable
NEX-4622 On-demand TRIM code illogically enumerates metaslabs via mg_ms_tree
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
NEX-4619 Want kstats to monitor TRIM and UNMAP operation
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R (fix studio build)
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Garrett D'Amore <garrett@damore.org>
5818 zfs {ref}compressratio is incorrect with 4k sector size
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Reviewed by: Don Brady <dev.fs.zfs@gmail.com>
Approved by: Albert Lee <trisk@omniti.com>
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Revert "NEX-4476 WRC: Allow to use write back cache per tree of datasets"
This reverts commit fe97b74444278a6f36fec93179133641296312da.
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-3502 dedup ceiling should set a pool prop when cap is in effect
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-3984 On-demand TRIM
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Conflicts:
        usr/src/common/zfs/zpool_prop.c
        usr/src/uts/common/sys/fs/zfs.h
NEX-3558 KRRP Integration
NEX-3508 CLONE - Port NEX-2946 Add UNMAP/TRIM functionality to ZFS and illumos
Reviewed by: Josef Sipek <josef.sipek@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Conflicts:
    usr/src/uts/common/io/scsi/targets/sd.c
    usr/src/uts/common/sys/scsi/targets/sddef.h
NEX-3165 need some dedup improvements
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Garrett D'Amore <garrett@damore.org>
SUP-577 deadlock between zpool detach and syseventd
OS-80 support for vdev and CoS properties for the new I/O scheduler
OS-95 lint warning introduced by OS-61
Issue #27: Auto best-effort dedup enable/disable - settable per pool
Issues #7: Reconsile L2ARC and "special" use by datasets
Issue #2: optimize DDE lookup in DDT objects
Added option to control number of classes of DDE's in DDT.
New default is one, that is all DDE's are stored together
regardless of refcount.
Issue #3: Add support for parametrized number of copies for DDTs
Issue #25: Add a pool-level property that controls the number of copies of DDTs in the pool.
re #12643 rb4064 ZFS meta refactoring - vdev utilization tracking, auto-dedup
re #12585 rb4049 ZFS++ work port - refactoring to improve separation of open/closed code, bug fixes, performance improvements - open code
Bug 11205: add missing libzfs_closed_stubs.c to fix opensource-only build.
ZFS plus work: special vdevs, cos, cos/vdev properties

@@ -18,40 +18,48 @@
  *
  * CDDL HEADER END
  */
 /*
  * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
- * Copyright (c) 2011, 2018 by Delphix. All rights reserved.
- * Copyright 2011 Nexenta Systems, Inc.  All rights reserved.
+ * Copyright (c) 2011, 2015 by Delphix. All rights reserved.
  * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved.
+ * Copyright 2017 Nexenta Systems, Inc.  All rights reserved.
  * Copyright 2013 Saso Kiselkov. All rights reserved.
  * Copyright (c) 2017 Datto Inc.
  */
 
 #ifndef _SYS_SPA_IMPL_H
 #define _SYS_SPA_IMPL_H
 
 #include <sys/spa.h>
 #include <sys/vdev.h>
-#include <sys/vdev_removal.h>
+#include <sys/vdev_impl.h>
 #include <sys/metaslab.h>
 #include <sys/dmu.h>
 #include <sys/dsl_pool.h>
 #include <sys/uberblock_impl.h>
 #include <sys/zfs_context.h>
 #include <sys/avl.h>
 #include <sys/refcount.h>
 #include <sys/bplist.h>
 #include <sys/bpobj.h>
+#include <sys/special_impl.h>
+#include <sys/wbc.h>
 #include <sys/zfeature.h>
-#include <sys/zthr.h>
 #include <zfeature_common.h>
+#include <sys/autosnap.h>
 
 #ifdef  __cplusplus
 extern "C" {
 #endif
 
+/*
+ * This (illegal) pool name is used when temporarily importing a spa_t in order
+ * to get the vdev stats associated with the imported devices.
+ */
+#define TRYIMPORT_NAME  "$import"
+
 typedef struct spa_error_entry {
         zbookmark_phys_t        se_bookmark;
         char                    *se_name;
         avl_node_t              se_avl;
 } spa_error_entry_t;

@@ -62,66 +70,10 @@
         uint64_t sh_bof;                /* logical BOF */
         uint64_t sh_eof;                /* logical EOF */
         uint64_t sh_records_lost;       /* num of records overwritten */
 } spa_history_phys_t;
 
-/*
- * All members must be uint64_t, for byteswap purposes.
- */
-typedef struct spa_removing_phys {
-        uint64_t sr_state; /* dsl_scan_state_t */
-
-        /*
-         * The vdev ID that we most recently attempted to remove,
-         * or -1 if no removal has been attempted.
-         */
-        uint64_t sr_removing_vdev;
-
-        /*
-         * The vdev ID that we most recently successfully removed,
-         * or -1 if no devices have been removed.
-         */
-        uint64_t sr_prev_indirect_vdev;
-
-        uint64_t sr_start_time;
-        uint64_t sr_end_time;
-
-        /*
-         * Note that we can not use the space map's or indirect mapping's
-         * accounting as a substitute for these values, because we need to
-         * count frees of not-yet-copied data as though it did the copy.
-         * Otherwise, we could get into a situation where copied > to_copy,
-         * or we complete before copied == to_copy.
-         */
-        uint64_t sr_to_copy; /* bytes that need to be copied */
-        uint64_t sr_copied; /* bytes that have been copied or freed */
-} spa_removing_phys_t;
-
-/*
- * This struct is stored as an entry in the DMU_POOL_DIRECTORY_OBJECT
- * (with key DMU_POOL_CONDENSING_INDIRECT).  It is present if a condense
- * of an indirect vdev's mapping object is in progress.
- */
-typedef struct spa_condensing_indirect_phys {
-        /*
-         * The vdev ID of the indirect vdev whose indirect mapping is
-         * being condensed.
-         */
-        uint64_t        scip_vdev;
-
-        /*
-         * The vdev's old obsolete spacemap.  This spacemap's contents are
-         * being integrated into the new mapping.
-         */
-        uint64_t        scip_prev_obsolete_sm_object;
-
-        /*
-         * The new mapping object that is being created.
-         */
-        uint64_t        scip_next_mapping_object;
-} spa_condensing_indirect_phys_t;
-
 struct spa_aux_vdev {
         uint64_t        sav_object;             /* MOS object for device list */
         nvlist_t        *sav_config;            /* cached device config */
         vdev_t          **sav_vdevs;            /* devices */
         int             sav_count;              /* number devices */

@@ -180,19 +132,62 @@
         AVZ_ACTION_DESTROY,     /* Destroy all per-vdev ZAPs and the AVZ. */
         AVZ_ACTION_REBUILD,     /* Populate the new AVZ, see spa_avz_rebuild */
         AVZ_ACTION_INITIALIZE
 } spa_avz_action_t;
 
-typedef enum spa_config_source {
-        SPA_CONFIG_SRC_NONE = 0,
-        SPA_CONFIG_SRC_SCAN,            /* scan of path (default: /dev/dsk) */
-        SPA_CONFIG_SRC_CACHEFILE,       /* any cachefile */
-        SPA_CONFIG_SRC_TRYIMPORT,       /* returned from call to tryimport */
-        SPA_CONFIG_SRC_SPLIT,           /* new pool in a pool split */
-        SPA_CONFIG_SRC_MOS              /* MOS, but not always from right txg */
-} spa_config_source_t;
+typedef enum spa_watermark {
+        SPA_WM_NONE,
+        SPA_WM_LOW,
+        SPA_WM_HIGH
+} spa_watermark_t;
 
+/*
+ * average utilization, latency and throughput
+ * for spa and special/normal classes
+ */
+typedef struct spa_avg_stat {
+        uint64_t spa_utilization;
+        uint64_t special_utilization;
+        uint64_t normal_utilization;
+        uint64_t special_latency;
+        uint64_t normal_latency;
+        uint64_t special_throughput;
+        uint64_t normal_throughput;
+} spa_avg_stat_t;
+
+typedef struct spa_perfmon_data {
+        kthread_t               *perfmon_thread;
+        boolean_t               perfmon_thr_exit;
+        kmutex_t                perfmon_lock;
+        kcondvar_t              perfmon_cv;
+} spa_perfmon_data_t;
+
+/*
+ * Metaplacement controls 3-types of meta
+ * (see spa_refine_meta_placement() in special.c):
+ * - DDT-Meta (pool level property) (see DMU_OT_IS_DDT_META())
+ * - ZPL-Meta (dataset level property) (see DMU_OT_IS_ZPL_META())
+ * - ZFS-Meta (pool level property) all other metadata except
+ * DDT-Meta and ZPL-Meta
+ *
+ * spa_enable_meta_placement_selection is global switch
+ *
+ * spa_small_data_to_special contains max size of data that
+ * can be placed on special
+ *
+ * spa_sync_to_special uses special device for slog synchronous transactions
+ */
+typedef struct spa_meta_placement {
+        uint64_t spa_enable_meta_placement_selection;
+        uint64_t spa_ddt_meta_to_special;
+        uint64_t spa_zfs_meta_to_special;
+        uint64_t spa_small_data_to_special;
+        uint64_t spa_sync_to_special;
+} spa_meta_placement_t;
+
+typedef struct spa_trimstats spa_trimstats_t;
+
 struct spa {
         /*
          * Fields protected by spa_namespace_lock.
          */
         char            spa_name[ZFS_MAX_DATASET_NAME_LEN];     /* pool name */

@@ -206,19 +201,17 @@
         int             spa_sync_pass;          /* iterate-to-convergence */
         pool_state_t    spa_state;              /* pool state */
         int             spa_inject_ref;         /* injection references */
         uint8_t         spa_sync_on;            /* sync threads are running */
         spa_load_state_t spa_load_state;        /* current load operation */
-        boolean_t       spa_indirect_vdevs_loaded; /* mappings loaded? */
-        boolean_t       spa_trust_config;       /* do we trust vdev tree? */
-        spa_config_source_t spa_config_source;  /* where config comes from? */
         uint64_t        spa_import_flags;       /* import specific flags */
         spa_taskqs_t    spa_zio_taskq[ZIO_TYPES][ZIO_TASKQ_TYPES];
         dsl_pool_t      *spa_dsl_pool;
         boolean_t       spa_is_initializing;    /* true while opening pool */
         metaslab_class_t *spa_normal_class;     /* normal data class */
         metaslab_class_t *spa_log_class;        /* intent log data class */
+        metaslab_class_t *spa_special_class;    /* special usage class */
         uint64_t        spa_first_txg;          /* first txg after spa_open() */
         uint64_t        spa_final_txg;          /* txg of export/destroy */
         uint64_t        spa_freeze_txg;         /* freeze pool at this txg */
         uint64_t        spa_load_max_txg;       /* best initial ub_txg */
         uint64_t        spa_claim_max_txg;      /* highest claimed birth txg */

@@ -234,12 +227,10 @@
         uint64_t        spa_config_guid;        /* config pool guid */
         uint64_t        spa_load_guid;          /* spa_load initialized guid */
         uint64_t        spa_last_synced_guid;   /* last synced guid */
         list_t          spa_config_dirty_list;  /* vdevs with dirty config */
         list_t          spa_state_dirty_list;   /* vdevs with dirty state */
-        kmutex_t        spa_alloc_lock;
-        avl_tree_t      spa_alloc_tree;
         spa_aux_vdev_t  spa_spares;             /* hot spares */
         spa_aux_vdev_t  spa_l2cache;            /* L2ARC cache devices */
         nvlist_t        *spa_label_features;    /* Features for reading MOS */
         uint64_t        spa_config_object;      /* MOS object for pool config */
         uint64_t        spa_config_generation;  /* config generation number */

@@ -251,11 +242,10 @@
         kmutex_t        spa_cksum_tmpls_lock;
         void            *spa_cksum_tmpls[ZIO_CHECKSUM_FUNCTIONS];
         uberblock_t     spa_ubsync;             /* last synced uberblock */
         uberblock_t     spa_uberblock;          /* current uberblock */
         boolean_t       spa_extreme_rewind;     /* rewind past deferred frees */
-        uint64_t        spa_last_io;            /* lbolt of last non-scan I/O */
         kmutex_t        spa_scrub_lock;         /* resilver/scrub lock */
         uint64_t        spa_scrub_inflight;     /* in-flight scrub I/Os */
         kcondvar_t      spa_scrub_io_cv;        /* scrub I/O completion */
         uint8_t         spa_scrub_active;       /* active or suspended? */
         uint8_t         spa_scrub_type;         /* type of scrub we're doing */

@@ -264,25 +254,16 @@
         uint8_t         spa_scrub_reopen;       /* scrub doing vdev_reopen */
         uint64_t        spa_scan_pass_start;    /* start time per pass/reboot */
         uint64_t        spa_scan_pass_scrub_pause; /* scrub pause time */
         uint64_t        spa_scan_pass_scrub_spent_paused; /* total paused */
         uint64_t        spa_scan_pass_exam;     /* examined bytes per pass */
+        uint64_t        spa_scan_pass_work;     /* actually processed bytes */
         kmutex_t        spa_async_lock;         /* protect async state */
         kthread_t       *spa_async_thread;      /* thread doing async task */
         int             spa_async_suspended;    /* async tasks suspended */
         kcondvar_t      spa_async_cv;           /* wait for thread_exit() */
         uint16_t        spa_async_tasks;        /* async task mask */
-        uint64_t        spa_missing_tvds;       /* unopenable tvds on load */
-        uint64_t        spa_missing_tvds_allowed; /* allow loading spa? */
-
-        spa_removing_phys_t spa_removing_phys;
-        spa_vdev_removal_t *spa_vdev_removal;
-
-        spa_condensing_indirect_phys_t  spa_condensing_indirect_phys;
-        spa_condensing_indirect_t       *spa_condensing_indirect;
-        zthr_t          *spa_condense_zthr;     /* zthr doing condense. */
-
         char            *spa_root;              /* alternate root directory */
         uint64_t        spa_ena;                /* spa-wide ereport ENA */
         int             spa_last_open_failed;   /* error if last open failed */
         uint64_t        spa_last_ubsync_txg;    /* "best" uberblock txg */
         uint64_t        spa_last_ubsync_txg_ts; /* timestamp from that ub */

@@ -301,18 +282,21 @@
         uint64_t        spa_history;            /* history object */
         kmutex_t        spa_history_lock;       /* history lock */
         vdev_t          *spa_pending_vdev;      /* pending vdev additions */
         kmutex_t        spa_props_lock;         /* property lock */
         uint64_t        spa_pool_props_object;  /* object for properties */
+        kmutex_t        spa_cos_props_lock;     /* property lock */
+        uint64_t        spa_cos_props_object;   /* object for cos properties */
+        kmutex_t        spa_vdev_props_lock;    /* property lock */
+        uint64_t        spa_vdev_props_object;  /* object for vdev properties */
         uint64_t        spa_bootfs;             /* default boot filesystem */
         uint64_t        spa_failmode;           /* failure mode for the pool */
         uint64_t        spa_delegation;         /* delegation on/off */
         list_t          spa_config_list;        /* previous cache file(s) */
         /* per-CPU array of root of async I/O: */
         zio_t           **spa_async_zio_root;
         zio_t           *spa_suspend_zio_root;  /* root of all suspended I/O */
-        zio_t           *spa_txg_zio[TXG_SIZE]; /* spa_sync() waits for this */
         kmutex_t        spa_suspend_lock;       /* protects suspend_zio_root */
         kcondvar_t      spa_suspend_cv;         /* notification of resume */
         uint8_t         spa_suspended;          /* pool is suspended */
         uint8_t         spa_claiming;           /* pool is doing zil_claim() */
         boolean_t       spa_debug;              /* debug enabled? */

@@ -324,10 +308,12 @@
         uint64_t        spa_bootsize;           /* efi system partition size */
         ddt_t           *spa_ddt[ZIO_CHECKSUM_FUNCTIONS]; /* in-core DDTs */
         uint64_t        spa_ddt_stat_object;    /* DDT statistics */
         uint64_t        spa_dedup_ditto;        /* dedup ditto threshold */
         uint64_t        spa_dedup_checksum;     /* default dedup checksum */
+        uint64_t        spa_ddt_msize;          /* ddt size in core, from ddo */
+        uint64_t        spa_ddt_dsize;          /* ddt size on disk, from ddo */
         uint64_t        spa_dspace;             /* dspace in normal class */
         kmutex_t        spa_vdev_top_lock;      /* dueling offline/remove */
         kmutex_t        spa_proc_lock;          /* protects spa_proc* */
         kcondvar_t      spa_proc_cv;            /* spa_proc_state transitions */
         spa_proc_state_t spa_proc_state;        /* see definition */

@@ -348,40 +334,156 @@
         hrtime_t        spa_sync_starttime;     /* starting time fo spa_sync */
         uint64_t        spa_deadman_synctime;   /* deadman expiration timer */
         uint64_t        spa_all_vdev_zaps;      /* ZAP of per-vd ZAP obj #s */
         spa_avz_action_t        spa_avz_action; /* destroy/rebuild AVZ? */
 
+        /* TRIM */
+        uint64_t        spa_force_trim;         /* force sending trim? */
+        uint64_t        spa_auto_trim;          /* see spa_auto_trim_t */
+
+        kmutex_t        spa_auto_trim_lock;
+        kcondvar_t      spa_auto_trim_done_cv;  /* all autotrim thrd's exited */
+        uint64_t        spa_num_auto_trimming;  /* # of autotrim threads */
+        taskq_t         *spa_auto_trim_taskq;
+
+        kmutex_t        spa_man_trim_lock;
+        uint64_t        spa_man_trim_rate;      /* rate of trim in bytes/sec */
+        uint64_t        spa_num_man_trimming;   /* # of manual trim threads */
+        boolean_t       spa_man_trim_stop;      /* requested manual trim stop */
+        kcondvar_t      spa_man_trim_update_cv; /* updates to TRIM settings */
+        kcondvar_t      spa_man_trim_done_cv;   /* manual trim has completed */
+        /* For details on trim start/stop times see spa_get_trim_prog. */
+        uint64_t        spa_man_trim_start_time;
+        uint64_t        spa_man_trim_stop_time;
+        taskq_t         *spa_man_trim_taskq;
+
         /*
          * spa_iokstat_lock protects spa_iokstat and
          * spa_queue_stats[].
          */
         kmutex_t        spa_iokstat_lock;
         struct kstat    *spa_iokstat;           /* kstat of io to this pool */
         struct {
-                int spa_active;
-                int spa_queued;
+                uint64_t spa_active;
+                uint64_t spa_queued;
         } spa_queue_stats[ZIO_PRIORITY_NUM_QUEUEABLE];
 
+        /* Pool-wide scrub & resilver priority values. */
+        uint64_t        spa_scrub_prio;
+        uint64_t        spa_resilver_prio;
+
+        /* TRIM/UNMAP kstats */
+        spa_trimstats_t *spa_trimstats;         /* alloc'd by kstat_create */
+        struct kstat    *spa_trimstats_ks;
+
         hrtime_t        spa_ccw_fail_time;      /* Conf cache write fail time */
 
+        /* total space on all L2ARC devices used for DDT (l2arc_ddt=on) */
+        uint64_t spa_l2arc_ddt_devs_size;
+
+        /* if 1 this means we have stopped DDT growth for this pool */
+        uint8_t spa_ddt_capped;
+
+        /* specialclass support */
+        boolean_t       spa_usesc;              /* enable special class */
+        uint64_t        spa_special_vdev_correction_rate;
+        uint64_t        spa_minwat;             /* min watermark percent */
+        uint64_t        spa_lowat;              /* low watermark percent */
+        uint64_t        spa_hiwat;              /* high watermark percent */
+        uint64_t        spa_lwm_space;          /* low watermark */
+        uint64_t        spa_hwm_space;          /* high watermark */
+        uint64_t        spa_wbc_wm_range;       /* high wm - low wm */
+        uint8_t         spa_wbc_perc;           /* percent of writes to spec. */
+        spa_watermark_t spa_watermark;
+        boolean_t       spa_special_has_errors;
+
+        /* Write Back Cache */
+        uint64_t        spa_wbc_mode;
+        wbc_data_t      spa_wbc;
+
+        /* cos list */
+        list_t          spa_cos_list;
+
         /*
-         * spa_refcount & spa_config_lock must be the last elements
+         * utilization, latency and throughput statistics per metaslab_class
+         * to aid dynamic balancing of I/O across normal and special classes
+         */
+        uint64_t                spa_avg_stat_rotor;
+        spa_avg_stat_t          spa_avg_stat;
+
+        spa_perfmon_data_t      spa_perfmon;
+
+        /*
+         * Percentage of total write traffic routed to the special class when
+         * the latter is working as writeback cache.
+         * Note that this value is continuously recomputed at runtime based on
+         * the configured load-balancing mechanism (see spa_special_selection)
+         * For instance, 0% would mean that special class is not to be used
+         * for new writes, etc.
+         */
+        uint64_t spa_special_to_normal_ratio;
+
+        /*
+         * last re-routing delta value for the spa_special_to_normal_ratio
+         */
+        int64_t spa_special_to_normal_delta;
+
+        /* target percentage of data to be considered for dedup */
+        int spa_dedup_percentage;
+        uint64_t spa_dedup_rotor;
+
+        /*
+         * spa_refcnt & spa_config_lock must be the last elements
          * because refcount_t changes size based on compilation options.
          * In order for the MDB module to function correctly, the other
          * fields must remain in the same location.
          */
         spa_config_lock_t spa_config_lock[SCL_LOCKS]; /* config changes */
         refcount_t      spa_refcount;           /* number of opens */
+
+        uint64_t spa_ddt_meta_copies; /* amount of ddt-metadata copies */
+
+        /*
+         * The following two fields are designed to restrict the distribution
+         * of the deduplication entries. There are two possible states of these
+         * vars:
+         * 1) min=DITTO, max=DUPLICATED - it provides the old behavior
+         * 2) min=DUPLICATED, MAX=DUPLICATED - new behavior: all entries into
+         * the single zap.
+         */
+        enum ddt_class spa_ddt_class_min;
+        enum ddt_class spa_ddt_class_max;
+
+        spa_meta_placement_t spa_meta_policy;
+
+        uint64_t spa_dedup_best_effort;
+        uint64_t spa_dedup_lo_best_effort;
+        uint64_t spa_dedup_hi_best_effort;
+
+        zfs_autosnap_t spa_autosnap;
+
+        zbookmark_phys_t spa_lszb;
+
+        int spa_obj_mtx_sz;
 };
 
+/* possible in core size of all DDTs  */
+extern uint64_t zfs_ddts_msize;
+
+/* spa sysevent taskq */
+extern taskq_t *spa_sysevent_taskq;
+
 extern const char *spa_config_path;
 
 extern void spa_taskq_dispatch_ent(spa_t *spa, zio_type_t t, zio_taskq_type_t q,
     task_func_t *func, void *arg, uint_t flags, taskq_ent_t *ent);
-extern void spa_load_spares(spa_t *spa);
-extern void spa_load_l2cache(spa_t *spa);
 
+extern void spa_auto_trim_taskq_create(spa_t *spa);
+extern void spa_man_trim_taskq_create(spa_t *spa);
+extern void spa_auto_trim_taskq_destroy(spa_t *spa);
+extern void spa_man_trim_taskq_destroy(spa_t *spa);
+
 #ifdef  __cplusplus
 }
 #endif
 
 #endif  /* _SYS_SPA_IMPL_H */