Print this page
NEX-9200 Improve the scalability of attribute locking in zfs_zget
Reviewed by: Joyce McIntosh <joyce.mcintosh@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-9552 zfs_scan_idle throttling harms performance and needs to be removed
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
NEX-13140 DVA-throttle support for special-class
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-13937 Improve kstat performance
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>
NEX-6088 ZFS scrub/resilver take excessively long due to issuing lots of random IO
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-8711 backport illumos 7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
7136 ESC_VDEV_REMOVE_AUX ought to always include vdev information
7115 6922 generates ESC_ZFS_VDEV_REMOVE_AUX a bit too often
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Robert Mustacchi <rm@joyent.com>
NEX-6884 KRRP: replication deadlock due to unavailable resources
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-5856 ddt_capped isn't reset when deduped dataset is destroyed
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
NEX-5553 ZFS auto-trim, manual-trim and scrub can race and deadlock
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-5795 Rename 'wrc' as 'wbc' in the source and in the tech docs
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5367 special vdev: sync-write options (NEW)
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5318 Cleanup specialclass property (obsolete, not used) and fix related meta-to-special case
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5255 speed-up migration of the write-cached data
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5064 On-demand trim should store operation start and stop time
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5219 WBC: Add capability to delay migration
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5186 smf-tests contains built files and it shouldn't
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
NEX-5168 cleanup and productize non-default latency based writecache load-balancer
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4940 Special Vdev operation in presence (or absense) of IO Errors
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-4934 Add capability to remove special vdev
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4807 writecache load-balancing statistics: several distinct problems, must be revisited and revised
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4876 On-demand TRIM shouldn't use system_taskq and should queue jobs
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4794 Write Back Cache sync and async writes: adjust routing according to watermark limits
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-4620 ZFS autotrim triggering is unreliable
NEX-4622 On-demand TRIM code illogically enumerates metaslabs via mg_ms_tree
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
NEX-4619 Want kstats to monitor TRIM and UNMAP operation
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R (fix studio build)
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Garrett D'Amore <garrett@damore.org>
5818 zfs {ref}compressratio is incorrect with 4k sector size
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Reviewed by: Don Brady <dev.fs.zfs@gmail.com>
Approved by: Albert Lee <trisk@omniti.com>
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Revert "NEX-4476 WRC: Allow to use write back cache per tree of datasets"
This reverts commit fe97b74444278a6f36fec93179133641296312da.
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-3502 dedup ceiling should set a pool prop when cap is in effect
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-3984 On-demand TRIM
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Conflicts:
usr/src/common/zfs/zpool_prop.c
usr/src/uts/common/sys/fs/zfs.h
NEX-3558 KRRP Integration
NEX-3508 CLONE - Port NEX-2946 Add UNMAP/TRIM functionality to ZFS and illumos
Reviewed by: Josef Sipek <josef.sipek@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Conflicts:
usr/src/uts/common/io/scsi/targets/sd.c
usr/src/uts/common/sys/scsi/targets/sddef.h
NEX-3165 need some dedup improvements
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Garrett D'Amore <garrett@damore.org>
SUP-577 deadlock between zpool detach and syseventd
OS-80 support for vdev and CoS properties for the new I/O scheduler
OS-95 lint warning introduced by OS-61
Issue #27: Auto best-effort dedup enable/disable - settable per pool
Issues #7: Reconsile L2ARC and "special" use by datasets
Issue #2: optimize DDE lookup in DDT objects
Added option to control number of classes of DDE's in DDT.
New default is one, that is all DDE's are stored together
regardless of refcount.
Issue #3: Add support for parametrized number of copies for DDTs
Issue #25: Add a pool-level property that controls the number of copies of DDTs in the pool.
re #12643 rb4064 ZFS meta refactoring - vdev utilization tracking, auto-dedup
re #12585 rb4049 ZFS++ work port - refactoring to improve separation of open/closed code, bug fixes, performance improvements - open code
Bug 11205: add missing libzfs_closed_stubs.c to fix opensource-only build.
ZFS plus work: special vdevs, cos, cos/vdev properties
| Split |
Close |
| Expand all |
| Collapse all |
--- old/usr/src/uts/common/fs/zfs/sys/spa_impl.h
+++ new/usr/src/uts/common/fs/zfs/sys/spa_impl.h
1 1 /*
2 2 * CDDL HEADER START
3 3 *
4 4 * The contents of this file are subject to the terms of the
5 5 * Common Development and Distribution License (the "License").
6 6 * You may not use this file except in compliance with the License.
7 7 *
8 8 * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
9 9 * or http://www.opensolaris.org/os/licensing.
10 10 * See the License for the specific language governing permissions
11 11 * and limitations under the License.
12 12 *
|
↓ open down ↓ |
12 lines elided |
↑ open up ↑ |
13 13 * When distributing Covered Code, include this CDDL HEADER in each
14 14 * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
15 15 * If applicable, add the following below this CDDL HEADER, with the
16 16 * fields enclosed by brackets "[]" replaced with your own identifying
17 17 * information: Portions Copyright [yyyy] [name of copyright owner]
18 18 *
19 19 * CDDL HEADER END
20 20 */
21 21 /*
22 22 * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
23 - * Copyright (c) 2011, 2018 by Delphix. All rights reserved.
24 - * Copyright 2011 Nexenta Systems, Inc. All rights reserved.
23 + * Copyright (c) 2011, 2015 by Delphix. All rights reserved.
25 24 * Copyright (c) 2014 Spectra Logic Corporation, All rights reserved.
25 + * Copyright 2017 Nexenta Systems, Inc. All rights reserved.
26 26 * Copyright 2013 Saso Kiselkov. All rights reserved.
27 27 * Copyright (c) 2017 Datto Inc.
28 28 */
29 29
30 30 #ifndef _SYS_SPA_IMPL_H
31 31 #define _SYS_SPA_IMPL_H
32 32
33 33 #include <sys/spa.h>
34 34 #include <sys/vdev.h>
35 -#include <sys/vdev_removal.h>
35 +#include <sys/vdev_impl.h>
36 36 #include <sys/metaslab.h>
37 37 #include <sys/dmu.h>
38 38 #include <sys/dsl_pool.h>
39 39 #include <sys/uberblock_impl.h>
40 40 #include <sys/zfs_context.h>
41 41 #include <sys/avl.h>
42 42 #include <sys/refcount.h>
43 43 #include <sys/bplist.h>
44 44 #include <sys/bpobj.h>
45 +#include <sys/special_impl.h>
46 +#include <sys/wbc.h>
45 47 #include <sys/zfeature.h>
46 -#include <sys/zthr.h>
47 48 #include <zfeature_common.h>
49 +#include <sys/autosnap.h>
48 50
49 51 #ifdef __cplusplus
50 52 extern "C" {
51 53 #endif
52 54
55 +/*
56 + * This (illegal) pool name is used when temporarily importing a spa_t in order
57 + * to get the vdev stats associated with the imported devices.
58 + */
59 +#define TRYIMPORT_NAME "$import"
60 +
53 61 typedef struct spa_error_entry {
54 62 zbookmark_phys_t se_bookmark;
55 63 char *se_name;
56 64 avl_node_t se_avl;
57 65 } spa_error_entry_t;
58 66
59 67 typedef struct spa_history_phys {
60 68 uint64_t sh_pool_create_len; /* ending offset of zpool create */
61 69 uint64_t sh_phys_max_off; /* physical EOF */
62 70 uint64_t sh_bof; /* logical BOF */
63 71 uint64_t sh_eof; /* logical EOF */
64 72 uint64_t sh_records_lost; /* num of records overwritten */
65 73 } spa_history_phys_t;
66 74
67 -/*
68 - * All members must be uint64_t, for byteswap purposes.
69 - */
70 -typedef struct spa_removing_phys {
71 - uint64_t sr_state; /* dsl_scan_state_t */
72 -
73 - /*
74 - * The vdev ID that we most recently attempted to remove,
75 - * or -1 if no removal has been attempted.
76 - */
77 - uint64_t sr_removing_vdev;
78 -
79 - /*
80 - * The vdev ID that we most recently successfully removed,
81 - * or -1 if no devices have been removed.
82 - */
83 - uint64_t sr_prev_indirect_vdev;
84 -
85 - uint64_t sr_start_time;
86 - uint64_t sr_end_time;
87 -
88 - /*
89 - * Note that we can not use the space map's or indirect mapping's
90 - * accounting as a substitute for these values, because we need to
91 - * count frees of not-yet-copied data as though it did the copy.
92 - * Otherwise, we could get into a situation where copied > to_copy,
93 - * or we complete before copied == to_copy.
94 - */
95 - uint64_t sr_to_copy; /* bytes that need to be copied */
96 - uint64_t sr_copied; /* bytes that have been copied or freed */
97 -} spa_removing_phys_t;
98 -
99 -/*
100 - * This struct is stored as an entry in the DMU_POOL_DIRECTORY_OBJECT
101 - * (with key DMU_POOL_CONDENSING_INDIRECT). It is present if a condense
102 - * of an indirect vdev's mapping object is in progress.
103 - */
104 -typedef struct spa_condensing_indirect_phys {
105 - /*
106 - * The vdev ID of the indirect vdev whose indirect mapping is
107 - * being condensed.
108 - */
109 - uint64_t scip_vdev;
110 -
111 - /*
112 - * The vdev's old obsolete spacemap. This spacemap's contents are
113 - * being integrated into the new mapping.
114 - */
115 - uint64_t scip_prev_obsolete_sm_object;
116 -
117 - /*
118 - * The new mapping object that is being created.
119 - */
120 - uint64_t scip_next_mapping_object;
121 -} spa_condensing_indirect_phys_t;
122 -
123 75 struct spa_aux_vdev {
124 76 uint64_t sav_object; /* MOS object for device list */
125 77 nvlist_t *sav_config; /* cached device config */
126 78 vdev_t **sav_vdevs; /* devices */
127 79 int sav_count; /* number devices */
128 80 boolean_t sav_sync; /* sync the device list */
129 81 nvlist_t **sav_pending; /* pending device additions */
130 82 uint_t sav_npending; /* # pending devices */
131 83 };
132 84
133 85 typedef struct spa_config_lock {
134 86 kmutex_t scl_lock;
135 87 kthread_t *scl_writer;
136 88 int scl_write_wanted;
137 89 kcondvar_t scl_cv;
138 90 refcount_t scl_count;
139 91 } spa_config_lock_t;
140 92
141 93 typedef struct spa_config_dirent {
142 94 list_node_t scd_link;
143 95 char *scd_path;
144 96 } spa_config_dirent_t;
145 97
146 98 typedef enum zio_taskq_type {
147 99 ZIO_TASKQ_ISSUE = 0,
148 100 ZIO_TASKQ_ISSUE_HIGH,
149 101 ZIO_TASKQ_INTERRUPT,
150 102 ZIO_TASKQ_INTERRUPT_HIGH,
151 103 ZIO_TASKQ_TYPES
152 104 } zio_taskq_type_t;
153 105
154 106 /*
155 107 * State machine for the zpool-poolname process. The states transitions
156 108 * are done as follows:
157 109 *
158 110 * From To Routine
159 111 * PROC_NONE -> PROC_CREATED spa_activate()
160 112 * PROC_CREATED -> PROC_ACTIVE spa_thread()
161 113 * PROC_ACTIVE -> PROC_DEACTIVATE spa_deactivate()
162 114 * PROC_DEACTIVATE -> PROC_GONE spa_thread()
163 115 * PROC_GONE -> PROC_NONE spa_deactivate()
164 116 */
165 117 typedef enum spa_proc_state {
166 118 SPA_PROC_NONE, /* spa_proc = &p0, no process created */
167 119 SPA_PROC_CREATED, /* spa_activate() has proc, is waiting */
168 120 SPA_PROC_ACTIVE, /* taskqs created, spa_proc set */
169 121 SPA_PROC_DEACTIVATE, /* spa_deactivate() requests process exit */
170 122 SPA_PROC_GONE /* spa_thread() is exiting, spa_proc = &p0 */
171 123 } spa_proc_state_t;
172 124
173 125 typedef struct spa_taskqs {
174 126 uint_t stqs_count;
|
↓ open down ↓ |
42 lines elided |
↑ open up ↑ |
175 127 taskq_t **stqs_taskq;
176 128 } spa_taskqs_t;
177 129
178 130 typedef enum spa_all_vdev_zap_action {
179 131 AVZ_ACTION_NONE = 0,
180 132 AVZ_ACTION_DESTROY, /* Destroy all per-vdev ZAPs and the AVZ. */
181 133 AVZ_ACTION_REBUILD, /* Populate the new AVZ, see spa_avz_rebuild */
182 134 AVZ_ACTION_INITIALIZE
183 135 } spa_avz_action_t;
184 136
185 -typedef enum spa_config_source {
186 - SPA_CONFIG_SRC_NONE = 0,
187 - SPA_CONFIG_SRC_SCAN, /* scan of path (default: /dev/dsk) */
188 - SPA_CONFIG_SRC_CACHEFILE, /* any cachefile */
189 - SPA_CONFIG_SRC_TRYIMPORT, /* returned from call to tryimport */
190 - SPA_CONFIG_SRC_SPLIT, /* new pool in a pool split */
191 - SPA_CONFIG_SRC_MOS /* MOS, but not always from right txg */
192 -} spa_config_source_t;
137 +typedef enum spa_watermark {
138 + SPA_WM_NONE,
139 + SPA_WM_LOW,
140 + SPA_WM_HIGH
141 +} spa_watermark_t;
193 142
143 +/*
144 + * average utilization, latency and throughput
145 + * for spa and special/normal classes
146 + */
147 +typedef struct spa_avg_stat {
148 + uint64_t spa_utilization;
149 + uint64_t special_utilization;
150 + uint64_t normal_utilization;
151 + uint64_t special_latency;
152 + uint64_t normal_latency;
153 + uint64_t special_throughput;
154 + uint64_t normal_throughput;
155 +} spa_avg_stat_t;
156 +
157 +typedef struct spa_perfmon_data {
158 + kthread_t *perfmon_thread;
159 + boolean_t perfmon_thr_exit;
160 + kmutex_t perfmon_lock;
161 + kcondvar_t perfmon_cv;
162 +} spa_perfmon_data_t;
163 +
164 +/*
165 + * Metaplacement controls 3-types of meta
166 + * (see spa_refine_meta_placement() in special.c):
167 + * - DDT-Meta (pool level property) (see DMU_OT_IS_DDT_META())
168 + * - ZPL-Meta (dataset level property) (see DMU_OT_IS_ZPL_META())
169 + * - ZFS-Meta (pool level property) all other metadata except
170 + * DDT-Meta and ZPL-Meta
171 + *
172 + * spa_enable_meta_placement_selection is global switch
173 + *
174 + * spa_small_data_to_special contains max size of data that
175 + * can be placed on special
176 + *
177 + * spa_sync_to_special uses special device for slog synchronous transactions
178 + */
179 +typedef struct spa_meta_placement {
180 + uint64_t spa_enable_meta_placement_selection;
181 + uint64_t spa_ddt_meta_to_special;
182 + uint64_t spa_zfs_meta_to_special;
183 + uint64_t spa_small_data_to_special;
184 + uint64_t spa_sync_to_special;
185 +} spa_meta_placement_t;
186 +
187 +typedef struct spa_trimstats spa_trimstats_t;
188 +
194 189 struct spa {
195 190 /*
196 191 * Fields protected by spa_namespace_lock.
197 192 */
198 193 char spa_name[ZFS_MAX_DATASET_NAME_LEN]; /* pool name */
199 194 char *spa_comment; /* comment */
200 195 avl_node_t spa_avl; /* node in spa_namespace_avl */
201 196 nvlist_t *spa_config; /* last synced config */
202 197 nvlist_t *spa_config_syncing; /* currently syncing config */
203 198 nvlist_t *spa_config_splitting; /* config for splitting */
204 199 nvlist_t *spa_load_info; /* info and errors from load */
205 200 uint64_t spa_config_txg; /* txg of last config change */
206 201 int spa_sync_pass; /* iterate-to-convergence */
207 202 pool_state_t spa_state; /* pool state */
208 203 int spa_inject_ref; /* injection references */
209 204 uint8_t spa_sync_on; /* sync threads are running */
210 205 spa_load_state_t spa_load_state; /* current load operation */
211 - boolean_t spa_indirect_vdevs_loaded; /* mappings loaded? */
212 - boolean_t spa_trust_config; /* do we trust vdev tree? */
213 - spa_config_source_t spa_config_source; /* where config comes from? */
214 206 uint64_t spa_import_flags; /* import specific flags */
215 207 spa_taskqs_t spa_zio_taskq[ZIO_TYPES][ZIO_TASKQ_TYPES];
216 208 dsl_pool_t *spa_dsl_pool;
217 209 boolean_t spa_is_initializing; /* true while opening pool */
218 210 metaslab_class_t *spa_normal_class; /* normal data class */
219 211 metaslab_class_t *spa_log_class; /* intent log data class */
212 + metaslab_class_t *spa_special_class; /* special usage class */
220 213 uint64_t spa_first_txg; /* first txg after spa_open() */
221 214 uint64_t spa_final_txg; /* txg of export/destroy */
222 215 uint64_t spa_freeze_txg; /* freeze pool at this txg */
223 216 uint64_t spa_load_max_txg; /* best initial ub_txg */
224 217 uint64_t spa_claim_max_txg; /* highest claimed birth txg */
225 218 timespec_t spa_loaded_ts; /* 1st successful open time */
226 219 objset_t *spa_meta_objset; /* copy of dp->dp_meta_objset */
227 220 kmutex_t spa_evicting_os_lock; /* Evicting objset list lock */
228 221 list_t spa_evicting_os_list; /* Objsets being evicted. */
229 222 kcondvar_t spa_evicting_os_cv; /* Objset Eviction Completion */
230 223 txg_list_t spa_vdev_txg_list; /* per-txg dirty vdev list */
231 224 vdev_t *spa_root_vdev; /* top-level vdev container */
232 225 int spa_min_ashift; /* of vdevs in normal class */
233 226 int spa_max_ashift; /* of vdevs in normal class */
234 227 uint64_t spa_config_guid; /* config pool guid */
235 228 uint64_t spa_load_guid; /* spa_load initialized guid */
236 229 uint64_t spa_last_synced_guid; /* last synced guid */
237 230 list_t spa_config_dirty_list; /* vdevs with dirty config */
238 231 list_t spa_state_dirty_list; /* vdevs with dirty state */
239 - kmutex_t spa_alloc_lock;
240 - avl_tree_t spa_alloc_tree;
241 232 spa_aux_vdev_t spa_spares; /* hot spares */
242 233 spa_aux_vdev_t spa_l2cache; /* L2ARC cache devices */
243 234 nvlist_t *spa_label_features; /* Features for reading MOS */
244 235 uint64_t spa_config_object; /* MOS object for pool config */
245 236 uint64_t spa_config_generation; /* config generation number */
246 237 uint64_t spa_syncing_txg; /* txg currently syncing */
247 238 bpobj_t spa_deferred_bpobj; /* deferred-free bplist */
248 239 bplist_t spa_free_bplist[TXG_SIZE]; /* bplist of stuff to free */
249 240 zio_cksum_salt_t spa_cksum_salt; /* secret salt for cksum */
250 241 /* checksum context templates */
251 242 kmutex_t spa_cksum_tmpls_lock;
252 243 void *spa_cksum_tmpls[ZIO_CHECKSUM_FUNCTIONS];
253 244 uberblock_t spa_ubsync; /* last synced uberblock */
254 245 uberblock_t spa_uberblock; /* current uberblock */
255 246 boolean_t spa_extreme_rewind; /* rewind past deferred frees */
256 - uint64_t spa_last_io; /* lbolt of last non-scan I/O */
257 247 kmutex_t spa_scrub_lock; /* resilver/scrub lock */
258 248 uint64_t spa_scrub_inflight; /* in-flight scrub I/Os */
259 249 kcondvar_t spa_scrub_io_cv; /* scrub I/O completion */
260 250 uint8_t spa_scrub_active; /* active or suspended? */
261 251 uint8_t spa_scrub_type; /* type of scrub we're doing */
262 252 uint8_t spa_scrub_finished; /* indicator to rotate logs */
263 253 uint8_t spa_scrub_started; /* started since last boot */
264 254 uint8_t spa_scrub_reopen; /* scrub doing vdev_reopen */
265 255 uint64_t spa_scan_pass_start; /* start time per pass/reboot */
266 256 uint64_t spa_scan_pass_scrub_pause; /* scrub pause time */
267 257 uint64_t spa_scan_pass_scrub_spent_paused; /* total paused */
268 258 uint64_t spa_scan_pass_exam; /* examined bytes per pass */
259 + uint64_t spa_scan_pass_work; /* actually processed bytes */
269 260 kmutex_t spa_async_lock; /* protect async state */
270 261 kthread_t *spa_async_thread; /* thread doing async task */
271 262 int spa_async_suspended; /* async tasks suspended */
272 263 kcondvar_t spa_async_cv; /* wait for thread_exit() */
273 264 uint16_t spa_async_tasks; /* async task mask */
274 - uint64_t spa_missing_tvds; /* unopenable tvds on load */
275 - uint64_t spa_missing_tvds_allowed; /* allow loading spa? */
276 -
277 - spa_removing_phys_t spa_removing_phys;
278 - spa_vdev_removal_t *spa_vdev_removal;
279 -
280 - spa_condensing_indirect_phys_t spa_condensing_indirect_phys;
281 - spa_condensing_indirect_t *spa_condensing_indirect;
282 - zthr_t *spa_condense_zthr; /* zthr doing condense. */
283 -
284 265 char *spa_root; /* alternate root directory */
285 266 uint64_t spa_ena; /* spa-wide ereport ENA */
286 267 int spa_last_open_failed; /* error if last open failed */
287 268 uint64_t spa_last_ubsync_txg; /* "best" uberblock txg */
288 269 uint64_t spa_last_ubsync_txg_ts; /* timestamp from that ub */
289 270 uint64_t spa_load_txg; /* ub txg that loaded */
290 271 uint64_t spa_load_txg_ts; /* timestamp from that ub */
291 272 uint64_t spa_load_meta_errors; /* verify metadata err count */
292 273 uint64_t spa_load_data_errors; /* verify data err count */
293 274 uint64_t spa_verify_min_txg; /* start txg of verify scrub */
294 275 kmutex_t spa_errlog_lock; /* error log lock */
295 276 uint64_t spa_errlog_last; /* last error log object */
|
↓ open down ↓ |
2 lines elided |
↑ open up ↑ |
296 277 uint64_t spa_errlog_scrub; /* scrub error log object */
297 278 kmutex_t spa_errlist_lock; /* error list/ereport lock */
298 279 avl_tree_t spa_errlist_last; /* last error list */
299 280 avl_tree_t spa_errlist_scrub; /* scrub error list */
300 281 uint64_t spa_deflate; /* should we deflate? */
301 282 uint64_t spa_history; /* history object */
302 283 kmutex_t spa_history_lock; /* history lock */
303 284 vdev_t *spa_pending_vdev; /* pending vdev additions */
304 285 kmutex_t spa_props_lock; /* property lock */
305 286 uint64_t spa_pool_props_object; /* object for properties */
287 + kmutex_t spa_cos_props_lock; /* property lock */
288 + uint64_t spa_cos_props_object; /* object for cos properties */
289 + kmutex_t spa_vdev_props_lock; /* property lock */
290 + uint64_t spa_vdev_props_object; /* object for vdev properties */
306 291 uint64_t spa_bootfs; /* default boot filesystem */
307 292 uint64_t spa_failmode; /* failure mode for the pool */
308 293 uint64_t spa_delegation; /* delegation on/off */
309 294 list_t spa_config_list; /* previous cache file(s) */
310 295 /* per-CPU array of root of async I/O: */
311 296 zio_t **spa_async_zio_root;
312 297 zio_t *spa_suspend_zio_root; /* root of all suspended I/O */
313 - zio_t *spa_txg_zio[TXG_SIZE]; /* spa_sync() waits for this */
314 298 kmutex_t spa_suspend_lock; /* protects suspend_zio_root */
315 299 kcondvar_t spa_suspend_cv; /* notification of resume */
316 300 uint8_t spa_suspended; /* pool is suspended */
317 301 uint8_t spa_claiming; /* pool is doing zil_claim() */
318 302 boolean_t spa_debug; /* debug enabled? */
319 303 boolean_t spa_is_root; /* pool is root */
320 304 int spa_minref; /* num refs when first opened */
321 305 int spa_mode; /* FREAD | FWRITE */
322 306 spa_log_state_t spa_log_state; /* log state */
323 307 uint64_t spa_autoexpand; /* lun expansion on/off */
324 308 uint64_t spa_bootsize; /* efi system partition size */
325 309 ddt_t *spa_ddt[ZIO_CHECKSUM_FUNCTIONS]; /* in-core DDTs */
326 310 uint64_t spa_ddt_stat_object; /* DDT statistics */
327 311 uint64_t spa_dedup_ditto; /* dedup ditto threshold */
328 312 uint64_t spa_dedup_checksum; /* default dedup checksum */
313 + uint64_t spa_ddt_msize; /* ddt size in core, from ddo */
314 + uint64_t spa_ddt_dsize; /* ddt size on disk, from ddo */
329 315 uint64_t spa_dspace; /* dspace in normal class */
330 316 kmutex_t spa_vdev_top_lock; /* dueling offline/remove */
331 317 kmutex_t spa_proc_lock; /* protects spa_proc* */
332 318 kcondvar_t spa_proc_cv; /* spa_proc_state transitions */
333 319 spa_proc_state_t spa_proc_state; /* see definition */
334 320 struct proc *spa_proc; /* "zpool-poolname" process */
335 321 uint64_t spa_did; /* if procp != p0, did of t1 */
336 322 boolean_t spa_autoreplace; /* autoreplace set in open */
337 323 int spa_vdev_locks; /* locks grabbed */
338 324 uint64_t spa_creation_version; /* version at pool creation */
339 325 uint64_t spa_prev_software_version; /* See ub_software_version */
340 326 uint64_t spa_feat_for_write_obj; /* required to write to pool */
341 327 uint64_t spa_feat_for_read_obj; /* required to read from pool */
342 328 uint64_t spa_feat_desc_obj; /* Feature descriptions */
|
↓ open down ↓ |
4 lines elided |
↑ open up ↑ |
343 329 uint64_t spa_feat_enabled_txg_obj; /* Feature enabled txg */
344 330 /* cache feature refcounts */
345 331 uint64_t spa_feat_refcount_cache[SPA_FEATURES];
346 332 cyclic_id_t spa_deadman_cycid; /* cyclic id */
347 333 uint64_t spa_deadman_calls; /* number of deadman calls */
348 334 hrtime_t spa_sync_starttime; /* starting time fo spa_sync */
349 335 uint64_t spa_deadman_synctime; /* deadman expiration timer */
350 336 uint64_t spa_all_vdev_zaps; /* ZAP of per-vd ZAP obj #s */
351 337 spa_avz_action_t spa_avz_action; /* destroy/rebuild AVZ? */
352 338
339 + /* TRIM */
340 + uint64_t spa_force_trim; /* force sending trim? */
341 + uint64_t spa_auto_trim; /* see spa_auto_trim_t */
342 +
343 + kmutex_t spa_auto_trim_lock;
344 + kcondvar_t spa_auto_trim_done_cv; /* all autotrim thrd's exited */
345 + uint64_t spa_num_auto_trimming; /* # of autotrim threads */
346 + taskq_t *spa_auto_trim_taskq;
347 +
348 + kmutex_t spa_man_trim_lock;
349 + uint64_t spa_man_trim_rate; /* rate of trim in bytes/sec */
350 + uint64_t spa_num_man_trimming; /* # of manual trim threads */
351 + boolean_t spa_man_trim_stop; /* requested manual trim stop */
352 + kcondvar_t spa_man_trim_update_cv; /* updates to TRIM settings */
353 + kcondvar_t spa_man_trim_done_cv; /* manual trim has completed */
354 + /* For details on trim start/stop times see spa_get_trim_prog. */
355 + uint64_t spa_man_trim_start_time;
356 + uint64_t spa_man_trim_stop_time;
357 + taskq_t *spa_man_trim_taskq;
358 +
353 359 /*
354 360 * spa_iokstat_lock protects spa_iokstat and
355 361 * spa_queue_stats[].
356 362 */
357 363 kmutex_t spa_iokstat_lock;
358 364 struct kstat *spa_iokstat; /* kstat of io to this pool */
359 365 struct {
360 - int spa_active;
361 - int spa_queued;
366 + uint64_t spa_active;
367 + uint64_t spa_queued;
362 368 } spa_queue_stats[ZIO_PRIORITY_NUM_QUEUEABLE];
363 369
370 + /* Pool-wide scrub & resilver priority values. */
371 + uint64_t spa_scrub_prio;
372 + uint64_t spa_resilver_prio;
373 +
374 + /* TRIM/UNMAP kstats */
375 + spa_trimstats_t *spa_trimstats; /* alloc'd by kstat_create */
376 + struct kstat *spa_trimstats_ks;
377 +
364 378 hrtime_t spa_ccw_fail_time; /* Conf cache write fail time */
365 379
380 + /* total space on all L2ARC devices used for DDT (l2arc_ddt=on) */
381 + uint64_t spa_l2arc_ddt_devs_size;
382 +
383 + /* if 1 this means we have stopped DDT growth for this pool */
384 + uint8_t spa_ddt_capped;
385 +
386 + /* specialclass support */
387 + boolean_t spa_usesc; /* enable special class */
388 + uint64_t spa_special_vdev_correction_rate;
389 + uint64_t spa_minwat; /* min watermark percent */
390 + uint64_t spa_lowat; /* low watermark percent */
391 + uint64_t spa_hiwat; /* high watermark percent */
392 + uint64_t spa_lwm_space; /* low watermark */
393 + uint64_t spa_hwm_space; /* high watermark */
394 + uint64_t spa_wbc_wm_range; /* high wm - low wm */
395 + uint8_t spa_wbc_perc; /* percent of writes to spec. */
396 + spa_watermark_t spa_watermark;
397 + boolean_t spa_special_has_errors;
398 +
399 + /* Write Back Cache */
400 + uint64_t spa_wbc_mode;
401 + wbc_data_t spa_wbc;
402 +
403 + /* cos list */
404 + list_t spa_cos_list;
405 +
366 406 /*
367 - * spa_refcount & spa_config_lock must be the last elements
407 + * utilization, latency and throughput statistics per metaslab_class
408 + * to aid dynamic balancing of I/O across normal and special classes
409 + */
410 + uint64_t spa_avg_stat_rotor;
411 + spa_avg_stat_t spa_avg_stat;
412 +
413 + spa_perfmon_data_t spa_perfmon;
414 +
415 + /*
416 + * Percentage of total write traffic routed to the special class when
417 + * the latter is working as writeback cache.
418 + * Note that this value is continuously recomputed at runtime based on
419 + * the configured load-balancing mechanism (see spa_special_selection)
420 + * For instance, 0% would mean that special class is not to be used
421 + * for new writes, etc.
422 + */
423 + uint64_t spa_special_to_normal_ratio;
424 +
425 + /*
426 + * last re-routing delta value for the spa_special_to_normal_ratio
427 + */
428 + int64_t spa_special_to_normal_delta;
429 +
430 + /* target percentage of data to be considered for dedup */
431 + int spa_dedup_percentage;
432 + uint64_t spa_dedup_rotor;
433 +
434 + /*
435 + * spa_refcnt & spa_config_lock must be the last elements
368 436 * because refcount_t changes size based on compilation options.
369 437 * In order for the MDB module to function correctly, the other
370 438 * fields must remain in the same location.
371 439 */
372 440 spa_config_lock_t spa_config_lock[SCL_LOCKS]; /* config changes */
373 441 refcount_t spa_refcount; /* number of opens */
442 +
443 + uint64_t spa_ddt_meta_copies; /* amount of ddt-metadata copies */
444 +
445 + /*
446 + * The following two fields are designed to restrict the distribution
447 + * of the deduplication entries. There are two possible states of these
448 + * vars:
449 + * 1) min=DITTO, max=DUPLICATED - it provides the old behavior
450 + * 2) min=DUPLICATED, MAX=DUPLICATED - new behavior: all entries into
451 + * the single zap.
452 + */
453 + enum ddt_class spa_ddt_class_min;
454 + enum ddt_class spa_ddt_class_max;
455 +
456 + spa_meta_placement_t spa_meta_policy;
457 +
458 + uint64_t spa_dedup_best_effort;
459 + uint64_t spa_dedup_lo_best_effort;
460 + uint64_t spa_dedup_hi_best_effort;
461 +
462 + zfs_autosnap_t spa_autosnap;
463 +
464 + zbookmark_phys_t spa_lszb;
465 +
466 + int spa_obj_mtx_sz;
374 467 };
375 468
469 +/* possible in core size of all DDTs */
470 +extern uint64_t zfs_ddts_msize;
471 +
472 +/* spa sysevent taskq */
473 +extern taskq_t *spa_sysevent_taskq;
474 +
376 475 extern const char *spa_config_path;
377 476
378 477 extern void spa_taskq_dispatch_ent(spa_t *spa, zio_type_t t, zio_taskq_type_t q,
379 478 task_func_t *func, void *arg, uint_t flags, taskq_ent_t *ent);
380 -extern void spa_load_spares(spa_t *spa);
381 -extern void spa_load_l2cache(spa_t *spa);
382 479
480 +extern void spa_auto_trim_taskq_create(spa_t *spa);
481 +extern void spa_man_trim_taskq_create(spa_t *spa);
482 +extern void spa_auto_trim_taskq_destroy(spa_t *spa);
483 +extern void spa_man_trim_taskq_destroy(spa_t *spa);
484 +
383 485 #ifdef __cplusplus
384 486 }
385 487 #endif
386 488
387 489 #endif /* _SYS_SPA_IMPL_H */
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX