Print this page
NEX-13140 DVA-throttle support for special-class
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-4620 ZFS autotrim triggering is unreliable
NEX-4622 On-demand TRIM code illogically enumerates metaslabs via mg_ms_tree
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
NEX-3984 On-demand TRIM
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Conflicts:
        usr/src/common/zfs/zpool_prop.c
        usr/src/uts/common/sys/fs/zfs.h
NEX-3508 CLONE - Port NEX-2946 Add UNMAP/TRIM functionality to ZFS and illumos
Reviewed by: Josef Sipek <josef.sipek@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Conflicts:
    usr/src/uts/common/io/scsi/targets/sd.c
    usr/src/uts/common/sys/scsi/targets/sddef.h

*** 23,32 **** --- 23,33 ---- * Use is subject to license terms. */ /* * Copyright (c) 2011, 2015 by Delphix. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. All rights reserved. */ #ifndef _SYS_METASLAB_IMPL_H #define _SYS_METASLAB_IMPL_H
*** 186,195 **** --- 187,199 ---- uint64_t mc_alloc; /* total allocated space */ uint64_t mc_deferred; /* total deferred frees */ uint64_t mc_space; /* total space (alloc + free) */ uint64_t mc_dspace; /* total deflated space */ uint64_t mc_histogram[RANGE_TREE_HISTOGRAM_SIZE]; + + kmutex_t mc_alloc_lock; + avl_tree_t mc_alloc_tree; }; /* * Metaslab groups encapsulate all the allocatable regions (i.e. metaslabs) * of a top-level vdev. They are linked togther to form a circular linked
*** 244,253 **** --- 248,262 ---- uint64_t mg_failed_allocations; uint64_t mg_fragmentation; uint64_t mg_histogram[RANGE_TREE_HISTOGRAM_SIZE]; }; + typedef struct { + uint64_t ts_birth; /* TXG at which this trimset starts */ + range_tree_t *ts_tree; /* tree of extents in the trimset */ + } metaslab_trimset_t; + /* * This value defines the number of elements in the ms_lbas array. The value * of 64 was chosen as it covers all power of 2 buckets up to UINT64_MAX. * This is the equivalent of highbit(UINT64_MAX). */
*** 255,271 **** /* * Each metaslab maintains a set of in-core trees to track metaslab * operations. The in-core free tree (ms_tree) contains the list of * free segments which are eligible for allocation. As blocks are ! * allocated, the allocated segment are removed from the ms_tree and ! * added to a per txg allocation tree (ms_alloctree). As blocks are ! * freed, they are added to the free tree (ms_freeingtree). These trees ! * allow us to process all allocations and frees in syncing context ! * where it is safe to update the on-disk space maps. An additional set ! * of in-core trees is maintained to track deferred frees ! * (ms_defertree). Once a block is freed it will move from the * ms_freedtree to the ms_defertree. A deferred free means that a block * has been freed but cannot be used by the pool until TXG_DEFER_SIZE * transactions groups later. For example, a block that is freed in txg * 50 will not be available for reallocation until txg 52 (50 + * TXG_DEFER_SIZE). This provides a safety net for uberblock rollback. --- 264,281 ---- /* * Each metaslab maintains a set of in-core trees to track metaslab * operations. The in-core free tree (ms_tree) contains the list of * free segments which are eligible for allocation. As blocks are ! * allocated, the allocated segments are removed from the ms_tree and ! * added to a per txg allocation tree (ms_alloctree). This allows us to ! * process all allocations in syncing context where it is safe to update ! * the on-disk space maps. Frees are also processed in syncing context. ! * Most frees are generated from syncing context, and those that are not ! * are held in the spa_free_bplist for processing in syncing context. ! * An additional set of in-core trees is maintained to track deferred ! * frees (ms_defertree). Once a block is freed it will move from the * ms_freedtree to the ms_defertree. A deferred free means that a block * has been freed but cannot be used by the pool until TXG_DEFER_SIZE * transactions groups later. For example, a block that is freed in txg * 50 will not be available for reallocation until txg 52 (50 + * TXG_DEFER_SIZE). This provides a safety net for uberblock rollback.
*** 307,317 **** * ensure that allocations are not performed on the metaslab that is * being written. */ struct metaslab { kmutex_t ms_lock; - kmutex_t ms_sync_lock; kcondvar_t ms_load_cv; space_map_t *ms_sm; uint64_t ms_id; uint64_t ms_start; uint64_t ms_size; --- 317,326 ----
*** 318,327 **** --- 327,341 ---- uint64_t ms_fragmentation; range_tree_t *ms_alloctree[TXG_SIZE]; range_tree_t *ms_tree; + metaslab_trimset_t *ms_cur_ts; /* currently prepared trims */ + metaslab_trimset_t *ms_prev_ts; /* previous (aging) trims */ + kcondvar_t ms_trim_cv; + metaslab_trimset_t *ms_trimming_ts; + /* * The following range trees are accessed only from syncing context. * ms_free*tree only have entries while syncing, and are empty * between syncs. */