Print this page
8520 lzc_rollback_to should support rolling back to origin
7198 libzfs should gracefully handle EINVAL from lzc_rollback
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
NEX-9673 Add capability to replicate cloned datasets relative to origin
Reviewed by: Alex Deiter <alex.deiter@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-13340 Continuous replication service may fail if a user removes nested datasets
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-3562 filename normalization doesn't work for removes (sync with upstream)
NEX-9406 Add a property to show that a dataset has been modified since a snapshot
Reviewed by: Alexey Komarov <alexey.komarov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Revert "NEX-7251 Resume_token is not cleared right after finishing receive"
This reverts commit 9e97a45e8cf6ca59307a39e2d3c11c6e845e4187.
NEX-7251 Resume_token is not cleared right after finishing receive
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alexey Komarov <alexey.komarov@nexenta.com>
NEX-6815 KRRP: 'sess-send-stop' hangs forever if the sources pool does not have free space
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
NEX-5366 Race between unique_insert() and unique_remove() causes ZFS fsid change
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Dan Vatca <dan.vatca@gmail.com>
NEX-5795 Rename 'wrc' as 'wbc' in the source and in the tech docs
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5272 KRRP: replicate snapshot properties
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Alexey Komarov <alexey.komarov@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
6328 Fix cstyle errors in zfs codebase (fix studio)
6328 Fix cstyle errors in zfs codebase
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Jorgen Lundman <lundman@lundman.net>
Approved by: Robert Mustacchi <rm@joyent.com>
2605 want to resume interrupted zfs send
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Reviewed by: Arne Jansen <sensille@gmx.net>
Approved by: Dan McDonald <danmcd@omniti.com>
6160 /usr/lib/fs/zfs/bootinstall should use bootadm
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Adam Števko <adam.stevko@gmail.com>
Reviewed by: Josef Sipek <jeffpc@josefsipek.net>
Approved by: Richard Lowe <richlowe@richlowe.net>
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R (NULL is not an int)
6171 dsl_prop_unregister() slows down dataset eviction.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R (fix studio build)
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Garrett D'Amore <garrett@damore.org>
6047 SPARC boot should support feature@embedded_data
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
5959 clean up per-dataset feature count code
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
NEX-4582 update wrc test cases for allow to use write back cache per tree of datasets
Reviewed by: Steve Peng <steve.peng@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
5909 ensure that shared snap names don't become too long after promotion
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
5393 spurious failures from dsl_dataset_hold_obj()
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Will Andrews <willa@spectralogic.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Approved by: Dan McDonald <danmcd@omniti.com>
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Revert "NEX-4476 WRC: Allow to use write back cache per tree of datasets"
This reverts commit fe97b74444278a6f36fec93179133641296312da.
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-3964 It should not be allowed to rename a snapshot that its new name is matched to the prefix of in-kernel autosnapshots
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
NEX-3669 Faults for fans that don't exist
Reviewed by: Jeffry Molanus <jeffry.molanus@nexenta.com>
NEX-3891 Hide the snapshots that belong to in-kernel autosnap-service
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Alek Pinchuk <alek@nexenta.com>
NEX-3329 libnsl: set_up_connection() over TCP does not adhere the specified timeout
Reviewed by: Dan Fields <dan.fields@nexenta.com>
NEX-3521 CLONE - Port NEX-3209 normalization=formD and casesensitivity=mixed behaves improperly, squashing case
Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Dan Fields <dan.fields@nexenta.com>
NEX-3558 KRRP Integration
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Garrett D'Amore <garrett@damore.org>
re #13318 rb4428 kernel panic during failover with autosync running on the active node
re #12619 rb4429 More dp->dp_config_rwlock holds
re #8279 rb3915 need a mechanism to notify NMS about ZFS config changes (fix lint -courtesy of Yuri Pankov)
re #12584 rb4049 zfsxx latest code merge (fix lint - courtesy of Yuri Pankov)
re #12585 rb4049 ZFS++ work port - refactoring to improve separation of open/closed code, bug fixes, performance improvements - open code
re 12063 rb 3793 Panic on zpool destroy
Bug 11205: add missing libzfs_closed_stubs.c to fix opensource-only build.
ZFS plus work: special vdevs, cos, cos/vdev properties


  29  * Copyright 2016, OmniTI Computer Consulting, Inc. All rights reserved.
  30  * Copyright 2017 Nexenta Systems, Inc.
  31  */
  32 
  33 #include <sys/dmu_objset.h>
  34 #include <sys/dsl_dataset.h>
  35 #include <sys/dsl_dir.h>
  36 #include <sys/dsl_prop.h>
  37 #include <sys/dsl_synctask.h>
  38 #include <sys/dmu_traverse.h>
  39 #include <sys/dmu_impl.h>
  40 #include <sys/dmu_tx.h>
  41 #include <sys/arc.h>
  42 #include <sys/zio.h>
  43 #include <sys/zap.h>
  44 #include <sys/zfeature.h>
  45 #include <sys/unique.h>
  46 #include <sys/zfs_context.h>
  47 #include <sys/zfs_ioctl.h>
  48 #include <sys/spa.h>
  49 #include <sys/vdev.h>
  50 #include <sys/zfs_znode.h>
  51 #include <sys/zfs_onexit.h>
  52 #include <sys/zvol.h>
  53 #include <sys/dsl_scan.h>
  54 #include <sys/dsl_deadlist.h>
  55 #include <sys/dsl_destroy.h>
  56 #include <sys/dsl_userhold.h>
  57 #include <sys/dsl_bookmark.h>

  58 #include <sys/dmu_send.h>
  59 #include <sys/zio_checksum.h>
  60 #include <sys/zio_compress.h>
  61 #include <zfs_fletcher.h>
  62 
  63 /*
  64  * The SPA supports block sizes up to 16MB.  However, very large blocks
  65  * can have an impact on i/o latency (e.g. tying up a spinning disk for
  66  * ~300ms), and also potentially on the memory allocator.  Therefore,
  67  * we do not allow the recordsize to be set larger than zfs_max_recordsize
  68  * (default 1MB).  Larger blocks can be created by changing this tunable,
  69  * and pools with larger blocks can always be imported and used, regardless
  70  * of this setting.
  71  */
  72 int zfs_max_recordsize = 1 * 1024 * 1024;
  73 
  74 #define SWITCH64(x, y) \
  75         { \
  76                 uint64_t __tmp = (x); \
  77                 (x) = (y); \
  78                 (y) = __tmp; \
  79         }
  80 
  81 #define DS_REF_MAX      (1ULL << 62)
  82 
  83 extern inline dsl_dataset_phys_t *dsl_dataset_phys(dsl_dataset_t *ds);
  84 
  85 static void dsl_dataset_set_remap_deadlist_object(dsl_dataset_t *ds,
  86     uint64_t obj, dmu_tx_t *tx);
  87 static void dsl_dataset_unset_remap_deadlist_object(dsl_dataset_t *ds,
  88     dmu_tx_t *tx);
  89 
  90 extern int spa_asize_inflation;
  91 
  92 static zil_header_t zero_zil;
  93 














  94 /*
  95  * Figure out how much of this delta should be propogated to the dsl_dir
  96  * layer.  If there's a refreservation, that space has already been
  97  * partially accounted for in our ancestors.
  98  */
  99 static int64_t
 100 parent_delta(dsl_dataset_t *ds, int64_t delta)
 101 {
 102         dsl_dataset_phys_t *ds_phys;
 103         uint64_t old_bytes, new_bytes;
 104 
 105         if (ds->ds_reserved == 0)
 106                 return (delta);
 107 
 108         ds_phys = dsl_dataset_phys(ds);
 109         old_bytes = MAX(ds_phys->ds_unique_bytes, ds->ds_reserved);
 110         new_bytes = MAX(ds_phys->ds_unique_bytes + delta, ds->ds_reserved);
 111 
 112         ASSERT3U(ABS((int64_t)(new_bytes - old_bytes)), <=, ABS(delta));
 113         return (new_bytes - old_bytes);


 143         dsl_dataset_phys(ds)->ds_compressed_bytes += compressed;
 144         dsl_dataset_phys(ds)->ds_uncompressed_bytes += uncompressed;
 145         dsl_dataset_phys(ds)->ds_unique_bytes += used;
 146 
 147         if (BP_GET_LSIZE(bp) > SPA_OLD_MAXBLOCKSIZE) {
 148                 ds->ds_feature_activation_needed[SPA_FEATURE_LARGE_BLOCKS] =
 149                     B_TRUE;
 150         }
 151 
 152         spa_feature_t f = zio_checksum_to_feature(BP_GET_CHECKSUM(bp));
 153         if (f != SPA_FEATURE_NONE)
 154                 ds->ds_feature_activation_needed[f] = B_TRUE;
 155 
 156         mutex_exit(&ds->ds_lock);
 157         dsl_dir_diduse_space(ds->ds_dir, DD_USED_HEAD, delta,
 158             compressed, uncompressed, tx);
 159         dsl_dir_transfer_space(ds->ds_dir, used - delta,
 160             DD_USED_REFRSRV, DD_USED_HEAD, tx);
 161 }
 162 
 163 /*
 164  * Called when the specified segment has been remapped, and is thus no
 165  * longer referenced in the head dataset.  The vdev must be indirect.
 166  *
 167  * If the segment is referenced by a snapshot, put it on the remap deadlist.
 168  * Otherwise, add this segment to the obsolete spacemap.
 169  */
 170 void
 171 dsl_dataset_block_remapped(dsl_dataset_t *ds, uint64_t vdev, uint64_t offset,
 172     uint64_t size, uint64_t birth, dmu_tx_t *tx)
 173 {
 174         spa_t *spa = ds->ds_dir->dd_pool->dp_spa;
 175 
 176         ASSERT(dmu_tx_is_syncing(tx));
 177         ASSERT(birth <= tx->tx_txg);
 178         ASSERT(!ds->ds_is_snapshot);
 179 
 180         if (birth > dsl_dataset_phys(ds)->ds_prev_snap_txg) {
 181                 spa_vdev_indirect_mark_obsolete(spa, vdev, offset, size, tx);
 182         } else {
 183                 blkptr_t fakebp;
 184                 dva_t *dva = &fakebp.blk_dva[0];
 185 
 186                 ASSERT(ds != NULL);
 187 
 188                 mutex_enter(&ds->ds_remap_deadlist_lock);
 189                 if (!dsl_dataset_remap_deadlist_exists(ds)) {
 190                         dsl_dataset_create_remap_deadlist(ds, tx);
 191                 }
 192                 mutex_exit(&ds->ds_remap_deadlist_lock);
 193 
 194                 BP_ZERO(&fakebp);
 195                 fakebp.blk_birth = birth;
 196                 DVA_SET_VDEV(dva, vdev);
 197                 DVA_SET_OFFSET(dva, offset);
 198                 DVA_SET_ASIZE(dva, size);
 199 
 200                 dsl_deadlist_insert(&ds->ds_remap_deadlist, &fakebp, tx);
 201         }
 202 }
 203 
 204 int
 205 dsl_dataset_block_kill(dsl_dataset_t *ds, const blkptr_t *bp, dmu_tx_t *tx,
 206     boolean_t async)
 207 {
 208         int used = bp_get_dsize_sync(tx->tx_pool->dp_spa, bp);
 209         int compressed = BP_GET_PSIZE(bp);
 210         int uncompressed = BP_GET_UCSIZE(bp);

 211 
 212         if (BP_IS_HOLE(bp))
 213                 return (0);
 214 
 215         ASSERT(dmu_tx_is_syncing(tx));
 216         ASSERT(bp->blk_birth <= tx->tx_txg);
 217 
 218         if (ds == NULL) {
 219                 dsl_free(tx->tx_pool, tx->tx_txg, bp);
 220                 dsl_pool_mos_diduse_space(tx->tx_pool,
 221                     -used, -compressed, -uncompressed);
 222                 return (used);
 223         }
 224         ASSERT3P(tx->tx_pool, ==, ds->ds_dir->dd_pool);
 225 
 226         ASSERT(!ds->ds_is_snapshot);
 227         dmu_buf_will_dirty(ds->ds_dbuf, tx);
 228 
 229         if (bp->blk_birth > dsl_dataset_phys(ds)->ds_prev_snap_txg) {
 230                 int64_t delta;
 231 
 232                 dprintf_bp(bp, "freeing ds=%llu", ds->ds_object);
 233                 dsl_free(tx->tx_pool, tx->tx_txg, bp);
 234 

















 235                 mutex_enter(&ds->ds_lock);
 236                 ASSERT(dsl_dataset_phys(ds)->ds_unique_bytes >= used ||
 237                     !DS_UNIQUE_IS_ACCURATE(ds));
 238                 delta = parent_delta(ds, -used);
 239                 dsl_dataset_phys(ds)->ds_unique_bytes -= used;
 240                 mutex_exit(&ds->ds_lock);
 241                 dsl_dir_diduse_space(ds->ds_dir, DD_USED_HEAD,
 242                     delta, -compressed, -uncompressed, tx);
 243                 dsl_dir_transfer_space(ds->ds_dir, -used - delta,
 244                     DD_USED_REFRSRV, DD_USED_HEAD, tx);
 245         } else {
 246                 dprintf_bp(bp, "putting on dead list: %s", "");
 247                 if (async) {
 248                         /*
 249                          * We are here as part of zio's write done callback,
 250                          * which means we're a zio interrupt thread.  We can't
 251                          * call dsl_deadlist_insert() now because it may block
 252                          * waiting for I/O.  Instead, put bp on the deferred
 253                          * queue and let dsl_pool_sync() finish the job.
 254                          */


 302 }
 303 
 304 static void
 305 dsl_dataset_evict_async(void *dbu)
 306 {
 307         dsl_dataset_t *ds = dbu;
 308 
 309         ASSERT(ds->ds_owner == NULL);
 310 
 311         ds->ds_dbuf = NULL;
 312 
 313         if (ds->ds_objset != NULL)
 314                 dmu_objset_evict(ds->ds_objset);
 315 
 316         if (ds->ds_prev) {
 317                 dsl_dataset_rele(ds->ds_prev, ds);
 318                 ds->ds_prev = NULL;
 319         }
 320 
 321         bplist_destroy(&ds->ds_pending_deadlist);
 322         if (dsl_deadlist_is_open(&ds->ds_deadlist))
 323                 dsl_deadlist_close(&ds->ds_deadlist);
 324         if (dsl_deadlist_is_open(&ds->ds_remap_deadlist))
 325                 dsl_deadlist_close(&ds->ds_remap_deadlist);
 326         if (ds->ds_dir)
 327                 dsl_dir_async_rele(ds->ds_dir, ds);
 328 
 329         ASSERT(!list_link_active(&ds->ds_synced_link));
 330 
 331         list_destroy(&ds->ds_prop_cbs);
 332         mutex_destroy(&ds->ds_lock);
 333         mutex_destroy(&ds->ds_opening_lock);
 334         mutex_destroy(&ds->ds_sendstream_lock);
 335         mutex_destroy(&ds->ds_remap_deadlist_lock);
 336         refcount_destroy(&ds->ds_longholds);
 337         rrw_destroy(&ds->ds_bp_rwlock);
 338 
 339         kmem_free(ds, sizeof (dsl_dataset_t));
 340 }
 341 
 342 int
 343 dsl_dataset_get_snapname(dsl_dataset_t *ds)
 344 {
 345         dsl_dataset_phys_t *headphys;
 346         int err;
 347         dmu_buf_t *headdbuf;
 348         dsl_pool_t *dp = ds->ds_dir->dd_pool;
 349         objset_t *mos = dp->dp_meta_objset;
 350 
 351         if (ds->ds_snapname[0])
 352                 return (0);
 353         if (dsl_dataset_phys(ds)->ds_next_snap_obj == 0)
 354                 return (0);
 355 


 440         err = dmu_bonus_hold(mos, dsobj, tag, &dbuf);
 441         if (err != 0)
 442                 return (err);
 443 
 444         /* Make sure dsobj has the correct object type. */
 445         dmu_object_info_from_db(dbuf, &doi);
 446         if (doi.doi_bonus_type != DMU_OT_DSL_DATASET) {
 447                 dmu_buf_rele(dbuf, tag);
 448                 return (SET_ERROR(EINVAL));
 449         }
 450 
 451         ds = dmu_buf_get_user(dbuf);
 452         if (ds == NULL) {
 453                 dsl_dataset_t *winner = NULL;
 454 
 455                 ds = kmem_zalloc(sizeof (dsl_dataset_t), KM_SLEEP);
 456                 ds->ds_dbuf = dbuf;
 457                 ds->ds_object = dsobj;
 458                 ds->ds_is_snapshot = dsl_dataset_phys(ds)->ds_num_children != 0;
 459 
 460                 err = dsl_dir_hold_obj(dp, dsl_dataset_phys(ds)->ds_dir_obj,
 461                     NULL, ds, &ds->ds_dir);
 462                 if (err != 0) {
 463                         kmem_free(ds, sizeof (dsl_dataset_t));
 464                         dmu_buf_rele(dbuf, tag);
 465                         return (err);
 466                 }
 467 
 468                 mutex_init(&ds->ds_lock, NULL, MUTEX_DEFAULT, NULL);
 469                 mutex_init(&ds->ds_opening_lock, NULL, MUTEX_DEFAULT, NULL);
 470                 mutex_init(&ds->ds_sendstream_lock, NULL, MUTEX_DEFAULT, NULL);
 471                 mutex_init(&ds->ds_remap_deadlist_lock,
 472                     NULL, MUTEX_DEFAULT, NULL);
 473                 rrw_init(&ds->ds_bp_rwlock, B_FALSE);
 474                 refcount_create(&ds->ds_longholds);
 475 
 476                 bplist_create(&ds->ds_pending_deadlist);


 477 
 478                 list_create(&ds->ds_sendstreams, sizeof (dmu_sendarg_t),
 479                     offsetof(dmu_sendarg_t, dsa_link));
 480 
 481                 list_create(&ds->ds_prop_cbs, sizeof (dsl_prop_cb_record_t),
 482                     offsetof(dsl_prop_cb_record_t, cbr_ds_node));
 483 
 484                 if (doi.doi_type == DMU_OTN_ZAP_METADATA) {
 485                         for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {
 486                                 if (!(spa_feature_table[f].fi_flags &
 487                                     ZFEATURE_FLAG_PER_DATASET))
 488                                         continue;
 489                                 err = zap_contains(mos, dsobj,
 490                                     spa_feature_table[f].fi_guid);
 491                                 if (err == 0) {
 492                                         ds->ds_feature_inuse[f] = B_TRUE;
 493                                 } else {
 494                                         ASSERT3U(err, ==, ENOENT);
 495                                         err = 0;
 496                                 }
 497                         }
 498                 }
 499 














 500                 if (!ds->ds_is_snapshot) {
 501                         ds->ds_snapname[0] = '\0';
 502                         if (dsl_dataset_phys(ds)->ds_prev_snap_obj != 0) {
 503                                 err = dsl_dataset_hold_obj(dp,
 504                                     dsl_dataset_phys(ds)->ds_prev_snap_obj,
 505                                     ds, &ds->ds_prev);
 506                         }
 507                         if (doi.doi_type == DMU_OTN_ZAP_METADATA) {
 508                                 int zaperr = zap_lookup(mos, ds->ds_object,
 509                                     DS_FIELD_BOOKMARK_NAMES,
 510                                     sizeof (ds->ds_bookmarks), 1,
 511                                     &ds->ds_bookmarks);
 512                                 if (zaperr != ENOENT)
 513                                         VERIFY0(zaperr);
 514                         }
 515                 } else {
 516                         if (zfs_flags & ZFS_DEBUG_SNAPNAMES)
 517                                 err = dsl_dataset_get_snapname(ds);
 518                         if (err == 0 &&
 519                             dsl_dataset_phys(ds)->ds_userrefs_obj != 0) {
 520                                 err = zap_count(
 521                                     ds->ds_dir->dd_pool->dp_meta_objset,
 522                                     dsl_dataset_phys(ds)->ds_userrefs_obj,
 523                                     &ds->ds_userrefs);
 524                         }
 525                 }
 526 
 527                 if (err == 0 && !ds->ds_is_snapshot) {
 528                         err = dsl_prop_get_int_ds(ds,
 529                             zfs_prop_to_name(ZFS_PROP_REFRESERVATION),
 530                             &ds->ds_reserved);
 531                         if (err == 0) {
 532                                 err = dsl_prop_get_int_ds(ds,
 533                                     zfs_prop_to_name(ZFS_PROP_REFQUOTA),
 534                                     &ds->ds_quota);
 535                         }
 536                 } else {
 537                         ds->ds_reserved = ds->ds_quota = 0;
 538                 }
 539 
 540                 dsl_deadlist_open(&ds->ds_deadlist,
 541                     mos, dsl_dataset_phys(ds)->ds_deadlist_obj);
 542                 uint64_t remap_deadlist_obj =
 543                     dsl_dataset_get_remap_deadlist_object(ds);
 544                 if (remap_deadlist_obj != 0) {
 545                         dsl_deadlist_open(&ds->ds_remap_deadlist, mos,
 546                             remap_deadlist_obj);
 547                 }
 548 
 549                 dmu_buf_init_user(&ds->ds_dbu, dsl_dataset_evict_sync,
 550                     dsl_dataset_evict_async, &ds->ds_dbuf);
 551                 if (err == 0)
 552                         winner = dmu_buf_set_user_ie(dbuf, &ds->ds_dbu);
 553 
 554                 if (err != 0 || winner != NULL) {
 555                         bplist_destroy(&ds->ds_pending_deadlist);
 556                         dsl_deadlist_close(&ds->ds_deadlist);
 557                         if (dsl_deadlist_is_open(&ds->ds_remap_deadlist))
 558                                 dsl_deadlist_close(&ds->ds_remap_deadlist);
 559                         if (ds->ds_prev)
 560                                 dsl_dataset_rele(ds->ds_prev, ds);
 561                         dsl_dir_rele(ds->ds_dir, ds);
 562                         mutex_destroy(&ds->ds_lock);
 563                         mutex_destroy(&ds->ds_opening_lock);
 564                         mutex_destroy(&ds->ds_sendstream_lock);
 565                         refcount_destroy(&ds->ds_longholds);
 566                         kmem_free(ds, sizeof (dsl_dataset_t));
 567                         if (err != 0) {
 568                                 dmu_buf_rele(dbuf, tag);
 569                                 return (err);
 570                         }
 571                         ds = winner;
 572                 } else {
 573                         ds->ds_fsid_guid =
 574                             unique_insert(dsl_dataset_phys(ds)->ds_fsid_guid);
 575                         if (ds->ds_fsid_guid !=
 576                             dsl_dataset_phys(ds)->ds_fsid_guid) {
 577                                 zfs_dbgmsg("ds_fsid_guid changed from "
 578                                     "%llx to %llx for pool %s dataset id %llu",


1186                     ZFS_PROP_SNAPSHOT_LIMIT, NULL, cr);
1187                 if (error != 0)
1188                         return (error);
1189         }
1190 
1191         error = dsl_dataset_snapshot_reserve_space(ds, tx);
1192         if (error != 0)
1193                 return (error);
1194 
1195         return (0);
1196 }
1197 
1198 int
1199 dsl_dataset_snapshot_check(void *arg, dmu_tx_t *tx)
1200 {
1201         dsl_dataset_snapshot_arg_t *ddsa = arg;
1202         dsl_pool_t *dp = dmu_tx_pool(tx);
1203         nvpair_t *pair;
1204         int rv = 0;
1205 



1206         /*
1207          * Pre-compute how many total new snapshots will be created for each
1208          * level in the tree and below. This is needed for validating the
1209          * snapshot limit when either taking a recursive snapshot or when
1210          * taking multiple snapshots.
1211          *
1212          * The problem is that the counts are not actually adjusted when
1213          * we are checking, only when we finally sync. For a single snapshot,
1214          * this is easy, the count will increase by 1 at each node up the tree,
1215          * but its more complicated for the recursive/multiple snapshot case.
1216          *
1217          * The dsl_fs_ss_limit_check function does recursively check the count
1218          * at each level up the tree but since it is validating each snapshot
1219          * independently we need to be sure that we are validating the complete
1220          * count for the entire set of snapshots. We do this by rolling up the
1221          * counts for each component of the name into an nvlist and then
1222          * checking each of those cases with the aggregated count.
1223          *
1224          * This approach properly handles not only the recursive snapshot
1225          * case (where we get all of those on the ddsa_snaps list) but also


1326                         if (ddsa->ddsa_errors != NULL) {
1327                                 fnvlist_add_int32(ddsa->ddsa_errors,
1328                                     name, error);
1329                         }
1330                         rv = error;
1331                 }
1332         }
1333 
1334         return (rv);
1335 }
1336 
1337 void
1338 dsl_dataset_snapshot_sync_impl(dsl_dataset_t *ds, const char *snapname,
1339     dmu_tx_t *tx)
1340 {
1341         dsl_pool_t *dp = ds->ds_dir->dd_pool;
1342         dmu_buf_t *dbuf;
1343         dsl_dataset_phys_t *dsphys;
1344         uint64_t dsobj, crtxg;
1345         objset_t *mos = dp->dp_meta_objset;
1346         objset_t *os;

1347 
1348         ASSERT(RRW_WRITE_HELD(&dp->dp_config_rwlock));
1349 
1350         /*
1351          * If we are on an old pool, the zil must not be active, in which
1352          * case it will be zeroed.  Usually zil_suspend() accomplishes this.
1353          */
1354         ASSERT(spa_version(dmu_tx_pool(tx)->dp_spa) >= SPA_VERSION_FAST_SNAP ||
1355             dmu_objset_from_ds(ds, &os) != 0 ||
1356             bcmp(&os->os_phys->os_zil_header, &zero_zil,
1357             sizeof (zero_zil)) == 0);
1358 
1359         /* Should not snapshot a dirty dataset. */
1360         ASSERT(!txg_list_member(&ds->ds_dir->dd_pool->dp_dirty_datasets,
1361             ds, tx->tx_txg));
1362 
1363         dsl_fs_ss_count_adjust(ds->ds_dir, 1, DD_FIELD_SNAPSHOT_COUNT, tx);
1364 
1365         /*
1366          * The origin's ds_creation_txg has to be < TXG_INITIAL


1431          */
1432         if (ds->ds_reserved) {
1433                 int64_t delta;
1434                 ASSERT(DS_UNIQUE_IS_ACCURATE(ds));
1435                 delta = MIN(dsl_dataset_phys(ds)->ds_unique_bytes,
1436                     ds->ds_reserved);
1437                 dsl_dir_diduse_space(ds->ds_dir, DD_USED_REFRSRV,
1438                     delta, 0, 0, tx);
1439         }
1440 
1441         dmu_buf_will_dirty(ds->ds_dbuf, tx);
1442         dsl_dataset_phys(ds)->ds_deadlist_obj =
1443             dsl_deadlist_clone(&ds->ds_deadlist, UINT64_MAX,
1444             dsl_dataset_phys(ds)->ds_prev_snap_obj, tx);
1445         dsl_deadlist_close(&ds->ds_deadlist);
1446         dsl_deadlist_open(&ds->ds_deadlist, mos,
1447             dsl_dataset_phys(ds)->ds_deadlist_obj);
1448         dsl_deadlist_add_key(&ds->ds_deadlist,
1449             dsl_dataset_phys(ds)->ds_prev_snap_txg, tx);
1450 
1451         if (dsl_dataset_remap_deadlist_exists(ds)) {
1452                 uint64_t remap_deadlist_obj =
1453                     dsl_dataset_get_remap_deadlist_object(ds);
1454                 /*
1455                  * Move the remap_deadlist to the snapshot.  The head
1456                  * will create a new remap deadlist on demand, from
1457                  * dsl_dataset_block_remapped().
1458                  */
1459                 dsl_dataset_unset_remap_deadlist_object(ds, tx);
1460                 dsl_deadlist_close(&ds->ds_remap_deadlist);
1461 
1462                 dmu_object_zapify(mos, dsobj, DMU_OT_DSL_DATASET, tx);
1463                 VERIFY0(zap_add(mos, dsobj, DS_FIELD_REMAP_DEADLIST,
1464                     sizeof (remap_deadlist_obj), 1, &remap_deadlist_obj, tx));
1465         }
1466 
1467         ASSERT3U(dsl_dataset_phys(ds)->ds_prev_snap_txg, <, tx->tx_txg);
1468         dsl_dataset_phys(ds)->ds_prev_snap_obj = dsobj;
1469         dsl_dataset_phys(ds)->ds_prev_snap_txg = crtxg;

1470         dsl_dataset_phys(ds)->ds_unique_bytes = 0;
1471 
1472         if (spa_version(dp->dp_spa) >= SPA_VERSION_UNIQUE_ACCURATE)
1473                 dsl_dataset_phys(ds)->ds_flags |= DS_FLAG_UNIQUE_ACCURATE;
1474 
1475         VERIFY0(zap_add(mos, dsl_dataset_phys(ds)->ds_snapnames_zapobj,
1476             snapname, 8, 1, &dsobj, tx));
1477 
1478         if (ds->ds_prev)
1479                 dsl_dataset_rele(ds->ds_prev, ds);
1480         VERIFY0(dsl_dataset_hold_obj(dp,
1481             dsl_dataset_phys(ds)->ds_prev_snap_obj, ds, &ds->ds_prev));
1482 
1483         dsl_scan_ds_snapshotted(ds, tx);
1484 
1485         dsl_dir_snap_cmtime_update(ds->ds_dir);
1486 
1487         spa_history_log_internal_ds(ds->ds_prev, "snapshot", tx, "");











1488 }
1489 
1490 void
1491 dsl_dataset_snapshot_sync(void *arg, dmu_tx_t *tx)
1492 {
1493         dsl_dataset_snapshot_arg_t *ddsa = arg;
1494         dsl_pool_t *dp = dmu_tx_pool(tx);
1495         nvpair_t *pair;
1496 
1497         for (pair = nvlist_next_nvpair(ddsa->ddsa_snaps, NULL);
1498             pair != NULL; pair = nvlist_next_nvpair(ddsa->ddsa_snaps, pair)) {
1499                 dsl_dataset_t *ds;
1500                 char *name, *atp;
1501                 char dsname[ZFS_MAX_DATASET_NAME_LEN];
1502 
1503                 name = nvpair_name(pair);

1504                 atp = strchr(name, '@');
1505                 (void) strlcpy(dsname, name, atp - name + 1);
1506                 VERIFY0(dsl_dataset_hold(dp, dsname, FTAG, &ds));
1507 
1508                 dsl_dataset_snapshot_sync_impl(ds, atp + 1, tx);
1509                 if (ddsa->ddsa_props != NULL) {
1510                         dsl_props_set_sync_impl(ds->ds_prev,
1511                             ZPROP_SRC_LOCAL, ddsa->ddsa_props, tx);
1512                 }
1513                 dsl_dataset_rele(ds, FTAG);
1514         }
1515 }
1516 
1517 /*
1518  * The snapshots must all be in the same pool.
1519  * All-or-nothing: if there are any failures, nothing will be modified.
1520  */
1521 int
1522 dsl_dataset_snapshot(nvlist_t *snaps, nvlist_t *props, nvlist_t *errors)
1523 {


1551 
1552                         atp = strchr(snapname, '@');
1553                         if (atp == NULL) {
1554                                 error = SET_ERROR(EINVAL);
1555                                 break;
1556                         }
1557                         (void) strlcpy(fsname, snapname, atp - snapname + 1);
1558 
1559                         error = zil_suspend(fsname, &cookie);
1560                         if (error != 0)
1561                                 break;
1562                         fnvlist_add_uint64(suspended, fsname,
1563                             (uintptr_t)cookie);
1564                 }
1565         }
1566 
1567         ddsa.ddsa_snaps = snaps;
1568         ddsa.ddsa_props = props;
1569         ddsa.ddsa_errors = errors;
1570         ddsa.ddsa_cr = CRED();

1571 
1572         if (error == 0) {
1573                 error = dsl_sync_task(firstname, dsl_dataset_snapshot_check,
1574                     dsl_dataset_snapshot_sync, &ddsa,
1575                     fnvlist_num_pairs(snaps) * 3, ZFS_SPACE_CHECK_NORMAL);
1576         }
1577 
1578         if (suspended != NULL) {
1579                 for (pair = nvlist_next_nvpair(suspended, NULL); pair != NULL;
1580                     pair = nvlist_next_nvpair(suspended, pair)) {
1581                         zil_resume((void *)(uintptr_t)
1582                             fnvpair_value_uint64(pair));
1583                 }
1584                 fnvlist_free(suspended);
1585         }
1586 
1587         return (error);
1588 }
1589 
1590 typedef struct dsl_dataset_snapshot_tmp_arg {


2185         return (0);
2186 }
2187 
2188 void
2189 dsl_dataset_stats(dsl_dataset_t *ds, nvlist_t *nv)
2190 {
2191         dsl_pool_t *dp = ds->ds_dir->dd_pool;
2192 
2193         ASSERT(dsl_pool_config_held(dp));
2194 
2195         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_REFRATIO,
2196             dsl_get_refratio(ds));
2197         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_LOGICALREFERENCED,
2198             dsl_get_logicalreferenced(ds));
2199         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_COMPRESSRATIO,
2200             dsl_get_compressratio(ds));
2201         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_USED,
2202             dsl_get_used(ds));
2203 
2204         if (ds->ds_is_snapshot) {













2205                 get_clones_stat(ds, nv);
2206         } else {
2207                 char buf[ZFS_MAX_DATASET_NAME_LEN];
2208                 if (dsl_get_prev_snap(ds, buf) == 0)
2209                         dsl_prop_nvlist_add_string(nv, ZFS_PROP_PREV_SNAP,
2210                             buf);
2211                 dsl_dir_stats(ds->ds_dir, nv);
2212         }
2213 
2214         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_AVAILABLE,
2215             dsl_get_available(ds));
2216         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_REFERENCED,
2217             dsl_get_referenced(ds));
2218         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_CREATION,
2219             dsl_get_creation(ds));
2220         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_CREATETXG,
2221             dsl_get_creationtxg(ds));
2222         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_REFQUOTA,
2223             dsl_get_refquota(ds));
2224         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_REFRESERVATION,


2262                     sizeof (recvname) &&
2263                     strlcat(recvname, recv_clone_name, sizeof (recvname)) <
2264                     sizeof (recvname) &&
2265                     dsl_dataset_hold(dp, recvname, FTAG, &recv_ds) == 0) {
2266                         get_receive_resume_stats(recv_ds, nv);
2267                         dsl_dataset_rele(recv_ds, FTAG);
2268                 }
2269         }
2270 }
2271 
2272 void
2273 dsl_dataset_fast_stat(dsl_dataset_t *ds, dmu_objset_stats_t *stat)
2274 {
2275         dsl_pool_t *dp = ds->ds_dir->dd_pool;
2276         ASSERT(dsl_pool_config_held(dp));
2277 
2278         stat->dds_creation_txg = dsl_get_creationtxg(ds);
2279         stat->dds_inconsistent = dsl_get_inconsistent(ds);
2280         stat->dds_guid = dsl_get_guid(ds);
2281         stat->dds_origin[0] = '\0';


2282         if (ds->ds_is_snapshot) {



2283                 stat->dds_is_snapshot = B_TRUE;

2284                 stat->dds_num_clones = dsl_get_numclones(ds);
2285         } else {
2286                 stat->dds_is_snapshot = B_FALSE;
2287                 stat->dds_num_clones = 0;
2288 
2289                 if (dsl_dir_is_clone(ds->ds_dir)) {
2290                         dsl_dir_get_origin(ds->ds_dir, stat->dds_origin);
2291                 }
2292         }
2293 }
2294 
2295 uint64_t
2296 dsl_dataset_fsid_guid(dsl_dataset_t *ds)
2297 {
2298         return (ds->ds_fsid_guid);
2299 }
2300 
2301 void
2302 dsl_dataset_space(dsl_dataset_t *ds,
2303     uint64_t *refdbytesp, uint64_t *availbytesp,
2304     uint64_t *usedobjsp, uint64_t *availobjsp)
2305 {
2306         *refdbytesp = dsl_dataset_phys(ds)->ds_referenced_bytes;


2383                 error = SET_ERROR(EEXIST);
2384         else if (error == ENOENT)
2385                 error = 0;
2386 
2387         /* dataset name + 1 for the "@" + the new snapshot name must fit */
2388         if (dsl_dir_namelen(hds->ds_dir) + 1 +
2389             strlen(ddrsa->ddrsa_newsnapname) >= ZFS_MAX_DATASET_NAME_LEN)
2390                 error = SET_ERROR(ENAMETOOLONG);
2391 
2392         return (error);
2393 }
2394 
2395 static int
2396 dsl_dataset_rename_snapshot_check(void *arg, dmu_tx_t *tx)
2397 {
2398         dsl_dataset_rename_snapshot_arg_t *ddrsa = arg;
2399         dsl_pool_t *dp = dmu_tx_pool(tx);
2400         dsl_dataset_t *hds;
2401         int error;
2402 








2403         error = dsl_dataset_hold(dp, ddrsa->ddrsa_fsname, FTAG, &hds);
2404         if (error != 0)
2405                 return (error);
2406 
2407         if (ddrsa->ddrsa_recursive) {
2408                 error = dmu_objset_find_dp(dp, hds->ds_dir->dd_object,
2409                     dsl_dataset_rename_snapshot_check_impl, ddrsa,
2410                     DS_FIND_CHILDREN);
2411         } else {
2412                 error = dsl_dataset_rename_snapshot_check_impl(dp, hds, ddrsa);
2413         }
2414         dsl_dataset_rele(hds, FTAG);
2415         return (error);
2416 }
2417 
2418 static int
2419 dsl_dataset_rename_snapshot_sync_impl(dsl_pool_t *dp,
2420     dsl_dataset_t *hds, void *arg)
2421 {
2422         dsl_dataset_rename_snapshot_arg_t *ddrsa = arg;


3325          * The clone can't be too much over the head's refquota.
3326          *
3327          * To ensure that the entire refquota can be used, we allow one
3328          * transaction to exceed the the refquota.  Therefore, this check
3329          * needs to also allow for the space referenced to be more than the
3330          * refquota.  The maximum amount of space that one transaction can use
3331          * on disk is DMU_MAX_ACCESS * spa_asize_inflation.  Allowing this
3332          * overage ensures that we are able to receive a filesystem that
3333          * exceeds the refquota on the source system.
3334          *
3335          * So that overage is the refquota_slack we use below.
3336          */
3337         if (origin_head->ds_quota != 0 &&
3338             dsl_dataset_phys(clone)->ds_referenced_bytes >
3339             origin_head->ds_quota + refquota_slack)
3340                 return (SET_ERROR(EDQUOT));
3341 
3342         return (0);
3343 }
3344 
3345 static void
3346 dsl_dataset_swap_remap_deadlists(dsl_dataset_t *clone,
3347     dsl_dataset_t *origin, dmu_tx_t *tx)
3348 {
3349         uint64_t clone_remap_dl_obj, origin_remap_dl_obj;
3350         dsl_pool_t *dp = dmu_tx_pool(tx);
3351 
3352         ASSERT(dsl_pool_sync_context(dp));
3353 
3354         clone_remap_dl_obj = dsl_dataset_get_remap_deadlist_object(clone);
3355         origin_remap_dl_obj = dsl_dataset_get_remap_deadlist_object(origin);
3356 
3357         if (clone_remap_dl_obj != 0) {
3358                 dsl_deadlist_close(&clone->ds_remap_deadlist);
3359                 dsl_dataset_unset_remap_deadlist_object(clone, tx);
3360         }
3361         if (origin_remap_dl_obj != 0) {
3362                 dsl_deadlist_close(&origin->ds_remap_deadlist);
3363                 dsl_dataset_unset_remap_deadlist_object(origin, tx);
3364         }
3365 
3366         if (clone_remap_dl_obj != 0) {
3367                 dsl_dataset_set_remap_deadlist_object(origin,
3368                     clone_remap_dl_obj, tx);
3369                 dsl_deadlist_open(&origin->ds_remap_deadlist,
3370                     dp->dp_meta_objset, clone_remap_dl_obj);
3371         }
3372         if (origin_remap_dl_obj != 0) {
3373                 dsl_dataset_set_remap_deadlist_object(clone,
3374                     origin_remap_dl_obj, tx);
3375                 dsl_deadlist_open(&clone->ds_remap_deadlist,
3376                     dp->dp_meta_objset, origin_remap_dl_obj);
3377         }
3378 }
3379 
3380 void
3381 dsl_dataset_clone_swap_sync_impl(dsl_dataset_t *clone,
3382     dsl_dataset_t *origin_head, dmu_tx_t *tx)
3383 {
3384         dsl_pool_t *dp = dmu_tx_pool(tx);
3385         int64_t unused_refres_delta;
3386 
3387         ASSERT(clone->ds_reserved == 0);
3388         /*
3389          * NOTE: On DEBUG kernels there could be a race between this and
3390          * the check function if spa_asize_inflation is adjusted...
3391          */
3392         ASSERT(origin_head->ds_quota == 0 ||
3393             dsl_dataset_phys(clone)->ds_unique_bytes <= origin_head->ds_quota +
3394             DMU_MAX_ACCESS * spa_asize_inflation);
3395         ASSERT3P(clone->ds_prev, ==, origin_head->ds_prev);
3396 
3397         /*
3398          * Swap per-dataset feature flags.
3399          */


3529         SWITCH64(dsl_dataset_phys(origin_head)->ds_uncompressed_bytes,
3530             dsl_dataset_phys(clone)->ds_uncompressed_bytes);
3531         SWITCH64(dsl_dataset_phys(origin_head)->ds_unique_bytes,
3532             dsl_dataset_phys(clone)->ds_unique_bytes);
3533 
3534         /* apply any parent delta for change in unconsumed refreservation */
3535         dsl_dir_diduse_space(origin_head->ds_dir, DD_USED_REFRSRV,
3536             unused_refres_delta, 0, 0, tx);
3537 
3538         /*
3539          * Swap deadlists.
3540          */
3541         dsl_deadlist_close(&clone->ds_deadlist);
3542         dsl_deadlist_close(&origin_head->ds_deadlist);
3543         SWITCH64(dsl_dataset_phys(origin_head)->ds_deadlist_obj,
3544             dsl_dataset_phys(clone)->ds_deadlist_obj);
3545         dsl_deadlist_open(&clone->ds_deadlist, dp->dp_meta_objset,
3546             dsl_dataset_phys(clone)->ds_deadlist_obj);
3547         dsl_deadlist_open(&origin_head->ds_deadlist, dp->dp_meta_objset,
3548             dsl_dataset_phys(origin_head)->ds_deadlist_obj);
3549         dsl_dataset_swap_remap_deadlists(clone, origin_head, tx);
3550 
3551         dsl_scan_ds_clone_swapped(origin_head, clone, tx);
3552 
3553         spa_history_log_internal_ds(clone, "clone swap", tx,
3554             "parent=%s", origin_head->ds_dir->dd_myname);
3555 }
3556 
3557 /*
3558  * Given a pool name and a dataset object number in that pool,
3559  * return the name of that dataset.
3560  */
3561 int
3562 dsl_dsobj_to_dsname(char *pname, uint64_t obj, char *buf)
3563 {
3564         dsl_pool_t *dp;
3565         dsl_dataset_t *ds;
3566         int error;
3567 
3568         error = dsl_pool_hold(pname, FTAG, &dp);
3569         if (error != 0)


4022         if (dsl_dir_phys(later->ds_dir)->dd_origin_obj == earlier->ds_object)
4023                 return (B_TRUE);
4024         dsl_dataset_t *origin;
4025         error = dsl_dataset_hold_obj(dp,
4026             dsl_dir_phys(later->ds_dir)->dd_origin_obj, FTAG, &origin);
4027         if (error != 0)
4028                 return (B_FALSE);
4029         ret = dsl_dataset_is_before(origin, earlier, earlier_txg);
4030         dsl_dataset_rele(origin, FTAG);
4031         return (ret);
4032 }
4033 
4034 void
4035 dsl_dataset_zapify(dsl_dataset_t *ds, dmu_tx_t *tx)
4036 {
4037         objset_t *mos = ds->ds_dir->dd_pool->dp_meta_objset;
4038         dmu_object_zapify(mos, ds->ds_object, DMU_OT_DSL_DATASET, tx);
4039 }
4040 
4041 boolean_t
4042 dsl_dataset_is_zapified(dsl_dataset_t *ds)
4043 {
4044         dmu_object_info_t doi;
4045 
4046         dmu_object_info_from_db(ds->ds_dbuf, &doi);
4047         return (doi.doi_type == DMU_OTN_ZAP_METADATA);



4048 }
4049 
4050 boolean_t
4051 dsl_dataset_has_resume_receive_state(dsl_dataset_t *ds)
4052 {
4053         return (dsl_dataset_is_zapified(ds) &&
4054             zap_contains(ds->ds_dir->dd_pool->dp_meta_objset,
4055             ds->ds_object, DS_FIELD_RESUME_TOGUID) == 0);
4056 }
4057 
4058 uint64_t
4059 dsl_dataset_get_remap_deadlist_object(dsl_dataset_t *ds)
4060 {
4061         uint64_t remap_deadlist_obj;

4062         int err;

4063 
4064         if (!dsl_dataset_is_zapified(ds))
4065                 return (0);
4066 
4067         err = zap_lookup(ds->ds_dir->dd_pool->dp_meta_objset, ds->ds_object,
4068             DS_FIELD_REMAP_DEADLIST, sizeof (remap_deadlist_obj), 1,
4069             &remap_deadlist_obj);
4070 
4071         if (err != 0) {
4072                 VERIFY3S(err, ==, ENOENT);
4073                 return (0);


4074         }
4075 
4076         ASSERT(remap_deadlist_obj != 0);
4077         return (remap_deadlist_obj);

4078 }
4079 
4080 boolean_t
4081 dsl_dataset_remap_deadlist_exists(dsl_dataset_t *ds)
4082 {
4083         EQUIV(dsl_deadlist_is_open(&ds->ds_remap_deadlist),
4084             dsl_dataset_get_remap_deadlist_object(ds) != 0);
4085         return (dsl_deadlist_is_open(&ds->ds_remap_deadlist));
4086 }
4087 
4088 static void
4089 dsl_dataset_set_remap_deadlist_object(dsl_dataset_t *ds, uint64_t obj,
4090     dmu_tx_t *tx)
4091 {
4092         ASSERT(obj != 0);
4093         dsl_dataset_zapify(ds, tx);
4094         VERIFY0(zap_add(ds->ds_dir->dd_pool->dp_meta_objset, ds->ds_object,
4095             DS_FIELD_REMAP_DEADLIST, sizeof (obj), 1, &obj, tx));
4096 }
4097 
4098 static void
4099 dsl_dataset_unset_remap_deadlist_object(dsl_dataset_t *ds, dmu_tx_t *tx)
4100 {
4101         VERIFY0(zap_remove(ds->ds_dir->dd_pool->dp_meta_objset,
4102             ds->ds_object, DS_FIELD_REMAP_DEADLIST, tx));
4103 }
4104 
4105 void
4106 dsl_dataset_destroy_remap_deadlist(dsl_dataset_t *ds, dmu_tx_t *tx)
4107 {
4108         uint64_t remap_deadlist_object;
4109         spa_t *spa = ds->ds_dir->dd_pool->dp_spa;
4110 
4111         ASSERT(dmu_tx_is_syncing(tx));
4112         ASSERT(dsl_dataset_remap_deadlist_exists(ds));
4113 
4114         remap_deadlist_object = ds->ds_remap_deadlist.dl_object;
4115         dsl_deadlist_close(&ds->ds_remap_deadlist);
4116         dsl_deadlist_free(spa_meta_objset(spa), remap_deadlist_object, tx);
4117         dsl_dataset_unset_remap_deadlist_object(ds, tx);
4118         spa_feature_decr(spa, SPA_FEATURE_OBSOLETE_COUNTS, tx);
4119 }
4120 
4121 void
4122 dsl_dataset_create_remap_deadlist(dsl_dataset_t *ds, dmu_tx_t *tx)
4123 {
4124         uint64_t remap_deadlist_obj;
4125         spa_t *spa = ds->ds_dir->dd_pool->dp_spa;
4126 
4127         ASSERT(dmu_tx_is_syncing(tx));
4128         ASSERT(MUTEX_HELD(&ds->ds_remap_deadlist_lock));
4129         /*
4130          * Currently we only create remap deadlists when there are indirect
4131          * vdevs with referenced mappings.
4132          */
4133         ASSERT(spa_feature_is_active(spa, SPA_FEATURE_DEVICE_REMOVAL));
4134 
4135         remap_deadlist_obj = dsl_deadlist_clone(
4136             &ds->ds_deadlist, UINT64_MAX,
4137             dsl_dataset_phys(ds)->ds_prev_snap_obj, tx);
4138         dsl_dataset_set_remap_deadlist_object(ds,
4139             remap_deadlist_obj, tx);
4140         dsl_deadlist_open(&ds->ds_remap_deadlist, spa_meta_objset(spa),
4141             remap_deadlist_obj);
4142         spa_feature_incr(spa, SPA_FEATURE_OBSOLETE_COUNTS, tx);
4143 }


  29  * Copyright 2016, OmniTI Computer Consulting, Inc. All rights reserved.
  30  * Copyright 2017 Nexenta Systems, Inc.
  31  */
  32 
  33 #include <sys/dmu_objset.h>
  34 #include <sys/dsl_dataset.h>
  35 #include <sys/dsl_dir.h>
  36 #include <sys/dsl_prop.h>
  37 #include <sys/dsl_synctask.h>
  38 #include <sys/dmu_traverse.h>
  39 #include <sys/dmu_impl.h>
  40 #include <sys/dmu_tx.h>
  41 #include <sys/arc.h>
  42 #include <sys/zio.h>
  43 #include <sys/zap.h>
  44 #include <sys/zfeature.h>
  45 #include <sys/unique.h>
  46 #include <sys/zfs_context.h>
  47 #include <sys/zfs_ioctl.h>
  48 #include <sys/spa.h>
  49 #include <sys/spa_impl.h>
  50 #include <sys/zfs_znode.h>
  51 #include <sys/zfs_onexit.h>
  52 #include <sys/zvol.h>
  53 #include <sys/dsl_scan.h>
  54 #include <sys/dsl_deadlist.h>
  55 #include <sys/dsl_destroy.h>
  56 #include <sys/dsl_userhold.h>
  57 #include <sys/dsl_bookmark.h>
  58 #include <sys/autosnap.h>
  59 #include <sys/dmu_send.h>
  60 #include <sys/zio_checksum.h>
  61 #include <sys/zio_compress.h>
  62 #include <zfs_fletcher.h>
  63 
  64 /*
  65  * The SPA supports block sizes up to 16MB.  However, very large blocks
  66  * can have an impact on i/o latency (e.g. tying up a spinning disk for
  67  * ~300ms), and also potentially on the memory allocator.  Therefore,
  68  * we do not allow the recordsize to be set larger than zfs_max_recordsize
  69  * (default 1MB).  Larger blocks can be created by changing this tunable,
  70  * and pools with larger blocks can always be imported and used, regardless
  71  * of this setting.
  72  */
  73 int zfs_max_recordsize = 1 * 1024 * 1024;
  74 
  75 #define SWITCH64(x, y) \
  76         { \
  77                 uint64_t __tmp = (x); \
  78                 (x) = (y); \
  79                 (y) = __tmp; \
  80         }
  81 
  82 #define DS_REF_MAX      (1ULL << 62)
  83 
  84 extern inline dsl_dataset_phys_t *dsl_dataset_phys(dsl_dataset_t *ds);
  85 





  86 extern int spa_asize_inflation;
  87 
  88 static zil_header_t zero_zil;
  89 
  90 kmem_cache_t *zfs_ds_collector_cache = NULL;
  91 
  92 zfs_ds_collector_entry_t *
  93 dsl_dataset_collector_cache_alloc()
  94 {
  95         return (kmem_cache_alloc(zfs_ds_collector_cache, KM_SLEEP));
  96 }
  97 
  98 void
  99 dsl_dataset_collector_cache_free(zfs_ds_collector_entry_t *entry)
 100 {
 101         kmem_cache_free(zfs_ds_collector_cache, entry);
 102 }
 103 
 104 /*
 105  * Figure out how much of this delta should be propogated to the dsl_dir
 106  * layer.  If there's a refreservation, that space has already been
 107  * partially accounted for in our ancestors.
 108  */
 109 static int64_t
 110 parent_delta(dsl_dataset_t *ds, int64_t delta)
 111 {
 112         dsl_dataset_phys_t *ds_phys;
 113         uint64_t old_bytes, new_bytes;
 114 
 115         if (ds->ds_reserved == 0)
 116                 return (delta);
 117 
 118         ds_phys = dsl_dataset_phys(ds);
 119         old_bytes = MAX(ds_phys->ds_unique_bytes, ds->ds_reserved);
 120         new_bytes = MAX(ds_phys->ds_unique_bytes + delta, ds->ds_reserved);
 121 
 122         ASSERT3U(ABS((int64_t)(new_bytes - old_bytes)), <=, ABS(delta));
 123         return (new_bytes - old_bytes);


 153         dsl_dataset_phys(ds)->ds_compressed_bytes += compressed;
 154         dsl_dataset_phys(ds)->ds_uncompressed_bytes += uncompressed;
 155         dsl_dataset_phys(ds)->ds_unique_bytes += used;
 156 
 157         if (BP_GET_LSIZE(bp) > SPA_OLD_MAXBLOCKSIZE) {
 158                 ds->ds_feature_activation_needed[SPA_FEATURE_LARGE_BLOCKS] =
 159                     B_TRUE;
 160         }
 161 
 162         spa_feature_t f = zio_checksum_to_feature(BP_GET_CHECKSUM(bp));
 163         if (f != SPA_FEATURE_NONE)
 164                 ds->ds_feature_activation_needed[f] = B_TRUE;
 165 
 166         mutex_exit(&ds->ds_lock);
 167         dsl_dir_diduse_space(ds->ds_dir, DD_USED_HEAD, delta,
 168             compressed, uncompressed, tx);
 169         dsl_dir_transfer_space(ds->ds_dir, used - delta,
 170             DD_USED_REFRSRV, DD_USED_HEAD, tx);
 171 }
 172 









































 173 int
 174 dsl_dataset_block_kill(dsl_dataset_t *ds, const blkptr_t *bp, dmu_tx_t *tx,
 175     boolean_t async)
 176 {
 177         int used = bp_get_dsize_sync(tx->tx_pool->dp_spa, bp);
 178         int compressed = BP_GET_PSIZE(bp);
 179         int uncompressed = BP_GET_UCSIZE(bp);
 180         wbc_data_t *wbc_data = spa_get_wbc_data(tx->tx_pool->dp_spa);
 181 
 182         if (BP_IS_HOLE(bp))
 183                 return (0);
 184 
 185         ASSERT(dmu_tx_is_syncing(tx));
 186         ASSERT(bp->blk_birth <= tx->tx_txg);
 187 
 188         if (ds == NULL) {
 189                 dsl_free(tx->tx_pool, tx->tx_txg, bp);
 190                 dsl_pool_mos_diduse_space(tx->tx_pool,
 191                     -used, -compressed, -uncompressed);
 192                 return (used);
 193         }
 194         ASSERT3P(tx->tx_pool, ==, ds->ds_dir->dd_pool);
 195 
 196         ASSERT(!ds->ds_is_snapshot);
 197         dmu_buf_will_dirty(ds->ds_dbuf, tx);
 198 
 199         if (bp->blk_birth > dsl_dataset_phys(ds)->ds_prev_snap_txg) {
 200                 int64_t delta;
 201 
 202                 dprintf_bp(bp, "freeing ds=%llu", ds->ds_object);
 203                 dsl_free(tx->tx_pool, tx->tx_txg, bp);
 204 
 205                 /* update amount of data which is changed in the window */
 206                 mutex_enter(&wbc_data->wbc_lock);
 207                 if (wbc_data->wbc_isvalid &&
 208                     bp->blk_birth && wbc_data->wbc_finish_txg &&
 209                     bp->blk_birth <= wbc_data->wbc_finish_txg &&
 210                     bp->blk_birth >= wbc_data->wbc_start_txg &&
 211                     !wbc_data->wbc_purge) {
 212 
 213                         wbc_data->wbc_altered_bytes += used;
 214                         if (wbc_data->wbc_altered_limit &&
 215                             wbc_data->wbc_altered_bytes >
 216                             wbc_data->wbc_altered_limit) {
 217                                 wbc_purge_window(tx->tx_pool->dp_spa, tx);
 218                         }
 219                 }
 220                 mutex_exit(&wbc_data->wbc_lock);
 221 
 222                 mutex_enter(&ds->ds_lock);
 223                 ASSERT(dsl_dataset_phys(ds)->ds_unique_bytes >= used ||
 224                     !DS_UNIQUE_IS_ACCURATE(ds));
 225                 delta = parent_delta(ds, -used);
 226                 dsl_dataset_phys(ds)->ds_unique_bytes -= used;
 227                 mutex_exit(&ds->ds_lock);
 228                 dsl_dir_diduse_space(ds->ds_dir, DD_USED_HEAD,
 229                     delta, -compressed, -uncompressed, tx);
 230                 dsl_dir_transfer_space(ds->ds_dir, -used - delta,
 231                     DD_USED_REFRSRV, DD_USED_HEAD, tx);
 232         } else {
 233                 dprintf_bp(bp, "putting on dead list: %s", "");
 234                 if (async) {
 235                         /*
 236                          * We are here as part of zio's write done callback,
 237                          * which means we're a zio interrupt thread.  We can't
 238                          * call dsl_deadlist_insert() now because it may block
 239                          * waiting for I/O.  Instead, put bp on the deferred
 240                          * queue and let dsl_pool_sync() finish the job.
 241                          */


 289 }
 290 
 291 static void
 292 dsl_dataset_evict_async(void *dbu)
 293 {
 294         dsl_dataset_t *ds = dbu;
 295 
 296         ASSERT(ds->ds_owner == NULL);
 297 
 298         ds->ds_dbuf = NULL;
 299 
 300         if (ds->ds_objset != NULL)
 301                 dmu_objset_evict(ds->ds_objset);
 302 
 303         if (ds->ds_prev) {
 304                 dsl_dataset_rele(ds->ds_prev, ds);
 305                 ds->ds_prev = NULL;
 306         }
 307 
 308         bplist_destroy(&ds->ds_pending_deadlist);
 309         if (ds->ds_deadlist.dl_os != NULL)
 310                 dsl_deadlist_close(&ds->ds_deadlist);


 311         if (ds->ds_dir)
 312                 dsl_dir_async_rele(ds->ds_dir, ds);
 313 
 314         ASSERT(!list_link_active(&ds->ds_synced_link));
 315 
 316         list_destroy(&ds->ds_prop_cbs);
 317         mutex_destroy(&ds->ds_lock);
 318         mutex_destroy(&ds->ds_opening_lock);
 319         mutex_destroy(&ds->ds_sendstream_lock);

 320         refcount_destroy(&ds->ds_longholds);
 321         rrw_destroy(&ds->ds_bp_rwlock);
 322 
 323         kmem_free(ds, sizeof (dsl_dataset_t));
 324 }
 325 
 326 int
 327 dsl_dataset_get_snapname(dsl_dataset_t *ds)
 328 {
 329         dsl_dataset_phys_t *headphys;
 330         int err;
 331         dmu_buf_t *headdbuf;
 332         dsl_pool_t *dp = ds->ds_dir->dd_pool;
 333         objset_t *mos = dp->dp_meta_objset;
 334 
 335         if (ds->ds_snapname[0])
 336                 return (0);
 337         if (dsl_dataset_phys(ds)->ds_next_snap_obj == 0)
 338                 return (0);
 339 


 424         err = dmu_bonus_hold(mos, dsobj, tag, &dbuf);
 425         if (err != 0)
 426                 return (err);
 427 
 428         /* Make sure dsobj has the correct object type. */
 429         dmu_object_info_from_db(dbuf, &doi);
 430         if (doi.doi_bonus_type != DMU_OT_DSL_DATASET) {
 431                 dmu_buf_rele(dbuf, tag);
 432                 return (SET_ERROR(EINVAL));
 433         }
 434 
 435         ds = dmu_buf_get_user(dbuf);
 436         if (ds == NULL) {
 437                 dsl_dataset_t *winner = NULL;
 438 
 439                 ds = kmem_zalloc(sizeof (dsl_dataset_t), KM_SLEEP);
 440                 ds->ds_dbuf = dbuf;
 441                 ds->ds_object = dsobj;
 442                 ds->ds_is_snapshot = dsl_dataset_phys(ds)->ds_num_children != 0;
 443 








 444                 mutex_init(&ds->ds_lock, NULL, MUTEX_DEFAULT, NULL);
 445                 mutex_init(&ds->ds_opening_lock, NULL, MUTEX_DEFAULT, NULL);
 446                 mutex_init(&ds->ds_sendstream_lock, NULL, MUTEX_DEFAULT, NULL);


 447                 rrw_init(&ds->ds_bp_rwlock, B_FALSE);
 448                 refcount_create(&ds->ds_longholds);
 449 
 450                 bplist_create(&ds->ds_pending_deadlist);
 451                 dsl_deadlist_open(&ds->ds_deadlist,
 452                     mos, dsl_dataset_phys(ds)->ds_deadlist_obj);
 453 
 454                 list_create(&ds->ds_sendstreams, sizeof (dmu_sendarg_t),
 455                     offsetof(dmu_sendarg_t, dsa_link));
 456 
 457                 list_create(&ds->ds_prop_cbs, sizeof (dsl_prop_cb_record_t),
 458                     offsetof(dsl_prop_cb_record_t, cbr_ds_node));
 459 
 460                 if (doi.doi_type == DMU_OTN_ZAP_METADATA) {
 461                         for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {
 462                                 if (!(spa_feature_table[f].fi_flags &
 463                                     ZFEATURE_FLAG_PER_DATASET))
 464                                         continue;
 465                                 err = zap_contains(mos, dsobj,
 466                                     spa_feature_table[f].fi_guid);
 467                                 if (err == 0) {
 468                                         ds->ds_feature_inuse[f] = B_TRUE;
 469                                 } else {
 470                                         ASSERT3U(err, ==, ENOENT);
 471                                         err = 0;
 472                                 }
 473                         }
 474                 }
 475 
 476                 err = dsl_dir_hold_obj(dp,
 477                     dsl_dataset_phys(ds)->ds_dir_obj, NULL, ds, &ds->ds_dir);
 478                 if (err != 0) {
 479                         mutex_destroy(&ds->ds_lock);
 480                         mutex_destroy(&ds->ds_opening_lock);
 481                         mutex_destroy(&ds->ds_sendstream_lock);
 482                         refcount_destroy(&ds->ds_longholds);
 483                         bplist_destroy(&ds->ds_pending_deadlist);
 484                         dsl_deadlist_close(&ds->ds_deadlist);
 485                         kmem_free(ds, sizeof (dsl_dataset_t));
 486                         dmu_buf_rele(dbuf, tag);
 487                         return (err);
 488                 }
 489 
 490                 if (!ds->ds_is_snapshot) {
 491                         ds->ds_snapname[0] = '\0';
 492                         if (dsl_dataset_phys(ds)->ds_prev_snap_obj != 0) {
 493                                 err = dsl_dataset_hold_obj(dp,
 494                                     dsl_dataset_phys(ds)->ds_prev_snap_obj,
 495                                     ds, &ds->ds_prev);
 496                         }
 497                         if (doi.doi_type == DMU_OTN_ZAP_METADATA) {
 498                                 int zaperr = zap_lookup(mos, ds->ds_object,
 499                                     DS_FIELD_BOOKMARK_NAMES,
 500                                     sizeof (ds->ds_bookmarks), 1,
 501                                     &ds->ds_bookmarks);
 502                                 if (zaperr != ENOENT)
 503                                         VERIFY0(zaperr);
 504                         }
 505                 } else {
 506                         if (zfs_flags & ZFS_DEBUG_SNAPNAMES)
 507                                 err = dsl_dataset_get_snapname(ds);
 508                         if (err == 0 &&
 509                             dsl_dataset_phys(ds)->ds_userrefs_obj != 0) {
 510                                 err = zap_count(
 511                                     ds->ds_dir->dd_pool->dp_meta_objset,
 512                                     dsl_dataset_phys(ds)->ds_userrefs_obj,
 513                                     &ds->ds_userrefs);
 514                         }
 515                 }
 516 
 517                 if (err == 0 && !ds->ds_is_snapshot) {
 518                         err = dsl_prop_get_int_ds(ds,
 519                             zfs_prop_to_name(ZFS_PROP_REFRESERVATION),
 520                             &ds->ds_reserved);
 521                         if (err == 0) {
 522                                 err = dsl_prop_get_int_ds(ds,
 523                                     zfs_prop_to_name(ZFS_PROP_REFQUOTA),
 524                                     &ds->ds_quota);
 525                         }
 526                 } else {
 527                         ds->ds_reserved = ds->ds_quota = 0;
 528                 }
 529 









 530                 dmu_buf_init_user(&ds->ds_dbu, dsl_dataset_evict_sync,
 531                     dsl_dataset_evict_async, &ds->ds_dbuf);
 532                 if (err == 0)
 533                         winner = dmu_buf_set_user_ie(dbuf, &ds->ds_dbu);
 534 
 535                 if (err != 0 || winner != NULL) {
 536                         bplist_destroy(&ds->ds_pending_deadlist);
 537                         dsl_deadlist_close(&ds->ds_deadlist);


 538                         if (ds->ds_prev)
 539                                 dsl_dataset_rele(ds->ds_prev, ds);
 540                         dsl_dir_rele(ds->ds_dir, ds);
 541                         mutex_destroy(&ds->ds_lock);
 542                         mutex_destroy(&ds->ds_opening_lock);
 543                         mutex_destroy(&ds->ds_sendstream_lock);
 544                         refcount_destroy(&ds->ds_longholds);
 545                         kmem_free(ds, sizeof (dsl_dataset_t));
 546                         if (err != 0) {
 547                                 dmu_buf_rele(dbuf, tag);
 548                                 return (err);
 549                         }
 550                         ds = winner;
 551                 } else {
 552                         ds->ds_fsid_guid =
 553                             unique_insert(dsl_dataset_phys(ds)->ds_fsid_guid);
 554                         if (ds->ds_fsid_guid !=
 555                             dsl_dataset_phys(ds)->ds_fsid_guid) {
 556                                 zfs_dbgmsg("ds_fsid_guid changed from "
 557                                     "%llx to %llx for pool %s dataset id %llu",


1165                     ZFS_PROP_SNAPSHOT_LIMIT, NULL, cr);
1166                 if (error != 0)
1167                         return (error);
1168         }
1169 
1170         error = dsl_dataset_snapshot_reserve_space(ds, tx);
1171         if (error != 0)
1172                 return (error);
1173 
1174         return (0);
1175 }
1176 
1177 int
1178 dsl_dataset_snapshot_check(void *arg, dmu_tx_t *tx)
1179 {
1180         dsl_dataset_snapshot_arg_t *ddsa = arg;
1181         dsl_pool_t *dp = dmu_tx_pool(tx);
1182         nvpair_t *pair;
1183         int rv = 0;
1184 
1185         if (ddsa->ddsa_autosnap && dmu_tx_is_syncing(tx))
1186                 autosnap_invalidate_list(dp, ddsa->ddsa_snaps);
1187 
1188         /*
1189          * Pre-compute how many total new snapshots will be created for each
1190          * level in the tree and below. This is needed for validating the
1191          * snapshot limit when either taking a recursive snapshot or when
1192          * taking multiple snapshots.
1193          *
1194          * The problem is that the counts are not actually adjusted when
1195          * we are checking, only when we finally sync. For a single snapshot,
1196          * this is easy, the count will increase by 1 at each node up the tree,
1197          * but its more complicated for the recursive/multiple snapshot case.
1198          *
1199          * The dsl_fs_ss_limit_check function does recursively check the count
1200          * at each level up the tree but since it is validating each snapshot
1201          * independently we need to be sure that we are validating the complete
1202          * count for the entire set of snapshots. We do this by rolling up the
1203          * counts for each component of the name into an nvlist and then
1204          * checking each of those cases with the aggregated count.
1205          *
1206          * This approach properly handles not only the recursive snapshot
1207          * case (where we get all of those on the ddsa_snaps list) but also


1308                         if (ddsa->ddsa_errors != NULL) {
1309                                 fnvlist_add_int32(ddsa->ddsa_errors,
1310                                     name, error);
1311                         }
1312                         rv = error;
1313                 }
1314         }
1315 
1316         return (rv);
1317 }
1318 
1319 void
1320 dsl_dataset_snapshot_sync_impl(dsl_dataset_t *ds, const char *snapname,
1321     dmu_tx_t *tx)
1322 {
1323         dsl_pool_t *dp = ds->ds_dir->dd_pool;
1324         dmu_buf_t *dbuf;
1325         dsl_dataset_phys_t *dsphys;
1326         uint64_t dsobj, crtxg;
1327         objset_t *mos = dp->dp_meta_objset;
1328         objset_t *os = NULL;
1329         uint64_t unique_bytes = 0;
1330 
1331         ASSERT(RRW_WRITE_HELD(&dp->dp_config_rwlock));
1332 
1333         /*
1334          * If we are on an old pool, the zil must not be active, in which
1335          * case it will be zeroed.  Usually zil_suspend() accomplishes this.
1336          */
1337         ASSERT(spa_version(dmu_tx_pool(tx)->dp_spa) >= SPA_VERSION_FAST_SNAP ||
1338             dmu_objset_from_ds(ds, &os) != 0 ||
1339             bcmp(&os->os_phys->os_zil_header, &zero_zil,
1340             sizeof (zero_zil)) == 0);
1341 
1342         /* Should not snapshot a dirty dataset. */
1343         ASSERT(!txg_list_member(&ds->ds_dir->dd_pool->dp_dirty_datasets,
1344             ds, tx->tx_txg));
1345 
1346         dsl_fs_ss_count_adjust(ds->ds_dir, 1, DD_FIELD_SNAPSHOT_COUNT, tx);
1347 
1348         /*
1349          * The origin's ds_creation_txg has to be < TXG_INITIAL


1414          */
1415         if (ds->ds_reserved) {
1416                 int64_t delta;
1417                 ASSERT(DS_UNIQUE_IS_ACCURATE(ds));
1418                 delta = MIN(dsl_dataset_phys(ds)->ds_unique_bytes,
1419                     ds->ds_reserved);
1420                 dsl_dir_diduse_space(ds->ds_dir, DD_USED_REFRSRV,
1421                     delta, 0, 0, tx);
1422         }
1423 
1424         dmu_buf_will_dirty(ds->ds_dbuf, tx);
1425         dsl_dataset_phys(ds)->ds_deadlist_obj =
1426             dsl_deadlist_clone(&ds->ds_deadlist, UINT64_MAX,
1427             dsl_dataset_phys(ds)->ds_prev_snap_obj, tx);
1428         dsl_deadlist_close(&ds->ds_deadlist);
1429         dsl_deadlist_open(&ds->ds_deadlist, mos,
1430             dsl_dataset_phys(ds)->ds_deadlist_obj);
1431         dsl_deadlist_add_key(&ds->ds_deadlist,
1432             dsl_dataset_phys(ds)->ds_prev_snap_txg, tx);
1433 
















1434         ASSERT3U(dsl_dataset_phys(ds)->ds_prev_snap_txg, <, tx->tx_txg);
1435         dsl_dataset_phys(ds)->ds_prev_snap_obj = dsobj;
1436         dsl_dataset_phys(ds)->ds_prev_snap_txg = crtxg;
1437         unique_bytes = dsl_dataset_phys(ds)->ds_unique_bytes;
1438         dsl_dataset_phys(ds)->ds_unique_bytes = 0;

1439         if (spa_version(dp->dp_spa) >= SPA_VERSION_UNIQUE_ACCURATE)
1440                 dsl_dataset_phys(ds)->ds_flags |= DS_FLAG_UNIQUE_ACCURATE;
1441 
1442         VERIFY0(zap_add(mos, dsl_dataset_phys(ds)->ds_snapnames_zapobj,
1443             snapname, 8, 1, &dsobj, tx));
1444 
1445         if (ds->ds_prev)
1446                 dsl_dataset_rele(ds->ds_prev, ds);
1447         VERIFY0(dsl_dataset_hold_obj(dp,
1448             dsl_dataset_phys(ds)->ds_prev_snap_obj, ds, &ds->ds_prev));
1449 
1450         dsl_scan_ds_snapshotted(ds, tx);
1451 
1452         dsl_dir_snap_cmtime_update(ds->ds_dir);
1453 
1454         spa_history_log_internal_ds(ds->ds_prev, "snapshot", tx, "");
1455 
1456         if (autosnap_check_name(snapname)) {
1457                 autosnap_create_cb(spa_get_autosnap(dp->dp_spa),
1458                     ds, snapname, tx->tx_txg);
1459         }
1460 
1461         if (os == NULL)
1462                 VERIFY0(dmu_objset_from_ds(ds, &os));
1463 
1464         if (os->os_wbc_mode != ZFS_WBC_MODE_OFF)
1465                 wbc_add_bytes(dp->dp_spa, tx->tx_txg, unique_bytes);
1466 }
1467 
1468 void
1469 dsl_dataset_snapshot_sync(void *arg, dmu_tx_t *tx)
1470 {
1471         dsl_dataset_snapshot_arg_t *ddsa = arg;
1472         dsl_pool_t *dp = dmu_tx_pool(tx);
1473         nvpair_t *pair;
1474 
1475         for (pair = nvlist_next_nvpair(ddsa->ddsa_snaps, NULL);
1476             pair != NULL; pair = nvlist_next_nvpair(ddsa->ddsa_snaps, pair)) {
1477                 dsl_dataset_t *ds;
1478                 char *name, *atp;
1479                 char dsname[ZFS_MAX_DATASET_NAME_LEN];
1480 
1481                 name = nvpair_name(pair);
1482 
1483                 atp = strchr(name, '@');
1484                 (void) strlcpy(dsname, name, atp - name + 1);
1485                 VERIFY0(dsl_dataset_hold(dp, dsname, FTAG, &ds));
1486 
1487                 dsl_dataset_snapshot_sync_impl(ds, atp + 1, tx);
1488                 if (ddsa->ddsa_props != NULL) {
1489                         dsl_props_set_sync_impl(ds->ds_prev,
1490                             ZPROP_SRC_LOCAL, ddsa->ddsa_props, tx);
1491                 }
1492                 dsl_dataset_rele(ds, FTAG);
1493         }
1494 }
1495 
1496 /*
1497  * The snapshots must all be in the same pool.
1498  * All-or-nothing: if there are any failures, nothing will be modified.
1499  */
1500 int
1501 dsl_dataset_snapshot(nvlist_t *snaps, nvlist_t *props, nvlist_t *errors)
1502 {


1530 
1531                         atp = strchr(snapname, '@');
1532                         if (atp == NULL) {
1533                                 error = SET_ERROR(EINVAL);
1534                                 break;
1535                         }
1536                         (void) strlcpy(fsname, snapname, atp - snapname + 1);
1537 
1538                         error = zil_suspend(fsname, &cookie);
1539                         if (error != 0)
1540                                 break;
1541                         fnvlist_add_uint64(suspended, fsname,
1542                             (uintptr_t)cookie);
1543                 }
1544         }
1545 
1546         ddsa.ddsa_snaps = snaps;
1547         ddsa.ddsa_props = props;
1548         ddsa.ddsa_errors = errors;
1549         ddsa.ddsa_cr = CRED();
1550         ddsa.ddsa_autosnap = B_FALSE;
1551 
1552         if (error == 0) {
1553                 error = dsl_sync_task(firstname, dsl_dataset_snapshot_check,
1554                     dsl_dataset_snapshot_sync, &ddsa,
1555                     fnvlist_num_pairs(snaps) * 3, ZFS_SPACE_CHECK_NORMAL);
1556         }
1557 
1558         if (suspended != NULL) {
1559                 for (pair = nvlist_next_nvpair(suspended, NULL); pair != NULL;
1560                     pair = nvlist_next_nvpair(suspended, pair)) {
1561                         zil_resume((void *)(uintptr_t)
1562                             fnvpair_value_uint64(pair));
1563                 }
1564                 fnvlist_free(suspended);
1565         }
1566 
1567         return (error);
1568 }
1569 
1570 typedef struct dsl_dataset_snapshot_tmp_arg {


2165         return (0);
2166 }
2167 
2168 void
2169 dsl_dataset_stats(dsl_dataset_t *ds, nvlist_t *nv)
2170 {
2171         dsl_pool_t *dp = ds->ds_dir->dd_pool;
2172 
2173         ASSERT(dsl_pool_config_held(dp));
2174 
2175         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_REFRATIO,
2176             dsl_get_refratio(ds));
2177         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_LOGICALREFERENCED,
2178             dsl_get_logicalreferenced(ds));
2179         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_COMPRESSRATIO,
2180             dsl_get_compressratio(ds));
2181         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_USED,
2182             dsl_get_used(ds));
2183 
2184         if (ds->ds_is_snapshot) {
2185                 dsl_dataset_t *hds = NULL;
2186                 boolean_t modified = B_FALSE;
2187 
2188                 if (dsl_dataset_hold_obj(dp,
2189                     dsl_dir_phys(ds->ds_dir)->dd_head_dataset_obj,
2190                     FTAG, &hds) == 0) {
2191                         modified = dsl_dataset_modified_since_snap(hds, ds);
2192                         dsl_dataset_rele(hds, FTAG);
2193                 }
2194 
2195                 dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_MODIFIED,
2196                     modified ? 1 : 0);
2197 
2198                 get_clones_stat(ds, nv);
2199         } else {
2200                 char buf[ZFS_MAX_DATASET_NAME_LEN];
2201                 if (dsl_get_prev_snap(ds, buf) == 0)
2202                         dsl_prop_nvlist_add_string(nv, ZFS_PROP_PREV_SNAP,
2203                             buf);
2204                 dsl_dir_stats(ds->ds_dir, nv);
2205         }
2206 
2207         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_AVAILABLE,
2208             dsl_get_available(ds));
2209         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_REFERENCED,
2210             dsl_get_referenced(ds));
2211         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_CREATION,
2212             dsl_get_creation(ds));
2213         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_CREATETXG,
2214             dsl_get_creationtxg(ds));
2215         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_REFQUOTA,
2216             dsl_get_refquota(ds));
2217         dsl_prop_nvlist_add_uint64(nv, ZFS_PROP_REFRESERVATION,


2255                     sizeof (recvname) &&
2256                     strlcat(recvname, recv_clone_name, sizeof (recvname)) <
2257                     sizeof (recvname) &&
2258                     dsl_dataset_hold(dp, recvname, FTAG, &recv_ds) == 0) {
2259                         get_receive_resume_stats(recv_ds, nv);
2260                         dsl_dataset_rele(recv_ds, FTAG);
2261                 }
2262         }
2263 }
2264 
2265 void
2266 dsl_dataset_fast_stat(dsl_dataset_t *ds, dmu_objset_stats_t *stat)
2267 {
2268         dsl_pool_t *dp = ds->ds_dir->dd_pool;
2269         ASSERT(dsl_pool_config_held(dp));
2270 
2271         stat->dds_creation_txg = dsl_get_creationtxg(ds);
2272         stat->dds_inconsistent = dsl_get_inconsistent(ds);
2273         stat->dds_guid = dsl_get_guid(ds);
2274         stat->dds_origin[0] = '\0';
2275         stat->dds_is_snapshot = B_FALSE;
2276         stat->dds_is_autosnapshot = B_FALSE;
2277         if (ds->ds_is_snapshot) {
2278                 if (autosnap_is_autosnap(ds))
2279                         stat->dds_is_autosnapshot = B_TRUE;
2280                 else
2281                         stat->dds_is_snapshot = B_TRUE;
2282 
2283                 stat->dds_num_clones = dsl_get_numclones(ds);
2284         } else {

2285                 stat->dds_num_clones = 0;
2286 
2287                 if (dsl_dir_is_clone(ds->ds_dir)) {
2288                         dsl_dir_get_origin(ds->ds_dir, stat->dds_origin);
2289                 }
2290         }
2291 }
2292 
2293 uint64_t
2294 dsl_dataset_fsid_guid(dsl_dataset_t *ds)
2295 {
2296         return (ds->ds_fsid_guid);
2297 }
2298 
2299 void
2300 dsl_dataset_space(dsl_dataset_t *ds,
2301     uint64_t *refdbytesp, uint64_t *availbytesp,
2302     uint64_t *usedobjsp, uint64_t *availobjsp)
2303 {
2304         *refdbytesp = dsl_dataset_phys(ds)->ds_referenced_bytes;


2381                 error = SET_ERROR(EEXIST);
2382         else if (error == ENOENT)
2383                 error = 0;
2384 
2385         /* dataset name + 1 for the "@" + the new snapshot name must fit */
2386         if (dsl_dir_namelen(hds->ds_dir) + 1 +
2387             strlen(ddrsa->ddrsa_newsnapname) >= ZFS_MAX_DATASET_NAME_LEN)
2388                 error = SET_ERROR(ENAMETOOLONG);
2389 
2390         return (error);
2391 }
2392 
2393 static int
2394 dsl_dataset_rename_snapshot_check(void *arg, dmu_tx_t *tx)
2395 {
2396         dsl_dataset_rename_snapshot_arg_t *ddrsa = arg;
2397         dsl_pool_t *dp = dmu_tx_pool(tx);
2398         dsl_dataset_t *hds;
2399         int error;
2400 
2401         /* You cannot rename an autosnapshot */
2402         if (autosnap_check_name(ddrsa->ddrsa_oldsnapname))
2403                 return (SET_ERROR(EPERM));
2404 
2405         /* New name cannot match the AUTOSNAP prefix */
2406         if (autosnap_check_name(ddrsa->ddrsa_newsnapname))
2407                 return (SET_ERROR(EPERM));
2408 
2409         error = dsl_dataset_hold(dp, ddrsa->ddrsa_fsname, FTAG, &hds);
2410         if (error != 0)
2411                 return (error);
2412 
2413         if (ddrsa->ddrsa_recursive) {
2414                 error = dmu_objset_find_dp(dp, hds->ds_dir->dd_object,
2415                     dsl_dataset_rename_snapshot_check_impl, ddrsa,
2416                     DS_FIND_CHILDREN);
2417         } else {
2418                 error = dsl_dataset_rename_snapshot_check_impl(dp, hds, ddrsa);
2419         }
2420         dsl_dataset_rele(hds, FTAG);
2421         return (error);
2422 }
2423 
2424 static int
2425 dsl_dataset_rename_snapshot_sync_impl(dsl_pool_t *dp,
2426     dsl_dataset_t *hds, void *arg)
2427 {
2428         dsl_dataset_rename_snapshot_arg_t *ddrsa = arg;


3331          * The clone can't be too much over the head's refquota.
3332          *
3333          * To ensure that the entire refquota can be used, we allow one
3334          * transaction to exceed the the refquota.  Therefore, this check
3335          * needs to also allow for the space referenced to be more than the
3336          * refquota.  The maximum amount of space that one transaction can use
3337          * on disk is DMU_MAX_ACCESS * spa_asize_inflation.  Allowing this
3338          * overage ensures that we are able to receive a filesystem that
3339          * exceeds the refquota on the source system.
3340          *
3341          * So that overage is the refquota_slack we use below.
3342          */
3343         if (origin_head->ds_quota != 0 &&
3344             dsl_dataset_phys(clone)->ds_referenced_bytes >
3345             origin_head->ds_quota + refquota_slack)
3346                 return (SET_ERROR(EDQUOT));
3347 
3348         return (0);
3349 }
3350 



































3351 void
3352 dsl_dataset_clone_swap_sync_impl(dsl_dataset_t *clone,
3353     dsl_dataset_t *origin_head, dmu_tx_t *tx)
3354 {
3355         dsl_pool_t *dp = dmu_tx_pool(tx);
3356         int64_t unused_refres_delta;
3357 
3358         ASSERT(clone->ds_reserved == 0);
3359         /*
3360          * NOTE: On DEBUG kernels there could be a race between this and
3361          * the check function if spa_asize_inflation is adjusted...
3362          */
3363         ASSERT(origin_head->ds_quota == 0 ||
3364             dsl_dataset_phys(clone)->ds_unique_bytes <= origin_head->ds_quota +
3365             DMU_MAX_ACCESS * spa_asize_inflation);
3366         ASSERT3P(clone->ds_prev, ==, origin_head->ds_prev);
3367 
3368         /*
3369          * Swap per-dataset feature flags.
3370          */


3500         SWITCH64(dsl_dataset_phys(origin_head)->ds_uncompressed_bytes,
3501             dsl_dataset_phys(clone)->ds_uncompressed_bytes);
3502         SWITCH64(dsl_dataset_phys(origin_head)->ds_unique_bytes,
3503             dsl_dataset_phys(clone)->ds_unique_bytes);
3504 
3505         /* apply any parent delta for change in unconsumed refreservation */
3506         dsl_dir_diduse_space(origin_head->ds_dir, DD_USED_REFRSRV,
3507             unused_refres_delta, 0, 0, tx);
3508 
3509         /*
3510          * Swap deadlists.
3511          */
3512         dsl_deadlist_close(&clone->ds_deadlist);
3513         dsl_deadlist_close(&origin_head->ds_deadlist);
3514         SWITCH64(dsl_dataset_phys(origin_head)->ds_deadlist_obj,
3515             dsl_dataset_phys(clone)->ds_deadlist_obj);
3516         dsl_deadlist_open(&clone->ds_deadlist, dp->dp_meta_objset,
3517             dsl_dataset_phys(clone)->ds_deadlist_obj);
3518         dsl_deadlist_open(&origin_head->ds_deadlist, dp->dp_meta_objset,
3519             dsl_dataset_phys(origin_head)->ds_deadlist_obj);

3520 
3521         dsl_scan_ds_clone_swapped(origin_head, clone, tx);
3522 
3523         spa_history_log_internal_ds(clone, "clone swap", tx,
3524             "parent=%s", origin_head->ds_dir->dd_myname);
3525 }
3526 
3527 /*
3528  * Given a pool name and a dataset object number in that pool,
3529  * return the name of that dataset.
3530  */
3531 int
3532 dsl_dsobj_to_dsname(char *pname, uint64_t obj, char *buf)
3533 {
3534         dsl_pool_t *dp;
3535         dsl_dataset_t *ds;
3536         int error;
3537 
3538         error = dsl_pool_hold(pname, FTAG, &dp);
3539         if (error != 0)


3992         if (dsl_dir_phys(later->ds_dir)->dd_origin_obj == earlier->ds_object)
3993                 return (B_TRUE);
3994         dsl_dataset_t *origin;
3995         error = dsl_dataset_hold_obj(dp,
3996             dsl_dir_phys(later->ds_dir)->dd_origin_obj, FTAG, &origin);
3997         if (error != 0)
3998                 return (B_FALSE);
3999         ret = dsl_dataset_is_before(origin, earlier, earlier_txg);
4000         dsl_dataset_rele(origin, FTAG);
4001         return (ret);
4002 }
4003 
4004 void
4005 dsl_dataset_zapify(dsl_dataset_t *ds, dmu_tx_t *tx)
4006 {
4007         objset_t *mos = ds->ds_dir->dd_pool->dp_meta_objset;
4008         dmu_object_zapify(mos, ds->ds_object, DMU_OT_DSL_DATASET, tx);
4009 }
4010 
4011 boolean_t
4012 dataset_name_hidden(const char *name)
4013 {
4014         if (strchr(name, '$') != NULL)
4015                 return (B_TRUE);
4016         if (strchr(name, '%') != NULL)
4017                 return (B_TRUE);
4018         if (!INGLOBALZONE(curproc) && !zone_dataset_visible(name, NULL))
4019                 return (B_TRUE);
4020         return (B_FALSE);
4021 }
4022 








4023 uint64_t
4024 dsl_dataset_creation_txg(const char *name)
4025 {
4026         dsl_pool_t *dp;
4027         dsl_dataset_t *ds;
4028         int err;
4029         uint64_t ret = UINT64_MAX;
4030 
4031         err = dsl_pool_hold(name, FTAG, &dp);

4032 
4033         if (err)
4034                 return (ret);

4035 
4036         err = dsl_dataset_hold(dp, name, FTAG, &ds);
4037 
4038         if (!err) {
4039                 ret = dsl_dataset_phys(ds)->ds_creation_txg;
4040                 dsl_dataset_rele(ds, FTAG);
4041         }
4042 
4043         dsl_pool_rele(dp, FTAG);
4044 
4045         return (ret);
4046 }
4047 
4048 boolean_t
4049 dsl_dataset_is_zapified(dsl_dataset_t *ds)
4050 {
4051         dmu_object_info_t doi;



4052 
4053         dmu_object_info_from_db(ds->ds_dbuf, &doi);
4054         return (doi.doi_type == DMU_OTN_ZAP_METADATA);






4055 }
4056 
4057 boolean_t
4058 dsl_dataset_has_resume_receive_state(dsl_dataset_t *ds)
4059 {
4060         return (dsl_dataset_is_zapified(ds) &&
4061             zap_contains(ds->ds_dir->dd_pool->dp_meta_objset,
4062             ds->ds_object, DS_FIELD_RESUME_TOGUID) == 0);







































4063 }