Print this page
NEX-6855 System fails to boot up after a large number of datasets created
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-9301 BAD Trap: Double Fault panic on zfs destroy snapshot
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-7641 Impossible to remove special vdev from pool if WBC-ed dataset was removed before disabling WBC
Reviewed by: Alek Pinchuk <alek@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-5795 Rename 'wrc' as 'wbc' in the source and in the tech docs
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
2605 want to resume interrupted zfs send
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Reviewed by: Arne Jansen <sensille@gmx.net>
Approved by: Dan McDonald <danmcd@omniti.com>
6047 SPARC boot should support feature@embedded_data
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
5959 clean up per-dataset feature count code
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
NEX-4582 update wrc test cases for allow to use write back cache per tree of datasets
Reviewed by: Steve Peng <steve.peng@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Revert "NEX-4476 WRC: Allow to use write back cache per tree of datasets"
This reverts commit fe97b74444278a6f36fec93179133641296312da.
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-3964 It should not be allowed to rename a snapshot that its new name is matched to the prefix of in-kernel autosnapshots (lint)
NEX-3964 It should not be allowed to rename a snapshot that its new name is matched to the prefix of in-kernel autosnapshots
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
NEX-3558 KRRP Integration
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Garrett D'Amore <garrett@damore.org>


   3  *
   4  * The contents of this file are subject to the terms of the
   5  * Common Development and Distribution License (the "License").
   6  * You may not use this file except in compliance with the License.
   7  *
   8  * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
   9  * or http://www.opensolaris.org/os/licensing.
  10  * See the License for the specific language governing permissions
  11  * and limitations under the License.
  12  *
  13  * When distributing Covered Code, include this CDDL HEADER in each
  14  * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  15  * If applicable, add the following below this CDDL HEADER, with the
  16  * fields enclosed by brackets "[]" replaced with your own identifying
  17  * information: Portions Copyright [yyyy] [name of copyright owner]
  18  *
  19  * CDDL HEADER END
  20  */
  21 /*
  22  * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
  23  * Copyright (c) 2012, 2017 by Delphix. All rights reserved.
  24  * Copyright (c) 2013 Steven Hartland. All rights reserved.
  25  * Copyright (c) 2013 by Joyent, Inc. All rights reserved.
  26  * Copyright (c) 2014 Integros [integros.com]

  27  */
  28 

  29 #include <sys/zfs_context.h>
  30 #include <sys/dsl_userhold.h>
  31 #include <sys/dsl_dataset.h>
  32 #include <sys/dsl_synctask.h>
  33 #include <sys/dsl_destroy.h>
  34 #include <sys/dmu_tx.h>
  35 #include <sys/dsl_pool.h>
  36 #include <sys/dsl_dir.h>
  37 #include <sys/dmu_traverse.h>
  38 #include <sys/dsl_scan.h>
  39 #include <sys/dmu_objset.h>
  40 #include <sys/zap.h>
  41 #include <sys/zfeature.h>
  42 #include <sys/zfs_ioctl.h>
  43 #include <sys/dsl_deleg.h>
  44 #include <sys/dmu_impl.h>

  45 #include <sys/zcp.h>
  46 
  47 int
  48 dsl_destroy_snapshot_check_impl(dsl_dataset_t *ds, boolean_t defer)
  49 {
  50         if (!ds->ds_is_snapshot)
  51                 return (SET_ERROR(EINVAL));
  52 
  53         if (dsl_dataset_long_held(ds))
  54                 return (SET_ERROR(EBUSY));
  55 
  56         /*
  57          * Only allow deferred destroy on pools that support it.
  58          * NOTE: deferred destroy is only supported on snapshots.
  59          */
  60         if (defer) {
  61                 if (spa_version(ds->ds_dir->dd_pool->dp_spa) <
  62                     SPA_VERSION_USERREFS)
  63                         return (SET_ERROR(ENOTSUP));
  64                 return (0);


 167         dsl_dir_diduse_space(ds->ds_dir, DD_USED_SNAP,
 168             -poa.used, -poa.comp, -poa.uncomp, tx);
 169 
 170         /* swap next's deadlist to our deadlist */
 171         dsl_deadlist_close(&ds->ds_deadlist);
 172         dsl_deadlist_close(&ds_next->ds_deadlist);
 173         deadlist_obj = dsl_dataset_phys(ds)->ds_deadlist_obj;
 174         dsl_dataset_phys(ds)->ds_deadlist_obj =
 175             dsl_dataset_phys(ds_next)->ds_deadlist_obj;
 176         dsl_dataset_phys(ds_next)->ds_deadlist_obj = deadlist_obj;
 177         dsl_deadlist_open(&ds->ds_deadlist, mos,
 178             dsl_dataset_phys(ds)->ds_deadlist_obj);
 179         dsl_deadlist_open(&ds_next->ds_deadlist, mos,
 180             dsl_dataset_phys(ds_next)->ds_deadlist_obj);
 181 }
 182 
 183 static void
 184 dsl_dataset_remove_clones_key(dsl_dataset_t *ds, uint64_t mintxg, dmu_tx_t *tx)
 185 {
 186         objset_t *mos = ds->ds_dir->dd_pool->dp_meta_objset;
 187         zap_cursor_t zc;
 188         zap_attribute_t za;
 189 
 190         /*
 191          * If it is the old version, dd_clones doesn't exist so we can't
 192          * find the clones, but dsl_deadlist_remove_key() is a no-op so it
 193          * doesn't matter.
 194          */
 195         if (dsl_dir_phys(ds->ds_dir)->dd_clones == 0)
 196                 return;


 197 
 198         for (zap_cursor_init(&zc, mos, dsl_dir_phys(ds->ds_dir)->dd_clones);
 199             zap_cursor_retrieve(&zc, &za) == 0;
 200             zap_cursor_advance(&zc)) {
 201                 dsl_dataset_t *clone;
 202 
 203                 VERIFY0(dsl_dataset_hold_obj(ds->ds_dir->dd_pool,
 204                     za.za_first_integer, FTAG, &clone));
 205                 if (clone->ds_dir->dd_origin_txg > mintxg) {
 206                         dsl_deadlist_remove_key(&clone->ds_deadlist,
 207                             mintxg, tx);
 208                         if (dsl_dataset_remap_deadlist_exists(clone)) {
 209                                 dsl_deadlist_remove_key(
 210                                     &clone->ds_remap_deadlist, mintxg, tx);
 211                         }
 212                         dsl_dataset_remove_clones_key(clone, mintxg, tx);
 213                 }
 214                 dsl_dataset_rele(clone, FTAG);
 215         }
 216         zap_cursor_fini(&zc);


 217 }
 218 
 219 static void
 220 dsl_destroy_snapshot_handle_remaps(dsl_dataset_t *ds, dsl_dataset_t *ds_next,
 221     dmu_tx_t *tx)
 222 {
 223         dsl_pool_t *dp = ds->ds_dir->dd_pool;
 224 
 225         /* Move blocks to be obsoleted to pool's obsolete list. */
 226         if (dsl_dataset_remap_deadlist_exists(ds_next)) {
 227                 if (!bpobj_is_open(&dp->dp_obsolete_bpobj))
 228                         dsl_pool_create_obsolete_bpobj(dp, tx);
 229 
 230                 dsl_deadlist_move_bpobj(&ds_next->ds_remap_deadlist,
 231                     &dp->dp_obsolete_bpobj,
 232                     dsl_dataset_phys(ds)->ds_prev_snap_txg, tx);
 233         }
 234 
 235         /* Merge our deadlist into next's and free it. */
 236         if (dsl_dataset_remap_deadlist_exists(ds)) {
 237                 uint64_t remap_deadlist_object =
 238                     dsl_dataset_get_remap_deadlist_object(ds);
 239                 ASSERT(remap_deadlist_object != 0);
 240 
 241                 mutex_enter(&ds_next->ds_remap_deadlist_lock);
 242                 if (!dsl_dataset_remap_deadlist_exists(ds_next))
 243                         dsl_dataset_create_remap_deadlist(ds_next, tx);
 244                 mutex_exit(&ds_next->ds_remap_deadlist_lock);
 245 
 246                 dsl_deadlist_merge(&ds_next->ds_remap_deadlist,
 247                     remap_deadlist_object, tx);
 248                 dsl_dataset_destroy_remap_deadlist(ds, tx);
 249         }
 250 }
 251 
 252 void
 253 dsl_destroy_snapshot_sync_impl(dsl_dataset_t *ds, boolean_t defer, dmu_tx_t *tx)
 254 {
 255         int err;
 256         int after_branch_point = FALSE;
 257         dsl_pool_t *dp = ds->ds_dir->dd_pool;


 258         objset_t *mos = dp->dp_meta_objset;
 259         dsl_dataset_t *ds_prev = NULL;
 260         uint64_t obj;
 261 
 262         ASSERT(RRW_WRITE_HELD(&dp->dp_config_rwlock));
 263         rrw_enter(&ds->ds_bp_rwlock, RW_READER, FTAG);
 264         ASSERT3U(dsl_dataset_phys(ds)->ds_bp.blk_birth, <=, tx->tx_txg);
 265         rrw_exit(&ds->ds_bp_rwlock, FTAG);
 266         ASSERT(refcount_is_zero(&ds->ds_longholds));
 267 









 268         if (defer &&
 269             (ds->ds_userrefs > 0 ||
 270             dsl_dataset_phys(ds)->ds_num_children > 1)) {
 271                 ASSERT(spa_version(dp->dp_spa) >= SPA_VERSION_USERREFS);
 272                 dmu_buf_will_dirty(ds->ds_dbuf, tx);
 273                 dsl_dataset_phys(ds)->ds_flags |= DS_FLAG_DEFER_DESTROY;
 274                 spa_history_log_internal_ds(ds, "defer_destroy", tx, "");
 275                 return;
 276         }
 277 
 278         ASSERT3U(dsl_dataset_phys(ds)->ds_num_children, <=, 1);
 279 
 280         /* We need to log before removing it from the namespace. */
 281         spa_history_log_internal_ds(ds, "destroy", tx, "");
 282 
 283         dsl_scan_ds_destroyed(ds, tx);
 284 
 285         obj = ds->ds_object;
 286 
 287         for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {


 347                 }
 348 
 349                 /* Adjust snapused. */
 350                 dsl_deadlist_space_range(&ds_next->ds_deadlist,
 351                     dsl_dataset_phys(ds)->ds_prev_snap_txg, UINT64_MAX,
 352                     &used, &comp, &uncomp);
 353                 dsl_dir_diduse_space(ds->ds_dir, DD_USED_SNAP,
 354                     -used, -comp, -uncomp, tx);
 355 
 356                 /* Move blocks to be freed to pool's free list. */
 357                 dsl_deadlist_move_bpobj(&ds_next->ds_deadlist,
 358                     &dp->dp_free_bpobj, dsl_dataset_phys(ds)->ds_prev_snap_txg,
 359                     tx);
 360                 dsl_dir_diduse_space(tx->tx_pool->dp_free_dir,
 361                     DD_USED_HEAD, used, comp, uncomp, tx);
 362 
 363                 /* Merge our deadlist into next's and free it. */
 364                 dsl_deadlist_merge(&ds_next->ds_deadlist,
 365                     dsl_dataset_phys(ds)->ds_deadlist_obj, tx);
 366         }
 367 
 368         dsl_deadlist_close(&ds->ds_deadlist);
 369         dsl_deadlist_free(mos, dsl_dataset_phys(ds)->ds_deadlist_obj, tx);
 370         dmu_buf_will_dirty(ds->ds_dbuf, tx);
 371         dsl_dataset_phys(ds)->ds_deadlist_obj = 0;
 372 
 373         dsl_destroy_snapshot_handle_remaps(ds, ds_next, tx);
 374 
 375         /* Collapse range in clone heads */
 376         dsl_dataset_remove_clones_key(ds,
 377             dsl_dataset_phys(ds)->ds_creation_txg, tx);
 378 
 379         if (ds_next->ds_is_snapshot) {
 380                 dsl_dataset_t *ds_nextnext;
 381 
 382                 /*
 383                  * Update next's unique to include blocks which
 384                  * were previously shared by only this snapshot
 385                  * and it.  Those blocks will be born after the
 386                  * prev snap and before this snap, and will have
 387                  * died after the next snap and before the one
 388                  * after that (ie. be on the snap after next's
 389                  * deadlist).
 390                  */
 391                 VERIFY0(dsl_dataset_hold_obj(dp,
 392                     dsl_dataset_phys(ds_next)->ds_next_snap_obj,
 393                     FTAG, &ds_nextnext));
 394                 dsl_deadlist_space_range(&ds_nextnext->ds_deadlist,
 395                     dsl_dataset_phys(ds)->ds_prev_snap_txg,
 396                     dsl_dataset_phys(ds)->ds_creation_txg,
 397                     &used, &comp, &uncomp);
 398                 dsl_dataset_phys(ds_next)->ds_unique_bytes += used;
 399                 dsl_dataset_rele(ds_nextnext, FTAG);
 400                 ASSERT3P(ds_next->ds_prev, ==, NULL);
 401 
 402                 /* Collapse range in this head. */
 403                 dsl_dataset_t *hds;
 404                 VERIFY0(dsl_dataset_hold_obj(dp,
 405                     dsl_dir_phys(ds->ds_dir)->dd_head_dataset_obj, FTAG, &hds));
 406                 dsl_deadlist_remove_key(&hds->ds_deadlist,
 407                     dsl_dataset_phys(ds)->ds_creation_txg, tx);
 408                 if (dsl_dataset_remap_deadlist_exists(hds)) {
 409                         dsl_deadlist_remove_key(&hds->ds_remap_deadlist,
 410                             dsl_dataset_phys(ds)->ds_creation_txg, tx);
 411                 }
 412                 dsl_dataset_rele(hds, FTAG);
 413 
 414         } else {
 415                 ASSERT3P(ds_next->ds_prev, ==, ds);
 416                 dsl_dataset_rele(ds_next->ds_prev, ds_next);
 417                 ds_next->ds_prev = NULL;
 418                 if (ds_prev) {
 419                         VERIFY0(dsl_dataset_hold_obj(dp,
 420                             dsl_dataset_phys(ds)->ds_prev_snap_obj,
 421                             ds_next, &ds_next->ds_prev));
 422                 }
 423 
 424                 dsl_dataset_recalc_head_uniq(ds_next);
 425 
 426                 /*
 427                  * Reduce the amount of our unconsumed refreservation
 428                  * being charged to our parent by the amount of
 429                  * new unique data we have gained.
 430                  */
 431                 if (old_unique < ds_next->ds_reserved) {


 491                     tx));
 492         dsl_dir_rele(ds->ds_dir, ds);
 493         ds->ds_dir = NULL;
 494         dmu_object_free_zapified(mos, obj, tx);
 495 }
 496 
 497 void
 498 dsl_destroy_snapshot_sync(void *arg, dmu_tx_t *tx)
 499 {
 500         dsl_destroy_snapshot_arg_t *ddsa = arg;
 501         const char *dsname = ddsa->ddsa_name;
 502         boolean_t defer = ddsa->ddsa_defer;
 503 
 504         dsl_pool_t *dp = dmu_tx_pool(tx);
 505         dsl_dataset_t *ds;
 506 
 507         int error = dsl_dataset_hold(dp, dsname, FTAG, &ds);
 508         if (error == ENOENT)
 509                 return;
 510         ASSERT0(error);




 511         dsl_destroy_snapshot_sync_impl(ds, defer, tx);
 512         dsl_dataset_rele(ds, FTAG);
 513 }
 514 
 515 /*
 516  * The semantics of this function are described in the comment above
 517  * lzc_destroy_snaps().  To summarize:
 518  *
 519  * The snapshots must all be in the same pool.
 520  *
 521  * Snapshots that don't exist will be silently ignored (considered to be
 522  * "already deleted").
 523  *
 524  * On success, all snaps will be destroyed and this will return 0.
 525  * On failure, no snaps will be destroyed, the errlist will be filled in,
 526  * and this will return an errno.
 527  */
 528 int
 529 dsl_destroy_snapshots_nvl(nvlist_t *snaps, boolean_t defer,
 530     nvlist_t *errlist)


 840 
 841         if (dsl_dataset_phys(ds)->ds_prev_snap_obj != 0) {
 842                 /* This is a clone */
 843                 ASSERT(ds->ds_prev != NULL);
 844                 ASSERT3U(dsl_dataset_phys(ds->ds_prev)->ds_next_snap_obj, !=,
 845                     obj);
 846                 ASSERT0(dsl_dataset_phys(ds)->ds_next_snap_obj);
 847 
 848                 dmu_buf_will_dirty(ds->ds_prev->ds_dbuf, tx);
 849                 if (dsl_dataset_phys(ds->ds_prev)->ds_next_clones_obj != 0) {
 850                         dsl_dataset_remove_from_next_clones(ds->ds_prev,
 851                             obj, tx);
 852                 }
 853 
 854                 ASSERT3U(dsl_dataset_phys(ds->ds_prev)->ds_num_children, >, 1);
 855                 dsl_dataset_phys(ds->ds_prev)->ds_num_children--;
 856         }
 857 
 858         /*
 859          * Destroy the deadlist.  Unless it's a clone, the
 860          * deadlist should be empty since the dataset has no snapshots.
 861          * (If it's a clone, it's safe to ignore the deadlist contents
 862          * since they are still referenced by the origin snapshot.)
 863          */
 864         dsl_deadlist_close(&ds->ds_deadlist);
 865         dsl_deadlist_free(mos, dsl_dataset_phys(ds)->ds_deadlist_obj, tx);
 866         dmu_buf_will_dirty(ds->ds_dbuf, tx);
 867         dsl_dataset_phys(ds)->ds_deadlist_obj = 0;
 868 
 869         if (dsl_dataset_remap_deadlist_exists(ds))
 870                 dsl_dataset_destroy_remap_deadlist(ds, tx);
 871 
 872         objset_t *os;
 873         VERIFY0(dmu_objset_from_ds(ds, &os));
 874 














 875         if (!spa_feature_is_enabled(dp->dp_spa, SPA_FEATURE_ASYNC_DESTROY)) {
 876                 old_synchronous_dataset_destroy(ds, tx);
 877         } else {
 878                 /*
 879                  * Move the bptree into the pool's list of trees to
 880                  * clean up and update space accounting information.
 881                  */
 882                 uint64_t used, comp, uncomp;
 883 
 884                 zil_destroy_sync(dmu_objset_zil(os), tx);
 885 
 886                 if (!spa_feature_is_active(dp->dp_spa,
 887                     SPA_FEATURE_ASYNC_DESTROY)) {
 888                         dsl_scan_t *scn = dp->dp_scan;
 889                         spa_feature_incr(dp->dp_spa, SPA_FEATURE_ASYNC_DESTROY,
 890                             tx);
 891                         dp->dp_bptree_obj = bptree_alloc(mos, tx);
 892                         VERIFY0(zap_add(mos,
 893                             DMU_POOL_DIRECTORY_OBJECT,
 894                             DMU_POOL_BPTREE_OBJ, sizeof (uint64_t), 1,


 996         spa_history_log_internal_ds(ds, "destroy begin", tx, "");
 997         dsl_dataset_rele(ds, FTAG);
 998 }
 999 
1000 int
1001 dsl_destroy_head(const char *name)
1002 {
1003         dsl_destroy_head_arg_t ddha;
1004         int error;
1005         spa_t *spa;
1006         boolean_t isenabled;
1007 
1008 #ifdef _KERNEL
1009         zfs_destroy_unmount_origin(name);
1010 #endif
1011 
1012         error = spa_open(name, &spa, FTAG);
1013         if (error != 0)
1014                 return (error);
1015         isenabled = spa_feature_is_enabled(spa, SPA_FEATURE_ASYNC_DESTROY);

1016         spa_close(spa, FTAG);
1017 
1018         ddha.ddha_name = name;
1019 
1020         if (!isenabled) {
1021                 objset_t *os;
1022 
1023                 error = dsl_sync_task(name, dsl_destroy_head_check,
1024                     dsl_destroy_head_begin_sync, &ddha,
1025                     0, ZFS_SPACE_CHECK_NONE);
1026                 if (error != 0)
1027                         return (error);
1028 
1029                 /*
1030                  * Head deletion is processed in one txg on old pools;
1031                  * remove the objects from open context so that the txg sync
1032                  * is not too long.
1033                  */
1034                 error = dmu_objset_own(name, DMU_OST_ANY, B_FALSE, FTAG, &os);
1035                 if (error == 0) {
1036                         uint64_t prev_snap_txg =
1037                             dsl_dataset_phys(dmu_objset_ds(os))->
1038                             ds_prev_snap_txg;
1039                         for (uint64_t obj = 0; error == 0;
1040                             error = dmu_object_next(os, &obj, FALSE,
1041                             prev_snap_txg))
1042                                 (void) dmu_free_long_object(os, obj);
1043                         /* sync out all frees */
1044                         txg_wait_synced(dmu_objset_pool(os), 0);
1045                         dmu_objset_disown(os, FTAG);
1046                 }
1047         }
1048 
1049         return (dsl_sync_task(name, dsl_destroy_head_check,
1050             dsl_destroy_head_sync, &ddha, 0, ZFS_SPACE_CHECK_NONE));
1051 }
1052 
1053 /*
1054  * Note, this function is used as the callback for dmu_objset_find().  We
1055  * always return 0 so that we will continue to find and process
1056  * inconsistent datasets, even if we encounter an error trying to
1057  * process one of them.
1058  */




1059 /* ARGSUSED */
1060 int
1061 dsl_destroy_inconsistent(const char *dsname, void *arg)

1062 {
1063         objset_t *os;


1064 
1065         if (dmu_objset_hold(dsname, FTAG, &os) == 0) {
1066                 boolean_t need_destroy = DS_IS_INCONSISTENT(dmu_objset_ds(os));
1067 
1068                 /*
1069                  * If the dataset is inconsistent because a resumable receive
1070                  * has failed, then do not destroy it.
1071                  */
1072                 if (dsl_dataset_has_resume_receive_state(dmu_objset_ds(os)))
1073                         need_destroy = B_FALSE;
1074 
1075                 dmu_objset_rele(os, FTAG);
1076                 if (need_destroy)
1077                         (void) dsl_destroy_head(dsname);




























1078         }






















1079         return (0);







































































































































































1080 }


   3  *
   4  * The contents of this file are subject to the terms of the
   5  * Common Development and Distribution License (the "License").
   6  * You may not use this file except in compliance with the License.
   7  *
   8  * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
   9  * or http://www.opensolaris.org/os/licensing.
  10  * See the License for the specific language governing permissions
  11  * and limitations under the License.
  12  *
  13  * When distributing Covered Code, include this CDDL HEADER in each
  14  * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  15  * If applicable, add the following below this CDDL HEADER, with the
  16  * fields enclosed by brackets "[]" replaced with your own identifying
  17  * information: Portions Copyright [yyyy] [name of copyright owner]
  18  *
  19  * CDDL HEADER END
  20  */
  21 /*
  22  * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
  23  * Copyright (c) 2012, 2016 by Delphix. All rights reserved.
  24  * Copyright (c) 2013 Steven Hartland. All rights reserved.
  25  * Copyright (c) 2013 by Joyent, Inc. All rights reserved.
  26  * Copyright (c) 2014 Integros [integros.com]
  27  * Copyright 2016 Nexenta Systems, Inc. All rights reserved.
  28  */
  29 
  30 #include <sys/autosnap.h>
  31 #include <sys/zfs_context.h>
  32 #include <sys/dsl_userhold.h>
  33 #include <sys/dsl_dataset.h>
  34 #include <sys/dsl_synctask.h>
  35 #include <sys/dsl_destroy.h>
  36 #include <sys/dmu_tx.h>
  37 #include <sys/dsl_pool.h>
  38 #include <sys/dsl_dir.h>
  39 #include <sys/dmu_traverse.h>
  40 #include <sys/dsl_scan.h>
  41 #include <sys/dmu_objset.h>
  42 #include <sys/zap.h>
  43 #include <sys/zfeature.h>
  44 #include <sys/zfs_ioctl.h>
  45 #include <sys/dsl_deleg.h>
  46 #include <sys/dmu_impl.h>
  47 #include <sys/wbc.h>
  48 #include <sys/zcp.h>
  49 
  50 int
  51 dsl_destroy_snapshot_check_impl(dsl_dataset_t *ds, boolean_t defer)
  52 {
  53         if (!ds->ds_is_snapshot)
  54                 return (SET_ERROR(EINVAL));
  55 
  56         if (dsl_dataset_long_held(ds))
  57                 return (SET_ERROR(EBUSY));
  58 
  59         /*
  60          * Only allow deferred destroy on pools that support it.
  61          * NOTE: deferred destroy is only supported on snapshots.
  62          */
  63         if (defer) {
  64                 if (spa_version(ds->ds_dir->dd_pool->dp_spa) <
  65                     SPA_VERSION_USERREFS)
  66                         return (SET_ERROR(ENOTSUP));
  67                 return (0);


 170         dsl_dir_diduse_space(ds->ds_dir, DD_USED_SNAP,
 171             -poa.used, -poa.comp, -poa.uncomp, tx);
 172 
 173         /* swap next's deadlist to our deadlist */
 174         dsl_deadlist_close(&ds->ds_deadlist);
 175         dsl_deadlist_close(&ds_next->ds_deadlist);
 176         deadlist_obj = dsl_dataset_phys(ds)->ds_deadlist_obj;
 177         dsl_dataset_phys(ds)->ds_deadlist_obj =
 178             dsl_dataset_phys(ds_next)->ds_deadlist_obj;
 179         dsl_dataset_phys(ds_next)->ds_deadlist_obj = deadlist_obj;
 180         dsl_deadlist_open(&ds->ds_deadlist, mos,
 181             dsl_dataset_phys(ds)->ds_deadlist_obj);
 182         dsl_deadlist_open(&ds_next->ds_deadlist, mos,
 183             dsl_dataset_phys(ds_next)->ds_deadlist_obj);
 184 }
 185 
 186 static void
 187 dsl_dataset_remove_clones_key(dsl_dataset_t *ds, uint64_t mintxg, dmu_tx_t *tx)
 188 {
 189         objset_t *mos = ds->ds_dir->dd_pool->dp_meta_objset;
 190         zap_cursor_t *zc;
 191         zap_attribute_t *za;
 192 
 193         /*
 194          * If it is the old version, dd_clones doesn't exist so we can't
 195          * find the clones, but dsl_deadlist_remove_key() is a no-op so it
 196          * doesn't matter.
 197          */
 198         if (dsl_dir_phys(ds->ds_dir)->dd_clones == 0)
 199                 return;
 200         zc = kmem_alloc(sizeof (zap_cursor_t), KM_SLEEP);
 201         za = kmem_alloc(sizeof (zap_attribute_t), KM_SLEEP);
 202 
 203         for (zap_cursor_init(zc, mos, dsl_dir_phys(ds->ds_dir)->dd_clones);
 204             zap_cursor_retrieve(zc, za) == 0;
 205             zap_cursor_advance(zc)) {
 206                 dsl_dataset_t *clone;
 207 
 208                 VERIFY0(dsl_dataset_hold_obj(ds->ds_dir->dd_pool,
 209                     za->za_first_integer, FTAG, &clone));
 210                 if (clone->ds_dir->dd_origin_txg > mintxg) {
 211                         dsl_deadlist_remove_key(&clone->ds_deadlist,
 212                             mintxg, tx);




 213                         dsl_dataset_remove_clones_key(clone, mintxg, tx);
 214                 }
 215                 dsl_dataset_rele(clone, FTAG);
 216         }
 217         zap_cursor_fini(zc);
 218         kmem_free(zc, sizeof (zap_cursor_t));
 219         kmem_free(za, sizeof (zap_attribute_t));
 220 }
 221 

































 222 void
 223 dsl_destroy_snapshot_sync_impl(dsl_dataset_t *ds, boolean_t defer, dmu_tx_t *tx)
 224 {
 225         int err;
 226         int after_branch_point = FALSE;
 227         dsl_pool_t *dp = ds->ds_dir->dd_pool;
 228         spa_t *spa = dp->dp_spa;
 229         wbc_data_t *wbc_data = spa_get_wbc_data(spa);
 230         objset_t *mos = dp->dp_meta_objset;
 231         dsl_dataset_t *ds_prev = NULL;
 232         uint64_t obj;
 233 
 234         ASSERT(RRW_WRITE_HELD(&dp->dp_config_rwlock));
 235         rrw_enter(&ds->ds_bp_rwlock, RW_READER, FTAG);
 236         ASSERT3U(dsl_dataset_phys(ds)->ds_bp.blk_birth, <=, tx->tx_txg);
 237         rrw_exit(&ds->ds_bp_rwlock, FTAG);
 238         ASSERT(refcount_is_zero(&ds->ds_longholds));
 239 
 240         /*
 241          * if an edge snapshot of WBC window is destroyed, the window must be
 242          * aborted
 243          */
 244         mutex_enter(&wbc_data->wbc_lock);
 245         if (dsl_dataset_phys(ds)->ds_creation_txg == wbc_data->wbc_finish_txg)
 246                 wbc_purge_window(spa, tx);
 247         mutex_exit(&wbc_data->wbc_lock);
 248 
 249         if (defer &&
 250             (ds->ds_userrefs > 0 ||
 251             dsl_dataset_phys(ds)->ds_num_children > 1)) {
 252                 ASSERT(spa_version(dp->dp_spa) >= SPA_VERSION_USERREFS);
 253                 dmu_buf_will_dirty(ds->ds_dbuf, tx);
 254                 dsl_dataset_phys(ds)->ds_flags |= DS_FLAG_DEFER_DESTROY;
 255                 spa_history_log_internal_ds(ds, "defer_destroy", tx, "");
 256                 return;
 257         }
 258 
 259         ASSERT3U(dsl_dataset_phys(ds)->ds_num_children, <=, 1);
 260 
 261         /* We need to log before removing it from the namespace. */
 262         spa_history_log_internal_ds(ds, "destroy", tx, "");
 263 
 264         dsl_scan_ds_destroyed(ds, tx);
 265 
 266         obj = ds->ds_object;
 267 
 268         for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {


 328                 }
 329 
 330                 /* Adjust snapused. */
 331                 dsl_deadlist_space_range(&ds_next->ds_deadlist,
 332                     dsl_dataset_phys(ds)->ds_prev_snap_txg, UINT64_MAX,
 333                     &used, &comp, &uncomp);
 334                 dsl_dir_diduse_space(ds->ds_dir, DD_USED_SNAP,
 335                     -used, -comp, -uncomp, tx);
 336 
 337                 /* Move blocks to be freed to pool's free list. */
 338                 dsl_deadlist_move_bpobj(&ds_next->ds_deadlist,
 339                     &dp->dp_free_bpobj, dsl_dataset_phys(ds)->ds_prev_snap_txg,
 340                     tx);
 341                 dsl_dir_diduse_space(tx->tx_pool->dp_free_dir,
 342                     DD_USED_HEAD, used, comp, uncomp, tx);
 343 
 344                 /* Merge our deadlist into next's and free it. */
 345                 dsl_deadlist_merge(&ds_next->ds_deadlist,
 346                     dsl_dataset_phys(ds)->ds_deadlist_obj, tx);
 347         }

 348         dsl_deadlist_close(&ds->ds_deadlist);
 349         dsl_deadlist_free(mos, dsl_dataset_phys(ds)->ds_deadlist_obj, tx);
 350         dmu_buf_will_dirty(ds->ds_dbuf, tx);
 351         dsl_dataset_phys(ds)->ds_deadlist_obj = 0;
 352 


 353         /* Collapse range in clone heads */
 354         dsl_dataset_remove_clones_key(ds,
 355             dsl_dataset_phys(ds)->ds_creation_txg, tx);
 356 
 357         if (ds_next->ds_is_snapshot) {
 358                 dsl_dataset_t *ds_nextnext;
 359 
 360                 /*
 361                  * Update next's unique to include blocks which
 362                  * were previously shared by only this snapshot
 363                  * and it.  Those blocks will be born after the
 364                  * prev snap and before this snap, and will have
 365                  * died after the next snap and before the one
 366                  * after that (ie. be on the snap after next's
 367                  * deadlist).
 368                  */
 369                 VERIFY0(dsl_dataset_hold_obj(dp,
 370                     dsl_dataset_phys(ds_next)->ds_next_snap_obj,
 371                     FTAG, &ds_nextnext));
 372                 dsl_deadlist_space_range(&ds_nextnext->ds_deadlist,
 373                     dsl_dataset_phys(ds)->ds_prev_snap_txg,
 374                     dsl_dataset_phys(ds)->ds_creation_txg,
 375                     &used, &comp, &uncomp);
 376                 dsl_dataset_phys(ds_next)->ds_unique_bytes += used;
 377                 dsl_dataset_rele(ds_nextnext, FTAG);
 378                 ASSERT3P(ds_next->ds_prev, ==, NULL);
 379 
 380                 /* Collapse range in this head. */
 381                 dsl_dataset_t *hds;
 382                 VERIFY0(dsl_dataset_hold_obj(dp,
 383                     dsl_dir_phys(ds->ds_dir)->dd_head_dataset_obj, FTAG, &hds));
 384                 dsl_deadlist_remove_key(&hds->ds_deadlist,
 385                     dsl_dataset_phys(ds)->ds_creation_txg, tx);




 386                 dsl_dataset_rele(hds, FTAG);
 387 
 388         } else {
 389                 ASSERT3P(ds_next->ds_prev, ==, ds);
 390                 dsl_dataset_rele(ds_next->ds_prev, ds_next);
 391                 ds_next->ds_prev = NULL;
 392                 if (ds_prev) {
 393                         VERIFY0(dsl_dataset_hold_obj(dp,
 394                             dsl_dataset_phys(ds)->ds_prev_snap_obj,
 395                             ds_next, &ds_next->ds_prev));
 396                 }
 397 
 398                 dsl_dataset_recalc_head_uniq(ds_next);
 399 
 400                 /*
 401                  * Reduce the amount of our unconsumed refreservation
 402                  * being charged to our parent by the amount of
 403                  * new unique data we have gained.
 404                  */
 405                 if (old_unique < ds_next->ds_reserved) {


 465                     tx));
 466         dsl_dir_rele(ds->ds_dir, ds);
 467         ds->ds_dir = NULL;
 468         dmu_object_free_zapified(mos, obj, tx);
 469 }
 470 
 471 void
 472 dsl_destroy_snapshot_sync(void *arg, dmu_tx_t *tx)
 473 {
 474         dsl_destroy_snapshot_arg_t *ddsa = arg;
 475         const char *dsname = ddsa->ddsa_name;
 476         boolean_t defer = ddsa->ddsa_defer;
 477 
 478         dsl_pool_t *dp = dmu_tx_pool(tx);
 479         dsl_dataset_t *ds;
 480 
 481         int error = dsl_dataset_hold(dp, dsname, FTAG, &ds);
 482         if (error == ENOENT)
 483                 return;
 484         ASSERT0(error);
 485 
 486         if (autosnap_check_name(strchr(dsname, '@')))
 487                 autosnap_exempt_snapshot(dp->dp_spa, dsname);
 488 
 489         dsl_destroy_snapshot_sync_impl(ds, defer, tx);
 490         dsl_dataset_rele(ds, FTAG);
 491 }
 492 
 493 /*
 494  * The semantics of this function are described in the comment above
 495  * lzc_destroy_snaps().  To summarize:
 496  *
 497  * The snapshots must all be in the same pool.
 498  *
 499  * Snapshots that don't exist will be silently ignored (considered to be
 500  * "already deleted").
 501  *
 502  * On success, all snaps will be destroyed and this will return 0.
 503  * On failure, no snaps will be destroyed, the errlist will be filled in,
 504  * and this will return an errno.
 505  */
 506 int
 507 dsl_destroy_snapshots_nvl(nvlist_t *snaps, boolean_t defer,
 508     nvlist_t *errlist)


 818 
 819         if (dsl_dataset_phys(ds)->ds_prev_snap_obj != 0) {
 820                 /* This is a clone */
 821                 ASSERT(ds->ds_prev != NULL);
 822                 ASSERT3U(dsl_dataset_phys(ds->ds_prev)->ds_next_snap_obj, !=,
 823                     obj);
 824                 ASSERT0(dsl_dataset_phys(ds)->ds_next_snap_obj);
 825 
 826                 dmu_buf_will_dirty(ds->ds_prev->ds_dbuf, tx);
 827                 if (dsl_dataset_phys(ds->ds_prev)->ds_next_clones_obj != 0) {
 828                         dsl_dataset_remove_from_next_clones(ds->ds_prev,
 829                             obj, tx);
 830                 }
 831 
 832                 ASSERT3U(dsl_dataset_phys(ds->ds_prev)->ds_num_children, >, 1);
 833                 dsl_dataset_phys(ds->ds_prev)->ds_num_children--;
 834         }
 835 
 836         /*
 837          * Destroy the deadlist.  Unless it's a clone, the
 838          * deadlist should be empty.  (If it's a clone, it's
 839          * safe to ignore the deadlist contents.)

 840          */
 841         dsl_deadlist_close(&ds->ds_deadlist);
 842         dsl_deadlist_free(mos, dsl_dataset_phys(ds)->ds_deadlist_obj, tx);
 843         dmu_buf_will_dirty(ds->ds_dbuf, tx);
 844         dsl_dataset_phys(ds)->ds_deadlist_obj = 0;
 845 



 846         objset_t *os;
 847         VERIFY0(dmu_objset_from_ds(ds, &os));
 848 
 849         if (spa_feature_is_active(dp->dp_spa, SPA_FEATURE_WBC)) {
 850                 wbc_process_objset(spa_get_wbc_data(dp->dp_spa), os, B_TRUE);
 851 
 852                 /*
 853                  * If WBC was activated for this dataset and it is a root
 854                  * of WBC-ed tree of datasets then need to decrement WBC
 855                  * feature flag refcounter, to be sure that 'feature@wbc'
 856                  * shows correct information about the status of WBC
 857                  */
 858                 if (os->os_wbc_root_ds_obj != 0 &&
 859                     ds->ds_object == os->os_wbc_root_ds_obj)
 860                         spa_feature_decr(os->os_spa, SPA_FEATURE_WBC, tx);
 861         }
 862 
 863         if (!spa_feature_is_enabled(dp->dp_spa, SPA_FEATURE_ASYNC_DESTROY)) {
 864                 old_synchronous_dataset_destroy(ds, tx);
 865         } else {
 866                 /*
 867                  * Move the bptree into the pool's list of trees to
 868                  * clean up and update space accounting information.
 869                  */
 870                 uint64_t used, comp, uncomp;
 871 
 872                 zil_destroy_sync(dmu_objset_zil(os), tx);
 873 
 874                 if (!spa_feature_is_active(dp->dp_spa,
 875                     SPA_FEATURE_ASYNC_DESTROY)) {
 876                         dsl_scan_t *scn = dp->dp_scan;
 877                         spa_feature_incr(dp->dp_spa, SPA_FEATURE_ASYNC_DESTROY,
 878                             tx);
 879                         dp->dp_bptree_obj = bptree_alloc(mos, tx);
 880                         VERIFY0(zap_add(mos,
 881                             DMU_POOL_DIRECTORY_OBJECT,
 882                             DMU_POOL_BPTREE_OBJ, sizeof (uint64_t), 1,


 984         spa_history_log_internal_ds(ds, "destroy begin", tx, "");
 985         dsl_dataset_rele(ds, FTAG);
 986 }
 987 
 988 int
 989 dsl_destroy_head(const char *name)
 990 {
 991         dsl_destroy_head_arg_t ddha;
 992         int error;
 993         spa_t *spa;
 994         boolean_t isenabled;
 995 
 996 #ifdef _KERNEL
 997         zfs_destroy_unmount_origin(name);
 998 #endif
 999 
1000         error = spa_open(name, &spa, FTAG);
1001         if (error != 0)
1002                 return (error);
1003         isenabled = spa_feature_is_enabled(spa, SPA_FEATURE_ASYNC_DESTROY);
1004 
1005         spa_close(spa, FTAG);
1006 
1007         ddha.ddha_name = name;
1008 
1009         if (!isenabled) {
1010                 objset_t *os;
1011 
1012                 error = dsl_sync_task(name, dsl_destroy_head_check,
1013                     dsl_destroy_head_begin_sync, &ddha,
1014                     0, ZFS_SPACE_CHECK_NONE);
1015                 if (error != 0)
1016                         return (error);
1017 
1018                 /*
1019                  * Head deletion is processed in one txg on old pools;
1020                  * remove the objects from open context so that the txg sync
1021                  * is not too long.
1022                  */
1023                 error = dmu_objset_own(name, DMU_OST_ANY, B_FALSE, FTAG, &os);
1024                 if (error == 0) {
1025                         uint64_t prev_snap_txg =
1026                             dsl_dataset_phys(dmu_objset_ds(os))->
1027                             ds_prev_snap_txg;
1028                         for (uint64_t obj = 0; error == 0;
1029                             error = dmu_object_next(os, &obj, FALSE,
1030                             prev_snap_txg))
1031                                 (void) dmu_free_long_object(os, obj);
1032                         /* sync out all frees */
1033                         txg_wait_synced(dmu_objset_pool(os), 0);
1034                         dmu_objset_disown(os, FTAG);
1035                 }
1036         }
1037 
1038         return (dsl_sync_task(name, dsl_destroy_head_check,
1039             dsl_destroy_head_sync, &ddha, 0, ZFS_SPACE_CHECK_NONE));
1040 }
1041 
1042 typedef struct {
1043         kmutex_t        lock;
1044         list_t list;
1045 } dsl_inconsistent_walker_cb_t;
1046 
1047 typedef struct {
1048         char name[ZFS_MAX_DATASET_NAME_LEN];
1049         list_node_t node;
1050 } dsl_inconsistent_node_t;
1051 
1052 /* ARGSUSED */
1053 static int
1054 dsl_collect_inconsistent_datasets_cb(dsl_pool_t *dp,
1055     dsl_dataset_t *ds, void *arg)
1056 {
1057         dsl_inconsistent_node_t *ds_node;
1058         dsl_inconsistent_walker_cb_t *walker =
1059             (dsl_inconsistent_walker_cb_t *)arg;
1060 
1061         if (!DS_IS_INCONSISTENT(ds))
1062                 return (0);
1063 
1064         /*
1065          * If the dataset is inconsistent because a resumable receive
1066          * has failed, then do not destroy it.
1067          */
1068         if (dsl_dataset_has_resume_receive_state(ds))
1069                 return (0);
1070 
1071         ds_node = kmem_alloc(sizeof (dsl_inconsistent_node_t), KM_SLEEP);
1072         dsl_dataset_name(ds, ds_node->name);
1073 
1074         mutex_enter(&walker->lock);
1075         list_insert_tail(&walker->list, ds_node);
1076         mutex_exit(&walker->lock);
1077 
1078         return (0);
1079 }
1080 
1081 /*
1082  * Walk in parallel over the entire pool and gather inconsistent
1083  * datasets namely, those that don't have resume token and destroy them.
1084  */
1085 void
1086 dsl_destroy_inconsistent(dsl_pool_t *dp)
1087 {
1088         dsl_inconsistent_walker_cb_t walker;
1089         dsl_inconsistent_node_t *ds_node;
1090 
1091         mutex_init(&walker.lock, NULL, MUTEX_DEFAULT, NULL);
1092         list_create(&walker.list, sizeof (dsl_inconsistent_node_t),
1093             offsetof(dsl_inconsistent_node_t, node));
1094 
1095         VERIFY0(dmu_objset_find_dp(dp, dp->dp_root_dir_obj,
1096                 dsl_collect_inconsistent_datasets_cb,
1097             &walker, DS_FIND_CHILDREN));
1098 
1099         while ((ds_node = list_remove_head(&walker.list)) != NULL) {
1100                 (void) dsl_destroy_head(ds_node->name);
1101                 kmem_free(ds_node, sizeof (dsl_inconsistent_node_t));
1102         }
1103 
1104         list_destroy(&walker.list);
1105         mutex_destroy(&walker.lock);
1106 }
1107 
1108 typedef struct {
1109         const char *from_ds;
1110         boolean_t defer;
1111 } dmu_destroy_atomically_arg_t;
1112 
1113 static int
1114 dsl_destroy_atomically_sync(void *arg, dmu_tx_t *tx)
1115 {
1116         dmu_destroy_atomically_arg_t *ddaa = arg;
1117         boolean_t defer = ddaa->defer;
1118         dsl_pool_t *dp = dmu_tx_pool(tx);
1119         zfs_ds_collector_entry_t *tail;
1120         list_t namestack;
1121         int err = 0;
1122 
1123         /* do not perfrom checks in ioctl */
1124         if (!dmu_tx_is_syncing(tx))
1125                 return (0);
1126 
1127         ASSERT(dsl_pool_config_held(dp));
1128 
1129         if (!spa_feature_is_enabled(dp->dp_spa, SPA_FEATURE_ASYNC_DESTROY))
1130                 return (SET_ERROR(ENOTSUP));
1131 
1132         /* It is possible than autosnap watches the DS */
1133         if (spa_feature_is_active(dp->dp_spa, SPA_FEATURE_WBC)) {
1134                 objset_t *os = NULL;
1135                 dsl_dataset_t *ds = NULL;
1136 
1137                 err = dsl_dataset_hold(dp, ddaa->from_ds, FTAG, &ds);
1138                 if (err != 0)
1139                         return (err);
1140 
1141                 err = dmu_objset_from_ds(ds, &os);
1142                 if (err != 0) {
1143                         dsl_dataset_rele(ds, FTAG);
1144                         return (err);
1145                 }
1146 
1147                 if (!dmu_objset_is_snapshot(os)) {
1148                         wbc_process_objset(spa_get_wbc_data(dp->dp_spa),
1149                             os, B_TRUE);
1150                 }
1151 
1152                 dsl_dataset_rele(ds, FTAG);
1153         }
1154 
1155         /* initialize the stack of datasets */
1156         list_create(&namestack, sizeof (zfs_ds_collector_entry_t),
1157             offsetof(zfs_ds_collector_entry_t, node));
1158         tail = dsl_dataset_collector_cache_alloc();
1159 
1160         /* push the head */
1161         tail->cookie = 0;
1162         tail->cookie_is_snap = B_FALSE;
1163         (void) strcpy(tail->name, ddaa->from_ds);
1164         list_insert_tail(&namestack, tail);
1165 
1166         /* the head is processed at the very end and after all is done */
1167         while (err == 0 && ((tail = list_tail(&namestack)) != NULL)) {
1168                 zfs_ds_collector_entry_t *el;
1169                 objset_t *os;
1170                 dsl_dataset_t *ds;
1171                 char *p;
1172 
1173                 /* init new entry */
1174                 el = dsl_dataset_collector_cache_alloc();
1175                 el->cookie = 0;
1176                 el->cookie_is_snap = B_FALSE;
1177                 (void) strcpy(el->name, tail->name);
1178                 p = el->name + strlen(el->name);
1179 
1180                 /* hold the current dataset to traverse its children */
1181                 err = dsl_dataset_hold(dp, tail->name, FTAG, &ds);
1182                 if (err != 0) {
1183                         dsl_dataset_collector_cache_free(el);
1184                         break;
1185                 }
1186 
1187                 err  = dmu_objset_from_ds(ds, &os);
1188                 if (err != 0) {
1189                         dsl_dataset_rele(ds, FTAG);
1190                         dsl_dataset_collector_cache_free(el);
1191                         break;
1192                 }
1193 
1194                 if (dmu_objset_is_snapshot(os)) {
1195                         /* traverse clones for snapshots */
1196                         err = dmu_clone_list_next(os, MAXNAMELEN,
1197                             el->name, NULL, &tail->cookie);
1198                 } else {
1199                         /* for filesystems traverse fs first, then snaps */
1200                         if (!tail->cookie_is_snap) {
1201                                 *p++ = '/';
1202                                 do {
1203                                         *p = '\0';
1204                                         err = dmu_dir_list_next(os,
1205                                             MAXNAMELEN - (p - el->name),
1206                                             p, NULL, &tail->cookie);
1207                                 } while (err == 0 &&
1208                                     dataset_name_hidden(el->name));
1209 
1210                                 /* no more fs, move to snapshots */
1211                                 if (err == ENOENT) {
1212                                         *(--p) = '\0';
1213                                         tail->cookie_is_snap = 1;
1214                                         tail->cookie = 0;
1215                                         err = 0;
1216                                 }
1217                         }
1218 
1219                         if (err == 0 && tail->cookie_is_snap) {
1220                                 *p++ = '@';
1221                                 *p = '\0';
1222                                 err = dmu_snapshot_list_next(os,
1223                                     MAXNAMELEN - (p - el->name),
1224                                     p, NULL, &tail->cookie, NULL);
1225                         }
1226                 }
1227 
1228                 if (err == 0) {
1229                         /* a children found, add it and continue */
1230                         list_insert_tail(&namestack, el);
1231                         dsl_dataset_rele(ds, FTAG);
1232                         continue;
1233                 }
1234 
1235                 dsl_dataset_collector_cache_free(el);
1236 
1237                 if (err != ENOENT) {
1238                         dsl_dataset_rele(ds, FTAG);
1239                         break;
1240                 }
1241 
1242                 /*
1243                  * There are no more children of the dataset, pop it from stack
1244                  * and destroy it
1245                  */
1246 
1247                 err = 0;
1248 
1249                 list_remove(&namestack, tail);
1250 
1251                 if (dmu_objset_is_snapshot(os)) {
1252                         err = dsl_destroy_snapshot_check_impl(ds, defer);
1253                         if (err == 0)
1254                                 dsl_destroy_snapshot_sync_impl(ds, defer, tx);
1255                 } else if (strchr(tail->name, '/') != NULL) {
1256                         err = dsl_destroy_head_check_impl(ds, 0);
1257                         if (err == 0)
1258                                 dsl_destroy_head_sync_impl(ds, tx);
1259                 }
1260 
1261                 dsl_dataset_rele(ds, FTAG);
1262                 dsl_dataset_collector_cache_free(tail);
1263         }
1264 
1265         if (err != 0) {
1266                 while ((tail = list_remove_tail(&namestack)) != NULL)
1267                         dsl_dataset_collector_cache_free(tail);
1268         }
1269 
1270         ASSERT(list_head(&namestack) == NULL);
1271 
1272         list_destroy(&namestack);
1273 
1274         return (err);
1275 }
1276 
1277 /*ARGSUSED*/
1278 void
1279 dsl_destroy_atomically_sync_dummy(void *arg, dmu_tx_t *tx)
1280 {
1281 }
1282 
1283 int
1284 dsl_destroy_atomically(const char *name, boolean_t defer)
1285 {
1286         dmu_destroy_atomically_arg_t ddaa;
1287 
1288         ddaa.from_ds = name;
1289         ddaa.defer = defer;
1290 
1291         return (dsl_sync_task(name, dsl_destroy_atomically_sync,
1292             dsl_destroy_atomically_sync_dummy, &ddaa, 0, ZFS_SPACE_CHECK_NONE));
1293 }