Print this page
    
NEX-9752 backport illumos 6950 ARC should cache compressed data
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
6950 ARC should cache compressed data
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
NEX-8521 zdb -h <pool> raises core dump
Reviewed by: Alex Deiter <alex.deiter@nexenta.com>
Reviewed by: Dan Fields <dan.fields@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
SUP-918: zdb -h infinite loop when buffering records larger than static limit
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
NEX-3650 KRRP needs to clean up cstyle, hdrchk, and mapfile issues
Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-3214 remove cos object type from dmu.h
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
6391 Override default SPA config location via environment
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Richard Yao <ryao@gentoo.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
6268 zfs diff confused by moving a file to another directory
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Approved by: Dan McDonald <danmcd@omniti.com>
6290 zdb -h overflows stack
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Brian Donohue <brian.donohue@delphix.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Reviewed by: Don Brady <dev.fs.zfs@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
6047 SPARC boot should support feature@embedded_data
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
5959 clean up per-dataset feature count code
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
NEX-4582 update wrc test cases for allow to use write back cache per tree of datasets
Reviewed by: Steve Peng <steve.peng@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
5812 assertion failed in zrl_tryenter(): zr_owner==NULL
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Approved by: Gordon Ross <gwr@nexenta.com>
5810 zdb should print details of bpobj
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Reviewed by: Simon Klinkert <simon.klinkert@gmail.com>
Approved by: Gordon Ross <gwr@nexenta.com>
NEX-3558 KRRP Integration
NEX-3212 remove vdev prop object type from dmu.h
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Josef Sipek <josef.sipek@nexenta.com>
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Garrett D'Amore <garrett@damore.org>
Make special vdev subtree topology the same as regular vdev subtree to simplify testcase setup
Fixup merge issues
Issue #40: ZDB shouldn't crash with new code
re #12611 rb4105 zpool import panic in ddt_zap_count()
re #8279 rb3915 need a mechanism to notify NMS about ZFS config changes (fix lint -courtesy of Yuri Pankov)
re #12584 rb4049 zfsxx latest code merge (fix lint - courtesy of Yuri Pankov)
re #12585 rb4049 ZFS++ work port - refactoring to improve separation of open/closed code, bug fixes, performance improvements - open code
Bug 11205: add missing libzfs_closed_stubs.c to fix opensource-only build.
ZFS plus work: special vdevs, cos, cos/vdev properties
    
      
        | Split | 
	Close | 
      
      | Expand all | 
      | Collapse all | 
    
    
          --- old/usr/src/cmd/zdb/zdb.c
          +++ new/usr/src/cmd/zdb/zdb.c
   1    1  /*
   2    2   * CDDL HEADER START
   3    3   *
   4    4   * The contents of this file are subject to the terms of the
   5    5   * Common Development and Distribution License (the "License").
   6    6   * You may not use this file except in compliance with the License.
   7    7   *
   8    8   * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
   9    9   * or http://www.opensolaris.org/os/licensing.
  10   10   * See the License for the specific language governing permissions
  11   11   * and limitations under the License.
  12   12   *
  13   13   * When distributing Covered Code, include this CDDL HEADER in each
  
    | 
      ↓ open down ↓ | 
    13 lines elided | 
    
      ↑ open up ↑ | 
  
  14   14   * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  15   15   * If applicable, add the following below this CDDL HEADER, with the
  16   16   * fields enclosed by brackets "[]" replaced with your own identifying
  17   17   * information: Portions Copyright [yyyy] [name of copyright owner]
  18   18   *
  19   19   * CDDL HEADER END
  20   20   */
  21   21  
  22   22  /*
  23   23   * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
  24      - * Copyright (c) 2011, 2017 by Delphix. All rights reserved.
       24 + * Copyright (c) 2011, 2016 by Delphix. All rights reserved.
  25   25   * Copyright (c) 2014 Integros [integros.com]
  26   26   * Copyright 2017 Nexenta Systems, Inc.
  27   27   * Copyright 2017 RackTop Systems.
  28   28   */
  29   29  
  30   30  #include <stdio.h>
  31   31  #include <unistd.h>
  32   32  #include <stdio_ext.h>
  33   33  #include <stdlib.h>
  34   34  #include <ctype.h>
       35 +#include <string.h>
       36 +#include <errno.h>
  35   37  #include <sys/zfs_context.h>
  36   38  #include <sys/spa.h>
  37   39  #include <sys/spa_impl.h>
  38   40  #include <sys/dmu.h>
  39   41  #include <sys/zap.h>
  40   42  #include <sys/fs/zfs.h>
  41   43  #include <sys/zfs_znode.h>
  42   44  #include <sys/zfs_sa.h>
  43   45  #include <sys/sa.h>
  44   46  #include <sys/sa_impl.h>
  45   47  #include <sys/vdev.h>
  46   48  #include <sys/vdev_impl.h>
  47   49  #include <sys/metaslab_impl.h>
  48   50  #include <sys/dmu_objset.h>
  49   51  #include <sys/dsl_dir.h>
  50   52  #include <sys/dsl_dataset.h>
  51   53  #include <sys/dsl_pool.h>
  52   54  #include <sys/dbuf.h>
  53   55  #include <sys/zil.h>
  54   56  #include <sys/zil_impl.h>
  55   57  #include <sys/stat.h>
  56   58  #include <sys/resource.h>
  57   59  #include <sys/dmu_traverse.h>
  58   60  #include <sys/zio_checksum.h>
  59   61  #include <sys/zio_compress.h>
  60   62  #include <sys/zfs_fuid.h>
  61   63  #include <sys/arc.h>
  62   64  #include <sys/ddt.h>
  63   65  #include <sys/zfeature.h>
  64   66  #include <sys/abd.h>
  65   67  #include <sys/blkptr.h>
  66   68  #include <zfs_comutil.h>
  67   69  #include <libcmdutils.h>
  68   70  #undef verify
  69   71  #include <libzfs.h>
  70   72  
  
    | 
      ↓ open down ↓ | 
    26 lines elided | 
    
      ↑ open up ↑ | 
  
  71   73  #include "zdb.h"
  72   74  
  73   75  #define ZDB_COMPRESS_NAME(idx) ((idx) < ZIO_COMPRESS_FUNCTIONS ?        \
  74   76          zio_compress_table[(idx)].ci_name : "UNKNOWN")
  75   77  #define ZDB_CHECKSUM_NAME(idx) ((idx) < ZIO_CHECKSUM_FUNCTIONS ?        \
  76   78          zio_checksum_table[(idx)].ci_name : "UNKNOWN")
  77   79  #define ZDB_OT_NAME(idx) ((idx) < DMU_OT_NUMTYPES ?     \
  78   80          dmu_ot[(idx)].ot_name : DMU_OT_IS_VALID(idx) ?  \
  79   81          dmu_ot_byteswap[DMU_OT_BYTESWAP(idx)].ob_name : "UNKNOWN")
  80   82  #define ZDB_OT_TYPE(idx) ((idx) < DMU_OT_NUMTYPES ? (idx) :             \
  81      -        (idx) == DMU_OTN_ZAP_DATA || (idx) == DMU_OTN_ZAP_METADATA ?    \
  82      -        DMU_OT_ZAP_OTHER : \
  83      -        (idx) == DMU_OTN_UINT64_DATA || (idx) == DMU_OTN_UINT64_METADATA ? \
  84      -        DMU_OT_UINT64_OTHER : DMU_OT_NUMTYPES)
       83 +        (((idx) == DMU_OTN_ZAP_DATA || (idx) == DMU_OTN_ZAP_METADATA) ? \
       84 +        DMU_OT_ZAP_OTHER : DMU_OT_NUMTYPES))
  85   85  
  86   86  #ifndef lint
  87   87  extern int reference_tracking_enable;
  88   88  extern boolean_t zfs_recover;
  89   89  extern uint64_t zfs_arc_max, zfs_arc_meta_limit;
  90   90  extern int zfs_vdev_async_read_max_active;
  91   91  extern int aok;
  92      -extern boolean_t spa_load_verify_dryrun;
  93   92  #else
  94   93  int reference_tracking_enable;
  95   94  boolean_t zfs_recover;
  96   95  uint64_t zfs_arc_max, zfs_arc_meta_limit;
  97   96  int zfs_vdev_async_read_max_active;
  98   97  int aok;
  99      -boolean_t spa_load_verify_dryrun;
 100   98  #endif
 101   99  
 102  100  static const char cmdname[] = "zdb";
 103  101  uint8_t dump_opt[256];
 104  102  
 105  103  typedef void object_viewer_t(objset_t *, uint64_t, void *data, size_t size);
 106  104  
 107  105  uint64_t *zopt_object = NULL;
 108  106  static unsigned zopt_objects = 0;
 109  107  libzfs_handle_t *g_zfs;
 110  108  uint64_t max_inflight = 1000;
 111  109  
 112  110  static void snprintf_blkptr_compact(char *, size_t, const blkptr_t *);
 113  111  
 114  112  /*
 115  113   * These libumem hooks provide a reasonable set of defaults for the allocator's
 116  114   * debugging facilities.
 117  115   */
 118  116  const char *
 119  117  _umem_debug_init()
 120  118  {
 121  119          return ("default,verbose"); /* $UMEM_DEBUG setting */
 122  120  }
 123  121  
 124  122  const char *
 125  123  _umem_logging_init(void)
 126  124  {
 127  125          return ("fail,contents"); /* $UMEM_LOGGING setting */
 128  126  }
 129  127  
 130  128  static void
 131  129  usage(void)
 132  130  {
 133  131          (void) fprintf(stderr,
 134  132              "Usage:\t%s [-AbcdDFGhiLMPsvX] [-e [-V] [-p <path> ...]] "
 135  133              "[-I <inflight I/Os>]\n"
 136  134              "\t\t[-o <var>=<value>]... [-t <txg>] [-U <cache>] [-x <dumpdir>]\n"
 137  135              "\t\t[<poolname> [<object> ...]]\n"
 138  136              "\t%s [-AdiPv] [-e [-V] [-p <path> ...]] [-U <cache>] <dataset> "
 139  137              "[<object> ...]\n"
 140  138              "\t%s -C [-A] [-U <cache>]\n"
 141  139              "\t%s -l [-Aqu] <device>\n"
 142  140              "\t%s -m [-AFLPX] [-e [-V] [-p <path> ...]] [-t <txg>] "
 143  141              "[-U <cache>]\n\t\t<poolname> [<vdev> [<metaslab> ...]]\n"
 144  142              "\t%s -O <dataset> <path>\n"
 145  143              "\t%s -R [-A] [-e [-V] [-p <path> ...]] [-U <cache>]\n"
 146  144              "\t\t<poolname> <vdev>:<offset>:<size>[:<flags>]\n"
 147  145              "\t%s -E [-A] word0:word1:...:word15\n"
 148  146              "\t%s -S [-AP] [-e [-V] [-p <path> ...]] [-U <cache>] "
 149  147              "<poolname>\n\n",
 150  148              cmdname, cmdname, cmdname, cmdname, cmdname, cmdname, cmdname,
 151  149              cmdname, cmdname);
 152  150  
 153  151          (void) fprintf(stderr, "    Dataset name must include at least one "
 154  152              "separator character '/' or '@'\n");
 155  153          (void) fprintf(stderr, "    If dataset name is specified, only that "
 156  154              "dataset is dumped\n");
 157  155          (void) fprintf(stderr, "    If object numbers are specified, only "
 158  156              "those objects are dumped\n\n");
 159  157          (void) fprintf(stderr, "    Options to control amount of output:\n");
 160  158          (void) fprintf(stderr, "        -b block statistics\n");
 161  159          (void) fprintf(stderr, "        -c checksum all metadata (twice for "
 162  160              "all data) blocks\n");
 163  161          (void) fprintf(stderr, "        -C config (or cachefile if alone)\n");
 164  162          (void) fprintf(stderr, "        -d dataset(s)\n");
 165  163          (void) fprintf(stderr, "        -D dedup statistics\n");
 166  164          (void) fprintf(stderr, "        -E decode and display block from an "
 167  165              "embedded block pointer\n");
 168  166          (void) fprintf(stderr, "        -h pool history\n");
 169  167          (void) fprintf(stderr, "        -i intent logs\n");
 170  168          (void) fprintf(stderr, "        -l read label contents\n");
 171  169          (void) fprintf(stderr, "        -L disable leak tracking (do not "
 172  170              "load spacemaps)\n");
 173  171          (void) fprintf(stderr, "        -m metaslabs\n");
 174  172          (void) fprintf(stderr, "        -M metaslab groups\n");
 175  173          (void) fprintf(stderr, "        -O perform object lookups by path\n");
 176  174          (void) fprintf(stderr, "        -R read and display block from a "
 177  175              "device\n");
 178  176          (void) fprintf(stderr, "        -s report stats on zdb's I/O\n");
 179  177          (void) fprintf(stderr, "        -S simulate dedup to measure effect\n");
 180  178          (void) fprintf(stderr, "        -v verbose (applies to all "
 181  179              "others)\n\n");
 182  180          (void) fprintf(stderr, "    Below options are intended for use "
 183  181              "with other options:\n");
 184  182          (void) fprintf(stderr, "        -A ignore assertions (-A), enable "
 185  183              "panic recovery (-AA) or both (-AAA)\n");
 186  184          (void) fprintf(stderr, "        -e pool is exported/destroyed/"
 187  185              "has altroot/not in a cachefile\n");
 188  186          (void) fprintf(stderr, "        -F attempt automatic rewind within "
 189  187              "safe range of transaction groups\n");
 190  188          (void) fprintf(stderr, "        -G dump zfs_dbgmsg buffer before "
 191  189              "exiting\n");
 192  190          (void) fprintf(stderr, "        -I <number of inflight I/Os> -- "
 193  191              "specify the maximum number of "
 194  192              "checksumming I/Os [default is 200]\n");
 195  193          (void) fprintf(stderr, "        -o <variable>=<value> set global "
 196  194              "variable to an unsigned 32-bit integer value\n");
 197  195          (void) fprintf(stderr, "        -p <path> -- use one or more with "
 198  196              "-e to specify path to vdev dir\n");
 199  197          (void) fprintf(stderr, "        -P print numbers in parseable form\n");
 200  198          (void) fprintf(stderr, "        -q don't print label contents\n");
 201  199          (void) fprintf(stderr, "        -t <txg> -- highest txg to use when "
 202  200              "searching for uberblocks\n");
 203  201          (void) fprintf(stderr, "        -u uberblock\n");
 204  202          (void) fprintf(stderr, "        -U <cachefile_path> -- use alternate "
 205  203              "cachefile\n");
 206  204          (void) fprintf(stderr, "        -V do verbatim import\n");
 207  205          (void) fprintf(stderr, "        -x <dumpdir> -- "
 208  206              "dump all read blocks into specified directory\n");
 209  207          (void) fprintf(stderr, "        -X attempt extreme rewind (does not "
 210  208              "work with dataset)\n\n");
 211  209          (void) fprintf(stderr, "Specify an option more than once (e.g. -bb) "
 212  210              "to make only that option verbose\n");
 213  211          (void) fprintf(stderr, "Default is to dump everything non-verbosely\n");
 214  212          exit(1);
 215  213  }
 216  214  
 217  215  static void
 218  216  dump_debug_buffer()
 219  217  {
 220  218          if (dump_opt['G']) {
 221  219                  (void) printf("\n");
 222  220                  zfs_dbgmsg_print("zdb");
 223  221          }
 224  222  }
 225  223  
 226  224  /*
 227  225   * Called for usage errors that are discovered after a call to spa_open(),
 228  226   * dmu_bonus_hold(), or pool_match().  abort() is called for other errors.
 229  227   */
 230  228  
 231  229  static void
 232  230  fatal(const char *fmt, ...)
 233  231  {
 234  232          va_list ap;
 235  233  
 236  234          va_start(ap, fmt);
 237  235          (void) fprintf(stderr, "%s: ", cmdname);
 238  236          (void) vfprintf(stderr, fmt, ap);
 239  237          va_end(ap);
 240  238          (void) fprintf(stderr, "\n");
 241  239  
 242  240          dump_debug_buffer();
 243  241  
 244  242          exit(1);
 245  243  }
 246  244  
 247  245  /* ARGSUSED */
 248  246  static void
 249  247  dump_packed_nvlist(objset_t *os, uint64_t object, void *data, size_t size)
 250  248  {
 251  249          nvlist_t *nv;
 252  250          size_t nvsize = *(uint64_t *)data;
 253  251          char *packed = umem_alloc(nvsize, UMEM_NOFAIL);
 254  252  
 255  253          VERIFY(0 == dmu_read(os, object, 0, nvsize, packed, DMU_READ_PREFETCH));
 256  254  
 257  255          VERIFY(nvlist_unpack(packed, nvsize, &nv, 0) == 0);
 258  256  
 259  257          umem_free(packed, nvsize);
 260  258  
 261  259          dump_nvlist(nv, 8);
 262  260  
 263  261          nvlist_free(nv);
 264  262  }
 265  263  
 266  264  /* ARGSUSED */
 267  265  static void
 268  266  dump_history_offsets(objset_t *os, uint64_t object, void *data, size_t size)
 269  267  {
 270  268          spa_history_phys_t *shp = data;
 271  269  
 272  270          if (shp == NULL)
 273  271                  return;
 274  272  
 275  273          (void) printf("\t\tpool_create_len = %llu\n",
 276  274              (u_longlong_t)shp->sh_pool_create_len);
 277  275          (void) printf("\t\tphys_max_off = %llu\n",
 278  276              (u_longlong_t)shp->sh_phys_max_off);
 279  277          (void) printf("\t\tbof = %llu\n",
 280  278              (u_longlong_t)shp->sh_bof);
 281  279          (void) printf("\t\teof = %llu\n",
 282  280              (u_longlong_t)shp->sh_eof);
 283  281          (void) printf("\t\trecords_lost = %llu\n",
 284  282              (u_longlong_t)shp->sh_records_lost);
 285  283  }
 286  284  
 287  285  static void
 288  286  zdb_nicenum(uint64_t num, char *buf, size_t buflen)
 289  287  {
 290  288          if (dump_opt['P'])
 291  289                  (void) snprintf(buf, buflen, "%llu", (longlong_t)num);
 292  290          else
 293  291                  nicenum(num, buf, sizeof (buf));
 294  292  }
 295  293  
 296  294  static const char histo_stars[] = "****************************************";
 297  295  static const uint64_t histo_width = sizeof (histo_stars) - 1;
 298  296  
 299  297  static void
 300  298  dump_histogram(const uint64_t *histo, int size, int offset)
 301  299  {
 302  300          int i;
 303  301          int minidx = size - 1;
 304  302          int maxidx = 0;
 305  303          uint64_t max = 0;
 306  304  
 307  305          for (i = 0; i < size; i++) {
 308  306                  if (histo[i] > max)
 309  307                          max = histo[i];
 310  308                  if (histo[i] > 0 && i > maxidx)
 311  309                          maxidx = i;
 312  310                  if (histo[i] > 0 && i < minidx)
 313  311                          minidx = i;
 314  312          }
 315  313  
 316  314          if (max < histo_width)
 317  315                  max = histo_width;
 318  316  
 319  317          for (i = minidx; i <= maxidx; i++) {
 320  318                  (void) printf("\t\t\t%3u: %6llu %s\n",
 321  319                      i + offset, (u_longlong_t)histo[i],
 322  320                      &histo_stars[(max - histo[i]) * histo_width / max]);
 323  321          }
 324  322  }
 325  323  
 326  324  static void
 327  325  dump_zap_stats(objset_t *os, uint64_t object)
 328  326  {
 329  327          int error;
 330  328          zap_stats_t zs;
 331  329  
 332  330          error = zap_get_stats(os, object, &zs);
 333  331          if (error)
 334  332                  return;
 335  333  
 336  334          if (zs.zs_ptrtbl_len == 0) {
 337  335                  ASSERT(zs.zs_num_blocks == 1);
 338  336                  (void) printf("\tmicrozap: %llu bytes, %llu entries\n",
 339  337                      (u_longlong_t)zs.zs_blocksize,
 340  338                      (u_longlong_t)zs.zs_num_entries);
 341  339                  return;
 342  340          }
 343  341  
 344  342          (void) printf("\tFat ZAP stats:\n");
 345  343  
 346  344          (void) printf("\t\tPointer table:\n");
 347  345          (void) printf("\t\t\t%llu elements\n",
 348  346              (u_longlong_t)zs.zs_ptrtbl_len);
 349  347          (void) printf("\t\t\tzt_blk: %llu\n",
 350  348              (u_longlong_t)zs.zs_ptrtbl_zt_blk);
 351  349          (void) printf("\t\t\tzt_numblks: %llu\n",
 352  350              (u_longlong_t)zs.zs_ptrtbl_zt_numblks);
 353  351          (void) printf("\t\t\tzt_shift: %llu\n",
 354  352              (u_longlong_t)zs.zs_ptrtbl_zt_shift);
 355  353          (void) printf("\t\t\tzt_blks_copied: %llu\n",
 356  354              (u_longlong_t)zs.zs_ptrtbl_blks_copied);
 357  355          (void) printf("\t\t\tzt_nextblk: %llu\n",
 358  356              (u_longlong_t)zs.zs_ptrtbl_nextblk);
 359  357  
 360  358          (void) printf("\t\tZAP entries: %llu\n",
 361  359              (u_longlong_t)zs.zs_num_entries);
 362  360          (void) printf("\t\tLeaf blocks: %llu\n",
 363  361              (u_longlong_t)zs.zs_num_leafs);
 364  362          (void) printf("\t\tTotal blocks: %llu\n",
 365  363              (u_longlong_t)zs.zs_num_blocks);
 366  364          (void) printf("\t\tzap_block_type: 0x%llx\n",
 367  365              (u_longlong_t)zs.zs_block_type);
 368  366          (void) printf("\t\tzap_magic: 0x%llx\n",
 369  367              (u_longlong_t)zs.zs_magic);
 370  368          (void) printf("\t\tzap_salt: 0x%llx\n",
 371  369              (u_longlong_t)zs.zs_salt);
 372  370  
 373  371          (void) printf("\t\tLeafs with 2^n pointers:\n");
 374  372          dump_histogram(zs.zs_leafs_with_2n_pointers, ZAP_HISTOGRAM_SIZE, 0);
 375  373  
 376  374          (void) printf("\t\tBlocks with n*5 entries:\n");
 377  375          dump_histogram(zs.zs_blocks_with_n5_entries, ZAP_HISTOGRAM_SIZE, 0);
 378  376  
 379  377          (void) printf("\t\tBlocks n/10 full:\n");
 380  378          dump_histogram(zs.zs_blocks_n_tenths_full, ZAP_HISTOGRAM_SIZE, 0);
 381  379  
 382  380          (void) printf("\t\tEntries with n chunks:\n");
 383  381          dump_histogram(zs.zs_entries_using_n_chunks, ZAP_HISTOGRAM_SIZE, 0);
 384  382  
 385  383          (void) printf("\t\tBuckets with n entries:\n");
 386  384          dump_histogram(zs.zs_buckets_with_n_entries, ZAP_HISTOGRAM_SIZE, 0);
 387  385  }
 388  386  
 389  387  /*ARGSUSED*/
 390  388  static void
 391  389  dump_none(objset_t *os, uint64_t object, void *data, size_t size)
 392  390  {
 393  391  }
 394  392  
 395  393  /*ARGSUSED*/
 396  394  static void
 397  395  dump_unknown(objset_t *os, uint64_t object, void *data, size_t size)
 398  396  {
 399  397          (void) printf("\tUNKNOWN OBJECT TYPE\n");
 400  398  }
 401  399  
 402  400  /*ARGSUSED*/
 403  401  static void
 404  402  dump_uint8(objset_t *os, uint64_t object, void *data, size_t size)
 405  403  {
 406  404  }
 407  405  
 408  406  /*ARGSUSED*/
 409  407  static void
 410  408  dump_uint64(objset_t *os, uint64_t object, void *data, size_t size)
 411  409  {
 412  410  }
 413  411  
 414  412  /*ARGSUSED*/
 415  413  static void
 416  414  dump_zap(objset_t *os, uint64_t object, void *data, size_t size)
 417  415  {
 418  416          zap_cursor_t zc;
 419  417          zap_attribute_t attr;
 420  418          void *prop;
 421  419          unsigned i;
 422  420  
 423  421          dump_zap_stats(os, object);
 424  422          (void) printf("\n");
 425  423  
 426  424          for (zap_cursor_init(&zc, os, object);
 427  425              zap_cursor_retrieve(&zc, &attr) == 0;
 428  426              zap_cursor_advance(&zc)) {
 429  427                  (void) printf("\t\t%s = ", attr.za_name);
 430  428                  if (attr.za_num_integers == 0) {
 431  429                          (void) printf("\n");
 432  430                          continue;
 433  431                  }
 434  432                  prop = umem_zalloc(attr.za_num_integers *
 435  433                      attr.za_integer_length, UMEM_NOFAIL);
 436  434                  (void) zap_lookup(os, object, attr.za_name,
 437  435                      attr.za_integer_length, attr.za_num_integers, prop);
 438  436                  if (attr.za_integer_length == 1) {
 439  437                          (void) printf("%s", (char *)prop);
 440  438                  } else {
 441  439                          for (i = 0; i < attr.za_num_integers; i++) {
 442  440                                  switch (attr.za_integer_length) {
 443  441                                  case 2:
 444  442                                          (void) printf("%u ",
 445  443                                              ((uint16_t *)prop)[i]);
 446  444                                          break;
 447  445                                  case 4:
 448  446                                          (void) printf("%u ",
 449  447                                              ((uint32_t *)prop)[i]);
 450  448                                          break;
 451  449                                  case 8:
 452  450                                          (void) printf("%lld ",
 453  451                                              (u_longlong_t)((int64_t *)prop)[i]);
 454  452                                          break;
 455  453                                  }
 456  454                          }
 457  455                  }
 458  456                  (void) printf("\n");
 459  457                  umem_free(prop, attr.za_num_integers * attr.za_integer_length);
 460  458          }
 461  459          zap_cursor_fini(&zc);
 462  460  }
 463  461  
 464  462  static void
 465  463  dump_bpobj(objset_t *os, uint64_t object, void *data, size_t size)
 466  464  {
 467  465          bpobj_phys_t *bpop = data;
 468  466          char bytes[32], comp[32], uncomp[32];
 469  467  
 470  468          /* make sure the output won't get truncated */
 471  469          CTASSERT(sizeof (bytes) >= NN_NUMBUF_SZ);
 472  470          CTASSERT(sizeof (comp) >= NN_NUMBUF_SZ);
 473  471          CTASSERT(sizeof (uncomp) >= NN_NUMBUF_SZ);
 474  472  
 475  473          if (bpop == NULL)
 476  474                  return;
 477  475  
 478  476          zdb_nicenum(bpop->bpo_bytes, bytes, sizeof (bytes));
 479  477          zdb_nicenum(bpop->bpo_comp, comp, sizeof (comp));
 480  478          zdb_nicenum(bpop->bpo_uncomp, uncomp, sizeof (uncomp));
 481  479  
 482  480          (void) printf("\t\tnum_blkptrs = %llu\n",
 483  481              (u_longlong_t)bpop->bpo_num_blkptrs);
 484  482          (void) printf("\t\tbytes = %s\n", bytes);
 485  483          if (size >= BPOBJ_SIZE_V1) {
 486  484                  (void) printf("\t\tcomp = %s\n", comp);
 487  485                  (void) printf("\t\tuncomp = %s\n", uncomp);
 488  486          }
 489  487          if (size >= sizeof (*bpop)) {
 490  488                  (void) printf("\t\tsubobjs = %llu\n",
 491  489                      (u_longlong_t)bpop->bpo_subobjs);
 492  490                  (void) printf("\t\tnum_subobjs = %llu\n",
 493  491                      (u_longlong_t)bpop->bpo_num_subobjs);
 494  492          }
 495  493  
 496  494          if (dump_opt['d'] < 5)
 497  495                  return;
 498  496  
 499  497          for (uint64_t i = 0; i < bpop->bpo_num_blkptrs; i++) {
 500  498                  char blkbuf[BP_SPRINTF_LEN];
 501  499                  blkptr_t bp;
 502  500  
 503  501                  int err = dmu_read(os, object,
 504  502                      i * sizeof (bp), sizeof (bp), &bp, 0);
 505  503                  if (err != 0) {
 506  504                          (void) printf("got error %u from dmu_read\n", err);
 507  505                          break;
 508  506                  }
 509  507                  snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), &bp);
 510  508                  (void) printf("\t%s\n", blkbuf);
 511  509          }
 512  510  }
 513  511  
 514  512  /* ARGSUSED */
 515  513  static void
 516  514  dump_bpobj_subobjs(objset_t *os, uint64_t object, void *data, size_t size)
 517  515  {
 518  516          dmu_object_info_t doi;
 519  517  
 520  518          VERIFY0(dmu_object_info(os, object, &doi));
 521  519          uint64_t *subobjs = kmem_alloc(doi.doi_max_offset, KM_SLEEP);
 522  520  
 523  521          int err = dmu_read(os, object, 0, doi.doi_max_offset, subobjs, 0);
 524  522          if (err != 0) {
 525  523                  (void) printf("got error %u from dmu_read\n", err);
 526  524                  kmem_free(subobjs, doi.doi_max_offset);
 527  525                  return;
 528  526          }
 529  527  
 530  528          int64_t last_nonzero = -1;
 531  529          for (uint64_t i = 0; i < doi.doi_max_offset / 8; i++) {
 532  530                  if (subobjs[i] != 0)
 533  531                          last_nonzero = i;
 534  532          }
 535  533  
 536  534          for (int64_t i = 0; i <= last_nonzero; i++) {
 537  535                  (void) printf("\t%llu\n", (longlong_t)subobjs[i]);
 538  536          }
 539  537          kmem_free(subobjs, doi.doi_max_offset);
 540  538  }
 541  539  
 542  540  /*ARGSUSED*/
 543  541  static void
 544  542  dump_ddt_zap(objset_t *os, uint64_t object, void *data, size_t size)
 545  543  {
 546  544          dump_zap_stats(os, object);
 547  545          /* contents are printed elsewhere, properly decoded */
 548  546  }
 549  547  
 550  548  /*ARGSUSED*/
 551  549  static void
 552  550  dump_sa_attrs(objset_t *os, uint64_t object, void *data, size_t size)
 553  551  {
 554  552          zap_cursor_t zc;
 555  553          zap_attribute_t attr;
 556  554  
 557  555          dump_zap_stats(os, object);
 558  556          (void) printf("\n");
 559  557  
 560  558          for (zap_cursor_init(&zc, os, object);
 561  559              zap_cursor_retrieve(&zc, &attr) == 0;
 562  560              zap_cursor_advance(&zc)) {
 563  561                  (void) printf("\t\t%s = ", attr.za_name);
 564  562                  if (attr.za_num_integers == 0) {
 565  563                          (void) printf("\n");
 566  564                          continue;
 567  565                  }
 568  566                  (void) printf(" %llx : [%d:%d:%d]\n",
 569  567                      (u_longlong_t)attr.za_first_integer,
 570  568                      (int)ATTR_LENGTH(attr.za_first_integer),
 571  569                      (int)ATTR_BSWAP(attr.za_first_integer),
 572  570                      (int)ATTR_NUM(attr.za_first_integer));
 573  571          }
 574  572          zap_cursor_fini(&zc);
 575  573  }
 576  574  
 577  575  /*ARGSUSED*/
 578  576  static void
 579  577  dump_sa_layouts(objset_t *os, uint64_t object, void *data, size_t size)
 580  578  {
 581  579          zap_cursor_t zc;
 582  580          zap_attribute_t attr;
 583  581          uint16_t *layout_attrs;
 584  582          unsigned i;
 585  583  
 586  584          dump_zap_stats(os, object);
 587  585          (void) printf("\n");
 588  586  
 589  587          for (zap_cursor_init(&zc, os, object);
 590  588              zap_cursor_retrieve(&zc, &attr) == 0;
 591  589              zap_cursor_advance(&zc)) {
 592  590                  (void) printf("\t\t%s = [", attr.za_name);
 593  591                  if (attr.za_num_integers == 0) {
 594  592                          (void) printf("\n");
 595  593                          continue;
 596  594                  }
 597  595  
 598  596                  VERIFY(attr.za_integer_length == 2);
 599  597                  layout_attrs = umem_zalloc(attr.za_num_integers *
 600  598                      attr.za_integer_length, UMEM_NOFAIL);
 601  599  
 602  600                  VERIFY(zap_lookup(os, object, attr.za_name,
 603  601                      attr.za_integer_length,
 604  602                      attr.za_num_integers, layout_attrs) == 0);
 605  603  
 606  604                  for (i = 0; i != attr.za_num_integers; i++)
 607  605                          (void) printf(" %d ", (int)layout_attrs[i]);
 608  606                  (void) printf("]\n");
 609  607                  umem_free(layout_attrs,
 610  608                      attr.za_num_integers * attr.za_integer_length);
 611  609          }
 612  610          zap_cursor_fini(&zc);
 613  611  }
 614  612  
 615  613  /*ARGSUSED*/
 616  614  static void
 617  615  dump_zpldir(objset_t *os, uint64_t object, void *data, size_t size)
 618  616  {
 619  617          zap_cursor_t zc;
 620  618          zap_attribute_t attr;
 621  619          const char *typenames[] = {
 622  620                  /* 0 */ "not specified",
 623  621                  /* 1 */ "FIFO",
 624  622                  /* 2 */ "Character Device",
 625  623                  /* 3 */ "3 (invalid)",
 626  624                  /* 4 */ "Directory",
 627  625                  /* 5 */ "5 (invalid)",
 628  626                  /* 6 */ "Block Device",
 629  627                  /* 7 */ "7 (invalid)",
 630  628                  /* 8 */ "Regular File",
 631  629                  /* 9 */ "9 (invalid)",
 632  630                  /* 10 */ "Symbolic Link",
 633  631                  /* 11 */ "11 (invalid)",
 634  632                  /* 12 */ "Socket",
 635  633                  /* 13 */ "Door",
 636  634                  /* 14 */ "Event Port",
 637  635                  /* 15 */ "15 (invalid)",
 638  636          };
 639  637  
 640  638          dump_zap_stats(os, object);
 641  639          (void) printf("\n");
 642  640  
 643  641          for (zap_cursor_init(&zc, os, object);
 644  642              zap_cursor_retrieve(&zc, &attr) == 0;
 645  643              zap_cursor_advance(&zc)) {
 646  644                  (void) printf("\t\t%s = %lld (type: %s)\n",
 647  645                      attr.za_name, ZFS_DIRENT_OBJ(attr.za_first_integer),
 648  646                      typenames[ZFS_DIRENT_TYPE(attr.za_first_integer)]);
 649  647          }
 650  648          zap_cursor_fini(&zc);
 651  649  }
 652  650  
 653  651  static int
 654  652  get_dtl_refcount(vdev_t *vd)
 655  653  {
 656  654          int refcount = 0;
 657  655  
 658  656          if (vd->vdev_ops->vdev_op_leaf) {
 659  657                  space_map_t *sm = vd->vdev_dtl_sm;
 660  658  
 661  659                  if (sm != NULL &&
 662  660                      sm->sm_dbuf->db_size == sizeof (space_map_phys_t))
 663  661                          return (1);
 664  662                  return (0);
 665  663          }
 666  664  
  
    | 
      ↓ open down ↓ | 
    557 lines elided | 
    
      ↑ open up ↑ | 
  
 667  665          for (unsigned c = 0; c < vd->vdev_children; c++)
 668  666                  refcount += get_dtl_refcount(vd->vdev_child[c]);
 669  667          return (refcount);
 670  668  }
 671  669  
 672  670  static int
 673  671  get_metaslab_refcount(vdev_t *vd)
 674  672  {
 675  673          int refcount = 0;
 676  674  
 677      -        if (vd->vdev_top == vd) {
 678      -                for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
      675 +        if (vd->vdev_top == vd && !vd->vdev_removing) {
      676 +                for (unsigned m = 0; m < vd->vdev_ms_count; m++) {
 679  677                          space_map_t *sm = vd->vdev_ms[m]->ms_sm;
 680  678  
 681  679                          if (sm != NULL &&
 682  680                              sm->sm_dbuf->db_size == sizeof (space_map_phys_t))
 683  681                                  refcount++;
 684  682                  }
 685  683          }
 686  684          for (unsigned c = 0; c < vd->vdev_children; c++)
 687  685                  refcount += get_metaslab_refcount(vd->vdev_child[c]);
 688  686  
 689  687          return (refcount);
 690  688  }
 691  689  
 692  690  static int
 693      -get_obsolete_refcount(vdev_t *vd)
 694      -{
 695      -        int refcount = 0;
 696      -
 697      -        uint64_t obsolete_sm_obj = vdev_obsolete_sm_object(vd);
 698      -        if (vd->vdev_top == vd && obsolete_sm_obj != 0) {
 699      -                dmu_object_info_t doi;
 700      -                VERIFY0(dmu_object_info(vd->vdev_spa->spa_meta_objset,
 701      -                    obsolete_sm_obj, &doi));
 702      -                if (doi.doi_bonus_size == sizeof (space_map_phys_t)) {
 703      -                        refcount++;
 704      -                }
 705      -        } else {
 706      -                ASSERT3P(vd->vdev_obsolete_sm, ==, NULL);
 707      -                ASSERT3U(obsolete_sm_obj, ==, 0);
 708      -        }
 709      -        for (unsigned c = 0; c < vd->vdev_children; c++) {
 710      -                refcount += get_obsolete_refcount(vd->vdev_child[c]);
 711      -        }
 712      -
 713      -        return (refcount);
 714      -}
 715      -
 716      -static int
 717      -get_prev_obsolete_spacemap_refcount(spa_t *spa)
 718      -{
 719      -        uint64_t prev_obj =
 720      -            spa->spa_condensing_indirect_phys.scip_prev_obsolete_sm_object;
 721      -        if (prev_obj != 0) {
 722      -                dmu_object_info_t doi;
 723      -                VERIFY0(dmu_object_info(spa->spa_meta_objset, prev_obj, &doi));
 724      -                if (doi.doi_bonus_size == sizeof (space_map_phys_t)) {
 725      -                        return (1);
 726      -                }
 727      -        }
 728      -        return (0);
 729      -}
 730      -
 731      -static int
 732  691  verify_spacemap_refcounts(spa_t *spa)
 733  692  {
 734  693          uint64_t expected_refcount = 0;
 735  694          uint64_t actual_refcount;
 736  695  
 737  696          (void) feature_get_refcount(spa,
 738  697              &spa_feature_table[SPA_FEATURE_SPACEMAP_HISTOGRAM],
 739  698              &expected_refcount);
 740  699          actual_refcount = get_dtl_refcount(spa->spa_root_vdev);
 741  700          actual_refcount += get_metaslab_refcount(spa->spa_root_vdev);
 742      -        actual_refcount += get_obsolete_refcount(spa->spa_root_vdev);
 743      -        actual_refcount += get_prev_obsolete_spacemap_refcount(spa);
 744  701  
 745  702          if (expected_refcount != actual_refcount) {
 746  703                  (void) printf("space map refcount mismatch: expected %lld != "
 747  704                      "actual %lld\n",
 748  705                      (longlong_t)expected_refcount,
 749  706                      (longlong_t)actual_refcount);
 750  707                  return (2);
 751  708          }
 752  709          return (0);
 753  710  }
 754  711  
 755  712  static void
 756  713  dump_spacemap(objset_t *os, space_map_t *sm)
 757  714  {
 758  715          uint64_t alloc, offset, entry;
 759      -        char *ddata[] = { "ALLOC", "FREE", "CONDENSE", "INVALID",
 760      -            "INVALID", "INVALID", "INVALID", "INVALID" };
      716 +        const char *ddata[] = { "ALLOC", "FREE", "CONDENSE", "INVALID",
      717 +                            "INVALID", "INVALID", "INVALID", "INVALID" };
 761  718  
 762  719          if (sm == NULL)
 763  720                  return;
 764  721  
 765      -        (void) printf("space map object %llu:\n",
 766      -            (longlong_t)sm->sm_phys->smp_object);
 767      -        (void) printf("  smp_objsize = 0x%llx\n",
 768      -            (longlong_t)sm->sm_phys->smp_objsize);
 769      -        (void) printf("  smp_alloc = 0x%llx\n",
 770      -            (longlong_t)sm->sm_phys->smp_alloc);
 771      -
 772  722          /*
 773  723           * Print out the freelist entries in both encoded and decoded form.
 774  724           */
 775  725          alloc = 0;
 776  726          for (offset = 0; offset < space_map_length(sm);
 777  727              offset += sizeof (entry)) {
 778  728                  uint8_t mapshift = sm->sm_shift;
 779  729  
 780  730                  VERIFY0(dmu_read(os, space_map_object(sm), offset,
 781  731                      sizeof (entry), &entry, DMU_READ_PREFETCH));
 782  732                  if (SM_DEBUG_DECODE(entry)) {
 783  733  
 784  734                          (void) printf("\t    [%6llu] %s: txg %llu, pass %llu\n",
 785  735                              (u_longlong_t)(offset / sizeof (entry)),
 786  736                              ddata[SM_DEBUG_ACTION_DECODE(entry)],
 787  737                              (u_longlong_t)SM_DEBUG_TXG_DECODE(entry),
 788  738                              (u_longlong_t)SM_DEBUG_SYNCPASS_DECODE(entry));
 789  739                  } else {
 790  740                          (void) printf("\t    [%6llu]    %c  range:"
 791  741                              " %010llx-%010llx  size: %06llx\n",
 792  742                              (u_longlong_t)(offset / sizeof (entry)),
 793  743                              SM_TYPE_DECODE(entry) == SM_ALLOC ? 'A' : 'F',
 794  744                              (u_longlong_t)((SM_OFFSET_DECODE(entry) <<
 795  745                              mapshift) + sm->sm_start),
 796  746                              (u_longlong_t)((SM_OFFSET_DECODE(entry) <<
 797  747                              mapshift) + sm->sm_start +
 798  748                              (SM_RUN_DECODE(entry) << mapshift)),
 799  749                              (u_longlong_t)(SM_RUN_DECODE(entry) << mapshift));
 800  750                          if (SM_TYPE_DECODE(entry) == SM_ALLOC)
 801  751                                  alloc += SM_RUN_DECODE(entry) << mapshift;
 802  752                          else
 803  753                                  alloc -= SM_RUN_DECODE(entry) << mapshift;
 804  754                  }
 805  755          }
 806  756          if (alloc != space_map_allocated(sm)) {
 807  757                  (void) printf("space_map_object alloc (%llu) INCONSISTENT "
 808  758                      "with space map summary (%llu)\n",
 809  759                      (u_longlong_t)space_map_allocated(sm), (u_longlong_t)alloc);
 810  760          }
 811  761  }
 812  762  
 813  763  static void
 814  764  dump_metaslab_stats(metaslab_t *msp)
 815  765  {
 816  766          char maxbuf[32];
 817  767          range_tree_t *rt = msp->ms_tree;
 818  768          avl_tree_t *t = &msp->ms_size_tree;
 819  769          int free_pct = range_tree_space(rt) * 100 / msp->ms_size;
 820  770  
 821  771          /* max sure nicenum has enough space */
 822  772          CTASSERT(sizeof (maxbuf) >= NN_NUMBUF_SZ);
 823  773  
 824  774          zdb_nicenum(metaslab_block_maxsize(msp), maxbuf, sizeof (maxbuf));
 825  775  
 826  776          (void) printf("\t %25s %10lu   %7s  %6s   %4s %4d%%\n",
 827  777              "segments", avl_numnodes(t), "maxsize", maxbuf,
 828  778              "freepct", free_pct);
 829  779          (void) printf("\tIn-memory histogram:\n");
 830  780          dump_histogram(rt->rt_histogram, RANGE_TREE_HISTOGRAM_SIZE, 0);
 831  781  }
 832  782  
 833  783  static void
 834  784  dump_metaslab(metaslab_t *msp)
 835  785  {
 836  786          vdev_t *vd = msp->ms_group->mg_vd;
 837  787          spa_t *spa = vd->vdev_spa;
 838  788          space_map_t *sm = msp->ms_sm;
 839  789          char freebuf[32];
 840  790  
 841  791          zdb_nicenum(msp->ms_size - space_map_allocated(sm), freebuf,
 842  792              sizeof (freebuf));
 843  793  
 844  794          (void) printf(
 845  795              "\tmetaslab %6llu   offset %12llx   spacemap %6llu   free    %5s\n",
 846  796              (u_longlong_t)msp->ms_id, (u_longlong_t)msp->ms_start,
 847  797              (u_longlong_t)space_map_object(sm), freebuf);
 848  798  
 849  799          if (dump_opt['m'] > 2 && !dump_opt['L']) {
 850  800                  mutex_enter(&msp->ms_lock);
 851  801                  metaslab_load_wait(msp);
 852  802                  if (!msp->ms_loaded) {
 853  803                          VERIFY0(metaslab_load(msp));
 854  804                          range_tree_stat_verify(msp->ms_tree);
 855  805                  }
 856  806                  dump_metaslab_stats(msp);
 857  807                  metaslab_unload(msp);
 858  808                  mutex_exit(&msp->ms_lock);
 859  809          }
 860  810  
 861  811          if (dump_opt['m'] > 1 && sm != NULL &&
 862  812              spa_feature_is_active(spa, SPA_FEATURE_SPACEMAP_HISTOGRAM)) {
 863  813                  /*
 864  814                   * The space map histogram represents free space in chunks
 865  815                   * of sm_shift (i.e. bucket 0 refers to 2^sm_shift).
  
    | 
      ↓ open down ↓ | 
    84 lines elided | 
    
      ↑ open up ↑ | 
  
 866  816                   */
 867  817                  (void) printf("\tOn-disk histogram:\t\tfragmentation %llu\n",
 868  818                      (u_longlong_t)msp->ms_fragmentation);
 869  819                  dump_histogram(sm->sm_phys->smp_histogram,
 870  820                      SPACE_MAP_HISTOGRAM_SIZE, sm->sm_shift);
 871  821          }
 872  822  
 873  823          if (dump_opt['d'] > 5 || dump_opt['m'] > 3) {
 874  824                  ASSERT(msp->ms_size == (1ULL << vd->vdev_ms_shift));
 875  825  
      826 +                mutex_enter(&msp->ms_lock);
 876  827                  dump_spacemap(spa->spa_meta_objset, msp->ms_sm);
      828 +                mutex_exit(&msp->ms_lock);
 877  829          }
 878  830  }
 879  831  
 880  832  static void
 881  833  print_vdev_metaslab_header(vdev_t *vd)
 882  834  {
 883  835          (void) printf("\tvdev %10llu\n\t%-10s%5llu   %-19s   %-15s   %-10s\n",
 884  836              (u_longlong_t)vd->vdev_id,
 885  837              "metaslabs", (u_longlong_t)vd->vdev_ms_count,
 886  838              "offset", "spacemap", "free");
 887  839          (void) printf("\t%15s   %19s   %15s   %10s\n",
 888  840              "---------------", "-------------------",
 889  841              "---------------", "-------------");
 890  842  }
 891  843  
 892  844  static void
 893  845  dump_metaslab_groups(spa_t *spa)
 894  846  {
 895  847          vdev_t *rvd = spa->spa_root_vdev;
 896  848          metaslab_class_t *mc = spa_normal_class(spa);
 897  849          uint64_t fragmentation;
 898  850  
 899  851          metaslab_class_histogram_verify(mc);
 900  852  
 901  853          for (unsigned c = 0; c < rvd->vdev_children; c++) {
 902  854                  vdev_t *tvd = rvd->vdev_child[c];
 903  855                  metaslab_group_t *mg = tvd->vdev_mg;
 904  856  
 905  857                  if (mg->mg_class != mc)
 906  858                          continue;
 907  859  
 908  860                  metaslab_group_histogram_verify(mg);
 909  861                  mg->mg_fragmentation = metaslab_group_fragmentation(mg);
 910  862  
 911  863                  (void) printf("\tvdev %10llu\t\tmetaslabs%5llu\t\t"
 912  864                      "fragmentation",
 913  865                      (u_longlong_t)tvd->vdev_id,
 914  866                      (u_longlong_t)tvd->vdev_ms_count);
 915  867                  if (mg->mg_fragmentation == ZFS_FRAG_INVALID) {
 916  868                          (void) printf("%3s\n", "-");
 917  869                  } else {
 918  870                          (void) printf("%3llu%%\n",
 919  871                              (u_longlong_t)mg->mg_fragmentation);
 920  872                  }
 921  873                  dump_histogram(mg->mg_histogram, RANGE_TREE_HISTOGRAM_SIZE, 0);
 922  874          }
 923  875  
  
    | 
      ↓ open down ↓ | 
    37 lines elided | 
    
      ↑ open up ↑ | 
  
 924  876          (void) printf("\tpool %s\tfragmentation", spa_name(spa));
 925  877          fragmentation = metaslab_class_fragmentation(mc);
 926  878          if (fragmentation == ZFS_FRAG_INVALID)
 927  879                  (void) printf("\t%3s\n", "-");
 928  880          else
 929  881                  (void) printf("\t%3llu%%\n", (u_longlong_t)fragmentation);
 930  882          dump_histogram(mc->mc_histogram, RANGE_TREE_HISTOGRAM_SIZE, 0);
 931  883  }
 932  884  
 933  885  static void
 934      -print_vdev_indirect(vdev_t *vd)
 935      -{
 936      -        vdev_indirect_config_t *vic = &vd->vdev_indirect_config;
 937      -        vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
 938      -        vdev_indirect_births_t *vib = vd->vdev_indirect_births;
 939      -
 940      -        if (vim == NULL) {
 941      -                ASSERT3P(vib, ==, NULL);
 942      -                return;
 943      -        }
 944      -
 945      -        ASSERT3U(vdev_indirect_mapping_object(vim), ==,
 946      -            vic->vic_mapping_object);
 947      -        ASSERT3U(vdev_indirect_births_object(vib), ==,
 948      -            vic->vic_births_object);
 949      -
 950      -        (void) printf("indirect births obj %llu:\n",
 951      -            (longlong_t)vic->vic_births_object);
 952      -        (void) printf("    vib_count = %llu\n",
 953      -            (longlong_t)vdev_indirect_births_count(vib));
 954      -        for (uint64_t i = 0; i < vdev_indirect_births_count(vib); i++) {
 955      -                vdev_indirect_birth_entry_phys_t *cur_vibe =
 956      -                    &vib->vib_entries[i];
 957      -                (void) printf("\toffset %llx -> txg %llu\n",
 958      -                    (longlong_t)cur_vibe->vibe_offset,
 959      -                    (longlong_t)cur_vibe->vibe_phys_birth_txg);
 960      -        }
 961      -        (void) printf("\n");
 962      -
 963      -        (void) printf("indirect mapping obj %llu:\n",
 964      -            (longlong_t)vic->vic_mapping_object);
 965      -        (void) printf("    vim_max_offset = 0x%llx\n",
 966      -            (longlong_t)vdev_indirect_mapping_max_offset(vim));
 967      -        (void) printf("    vim_bytes_mapped = 0x%llx\n",
 968      -            (longlong_t)vdev_indirect_mapping_bytes_mapped(vim));
 969      -        (void) printf("    vim_count = %llu\n",
 970      -            (longlong_t)vdev_indirect_mapping_num_entries(vim));
 971      -
 972      -        if (dump_opt['d'] <= 5 && dump_opt['m'] <= 3)
 973      -                return;
 974      -
 975      -        uint32_t *counts = vdev_indirect_mapping_load_obsolete_counts(vim);
 976      -
 977      -        for (uint64_t i = 0; i < vdev_indirect_mapping_num_entries(vim); i++) {
 978      -                vdev_indirect_mapping_entry_phys_t *vimep =
 979      -                    &vim->vim_entries[i];
 980      -                (void) printf("\t<%llx:%llx:%llx> -> "
 981      -                    "<%llx:%llx:%llx> (%x obsolete)\n",
 982      -                    (longlong_t)vd->vdev_id,
 983      -                    (longlong_t)DVA_MAPPING_GET_SRC_OFFSET(vimep),
 984      -                    (longlong_t)DVA_GET_ASIZE(&vimep->vimep_dst),
 985      -                    (longlong_t)DVA_GET_VDEV(&vimep->vimep_dst),
 986      -                    (longlong_t)DVA_GET_OFFSET(&vimep->vimep_dst),
 987      -                    (longlong_t)DVA_GET_ASIZE(&vimep->vimep_dst),
 988      -                    counts[i]);
 989      -        }
 990      -        (void) printf("\n");
 991      -
 992      -        uint64_t obsolete_sm_object = vdev_obsolete_sm_object(vd);
 993      -        if (obsolete_sm_object != 0) {
 994      -                objset_t *mos = vd->vdev_spa->spa_meta_objset;
 995      -                (void) printf("obsolete space map object %llu:\n",
 996      -                    (u_longlong_t)obsolete_sm_object);
 997      -                ASSERT(vd->vdev_obsolete_sm != NULL);
 998      -                ASSERT3U(space_map_object(vd->vdev_obsolete_sm), ==,
 999      -                    obsolete_sm_object);
1000      -                dump_spacemap(mos, vd->vdev_obsolete_sm);
1001      -                (void) printf("\n");
1002      -        }
1003      -}
1004      -
1005      -static void
1006  886  dump_metaslabs(spa_t *spa)
1007  887  {
1008  888          vdev_t *vd, *rvd = spa->spa_root_vdev;
1009  889          uint64_t m, c = 0, children = rvd->vdev_children;
1010  890  
1011  891          (void) printf("\nMetaslabs:\n");
1012  892  
1013  893          if (!dump_opt['d'] && zopt_objects > 0) {
1014  894                  c = zopt_object[0];
1015  895  
1016  896                  if (c >= children)
1017  897                          (void) fatal("bad vdev id: %llu", (u_longlong_t)c);
1018  898  
1019  899                  if (zopt_objects > 1) {
1020  900                          vd = rvd->vdev_child[c];
1021  901                          print_vdev_metaslab_header(vd);
1022  902  
1023  903                          for (m = 1; m < zopt_objects; m++) {
1024  904                                  if (zopt_object[m] < vd->vdev_ms_count)
1025  905                                          dump_metaslab(
1026  906                                              vd->vdev_ms[zopt_object[m]]);
1027  907                                  else
1028  908                                          (void) fprintf(stderr, "bad metaslab "
1029  909                                              "number %llu\n",
1030  910                                              (u_longlong_t)zopt_object[m]);
  
    | 
      ↓ open down ↓ | 
    15 lines elided | 
    
      ↑ open up ↑ | 
  
1031  911                          }
1032  912                          (void) printf("\n");
1033  913                          return;
1034  914                  }
1035  915                  children = c + 1;
1036  916          }
1037  917          for (; c < children; c++) {
1038  918                  vd = rvd->vdev_child[c];
1039  919                  print_vdev_metaslab_header(vd);
1040  920  
1041      -                print_vdev_indirect(vd);
1042      -
1043  921                  for (m = 0; m < vd->vdev_ms_count; m++)
1044  922                          dump_metaslab(vd->vdev_ms[m]);
1045  923                  (void) printf("\n");
1046  924          }
1047  925  }
1048  926  
1049  927  static void
1050  928  dump_dde(const ddt_t *ddt, const ddt_entry_t *dde, uint64_t index)
1051  929  {
1052  930          const ddt_phys_t *ddp = dde->dde_phys;
1053  931          const ddt_key_t *ddk = &dde->dde_key;
1054  932          const char *types[4] = { "ditto", "single", "double", "triple" };
1055  933          char blkbuf[BP_SPRINTF_LEN];
1056  934          blkptr_t blk;
1057  935  
1058  936          for (int p = 0; p < DDT_PHYS_TYPES; p++, ddp++) {
1059  937                  if (ddp->ddp_phys_birth == 0)
1060  938                          continue;
1061  939                  ddt_bp_create(ddt->ddt_checksum, ddk, ddp, &blk);
1062  940                  snprintf_blkptr(blkbuf, sizeof (blkbuf), &blk);
1063  941                  (void) printf("index %llx refcnt %llu %s %s\n",
1064  942                      (u_longlong_t)index, (u_longlong_t)ddp->ddp_refcnt,
1065  943                      types[p], blkbuf);
1066  944          }
1067  945  }
1068  946  
1069  947  static void
1070  948  dump_dedup_ratio(const ddt_stat_t *dds)
1071  949  {
1072  950          double rL, rP, rD, D, dedup, compress, copies;
1073  951  
1074  952          if (dds->dds_blocks == 0)
1075  953                  return;
1076  954  
1077  955          rL = (double)dds->dds_ref_lsize;
1078  956          rP = (double)dds->dds_ref_psize;
1079  957          rD = (double)dds->dds_ref_dsize;
1080  958          D = (double)dds->dds_dsize;
1081  959  
1082  960          dedup = rD / D;
1083  961          compress = rL / rP;
1084  962          copies = rD / rP;
1085  963  
1086  964          (void) printf("dedup = %.2f, compress = %.2f, copies = %.2f, "
1087  965              "dedup * compress / copies = %.2f\n\n",
1088  966              dedup, compress, copies, dedup * compress / copies);
1089  967  }
1090  968  
1091  969  static void
1092  970  dump_ddt(ddt_t *ddt, enum ddt_type type, enum ddt_class class)
1093  971  {
1094  972          char name[DDT_NAMELEN];
1095  973          ddt_entry_t dde;
1096  974          uint64_t walk = 0;
  
    | 
      ↓ open down ↓ | 
    44 lines elided | 
    
      ↑ open up ↑ | 
  
1097  975          dmu_object_info_t doi;
1098  976          uint64_t count, dspace, mspace;
1099  977          int error;
1100  978  
1101  979          error = ddt_object_info(ddt, type, class, &doi);
1102  980  
1103  981          if (error == ENOENT)
1104  982                  return;
1105  983          ASSERT(error == 0);
1106  984  
1107      -        if ((count = ddt_object_count(ddt, type, class)) == 0)
      985 +        (void) ddt_object_count(ddt, type, class, &count);
      986 +        if (count == 0)
1108  987                  return;
1109  988  
1110  989          dspace = doi.doi_physical_blocks_512 << 9;
1111  990          mspace = doi.doi_fill_count * doi.doi_data_block_size;
1112  991  
1113  992          ddt_object_name(ddt, type, class, name);
1114  993  
1115  994          (void) printf("%s: %llu entries, size %llu on disk, %llu in core\n",
1116  995              name,
1117  996              (u_longlong_t)count,
1118  997              (u_longlong_t)(dspace / count),
1119  998              (u_longlong_t)(mspace / count));
1120  999  
1121 1000          if (dump_opt['D'] < 3)
1122 1001                  return;
1123 1002  
1124 1003          zpool_dump_ddt(NULL, &ddt->ddt_histogram[type][class]);
1125 1004  
1126 1005          if (dump_opt['D'] < 4)
1127 1006                  return;
1128 1007  
1129 1008          if (dump_opt['D'] < 5 && class == DDT_CLASS_UNIQUE)
1130 1009                  return;
1131 1010  
1132 1011          (void) printf("%s contents:\n\n", name);
1133 1012  
1134 1013          while ((error = ddt_object_walk(ddt, type, class, &walk, &dde)) == 0)
1135 1014                  dump_dde(ddt, &dde, walk);
1136 1015  
1137 1016          ASSERT(error == ENOENT);
1138 1017  
1139 1018          (void) printf("\n");
1140 1019  }
1141 1020  
1142 1021  static void
1143 1022  dump_all_ddts(spa_t *spa)
1144 1023  {
1145 1024          ddt_histogram_t ddh_total;
1146 1025          ddt_stat_t dds_total;
1147 1026  
1148 1027          bzero(&ddh_total, sizeof (ddh_total));
1149 1028          bzero(&dds_total, sizeof (dds_total));
1150 1029  
1151 1030          for (enum zio_checksum c = 0; c < ZIO_CHECKSUM_FUNCTIONS; c++) {
1152 1031                  ddt_t *ddt = spa->spa_ddt[c];
1153 1032                  for (enum ddt_type type = 0; type < DDT_TYPES; type++) {
1154 1033                          for (enum ddt_class class = 0; class < DDT_CLASSES;
1155 1034                              class++) {
1156 1035                                  dump_ddt(ddt, type, class);
1157 1036                          }
1158 1037                  }
1159 1038          }
1160 1039  
1161 1040          ddt_get_dedup_stats(spa, &dds_total);
1162 1041  
1163 1042          if (dds_total.dds_blocks == 0) {
1164 1043                  (void) printf("All DDTs are empty\n");
1165 1044                  return;
1166 1045          }
1167 1046  
1168 1047          (void) printf("\n");
1169 1048  
1170 1049          if (dump_opt['D'] > 1) {
1171 1050                  (void) printf("DDT histogram (aggregated over all DDTs):\n");
1172 1051                  ddt_get_dedup_histogram(spa, &ddh_total);
1173 1052                  zpool_dump_ddt(&dds_total, &ddh_total);
1174 1053          }
1175 1054  
1176 1055          dump_dedup_ratio(&dds_total);
1177 1056  }
1178 1057  
1179 1058  static void
1180 1059  dump_dtl_seg(void *arg, uint64_t start, uint64_t size)
1181 1060  {
1182 1061          char *prefix = arg;
1183 1062  
1184 1063          (void) printf("%s [%llu,%llu) length %llu\n",
1185 1064              prefix,
1186 1065              (u_longlong_t)start,
1187 1066              (u_longlong_t)(start + size),
1188 1067              (u_longlong_t)(size));
1189 1068  }
1190 1069  
1191 1070  static void
1192 1071  dump_dtl(vdev_t *vd, int indent)
1193 1072  {
1194 1073          spa_t *spa = vd->vdev_spa;
1195 1074          boolean_t required;
1196 1075          const char *name[DTL_TYPES] = { "missing", "partial", "scrub",
1197 1076                  "outage" };
1198 1077          char prefix[256];
1199 1078  
1200 1079          spa_vdev_state_enter(spa, SCL_NONE);
1201 1080          required = vdev_dtl_required(vd);
1202 1081          (void) spa_vdev_state_exit(spa, NULL, 0);
1203 1082  
1204 1083          if (indent == 0)
1205 1084                  (void) printf("\nDirty time logs:\n\n");
1206 1085  
1207 1086          (void) printf("\t%*s%s [%s]\n", indent, "",
  
    | 
      ↓ open down ↓ | 
    90 lines elided | 
    
      ↑ open up ↑ | 
  
1208 1087              vd->vdev_path ? vd->vdev_path :
1209 1088              vd->vdev_parent ? vd->vdev_ops->vdev_op_type : spa_name(spa),
1210 1089              required ? "DTL-required" : "DTL-expendable");
1211 1090  
1212 1091          for (int t = 0; t < DTL_TYPES; t++) {
1213 1092                  range_tree_t *rt = vd->vdev_dtl[t];
1214 1093                  if (range_tree_space(rt) == 0)
1215 1094                          continue;
1216 1095                  (void) snprintf(prefix, sizeof (prefix), "\t%*s%s",
1217 1096                      indent + 2, "", name[t]);
     1097 +                mutex_enter(rt->rt_lock);
1218 1098                  range_tree_walk(rt, dump_dtl_seg, prefix);
     1099 +                mutex_exit(rt->rt_lock);
1219 1100                  if (dump_opt['d'] > 5 && vd->vdev_children == 0)
1220 1101                          dump_spacemap(spa->spa_meta_objset, vd->vdev_dtl_sm);
1221 1102          }
1222 1103  
1223 1104          for (unsigned c = 0; c < vd->vdev_children; c++)
1224 1105                  dump_dtl(vd->vdev_child[c], indent + 4);
1225 1106  }
1226 1107  
1227 1108  static void
1228 1109  dump_history(spa_t *spa)
1229 1110  {
1230 1111          nvlist_t **events = NULL;
1231 1112          uint64_t resid, len, off = 0;
     1113 +        uint64_t buflen;
1232 1114          uint_t num = 0;
1233 1115          int error;
1234 1116          time_t tsec;
1235 1117          struct tm t;
1236 1118          char tbuf[30];
1237 1119          char internalstr[MAXPATHLEN];
1238 1120  
1239      -        char *buf = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
     1121 +        buflen = SPA_MAXBLOCKSIZE;
     1122 +        char *buf = umem_alloc(buflen, UMEM_NOFAIL);
1240 1123          do {
1241      -                len = SPA_MAXBLOCKSIZE;
     1124 +                len = buflen;
1242 1125  
1243 1126                  if ((error = spa_history_get(spa, &off, &len, buf)) != 0) {
1244      -                        (void) fprintf(stderr, "Unable to read history: "
1245      -                            "error %d\n", error);
1246      -                        umem_free(buf, SPA_MAXBLOCKSIZE);
1247      -                        return;
     1127 +                        break;
1248 1128                  }
1249 1129  
1250      -                if (zpool_history_unpack(buf, len, &resid, &events, &num) != 0)
     1130 +                error = zpool_history_unpack(buf, len, &resid, &events, &num);
     1131 +                if (error != 0) {
1251 1132                          break;
     1133 +                }
1252 1134  
1253 1135                  off -= resid;
     1136 +                if (resid == len) {
     1137 +                         umem_free(buf, buflen);
     1138 +                         buflen *= 2;
     1139 +                         buf = umem_alloc(buflen, UMEM_NOFAIL);
     1140 +                         if (buf == NULL) {
     1141 +                                (void) fprintf(stderr, "Unable to read history: %s\n",
     1142 +                                    strerror(error));
     1143 +                                goto err;
     1144 +                         }
     1145 +                }
1254 1146          } while (len != 0);
1255      -        umem_free(buf, SPA_MAXBLOCKSIZE);
     1147 +        umem_free(buf, buflen);
1256 1148  
     1149 +        if (error != 0) {
     1150 +                (void) fprintf(stderr, "Unable to read history: %s\n",
     1151 +                    strerror(error));
     1152 +                goto err;
     1153 +        }
     1154 +
1257 1155          (void) printf("\nHistory:\n");
1258 1156          for (unsigned i = 0; i < num; i++) {
1259 1157                  uint64_t time, txg, ievent;
1260 1158                  char *cmd, *intstr;
1261 1159                  boolean_t printed = B_FALSE;
1262 1160  
1263 1161                  if (nvlist_lookup_uint64(events[i], ZPOOL_HIST_TIME,
1264 1162                      &time) != 0)
1265 1163                          goto next;
1266 1164                  if (nvlist_lookup_string(events[i], ZPOOL_HIST_CMD,
1267 1165                      &cmd) != 0) {
1268 1166                          if (nvlist_lookup_uint64(events[i],
1269 1167                              ZPOOL_HIST_INT_EVENT, &ievent) != 0)
1270 1168                                  goto next;
1271 1169                          verify(nvlist_lookup_uint64(events[i],
1272 1170                              ZPOOL_HIST_TXG, &txg) == 0);
1273 1171                          verify(nvlist_lookup_string(events[i],
1274 1172                              ZPOOL_HIST_INT_STR, &intstr) == 0);
1275 1173                          if (ievent >= ZFS_NUM_LEGACY_HISTORY_EVENTS)
1276 1174                                  goto next;
1277 1175  
1278 1176                          (void) snprintf(internalstr,
1279 1177                              sizeof (internalstr),
1280 1178                              "[internal %s txg:%ju] %s",
1281 1179                              zfs_history_event_names[ievent], (uintmax_t)txg,
1282 1180                              intstr);
1283 1181                          cmd = internalstr;
1284 1182                  }
1285 1183                  tsec = time;
1286 1184                  (void) localtime_r(&tsec, &t);
1287 1185                  (void) strftime(tbuf, sizeof (tbuf), "%F.%T", &t);
  
    | 
      ↓ open down ↓ | 
    21 lines elided | 
    
      ↑ open up ↑ | 
  
1288 1186                  (void) printf("%s %s\n", tbuf, cmd);
1289 1187                  printed = B_TRUE;
1290 1188  
1291 1189  next:
1292 1190                  if (dump_opt['h'] > 1) {
1293 1191                          if (!printed)
1294 1192                                  (void) printf("unrecognized record:\n");
1295 1193                          dump_nvlist(events[i], 2);
1296 1194                  }
1297 1195          }
     1196 +err:
     1197 +        for (unsigned i = 0; i < num; i++) {
     1198 +                nvlist_free(events[i]);
     1199 +        }
     1200 +        free(events);
1298 1201  }
1299 1202  
1300 1203  /*ARGSUSED*/
1301 1204  static void
1302 1205  dump_dnode(objset_t *os, uint64_t object, void *data, size_t size)
1303 1206  {
1304 1207  }
1305 1208  
1306 1209  static uint64_t
1307 1210  blkid2offset(const dnode_phys_t *dnp, const blkptr_t *bp,
1308 1211      const zbookmark_phys_t *zb)
1309 1212  {
1310 1213          if (dnp == NULL) {
1311 1214                  ASSERT(zb->zb_level < 0);
1312 1215                  if (zb->zb_object == 0)
1313 1216                          return (zb->zb_blkid);
1314 1217                  return (zb->zb_blkid * BP_GET_LSIZE(bp));
1315 1218          }
1316 1219  
1317 1220          ASSERT(zb->zb_level >= 0);
1318 1221  
1319 1222          return ((zb->zb_blkid <<
1320 1223              (zb->zb_level * (dnp->dn_indblkshift - SPA_BLKPTRSHIFT))) *
1321 1224              dnp->dn_datablkszsec << SPA_MINBLOCKSHIFT);
1322 1225  }
1323 1226  
1324 1227  static void
1325 1228  snprintf_blkptr_compact(char *blkbuf, size_t buflen, const blkptr_t *bp)
1326 1229  {
1327 1230          const dva_t *dva = bp->blk_dva;
1328 1231          int ndvas = dump_opt['d'] > 5 ? BP_GET_NDVAS(bp) : 1;
1329 1232  
1330 1233          if (dump_opt['b'] >= 6) {
1331 1234                  snprintf_blkptr(blkbuf, buflen, bp);
1332 1235                  return;
1333 1236          }
1334 1237  
1335 1238          if (BP_IS_EMBEDDED(bp)) {
1336 1239                  (void) sprintf(blkbuf,
1337 1240                      "EMBEDDED et=%u %llxL/%llxP B=%llu",
1338 1241                      (int)BPE_GET_ETYPE(bp),
1339 1242                      (u_longlong_t)BPE_GET_LSIZE(bp),
1340 1243                      (u_longlong_t)BPE_GET_PSIZE(bp),
1341 1244                      (u_longlong_t)bp->blk_birth);
1342 1245                  return;
1343 1246          }
1344 1247  
1345 1248          blkbuf[0] = '\0';
1346 1249          for (int i = 0; i < ndvas; i++)
1347 1250                  (void) snprintf(blkbuf + strlen(blkbuf),
1348 1251                      buflen - strlen(blkbuf), "%llu:%llx:%llx ",
1349 1252                      (u_longlong_t)DVA_GET_VDEV(&dva[i]),
1350 1253                      (u_longlong_t)DVA_GET_OFFSET(&dva[i]),
1351 1254                      (u_longlong_t)DVA_GET_ASIZE(&dva[i]));
1352 1255  
1353 1256          if (BP_IS_HOLE(bp)) {
1354 1257                  (void) snprintf(blkbuf + strlen(blkbuf),
1355 1258                      buflen - strlen(blkbuf),
1356 1259                      "%llxL B=%llu",
1357 1260                      (u_longlong_t)BP_GET_LSIZE(bp),
1358 1261                      (u_longlong_t)bp->blk_birth);
1359 1262          } else {
1360 1263                  (void) snprintf(blkbuf + strlen(blkbuf),
1361 1264                      buflen - strlen(blkbuf),
1362 1265                      "%llxL/%llxP F=%llu B=%llu/%llu",
1363 1266                      (u_longlong_t)BP_GET_LSIZE(bp),
1364 1267                      (u_longlong_t)BP_GET_PSIZE(bp),
1365 1268                      (u_longlong_t)BP_GET_FILL(bp),
1366 1269                      (u_longlong_t)bp->blk_birth,
1367 1270                      (u_longlong_t)BP_PHYSICAL_BIRTH(bp));
1368 1271          }
1369 1272  }
1370 1273  
1371 1274  static void
1372 1275  print_indirect(blkptr_t *bp, const zbookmark_phys_t *zb,
1373 1276      const dnode_phys_t *dnp)
1374 1277  {
1375 1278          char blkbuf[BP_SPRINTF_LEN];
1376 1279          int l;
1377 1280  
1378 1281          if (!BP_IS_EMBEDDED(bp)) {
1379 1282                  ASSERT3U(BP_GET_TYPE(bp), ==, dnp->dn_type);
1380 1283                  ASSERT3U(BP_GET_LEVEL(bp), ==, zb->zb_level);
1381 1284          }
1382 1285  
1383 1286          (void) printf("%16llx ", (u_longlong_t)blkid2offset(dnp, bp, zb));
1384 1287  
1385 1288          ASSERT(zb->zb_level >= 0);
1386 1289  
1387 1290          for (l = dnp->dn_nlevels - 1; l >= -1; l--) {
1388 1291                  if (l == zb->zb_level) {
1389 1292                          (void) printf("L%llx", (u_longlong_t)zb->zb_level);
1390 1293                  } else {
1391 1294                          (void) printf(" ");
1392 1295                  }
1393 1296          }
1394 1297  
1395 1298          snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), bp);
1396 1299          (void) printf("%s\n", blkbuf);
1397 1300  }
1398 1301  
1399 1302  static int
1400 1303  visit_indirect(spa_t *spa, const dnode_phys_t *dnp,
1401 1304      blkptr_t *bp, const zbookmark_phys_t *zb)
1402 1305  {
1403 1306          int err = 0;
1404 1307  
1405 1308          if (bp->blk_birth == 0)
1406 1309                  return (0);
1407 1310  
1408 1311          print_indirect(bp, zb, dnp);
1409 1312  
1410 1313          if (BP_GET_LEVEL(bp) > 0 && !BP_IS_HOLE(bp)) {
1411 1314                  arc_flags_t flags = ARC_FLAG_WAIT;
1412 1315                  int i;
1413 1316                  blkptr_t *cbp;
1414 1317                  int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT;
1415 1318                  arc_buf_t *buf;
1416 1319                  uint64_t fill = 0;
1417 1320  
1418 1321                  err = arc_read(NULL, spa, bp, arc_getbuf_func, &buf,
1419 1322                      ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, &flags, zb);
1420 1323                  if (err)
1421 1324                          return (err);
1422 1325                  ASSERT(buf->b_data);
1423 1326  
1424 1327                  /* recursively visit blocks below this */
1425 1328                  cbp = buf->b_data;
1426 1329                  for (i = 0; i < epb; i++, cbp++) {
1427 1330                          zbookmark_phys_t czb;
1428 1331  
1429 1332                          SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object,
1430 1333                              zb->zb_level - 1,
1431 1334                              zb->zb_blkid * epb + i);
1432 1335                          err = visit_indirect(spa, dnp, cbp, &czb);
1433 1336                          if (err)
1434 1337                                  break;
1435 1338                          fill += BP_GET_FILL(cbp);
1436 1339                  }
1437 1340                  if (!err)
1438 1341                          ASSERT3U(fill, ==, BP_GET_FILL(bp));
1439 1342                  arc_buf_destroy(buf, &buf);
1440 1343          }
1441 1344  
1442 1345          return (err);
1443 1346  }
1444 1347  
1445 1348  /*ARGSUSED*/
1446 1349  static void
1447 1350  dump_indirect(dnode_t *dn)
1448 1351  {
1449 1352          dnode_phys_t *dnp = dn->dn_phys;
1450 1353          int j;
1451 1354          zbookmark_phys_t czb;
1452 1355  
1453 1356          (void) printf("Indirect blocks:\n");
1454 1357  
1455 1358          SET_BOOKMARK(&czb, dmu_objset_id(dn->dn_objset),
1456 1359              dn->dn_object, dnp->dn_nlevels - 1, 0);
1457 1360          for (j = 0; j < dnp->dn_nblkptr; j++) {
1458 1361                  czb.zb_blkid = j;
1459 1362                  (void) visit_indirect(dmu_objset_spa(dn->dn_objset), dnp,
1460 1363                      &dnp->dn_blkptr[j], &czb);
1461 1364          }
1462 1365  
1463 1366          (void) printf("\n");
1464 1367  }
1465 1368  
1466 1369  /*ARGSUSED*/
1467 1370  static void
1468 1371  dump_dsl_dir(objset_t *os, uint64_t object, void *data, size_t size)
1469 1372  {
1470 1373          dsl_dir_phys_t *dd = data;
1471 1374          time_t crtime;
1472 1375          char nice[32];
1473 1376  
1474 1377          /* make sure nicenum has enough space */
1475 1378          CTASSERT(sizeof (nice) >= NN_NUMBUF_SZ);
1476 1379  
1477 1380          if (dd == NULL)
1478 1381                  return;
1479 1382  
1480 1383          ASSERT3U(size, >=, sizeof (dsl_dir_phys_t));
1481 1384  
1482 1385          crtime = dd->dd_creation_time;
1483 1386          (void) printf("\t\tcreation_time = %s", ctime(&crtime));
1484 1387          (void) printf("\t\thead_dataset_obj = %llu\n",
1485 1388              (u_longlong_t)dd->dd_head_dataset_obj);
1486 1389          (void) printf("\t\tparent_dir_obj = %llu\n",
1487 1390              (u_longlong_t)dd->dd_parent_obj);
1488 1391          (void) printf("\t\torigin_obj = %llu\n",
1489 1392              (u_longlong_t)dd->dd_origin_obj);
1490 1393          (void) printf("\t\tchild_dir_zapobj = %llu\n",
1491 1394              (u_longlong_t)dd->dd_child_dir_zapobj);
1492 1395          zdb_nicenum(dd->dd_used_bytes, nice, sizeof (nice));
1493 1396          (void) printf("\t\tused_bytes = %s\n", nice);
1494 1397          zdb_nicenum(dd->dd_compressed_bytes, nice, sizeof (nice));
1495 1398          (void) printf("\t\tcompressed_bytes = %s\n", nice);
1496 1399          zdb_nicenum(dd->dd_uncompressed_bytes, nice, sizeof (nice));
1497 1400          (void) printf("\t\tuncompressed_bytes = %s\n", nice);
1498 1401          zdb_nicenum(dd->dd_quota, nice, sizeof (nice));
1499 1402          (void) printf("\t\tquota = %s\n", nice);
1500 1403          zdb_nicenum(dd->dd_reserved, nice, sizeof (nice));
1501 1404          (void) printf("\t\treserved = %s\n", nice);
1502 1405          (void) printf("\t\tprops_zapobj = %llu\n",
1503 1406              (u_longlong_t)dd->dd_props_zapobj);
1504 1407          (void) printf("\t\tdeleg_zapobj = %llu\n",
1505 1408              (u_longlong_t)dd->dd_deleg_zapobj);
1506 1409          (void) printf("\t\tflags = %llx\n",
1507 1410              (u_longlong_t)dd->dd_flags);
1508 1411  
1509 1412  #define DO(which) \
1510 1413          zdb_nicenum(dd->dd_used_breakdown[DD_USED_ ## which], nice, \
1511 1414              sizeof (nice)); \
1512 1415          (void) printf("\t\tused_breakdown[" #which "] = %s\n", nice)
1513 1416          DO(HEAD);
1514 1417          DO(SNAP);
1515 1418          DO(CHILD);
1516 1419          DO(CHILD_RSRV);
1517 1420          DO(REFRSRV);
1518 1421  #undef DO
1519 1422  }
1520 1423  
1521 1424  /*ARGSUSED*/
1522 1425  static void
1523 1426  dump_dsl_dataset(objset_t *os, uint64_t object, void *data, size_t size)
1524 1427  {
1525 1428          dsl_dataset_phys_t *ds = data;
1526 1429          time_t crtime;
1527 1430          char used[32], compressed[32], uncompressed[32], unique[32];
1528 1431          char blkbuf[BP_SPRINTF_LEN];
1529 1432  
1530 1433          /* make sure nicenum has enough space */
1531 1434          CTASSERT(sizeof (used) >= NN_NUMBUF_SZ);
1532 1435          CTASSERT(sizeof (compressed) >= NN_NUMBUF_SZ);
1533 1436          CTASSERT(sizeof (uncompressed) >= NN_NUMBUF_SZ);
1534 1437          CTASSERT(sizeof (unique) >= NN_NUMBUF_SZ);
1535 1438  
1536 1439          if (ds == NULL)
1537 1440                  return;
1538 1441  
1539 1442          ASSERT(size == sizeof (*ds));
1540 1443          crtime = ds->ds_creation_time;
1541 1444          zdb_nicenum(ds->ds_referenced_bytes, used, sizeof (used));
1542 1445          zdb_nicenum(ds->ds_compressed_bytes, compressed, sizeof (compressed));
1543 1446          zdb_nicenum(ds->ds_uncompressed_bytes, uncompressed,
1544 1447              sizeof (uncompressed));
1545 1448          zdb_nicenum(ds->ds_unique_bytes, unique, sizeof (unique));
1546 1449          snprintf_blkptr(blkbuf, sizeof (blkbuf), &ds->ds_bp);
1547 1450  
1548 1451          (void) printf("\t\tdir_obj = %llu\n",
1549 1452              (u_longlong_t)ds->ds_dir_obj);
1550 1453          (void) printf("\t\tprev_snap_obj = %llu\n",
1551 1454              (u_longlong_t)ds->ds_prev_snap_obj);
1552 1455          (void) printf("\t\tprev_snap_txg = %llu\n",
1553 1456              (u_longlong_t)ds->ds_prev_snap_txg);
1554 1457          (void) printf("\t\tnext_snap_obj = %llu\n",
1555 1458              (u_longlong_t)ds->ds_next_snap_obj);
1556 1459          (void) printf("\t\tsnapnames_zapobj = %llu\n",
1557 1460              (u_longlong_t)ds->ds_snapnames_zapobj);
1558 1461          (void) printf("\t\tnum_children = %llu\n",
1559 1462              (u_longlong_t)ds->ds_num_children);
1560 1463          (void) printf("\t\tuserrefs_obj = %llu\n",
1561 1464              (u_longlong_t)ds->ds_userrefs_obj);
1562 1465          (void) printf("\t\tcreation_time = %s", ctime(&crtime));
1563 1466          (void) printf("\t\tcreation_txg = %llu\n",
1564 1467              (u_longlong_t)ds->ds_creation_txg);
1565 1468          (void) printf("\t\tdeadlist_obj = %llu\n",
1566 1469              (u_longlong_t)ds->ds_deadlist_obj);
1567 1470          (void) printf("\t\tused_bytes = %s\n", used);
1568 1471          (void) printf("\t\tcompressed_bytes = %s\n", compressed);
1569 1472          (void) printf("\t\tuncompressed_bytes = %s\n", uncompressed);
1570 1473          (void) printf("\t\tunique = %s\n", unique);
1571 1474          (void) printf("\t\tfsid_guid = %llu\n",
1572 1475              (u_longlong_t)ds->ds_fsid_guid);
1573 1476          (void) printf("\t\tguid = %llu\n",
1574 1477              (u_longlong_t)ds->ds_guid);
1575 1478          (void) printf("\t\tflags = %llx\n",
1576 1479              (u_longlong_t)ds->ds_flags);
1577 1480          (void) printf("\t\tnext_clones_obj = %llu\n",
1578 1481              (u_longlong_t)ds->ds_next_clones_obj);
1579 1482          (void) printf("\t\tprops_obj = %llu\n",
1580 1483              (u_longlong_t)ds->ds_props_obj);
1581 1484          (void) printf("\t\tbp = %s\n", blkbuf);
1582 1485  }
1583 1486  
1584 1487  /* ARGSUSED */
1585 1488  static int
1586 1489  dump_bptree_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx)
1587 1490  {
1588 1491          char blkbuf[BP_SPRINTF_LEN];
1589 1492  
1590 1493          if (bp->blk_birth != 0) {
1591 1494                  snprintf_blkptr(blkbuf, sizeof (blkbuf), bp);
1592 1495                  (void) printf("\t%s\n", blkbuf);
1593 1496          }
1594 1497          return (0);
1595 1498  }
1596 1499  
1597 1500  static void
1598 1501  dump_bptree(objset_t *os, uint64_t obj, const char *name)
1599 1502  {
1600 1503          char bytes[32];
1601 1504          bptree_phys_t *bt;
1602 1505          dmu_buf_t *db;
1603 1506  
1604 1507          /* make sure nicenum has enough space */
1605 1508          CTASSERT(sizeof (bytes) >= NN_NUMBUF_SZ);
1606 1509  
1607 1510          if (dump_opt['d'] < 3)
1608 1511                  return;
1609 1512  
1610 1513          VERIFY3U(0, ==, dmu_bonus_hold(os, obj, FTAG, &db));
1611 1514          bt = db->db_data;
1612 1515          zdb_nicenum(bt->bt_bytes, bytes, sizeof (bytes));
1613 1516          (void) printf("\n    %s: %llu datasets, %s\n",
1614 1517              name, (unsigned long long)(bt->bt_end - bt->bt_begin), bytes);
1615 1518          dmu_buf_rele(db, FTAG);
1616 1519  
1617 1520          if (dump_opt['d'] < 5)
1618 1521                  return;
1619 1522  
1620 1523          (void) printf("\n");
1621 1524  
1622 1525          (void) bptree_iterate(os, obj, B_FALSE, dump_bptree_cb, NULL, NULL);
1623 1526  }
1624 1527  
1625 1528  /* ARGSUSED */
1626 1529  static int
1627 1530  dump_bpobj_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx)
1628 1531  {
1629 1532          char blkbuf[BP_SPRINTF_LEN];
1630 1533  
1631 1534          ASSERT(bp->blk_birth != 0);
1632 1535          snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), bp);
1633 1536          (void) printf("\t%s\n", blkbuf);
1634 1537          return (0);
1635 1538  }
1636 1539  
1637 1540  static void
1638 1541  dump_full_bpobj(bpobj_t *bpo, const char *name, int indent)
1639 1542  {
1640 1543          char bytes[32];
1641 1544          char comp[32];
1642 1545          char uncomp[32];
1643 1546  
1644 1547          /* make sure nicenum has enough space */
1645 1548          CTASSERT(sizeof (bytes) >= NN_NUMBUF_SZ);
1646 1549          CTASSERT(sizeof (comp) >= NN_NUMBUF_SZ);
1647 1550          CTASSERT(sizeof (uncomp) >= NN_NUMBUF_SZ);
1648 1551  
1649 1552          if (dump_opt['d'] < 3)
1650 1553                  return;
1651 1554  
1652 1555          zdb_nicenum(bpo->bpo_phys->bpo_bytes, bytes, sizeof (bytes));
1653 1556          if (bpo->bpo_havesubobj && bpo->bpo_phys->bpo_subobjs != 0) {
1654 1557                  zdb_nicenum(bpo->bpo_phys->bpo_comp, comp, sizeof (comp));
1655 1558                  zdb_nicenum(bpo->bpo_phys->bpo_uncomp, uncomp, sizeof (uncomp));
1656 1559                  (void) printf("    %*s: object %llu, %llu local blkptrs, "
1657 1560                      "%llu subobjs in object %llu, %s (%s/%s comp)\n",
1658 1561                      indent * 8, name,
1659 1562                      (u_longlong_t)bpo->bpo_object,
1660 1563                      (u_longlong_t)bpo->bpo_phys->bpo_num_blkptrs,
1661 1564                      (u_longlong_t)bpo->bpo_phys->bpo_num_subobjs,
1662 1565                      (u_longlong_t)bpo->bpo_phys->bpo_subobjs,
1663 1566                      bytes, comp, uncomp);
1664 1567  
1665 1568                  for (uint64_t i = 0; i < bpo->bpo_phys->bpo_num_subobjs; i++) {
1666 1569                          uint64_t subobj;
1667 1570                          bpobj_t subbpo;
1668 1571                          int error;
1669 1572                          VERIFY0(dmu_read(bpo->bpo_os,
1670 1573                              bpo->bpo_phys->bpo_subobjs,
1671 1574                              i * sizeof (subobj), sizeof (subobj), &subobj, 0));
1672 1575                          error = bpobj_open(&subbpo, bpo->bpo_os, subobj);
1673 1576                          if (error != 0) {
1674 1577                                  (void) printf("ERROR %u while trying to open "
1675 1578                                      "subobj id %llu\n",
1676 1579                                      error, (u_longlong_t)subobj);
1677 1580                                  continue;
1678 1581                          }
1679 1582                          dump_full_bpobj(&subbpo, "subobj", indent + 1);
1680 1583                          bpobj_close(&subbpo);
1681 1584                  }
1682 1585          } else {
1683 1586                  (void) printf("    %*s: object %llu, %llu blkptrs, %s\n",
1684 1587                      indent * 8, name,
1685 1588                      (u_longlong_t)bpo->bpo_object,
1686 1589                      (u_longlong_t)bpo->bpo_phys->bpo_num_blkptrs,
1687 1590                      bytes);
1688 1591          }
1689 1592  
1690 1593          if (dump_opt['d'] < 5)
1691 1594                  return;
1692 1595  
1693 1596  
1694 1597          if (indent == 0) {
1695 1598                  (void) bpobj_iterate_nofree(bpo, dump_bpobj_cb, NULL, NULL);
1696 1599                  (void) printf("\n");
1697 1600          }
1698 1601  }
1699 1602  
1700 1603  static void
1701 1604  dump_deadlist(dsl_deadlist_t *dl)
1702 1605  {
1703 1606          dsl_deadlist_entry_t *dle;
1704 1607          uint64_t unused;
1705 1608          char bytes[32];
1706 1609          char comp[32];
1707 1610          char uncomp[32];
1708 1611  
1709 1612          /* make sure nicenum has enough space */
1710 1613          CTASSERT(sizeof (bytes) >= NN_NUMBUF_SZ);
1711 1614          CTASSERT(sizeof (comp) >= NN_NUMBUF_SZ);
1712 1615          CTASSERT(sizeof (uncomp) >= NN_NUMBUF_SZ);
1713 1616  
1714 1617          if (dump_opt['d'] < 3)
1715 1618                  return;
1716 1619  
1717 1620          if (dl->dl_oldfmt) {
1718 1621                  dump_full_bpobj(&dl->dl_bpobj, "old-format deadlist", 0);
1719 1622                  return;
1720 1623          }
1721 1624  
1722 1625          zdb_nicenum(dl->dl_phys->dl_used, bytes, sizeof (bytes));
1723 1626          zdb_nicenum(dl->dl_phys->dl_comp, comp, sizeof (comp));
1724 1627          zdb_nicenum(dl->dl_phys->dl_uncomp, uncomp, sizeof (uncomp));
1725 1628          (void) printf("\n    Deadlist: %s (%s/%s comp)\n",
1726 1629              bytes, comp, uncomp);
1727 1630  
1728 1631          if (dump_opt['d'] < 4)
1729 1632                  return;
1730 1633  
1731 1634          (void) printf("\n");
1732 1635  
1733 1636          /* force the tree to be loaded */
1734 1637          dsl_deadlist_space_range(dl, 0, UINT64_MAX, &unused, &unused, &unused);
1735 1638  
1736 1639          for (dle = avl_first(&dl->dl_tree); dle;
1737 1640              dle = AVL_NEXT(&dl->dl_tree, dle)) {
1738 1641                  if (dump_opt['d'] >= 5) {
1739 1642                          char buf[128];
1740 1643                          (void) snprintf(buf, sizeof (buf),
1741 1644                              "mintxg %llu -> obj %llu",
1742 1645                              (longlong_t)dle->dle_mintxg,
1743 1646                              (longlong_t)dle->dle_bpobj.bpo_object);
1744 1647  
1745 1648                          dump_full_bpobj(&dle->dle_bpobj, buf, 0);
1746 1649                  } else {
1747 1650                          (void) printf("mintxg %llu -> obj %llu\n",
1748 1651                              (longlong_t)dle->dle_mintxg,
1749 1652                              (longlong_t)dle->dle_bpobj.bpo_object);
1750 1653  
1751 1654                  }
1752 1655          }
1753 1656  }
1754 1657  
1755 1658  static avl_tree_t idx_tree;
1756 1659  static avl_tree_t domain_tree;
1757 1660  static boolean_t fuid_table_loaded;
1758 1661  static objset_t *sa_os = NULL;
1759 1662  static sa_attr_type_t *sa_attr_table = NULL;
1760 1663  
1761 1664  static int
1762 1665  open_objset(const char *path, dmu_objset_type_t type, void *tag, objset_t **osp)
1763 1666  {
1764 1667          int err;
1765 1668          uint64_t sa_attrs = 0;
1766 1669          uint64_t version = 0;
1767 1670  
1768 1671          VERIFY3P(sa_os, ==, NULL);
1769 1672          err = dmu_objset_own(path, type, B_TRUE, tag, osp);
1770 1673          if (err != 0) {
1771 1674                  (void) fprintf(stderr, "failed to own dataset '%s': %s\n", path,
1772 1675                      strerror(err));
1773 1676                  return (err);
1774 1677          }
1775 1678  
1776 1679          if (dmu_objset_type(*osp) == DMU_OST_ZFS) {
1777 1680                  (void) zap_lookup(*osp, MASTER_NODE_OBJ, ZPL_VERSION_STR,
1778 1681                      8, 1, &version);
1779 1682                  if (version >= ZPL_VERSION_SA) {
1780 1683                          (void) zap_lookup(*osp, MASTER_NODE_OBJ, ZFS_SA_ATTRS,
1781 1684                              8, 1, &sa_attrs);
1782 1685                  }
1783 1686                  err = sa_setup(*osp, sa_attrs, zfs_attr_table, ZPL_END,
1784 1687                      &sa_attr_table);
1785 1688                  if (err != 0) {
1786 1689                          (void) fprintf(stderr, "sa_setup failed: %s\n",
1787 1690                              strerror(err));
1788 1691                          dmu_objset_disown(*osp, tag);
1789 1692                          *osp = NULL;
1790 1693                  }
1791 1694          }
1792 1695          sa_os = *osp;
1793 1696  
1794 1697          return (0);
1795 1698  }
1796 1699  
1797 1700  static void
1798 1701  close_objset(objset_t *os, void *tag)
1799 1702  {
1800 1703          VERIFY3P(os, ==, sa_os);
1801 1704          if (os->os_sa != NULL)
1802 1705                  sa_tear_down(os);
1803 1706          dmu_objset_disown(os, tag);
1804 1707          sa_attr_table = NULL;
1805 1708          sa_os = NULL;
1806 1709  }
1807 1710  
1808 1711  static void
1809 1712  fuid_table_destroy()
1810 1713  {
1811 1714          if (fuid_table_loaded) {
1812 1715                  zfs_fuid_table_destroy(&idx_tree, &domain_tree);
1813 1716                  fuid_table_loaded = B_FALSE;
1814 1717          }
1815 1718  }
1816 1719  
1817 1720  /*
1818 1721   * print uid or gid information.
1819 1722   * For normal POSIX id just the id is printed in decimal format.
1820 1723   * For CIFS files with FUID the fuid is printed in hex followed by
1821 1724   * the domain-rid string.
1822 1725   */
1823 1726  static void
1824 1727  print_idstr(uint64_t id, const char *id_type)
1825 1728  {
1826 1729          if (FUID_INDEX(id)) {
1827 1730                  char *domain;
1828 1731  
1829 1732                  domain = zfs_fuid_idx_domain(&idx_tree, FUID_INDEX(id));
1830 1733                  (void) printf("\t%s     %llx [%s-%d]\n", id_type,
1831 1734                      (u_longlong_t)id, domain, (int)FUID_RID(id));
1832 1735          } else {
1833 1736                  (void) printf("\t%s     %llu\n", id_type, (u_longlong_t)id);
1834 1737          }
1835 1738  
1836 1739  }
1837 1740  
1838 1741  static void
1839 1742  dump_uidgid(objset_t *os, uint64_t uid, uint64_t gid)
1840 1743  {
1841 1744          uint32_t uid_idx, gid_idx;
1842 1745  
1843 1746          uid_idx = FUID_INDEX(uid);
1844 1747          gid_idx = FUID_INDEX(gid);
1845 1748  
1846 1749          /* Load domain table, if not already loaded */
1847 1750          if (!fuid_table_loaded && (uid_idx || gid_idx)) {
1848 1751                  uint64_t fuid_obj;
1849 1752  
1850 1753                  /* first find the fuid object.  It lives in the master node */
1851 1754                  VERIFY(zap_lookup(os, MASTER_NODE_OBJ, ZFS_FUID_TABLES,
1852 1755                      8, 1, &fuid_obj) == 0);
1853 1756                  zfs_fuid_avl_tree_create(&idx_tree, &domain_tree);
1854 1757                  (void) zfs_fuid_table_load(os, fuid_obj,
1855 1758                      &idx_tree, &domain_tree);
1856 1759                  fuid_table_loaded = B_TRUE;
1857 1760          }
1858 1761  
1859 1762          print_idstr(uid, "uid");
1860 1763          print_idstr(gid, "gid");
1861 1764  }
1862 1765  
1863 1766  /*ARGSUSED*/
1864 1767  static void
1865 1768  dump_znode(objset_t *os, uint64_t object, void *data, size_t size)
1866 1769  {
1867 1770          char path[MAXPATHLEN * 2];      /* allow for xattr and failure prefix */
1868 1771          sa_handle_t *hdl;
1869 1772          uint64_t xattr, rdev, gen;
1870 1773          uint64_t uid, gid, mode, fsize, parent, links;
1871 1774          uint64_t pflags;
1872 1775          uint64_t acctm[2], modtm[2], chgtm[2], crtm[2];
1873 1776          time_t z_crtime, z_atime, z_mtime, z_ctime;
1874 1777          sa_bulk_attr_t bulk[12];
1875 1778          int idx = 0;
1876 1779          int error;
1877 1780  
1878 1781          VERIFY3P(os, ==, sa_os);
1879 1782          if (sa_handle_get(os, object, NULL, SA_HDL_PRIVATE, &hdl)) {
1880 1783                  (void) printf("Failed to get handle for SA znode\n");
1881 1784                  return;
1882 1785          }
1883 1786  
1884 1787          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_UID], NULL, &uid, 8);
1885 1788          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_GID], NULL, &gid, 8);
1886 1789          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_LINKS], NULL,
1887 1790              &links, 8);
1888 1791          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_GEN], NULL, &gen, 8);
1889 1792          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_MODE], NULL,
1890 1793              &mode, 8);
1891 1794          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_PARENT],
1892 1795              NULL, &parent, 8);
1893 1796          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_SIZE], NULL,
1894 1797              &fsize, 8);
1895 1798          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_ATIME], NULL,
1896 1799              acctm, 16);
1897 1800          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_MTIME], NULL,
1898 1801              modtm, 16);
1899 1802          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_CRTIME], NULL,
1900 1803              crtm, 16);
1901 1804          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_CTIME], NULL,
1902 1805              chgtm, 16);
1903 1806          SA_ADD_BULK_ATTR(bulk, idx, sa_attr_table[ZPL_FLAGS], NULL,
1904 1807              &pflags, 8);
1905 1808  
1906 1809          if (sa_bulk_lookup(hdl, bulk, idx)) {
1907 1810                  (void) sa_handle_destroy(hdl);
1908 1811                  return;
1909 1812          }
1910 1813  
1911 1814          z_crtime = (time_t)crtm[0];
1912 1815          z_atime = (time_t)acctm[0];
1913 1816          z_mtime = (time_t)modtm[0];
1914 1817          z_ctime = (time_t)chgtm[0];
1915 1818  
1916 1819          if (dump_opt['d'] > 4) {
1917 1820                  error = zfs_obj_to_path(os, object, path, sizeof (path));
1918 1821                  if (error != 0) {
1919 1822                          (void) snprintf(path, sizeof (path),
1920 1823                              "\?\?\?<object#%llu>", (u_longlong_t)object);
1921 1824                  }
1922 1825                  (void) printf("\tpath   %s\n", path);
1923 1826          }
1924 1827          dump_uidgid(os, uid, gid);
1925 1828          (void) printf("\tatime  %s", ctime(&z_atime));
1926 1829          (void) printf("\tmtime  %s", ctime(&z_mtime));
1927 1830          (void) printf("\tctime  %s", ctime(&z_ctime));
1928 1831          (void) printf("\tcrtime %s", ctime(&z_crtime));
1929 1832          (void) printf("\tgen    %llu\n", (u_longlong_t)gen);
1930 1833          (void) printf("\tmode   %llo\n", (u_longlong_t)mode);
1931 1834          (void) printf("\tsize   %llu\n", (u_longlong_t)fsize);
1932 1835          (void) printf("\tparent %llu\n", (u_longlong_t)parent);
1933 1836          (void) printf("\tlinks  %llu\n", (u_longlong_t)links);
1934 1837          (void) printf("\tpflags %llx\n", (u_longlong_t)pflags);
1935 1838          if (sa_lookup(hdl, sa_attr_table[ZPL_XATTR], &xattr,
1936 1839              sizeof (uint64_t)) == 0)
1937 1840                  (void) printf("\txattr  %llu\n", (u_longlong_t)xattr);
1938 1841          if (sa_lookup(hdl, sa_attr_table[ZPL_RDEV], &rdev,
1939 1842              sizeof (uint64_t)) == 0)
1940 1843                  (void) printf("\trdev   0x%016llx\n", (u_longlong_t)rdev);
1941 1844          sa_handle_destroy(hdl);
1942 1845  }
1943 1846  
1944 1847  /*ARGSUSED*/
1945 1848  static void
1946 1849  dump_acl(objset_t *os, uint64_t object, void *data, size_t size)
1947 1850  {
1948 1851  }
1949 1852  
1950 1853  /*ARGSUSED*/
1951 1854  static void
1952 1855  dump_dmu_objset(objset_t *os, uint64_t object, void *data, size_t size)
1953 1856  {
1954 1857  }
1955 1858  
1956 1859  static object_viewer_t *object_viewer[DMU_OT_NUMTYPES + 1] = {
1957 1860          dump_none,              /* unallocated                  */
1958 1861          dump_zap,               /* object directory             */
1959 1862          dump_uint64,            /* object array                 */
1960 1863          dump_none,              /* packed nvlist                */
1961 1864          dump_packed_nvlist,     /* packed nvlist size           */
1962 1865          dump_none,              /* bpobj                        */
1963 1866          dump_bpobj,             /* bpobj header                 */
1964 1867          dump_none,              /* SPA space map header         */
1965 1868          dump_none,              /* SPA space map                */
1966 1869          dump_none,              /* ZIL intent log               */
1967 1870          dump_dnode,             /* DMU dnode                    */
1968 1871          dump_dmu_objset,        /* DMU objset                   */
1969 1872          dump_dsl_dir,           /* DSL directory                */
1970 1873          dump_zap,               /* DSL directory child map      */
1971 1874          dump_zap,               /* DSL dataset snap map         */
1972 1875          dump_zap,               /* DSL props                    */
1973 1876          dump_dsl_dataset,       /* DSL dataset                  */
1974 1877          dump_znode,             /* ZFS znode                    */
1975 1878          dump_acl,               /* ZFS V0 ACL                   */
1976 1879          dump_uint8,             /* ZFS plain file               */
1977 1880          dump_zpldir,            /* ZFS directory                */
1978 1881          dump_zap,               /* ZFS master node              */
1979 1882          dump_zap,               /* ZFS delete queue             */
1980 1883          dump_uint8,             /* zvol object                  */
1981 1884          dump_zap,               /* zvol prop                    */
1982 1885          dump_uint8,             /* other uint8[]                */
1983 1886          dump_uint64,            /* other uint64[]               */
1984 1887          dump_zap,               /* other ZAP                    */
1985 1888          dump_zap,               /* persistent error log         */
1986 1889          dump_uint8,             /* SPA history                  */
1987 1890          dump_history_offsets,   /* SPA history offsets          */
1988 1891          dump_zap,               /* Pool properties              */
1989 1892          dump_zap,               /* DSL permissions              */
1990 1893          dump_acl,               /* ZFS ACL                      */
1991 1894          dump_uint8,             /* ZFS SYSACL                   */
1992 1895          dump_none,              /* FUID nvlist                  */
1993 1896          dump_packed_nvlist,     /* FUID nvlist size             */
1994 1897          dump_zap,               /* DSL dataset next clones      */
1995 1898          dump_zap,               /* DSL scrub queue              */
1996 1899          dump_zap,               /* ZFS user/group used          */
1997 1900          dump_zap,               /* ZFS user/group quota         */
1998 1901          dump_zap,               /* snapshot refcount tags       */
1999 1902          dump_ddt_zap,           /* DDT ZAP object               */
2000 1903          dump_zap,               /* DDT statistics               */
2001 1904          dump_znode,             /* SA object                    */
2002 1905          dump_zap,               /* SA Master Node               */
2003 1906          dump_sa_attrs,          /* SA attribute registration    */
2004 1907          dump_sa_layouts,        /* SA attribute layouts         */
2005 1908          dump_zap,               /* DSL scrub translations       */
2006 1909          dump_none,              /* fake dedup BP                */
2007 1910          dump_zap,               /* deadlist                     */
2008 1911          dump_none,              /* deadlist hdr                 */
2009 1912          dump_zap,               /* dsl clones                   */
2010 1913          dump_bpobj_subobjs,     /* bpobj subobjs                */
2011 1914          dump_unknown,           /* Unknown type, must be last   */
2012 1915  };
2013 1916  
2014 1917  static void
2015 1918  dump_object(objset_t *os, uint64_t object, int verbosity, int *print_header)
2016 1919  {
2017 1920          dmu_buf_t *db = NULL;
2018 1921          dmu_object_info_t doi;
2019 1922          dnode_t *dn;
2020 1923          void *bonus = NULL;
2021 1924          size_t bsize = 0;
2022 1925          char iblk[32], dblk[32], lsize[32], asize[32], fill[32];
2023 1926          char bonus_size[32];
2024 1927          char aux[50];
2025 1928          int error;
2026 1929  
2027 1930          /* make sure nicenum has enough space */
2028 1931          CTASSERT(sizeof (iblk) >= NN_NUMBUF_SZ);
2029 1932          CTASSERT(sizeof (dblk) >= NN_NUMBUF_SZ);
2030 1933          CTASSERT(sizeof (lsize) >= NN_NUMBUF_SZ);
2031 1934          CTASSERT(sizeof (asize) >= NN_NUMBUF_SZ);
2032 1935          CTASSERT(sizeof (bonus_size) >= NN_NUMBUF_SZ);
2033 1936  
2034 1937          if (*print_header) {
2035 1938                  (void) printf("\n%10s  %3s  %5s  %5s  %5s  %5s  %6s  %s\n",
2036 1939                      "Object", "lvl", "iblk", "dblk", "dsize", "lsize",
2037 1940                      "%full", "type");
2038 1941                  *print_header = 0;
2039 1942          }
2040 1943  
2041 1944          if (object == 0) {
2042 1945                  dn = DMU_META_DNODE(os);
2043 1946          } else {
2044 1947                  error = dmu_bonus_hold(os, object, FTAG, &db);
2045 1948                  if (error)
2046 1949                          fatal("dmu_bonus_hold(%llu) failed, errno %u",
2047 1950                              object, error);
2048 1951                  bonus = db->db_data;
2049 1952                  bsize = db->db_size;
2050 1953                  dn = DB_DNODE((dmu_buf_impl_t *)db);
2051 1954          }
2052 1955          dmu_object_info_from_dnode(dn, &doi);
2053 1956  
2054 1957          zdb_nicenum(doi.doi_metadata_block_size, iblk, sizeof (iblk));
2055 1958          zdb_nicenum(doi.doi_data_block_size, dblk, sizeof (dblk));
2056 1959          zdb_nicenum(doi.doi_max_offset, lsize, sizeof (lsize));
2057 1960          zdb_nicenum(doi.doi_physical_blocks_512 << 9, asize, sizeof (asize));
2058 1961          zdb_nicenum(doi.doi_bonus_size, bonus_size, sizeof (bonus_size));
2059 1962          (void) sprintf(fill, "%6.2f", 100.0 * doi.doi_fill_count *
2060 1963              doi.doi_data_block_size / (object == 0 ? DNODES_PER_BLOCK : 1) /
2061 1964              doi.doi_max_offset);
2062 1965  
2063 1966          aux[0] = '\0';
2064 1967  
2065 1968          if (doi.doi_checksum != ZIO_CHECKSUM_INHERIT || verbosity >= 6) {
2066 1969                  (void) snprintf(aux + strlen(aux), sizeof (aux), " (K=%s)",
2067 1970                      ZDB_CHECKSUM_NAME(doi.doi_checksum));
2068 1971          }
2069 1972  
2070 1973          if (doi.doi_compress != ZIO_COMPRESS_INHERIT || verbosity >= 6) {
2071 1974                  (void) snprintf(aux + strlen(aux), sizeof (aux), " (Z=%s)",
2072 1975                      ZDB_COMPRESS_NAME(doi.doi_compress));
2073 1976          }
2074 1977  
2075 1978          (void) printf("%10lld  %3u  %5s  %5s  %5s  %5s  %6s  %s%s\n",
2076 1979              (u_longlong_t)object, doi.doi_indirection, iblk, dblk,
2077 1980              asize, lsize, fill, ZDB_OT_NAME(doi.doi_type), aux);
2078 1981  
2079 1982          if (doi.doi_bonus_type != DMU_OT_NONE && verbosity > 3) {
2080 1983                  (void) printf("%10s  %3s  %5s  %5s  %5s  %5s  %6s  %s\n",
2081 1984                      "", "", "", "", "", bonus_size, "bonus",
2082 1985                      ZDB_OT_NAME(doi.doi_bonus_type));
2083 1986          }
2084 1987  
2085 1988          if (verbosity >= 4) {
2086 1989                  (void) printf("\tdnode flags: %s%s%s\n",
2087 1990                      (dn->dn_phys->dn_flags & DNODE_FLAG_USED_BYTES) ?
2088 1991                      "USED_BYTES " : "",
2089 1992                      (dn->dn_phys->dn_flags & DNODE_FLAG_USERUSED_ACCOUNTED) ?
2090 1993                      "USERUSED_ACCOUNTED " : "",
2091 1994                      (dn->dn_phys->dn_flags & DNODE_FLAG_SPILL_BLKPTR) ?
2092 1995                      "SPILL_BLKPTR" : "");
2093 1996                  (void) printf("\tdnode maxblkid: %llu\n",
2094 1997                      (longlong_t)dn->dn_phys->dn_maxblkid);
2095 1998  
2096 1999                  object_viewer[ZDB_OT_TYPE(doi.doi_bonus_type)](os, object,
2097 2000                      bonus, bsize);
2098 2001                  object_viewer[ZDB_OT_TYPE(doi.doi_type)](os, object, NULL, 0);
2099 2002                  *print_header = 1;
2100 2003          }
2101 2004  
2102 2005          if (verbosity >= 5)
2103 2006                  dump_indirect(dn);
2104 2007  
2105 2008          if (verbosity >= 5) {
2106 2009                  /*
2107 2010                   * Report the list of segments that comprise the object.
2108 2011                   */
2109 2012                  uint64_t start = 0;
2110 2013                  uint64_t end;
2111 2014                  uint64_t blkfill = 1;
2112 2015                  int minlvl = 1;
2113 2016  
2114 2017                  if (dn->dn_type == DMU_OT_DNODE) {
2115 2018                          minlvl = 0;
2116 2019                          blkfill = DNODES_PER_BLOCK;
2117 2020                  }
2118 2021  
2119 2022                  for (;;) {
2120 2023                          char segsize[32];
2121 2024                          /* make sure nicenum has enough space */
2122 2025                          CTASSERT(sizeof (segsize) >= NN_NUMBUF_SZ);
2123 2026                          error = dnode_next_offset(dn,
2124 2027                              0, &start, minlvl, blkfill, 0);
2125 2028                          if (error)
2126 2029                                  break;
2127 2030                          end = start;
2128 2031                          error = dnode_next_offset(dn,
2129 2032                              DNODE_FIND_HOLE, &end, minlvl, blkfill, 0);
2130 2033                          zdb_nicenum(end - start, segsize, sizeof (segsize));
2131 2034                          (void) printf("\t\tsegment [%016llx, %016llx)"
2132 2035                              " size %5s\n", (u_longlong_t)start,
2133 2036                              (u_longlong_t)end, segsize);
2134 2037                          if (error)
2135 2038                                  break;
2136 2039                          start = end;
2137 2040                  }
2138 2041          }
2139 2042  
2140 2043          if (db != NULL)
2141 2044                  dmu_buf_rele(db, FTAG);
2142 2045  }
2143 2046  
2144 2047  static const char *objset_types[DMU_OST_NUMTYPES] = {
2145 2048          "NONE", "META", "ZPL", "ZVOL", "OTHER", "ANY" };
2146 2049  
2147 2050  static void
2148 2051  dump_dir(objset_t *os)
2149 2052  {
2150 2053          dmu_objset_stats_t dds;
2151 2054          uint64_t object, object_count;
2152 2055          uint64_t refdbytes, usedobjs, scratch;
2153 2056          char numbuf[32];
2154 2057          char blkbuf[BP_SPRINTF_LEN + 20];
2155 2058          char osname[ZFS_MAX_DATASET_NAME_LEN];
2156 2059          const char *type = "UNKNOWN";
2157 2060          int verbosity = dump_opt['d'];
2158 2061          int print_header = 1;
2159 2062          unsigned i;
2160 2063          int error;
2161 2064  
2162 2065          /* make sure nicenum has enough space */
2163 2066          CTASSERT(sizeof (numbuf) >= NN_NUMBUF_SZ);
2164 2067  
2165 2068          dsl_pool_config_enter(dmu_objset_pool(os), FTAG);
2166 2069          dmu_objset_fast_stat(os, &dds);
2167 2070          dsl_pool_config_exit(dmu_objset_pool(os), FTAG);
2168 2071  
2169 2072          if (dds.dds_type < DMU_OST_NUMTYPES)
2170 2073                  type = objset_types[dds.dds_type];
2171 2074  
2172 2075          if (dds.dds_type == DMU_OST_META) {
2173 2076                  dds.dds_creation_txg = TXG_INITIAL;
2174 2077                  usedobjs = BP_GET_FILL(os->os_rootbp);
2175 2078                  refdbytes = dsl_dir_phys(os->os_spa->spa_dsl_pool->dp_mos_dir)->
2176 2079                      dd_used_bytes;
2177 2080          } else {
2178 2081                  dmu_objset_space(os, &refdbytes, &scratch, &usedobjs, &scratch);
2179 2082          }
2180 2083  
2181 2084          ASSERT3U(usedobjs, ==, BP_GET_FILL(os->os_rootbp));
2182 2085  
2183 2086          zdb_nicenum(refdbytes, numbuf, sizeof (numbuf));
2184 2087  
2185 2088          if (verbosity >= 4) {
2186 2089                  (void) snprintf(blkbuf, sizeof (blkbuf), ", rootbp ");
2187 2090                  (void) snprintf_blkptr(blkbuf + strlen(blkbuf),
2188 2091                      sizeof (blkbuf) - strlen(blkbuf), os->os_rootbp);
2189 2092          } else {
2190 2093                  blkbuf[0] = '\0';
2191 2094          }
2192 2095  
2193 2096          dmu_objset_name(os, osname);
2194 2097  
2195 2098          (void) printf("Dataset %s [%s], ID %llu, cr_txg %llu, "
2196 2099              "%s, %llu objects%s\n",
2197 2100              osname, type, (u_longlong_t)dmu_objset_id(os),
2198 2101              (u_longlong_t)dds.dds_creation_txg,
2199 2102              numbuf, (u_longlong_t)usedobjs, blkbuf);
2200 2103  
2201 2104          if (zopt_objects != 0) {
  
    | 
      ↓ open down ↓ | 
    894 lines elided | 
    
      ↑ open up ↑ | 
  
2202 2105                  for (i = 0; i < zopt_objects; i++)
2203 2106                          dump_object(os, zopt_object[i], verbosity,
2204 2107                              &print_header);
2205 2108                  (void) printf("\n");
2206 2109                  return;
2207 2110          }
2208 2111  
2209 2112          if (dump_opt['i'] != 0 || verbosity >= 2)
2210 2113                  dump_intent_log(dmu_objset_zil(os));
2211 2114  
2212      -        if (dmu_objset_ds(os) != NULL) {
2213      -                dsl_dataset_t *ds = dmu_objset_ds(os);
2214      -                dump_deadlist(&ds->ds_deadlist);
     2115 +        if (dmu_objset_ds(os) != NULL)
     2116 +                dump_deadlist(&dmu_objset_ds(os)->ds_deadlist);
2215 2117  
2216      -                if (dsl_dataset_remap_deadlist_exists(ds)) {
2217      -                        (void) printf("ds_remap_deadlist:\n");
2218      -                        dump_deadlist(&ds->ds_remap_deadlist);
2219      -                }
2220      -        }
2221      -
2222 2118          if (verbosity < 2)
2223 2119                  return;
2224 2120  
2225 2121          if (BP_IS_HOLE(os->os_rootbp))
2226 2122                  return;
2227 2123  
2228 2124          dump_object(os, 0, verbosity, &print_header);
2229 2125          object_count = 0;
2230 2126          if (DMU_USERUSED_DNODE(os) != NULL &&
2231 2127              DMU_USERUSED_DNODE(os)->dn_type != 0) {
2232 2128                  dump_object(os, DMU_USERUSED_OBJECT, verbosity, &print_header);
2233 2129                  dump_object(os, DMU_GROUPUSED_OBJECT, verbosity, &print_header);
2234 2130          }
2235 2131  
2236 2132          object = 0;
2237 2133          while ((error = dmu_object_next(os, &object, B_FALSE, 0)) == 0) {
2238 2134                  dump_object(os, object, verbosity, &print_header);
2239 2135                  object_count++;
2240 2136          }
2241 2137  
2242 2138          ASSERT3U(object_count, ==, usedobjs);
2243 2139  
2244 2140          (void) printf("\n");
2245 2141  
2246 2142          if (error != ESRCH) {
2247 2143                  (void) fprintf(stderr, "dmu_object_next() = %d\n", error);
2248 2144                  abort();
2249 2145          }
2250 2146  }
2251 2147  
2252 2148  static void
2253 2149  dump_uberblock(uberblock_t *ub, const char *header, const char *footer)
2254 2150  {
2255 2151          time_t timestamp = ub->ub_timestamp;
2256 2152  
2257 2153          (void) printf("%s", header ? header : "");
2258 2154          (void) printf("\tmagic = %016llx\n", (u_longlong_t)ub->ub_magic);
2259 2155          (void) printf("\tversion = %llu\n", (u_longlong_t)ub->ub_version);
2260 2156          (void) printf("\ttxg = %llu\n", (u_longlong_t)ub->ub_txg);
2261 2157          (void) printf("\tguid_sum = %llu\n", (u_longlong_t)ub->ub_guid_sum);
2262 2158          (void) printf("\ttimestamp = %llu UTC = %s",
2263 2159              (u_longlong_t)ub->ub_timestamp, asctime(localtime(×tamp)));
2264 2160          if (dump_opt['u'] >= 3) {
2265 2161                  char blkbuf[BP_SPRINTF_LEN];
2266 2162                  snprintf_blkptr(blkbuf, sizeof (blkbuf), &ub->ub_rootbp);
2267 2163                  (void) printf("\trootbp = %s\n", blkbuf);
2268 2164          }
2269 2165          (void) printf("%s", footer ? footer : "");
2270 2166  }
2271 2167  
2272 2168  static void
2273 2169  dump_config(spa_t *spa)
2274 2170  {
2275 2171          dmu_buf_t *db;
2276 2172          size_t nvsize = 0;
2277 2173          int error = 0;
2278 2174  
2279 2175  
2280 2176          error = dmu_bonus_hold(spa->spa_meta_objset,
2281 2177              spa->spa_config_object, FTAG, &db);
2282 2178  
2283 2179          if (error == 0) {
2284 2180                  nvsize = *(uint64_t *)db->db_data;
2285 2181                  dmu_buf_rele(db, FTAG);
2286 2182  
2287 2183                  (void) printf("\nMOS Configuration:\n");
2288 2184                  dump_packed_nvlist(spa->spa_meta_objset,
2289 2185                      spa->spa_config_object, (void *)&nvsize, 1);
2290 2186          } else {
2291 2187                  (void) fprintf(stderr, "dmu_bonus_hold(%llu) failed, errno %d",
2292 2188                      (u_longlong_t)spa->spa_config_object, error);
2293 2189          }
2294 2190  }
2295 2191  
2296 2192  static void
2297 2193  dump_cachefile(const char *cachefile)
2298 2194  {
2299 2195          int fd;
2300 2196          struct stat64 statbuf;
2301 2197          char *buf;
2302 2198          nvlist_t *config;
2303 2199  
2304 2200          if ((fd = open64(cachefile, O_RDONLY)) < 0) {
2305 2201                  (void) printf("cannot open '%s': %s\n", cachefile,
2306 2202                      strerror(errno));
2307 2203                  exit(1);
2308 2204          }
2309 2205  
2310 2206          if (fstat64(fd, &statbuf) != 0) {
2311 2207                  (void) printf("failed to stat '%s': %s\n", cachefile,
2312 2208                      strerror(errno));
2313 2209                  exit(1);
2314 2210          }
2315 2211  
2316 2212          if ((buf = malloc(statbuf.st_size)) == NULL) {
2317 2213                  (void) fprintf(stderr, "failed to allocate %llu bytes\n",
2318 2214                      (u_longlong_t)statbuf.st_size);
2319 2215                  exit(1);
2320 2216          }
2321 2217  
2322 2218          if (read(fd, buf, statbuf.st_size) != statbuf.st_size) {
2323 2219                  (void) fprintf(stderr, "failed to read %llu bytes\n",
2324 2220                      (u_longlong_t)statbuf.st_size);
2325 2221                  exit(1);
2326 2222          }
2327 2223  
2328 2224          (void) close(fd);
2329 2225  
2330 2226          if (nvlist_unpack(buf, statbuf.st_size, &config, 0) != 0) {
2331 2227                  (void) fprintf(stderr, "failed to unpack nvlist\n");
2332 2228                  exit(1);
2333 2229          }
2334 2230  
2335 2231          free(buf);
2336 2232  
2337 2233          dump_nvlist(config, 0);
2338 2234  
2339 2235          nvlist_free(config);
2340 2236  }
2341 2237  
2342 2238  #define ZDB_MAX_UB_HEADER_SIZE 32
2343 2239  
2344 2240  static void
2345 2241  dump_label_uberblocks(vdev_label_t *lbl, uint64_t ashift)
2346 2242  {
2347 2243          vdev_t vd;
2348 2244          vdev_t *vdp = &vd;
2349 2245          char header[ZDB_MAX_UB_HEADER_SIZE];
2350 2246  
2351 2247          vd.vdev_ashift = ashift;
2352 2248          vdp->vdev_top = vdp;
2353 2249  
2354 2250          for (int i = 0; i < VDEV_UBERBLOCK_COUNT(vdp); i++) {
2355 2251                  uint64_t uoff = VDEV_UBERBLOCK_OFFSET(vdp, i);
2356 2252                  uberblock_t *ub = (void *)((char *)lbl + uoff);
2357 2253  
2358 2254                  if (uberblock_verify(ub))
2359 2255                          continue;
2360 2256                  (void) snprintf(header, ZDB_MAX_UB_HEADER_SIZE,
2361 2257                      "Uberblock[%d]\n", i);
2362 2258                  dump_uberblock(ub, header, "");
2363 2259          }
2364 2260  }
2365 2261  
2366 2262  static char curpath[PATH_MAX];
2367 2263  
2368 2264  /*
2369 2265   * Iterate through the path components, recursively passing
2370 2266   * current one's obj and remaining path until we find the obj
2371 2267   * for the last one.
2372 2268   */
2373 2269  static int
2374 2270  dump_path_impl(objset_t *os, uint64_t obj, char *name)
2375 2271  {
2376 2272          int err;
2377 2273          int header = 1;
2378 2274          uint64_t child_obj;
2379 2275          char *s;
2380 2276          dmu_buf_t *db;
2381 2277          dmu_object_info_t doi;
2382 2278  
2383 2279          if ((s = strchr(name, '/')) != NULL)
2384 2280                  *s = '\0';
2385 2281          err = zap_lookup(os, obj, name, 8, 1, &child_obj);
2386 2282  
2387 2283          (void) strlcat(curpath, name, sizeof (curpath));
2388 2284  
2389 2285          if (err != 0) {
2390 2286                  (void) fprintf(stderr, "failed to lookup %s: %s\n",
2391 2287                      curpath, strerror(err));
2392 2288                  return (err);
2393 2289          }
2394 2290  
2395 2291          child_obj = ZFS_DIRENT_OBJ(child_obj);
2396 2292          err = sa_buf_hold(os, child_obj, FTAG, &db);
2397 2293          if (err != 0) {
2398 2294                  (void) fprintf(stderr,
2399 2295                      "failed to get SA dbuf for obj %llu: %s\n",
2400 2296                      (u_longlong_t)child_obj, strerror(err));
2401 2297                  return (EINVAL);
2402 2298          }
2403 2299          dmu_object_info_from_db(db, &doi);
2404 2300          sa_buf_rele(db, FTAG);
2405 2301  
2406 2302          if (doi.doi_bonus_type != DMU_OT_SA &&
2407 2303              doi.doi_bonus_type != DMU_OT_ZNODE) {
2408 2304                  (void) fprintf(stderr, "invalid bonus type %d for obj %llu\n",
2409 2305                      doi.doi_bonus_type, (u_longlong_t)child_obj);
2410 2306                  return (EINVAL);
2411 2307          }
2412 2308  
2413 2309          if (dump_opt['v'] > 6) {
2414 2310                  (void) printf("obj=%llu %s type=%d bonustype=%d\n",
2415 2311                      (u_longlong_t)child_obj, curpath, doi.doi_type,
2416 2312                      doi.doi_bonus_type);
2417 2313          }
2418 2314  
2419 2315          (void) strlcat(curpath, "/", sizeof (curpath));
2420 2316  
2421 2317          switch (doi.doi_type) {
2422 2318          case DMU_OT_DIRECTORY_CONTENTS:
2423 2319                  if (s != NULL && *(s + 1) != '\0')
2424 2320                          return (dump_path_impl(os, child_obj, s + 1));
2425 2321                  /*FALLTHROUGH*/
2426 2322          case DMU_OT_PLAIN_FILE_CONTENTS:
2427 2323                  dump_object(os, child_obj, dump_opt['v'], &header);
2428 2324                  return (0);
2429 2325          default:
2430 2326                  (void) fprintf(stderr, "object %llu has non-file/directory "
2431 2327                      "type %d\n", (u_longlong_t)obj, doi.doi_type);
2432 2328                  break;
2433 2329          }
2434 2330  
2435 2331          return (EINVAL);
2436 2332  }
2437 2333  
2438 2334  /*
2439 2335   * Dump the blocks for the object specified by path inside the dataset.
2440 2336   */
2441 2337  static int
2442 2338  dump_path(char *ds, char *path)
2443 2339  {
2444 2340          int err;
2445 2341          objset_t *os;
2446 2342          uint64_t root_obj;
2447 2343  
2448 2344          err = open_objset(ds, DMU_OST_ZFS, FTAG, &os);
2449 2345          if (err != 0)
2450 2346                  return (err);
2451 2347  
2452 2348          err = zap_lookup(os, MASTER_NODE_OBJ, ZFS_ROOT_OBJ, 8, 1, &root_obj);
2453 2349          if (err != 0) {
2454 2350                  (void) fprintf(stderr, "can't lookup root znode: %s\n",
2455 2351                      strerror(err));
2456 2352                  dmu_objset_disown(os, FTAG);
2457 2353                  return (EINVAL);
2458 2354          }
2459 2355  
2460 2356          (void) snprintf(curpath, sizeof (curpath), "dataset=%s path=/", ds);
2461 2357  
2462 2358          err = dump_path_impl(os, root_obj, path);
2463 2359  
2464 2360          close_objset(os, FTAG);
2465 2361          return (err);
2466 2362  }
2467 2363  
2468 2364  static int
2469 2365  dump_label(const char *dev)
2470 2366  {
2471 2367          int fd;
2472 2368          vdev_label_t label;
2473 2369          char path[MAXPATHLEN];
2474 2370          char *buf = label.vl_vdev_phys.vp_nvlist;
2475 2371          size_t buflen = sizeof (label.vl_vdev_phys.vp_nvlist);
2476 2372          struct stat64 statbuf;
2477 2373          uint64_t psize, ashift;
2478 2374          boolean_t label_found = B_FALSE;
2479 2375  
2480 2376          (void) strlcpy(path, dev, sizeof (path));
2481 2377          if (dev[0] == '/') {
2482 2378                  if (strncmp(dev, ZFS_DISK_ROOTD,
2483 2379                      strlen(ZFS_DISK_ROOTD)) == 0) {
2484 2380                          (void) snprintf(path, sizeof (path), "%s%s",
2485 2381                              ZFS_RDISK_ROOTD, dev + strlen(ZFS_DISK_ROOTD));
2486 2382                  }
2487 2383          } else if (stat64(path, &statbuf) != 0) {
2488 2384                  char *s;
2489 2385  
2490 2386                  (void) snprintf(path, sizeof (path), "%s%s", ZFS_RDISK_ROOTD,
2491 2387                      dev);
2492 2388                  if (((s = strrchr(dev, 's')) == NULL &&
2493 2389                      (s = strchr(dev, 'p')) == NULL) ||
2494 2390                      !isdigit(*(s + 1)))
2495 2391                          (void) strlcat(path, "s0", sizeof (path));
2496 2392          }
2497 2393  
2498 2394          if ((fd = open64(path, O_RDONLY)) < 0) {
2499 2395                  (void) fprintf(stderr, "cannot open '%s': %s\n", path,
2500 2396                      strerror(errno));
2501 2397                  exit(1);
2502 2398          }
2503 2399  
2504 2400          if (fstat64(fd, &statbuf) != 0) {
2505 2401                  (void) fprintf(stderr, "failed to stat '%s': %s\n", path,
2506 2402                      strerror(errno));
2507 2403                  (void) close(fd);
2508 2404                  exit(1);
2509 2405          }
2510 2406  
2511 2407          if (S_ISBLK(statbuf.st_mode)) {
2512 2408                  (void) fprintf(stderr,
2513 2409                      "cannot use '%s': character device required\n", path);
2514 2410                  (void) close(fd);
2515 2411                  exit(1);
2516 2412          }
2517 2413  
2518 2414          psize = statbuf.st_size;
2519 2415          psize = P2ALIGN(psize, (uint64_t)sizeof (vdev_label_t));
2520 2416  
2521 2417          for (int l = 0; l < VDEV_LABELS; l++) {
2522 2418                  nvlist_t *config = NULL;
2523 2419  
2524 2420                  if (!dump_opt['q']) {
2525 2421                          (void) printf("------------------------------------\n");
2526 2422                          (void) printf("LABEL %d\n", l);
2527 2423                          (void) printf("------------------------------------\n");
2528 2424                  }
2529 2425  
2530 2426                  if (pread64(fd, &label, sizeof (label),
2531 2427                      vdev_label_offset(psize, l, 0)) != sizeof (label)) {
2532 2428                          if (!dump_opt['q'])
2533 2429                                  (void) printf("failed to read label %d\n", l);
2534 2430                          continue;
2535 2431                  }
2536 2432  
2537 2433                  if (nvlist_unpack(buf, buflen, &config, 0) != 0) {
2538 2434                          if (!dump_opt['q'])
2539 2435                                  (void) printf("failed to unpack label %d\n", l);
2540 2436                          ashift = SPA_MINBLOCKSHIFT;
2541 2437                  } else {
2542 2438                          nvlist_t *vdev_tree = NULL;
2543 2439  
2544 2440                          if (!dump_opt['q'])
2545 2441                                  dump_nvlist(config, 4);
2546 2442                          if ((nvlist_lookup_nvlist(config,
2547 2443                              ZPOOL_CONFIG_VDEV_TREE, &vdev_tree) != 0) ||
2548 2444                              (nvlist_lookup_uint64(vdev_tree,
2549 2445                              ZPOOL_CONFIG_ASHIFT, &ashift) != 0))
2550 2446                                  ashift = SPA_MINBLOCKSHIFT;
2551 2447                          nvlist_free(config);
2552 2448                          label_found = B_TRUE;
2553 2449                  }
  
    | 
      ↓ open down ↓ | 
    322 lines elided | 
    
      ↑ open up ↑ | 
  
2554 2450                  if (dump_opt['u'])
2555 2451                          dump_label_uberblocks(&label, ashift);
2556 2452          }
2557 2453  
2558 2454          (void) close(fd);
2559 2455  
2560 2456          return (label_found ? 0 : 2);
2561 2457  }
2562 2458  
2563 2459  static uint64_t dataset_feature_count[SPA_FEATURES];
2564      -static uint64_t remap_deadlist_count = 0;
2565 2460  
2566 2461  /*ARGSUSED*/
2567 2462  static int
2568 2463  dump_one_dir(const char *dsname, void *arg)
2569 2464  {
2570 2465          int error;
2571 2466          objset_t *os;
2572 2467  
2573 2468          error = open_objset(dsname, DMU_OST_ANY, FTAG, &os);
2574 2469          if (error != 0)
2575 2470                  return (0);
2576 2471  
2577 2472          for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {
2578 2473                  if (!dmu_objset_ds(os)->ds_feature_inuse[f])
2579 2474                          continue;
2580 2475                  ASSERT(spa_feature_table[f].fi_flags &
2581 2476                      ZFEATURE_FLAG_PER_DATASET);
2582 2477                  dataset_feature_count[f]++;
2583 2478          }
2584 2479  
2585      -        if (dsl_dataset_remap_deadlist_exists(dmu_objset_ds(os))) {
2586      -                remap_deadlist_count++;
2587      -        }
2588      -
2589 2480          dump_dir(os);
2590 2481          close_objset(os, FTAG);
2591 2482          fuid_table_destroy();
2592 2483          return (0);
2593 2484  }
2594 2485  
2595 2486  /*
2596 2487   * Block statistics.
2597 2488   */
2598 2489  #define PSIZE_HISTO_SIZE (SPA_OLD_MAXBLOCKSIZE / SPA_MINBLOCKSIZE + 2)
2599 2490  typedef struct zdb_blkstats {
2600 2491          uint64_t zb_asize;
2601 2492          uint64_t zb_lsize;
2602 2493          uint64_t zb_psize;
2603 2494          uint64_t zb_count;
2604 2495          uint64_t zb_gangs;
2605 2496          uint64_t zb_ditto_samevdev;
2606 2497          uint64_t zb_psize_histogram[PSIZE_HISTO_SIZE];
2607 2498  } zdb_blkstats_t;
2608 2499  
2609 2500  /*
2610 2501   * Extended object types to report deferred frees and dedup auto-ditto blocks.
2611 2502   */
2612 2503  #define ZDB_OT_DEFERRED (DMU_OT_NUMTYPES + 0)
2613 2504  #define ZDB_OT_DITTO    (DMU_OT_NUMTYPES + 1)
2614 2505  #define ZDB_OT_OTHER    (DMU_OT_NUMTYPES + 2)
2615 2506  #define ZDB_OT_TOTAL    (DMU_OT_NUMTYPES + 3)
2616 2507  
2617 2508  static const char *zdb_ot_extname[] = {
  
    | 
      ↓ open down ↓ | 
    19 lines elided | 
    
      ↑ open up ↑ | 
  
2618 2509          "deferred free",
2619 2510          "dedup ditto",
2620 2511          "other",
2621 2512          "Total",
2622 2513  };
2623 2514  
2624 2515  #define ZB_TOTAL        DN_MAX_LEVELS
2625 2516  
2626 2517  typedef struct zdb_cb {
2627 2518          zdb_blkstats_t  zcb_type[ZB_TOTAL + 1][ZDB_OT_TOTAL + 1];
2628      -        uint64_t        zcb_removing_size;
2629 2519          uint64_t        zcb_dedup_asize;
2630 2520          uint64_t        zcb_dedup_blocks;
2631 2521          uint64_t        zcb_embedded_blocks[NUM_BP_EMBEDDED_TYPES];
2632 2522          uint64_t        zcb_embedded_histogram[NUM_BP_EMBEDDED_TYPES]
2633 2523              [BPE_PAYLOAD_SIZE];
2634 2524          uint64_t        zcb_start;
2635 2525          hrtime_t        zcb_lastprint;
2636 2526          uint64_t        zcb_totalasize;
2637 2527          uint64_t        zcb_errors[256];
2638 2528          int             zcb_readfails;
2639 2529          int             zcb_haderrors;
2640 2530          spa_t           *zcb_spa;
2641      -        uint32_t        **zcb_vd_obsolete_counts;
2642 2531  } zdb_cb_t;
2643 2532  
2644 2533  static void
2645 2534  zdb_count_block(zdb_cb_t *zcb, zilog_t *zilog, const blkptr_t *bp,
2646 2535      dmu_object_type_t type)
2647 2536  {
2648 2537          uint64_t refcnt = 0;
2649 2538  
2650 2539          ASSERT(type < ZDB_OT_TOTAL);
2651 2540  
2652 2541          if (zilog && zil_bp_tree_add(zilog, bp) != 0)
2653 2542                  return;
2654 2543  
2655 2544          for (int i = 0; i < 4; i++) {
2656 2545                  int l = (i < 2) ? BP_GET_LEVEL(bp) : ZB_TOTAL;
2657 2546                  int t = (i & 1) ? type : ZDB_OT_TOTAL;
2658 2547                  int equal;
2659 2548                  zdb_blkstats_t *zb = &zcb->zcb_type[l][t];
2660 2549  
2661 2550                  zb->zb_asize += BP_GET_ASIZE(bp);
2662 2551                  zb->zb_lsize += BP_GET_LSIZE(bp);
2663 2552                  zb->zb_psize += BP_GET_PSIZE(bp);
2664 2553                  zb->zb_count++;
2665 2554  
2666 2555                  /*
2667 2556                   * The histogram is only big enough to record blocks up to
2668 2557                   * SPA_OLD_MAXBLOCKSIZE; larger blocks go into the last,
2669 2558                   * "other", bucket.
2670 2559                   */
2671 2560                  unsigned idx = BP_GET_PSIZE(bp) >> SPA_MINBLOCKSHIFT;
2672 2561                  idx = MIN(idx, SPA_OLD_MAXBLOCKSIZE / SPA_MINBLOCKSIZE + 1);
2673 2562                  zb->zb_psize_histogram[idx]++;
2674 2563  
2675 2564                  zb->zb_gangs += BP_COUNT_GANG(bp);
2676 2565  
2677 2566                  switch (BP_GET_NDVAS(bp)) {
2678 2567                  case 2:
2679 2568                          if (DVA_GET_VDEV(&bp->blk_dva[0]) ==
2680 2569                              DVA_GET_VDEV(&bp->blk_dva[1]))
2681 2570                                  zb->zb_ditto_samevdev++;
2682 2571                          break;
2683 2572                  case 3:
2684 2573                          equal = (DVA_GET_VDEV(&bp->blk_dva[0]) ==
2685 2574                              DVA_GET_VDEV(&bp->blk_dva[1])) +
2686 2575                              (DVA_GET_VDEV(&bp->blk_dva[0]) ==
2687 2576                              DVA_GET_VDEV(&bp->blk_dva[2])) +
2688 2577                              (DVA_GET_VDEV(&bp->blk_dva[1]) ==
2689 2578                              DVA_GET_VDEV(&bp->blk_dva[2]));
2690 2579                          if (equal != 0)
2691 2580                                  zb->zb_ditto_samevdev++;
2692 2581                          break;
2693 2582                  }
2694 2583  
2695 2584          }
2696 2585  
2697 2586          if (BP_IS_EMBEDDED(bp)) {
2698 2587                  zcb->zcb_embedded_blocks[BPE_GET_ETYPE(bp)]++;
2699 2588                  zcb->zcb_embedded_histogram[BPE_GET_ETYPE(bp)]
2700 2589                      [BPE_GET_PSIZE(bp)]++;
2701 2590                  return;
  
    | 
      ↓ open down ↓ | 
    50 lines elided | 
    
      ↑ open up ↑ | 
  
2702 2591          }
2703 2592  
2704 2593          if (dump_opt['L'])
2705 2594                  return;
2706 2595  
2707 2596          if (BP_GET_DEDUP(bp)) {
2708 2597                  ddt_t *ddt;
2709 2598                  ddt_entry_t *dde;
2710 2599  
2711 2600                  ddt = ddt_select(zcb->zcb_spa, bp);
2712      -                ddt_enter(ddt);
2713 2601                  dde = ddt_lookup(ddt, bp, B_FALSE);
2714 2602  
2715 2603                  if (dde == NULL) {
2716 2604                          refcnt = 0;
2717 2605                  } else {
2718 2606                          ddt_phys_t *ddp = ddt_phys_select(dde, bp);
     2607 +
     2608 +                        /* no other competitors for dde */
     2609 +                        dde_exit(dde);
     2610 +
2719 2611                          ddt_phys_decref(ddp);
2720 2612                          refcnt = ddp->ddp_refcnt;
2721 2613                          if (ddt_phys_total_refcnt(dde) == 0)
2722 2614                                  ddt_remove(ddt, dde);
2723 2615                  }
2724      -                ddt_exit(ddt);
2725 2616          }
2726 2617  
2727 2618          VERIFY3U(zio_wait(zio_claim(NULL, zcb->zcb_spa,
2728 2619              refcnt ? 0 : spa_first_txg(zcb->zcb_spa),
2729 2620              bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0);
2730 2621  }
2731 2622  
2732 2623  static void
2733 2624  zdb_blkptr_done(zio_t *zio)
2734 2625  {
2735 2626          spa_t *spa = zio->io_spa;
2736 2627          blkptr_t *bp = zio->io_bp;
2737 2628          int ioerr = zio->io_error;
2738 2629          zdb_cb_t *zcb = zio->io_private;
2739 2630          zbookmark_phys_t *zb = &zio->io_bookmark;
2740 2631  
2741 2632          abd_free(zio->io_abd);
2742 2633  
2743 2634          mutex_enter(&spa->spa_scrub_lock);
2744 2635          spa->spa_scrub_inflight--;
2745 2636          cv_broadcast(&spa->spa_scrub_io_cv);
2746 2637  
2747 2638          if (ioerr && !(zio->io_flags & ZIO_FLAG_SPECULATIVE)) {
2748 2639                  char blkbuf[BP_SPRINTF_LEN];
2749 2640  
2750 2641                  zcb->zcb_haderrors = 1;
2751 2642                  zcb->zcb_errors[ioerr]++;
2752 2643  
2753 2644                  if (dump_opt['b'] >= 2)
2754 2645                          snprintf_blkptr(blkbuf, sizeof (blkbuf), bp);
2755 2646                  else
2756 2647                          blkbuf[0] = '\0';
2757 2648  
2758 2649                  (void) printf("zdb_blkptr_cb: "
2759 2650                      "Got error %d reading "
2760 2651                      "<%llu, %llu, %lld, %llx> %s -- skipping\n",
2761 2652                      ioerr,
2762 2653                      (u_longlong_t)zb->zb_objset,
2763 2654                      (u_longlong_t)zb->zb_object,
2764 2655                      (u_longlong_t)zb->zb_level,
2765 2656                      (u_longlong_t)zb->zb_blkid,
2766 2657                      blkbuf);
2767 2658          }
2768 2659          mutex_exit(&spa->spa_scrub_lock);
2769 2660  }
2770 2661  
2771 2662  static int
2772 2663  zdb_blkptr_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
2773 2664      const zbookmark_phys_t *zb, const dnode_phys_t *dnp, void *arg)
2774 2665  {
2775 2666          zdb_cb_t *zcb = arg;
2776 2667          dmu_object_type_t type;
2777 2668          boolean_t is_metadata;
2778 2669  
2779 2670          if (bp == NULL)
2780 2671                  return (0);
2781 2672  
2782 2673          if (dump_opt['b'] >= 5 && bp->blk_birth > 0) {
2783 2674                  char blkbuf[BP_SPRINTF_LEN];
2784 2675                  snprintf_blkptr(blkbuf, sizeof (blkbuf), bp);
2785 2676                  (void) printf("objset %llu object %llu "
2786 2677                      "level %lld offset 0x%llx %s\n",
2787 2678                      (u_longlong_t)zb->zb_objset,
2788 2679                      (u_longlong_t)zb->zb_object,
2789 2680                      (longlong_t)zb->zb_level,
2790 2681                      (u_longlong_t)blkid2offset(dnp, bp, zb),
2791 2682                      blkbuf);
  
    | 
      ↓ open down ↓ | 
    57 lines elided | 
    
      ↑ open up ↑ | 
  
2792 2683          }
2793 2684  
2794 2685          if (BP_IS_HOLE(bp))
2795 2686                  return (0);
2796 2687  
2797 2688          type = BP_GET_TYPE(bp);
2798 2689  
2799 2690          zdb_count_block(zcb, zilog, bp,
2800 2691              (type & DMU_OT_NEWTYPE) ? ZDB_OT_OTHER : type);
2801 2692  
2802      -        is_metadata = (BP_GET_LEVEL(bp) != 0 || DMU_OT_IS_METADATA(type));
     2693 +        is_metadata = BP_IS_METADATA(bp);
2803 2694  
2804 2695          if (!BP_IS_EMBEDDED(bp) &&
2805 2696              (dump_opt['c'] > 1 || (dump_opt['c'] && is_metadata))) {
2806 2697                  size_t size = BP_GET_PSIZE(bp);
2807 2698                  abd_t *abd = abd_alloc(size, B_FALSE);
2808 2699                  int flags = ZIO_FLAG_CANFAIL | ZIO_FLAG_SCRUB | ZIO_FLAG_RAW;
2809 2700  
2810 2701                  /* If it's an intent log block, failure is expected. */
2811 2702                  if (zb->zb_level == ZB_ZIL_LEVEL)
2812 2703                          flags |= ZIO_FLAG_SPECULATIVE;
2813 2704  
2814 2705                  mutex_enter(&spa->spa_scrub_lock);
2815 2706                  while (spa->spa_scrub_inflight > max_inflight)
2816 2707                          cv_wait(&spa->spa_scrub_io_cv, &spa->spa_scrub_lock);
2817 2708                  spa->spa_scrub_inflight++;
2818 2709                  mutex_exit(&spa->spa_scrub_lock);
2819 2710  
2820 2711                  zio_nowait(zio_read(NULL, spa, bp, abd, size,
2821 2712                      zdb_blkptr_done, zcb, ZIO_PRIORITY_ASYNC_READ, flags, zb));
2822 2713          }
2823 2714  
2824 2715          zcb->zcb_readfails = 0;
2825 2716  
2826 2717          /* only call gethrtime() every 100 blocks */
2827 2718          static int iters;
2828 2719          if (++iters > 100)
2829 2720                  iters = 0;
2830 2721          else
2831 2722                  return (0);
2832 2723  
2833 2724          if (dump_opt['b'] < 5 && gethrtime() > zcb->zcb_lastprint + NANOSEC) {
2834 2725                  uint64_t now = gethrtime();
2835 2726                  char buf[10];
2836 2727                  uint64_t bytes = zcb->zcb_type[ZB_TOTAL][ZDB_OT_TOTAL].zb_asize;
2837 2728                  int kb_per_sec =
2838 2729                      1 + bytes / (1 + ((now - zcb->zcb_start) / 1000 / 1000));
2839 2730                  int sec_remaining =
2840 2731                      (zcb->zcb_totalasize - bytes) / 1024 / kb_per_sec;
2841 2732  
2842 2733                  /* make sure nicenum has enough space */
2843 2734                  CTASSERT(sizeof (buf) >= NN_NUMBUF_SZ);
2844 2735  
2845 2736                  zfs_nicenum(bytes, buf, sizeof (buf));
2846 2737                  (void) fprintf(stderr,
2847 2738                      "\r%5s completed (%4dMB/s) "
2848 2739                      "estimated time remaining: %uhr %02umin %02usec        ",
2849 2740                      buf, kb_per_sec / 1024,
2850 2741                      sec_remaining / 60 / 60,
2851 2742                      sec_remaining / 60 % 60,
2852 2743                      sec_remaining % 60);
2853 2744  
2854 2745                  zcb->zcb_lastprint = now;
2855 2746          }
2856 2747  
2857 2748          return (0);
2858 2749  }
2859 2750  
2860 2751  static void
2861 2752  zdb_leak(void *arg, uint64_t start, uint64_t size)
2862 2753  {
2863 2754          vdev_t *vd = arg;
2864 2755  
2865 2756          (void) printf("leaked space: vdev %llu, offset 0x%llx, size %llu\n",
2866 2757              (u_longlong_t)vd->vdev_id, (u_longlong_t)start, (u_longlong_t)size);
2867 2758  }
2868 2759  
2869 2760  static metaslab_ops_t zdb_metaslab_ops = {
2870 2761          NULL    /* alloc */
2871 2762  };
2872 2763  
2873 2764  static void
2874 2765  zdb_ddt_leak_init(spa_t *spa, zdb_cb_t *zcb)
2875 2766  {
2876 2767          ddt_bookmark_t ddb;
2877 2768          ddt_entry_t dde;
2878 2769          int error;
2879 2770  
2880 2771          bzero(&ddb, sizeof (ddb));
2881 2772          while ((error = ddt_walk(spa, &ddb, &dde)) == 0) {
2882 2773                  blkptr_t blk;
2883 2774                  ddt_phys_t *ddp = dde.dde_phys;
2884 2775  
2885 2776                  if (ddb.ddb_class == DDT_CLASS_UNIQUE)
2886 2777                          return;
2887 2778  
2888 2779                  ASSERT(ddt_phys_total_refcnt(&dde) > 1);
2889 2780  
2890 2781                  for (int p = 0; p < DDT_PHYS_TYPES; p++, ddp++) {
2891 2782                          if (ddp->ddp_phys_birth == 0)
2892 2783                                  continue;
2893 2784                          ddt_bp_create(ddb.ddb_checksum,
2894 2785                              &dde.dde_key, ddp, &blk);
  
    | 
      ↓ open down ↓ | 
    82 lines elided | 
    
      ↑ open up ↑ | 
  
2895 2786                          if (p == DDT_PHYS_DITTO) {
2896 2787                                  zdb_count_block(zcb, NULL, &blk, ZDB_OT_DITTO);
2897 2788                          } else {
2898 2789                                  zcb->zcb_dedup_asize +=
2899 2790                                      BP_GET_ASIZE(&blk) * (ddp->ddp_refcnt - 1);
2900 2791                                  zcb->zcb_dedup_blocks++;
2901 2792                          }
2902 2793                  }
2903 2794                  if (!dump_opt['L']) {
2904 2795                          ddt_t *ddt = spa->spa_ddt[ddb.ddb_checksum];
2905      -                        ddt_enter(ddt);
2906      -                        VERIFY(ddt_lookup(ddt, &blk, B_TRUE) != NULL);
2907      -                        ddt_exit(ddt);
     2796 +                        ddt_entry_t *dde;
     2797 +                        VERIFY((dde = ddt_lookup(ddt, &blk, B_TRUE)) != NULL);
     2798 +                        dde_exit(dde);
2908 2799                  }
2909 2800          }
2910 2801  
2911 2802          ASSERT(error == ENOENT);
2912 2803  }
2913 2804  
2914      -/* ARGSUSED */
2915 2805  static void
2916      -claim_segment_impl_cb(uint64_t inner_offset, vdev_t *vd, uint64_t offset,
2917      -    uint64_t size, void *arg)
2918      -{
2919      -        /*
2920      -         * This callback was called through a remap from
2921      -         * a device being removed. Therefore, the vdev that
2922      -         * this callback is applied to is a concrete
2923      -         * vdev.
2924      -         */
2925      -        ASSERT(vdev_is_concrete(vd));
2926      -
2927      -        VERIFY0(metaslab_claim_impl(vd, offset, size,
2928      -            spa_first_txg(vd->vdev_spa)));
2929      -}
2930      -
2931      -static void
2932      -claim_segment_cb(void *arg, uint64_t offset, uint64_t size)
2933      -{
2934      -        vdev_t *vd = arg;
2935      -
2936      -        vdev_indirect_ops.vdev_op_remap(vd, offset, size,
2937      -            claim_segment_impl_cb, NULL);
2938      -}
2939      -
2940      -/*
2941      - * After accounting for all allocated blocks that are directly referenced,
2942      - * we might have missed a reference to a block from a partially complete
2943      - * (and thus unused) indirect mapping object. We perform a secondary pass
2944      - * through the metaslabs we have already mapped and claim the destination
2945      - * blocks.
2946      - */
2947      -static void
2948      -zdb_claim_removing(spa_t *spa, zdb_cb_t *zcb)
2949      -{
2950      -        if (spa->spa_vdev_removal == NULL)
2951      -                return;
2952      -
2953      -        spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
2954      -
2955      -        spa_vdev_removal_t *svr = spa->spa_vdev_removal;
2956      -        vdev_t *vd = svr->svr_vdev;
2957      -        vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
2958      -
2959      -        for (uint64_t msi = 0; msi < vd->vdev_ms_count; msi++) {
2960      -                metaslab_t *msp = vd->vdev_ms[msi];
2961      -
2962      -                if (msp->ms_start >= vdev_indirect_mapping_max_offset(vim))
2963      -                        break;
2964      -
2965      -                ASSERT0(range_tree_space(svr->svr_allocd_segs));
2966      -
2967      -                if (msp->ms_sm != NULL) {
2968      -                        VERIFY0(space_map_load(msp->ms_sm,
2969      -                            svr->svr_allocd_segs, SM_ALLOC));
2970      -
2971      -                        /*
2972      -                         * Clear everything past what has been synced,
2973      -                         * because we have not allocated mappings for it yet.
2974      -                         */
2975      -                        range_tree_clear(svr->svr_allocd_segs,
2976      -                            vdev_indirect_mapping_max_offset(vim),
2977      -                            msp->ms_sm->sm_start + msp->ms_sm->sm_size -
2978      -                            vdev_indirect_mapping_max_offset(vim));
2979      -                }
2980      -
2981      -                zcb->zcb_removing_size +=
2982      -                    range_tree_space(svr->svr_allocd_segs);
2983      -                range_tree_vacate(svr->svr_allocd_segs, claim_segment_cb, vd);
2984      -        }
2985      -
2986      -        spa_config_exit(spa, SCL_CONFIG, FTAG);
2987      -}
2988      -
2989      -/*
2990      - * vm_idxp is an in-out parameter which (for indirect vdevs) is the
2991      - * index in vim_entries that has the first entry in this metaslab.  On
2992      - * return, it will be set to the first entry after this metaslab.
2993      - */
2994      -static void
2995      -zdb_leak_init_ms(metaslab_t *msp, uint64_t *vim_idxp)
2996      -{
2997      -        metaslab_group_t *mg = msp->ms_group;
2998      -        vdev_t *vd = mg->mg_vd;
2999      -        vdev_t *rvd = vd->vdev_spa->spa_root_vdev;
3000      -
3001      -        mutex_enter(&msp->ms_lock);
3002      -        metaslab_unload(msp);
3003      -
3004      -        /*
3005      -         * We don't want to spend the CPU manipulating the size-ordered
3006      -         * tree, so clear the range_tree ops.
3007      -         */
3008      -        msp->ms_tree->rt_ops = NULL;
3009      -
3010      -        (void) fprintf(stderr,
3011      -            "\rloading vdev %llu of %llu, metaslab %llu of %llu ...",
3012      -            (longlong_t)vd->vdev_id,
3013      -            (longlong_t)rvd->vdev_children,
3014      -            (longlong_t)msp->ms_id,
3015      -            (longlong_t)vd->vdev_ms_count);
3016      -
3017      -        /*
3018      -         * For leak detection, we overload the metaslab ms_tree to
3019      -         * contain allocated segments instead of free segments. As a
3020      -         * result, we can't use the normal metaslab_load/unload
3021      -         * interfaces.
3022      -         */
3023      -        if (vd->vdev_ops == &vdev_indirect_ops) {
3024      -                vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
3025      -                for (; *vim_idxp < vdev_indirect_mapping_num_entries(vim);
3026      -                    (*vim_idxp)++) {
3027      -                        vdev_indirect_mapping_entry_phys_t *vimep =
3028      -                            &vim->vim_entries[*vim_idxp];
3029      -                        uint64_t ent_offset = DVA_MAPPING_GET_SRC_OFFSET(vimep);
3030      -                        uint64_t ent_len = DVA_GET_ASIZE(&vimep->vimep_dst);
3031      -                        ASSERT3U(ent_offset, >=, msp->ms_start);
3032      -                        if (ent_offset >= msp->ms_start + msp->ms_size)
3033      -                                break;
3034      -
3035      -                        /*
3036      -                         * Mappings do not cross metaslab boundaries,
3037      -                         * because we create them by walking the metaslabs.
3038      -                         */
3039      -                        ASSERT3U(ent_offset + ent_len, <=,
3040      -                            msp->ms_start + msp->ms_size);
3041      -                        range_tree_add(msp->ms_tree, ent_offset, ent_len);
3042      -                }
3043      -        } else if (msp->ms_sm != NULL) {
3044      -                VERIFY0(space_map_load(msp->ms_sm, msp->ms_tree, SM_ALLOC));
3045      -        }
3046      -
3047      -        if (!msp->ms_loaded) {
3048      -                msp->ms_loaded = B_TRUE;
3049      -        }
3050      -        mutex_exit(&msp->ms_lock);
3051      -}
3052      -
3053      -/* ARGSUSED */
3054      -static int
3055      -increment_indirect_mapping_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx)
3056      -{
3057      -        zdb_cb_t *zcb = arg;
3058      -        spa_t *spa = zcb->zcb_spa;
3059      -        vdev_t *vd;
3060      -        const dva_t *dva = &bp->blk_dva[0];
3061      -
3062      -        ASSERT(!dump_opt['L']);
3063      -        ASSERT3U(BP_GET_NDVAS(bp), ==, 1);
3064      -
3065      -        spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER);
3066      -        vd = vdev_lookup_top(zcb->zcb_spa, DVA_GET_VDEV(dva));
3067      -        ASSERT3P(vd, !=, NULL);
3068      -        spa_config_exit(spa, SCL_VDEV, FTAG);
3069      -
3070      -        ASSERT(vd->vdev_indirect_config.vic_mapping_object != 0);
3071      -        ASSERT3P(zcb->zcb_vd_obsolete_counts[vd->vdev_id], !=, NULL);
3072      -
3073      -        vdev_indirect_mapping_increment_obsolete_count(
3074      -            vd->vdev_indirect_mapping,
3075      -            DVA_GET_OFFSET(dva), DVA_GET_ASIZE(dva),
3076      -            zcb->zcb_vd_obsolete_counts[vd->vdev_id]);
3077      -
3078      -        return (0);
3079      -}
3080      -
3081      -static uint32_t *
3082      -zdb_load_obsolete_counts(vdev_t *vd)
3083      -{
3084      -        vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
3085      -        spa_t *spa = vd->vdev_spa;
3086      -        spa_condensing_indirect_phys_t *scip =
3087      -            &spa->spa_condensing_indirect_phys;
3088      -        uint32_t *counts;
3089      -
3090      -        EQUIV(vdev_obsolete_sm_object(vd) != 0, vd->vdev_obsolete_sm != NULL);
3091      -        counts = vdev_indirect_mapping_load_obsolete_counts(vim);
3092      -        if (vd->vdev_obsolete_sm != NULL) {
3093      -                vdev_indirect_mapping_load_obsolete_spacemap(vim, counts,
3094      -                    vd->vdev_obsolete_sm);
3095      -        }
3096      -        if (scip->scip_vdev == vd->vdev_id &&
3097      -            scip->scip_prev_obsolete_sm_object != 0) {
3098      -                space_map_t *prev_obsolete_sm = NULL;
3099      -                VERIFY0(space_map_open(&prev_obsolete_sm, spa->spa_meta_objset,
3100      -                    scip->scip_prev_obsolete_sm_object, 0, vd->vdev_asize, 0));
3101      -                space_map_update(prev_obsolete_sm);
3102      -                vdev_indirect_mapping_load_obsolete_spacemap(vim, counts,
3103      -                    prev_obsolete_sm);
3104      -                space_map_close(prev_obsolete_sm);
3105      -        }
3106      -        return (counts);
3107      -}
3108      -
3109      -static void
3110 2806  zdb_leak_init(spa_t *spa, zdb_cb_t *zcb)
3111 2807  {
3112 2808          zcb->zcb_spa = spa;
3113 2809  
3114 2810          if (!dump_opt['L']) {
3115      -                dsl_pool_t *dp = spa->spa_dsl_pool;
3116 2811                  vdev_t *rvd = spa->spa_root_vdev;
3117 2812  
3118 2813                  /*
3119 2814                   * We are going to be changing the meaning of the metaslab's
3120 2815                   * ms_tree.  Ensure that the allocator doesn't try to
3121 2816                   * use the tree.
3122 2817                   */
3123 2818                  spa->spa_normal_class->mc_ops = &zdb_metaslab_ops;
3124 2819                  spa->spa_log_class->mc_ops = &zdb_metaslab_ops;
3125 2820  
3126      -                zcb->zcb_vd_obsolete_counts =
3127      -                    umem_zalloc(rvd->vdev_children * sizeof (uint32_t *),
3128      -                    UMEM_NOFAIL);
3129      -
3130      -
3131 2821                  for (uint64_t c = 0; c < rvd->vdev_children; c++) {
3132 2822                          vdev_t *vd = rvd->vdev_child[c];
3133      -                        uint64_t vim_idx = 0;
     2823 +                        metaslab_group_t *mg = vd->vdev_mg;
     2824 +                        for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
     2825 +                                metaslab_t *msp = vd->vdev_ms[m];
     2826 +                                ASSERT3P(msp->ms_group, ==, mg);
     2827 +                                mutex_enter(&msp->ms_lock);
     2828 +                                metaslab_unload(msp);
3134 2829  
3135      -                        ASSERT3U(c, ==, vd->vdev_id);
3136      -
3137      -                        /*
3138      -                         * Note: we don't check for mapping leaks on
3139      -                         * removing vdevs because their ms_tree's are
3140      -                         * used to look for leaks in allocated space.
3141      -                         */
3142      -                        if (vd->vdev_ops == &vdev_indirect_ops) {
3143      -                                zcb->zcb_vd_obsolete_counts[c] =
3144      -                                    zdb_load_obsolete_counts(vd);
3145      -
3146 2830                                  /*
3147      -                                 * Normally, indirect vdevs don't have any
3148      -                                 * metaslabs.  We want to set them up for
3149      -                                 * zio_claim().
     2831 +                                 * For leak detection, we overload the metaslab
     2832 +                                 * ms_tree to contain allocated segments
     2833 +                                 * instead of free segments. As a result,
     2834 +                                 * we can't use the normal metaslab_load/unload
     2835 +                                 * interfaces.
3150 2836                                   */
3151      -                                VERIFY0(vdev_metaslab_init(vd, 0));
3152      -                        }
     2837 +                                if (msp->ms_sm != NULL) {
     2838 +                                        (void) fprintf(stderr,
     2839 +                                            "\rloading space map for "
     2840 +                                            "vdev %llu of %llu, "
     2841 +                                            "metaslab %llu of %llu ...",
     2842 +                                            (longlong_t)c,
     2843 +                                            (longlong_t)rvd->vdev_children,
     2844 +                                            (longlong_t)m,
     2845 +                                            (longlong_t)vd->vdev_ms_count);
3153 2846  
3154      -                        for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
3155      -                                zdb_leak_init_ms(vd->vdev_ms[m], &vim_idx);
     2847 +                                        /*
     2848 +                                         * We don't want to spend the CPU
     2849 +                                         * manipulating the size-ordered
     2850 +                                         * tree, so clear the range_tree
     2851 +                                         * ops.
     2852 +                                         */
     2853 +                                        msp->ms_tree->rt_ops = NULL;
     2854 +                                        VERIFY0(space_map_load(msp->ms_sm,
     2855 +                                            msp->ms_tree, SM_ALLOC));
     2856 +
     2857 +                                        if (!msp->ms_loaded) {
     2858 +                                                msp->ms_loaded = B_TRUE;
     2859 +                                        }
     2860 +                                }
     2861 +                                mutex_exit(&msp->ms_lock);
3156 2862                          }
3157      -                        if (vd->vdev_ops == &vdev_indirect_ops) {
3158      -                                ASSERT3U(vim_idx, ==,
3159      -                                    vdev_indirect_mapping_num_entries(
3160      -                                    vd->vdev_indirect_mapping));
3161      -                        }
3162 2863                  }
3163 2864                  (void) fprintf(stderr, "\n");
3164      -
3165      -                if (bpobj_is_open(&dp->dp_obsolete_bpobj)) {
3166      -                        ASSERT(spa_feature_is_enabled(spa,
3167      -                            SPA_FEATURE_DEVICE_REMOVAL));
3168      -                        (void) bpobj_iterate_nofree(&dp->dp_obsolete_bpobj,
3169      -                            increment_indirect_mapping_cb, zcb, NULL);
3170      -                }
3171 2865          }
3172 2866  
3173 2867          spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
3174 2868  
3175 2869          zdb_ddt_leak_init(spa, zcb);
3176 2870  
3177 2871          spa_config_exit(spa, SCL_CONFIG, FTAG);
3178 2872  }
3179 2873  
3180      -static boolean_t
3181      -zdb_check_for_obsolete_leaks(vdev_t *vd, zdb_cb_t *zcb)
     2874 +static void
     2875 +zdb_leak_fini(spa_t *spa)
3182 2876  {
3183      -        boolean_t leaks = B_FALSE;
3184      -        vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
3185      -        uint64_t total_leaked = 0;
3186      -
3187      -        ASSERT(vim != NULL);
3188      -
3189      -        for (uint64_t i = 0; i < vdev_indirect_mapping_num_entries(vim); i++) {
3190      -                vdev_indirect_mapping_entry_phys_t *vimep =
3191      -                    &vim->vim_entries[i];
3192      -                uint64_t obsolete_bytes = 0;
3193      -                uint64_t offset = DVA_MAPPING_GET_SRC_OFFSET(vimep);
3194      -                metaslab_t *msp = vd->vdev_ms[offset >> vd->vdev_ms_shift];
3195      -
3196      -                /*
3197      -                 * This is not very efficient but it's easy to
3198      -                 * verify correctness.
3199      -                 */
3200      -                for (uint64_t inner_offset = 0;
3201      -                    inner_offset < DVA_GET_ASIZE(&vimep->vimep_dst);
3202      -                    inner_offset += 1 << vd->vdev_ashift) {
3203      -                        if (range_tree_contains(msp->ms_tree,
3204      -                            offset + inner_offset, 1 << vd->vdev_ashift)) {
3205      -                                obsolete_bytes += 1 << vd->vdev_ashift;
3206      -                        }
3207      -                }
3208      -
3209      -                int64_t bytes_leaked = obsolete_bytes -
3210      -                    zcb->zcb_vd_obsolete_counts[vd->vdev_id][i];
3211      -                ASSERT3U(DVA_GET_ASIZE(&vimep->vimep_dst), >=,
3212      -                    zcb->zcb_vd_obsolete_counts[vd->vdev_id][i]);
3213      -                if (bytes_leaked != 0 &&
3214      -                    (vdev_obsolete_counts_are_precise(vd) ||
3215      -                    dump_opt['d'] >= 5)) {
3216      -                        (void) printf("obsolete indirect mapping count "
3217      -                            "mismatch on %llu:%llx:%llx : %llx bytes leaked\n",
3218      -                            (u_longlong_t)vd->vdev_id,
3219      -                            (u_longlong_t)DVA_MAPPING_GET_SRC_OFFSET(vimep),
3220      -                            (u_longlong_t)DVA_GET_ASIZE(&vimep->vimep_dst),
3221      -                            (u_longlong_t)bytes_leaked);
3222      -                }
3223      -                total_leaked += ABS(bytes_leaked);
3224      -        }
3225      -
3226      -        if (!vdev_obsolete_counts_are_precise(vd) && total_leaked > 0) {
3227      -                int pct_leaked = total_leaked * 100 /
3228      -                    vdev_indirect_mapping_bytes_mapped(vim);
3229      -                (void) printf("cannot verify obsolete indirect mapping "
3230      -                    "counts of vdev %llu because precise feature was not "
3231      -                    "enabled when it was removed: %d%% (%llx bytes) of mapping"
3232      -                    "unreferenced\n",
3233      -                    (u_longlong_t)vd->vdev_id, pct_leaked,
3234      -                    (u_longlong_t)total_leaked);
3235      -        } else if (total_leaked > 0) {
3236      -                (void) printf("obsolete indirect mapping count mismatch "
3237      -                    "for vdev %llu -- %llx total bytes mismatched\n",
3238      -                    (u_longlong_t)vd->vdev_id,
3239      -                    (u_longlong_t)total_leaked);
3240      -                leaks |= B_TRUE;
3241      -        }
3242      -
3243      -        vdev_indirect_mapping_free_obsolete_counts(vim,
3244      -            zcb->zcb_vd_obsolete_counts[vd->vdev_id]);
3245      -        zcb->zcb_vd_obsolete_counts[vd->vdev_id] = NULL;
3246      -
3247      -        return (leaks);
3248      -}
3249      -
3250      -static boolean_t
3251      -zdb_leak_fini(spa_t *spa, zdb_cb_t *zcb)
3252      -{
3253      -        boolean_t leaks = B_FALSE;
3254 2877          if (!dump_opt['L']) {
3255 2878                  vdev_t *rvd = spa->spa_root_vdev;
3256 2879                  for (unsigned c = 0; c < rvd->vdev_children; c++) {
3257 2880                          vdev_t *vd = rvd->vdev_child[c];
3258 2881                          metaslab_group_t *mg = vd->vdev_mg;
3259      -
3260      -                        if (zcb->zcb_vd_obsolete_counts[c] != NULL) {
3261      -                                leaks |= zdb_check_for_obsolete_leaks(vd, zcb);
3262      -                        }
3263      -
3264      -                        for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
     2882 +                        for (unsigned m = 0; m < vd->vdev_ms_count; m++) {
3265 2883                                  metaslab_t *msp = vd->vdev_ms[m];
3266 2884                                  ASSERT3P(mg, ==, msp->ms_group);
     2885 +                                mutex_enter(&msp->ms_lock);
3267 2886  
3268 2887                                  /*
3269 2888                                   * The ms_tree has been overloaded to
3270 2889                                   * contain allocated segments. Now that we
3271 2890                                   * finished traversing all blocks, any
3272 2891                                   * block that remains in the ms_tree
3273 2892                                   * represents an allocated block that we
3274 2893                                   * did not claim during the traversal.
3275 2894                                   * Claimed blocks would have been removed
3276      -                                 * from the ms_tree.  For indirect vdevs,
3277      -                                 * space remaining in the tree represents
3278      -                                 * parts of the mapping that are not
3279      -                                 * referenced, which is not a bug.
     2895 +                                 * from the ms_tree.
3280 2896                                   */
3281      -                                if (vd->vdev_ops == &vdev_indirect_ops) {
3282      -                                        range_tree_vacate(msp->ms_tree,
3283      -                                            NULL, NULL);
3284      -                                } else {
3285      -                                        range_tree_vacate(msp->ms_tree,
3286      -                                            zdb_leak, vd);
3287      -                                }
     2897 +                                range_tree_vacate(msp->ms_tree, zdb_leak, vd);
3288 2898  
3289 2899                                  if (msp->ms_loaded) {
3290 2900                                          msp->ms_loaded = B_FALSE;
3291 2901                                  }
     2902 +
     2903 +                                mutex_exit(&msp->ms_lock);
3292 2904                          }
3293 2905                  }
3294      -
3295      -                umem_free(zcb->zcb_vd_obsolete_counts,
3296      -                    rvd->vdev_children * sizeof (uint32_t *));
3297      -                zcb->zcb_vd_obsolete_counts = NULL;
3298 2906          }
3299      -        return (leaks);
3300 2907  }
3301 2908  
3302 2909  /* ARGSUSED */
3303 2910  static int
3304 2911  count_block_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx)
3305 2912  {
3306 2913          zdb_cb_t *zcb = arg;
3307 2914  
3308 2915          if (dump_opt['b'] >= 5) {
3309 2916                  char blkbuf[BP_SPRINTF_LEN];
3310 2917                  snprintf_blkptr(blkbuf, sizeof (blkbuf), bp);
3311 2918                  (void) printf("[%s] %s\n",
3312 2919                      "deferred free", blkbuf);
  
    | 
      ↓ open down ↓ | 
    3 lines elided | 
    
      ↑ open up ↑ | 
  
3313 2920          }
3314 2921          zdb_count_block(zcb, NULL, bp, ZDB_OT_DEFERRED);
3315 2922          return (0);
3316 2923  }
3317 2924  
3318 2925  static int
3319 2926  dump_block_stats(spa_t *spa)
3320 2927  {
3321 2928          zdb_cb_t zcb;
3322 2929          zdb_blkstats_t *zb, *tzb;
3323      -        uint64_t norm_alloc, norm_space, total_alloc, total_found;
     2930 +        uint64_t norm_alloc, spec_alloc, norm_space, total_alloc, total_found;
3324 2931          int flags = TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA | TRAVERSE_HARD;
3325 2932          boolean_t leaks = B_FALSE;
3326 2933  
3327 2934          bzero(&zcb, sizeof (zcb));
3328 2935          (void) printf("\nTraversing all blocks %s%s%s%s%s...\n\n",
3329 2936              (dump_opt['c'] || !dump_opt['L']) ? "to verify " : "",
3330 2937              (dump_opt['c'] == 1) ? "metadata " : "",
3331 2938              dump_opt['c'] ? "checksums " : "",
3332 2939              (dump_opt['c'] && !dump_opt['L']) ? "and verify " : "",
3333 2940              !dump_opt['L'] ? "nothing leaked " : "");
3334 2941  
3335 2942          /*
3336 2943           * Load all space maps as SM_ALLOC maps, then traverse the pool
3337 2944           * claiming each block we discover.  If the pool is perfectly
3338 2945           * consistent, the space maps will be empty when we're done.
3339 2946           * Anything left over is a leak; any block we can't claim (because
  
    | 
      ↓ open down ↓ | 
    6 lines elided | 
    
      ↑ open up ↑ | 
  
3340 2947           * it's not part of any space map) is a double allocation,
3341 2948           * reference to a freed block, or an unclaimed log block.
3342 2949           */
3343 2950          zdb_leak_init(spa, &zcb);
3344 2951  
3345 2952          /*
3346 2953           * If there's a deferred-free bplist, process that first.
3347 2954           */
3348 2955          (void) bpobj_iterate_nofree(&spa->spa_deferred_bpobj,
3349 2956              count_block_cb, &zcb, NULL);
3350      -
3351 2957          if (spa_version(spa) >= SPA_VERSION_DEADLISTS) {
3352 2958                  (void) bpobj_iterate_nofree(&spa->spa_dsl_pool->dp_free_bpobj,
3353 2959                      count_block_cb, &zcb, NULL);
3354 2960          }
3355      -
3356      -        zdb_claim_removing(spa, &zcb);
3357      -
3358 2961          if (spa_feature_is_active(spa, SPA_FEATURE_ASYNC_DESTROY)) {
3359 2962                  VERIFY3U(0, ==, bptree_iterate(spa->spa_meta_objset,
3360 2963                      spa->spa_dsl_pool->dp_bptree_obj, B_FALSE, count_block_cb,
3361 2964                      &zcb, NULL));
3362 2965          }
3363 2966  
3364 2967          if (dump_opt['c'] > 1)
3365 2968                  flags |= TRAVERSE_PREFETCH_DATA;
3366 2969  
3367 2970          zcb.zcb_totalasize = metaslab_class_get_alloc(spa_normal_class(spa));
3368 2971          zcb.zcb_start = zcb.zcb_lastprint = gethrtime();
3369      -        zcb.zcb_haderrors |= traverse_pool(spa, 0, flags, zdb_blkptr_cb, &zcb);
     2972 +        zcb.zcb_haderrors |= traverse_pool(spa, 0, UINT64_MAX,
     2973 +            flags, zdb_blkptr_cb, &zcb, NULL);
3370 2974  
3371 2975          /*
3372 2976           * If we've traversed the data blocks then we need to wait for those
3373 2977           * I/Os to complete. We leverage "The Godfather" zio to wait on
3374 2978           * all async I/Os to complete.
3375 2979           */
3376 2980          if (dump_opt['c']) {
3377 2981                  for (int i = 0; i < max_ncpus; i++) {
3378 2982                          (void) zio_wait(spa->spa_async_zio_root[i]);
3379 2983                          spa->spa_async_zio_root[i] = zio_root(spa, NULL, NULL,
3380 2984                              ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE |
3381 2985                              ZIO_FLAG_GODFATHER);
3382 2986                  }
3383 2987          }
3384 2988  
3385 2989          if (zcb.zcb_haderrors) {
3386 2990                  (void) printf("\nError counts:\n\n");
3387 2991                  (void) printf("\t%5s  %s\n", "errno", "count");
3388 2992                  for (int e = 0; e < 256; e++) {
  
    | 
      ↓ open down ↓ | 
    9 lines elided | 
    
      ↑ open up ↑ | 
  
3389 2993                          if (zcb.zcb_errors[e] != 0) {
3390 2994                                  (void) printf("\t%5d  %llu\n",
3391 2995                                      e, (u_longlong_t)zcb.zcb_errors[e]);
3392 2996                          }
3393 2997                  }
3394 2998          }
3395 2999  
3396 3000          /*
3397 3001           * Report any leaked segments.
3398 3002           */
3399      -        leaks |= zdb_leak_fini(spa, &zcb);
     3003 +        zdb_leak_fini(spa);
3400 3004  
3401 3005          tzb = &zcb.zcb_type[ZB_TOTAL][ZDB_OT_TOTAL];
3402 3006  
3403 3007          norm_alloc = metaslab_class_get_alloc(spa_normal_class(spa));
     3008 +        spec_alloc = metaslab_class_get_alloc(spa_special_class(spa));
3404 3009          norm_space = metaslab_class_get_space(spa_normal_class(spa));
3405 3010  
     3011 +        norm_alloc += spec_alloc;
3406 3012          total_alloc = norm_alloc + metaslab_class_get_alloc(spa_log_class(spa));
3407      -        total_found = tzb->zb_asize - zcb.zcb_dedup_asize +
3408      -            zcb.zcb_removing_size;
     3013 +        total_found = tzb->zb_asize - zcb.zcb_dedup_asize;
3409 3014  
3410 3015          if (total_found == total_alloc) {
3411 3016                  if (!dump_opt['L'])
3412 3017                          (void) printf("\n\tNo leaks (block sum matches space"
3413 3018                              " maps exactly)\n");
3414 3019          } else {
3415 3020                  (void) printf("block traversal size %llu != alloc %llu "
3416 3021                      "(%s %lld)\n",
3417 3022                      (u_longlong_t)total_found,
3418 3023                      (u_longlong_t)total_alloc,
3419 3024                      (dump_opt['L']) ? "unreachable" : "leaked",
3420 3025                      (longlong_t)(total_alloc - total_found));
3421 3026                  leaks = B_TRUE;
3422 3027          }
3423 3028  
3424 3029          if (tzb->zb_count == 0)
3425 3030                  return (2);
3426 3031  
3427 3032          (void) printf("\n");
3428 3033          (void) printf("\tbp count:      %10llu\n",
3429 3034              (u_longlong_t)tzb->zb_count);
3430 3035          (void) printf("\tganged count:  %10llu\n",
3431 3036              (longlong_t)tzb->zb_gangs);
3432 3037          (void) printf("\tbp logical:    %10llu      avg: %6llu\n",
3433 3038              (u_longlong_t)tzb->zb_lsize,
3434 3039              (u_longlong_t)(tzb->zb_lsize / tzb->zb_count));
3435 3040          (void) printf("\tbp physical:   %10llu      avg:"
3436 3041              " %6llu     compression: %6.2f\n",
3437 3042              (u_longlong_t)tzb->zb_psize,
3438 3043              (u_longlong_t)(tzb->zb_psize / tzb->zb_count),
3439 3044              (double)tzb->zb_lsize / tzb->zb_psize);
  
    | 
      ↓ open down ↓ | 
    21 lines elided | 
    
      ↑ open up ↑ | 
  
3440 3045          (void) printf("\tbp allocated:  %10llu      avg:"
3441 3046              " %6llu     compression: %6.2f\n",
3442 3047              (u_longlong_t)tzb->zb_asize,
3443 3048              (u_longlong_t)(tzb->zb_asize / tzb->zb_count),
3444 3049              (double)tzb->zb_lsize / tzb->zb_asize);
3445 3050          (void) printf("\tbp deduped:    %10llu    ref>1:"
3446 3051              " %6llu   deduplication: %6.2f\n",
3447 3052              (u_longlong_t)zcb.zcb_dedup_asize,
3448 3053              (u_longlong_t)zcb.zcb_dedup_blocks,
3449 3054              (double)zcb.zcb_dedup_asize / tzb->zb_asize + 1.0);
     3055 +        if (spec_alloc != 0) {
     3056 +                (void) printf("\tspecial allocated: %10llu\n",
     3057 +                    (u_longlong_t)spec_alloc);
     3058 +        }
3450 3059          (void) printf("\tSPA allocated: %10llu     used: %5.2f%%\n",
3451 3060              (u_longlong_t)norm_alloc, 100.0 * norm_alloc / norm_space);
3452 3061  
3453 3062          for (bp_embedded_type_t i = 0; i < NUM_BP_EMBEDDED_TYPES; i++) {
3454 3063                  if (zcb.zcb_embedded_blocks[i] == 0)
3455 3064                          continue;
3456 3065                  (void) printf("\n");
3457 3066                  (void) printf("\tadditional, non-pointer bps of type %u: "
3458 3067                      "%10llu\n",
3459 3068                      i, (u_longlong_t)zcb.zcb_embedded_blocks[i]);
3460 3069  
3461 3070                  if (dump_opt['b'] >= 3) {
3462 3071                          (void) printf("\t number of (compressed) bytes:  "
3463 3072                              "number of bps\n");
3464 3073                          dump_histogram(zcb.zcb_embedded_histogram[i],
  
    | 
      ↓ open down ↓ | 
    5 lines elided | 
    
      ↑ open up ↑ | 
  
3465 3074                              sizeof (zcb.zcb_embedded_histogram[i]) /
3466 3075                              sizeof (zcb.zcb_embedded_histogram[i][0]), 0);
3467 3076                  }
3468 3077          }
3469 3078  
3470 3079          if (tzb->zb_ditto_samevdev != 0) {
3471 3080                  (void) printf("\tDittoed blocks on same vdev: %llu\n",
3472 3081                      (longlong_t)tzb->zb_ditto_samevdev);
3473 3082          }
3474 3083  
3475      -        for (uint64_t v = 0; v < spa->spa_root_vdev->vdev_children; v++) {
3476      -                vdev_t *vd = spa->spa_root_vdev->vdev_child[v];
3477      -                vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
3478      -
3479      -                if (vim == NULL) {
3480      -                        continue;
3481      -                }
3482      -
3483      -                char mem[32];
3484      -                zdb_nicenum(vdev_indirect_mapping_num_entries(vim),
3485      -                    mem, vdev_indirect_mapping_size(vim));
3486      -
3487      -                (void) printf("\tindirect vdev id %llu has %llu segments "
3488      -                    "(%s in memory)\n",
3489      -                    (longlong_t)vd->vdev_id,
3490      -                    (longlong_t)vdev_indirect_mapping_num_entries(vim), mem);
3491      -        }
3492      -
3493 3084          if (dump_opt['b'] >= 2) {
3494 3085                  int l, t, level;
3495 3086                  (void) printf("\nBlocks\tLSIZE\tPSIZE\tASIZE"
3496 3087                      "\t  avg\t comp\t%%Total\tType\n");
3497 3088  
3498 3089                  for (t = 0; t <= ZDB_OT_TOTAL; t++) {
3499 3090                          char csize[32], lsize[32], psize[32], asize[32];
3500 3091                          char avg[32], gang[32];
3501 3092                          const char *typename;
3502 3093  
3503 3094                          /* make sure nicenum has enough space */
3504 3095                          CTASSERT(sizeof (csize) >= NN_NUMBUF_SZ);
3505 3096                          CTASSERT(sizeof (lsize) >= NN_NUMBUF_SZ);
3506 3097                          CTASSERT(sizeof (psize) >= NN_NUMBUF_SZ);
3507 3098                          CTASSERT(sizeof (asize) >= NN_NUMBUF_SZ);
3508 3099                          CTASSERT(sizeof (avg) >= NN_NUMBUF_SZ);
3509 3100                          CTASSERT(sizeof (gang) >= NN_NUMBUF_SZ);
3510 3101  
3511 3102                          if (t < DMU_OT_NUMTYPES)
3512 3103                                  typename = dmu_ot[t].ot_name;
3513 3104                          else
3514 3105                                  typename = zdb_ot_extname[t - DMU_OT_NUMTYPES];
3515 3106  
3516 3107                          if (zcb.zcb_type[ZB_TOTAL][t].zb_asize == 0) {
3517 3108                                  (void) printf("%6s\t%5s\t%5s\t%5s"
3518 3109                                      "\t%5s\t%5s\t%6s\t%s\n",
3519 3110                                      "-",
3520 3111                                      "-",
3521 3112                                      "-",
3522 3113                                      "-",
3523 3114                                      "-",
3524 3115                                      "-",
3525 3116                                      "-",
3526 3117                                      typename);
3527 3118                                  continue;
3528 3119                          }
3529 3120  
3530 3121                          for (l = ZB_TOTAL - 1; l >= -1; l--) {
3531 3122                                  level = (l == -1 ? ZB_TOTAL : l);
3532 3123                                  zb = &zcb.zcb_type[level][t];
3533 3124  
3534 3125                                  if (zb->zb_asize == 0)
3535 3126                                          continue;
3536 3127  
3537 3128                                  if (dump_opt['b'] < 3 && level != ZB_TOTAL)
3538 3129                                          continue;
3539 3130  
3540 3131                                  if (level == 0 && zb->zb_asize ==
3541 3132                                      zcb.zcb_type[ZB_TOTAL][t].zb_asize)
3542 3133                                          continue;
3543 3134  
3544 3135                                  zdb_nicenum(zb->zb_count, csize,
3545 3136                                      sizeof (csize));
3546 3137                                  zdb_nicenum(zb->zb_lsize, lsize,
3547 3138                                      sizeof (lsize));
3548 3139                                  zdb_nicenum(zb->zb_psize, psize,
3549 3140                                      sizeof (psize));
3550 3141                                  zdb_nicenum(zb->zb_asize, asize,
3551 3142                                      sizeof (asize));
3552 3143                                  zdb_nicenum(zb->zb_asize / zb->zb_count, avg,
3553 3144                                      sizeof (avg));
3554 3145                                  zdb_nicenum(zb->zb_gangs, gang, sizeof (gang));
3555 3146  
3556 3147                                  (void) printf("%6s\t%5s\t%5s\t%5s\t%5s"
3557 3148                                      "\t%5.2f\t%6.2f\t",
3558 3149                                      csize, lsize, psize, asize, avg,
3559 3150                                      (double)zb->zb_lsize / zb->zb_psize,
3560 3151                                      100.0 * zb->zb_asize / tzb->zb_asize);
3561 3152  
3562 3153                                  if (level == ZB_TOTAL)
3563 3154                                          (void) printf("%s\n", typename);
3564 3155                                  else
3565 3156                                          (void) printf("    L%d %s\n",
3566 3157                                              level, typename);
3567 3158  
3568 3159                                  if (dump_opt['b'] >= 3 && zb->zb_gangs > 0) {
3569 3160                                          (void) printf("\t number of ganged "
3570 3161                                              "blocks: %s\n", gang);
3571 3162                                  }
3572 3163  
3573 3164                                  if (dump_opt['b'] >= 4) {
3574 3165                                          (void) printf("psize "
3575 3166                                              "(in 512-byte sectors): "
3576 3167                                              "number of blocks\n");
3577 3168                                          dump_histogram(zb->zb_psize_histogram,
3578 3169                                              PSIZE_HISTO_SIZE, 0);
3579 3170                                  }
3580 3171                          }
3581 3172                  }
3582 3173          }
3583 3174  
3584 3175          (void) printf("\n");
3585 3176  
3586 3177          if (leaks)
3587 3178                  return (2);
3588 3179  
3589 3180          if (zcb.zcb_haderrors)
3590 3181                  return (3);
3591 3182  
3592 3183          return (0);
3593 3184  }
3594 3185  
3595 3186  typedef struct zdb_ddt_entry {
3596 3187          ddt_key_t       zdde_key;
3597 3188          uint64_t        zdde_ref_blocks;
3598 3189          uint64_t        zdde_ref_lsize;
3599 3190          uint64_t        zdde_ref_psize;
3600 3191          uint64_t        zdde_ref_dsize;
3601 3192          avl_node_t      zdde_node;
3602 3193  } zdb_ddt_entry_t;
3603 3194  
3604 3195  /* ARGSUSED */
3605 3196  static int
3606 3197  zdb_ddt_add_cb(spa_t *spa, zilog_t *zilog, const blkptr_t *bp,
3607 3198      const zbookmark_phys_t *zb, const dnode_phys_t *dnp, void *arg)
3608 3199  {
3609 3200          avl_tree_t *t = arg;
3610 3201          avl_index_t where;
3611 3202          zdb_ddt_entry_t *zdde, zdde_search;
3612 3203  
3613 3204          if (bp == NULL || BP_IS_HOLE(bp) || BP_IS_EMBEDDED(bp))
3614 3205                  return (0);
  
    | 
      ↓ open down ↓ | 
    112 lines elided | 
    
      ↑ open up ↑ | 
  
3615 3206  
3616 3207          if (dump_opt['S'] > 1 && zb->zb_level == ZB_ROOT_LEVEL) {
3617 3208                  (void) printf("traversing objset %llu, %llu objects, "
3618 3209                      "%lu blocks so far\n",
3619 3210                      (u_longlong_t)zb->zb_objset,
3620 3211                      (u_longlong_t)BP_GET_FILL(bp),
3621 3212                      avl_numnodes(t));
3622 3213          }
3623 3214  
3624 3215          if (BP_IS_HOLE(bp) || BP_GET_CHECKSUM(bp) == ZIO_CHECKSUM_OFF ||
3625      -            BP_GET_LEVEL(bp) > 0 || DMU_OT_IS_METADATA(BP_GET_TYPE(bp)))
     3216 +            BP_IS_METADATA(bp))
3626 3217                  return (0);
3627 3218  
3628 3219          ddt_key_fill(&zdde_search.zdde_key, bp);
3629 3220  
3630 3221          zdde = avl_find(t, &zdde_search, &where);
3631 3222  
3632 3223          if (zdde == NULL) {
3633 3224                  zdde = umem_zalloc(sizeof (*zdde), UMEM_NOFAIL);
3634 3225                  zdde->zdde_key = zdde_search.zdde_key;
3635 3226                  avl_insert(t, zdde, where);
3636 3227          }
3637 3228  
3638 3229          zdde->zdde_ref_blocks += 1;
3639 3230          zdde->zdde_ref_lsize += BP_GET_LSIZE(bp);
3640 3231          zdde->zdde_ref_psize += BP_GET_PSIZE(bp);
3641 3232          zdde->zdde_ref_dsize += bp_get_dsize_sync(spa, bp);
3642 3233  
3643 3234          return (0);
3644 3235  }
3645 3236  
3646 3237  static void
3647 3238  dump_simulated_ddt(spa_t *spa)
3648 3239  {
3649 3240          avl_tree_t t;
3650 3241          void *cookie = NULL;
3651 3242          zdb_ddt_entry_t *zdde;
  
    | 
      ↓ open down ↓ | 
    16 lines elided | 
    
      ↑ open up ↑ | 
  
3652 3243          ddt_histogram_t ddh_total;
3653 3244          ddt_stat_t dds_total;
3654 3245  
3655 3246          bzero(&ddh_total, sizeof (ddh_total));
3656 3247          bzero(&dds_total, sizeof (dds_total));
3657 3248          avl_create(&t, ddt_entry_compare,
3658 3249              sizeof (zdb_ddt_entry_t), offsetof(zdb_ddt_entry_t, zdde_node));
3659 3250  
3660 3251          spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
3661 3252  
3662      -        (void) traverse_pool(spa, 0, TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA,
3663      -            zdb_ddt_add_cb, &t);
     3253 +        (void) traverse_pool(spa, 0, UINT64_MAX,
     3254 +            TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA,
     3255 +            zdb_ddt_add_cb, &t, NULL);
3664 3256  
3665 3257          spa_config_exit(spa, SCL_CONFIG, FTAG);
3666 3258  
3667 3259          while ((zdde = avl_destroy_nodes(&t, &cookie)) != NULL) {
3668 3260                  ddt_stat_t dds;
3669 3261                  uint64_t refcnt = zdde->zdde_ref_blocks;
3670 3262                  ASSERT(refcnt != 0);
3671 3263  
3672 3264                  dds.dds_blocks = zdde->zdde_ref_blocks / refcnt;
3673 3265                  dds.dds_lsize = zdde->zdde_ref_lsize / refcnt;
3674 3266                  dds.dds_psize = zdde->zdde_ref_psize / refcnt;
3675 3267                  dds.dds_dsize = zdde->zdde_ref_dsize / refcnt;
3676 3268  
3677 3269                  dds.dds_ref_blocks = zdde->zdde_ref_blocks;
3678 3270                  dds.dds_ref_lsize = zdde->zdde_ref_lsize;
3679 3271                  dds.dds_ref_psize = zdde->zdde_ref_psize;
3680 3272                  dds.dds_ref_dsize = zdde->zdde_ref_dsize;
3681 3273  
3682 3274                  ddt_stat_add(&ddh_total.ddh_stat[highbit64(refcnt) - 1],
3683 3275                      &dds, 0);
3684 3276  
3685 3277                  umem_free(zdde, sizeof (*zdde));
3686 3278          }
3687 3279  
3688 3280          avl_destroy(&t);
  
    | 
      ↓ open down ↓ | 
    15 lines elided | 
    
      ↑ open up ↑ | 
  
3689 3281  
3690 3282          ddt_histogram_stat(&dds_total, &ddh_total);
3691 3283  
3692 3284          (void) printf("Simulated DDT histogram:\n");
3693 3285  
3694 3286          zpool_dump_ddt(&dds_total, &ddh_total);
3695 3287  
3696 3288          dump_dedup_ratio(&dds_total);
3697 3289  }
3698 3290  
3699      -static int
3700      -verify_device_removal_feature_counts(spa_t *spa)
3701      -{
3702      -        uint64_t dr_feature_refcount = 0;
3703      -        uint64_t oc_feature_refcount = 0;
3704      -        uint64_t indirect_vdev_count = 0;
3705      -        uint64_t precise_vdev_count = 0;
3706      -        uint64_t obsolete_counts_object_count = 0;
3707      -        uint64_t obsolete_sm_count = 0;
3708      -        uint64_t obsolete_counts_count = 0;
3709      -        uint64_t scip_count = 0;
3710      -        uint64_t obsolete_bpobj_count = 0;
3711      -        int ret = 0;
3712      -
3713      -        spa_condensing_indirect_phys_t *scip =
3714      -            &spa->spa_condensing_indirect_phys;
3715      -        if (scip->scip_next_mapping_object != 0) {
3716      -                vdev_t *vd = spa->spa_root_vdev->vdev_child[scip->scip_vdev];
3717      -                ASSERT(scip->scip_prev_obsolete_sm_object != 0);
3718      -                ASSERT3P(vd->vdev_ops, ==, &vdev_indirect_ops);
3719      -
3720      -                (void) printf("Condensing indirect vdev %llu: new mapping "
3721      -                    "object %llu, prev obsolete sm %llu\n",
3722      -                    (u_longlong_t)scip->scip_vdev,
3723      -                    (u_longlong_t)scip->scip_next_mapping_object,
3724      -                    (u_longlong_t)scip->scip_prev_obsolete_sm_object);
3725      -                if (scip->scip_prev_obsolete_sm_object != 0) {
3726      -                        space_map_t *prev_obsolete_sm = NULL;
3727      -                        VERIFY0(space_map_open(&prev_obsolete_sm,
3728      -                            spa->spa_meta_objset,
3729      -                            scip->scip_prev_obsolete_sm_object,
3730      -                            0, vd->vdev_asize, 0));
3731      -                        space_map_update(prev_obsolete_sm);
3732      -                        dump_spacemap(spa->spa_meta_objset, prev_obsolete_sm);
3733      -                        (void) printf("\n");
3734      -                        space_map_close(prev_obsolete_sm);
3735      -                }
3736      -
3737      -                scip_count += 2;
3738      -        }
3739      -
3740      -        for (uint64_t i = 0; i < spa->spa_root_vdev->vdev_children; i++) {
3741      -                vdev_t *vd = spa->spa_root_vdev->vdev_child[i];
3742      -                vdev_indirect_config_t *vic = &vd->vdev_indirect_config;
3743      -
3744      -                if (vic->vic_mapping_object != 0) {
3745      -                        ASSERT(vd->vdev_ops == &vdev_indirect_ops ||
3746      -                            vd->vdev_removing);
3747      -                        indirect_vdev_count++;
3748      -
3749      -                        if (vd->vdev_indirect_mapping->vim_havecounts) {
3750      -                                obsolete_counts_count++;
3751      -                        }
3752      -                }
3753      -                if (vdev_obsolete_counts_are_precise(vd)) {
3754      -                        ASSERT(vic->vic_mapping_object != 0);
3755      -                        precise_vdev_count++;
3756      -                }
3757      -                if (vdev_obsolete_sm_object(vd) != 0) {
3758      -                        ASSERT(vic->vic_mapping_object != 0);
3759      -                        obsolete_sm_count++;
3760      -                }
3761      -        }
3762      -
3763      -        (void) feature_get_refcount(spa,
3764      -            &spa_feature_table[SPA_FEATURE_DEVICE_REMOVAL],
3765      -            &dr_feature_refcount);
3766      -        (void) feature_get_refcount(spa,
3767      -            &spa_feature_table[SPA_FEATURE_OBSOLETE_COUNTS],
3768      -            &oc_feature_refcount);
3769      -
3770      -        if (dr_feature_refcount != indirect_vdev_count) {
3771      -                ret = 1;
3772      -                (void) printf("Number of indirect vdevs (%llu) " \
3773      -                    "does not match feature count (%llu)\n",
3774      -                    (u_longlong_t)indirect_vdev_count,
3775      -                    (u_longlong_t)dr_feature_refcount);
3776      -        } else {
3777      -                (void) printf("Verified device_removal feature refcount " \
3778      -                    "of %llu is correct\n",
3779      -                    (u_longlong_t)dr_feature_refcount);
3780      -        }
3781      -
3782      -        if (zap_contains(spa_meta_objset(spa), DMU_POOL_DIRECTORY_OBJECT,
3783      -            DMU_POOL_OBSOLETE_BPOBJ) == 0) {
3784      -                obsolete_bpobj_count++;
3785      -        }
3786      -
3787      -
3788      -        obsolete_counts_object_count = precise_vdev_count;
3789      -        obsolete_counts_object_count += obsolete_sm_count;
3790      -        obsolete_counts_object_count += obsolete_counts_count;
3791      -        obsolete_counts_object_count += scip_count;
3792      -        obsolete_counts_object_count += obsolete_bpobj_count;
3793      -        obsolete_counts_object_count += remap_deadlist_count;
3794      -
3795      -        if (oc_feature_refcount != obsolete_counts_object_count) {
3796      -                ret = 1;
3797      -                (void) printf("Number of obsolete counts objects (%llu) " \
3798      -                    "does not match feature count (%llu)\n",
3799      -                    (u_longlong_t)obsolete_counts_object_count,
3800      -                    (u_longlong_t)oc_feature_refcount);
3801      -                (void) printf("pv:%llu os:%llu oc:%llu sc:%llu "
3802      -                    "ob:%llu rd:%llu\n",
3803      -                    (u_longlong_t)precise_vdev_count,
3804      -                    (u_longlong_t)obsolete_sm_count,
3805      -                    (u_longlong_t)obsolete_counts_count,
3806      -                    (u_longlong_t)scip_count,
3807      -                    (u_longlong_t)obsolete_bpobj_count,
3808      -                    (u_longlong_t)remap_deadlist_count);
3809      -        } else {
3810      -                (void) printf("Verified indirect_refcount feature refcount " \
3811      -                    "of %llu is correct\n",
3812      -                    (u_longlong_t)oc_feature_refcount);
3813      -        }
3814      -        return (ret);
3815      -}
3816      -
3817 3291  static void
3818 3292  dump_zpool(spa_t *spa)
3819 3293  {
3820 3294          dsl_pool_t *dp = spa_get_dsl(spa);
3821 3295          int rc = 0;
3822 3296  
3823 3297          if (dump_opt['S']) {
3824 3298                  dump_simulated_ddt(spa);
3825 3299                  return;
3826 3300          }
3827 3301  
3828 3302          if (!dump_opt['e'] && dump_opt['C'] > 1) {
3829 3303                  (void) printf("\nCached configuration:\n");
3830 3304                  dump_nvlist(spa->spa_config, 8);
3831 3305          }
3832 3306  
3833 3307          if (dump_opt['C'])
3834 3308                  dump_config(spa);
3835 3309  
3836 3310          if (dump_opt['u'])
3837 3311                  dump_uberblock(&spa->spa_uberblock, "\nUberblock:\n", "\n");
3838 3312  
3839 3313          if (dump_opt['D'])
  
    | 
      ↓ open down ↓ | 
    13 lines elided | 
    
      ↑ open up ↑ | 
  
3840 3314                  dump_all_ddts(spa);
3841 3315  
3842 3316          if (dump_opt['d'] > 2 || dump_opt['m'])
3843 3317                  dump_metaslabs(spa);
3844 3318          if (dump_opt['M'])
3845 3319                  dump_metaslab_groups(spa);
3846 3320  
3847 3321          if (dump_opt['d'] || dump_opt['i']) {
3848 3322                  dump_dir(dp->dp_meta_objset);
3849 3323                  if (dump_opt['d'] >= 3) {
3850      -                        dsl_pool_t *dp = spa->spa_dsl_pool;
3851 3324                          dump_full_bpobj(&spa->spa_deferred_bpobj,
3852 3325                              "Deferred frees", 0);
3853 3326                          if (spa_version(spa) >= SPA_VERSION_DEADLISTS) {
3854      -                                dump_full_bpobj(&dp->dp_free_bpobj,
     3327 +                                dump_full_bpobj(
     3328 +                                    &spa->spa_dsl_pool->dp_free_bpobj,
3855 3329                                      "Pool snapshot frees", 0);
3856 3330                          }
3857      -                        if (bpobj_is_open(&dp->dp_obsolete_bpobj)) {
3858      -                                ASSERT(spa_feature_is_enabled(spa,
3859      -                                    SPA_FEATURE_DEVICE_REMOVAL));
3860      -                                dump_full_bpobj(&dp->dp_obsolete_bpobj,
3861      -                                    "Pool obsolete blocks", 0);
3862      -                        }
3863 3331  
3864 3332                          if (spa_feature_is_active(spa,
3865 3333                              SPA_FEATURE_ASYNC_DESTROY)) {
3866 3334                                  dump_bptree(spa->spa_meta_objset,
3867      -                                    dp->dp_bptree_obj,
     3335 +                                    spa->spa_dsl_pool->dp_bptree_obj,
3868 3336                                      "Pool dataset frees");
3869 3337                          }
3870 3338                          dump_dtl(spa->spa_root_vdev, 0);
3871 3339                  }
3872 3340                  (void) dmu_objset_find(spa_name(spa), dump_one_dir,
3873 3341                      NULL, DS_FIND_SNAPSHOTS | DS_FIND_CHILDREN);
3874 3342  
3875 3343                  for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {
3876 3344                          uint64_t refcount;
3877 3345  
3878 3346                          if (!(spa_feature_table[f].fi_flags &
3879 3347                              ZFEATURE_FLAG_PER_DATASET) ||
3880 3348                              !spa_feature_is_enabled(spa, f)) {
3881 3349                                  ASSERT0(dataset_feature_count[f]);
3882 3350                                  continue;
3883 3351                          }
3884 3352                          (void) feature_get_refcount(spa,
3885 3353                              &spa_feature_table[f], &refcount);
3886 3354                          if (dataset_feature_count[f] != refcount) {
3887 3355                                  (void) printf("%s feature refcount mismatch: "
3888 3356                                      "%lld datasets != %lld refcount\n",
3889 3357                                      spa_feature_table[f].fi_uname,
  
    | 
      ↓ open down ↓ | 
    12 lines elided | 
    
      ↑ open up ↑ | 
  
3890 3358                                      (longlong_t)dataset_feature_count[f],
3891 3359                                      (longlong_t)refcount);
3892 3360                                  rc = 2;
3893 3361                          } else {
3894 3362                                  (void) printf("Verified %s feature refcount "
3895 3363                                      "of %llu is correct\n",
3896 3364                                      spa_feature_table[f].fi_uname,
3897 3365                                      (longlong_t)refcount);
3898 3366                          }
3899 3367                  }
3900      -
3901      -                if (rc == 0) {
3902      -                        rc = verify_device_removal_feature_counts(spa);
3903      -                }
3904 3368          }
3905 3369          if (rc == 0 && (dump_opt['b'] || dump_opt['c']))
3906 3370                  rc = dump_block_stats(spa);
3907 3371  
3908 3372          if (rc == 0)
3909 3373                  rc = verify_spacemap_refcounts(spa);
3910 3374  
3911 3375          if (dump_opt['s'])
3912 3376                  show_pool_stats(spa);
3913 3377  
3914 3378          if (dump_opt['h'])
3915 3379                  dump_history(spa);
3916 3380  
3917 3381          if (rc != 0) {
3918 3382                  dump_debug_buffer();
3919 3383                  exit(rc);
3920 3384          }
3921 3385  }
3922 3386  
3923 3387  #define ZDB_FLAG_CHECKSUM       0x0001
3924 3388  #define ZDB_FLAG_DECOMPRESS     0x0002
3925 3389  #define ZDB_FLAG_BSWAP          0x0004
3926 3390  #define ZDB_FLAG_GBH            0x0008
3927 3391  #define ZDB_FLAG_INDIRECT       0x0010
3928 3392  #define ZDB_FLAG_PHYS           0x0020
3929 3393  #define ZDB_FLAG_RAW            0x0040
3930 3394  #define ZDB_FLAG_PRINT_BLKPTR   0x0080
3931 3395  
3932 3396  static int flagbits[256];
3933 3397  
3934 3398  static void
3935 3399  zdb_print_blkptr(blkptr_t *bp, int flags)
3936 3400  {
3937 3401          char blkbuf[BP_SPRINTF_LEN];
3938 3402  
3939 3403          if (flags & ZDB_FLAG_BSWAP)
3940 3404                  byteswap_uint64_array((void *)bp, sizeof (blkptr_t));
3941 3405  
3942 3406          snprintf_blkptr(blkbuf, sizeof (blkbuf), bp);
3943 3407          (void) printf("%s\n", blkbuf);
3944 3408  }
3945 3409  
3946 3410  static void
3947 3411  zdb_dump_indirect(blkptr_t *bp, int nbps, int flags)
3948 3412  {
3949 3413          int i;
3950 3414  
3951 3415          for (i = 0; i < nbps; i++)
3952 3416                  zdb_print_blkptr(&bp[i], flags);
3953 3417  }
3954 3418  
3955 3419  static void
3956 3420  zdb_dump_gbh(void *buf, int flags)
3957 3421  {
3958 3422          zdb_dump_indirect((blkptr_t *)buf, SPA_GBH_NBLKPTRS, flags);
3959 3423  }
3960 3424  
3961 3425  static void
3962 3426  zdb_dump_block_raw(void *buf, uint64_t size, int flags)
3963 3427  {
3964 3428          if (flags & ZDB_FLAG_BSWAP)
3965 3429                  byteswap_uint64_array(buf, size);
3966 3430          (void) write(1, buf, size);
3967 3431  }
3968 3432  
3969 3433  static void
3970 3434  zdb_dump_block(char *label, void *buf, uint64_t size, int flags)
3971 3435  {
3972 3436          uint64_t *d = (uint64_t *)buf;
3973 3437          unsigned nwords = size / sizeof (uint64_t);
3974 3438          int do_bswap = !!(flags & ZDB_FLAG_BSWAP);
3975 3439          unsigned i, j;
3976 3440          const char *hdr;
3977 3441          char *c;
3978 3442  
3979 3443  
3980 3444          if (do_bswap)
3981 3445                  hdr = " 7 6 5 4 3 2 1 0   f e d c b a 9 8";
3982 3446          else
3983 3447                  hdr = " 0 1 2 3 4 5 6 7   8 9 a b c d e f";
3984 3448  
3985 3449          (void) printf("\n%s\n%6s   %s  0123456789abcdef\n", label, "", hdr);
3986 3450  
3987 3451          for (i = 0; i < nwords; i += 2) {
3988 3452                  (void) printf("%06llx:  %016llx  %016llx  ",
3989 3453                      (u_longlong_t)(i * sizeof (uint64_t)),
3990 3454                      (u_longlong_t)(do_bswap ? BSWAP_64(d[i]) : d[i]),
3991 3455                      (u_longlong_t)(do_bswap ? BSWAP_64(d[i + 1]) : d[i + 1]));
3992 3456  
3993 3457                  c = (char *)&d[i];
3994 3458                  for (j = 0; j < 2 * sizeof (uint64_t); j++)
3995 3459                          (void) printf("%c", isprint(c[j]) ? c[j] : '.');
3996 3460                  (void) printf("\n");
3997 3461          }
3998 3462  }
3999 3463  
4000 3464  /*
4001 3465   * There are two acceptable formats:
4002 3466   *      leaf_name         - For example: c1t0d0 or /tmp/ztest.0a
4003 3467   *      child[.child]*    - For example: 0.1.1
4004 3468   *
4005 3469   * The second form can be used to specify arbitrary vdevs anywhere
4006 3470   * in the heirarchy.  For example, in a pool with a mirror of
4007 3471   * RAID-Zs, you can specify either RAID-Z vdev with 0.0 or 0.1 .
4008 3472   */
4009 3473  static vdev_t *
4010 3474  zdb_vdev_lookup(vdev_t *vdev, const char *path)
4011 3475  {
4012 3476          char *s, *p, *q;
4013 3477          unsigned i;
4014 3478  
4015 3479          if (vdev == NULL)
4016 3480                  return (NULL);
4017 3481  
4018 3482          /* First, assume the x.x.x.x format */
4019 3483          i = strtoul(path, &s, 10);
4020 3484          if (s == path || (s && *s != '.' && *s != '\0'))
4021 3485                  goto name;
4022 3486          if (i >= vdev->vdev_children)
4023 3487                  return (NULL);
4024 3488  
4025 3489          vdev = vdev->vdev_child[i];
4026 3490          if (*s == '\0')
4027 3491                  return (vdev);
4028 3492          return (zdb_vdev_lookup(vdev, s+1));
4029 3493  
4030 3494  name:
4031 3495          for (i = 0; i < vdev->vdev_children; i++) {
4032 3496                  vdev_t *vc = vdev->vdev_child[i];
4033 3497  
4034 3498                  if (vc->vdev_path == NULL) {
4035 3499                          vc = zdb_vdev_lookup(vc, path);
4036 3500                          if (vc == NULL)
4037 3501                                  continue;
4038 3502                          else
4039 3503                                  return (vc);
4040 3504                  }
4041 3505  
4042 3506                  p = strrchr(vc->vdev_path, '/');
4043 3507                  p = p ? p + 1 : vc->vdev_path;
4044 3508                  q = &vc->vdev_path[strlen(vc->vdev_path) - 2];
4045 3509  
4046 3510                  if (strcmp(vc->vdev_path, path) == 0)
4047 3511                          return (vc);
4048 3512                  if (strcmp(p, path) == 0)
4049 3513                          return (vc);
4050 3514                  if (strcmp(q, "s0") == 0 && strncmp(p, path, q - p) == 0)
4051 3515                          return (vc);
4052 3516          }
4053 3517  
4054 3518          return (NULL);
4055 3519  }
4056 3520  
4057 3521  /* ARGSUSED */
4058 3522  static int
4059 3523  random_get_pseudo_bytes_cb(void *buf, size_t len, void *unused)
4060 3524  {
4061 3525          return (random_get_pseudo_bytes(buf, len));
4062 3526  }
4063 3527  
4064 3528  /*
4065 3529   * Read a block from a pool and print it out.  The syntax of the
4066 3530   * block descriptor is:
4067 3531   *
4068 3532   *      pool:vdev_specifier:offset:size[:flags]
4069 3533   *
4070 3534   *      pool           - The name of the pool you wish to read from
4071 3535   *      vdev_specifier - Which vdev (see comment for zdb_vdev_lookup)
4072 3536   *      offset         - offset, in hex, in bytes
4073 3537   *      size           - Amount of data to read, in hex, in bytes
4074 3538   *      flags          - A string of characters specifying options
4075 3539   *               b: Decode a blkptr at given offset within block
4076 3540   *              *c: Calculate and display checksums
4077 3541   *               d: Decompress data before dumping
4078 3542   *               e: Byteswap data before dumping
4079 3543   *               g: Display data as a gang block header
4080 3544   *               i: Display as an indirect block
4081 3545   *               p: Do I/O to physical offset
4082 3546   *               r: Dump raw data to stdout
4083 3547   *
4084 3548   *              * = not yet implemented
4085 3549   */
4086 3550  static void
4087 3551  zdb_read_block(char *thing, spa_t *spa)
4088 3552  {
4089 3553          blkptr_t blk, *bp = &blk;
4090 3554          dva_t *dva = bp->blk_dva;
4091 3555          int flags = 0;
4092 3556          uint64_t offset = 0, size = 0, psize = 0, lsize = 0, blkptr_offset = 0;
4093 3557          zio_t *zio;
4094 3558          vdev_t *vd;
4095 3559          abd_t *pabd;
4096 3560          void *lbuf, *buf;
4097 3561          const char *s, *vdev;
4098 3562          char *p, *dup, *flagstr;
4099 3563          int i, error;
4100 3564  
4101 3565          dup = strdup(thing);
4102 3566          s = strtok(dup, ":");
4103 3567          vdev = s ? s : "";
4104 3568          s = strtok(NULL, ":");
4105 3569          offset = strtoull(s ? s : "", NULL, 16);
4106 3570          s = strtok(NULL, ":");
4107 3571          size = strtoull(s ? s : "", NULL, 16);
4108 3572          s = strtok(NULL, ":");
4109 3573          if (s)
4110 3574                  flagstr = strdup(s);
4111 3575          else
4112 3576                  flagstr = strdup("");
4113 3577  
4114 3578          s = NULL;
4115 3579          if (size == 0)
4116 3580                  s = "size must not be zero";
4117 3581          if (!IS_P2ALIGNED(size, DEV_BSIZE))
4118 3582                  s = "size must be a multiple of sector size";
4119 3583          if (!IS_P2ALIGNED(offset, DEV_BSIZE))
4120 3584                  s = "offset must be a multiple of sector size";
4121 3585          if (s) {
4122 3586                  (void) printf("Invalid block specifier: %s  - %s\n", thing, s);
4123 3587                  free(dup);
4124 3588                  return;
4125 3589          }
4126 3590  
4127 3591          for (s = strtok(flagstr, ":"); s; s = strtok(NULL, ":")) {
4128 3592                  for (i = 0; flagstr[i]; i++) {
4129 3593                          int bit = flagbits[(uchar_t)flagstr[i]];
4130 3594  
4131 3595                          if (bit == 0) {
4132 3596                                  (void) printf("***Invalid flag: %c\n",
4133 3597                                      flagstr[i]);
4134 3598                                  continue;
4135 3599                          }
4136 3600                          flags |= bit;
4137 3601  
4138 3602                          /* If it's not something with an argument, keep going */
4139 3603                          if ((bit & (ZDB_FLAG_CHECKSUM |
4140 3604                              ZDB_FLAG_PRINT_BLKPTR)) == 0)
4141 3605                                  continue;
4142 3606  
4143 3607                          p = &flagstr[i + 1];
4144 3608                          if (bit == ZDB_FLAG_PRINT_BLKPTR)
4145 3609                                  blkptr_offset = strtoull(p, &p, 16);
4146 3610                          if (*p != ':' && *p != '\0') {
4147 3611                                  (void) printf("***Invalid flag arg: '%s'\n", s);
4148 3612                                  free(dup);
4149 3613                                  return;
4150 3614                          }
4151 3615                  }
4152 3616          }
4153 3617          free(flagstr);
4154 3618  
4155 3619          vd = zdb_vdev_lookup(spa->spa_root_vdev, vdev);
4156 3620          if (vd == NULL) {
4157 3621                  (void) printf("***Invalid vdev: %s\n", vdev);
4158 3622                  free(dup);
4159 3623                  return;
4160 3624          } else {
4161 3625                  if (vd->vdev_path)
4162 3626                          (void) fprintf(stderr, "Found vdev: %s\n",
4163 3627                              vd->vdev_path);
4164 3628                  else
4165 3629                          (void) fprintf(stderr, "Found vdev type: %s\n",
4166 3630                              vd->vdev_ops->vdev_op_type);
4167 3631          }
4168 3632  
4169 3633          psize = size;
4170 3634          lsize = size;
4171 3635  
4172 3636          pabd = abd_alloc_linear(SPA_MAXBLOCKSIZE, B_FALSE);
4173 3637          lbuf = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
4174 3638  
4175 3639          BP_ZERO(bp);
4176 3640  
4177 3641          DVA_SET_VDEV(&dva[0], vd->vdev_id);
4178 3642          DVA_SET_OFFSET(&dva[0], offset);
4179 3643          DVA_SET_GANG(&dva[0], !!(flags & ZDB_FLAG_GBH));
4180 3644          DVA_SET_ASIZE(&dva[0], vdev_psize_to_asize(vd, psize));
4181 3645  
4182 3646          BP_SET_BIRTH(bp, TXG_INITIAL, TXG_INITIAL);
4183 3647  
4184 3648          BP_SET_LSIZE(bp, lsize);
4185 3649          BP_SET_PSIZE(bp, psize);
4186 3650          BP_SET_COMPRESS(bp, ZIO_COMPRESS_OFF);
4187 3651          BP_SET_CHECKSUM(bp, ZIO_CHECKSUM_OFF);
4188 3652          BP_SET_TYPE(bp, DMU_OT_NONE);
4189 3653          BP_SET_LEVEL(bp, 0);
4190 3654          BP_SET_DEDUP(bp, 0);
4191 3655          BP_SET_BYTEORDER(bp, ZFS_HOST_BYTEORDER);
4192 3656  
4193 3657          spa_config_enter(spa, SCL_STATE, FTAG, RW_READER);
4194 3658          zio = zio_root(spa, NULL, NULL, 0);
4195 3659  
4196 3660          if (vd == vd->vdev_top) {
4197 3661                  /*
4198 3662                   * Treat this as a normal block read.
4199 3663                   */
4200 3664                  zio_nowait(zio_read(zio, spa, bp, pabd, psize, NULL, NULL,
  
    | 
      ↓ open down ↓ | 
    287 lines elided | 
    
      ↑ open up ↑ | 
  
4201 3665                      ZIO_PRIORITY_SYNC_READ,
4202 3666                      ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW, NULL));
4203 3667          } else {
4204 3668                  /*
4205 3669                   * Treat this as a vdev child I/O.
4206 3670                   */
4207 3671                  zio_nowait(zio_vdev_child_io(zio, bp, vd, offset, pabd,
4208 3672                      psize, ZIO_TYPE_READ, ZIO_PRIORITY_SYNC_READ,
4209 3673                      ZIO_FLAG_DONT_CACHE | ZIO_FLAG_DONT_QUEUE |
4210 3674                      ZIO_FLAG_DONT_PROPAGATE | ZIO_FLAG_DONT_RETRY |
4211      -                    ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW | ZIO_FLAG_OPTIONAL,
4212      -                    NULL, NULL));
     3675 +                    ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW, NULL, NULL));
4213 3676          }
4214 3677  
4215 3678          error = zio_wait(zio);
4216 3679          spa_config_exit(spa, SCL_STATE, FTAG);
4217 3680  
4218 3681          if (error) {
4219 3682                  (void) printf("Read of %s failed, error: %d\n", thing, error);
4220 3683                  goto out;
4221 3684          }
4222 3685  
4223 3686          if (flags & ZDB_FLAG_DECOMPRESS) {
4224 3687                  /*
4225 3688                   * We don't know how the data was compressed, so just try
4226 3689                   * every decompress function at every inflated blocksize.
4227 3690                   */
4228 3691                  enum zio_compress c;
4229 3692                  void *pbuf2 = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
4230 3693                  void *lbuf2 = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
4231 3694  
4232 3695                  abd_copy_to_buf(pbuf2, pabd, psize);
4233 3696  
4234 3697                  VERIFY0(abd_iterate_func(pabd, psize, SPA_MAXBLOCKSIZE - psize,
4235 3698                      random_get_pseudo_bytes_cb, NULL));
4236 3699  
4237 3700                  VERIFY0(random_get_pseudo_bytes((uint8_t *)pbuf2 + psize,
4238 3701                      SPA_MAXBLOCKSIZE - psize));
4239 3702  
4240 3703                  for (lsize = SPA_MAXBLOCKSIZE; lsize > psize;
4241 3704                      lsize -= SPA_MINBLOCKSIZE) {
4242 3705                          for (c = 0; c < ZIO_COMPRESS_FUNCTIONS; c++) {
4243 3706                                  if (zio_decompress_data(c, pabd,
4244 3707                                      lbuf, psize, lsize) == 0 &&
4245 3708                                      zio_decompress_data_buf(c, pbuf2,
4246 3709                                      lbuf2, psize, lsize) == 0 &&
4247 3710                                      bcmp(lbuf, lbuf2, lsize) == 0)
4248 3711                                          break;
4249 3712                          }
4250 3713                          if (c != ZIO_COMPRESS_FUNCTIONS)
4251 3714                                  break;
4252 3715                          lsize -= SPA_MINBLOCKSIZE;
4253 3716                  }
4254 3717  
4255 3718                  umem_free(pbuf2, SPA_MAXBLOCKSIZE);
4256 3719                  umem_free(lbuf2, SPA_MAXBLOCKSIZE);
4257 3720  
4258 3721                  if (lsize <= psize) {
4259 3722                          (void) printf("Decompress of %s failed\n", thing);
4260 3723                          goto out;
4261 3724                  }
4262 3725                  buf = lbuf;
4263 3726                  size = lsize;
4264 3727          } else {
4265 3728                  buf = abd_to_buf(pabd);
4266 3729                  size = psize;
4267 3730          }
4268 3731  
4269 3732          if (flags & ZDB_FLAG_PRINT_BLKPTR)
4270 3733                  zdb_print_blkptr((blkptr_t *)(void *)
4271 3734                      ((uintptr_t)buf + (uintptr_t)blkptr_offset), flags);
4272 3735          else if (flags & ZDB_FLAG_RAW)
4273 3736                  zdb_dump_block_raw(buf, size, flags);
4274 3737          else if (flags & ZDB_FLAG_INDIRECT)
4275 3738                  zdb_dump_indirect((blkptr_t *)buf, size / sizeof (blkptr_t),
4276 3739                      flags);
4277 3740          else if (flags & ZDB_FLAG_GBH)
4278 3741                  zdb_dump_gbh(buf, flags);
4279 3742          else
4280 3743                  zdb_dump_block(thing, buf, size, flags);
4281 3744  
4282 3745  out:
4283 3746          abd_free(pabd);
4284 3747          umem_free(lbuf, SPA_MAXBLOCKSIZE);
4285 3748          free(dup);
4286 3749  }
4287 3750  
4288 3751  static void
4289 3752  zdb_embedded_block(char *thing)
4290 3753  {
4291 3754          blkptr_t bp;
4292 3755          unsigned long long *words = (void *)&bp;
4293 3756          char buf[SPA_MAXBLOCKSIZE];
4294 3757          int err;
4295 3758  
4296 3759          bzero(&bp, sizeof (bp));
4297 3760          err = sscanf(thing, "%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx:"
4298 3761              "%llx:%llx:%llx:%llx:%llx:%llx:%llx:%llx",
4299 3762              words + 0, words + 1, words + 2, words + 3,
4300 3763              words + 4, words + 5, words + 6, words + 7,
4301 3764              words + 8, words + 9, words + 10, words + 11,
4302 3765              words + 12, words + 13, words + 14, words + 15);
4303 3766          if (err != 16) {
4304 3767                  (void) printf("invalid input format\n");
4305 3768                  exit(1);
4306 3769          }
4307 3770          ASSERT3U(BPE_GET_LSIZE(&bp), <=, SPA_MAXBLOCKSIZE);
4308 3771          err = decode_embedded_bp(&bp, buf, BPE_GET_LSIZE(&bp));
4309 3772          if (err != 0) {
4310 3773                  (void) printf("decode failed: %u\n", err);
4311 3774                  exit(1);
4312 3775          }
4313 3776          zdb_dump_block_raw(buf, BPE_GET_LSIZE(&bp), 0);
4314 3777  }
4315 3778  
4316 3779  static boolean_t
4317 3780  pool_match(nvlist_t *cfg, char *tgt)
4318 3781  {
4319 3782          uint64_t v, guid = strtoull(tgt, NULL, 0);
4320 3783          char *s;
4321 3784  
4322 3785          if (guid != 0) {
4323 3786                  if (nvlist_lookup_uint64(cfg, ZPOOL_CONFIG_POOL_GUID, &v) == 0)
4324 3787                          return (v == guid);
4325 3788          } else {
4326 3789                  if (nvlist_lookup_string(cfg, ZPOOL_CONFIG_POOL_NAME, &s) == 0)
4327 3790                          return (strcmp(s, tgt) == 0);
4328 3791          }
4329 3792          return (B_FALSE);
4330 3793  }
4331 3794  
4332 3795  static char *
4333 3796  find_zpool(char **target, nvlist_t **configp, int dirc, char **dirv)
4334 3797  {
4335 3798          nvlist_t *pools;
4336 3799          nvlist_t *match = NULL;
4337 3800          char *name = NULL;
4338 3801          char *sepp = NULL;
4339 3802          char sep = '\0';
4340 3803          int count = 0;
4341 3804          importargs_t args;
4342 3805  
4343 3806          bzero(&args, sizeof (args));
4344 3807          args.paths = dirc;
4345 3808          args.path = dirv;
4346 3809          args.can_be_active = B_TRUE;
4347 3810  
4348 3811          if ((sepp = strpbrk(*target, "/@")) != NULL) {
4349 3812                  sep = *sepp;
4350 3813                  *sepp = '\0';
4351 3814          }
4352 3815  
4353 3816          pools = zpool_search_import(g_zfs, &args);
4354 3817  
4355 3818          if (pools != NULL) {
4356 3819                  nvpair_t *elem = NULL;
4357 3820                  while ((elem = nvlist_next_nvpair(pools, elem)) != NULL) {
4358 3821                          verify(nvpair_value_nvlist(elem, configp) == 0);
4359 3822                          if (pool_match(*configp, *target)) {
4360 3823                                  count++;
4361 3824                                  if (match != NULL) {
4362 3825                                          /* print previously found config */
4363 3826                                          if (name != NULL) {
4364 3827                                                  (void) printf("%s\n", name);
4365 3828                                                  dump_nvlist(match, 8);
4366 3829                                                  name = NULL;
4367 3830                                          }
4368 3831                                          (void) printf("%s\n",
4369 3832                                              nvpair_name(elem));
4370 3833                                          dump_nvlist(*configp, 8);
4371 3834                                  } else {
4372 3835                                          match = *configp;
4373 3836                                          name = nvpair_name(elem);
4374 3837                                  }
4375 3838                          }
4376 3839                  }
4377 3840          }
4378 3841          if (count > 1)
4379 3842                  (void) fatal("\tMatched %d pools - use pool GUID "
4380 3843                      "instead of pool name or \n"
4381 3844                      "\tpool name part of a dataset name to select pool", count);
4382 3845  
4383 3846          if (sepp)
4384 3847                  *sepp = sep;
4385 3848          /*
4386 3849           * If pool GUID was specified for pool id, replace it with pool name
4387 3850           */
4388 3851          if (name && (strstr(*target, name) != *target)) {
4389 3852                  int sz = 1 + strlen(name) + ((sepp) ? strlen(sepp) : 0);
4390 3853  
4391 3854                  *target = umem_alloc(sz, UMEM_NOFAIL);
4392 3855                  (void) snprintf(*target, sz, "%s%s", name, sepp ? sepp : "");
4393 3856          }
4394 3857  
4395 3858          *configp = name ? match : NULL;
4396 3859  
4397 3860          return (name);
4398 3861  }
4399 3862  
4400 3863  int
4401 3864  main(int argc, char **argv)
4402 3865  {
4403 3866          int c;
4404 3867          struct rlimit rl = { 1024, 1024 };
4405 3868          spa_t *spa = NULL;
4406 3869          objset_t *os = NULL;
4407 3870          int dump_all = 1;
4408 3871          int verbose = 0;
4409 3872          int error = 0;
4410 3873          char **searchdirs = NULL;
4411 3874          int nsearch = 0;
4412 3875          char *target;
4413 3876          nvlist_t *policy = NULL;
4414 3877          uint64_t max_txg = UINT64_MAX;
4415 3878          int flags = ZFS_IMPORT_MISSING_LOG;
4416 3879          int rewind = ZPOOL_NEVER_REWIND;
4417 3880          char *spa_config_path_env;
4418 3881          boolean_t target_is_spa = B_TRUE;
4419 3882  
4420 3883          (void) setrlimit(RLIMIT_NOFILE, &rl);
4421 3884          (void) enable_extended_FILE_stdio(-1, -1);
4422 3885  
4423 3886          dprintf_setup(&argc, argv);
4424 3887  
4425 3888          /*
4426 3889           * If there is an environment variable SPA_CONFIG_PATH it overrides
4427 3890           * default spa_config_path setting. If -U flag is specified it will
4428 3891           * override this environment variable settings once again.
4429 3892           */
4430 3893          spa_config_path_env = getenv("SPA_CONFIG_PATH");
4431 3894          if (spa_config_path_env != NULL)
4432 3895                  spa_config_path = spa_config_path_env;
4433 3896  
4434 3897          while ((c = getopt(argc, argv,
4435 3898              "AbcCdDeEFGhiI:lLmMo:Op:PqRsSt:uU:vVx:X")) != -1) {
4436 3899                  switch (c) {
4437 3900                  case 'b':
4438 3901                  case 'c':
4439 3902                  case 'C':
4440 3903                  case 'd':
4441 3904                  case 'D':
4442 3905                  case 'E':
4443 3906                  case 'G':
4444 3907                  case 'h':
4445 3908                  case 'i':
4446 3909                  case 'l':
4447 3910                  case 'm':
4448 3911                  case 'M':
4449 3912                  case 'O':
4450 3913                  case 'R':
4451 3914                  case 's':
4452 3915                  case 'S':
4453 3916                  case 'u':
4454 3917                          dump_opt[c]++;
4455 3918                          dump_all = 0;
4456 3919                          break;
4457 3920                  case 'A':
4458 3921                  case 'e':
4459 3922                  case 'F':
4460 3923                  case 'L':
4461 3924                  case 'P':
4462 3925                  case 'q':
4463 3926                  case 'X':
4464 3927                          dump_opt[c]++;
4465 3928                          break;
4466 3929                  /* NB: Sort single match options below. */
4467 3930                  case 'I':
4468 3931                          max_inflight = strtoull(optarg, NULL, 0);
4469 3932                          if (max_inflight == 0) {
4470 3933                                  (void) fprintf(stderr, "maximum number "
4471 3934                                      "of inflight I/Os must be greater "
4472 3935                                      "than 0\n");
4473 3936                                  usage();
4474 3937                          }
4475 3938                          break;
4476 3939                  case 'o':
4477 3940                          error = set_global_var(optarg);
4478 3941                          if (error != 0)
4479 3942                                  usage();
4480 3943                          break;
4481 3944                  case 'p':
4482 3945                          if (searchdirs == NULL) {
4483 3946                                  searchdirs = umem_alloc(sizeof (char *),
4484 3947                                      UMEM_NOFAIL);
4485 3948                          } else {
4486 3949                                  char **tmp = umem_alloc((nsearch + 1) *
4487 3950                                      sizeof (char *), UMEM_NOFAIL);
4488 3951                                  bcopy(searchdirs, tmp, nsearch *
4489 3952                                      sizeof (char *));
4490 3953                                  umem_free(searchdirs,
4491 3954                                      nsearch * sizeof (char *));
4492 3955                                  searchdirs = tmp;
4493 3956                          }
4494 3957                          searchdirs[nsearch++] = optarg;
4495 3958                          break;
4496 3959                  case 't':
4497 3960                          max_txg = strtoull(optarg, NULL, 0);
4498 3961                          if (max_txg < TXG_INITIAL) {
4499 3962                                  (void) fprintf(stderr, "incorrect txg "
4500 3963                                      "specified: %s\n", optarg);
4501 3964                                  usage();
4502 3965                          }
4503 3966                          break;
4504 3967                  case 'U':
4505 3968                          spa_config_path = optarg;
4506 3969                          if (spa_config_path[0] != '/') {
4507 3970                                  (void) fprintf(stderr,
4508 3971                                      "cachefile must be an absolute path "
4509 3972                                      "(i.e. start with a slash)\n");
4510 3973                                  usage();
4511 3974                          }
4512 3975                          break;
4513 3976                  case 'v':
4514 3977                          verbose++;
4515 3978                          break;
4516 3979                  case 'V':
4517 3980                          flags = ZFS_IMPORT_VERBATIM;
4518 3981                          break;
4519 3982                  case 'x':
4520 3983                          vn_dumpdir = optarg;
4521 3984                          break;
4522 3985                  default:
4523 3986                          usage();
4524 3987                          break;
4525 3988                  }
4526 3989          }
4527 3990  
4528 3991          if (!dump_opt['e'] && searchdirs != NULL) {
4529 3992                  (void) fprintf(stderr, "-p option requires use of -e\n");
4530 3993                  usage();
4531 3994          }
4532 3995  
4533 3996          /*
4534 3997           * ZDB does not typically re-read blocks; therefore limit the ARC
4535 3998           * to 256 MB, which can be used entirely for metadata.
4536 3999           */
4537 4000          zfs_arc_max = zfs_arc_meta_limit = 256 * 1024 * 1024;
4538 4001  
4539 4002          /*
4540 4003           * "zdb -c" uses checksum-verifying scrub i/os which are async reads.
  
    | 
      ↓ open down ↓ | 
    318 lines elided | 
    
      ↑ open up ↑ | 
  
4541 4004           * "zdb -b" uses traversal prefetch which uses async reads.
4542 4005           * For good performance, let several of them be active at once.
4543 4006           */
4544 4007          zfs_vdev_async_read_max_active = 10;
4545 4008  
4546 4009          /*
4547 4010           * Disable reference tracking for better performance.
4548 4011           */
4549 4012          reference_tracking_enable = B_FALSE;
4550 4013  
4551      -        /*
4552      -         * Do not fail spa_load when spa_load_verify fails. This is needed
4553      -         * to load non-idle pools.
4554      -         */
4555      -        spa_load_verify_dryrun = B_TRUE;
4556      -
4557 4014          kernel_init(FREAD);
4558 4015          g_zfs = libzfs_init();
4559 4016          ASSERT(g_zfs != NULL);
4560 4017  
4561 4018          if (dump_all)
4562 4019                  verbose = MAX(verbose, 1);
4563 4020  
4564 4021          for (c = 0; c < 256; c++) {
4565 4022                  if (dump_all && strchr("AeEFlLOPRSX", c) == NULL)
4566 4023                          dump_opt[c] = 1;
4567 4024                  if (dump_opt[c])
4568 4025                          dump_opt[c] += verbose;
4569 4026          }
4570 4027  
4571 4028          aok = (dump_opt['A'] == 1) || (dump_opt['A'] > 2);
4572 4029          zfs_recover = (dump_opt['A'] > 1);
4573 4030  
4574 4031          argc -= optind;
4575 4032          argv += optind;
4576 4033  
4577 4034          if (argc < 2 && dump_opt['R'])
4578 4035                  usage();
4579 4036  
4580 4037          if (dump_opt['E']) {
4581 4038                  if (argc != 1)
4582 4039                          usage();
4583 4040                  zdb_embedded_block(argv[0]);
4584 4041                  return (0);
4585 4042          }
4586 4043  
4587 4044          if (argc < 1) {
4588 4045                  if (!dump_opt['e'] && dump_opt['C']) {
4589 4046                          dump_cachefile(spa_config_path);
4590 4047                          return (0);
4591 4048                  }
4592 4049                  usage();
4593 4050          }
4594 4051  
4595 4052          if (dump_opt['l'])
4596 4053                  return (dump_label(argv[0]));
4597 4054  
4598 4055          if (dump_opt['O']) {
4599 4056                  if (argc != 2)
4600 4057                          usage();
4601 4058                  dump_opt['v'] = verbose + 3;
4602 4059                  return (dump_path(argv[0], argv[1]));
4603 4060          }
4604 4061  
4605 4062          if (dump_opt['X'] || dump_opt['F'])
4606 4063                  rewind = ZPOOL_DO_REWIND |
4607 4064                      (dump_opt['X'] ? ZPOOL_EXTREME_REWIND : 0);
4608 4065  
4609 4066          if (nvlist_alloc(&policy, NV_UNIQUE_NAME_TYPE, 0) != 0 ||
4610 4067              nvlist_add_uint64(policy, ZPOOL_REWIND_REQUEST_TXG, max_txg) != 0 ||
4611 4068              nvlist_add_uint32(policy, ZPOOL_REWIND_REQUEST, rewind) != 0)
4612 4069                  fatal("internal error: %s", strerror(ENOMEM));
4613 4070  
4614 4071          error = 0;
4615 4072          target = argv[0];
4616 4073  
4617 4074          if (dump_opt['e']) {
4618 4075                  nvlist_t *cfg = NULL;
4619 4076                  char *name = find_zpool(&target, &cfg, nsearch, searchdirs);
4620 4077  
4621 4078                  error = ENOENT;
4622 4079                  if (name) {
4623 4080                          if (dump_opt['C'] > 1) {
4624 4081                                  (void) printf("\nConfiguration for import:\n");
4625 4082                                  dump_nvlist(cfg, 8);
4626 4083                          }
4627 4084                          if (nvlist_add_nvlist(cfg,
4628 4085                              ZPOOL_REWIND_POLICY, policy) != 0) {
4629 4086                                  fatal("can't open '%s': %s",
4630 4087                                      target, strerror(ENOMEM));
4631 4088                          }
4632 4089                          error = spa_import(name, cfg, NULL, flags);
4633 4090                  }
4634 4091          }
4635 4092  
4636 4093          if (strpbrk(target, "/@") != NULL) {
4637 4094                  size_t targetlen;
4638 4095  
4639 4096                  target_is_spa = B_FALSE;
4640 4097                  /*
4641 4098                   * Remove any trailing slash.  Later code would get confused
4642 4099                   * by it, but we want to allow it so that "pool/" can
4643 4100                   * indicate that we want to dump the topmost filesystem,
4644 4101                   * rather than the whole pool.
4645 4102                   */
4646 4103                  targetlen = strlen(target);
4647 4104                  if (targetlen != 0 && target[targetlen - 1] == '/')
4648 4105                          target[targetlen - 1] = '\0';
4649 4106          }
4650 4107  
4651 4108          if (error == 0) {
4652 4109                  if (target_is_spa || dump_opt['R']) {
4653 4110                          error = spa_open_rewind(target, &spa, FTAG, policy,
4654 4111                              NULL);
4655 4112                          if (error) {
4656 4113                                  /*
4657 4114                                   * If we're missing the log device then
4658 4115                                   * try opening the pool after clearing the
4659 4116                                   * log state.
4660 4117                                   */
4661 4118                                  mutex_enter(&spa_namespace_lock);
4662 4119                                  if ((spa = spa_lookup(target)) != NULL &&
4663 4120                                      spa->spa_log_state == SPA_LOG_MISSING) {
4664 4121                                          spa->spa_log_state = SPA_LOG_CLEAR;
4665 4122                                          error = 0;
4666 4123                                  }
4667 4124                                  mutex_exit(&spa_namespace_lock);
4668 4125  
4669 4126                                  if (!error) {
4670 4127                                          error = spa_open_rewind(target, &spa,
4671 4128                                              FTAG, policy, NULL);
4672 4129                                  }
4673 4130                          }
4674 4131                  } else {
4675 4132                          error = open_objset(target, DMU_OST_ANY, FTAG, &os);
4676 4133                  }
4677 4134          }
4678 4135          nvlist_free(policy);
4679 4136  
4680 4137          if (error)
4681 4138                  fatal("can't open '%s': %s", target, strerror(error));
4682 4139  
4683 4140          argv++;
4684 4141          argc--;
4685 4142          if (!dump_opt['R']) {
4686 4143                  if (argc > 0) {
4687 4144                          zopt_objects = argc;
4688 4145                          zopt_object = calloc(zopt_objects, sizeof (uint64_t));
4689 4146                          for (unsigned i = 0; i < zopt_objects; i++) {
4690 4147                                  errno = 0;
4691 4148                                  zopt_object[i] = strtoull(argv[i], NULL, 0);
4692 4149                                  if (zopt_object[i] == 0 && errno != 0)
4693 4150                                          fatal("bad number %s: %s",
4694 4151                                              argv[i], strerror(errno));
4695 4152                          }
4696 4153                  }
4697 4154                  if (os != NULL) {
4698 4155                          dump_dir(os);
4699 4156                  } else if (zopt_objects > 0 && !dump_opt['m']) {
4700 4157                          dump_dir(spa->spa_meta_objset);
4701 4158                  } else {
4702 4159                          dump_zpool(spa);
4703 4160                  }
4704 4161          } else {
4705 4162                  flagbits['b'] = ZDB_FLAG_PRINT_BLKPTR;
4706 4163                  flagbits['c'] = ZDB_FLAG_CHECKSUM;
4707 4164                  flagbits['d'] = ZDB_FLAG_DECOMPRESS;
4708 4165                  flagbits['e'] = ZDB_FLAG_BSWAP;
4709 4166                  flagbits['g'] = ZDB_FLAG_GBH;
4710 4167                  flagbits['i'] = ZDB_FLAG_INDIRECT;
4711 4168                  flagbits['p'] = ZDB_FLAG_PHYS;
4712 4169                  flagbits['r'] = ZDB_FLAG_RAW;
4713 4170  
4714 4171                  for (int i = 0; i < argc; i++)
4715 4172                          zdb_read_block(argv[i], spa);
4716 4173          }
4717 4174  
4718 4175          if (os != NULL)
4719 4176                  close_objset(os, FTAG);
4720 4177          else
4721 4178                  spa_close(spa, FTAG);
4722 4179  
4723 4180          fuid_table_destroy();
4724 4181  
4725 4182          dump_debug_buffer();
4726 4183  
4727 4184          libzfs_fini(g_zfs);
4728 4185          kernel_fini();
4729 4186  
4730 4187          return (0);
4731 4188  }
  
    | 
      ↓ open down ↓ | 
    165 lines elided | 
    
      ↑ open up ↑ | 
  
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX