Print this page
NEX-9752 backport illumos 6950 ARC should cache compressed data
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
6950 ARC should cache compressed data
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
NEX-8521 zdb -h <pool> raises core dump
Reviewed by: Alex Deiter <alex.deiter@nexenta.com>
Reviewed by: Dan Fields <dan.fields@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
SUP-918: zdb -h infinite loop when buffering records larger than static limit
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
NEX-3650 KRRP needs to clean up cstyle, hdrchk, and mapfile issues
Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-3214 remove cos object type from dmu.h
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
6391 Override default SPA config location via environment
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Richard Yao <ryao@gentoo.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
6268 zfs diff confused by moving a file to another directory
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Justin Gibbs <gibbs@scsiguy.com>
Approved by: Dan McDonald <danmcd@omniti.com>
6290 zdb -h overflows stack
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Brian Donohue <brian.donohue@delphix.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Reviewed by: Don Brady <dev.fs.zfs@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
6047 SPARC boot should support feature@embedded_data
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
5959 clean up per-dataset feature count code
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
NEX-4582 update wrc test cases for allow to use write back cache per tree of datasets
Reviewed by: Steve Peng <steve.peng@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
5812 assertion failed in zrl_tryenter(): zr_owner==NULL
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Approved by: Gordon Ross <gwr@nexenta.com>
5810 zdb should print details of bpobj
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Reviewed by: Simon Klinkert <simon.klinkert@gmail.com>
Approved by: Gordon Ross <gwr@nexenta.com>
NEX-3558 KRRP Integration
NEX-3212 remove vdev prop object type from dmu.h
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Josef Sipek <josef.sipek@nexenta.com>
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Garrett D'Amore <garrett@damore.org>
Make special vdev subtree topology the same as regular vdev subtree to simplify testcase setup
Fixup merge issues
Issue #40: ZDB shouldn't crash with new code
re #12611 rb4105 zpool import panic in ddt_zap_count()
re #8279 rb3915 need a mechanism to notify NMS about ZFS config changes (fix lint -courtesy of Yuri Pankov)
re #12584 rb4049 zfsxx latest code merge (fix lint - courtesy of Yuri Pankov)
re #12585 rb4049 ZFS++ work port - refactoring to improve separation of open/closed code, bug fixes, performance improvements - open code
Bug 11205: add missing libzfs_closed_stubs.c to fix opensource-only build.
ZFS plus work: special vdevs, cos, cos/vdev properties

Split Close
Expand all
Collapse all
          --- old/usr/src/cmd/zdb/zdb.c
          +++ new/usr/src/cmd/zdb/zdb.c
↓ open down ↓ 13 lines elided ↑ open up ↑
  14   14   * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  15   15   * If applicable, add the following below this CDDL HEADER, with the
  16   16   * fields enclosed by brackets "[]" replaced with your own identifying
  17   17   * information: Portions Copyright [yyyy] [name of copyright owner]
  18   18   *
  19   19   * CDDL HEADER END
  20   20   */
  21   21  
  22   22  /*
  23   23   * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved.
  24      - * Copyright (c) 2011, 2017 by Delphix. All rights reserved.
       24 + * Copyright (c) 2011, 2016 by Delphix. All rights reserved.
  25   25   * Copyright (c) 2014 Integros [integros.com]
  26   26   * Copyright 2017 Nexenta Systems, Inc.
  27   27   * Copyright 2017 RackTop Systems.
  28   28   */
  29   29  
  30   30  #include <stdio.h>
  31   31  #include <unistd.h>
  32   32  #include <stdio_ext.h>
  33   33  #include <stdlib.h>
  34   34  #include <ctype.h>
       35 +#include <string.h>
       36 +#include <errno.h>
  35   37  #include <sys/zfs_context.h>
  36   38  #include <sys/spa.h>
  37   39  #include <sys/spa_impl.h>
  38   40  #include <sys/dmu.h>
  39   41  #include <sys/zap.h>
  40   42  #include <sys/fs/zfs.h>
  41   43  #include <sys/zfs_znode.h>
  42   44  #include <sys/zfs_sa.h>
  43   45  #include <sys/sa.h>
  44   46  #include <sys/sa_impl.h>
↓ open down ↓ 26 lines elided ↑ open up ↑
  71   73  #include "zdb.h"
  72   74  
  73   75  #define ZDB_COMPRESS_NAME(idx) ((idx) < ZIO_COMPRESS_FUNCTIONS ?        \
  74   76          zio_compress_table[(idx)].ci_name : "UNKNOWN")
  75   77  #define ZDB_CHECKSUM_NAME(idx) ((idx) < ZIO_CHECKSUM_FUNCTIONS ?        \
  76   78          zio_checksum_table[(idx)].ci_name : "UNKNOWN")
  77   79  #define ZDB_OT_NAME(idx) ((idx) < DMU_OT_NUMTYPES ?     \
  78   80          dmu_ot[(idx)].ot_name : DMU_OT_IS_VALID(idx) ?  \
  79   81          dmu_ot_byteswap[DMU_OT_BYTESWAP(idx)].ob_name : "UNKNOWN")
  80   82  #define ZDB_OT_TYPE(idx) ((idx) < DMU_OT_NUMTYPES ? (idx) :             \
  81      -        (idx) == DMU_OTN_ZAP_DATA || (idx) == DMU_OTN_ZAP_METADATA ?    \
  82      -        DMU_OT_ZAP_OTHER : \
  83      -        (idx) == DMU_OTN_UINT64_DATA || (idx) == DMU_OTN_UINT64_METADATA ? \
  84      -        DMU_OT_UINT64_OTHER : DMU_OT_NUMTYPES)
       83 +        (((idx) == DMU_OTN_ZAP_DATA || (idx) == DMU_OTN_ZAP_METADATA) ? \
       84 +        DMU_OT_ZAP_OTHER : DMU_OT_NUMTYPES))
  85   85  
  86   86  #ifndef lint
  87   87  extern int reference_tracking_enable;
  88   88  extern boolean_t zfs_recover;
  89   89  extern uint64_t zfs_arc_max, zfs_arc_meta_limit;
  90   90  extern int zfs_vdev_async_read_max_active;
  91   91  extern int aok;
  92      -extern boolean_t spa_load_verify_dryrun;
  93   92  #else
  94   93  int reference_tracking_enable;
  95   94  boolean_t zfs_recover;
  96   95  uint64_t zfs_arc_max, zfs_arc_meta_limit;
  97   96  int zfs_vdev_async_read_max_active;
  98   97  int aok;
  99      -boolean_t spa_load_verify_dryrun;
 100   98  #endif
 101   99  
 102  100  static const char cmdname[] = "zdb";
 103  101  uint8_t dump_opt[256];
 104  102  
 105  103  typedef void object_viewer_t(objset_t *, uint64_t, void *data, size_t size);
 106  104  
 107  105  uint64_t *zopt_object = NULL;
 108  106  static unsigned zopt_objects = 0;
 109  107  libzfs_handle_t *g_zfs;
↓ open down ↓ 557 lines elided ↑ open up ↑
 667  665          for (unsigned c = 0; c < vd->vdev_children; c++)
 668  666                  refcount += get_dtl_refcount(vd->vdev_child[c]);
 669  667          return (refcount);
 670  668  }
 671  669  
 672  670  static int
 673  671  get_metaslab_refcount(vdev_t *vd)
 674  672  {
 675  673          int refcount = 0;
 676  674  
 677      -        if (vd->vdev_top == vd) {
 678      -                for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
      675 +        if (vd->vdev_top == vd && !vd->vdev_removing) {
      676 +                for (unsigned m = 0; m < vd->vdev_ms_count; m++) {
 679  677                          space_map_t *sm = vd->vdev_ms[m]->ms_sm;
 680  678  
 681  679                          if (sm != NULL &&
 682  680                              sm->sm_dbuf->db_size == sizeof (space_map_phys_t))
 683  681                                  refcount++;
 684  682                  }
 685  683          }
 686  684          for (unsigned c = 0; c < vd->vdev_children; c++)
 687  685                  refcount += get_metaslab_refcount(vd->vdev_child[c]);
 688  686  
 689  687          return (refcount);
 690  688  }
 691  689  
 692  690  static int
 693      -get_obsolete_refcount(vdev_t *vd)
 694      -{
 695      -        int refcount = 0;
 696      -
 697      -        uint64_t obsolete_sm_obj = vdev_obsolete_sm_object(vd);
 698      -        if (vd->vdev_top == vd && obsolete_sm_obj != 0) {
 699      -                dmu_object_info_t doi;
 700      -                VERIFY0(dmu_object_info(vd->vdev_spa->spa_meta_objset,
 701      -                    obsolete_sm_obj, &doi));
 702      -                if (doi.doi_bonus_size == sizeof (space_map_phys_t)) {
 703      -                        refcount++;
 704      -                }
 705      -        } else {
 706      -                ASSERT3P(vd->vdev_obsolete_sm, ==, NULL);
 707      -                ASSERT3U(obsolete_sm_obj, ==, 0);
 708      -        }
 709      -        for (unsigned c = 0; c < vd->vdev_children; c++) {
 710      -                refcount += get_obsolete_refcount(vd->vdev_child[c]);
 711      -        }
 712      -
 713      -        return (refcount);
 714      -}
 715      -
 716      -static int
 717      -get_prev_obsolete_spacemap_refcount(spa_t *spa)
 718      -{
 719      -        uint64_t prev_obj =
 720      -            spa->spa_condensing_indirect_phys.scip_prev_obsolete_sm_object;
 721      -        if (prev_obj != 0) {
 722      -                dmu_object_info_t doi;
 723      -                VERIFY0(dmu_object_info(spa->spa_meta_objset, prev_obj, &doi));
 724      -                if (doi.doi_bonus_size == sizeof (space_map_phys_t)) {
 725      -                        return (1);
 726      -                }
 727      -        }
 728      -        return (0);
 729      -}
 730      -
 731      -static int
 732  691  verify_spacemap_refcounts(spa_t *spa)
 733  692  {
 734  693          uint64_t expected_refcount = 0;
 735  694          uint64_t actual_refcount;
 736  695  
 737  696          (void) feature_get_refcount(spa,
 738  697              &spa_feature_table[SPA_FEATURE_SPACEMAP_HISTOGRAM],
 739  698              &expected_refcount);
 740  699          actual_refcount = get_dtl_refcount(spa->spa_root_vdev);
 741  700          actual_refcount += get_metaslab_refcount(spa->spa_root_vdev);
 742      -        actual_refcount += get_obsolete_refcount(spa->spa_root_vdev);
 743      -        actual_refcount += get_prev_obsolete_spacemap_refcount(spa);
 744  701  
 745  702          if (expected_refcount != actual_refcount) {
 746  703                  (void) printf("space map refcount mismatch: expected %lld != "
 747  704                      "actual %lld\n",
 748  705                      (longlong_t)expected_refcount,
 749  706                      (longlong_t)actual_refcount);
 750  707                  return (2);
 751  708          }
 752  709          return (0);
 753  710  }
 754  711  
 755  712  static void
 756  713  dump_spacemap(objset_t *os, space_map_t *sm)
 757  714  {
 758  715          uint64_t alloc, offset, entry;
 759      -        char *ddata[] = { "ALLOC", "FREE", "CONDENSE", "INVALID",
 760      -            "INVALID", "INVALID", "INVALID", "INVALID" };
      716 +        const char *ddata[] = { "ALLOC", "FREE", "CONDENSE", "INVALID",
      717 +                            "INVALID", "INVALID", "INVALID", "INVALID" };
 761  718  
 762  719          if (sm == NULL)
 763  720                  return;
 764  721  
 765      -        (void) printf("space map object %llu:\n",
 766      -            (longlong_t)sm->sm_phys->smp_object);
 767      -        (void) printf("  smp_objsize = 0x%llx\n",
 768      -            (longlong_t)sm->sm_phys->smp_objsize);
 769      -        (void) printf("  smp_alloc = 0x%llx\n",
 770      -            (longlong_t)sm->sm_phys->smp_alloc);
 771      -
 772  722          /*
 773  723           * Print out the freelist entries in both encoded and decoded form.
 774  724           */
 775  725          alloc = 0;
 776  726          for (offset = 0; offset < space_map_length(sm);
 777  727              offset += sizeof (entry)) {
 778  728                  uint8_t mapshift = sm->sm_shift;
 779  729  
 780  730                  VERIFY0(dmu_read(os, space_map_object(sm), offset,
 781  731                      sizeof (entry), &entry, DMU_READ_PREFETCH));
↓ open down ↓ 84 lines elided ↑ open up ↑
 866  816                   */
 867  817                  (void) printf("\tOn-disk histogram:\t\tfragmentation %llu\n",
 868  818                      (u_longlong_t)msp->ms_fragmentation);
 869  819                  dump_histogram(sm->sm_phys->smp_histogram,
 870  820                      SPACE_MAP_HISTOGRAM_SIZE, sm->sm_shift);
 871  821          }
 872  822  
 873  823          if (dump_opt['d'] > 5 || dump_opt['m'] > 3) {
 874  824                  ASSERT(msp->ms_size == (1ULL << vd->vdev_ms_shift));
 875  825  
      826 +                mutex_enter(&msp->ms_lock);
 876  827                  dump_spacemap(spa->spa_meta_objset, msp->ms_sm);
      828 +                mutex_exit(&msp->ms_lock);
 877  829          }
 878  830  }
 879  831  
 880  832  static void
 881  833  print_vdev_metaslab_header(vdev_t *vd)
 882  834  {
 883  835          (void) printf("\tvdev %10llu\n\t%-10s%5llu   %-19s   %-15s   %-10s\n",
 884  836              (u_longlong_t)vd->vdev_id,
 885  837              "metaslabs", (u_longlong_t)vd->vdev_ms_count,
 886  838              "offset", "spacemap", "free");
↓ open down ↓ 37 lines elided ↑ open up ↑
 924  876          (void) printf("\tpool %s\tfragmentation", spa_name(spa));
 925  877          fragmentation = metaslab_class_fragmentation(mc);
 926  878          if (fragmentation == ZFS_FRAG_INVALID)
 927  879                  (void) printf("\t%3s\n", "-");
 928  880          else
 929  881                  (void) printf("\t%3llu%%\n", (u_longlong_t)fragmentation);
 930  882          dump_histogram(mc->mc_histogram, RANGE_TREE_HISTOGRAM_SIZE, 0);
 931  883  }
 932  884  
 933  885  static void
 934      -print_vdev_indirect(vdev_t *vd)
 935      -{
 936      -        vdev_indirect_config_t *vic = &vd->vdev_indirect_config;
 937      -        vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
 938      -        vdev_indirect_births_t *vib = vd->vdev_indirect_births;
 939      -
 940      -        if (vim == NULL) {
 941      -                ASSERT3P(vib, ==, NULL);
 942      -                return;
 943      -        }
 944      -
 945      -        ASSERT3U(vdev_indirect_mapping_object(vim), ==,
 946      -            vic->vic_mapping_object);
 947      -        ASSERT3U(vdev_indirect_births_object(vib), ==,
 948      -            vic->vic_births_object);
 949      -
 950      -        (void) printf("indirect births obj %llu:\n",
 951      -            (longlong_t)vic->vic_births_object);
 952      -        (void) printf("    vib_count = %llu\n",
 953      -            (longlong_t)vdev_indirect_births_count(vib));
 954      -        for (uint64_t i = 0; i < vdev_indirect_births_count(vib); i++) {
 955      -                vdev_indirect_birth_entry_phys_t *cur_vibe =
 956      -                    &vib->vib_entries[i];
 957      -                (void) printf("\toffset %llx -> txg %llu\n",
 958      -                    (longlong_t)cur_vibe->vibe_offset,
 959      -                    (longlong_t)cur_vibe->vibe_phys_birth_txg);
 960      -        }
 961      -        (void) printf("\n");
 962      -
 963      -        (void) printf("indirect mapping obj %llu:\n",
 964      -            (longlong_t)vic->vic_mapping_object);
 965      -        (void) printf("    vim_max_offset = 0x%llx\n",
 966      -            (longlong_t)vdev_indirect_mapping_max_offset(vim));
 967      -        (void) printf("    vim_bytes_mapped = 0x%llx\n",
 968      -            (longlong_t)vdev_indirect_mapping_bytes_mapped(vim));
 969      -        (void) printf("    vim_count = %llu\n",
 970      -            (longlong_t)vdev_indirect_mapping_num_entries(vim));
 971      -
 972      -        if (dump_opt['d'] <= 5 && dump_opt['m'] <= 3)
 973      -                return;
 974      -
 975      -        uint32_t *counts = vdev_indirect_mapping_load_obsolete_counts(vim);
 976      -
 977      -        for (uint64_t i = 0; i < vdev_indirect_mapping_num_entries(vim); i++) {
 978      -                vdev_indirect_mapping_entry_phys_t *vimep =
 979      -                    &vim->vim_entries[i];
 980      -                (void) printf("\t<%llx:%llx:%llx> -> "
 981      -                    "<%llx:%llx:%llx> (%x obsolete)\n",
 982      -                    (longlong_t)vd->vdev_id,
 983      -                    (longlong_t)DVA_MAPPING_GET_SRC_OFFSET(vimep),
 984      -                    (longlong_t)DVA_GET_ASIZE(&vimep->vimep_dst),
 985      -                    (longlong_t)DVA_GET_VDEV(&vimep->vimep_dst),
 986      -                    (longlong_t)DVA_GET_OFFSET(&vimep->vimep_dst),
 987      -                    (longlong_t)DVA_GET_ASIZE(&vimep->vimep_dst),
 988      -                    counts[i]);
 989      -        }
 990      -        (void) printf("\n");
 991      -
 992      -        uint64_t obsolete_sm_object = vdev_obsolete_sm_object(vd);
 993      -        if (obsolete_sm_object != 0) {
 994      -                objset_t *mos = vd->vdev_spa->spa_meta_objset;
 995      -                (void) printf("obsolete space map object %llu:\n",
 996      -                    (u_longlong_t)obsolete_sm_object);
 997      -                ASSERT(vd->vdev_obsolete_sm != NULL);
 998      -                ASSERT3U(space_map_object(vd->vdev_obsolete_sm), ==,
 999      -                    obsolete_sm_object);
1000      -                dump_spacemap(mos, vd->vdev_obsolete_sm);
1001      -                (void) printf("\n");
1002      -        }
1003      -}
1004      -
1005      -static void
1006  886  dump_metaslabs(spa_t *spa)
1007  887  {
1008  888          vdev_t *vd, *rvd = spa->spa_root_vdev;
1009  889          uint64_t m, c = 0, children = rvd->vdev_children;
1010  890  
1011  891          (void) printf("\nMetaslabs:\n");
1012  892  
1013  893          if (!dump_opt['d'] && zopt_objects > 0) {
1014  894                  c = zopt_object[0];
1015  895  
↓ open down ↓ 15 lines elided ↑ open up ↑
1031  911                          }
1032  912                          (void) printf("\n");
1033  913                          return;
1034  914                  }
1035  915                  children = c + 1;
1036  916          }
1037  917          for (; c < children; c++) {
1038  918                  vd = rvd->vdev_child[c];
1039  919                  print_vdev_metaslab_header(vd);
1040  920  
1041      -                print_vdev_indirect(vd);
1042      -
1043  921                  for (m = 0; m < vd->vdev_ms_count; m++)
1044  922                          dump_metaslab(vd->vdev_ms[m]);
1045  923                  (void) printf("\n");
1046  924          }
1047  925  }
1048  926  
1049  927  static void
1050  928  dump_dde(const ddt_t *ddt, const ddt_entry_t *dde, uint64_t index)
1051  929  {
1052  930          const ddt_phys_t *ddp = dde->dde_phys;
↓ open down ↓ 44 lines elided ↑ open up ↑
1097  975          dmu_object_info_t doi;
1098  976          uint64_t count, dspace, mspace;
1099  977          int error;
1100  978  
1101  979          error = ddt_object_info(ddt, type, class, &doi);
1102  980  
1103  981          if (error == ENOENT)
1104  982                  return;
1105  983          ASSERT(error == 0);
1106  984  
1107      -        if ((count = ddt_object_count(ddt, type, class)) == 0)
      985 +        (void) ddt_object_count(ddt, type, class, &count);
      986 +        if (count == 0)
1108  987                  return;
1109  988  
1110  989          dspace = doi.doi_physical_blocks_512 << 9;
1111  990          mspace = doi.doi_fill_count * doi.doi_data_block_size;
1112  991  
1113  992          ddt_object_name(ddt, type, class, name);
1114  993  
1115  994          (void) printf("%s: %llu entries, size %llu on disk, %llu in core\n",
1116  995              name,
1117  996              (u_longlong_t)count,
↓ open down ↓ 90 lines elided ↑ open up ↑
1208 1087              vd->vdev_path ? vd->vdev_path :
1209 1088              vd->vdev_parent ? vd->vdev_ops->vdev_op_type : spa_name(spa),
1210 1089              required ? "DTL-required" : "DTL-expendable");
1211 1090  
1212 1091          for (int t = 0; t < DTL_TYPES; t++) {
1213 1092                  range_tree_t *rt = vd->vdev_dtl[t];
1214 1093                  if (range_tree_space(rt) == 0)
1215 1094                          continue;
1216 1095                  (void) snprintf(prefix, sizeof (prefix), "\t%*s%s",
1217 1096                      indent + 2, "", name[t]);
     1097 +                mutex_enter(rt->rt_lock);
1218 1098                  range_tree_walk(rt, dump_dtl_seg, prefix);
     1099 +                mutex_exit(rt->rt_lock);
1219 1100                  if (dump_opt['d'] > 5 && vd->vdev_children == 0)
1220 1101                          dump_spacemap(spa->spa_meta_objset, vd->vdev_dtl_sm);
1221 1102          }
1222 1103  
1223 1104          for (unsigned c = 0; c < vd->vdev_children; c++)
1224 1105                  dump_dtl(vd->vdev_child[c], indent + 4);
1225 1106  }
1226 1107  
1227 1108  static void
1228 1109  dump_history(spa_t *spa)
1229 1110  {
1230 1111          nvlist_t **events = NULL;
1231 1112          uint64_t resid, len, off = 0;
     1113 +        uint64_t buflen;
1232 1114          uint_t num = 0;
1233 1115          int error;
1234 1116          time_t tsec;
1235 1117          struct tm t;
1236 1118          char tbuf[30];
1237 1119          char internalstr[MAXPATHLEN];
1238 1120  
1239      -        char *buf = umem_alloc(SPA_MAXBLOCKSIZE, UMEM_NOFAIL);
     1121 +        buflen = SPA_MAXBLOCKSIZE;
     1122 +        char *buf = umem_alloc(buflen, UMEM_NOFAIL);
1240 1123          do {
1241      -                len = SPA_MAXBLOCKSIZE;
     1124 +                len = buflen;
1242 1125  
1243 1126                  if ((error = spa_history_get(spa, &off, &len, buf)) != 0) {
1244      -                        (void) fprintf(stderr, "Unable to read history: "
1245      -                            "error %d\n", error);
1246      -                        umem_free(buf, SPA_MAXBLOCKSIZE);
1247      -                        return;
     1127 +                        break;
1248 1128                  }
1249 1129  
1250      -                if (zpool_history_unpack(buf, len, &resid, &events, &num) != 0)
     1130 +                error = zpool_history_unpack(buf, len, &resid, &events, &num);
     1131 +                if (error != 0) {
1251 1132                          break;
     1133 +                }
1252 1134  
1253 1135                  off -= resid;
     1136 +                if (resid == len) {
     1137 +                         umem_free(buf, buflen);
     1138 +                         buflen *= 2;
     1139 +                         buf = umem_alloc(buflen, UMEM_NOFAIL);
     1140 +                         if (buf == NULL) {
     1141 +                                (void) fprintf(stderr, "Unable to read history: %s\n",
     1142 +                                    strerror(error));
     1143 +                                goto err;
     1144 +                         }
     1145 +                }
1254 1146          } while (len != 0);
1255      -        umem_free(buf, SPA_MAXBLOCKSIZE);
     1147 +        umem_free(buf, buflen);
1256 1148  
     1149 +        if (error != 0) {
     1150 +                (void) fprintf(stderr, "Unable to read history: %s\n",
     1151 +                    strerror(error));
     1152 +                goto err;
     1153 +        }
     1154 +
1257 1155          (void) printf("\nHistory:\n");
1258 1156          for (unsigned i = 0; i < num; i++) {
1259 1157                  uint64_t time, txg, ievent;
1260 1158                  char *cmd, *intstr;
1261 1159                  boolean_t printed = B_FALSE;
1262 1160  
1263 1161                  if (nvlist_lookup_uint64(events[i], ZPOOL_HIST_TIME,
1264 1162                      &time) != 0)
1265 1163                          goto next;
1266 1164                  if (nvlist_lookup_string(events[i], ZPOOL_HIST_CMD,
↓ open down ↓ 21 lines elided ↑ open up ↑
1288 1186                  (void) printf("%s %s\n", tbuf, cmd);
1289 1187                  printed = B_TRUE;
1290 1188  
1291 1189  next:
1292 1190                  if (dump_opt['h'] > 1) {
1293 1191                          if (!printed)
1294 1192                                  (void) printf("unrecognized record:\n");
1295 1193                          dump_nvlist(events[i], 2);
1296 1194                  }
1297 1195          }
     1196 +err:
     1197 +        for (unsigned i = 0; i < num; i++) {
     1198 +                nvlist_free(events[i]);
     1199 +        }
     1200 +        free(events);
1298 1201  }
1299 1202  
1300 1203  /*ARGSUSED*/
1301 1204  static void
1302 1205  dump_dnode(objset_t *os, uint64_t object, void *data, size_t size)
1303 1206  {
1304 1207  }
1305 1208  
1306 1209  static uint64_t
1307 1210  blkid2offset(const dnode_phys_t *dnp, const blkptr_t *bp,
↓ open down ↓ 894 lines elided ↑ open up ↑
2202 2105                  for (i = 0; i < zopt_objects; i++)
2203 2106                          dump_object(os, zopt_object[i], verbosity,
2204 2107                              &print_header);
2205 2108                  (void) printf("\n");
2206 2109                  return;
2207 2110          }
2208 2111  
2209 2112          if (dump_opt['i'] != 0 || verbosity >= 2)
2210 2113                  dump_intent_log(dmu_objset_zil(os));
2211 2114  
2212      -        if (dmu_objset_ds(os) != NULL) {
2213      -                dsl_dataset_t *ds = dmu_objset_ds(os);
2214      -                dump_deadlist(&ds->ds_deadlist);
     2115 +        if (dmu_objset_ds(os) != NULL)
     2116 +                dump_deadlist(&dmu_objset_ds(os)->ds_deadlist);
2215 2117  
2216      -                if (dsl_dataset_remap_deadlist_exists(ds)) {
2217      -                        (void) printf("ds_remap_deadlist:\n");
2218      -                        dump_deadlist(&ds->ds_remap_deadlist);
2219      -                }
2220      -        }
2221      -
2222 2118          if (verbosity < 2)
2223 2119                  return;
2224 2120  
2225 2121          if (BP_IS_HOLE(os->os_rootbp))
2226 2122                  return;
2227 2123  
2228 2124          dump_object(os, 0, verbosity, &print_header);
2229 2125          object_count = 0;
2230 2126          if (DMU_USERUSED_DNODE(os) != NULL &&
2231 2127              DMU_USERUSED_DNODE(os)->dn_type != 0) {
↓ open down ↓ 322 lines elided ↑ open up ↑
2554 2450                  if (dump_opt['u'])
2555 2451                          dump_label_uberblocks(&label, ashift);
2556 2452          }
2557 2453  
2558 2454          (void) close(fd);
2559 2455  
2560 2456          return (label_found ? 0 : 2);
2561 2457  }
2562 2458  
2563 2459  static uint64_t dataset_feature_count[SPA_FEATURES];
2564      -static uint64_t remap_deadlist_count = 0;
2565 2460  
2566 2461  /*ARGSUSED*/
2567 2462  static int
2568 2463  dump_one_dir(const char *dsname, void *arg)
2569 2464  {
2570 2465          int error;
2571 2466          objset_t *os;
2572 2467  
2573 2468          error = open_objset(dsname, DMU_OST_ANY, FTAG, &os);
2574 2469          if (error != 0)
2575 2470                  return (0);
2576 2471  
2577 2472          for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {
2578 2473                  if (!dmu_objset_ds(os)->ds_feature_inuse[f])
2579 2474                          continue;
2580 2475                  ASSERT(spa_feature_table[f].fi_flags &
2581 2476                      ZFEATURE_FLAG_PER_DATASET);
2582 2477                  dataset_feature_count[f]++;
2583 2478          }
2584 2479  
2585      -        if (dsl_dataset_remap_deadlist_exists(dmu_objset_ds(os))) {
2586      -                remap_deadlist_count++;
2587      -        }
2588      -
2589 2480          dump_dir(os);
2590 2481          close_objset(os, FTAG);
2591 2482          fuid_table_destroy();
2592 2483          return (0);
2593 2484  }
2594 2485  
2595 2486  /*
2596 2487   * Block statistics.
2597 2488   */
2598 2489  #define PSIZE_HISTO_SIZE (SPA_OLD_MAXBLOCKSIZE / SPA_MINBLOCKSIZE + 2)
↓ open down ↓ 19 lines elided ↑ open up ↑
2618 2509          "deferred free",
2619 2510          "dedup ditto",
2620 2511          "other",
2621 2512          "Total",
2622 2513  };
2623 2514  
2624 2515  #define ZB_TOTAL        DN_MAX_LEVELS
2625 2516  
2626 2517  typedef struct zdb_cb {
2627 2518          zdb_blkstats_t  zcb_type[ZB_TOTAL + 1][ZDB_OT_TOTAL + 1];
2628      -        uint64_t        zcb_removing_size;
2629 2519          uint64_t        zcb_dedup_asize;
2630 2520          uint64_t        zcb_dedup_blocks;
2631 2521          uint64_t        zcb_embedded_blocks[NUM_BP_EMBEDDED_TYPES];
2632 2522          uint64_t        zcb_embedded_histogram[NUM_BP_EMBEDDED_TYPES]
2633 2523              [BPE_PAYLOAD_SIZE];
2634 2524          uint64_t        zcb_start;
2635 2525          hrtime_t        zcb_lastprint;
2636 2526          uint64_t        zcb_totalasize;
2637 2527          uint64_t        zcb_errors[256];
2638 2528          int             zcb_readfails;
2639 2529          int             zcb_haderrors;
2640 2530          spa_t           *zcb_spa;
2641      -        uint32_t        **zcb_vd_obsolete_counts;
2642 2531  } zdb_cb_t;
2643 2532  
2644 2533  static void
2645 2534  zdb_count_block(zdb_cb_t *zcb, zilog_t *zilog, const blkptr_t *bp,
2646 2535      dmu_object_type_t type)
2647 2536  {
2648 2537          uint64_t refcnt = 0;
2649 2538  
2650 2539          ASSERT(type < ZDB_OT_TOTAL);
2651 2540  
↓ open down ↓ 50 lines elided ↑ open up ↑
2702 2591          }
2703 2592  
2704 2593          if (dump_opt['L'])
2705 2594                  return;
2706 2595  
2707 2596          if (BP_GET_DEDUP(bp)) {
2708 2597                  ddt_t *ddt;
2709 2598                  ddt_entry_t *dde;
2710 2599  
2711 2600                  ddt = ddt_select(zcb->zcb_spa, bp);
2712      -                ddt_enter(ddt);
2713 2601                  dde = ddt_lookup(ddt, bp, B_FALSE);
2714 2602  
2715 2603                  if (dde == NULL) {
2716 2604                          refcnt = 0;
2717 2605                  } else {
2718 2606                          ddt_phys_t *ddp = ddt_phys_select(dde, bp);
     2607 +
     2608 +                        /* no other competitors for dde */
     2609 +                        dde_exit(dde);
     2610 +
2719 2611                          ddt_phys_decref(ddp);
2720 2612                          refcnt = ddp->ddp_refcnt;
2721 2613                          if (ddt_phys_total_refcnt(dde) == 0)
2722 2614                                  ddt_remove(ddt, dde);
2723 2615                  }
2724      -                ddt_exit(ddt);
2725 2616          }
2726 2617  
2727 2618          VERIFY3U(zio_wait(zio_claim(NULL, zcb->zcb_spa,
2728 2619              refcnt ? 0 : spa_first_txg(zcb->zcb_spa),
2729 2620              bp, NULL, NULL, ZIO_FLAG_CANFAIL)), ==, 0);
2730 2621  }
2731 2622  
2732 2623  static void
2733 2624  zdb_blkptr_done(zio_t *zio)
2734 2625  {
↓ open down ↓ 57 lines elided ↑ open up ↑
2792 2683          }
2793 2684  
2794 2685          if (BP_IS_HOLE(bp))
2795 2686                  return (0);
2796 2687  
2797 2688          type = BP_GET_TYPE(bp);
2798 2689  
2799 2690          zdb_count_block(zcb, zilog, bp,
2800 2691              (type & DMU_OT_NEWTYPE) ? ZDB_OT_OTHER : type);
2801 2692  
2802      -        is_metadata = (BP_GET_LEVEL(bp) != 0 || DMU_OT_IS_METADATA(type));
     2693 +        is_metadata = BP_IS_METADATA(bp);
2803 2694  
2804 2695          if (!BP_IS_EMBEDDED(bp) &&
2805 2696              (dump_opt['c'] > 1 || (dump_opt['c'] && is_metadata))) {
2806 2697                  size_t size = BP_GET_PSIZE(bp);
2807 2698                  abd_t *abd = abd_alloc(size, B_FALSE);
2808 2699                  int flags = ZIO_FLAG_CANFAIL | ZIO_FLAG_SCRUB | ZIO_FLAG_RAW;
2809 2700  
2810 2701                  /* If it's an intent log block, failure is expected. */
2811 2702                  if (zb->zb_level == ZB_ZIL_LEVEL)
2812 2703                          flags |= ZIO_FLAG_SPECULATIVE;
↓ open down ↓ 82 lines elided ↑ open up ↑
2895 2786                          if (p == DDT_PHYS_DITTO) {
2896 2787                                  zdb_count_block(zcb, NULL, &blk, ZDB_OT_DITTO);
2897 2788                          } else {
2898 2789                                  zcb->zcb_dedup_asize +=
2899 2790                                      BP_GET_ASIZE(&blk) * (ddp->ddp_refcnt - 1);
2900 2791                                  zcb->zcb_dedup_blocks++;
2901 2792                          }
2902 2793                  }
2903 2794                  if (!dump_opt['L']) {
2904 2795                          ddt_t *ddt = spa->spa_ddt[ddb.ddb_checksum];
2905      -                        ddt_enter(ddt);
2906      -                        VERIFY(ddt_lookup(ddt, &blk, B_TRUE) != NULL);
2907      -                        ddt_exit(ddt);
     2796 +                        ddt_entry_t *dde;
     2797 +                        VERIFY((dde = ddt_lookup(ddt, &blk, B_TRUE)) != NULL);
     2798 +                        dde_exit(dde);
2908 2799                  }
2909 2800          }
2910 2801  
2911 2802          ASSERT(error == ENOENT);
2912 2803  }
2913 2804  
2914      -/* ARGSUSED */
2915 2805  static void
2916      -claim_segment_impl_cb(uint64_t inner_offset, vdev_t *vd, uint64_t offset,
2917      -    uint64_t size, void *arg)
2918      -{
2919      -        /*
2920      -         * This callback was called through a remap from
2921      -         * a device being removed. Therefore, the vdev that
2922      -         * this callback is applied to is a concrete
2923      -         * vdev.
2924      -         */
2925      -        ASSERT(vdev_is_concrete(vd));
2926      -
2927      -        VERIFY0(metaslab_claim_impl(vd, offset, size,
2928      -            spa_first_txg(vd->vdev_spa)));
2929      -}
2930      -
2931      -static void
2932      -claim_segment_cb(void *arg, uint64_t offset, uint64_t size)
2933      -{
2934      -        vdev_t *vd = arg;
2935      -
2936      -        vdev_indirect_ops.vdev_op_remap(vd, offset, size,
2937      -            claim_segment_impl_cb, NULL);
2938      -}
2939      -
2940      -/*
2941      - * After accounting for all allocated blocks that are directly referenced,
2942      - * we might have missed a reference to a block from a partially complete
2943      - * (and thus unused) indirect mapping object. We perform a secondary pass
2944      - * through the metaslabs we have already mapped and claim the destination
2945      - * blocks.
2946      - */
2947      -static void
2948      -zdb_claim_removing(spa_t *spa, zdb_cb_t *zcb)
2949      -{
2950      -        if (spa->spa_vdev_removal == NULL)
2951      -                return;
2952      -
2953      -        spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
2954      -
2955      -        spa_vdev_removal_t *svr = spa->spa_vdev_removal;
2956      -        vdev_t *vd = svr->svr_vdev;
2957      -        vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
2958      -
2959      -        for (uint64_t msi = 0; msi < vd->vdev_ms_count; msi++) {
2960      -                metaslab_t *msp = vd->vdev_ms[msi];
2961      -
2962      -                if (msp->ms_start >= vdev_indirect_mapping_max_offset(vim))
2963      -                        break;
2964      -
2965      -                ASSERT0(range_tree_space(svr->svr_allocd_segs));
2966      -
2967      -                if (msp->ms_sm != NULL) {
2968      -                        VERIFY0(space_map_load(msp->ms_sm,
2969      -                            svr->svr_allocd_segs, SM_ALLOC));
2970      -
2971      -                        /*
2972      -                         * Clear everything past what has been synced,
2973      -                         * because we have not allocated mappings for it yet.
2974      -                         */
2975      -                        range_tree_clear(svr->svr_allocd_segs,
2976      -                            vdev_indirect_mapping_max_offset(vim),
2977      -                            msp->ms_sm->sm_start + msp->ms_sm->sm_size -
2978      -                            vdev_indirect_mapping_max_offset(vim));
2979      -                }
2980      -
2981      -                zcb->zcb_removing_size +=
2982      -                    range_tree_space(svr->svr_allocd_segs);
2983      -                range_tree_vacate(svr->svr_allocd_segs, claim_segment_cb, vd);
2984      -        }
2985      -
2986      -        spa_config_exit(spa, SCL_CONFIG, FTAG);
2987      -}
2988      -
2989      -/*
2990      - * vm_idxp is an in-out parameter which (for indirect vdevs) is the
2991      - * index in vim_entries that has the first entry in this metaslab.  On
2992      - * return, it will be set to the first entry after this metaslab.
2993      - */
2994      -static void
2995      -zdb_leak_init_ms(metaslab_t *msp, uint64_t *vim_idxp)
2996      -{
2997      -        metaslab_group_t *mg = msp->ms_group;
2998      -        vdev_t *vd = mg->mg_vd;
2999      -        vdev_t *rvd = vd->vdev_spa->spa_root_vdev;
3000      -
3001      -        mutex_enter(&msp->ms_lock);
3002      -        metaslab_unload(msp);
3003      -
3004      -        /*
3005      -         * We don't want to spend the CPU manipulating the size-ordered
3006      -         * tree, so clear the range_tree ops.
3007      -         */
3008      -        msp->ms_tree->rt_ops = NULL;
3009      -
3010      -        (void) fprintf(stderr,
3011      -            "\rloading vdev %llu of %llu, metaslab %llu of %llu ...",
3012      -            (longlong_t)vd->vdev_id,
3013      -            (longlong_t)rvd->vdev_children,
3014      -            (longlong_t)msp->ms_id,
3015      -            (longlong_t)vd->vdev_ms_count);
3016      -
3017      -        /*
3018      -         * For leak detection, we overload the metaslab ms_tree to
3019      -         * contain allocated segments instead of free segments. As a
3020      -         * result, we can't use the normal metaslab_load/unload
3021      -         * interfaces.
3022      -         */
3023      -        if (vd->vdev_ops == &vdev_indirect_ops) {
3024      -                vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
3025      -                for (; *vim_idxp < vdev_indirect_mapping_num_entries(vim);
3026      -                    (*vim_idxp)++) {
3027      -                        vdev_indirect_mapping_entry_phys_t *vimep =
3028      -                            &vim->vim_entries[*vim_idxp];
3029      -                        uint64_t ent_offset = DVA_MAPPING_GET_SRC_OFFSET(vimep);
3030      -                        uint64_t ent_len = DVA_GET_ASIZE(&vimep->vimep_dst);
3031      -                        ASSERT3U(ent_offset, >=, msp->ms_start);
3032      -                        if (ent_offset >= msp->ms_start + msp->ms_size)
3033      -                                break;
3034      -
3035      -                        /*
3036      -                         * Mappings do not cross metaslab boundaries,
3037      -                         * because we create them by walking the metaslabs.
3038      -                         */
3039      -                        ASSERT3U(ent_offset + ent_len, <=,
3040      -                            msp->ms_start + msp->ms_size);
3041      -                        range_tree_add(msp->ms_tree, ent_offset, ent_len);
3042      -                }
3043      -        } else if (msp->ms_sm != NULL) {
3044      -                VERIFY0(space_map_load(msp->ms_sm, msp->ms_tree, SM_ALLOC));
3045      -        }
3046      -
3047      -        if (!msp->ms_loaded) {
3048      -                msp->ms_loaded = B_TRUE;
3049      -        }
3050      -        mutex_exit(&msp->ms_lock);
3051      -}
3052      -
3053      -/* ARGSUSED */
3054      -static int
3055      -increment_indirect_mapping_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx)
3056      -{
3057      -        zdb_cb_t *zcb = arg;
3058      -        spa_t *spa = zcb->zcb_spa;
3059      -        vdev_t *vd;
3060      -        const dva_t *dva = &bp->blk_dva[0];
3061      -
3062      -        ASSERT(!dump_opt['L']);
3063      -        ASSERT3U(BP_GET_NDVAS(bp), ==, 1);
3064      -
3065      -        spa_config_enter(spa, SCL_VDEV, FTAG, RW_READER);
3066      -        vd = vdev_lookup_top(zcb->zcb_spa, DVA_GET_VDEV(dva));
3067      -        ASSERT3P(vd, !=, NULL);
3068      -        spa_config_exit(spa, SCL_VDEV, FTAG);
3069      -
3070      -        ASSERT(vd->vdev_indirect_config.vic_mapping_object != 0);
3071      -        ASSERT3P(zcb->zcb_vd_obsolete_counts[vd->vdev_id], !=, NULL);
3072      -
3073      -        vdev_indirect_mapping_increment_obsolete_count(
3074      -            vd->vdev_indirect_mapping,
3075      -            DVA_GET_OFFSET(dva), DVA_GET_ASIZE(dva),
3076      -            zcb->zcb_vd_obsolete_counts[vd->vdev_id]);
3077      -
3078      -        return (0);
3079      -}
3080      -
3081      -static uint32_t *
3082      -zdb_load_obsolete_counts(vdev_t *vd)
3083      -{
3084      -        vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
3085      -        spa_t *spa = vd->vdev_spa;
3086      -        spa_condensing_indirect_phys_t *scip =
3087      -            &spa->spa_condensing_indirect_phys;
3088      -        uint32_t *counts;
3089      -
3090      -        EQUIV(vdev_obsolete_sm_object(vd) != 0, vd->vdev_obsolete_sm != NULL);
3091      -        counts = vdev_indirect_mapping_load_obsolete_counts(vim);
3092      -        if (vd->vdev_obsolete_sm != NULL) {
3093      -                vdev_indirect_mapping_load_obsolete_spacemap(vim, counts,
3094      -                    vd->vdev_obsolete_sm);
3095      -        }
3096      -        if (scip->scip_vdev == vd->vdev_id &&
3097      -            scip->scip_prev_obsolete_sm_object != 0) {
3098      -                space_map_t *prev_obsolete_sm = NULL;
3099      -                VERIFY0(space_map_open(&prev_obsolete_sm, spa->spa_meta_objset,
3100      -                    scip->scip_prev_obsolete_sm_object, 0, vd->vdev_asize, 0));
3101      -                space_map_update(prev_obsolete_sm);
3102      -                vdev_indirect_mapping_load_obsolete_spacemap(vim, counts,
3103      -                    prev_obsolete_sm);
3104      -                space_map_close(prev_obsolete_sm);
3105      -        }
3106      -        return (counts);
3107      -}
3108      -
3109      -static void
3110 2806  zdb_leak_init(spa_t *spa, zdb_cb_t *zcb)
3111 2807  {
3112 2808          zcb->zcb_spa = spa;
3113 2809  
3114 2810          if (!dump_opt['L']) {
3115      -                dsl_pool_t *dp = spa->spa_dsl_pool;
3116 2811                  vdev_t *rvd = spa->spa_root_vdev;
3117 2812  
3118 2813                  /*
3119 2814                   * We are going to be changing the meaning of the metaslab's
3120 2815                   * ms_tree.  Ensure that the allocator doesn't try to
3121 2816                   * use the tree.
3122 2817                   */
3123 2818                  spa->spa_normal_class->mc_ops = &zdb_metaslab_ops;
3124 2819                  spa->spa_log_class->mc_ops = &zdb_metaslab_ops;
3125 2820  
3126      -                zcb->zcb_vd_obsolete_counts =
3127      -                    umem_zalloc(rvd->vdev_children * sizeof (uint32_t *),
3128      -                    UMEM_NOFAIL);
3129      -
3130      -
3131 2821                  for (uint64_t c = 0; c < rvd->vdev_children; c++) {
3132 2822                          vdev_t *vd = rvd->vdev_child[c];
3133      -                        uint64_t vim_idx = 0;
     2823 +                        metaslab_group_t *mg = vd->vdev_mg;
     2824 +                        for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
     2825 +                                metaslab_t *msp = vd->vdev_ms[m];
     2826 +                                ASSERT3P(msp->ms_group, ==, mg);
     2827 +                                mutex_enter(&msp->ms_lock);
     2828 +                                metaslab_unload(msp);
3134 2829  
3135      -                        ASSERT3U(c, ==, vd->vdev_id);
3136      -
3137      -                        /*
3138      -                         * Note: we don't check for mapping leaks on
3139      -                         * removing vdevs because their ms_tree's are
3140      -                         * used to look for leaks in allocated space.
3141      -                         */
3142      -                        if (vd->vdev_ops == &vdev_indirect_ops) {
3143      -                                zcb->zcb_vd_obsolete_counts[c] =
3144      -                                    zdb_load_obsolete_counts(vd);
3145      -
3146 2830                                  /*
3147      -                                 * Normally, indirect vdevs don't have any
3148      -                                 * metaslabs.  We want to set them up for
3149      -                                 * zio_claim().
     2831 +                                 * For leak detection, we overload the metaslab
     2832 +                                 * ms_tree to contain allocated segments
     2833 +                                 * instead of free segments. As a result,
     2834 +                                 * we can't use the normal metaslab_load/unload
     2835 +                                 * interfaces.
3150 2836                                   */
3151      -                                VERIFY0(vdev_metaslab_init(vd, 0));
3152      -                        }
     2837 +                                if (msp->ms_sm != NULL) {
     2838 +                                        (void) fprintf(stderr,
     2839 +                                            "\rloading space map for "
     2840 +                                            "vdev %llu of %llu, "
     2841 +                                            "metaslab %llu of %llu ...",
     2842 +                                            (longlong_t)c,
     2843 +                                            (longlong_t)rvd->vdev_children,
     2844 +                                            (longlong_t)m,
     2845 +                                            (longlong_t)vd->vdev_ms_count);
3153 2846  
3154      -                        for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
3155      -                                zdb_leak_init_ms(vd->vdev_ms[m], &vim_idx);
     2847 +                                        /*
     2848 +                                         * We don't want to spend the CPU
     2849 +                                         * manipulating the size-ordered
     2850 +                                         * tree, so clear the range_tree
     2851 +                                         * ops.
     2852 +                                         */
     2853 +                                        msp->ms_tree->rt_ops = NULL;
     2854 +                                        VERIFY0(space_map_load(msp->ms_sm,
     2855 +                                            msp->ms_tree, SM_ALLOC));
     2856 +
     2857 +                                        if (!msp->ms_loaded) {
     2858 +                                                msp->ms_loaded = B_TRUE;
     2859 +                                        }
     2860 +                                }
     2861 +                                mutex_exit(&msp->ms_lock);
3156 2862                          }
3157      -                        if (vd->vdev_ops == &vdev_indirect_ops) {
3158      -                                ASSERT3U(vim_idx, ==,
3159      -                                    vdev_indirect_mapping_num_entries(
3160      -                                    vd->vdev_indirect_mapping));
3161      -                        }
3162 2863                  }
3163 2864                  (void) fprintf(stderr, "\n");
3164      -
3165      -                if (bpobj_is_open(&dp->dp_obsolete_bpobj)) {
3166      -                        ASSERT(spa_feature_is_enabled(spa,
3167      -                            SPA_FEATURE_DEVICE_REMOVAL));
3168      -                        (void) bpobj_iterate_nofree(&dp->dp_obsolete_bpobj,
3169      -                            increment_indirect_mapping_cb, zcb, NULL);
3170      -                }
3171 2865          }
3172 2866  
3173 2867          spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
3174 2868  
3175 2869          zdb_ddt_leak_init(spa, zcb);
3176 2870  
3177 2871          spa_config_exit(spa, SCL_CONFIG, FTAG);
3178 2872  }
3179 2873  
3180      -static boolean_t
3181      -zdb_check_for_obsolete_leaks(vdev_t *vd, zdb_cb_t *zcb)
     2874 +static void
     2875 +zdb_leak_fini(spa_t *spa)
3182 2876  {
3183      -        boolean_t leaks = B_FALSE;
3184      -        vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
3185      -        uint64_t total_leaked = 0;
3186      -
3187      -        ASSERT(vim != NULL);
3188      -
3189      -        for (uint64_t i = 0; i < vdev_indirect_mapping_num_entries(vim); i++) {
3190      -                vdev_indirect_mapping_entry_phys_t *vimep =
3191      -                    &vim->vim_entries[i];
3192      -                uint64_t obsolete_bytes = 0;
3193      -                uint64_t offset = DVA_MAPPING_GET_SRC_OFFSET(vimep);
3194      -                metaslab_t *msp = vd->vdev_ms[offset >> vd->vdev_ms_shift];
3195      -
3196      -                /*
3197      -                 * This is not very efficient but it's easy to
3198      -                 * verify correctness.
3199      -                 */
3200      -                for (uint64_t inner_offset = 0;
3201      -                    inner_offset < DVA_GET_ASIZE(&vimep->vimep_dst);
3202      -                    inner_offset += 1 << vd->vdev_ashift) {
3203      -                        if (range_tree_contains(msp->ms_tree,
3204      -                            offset + inner_offset, 1 << vd->vdev_ashift)) {
3205      -                                obsolete_bytes += 1 << vd->vdev_ashift;
3206      -                        }
3207      -                }
3208      -
3209      -                int64_t bytes_leaked = obsolete_bytes -
3210      -                    zcb->zcb_vd_obsolete_counts[vd->vdev_id][i];
3211      -                ASSERT3U(DVA_GET_ASIZE(&vimep->vimep_dst), >=,
3212      -                    zcb->zcb_vd_obsolete_counts[vd->vdev_id][i]);
3213      -                if (bytes_leaked != 0 &&
3214      -                    (vdev_obsolete_counts_are_precise(vd) ||
3215      -                    dump_opt['d'] >= 5)) {
3216      -                        (void) printf("obsolete indirect mapping count "
3217      -                            "mismatch on %llu:%llx:%llx : %llx bytes leaked\n",
3218      -                            (u_longlong_t)vd->vdev_id,
3219      -                            (u_longlong_t)DVA_MAPPING_GET_SRC_OFFSET(vimep),
3220      -                            (u_longlong_t)DVA_GET_ASIZE(&vimep->vimep_dst),
3221      -                            (u_longlong_t)bytes_leaked);
3222      -                }
3223      -                total_leaked += ABS(bytes_leaked);
3224      -        }
3225      -
3226      -        if (!vdev_obsolete_counts_are_precise(vd) && total_leaked > 0) {
3227      -                int pct_leaked = total_leaked * 100 /
3228      -                    vdev_indirect_mapping_bytes_mapped(vim);
3229      -                (void) printf("cannot verify obsolete indirect mapping "
3230      -                    "counts of vdev %llu because precise feature was not "
3231      -                    "enabled when it was removed: %d%% (%llx bytes) of mapping"
3232      -                    "unreferenced\n",
3233      -                    (u_longlong_t)vd->vdev_id, pct_leaked,
3234      -                    (u_longlong_t)total_leaked);
3235      -        } else if (total_leaked > 0) {
3236      -                (void) printf("obsolete indirect mapping count mismatch "
3237      -                    "for vdev %llu -- %llx total bytes mismatched\n",
3238      -                    (u_longlong_t)vd->vdev_id,
3239      -                    (u_longlong_t)total_leaked);
3240      -                leaks |= B_TRUE;
3241      -        }
3242      -
3243      -        vdev_indirect_mapping_free_obsolete_counts(vim,
3244      -            zcb->zcb_vd_obsolete_counts[vd->vdev_id]);
3245      -        zcb->zcb_vd_obsolete_counts[vd->vdev_id] = NULL;
3246      -
3247      -        return (leaks);
3248      -}
3249      -
3250      -static boolean_t
3251      -zdb_leak_fini(spa_t *spa, zdb_cb_t *zcb)
3252      -{
3253      -        boolean_t leaks = B_FALSE;
3254 2877          if (!dump_opt['L']) {
3255 2878                  vdev_t *rvd = spa->spa_root_vdev;
3256 2879                  for (unsigned c = 0; c < rvd->vdev_children; c++) {
3257 2880                          vdev_t *vd = rvd->vdev_child[c];
3258 2881                          metaslab_group_t *mg = vd->vdev_mg;
3259      -
3260      -                        if (zcb->zcb_vd_obsolete_counts[c] != NULL) {
3261      -                                leaks |= zdb_check_for_obsolete_leaks(vd, zcb);
3262      -                        }
3263      -
3264      -                        for (uint64_t m = 0; m < vd->vdev_ms_count; m++) {
     2882 +                        for (unsigned m = 0; m < vd->vdev_ms_count; m++) {
3265 2883                                  metaslab_t *msp = vd->vdev_ms[m];
3266 2884                                  ASSERT3P(mg, ==, msp->ms_group);
     2885 +                                mutex_enter(&msp->ms_lock);
3267 2886  
3268 2887                                  /*
3269 2888                                   * The ms_tree has been overloaded to
3270 2889                                   * contain allocated segments. Now that we
3271 2890                                   * finished traversing all blocks, any
3272 2891                                   * block that remains in the ms_tree
3273 2892                                   * represents an allocated block that we
3274 2893                                   * did not claim during the traversal.
3275 2894                                   * Claimed blocks would have been removed
3276      -                                 * from the ms_tree.  For indirect vdevs,
3277      -                                 * space remaining in the tree represents
3278      -                                 * parts of the mapping that are not
3279      -                                 * referenced, which is not a bug.
     2895 +                                 * from the ms_tree.
3280 2896                                   */
3281      -                                if (vd->vdev_ops == &vdev_indirect_ops) {
3282      -                                        range_tree_vacate(msp->ms_tree,
3283      -                                            NULL, NULL);
3284      -                                } else {
3285      -                                        range_tree_vacate(msp->ms_tree,
3286      -                                            zdb_leak, vd);
3287      -                                }
     2897 +                                range_tree_vacate(msp->ms_tree, zdb_leak, vd);
3288 2898  
3289 2899                                  if (msp->ms_loaded) {
3290 2900                                          msp->ms_loaded = B_FALSE;
3291 2901                                  }
     2902 +
     2903 +                                mutex_exit(&msp->ms_lock);
3292 2904                          }
3293 2905                  }
3294      -
3295      -                umem_free(zcb->zcb_vd_obsolete_counts,
3296      -                    rvd->vdev_children * sizeof (uint32_t *));
3297      -                zcb->zcb_vd_obsolete_counts = NULL;
3298 2906          }
3299      -        return (leaks);
3300 2907  }
3301 2908  
3302 2909  /* ARGSUSED */
3303 2910  static int
3304 2911  count_block_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx)
3305 2912  {
3306 2913          zdb_cb_t *zcb = arg;
3307 2914  
3308 2915          if (dump_opt['b'] >= 5) {
3309 2916                  char blkbuf[BP_SPRINTF_LEN];
↓ open down ↓ 3 lines elided ↑ open up ↑
3313 2920          }
3314 2921          zdb_count_block(zcb, NULL, bp, ZDB_OT_DEFERRED);
3315 2922          return (0);
3316 2923  }
3317 2924  
3318 2925  static int
3319 2926  dump_block_stats(spa_t *spa)
3320 2927  {
3321 2928          zdb_cb_t zcb;
3322 2929          zdb_blkstats_t *zb, *tzb;
3323      -        uint64_t norm_alloc, norm_space, total_alloc, total_found;
     2930 +        uint64_t norm_alloc, spec_alloc, norm_space, total_alloc, total_found;
3324 2931          int flags = TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA | TRAVERSE_HARD;
3325 2932          boolean_t leaks = B_FALSE;
3326 2933  
3327 2934          bzero(&zcb, sizeof (zcb));
3328 2935          (void) printf("\nTraversing all blocks %s%s%s%s%s...\n\n",
3329 2936              (dump_opt['c'] || !dump_opt['L']) ? "to verify " : "",
3330 2937              (dump_opt['c'] == 1) ? "metadata " : "",
3331 2938              dump_opt['c'] ? "checksums " : "",
3332 2939              (dump_opt['c'] && !dump_opt['L']) ? "and verify " : "",
3333 2940              !dump_opt['L'] ? "nothing leaked " : "");
↓ open down ↓ 6 lines elided ↑ open up ↑
3340 2947           * it's not part of any space map) is a double allocation,
3341 2948           * reference to a freed block, or an unclaimed log block.
3342 2949           */
3343 2950          zdb_leak_init(spa, &zcb);
3344 2951  
3345 2952          /*
3346 2953           * If there's a deferred-free bplist, process that first.
3347 2954           */
3348 2955          (void) bpobj_iterate_nofree(&spa->spa_deferred_bpobj,
3349 2956              count_block_cb, &zcb, NULL);
3350      -
3351 2957          if (spa_version(spa) >= SPA_VERSION_DEADLISTS) {
3352 2958                  (void) bpobj_iterate_nofree(&spa->spa_dsl_pool->dp_free_bpobj,
3353 2959                      count_block_cb, &zcb, NULL);
3354 2960          }
3355      -
3356      -        zdb_claim_removing(spa, &zcb);
3357      -
3358 2961          if (spa_feature_is_active(spa, SPA_FEATURE_ASYNC_DESTROY)) {
3359 2962                  VERIFY3U(0, ==, bptree_iterate(spa->spa_meta_objset,
3360 2963                      spa->spa_dsl_pool->dp_bptree_obj, B_FALSE, count_block_cb,
3361 2964                      &zcb, NULL));
3362 2965          }
3363 2966  
3364 2967          if (dump_opt['c'] > 1)
3365 2968                  flags |= TRAVERSE_PREFETCH_DATA;
3366 2969  
3367 2970          zcb.zcb_totalasize = metaslab_class_get_alloc(spa_normal_class(spa));
3368 2971          zcb.zcb_start = zcb.zcb_lastprint = gethrtime();
3369      -        zcb.zcb_haderrors |= traverse_pool(spa, 0, flags, zdb_blkptr_cb, &zcb);
     2972 +        zcb.zcb_haderrors |= traverse_pool(spa, 0, UINT64_MAX,
     2973 +            flags, zdb_blkptr_cb, &zcb, NULL);
3370 2974  
3371 2975          /*
3372 2976           * If we've traversed the data blocks then we need to wait for those
3373 2977           * I/Os to complete. We leverage "The Godfather" zio to wait on
3374 2978           * all async I/Os to complete.
3375 2979           */
3376 2980          if (dump_opt['c']) {
3377 2981                  for (int i = 0; i < max_ncpus; i++) {
3378 2982                          (void) zio_wait(spa->spa_async_zio_root[i]);
3379 2983                          spa->spa_async_zio_root[i] = zio_root(spa, NULL, NULL,
↓ open down ↓ 9 lines elided ↑ open up ↑
3389 2993                          if (zcb.zcb_errors[e] != 0) {
3390 2994                                  (void) printf("\t%5d  %llu\n",
3391 2995                                      e, (u_longlong_t)zcb.zcb_errors[e]);
3392 2996                          }
3393 2997                  }
3394 2998          }
3395 2999  
3396 3000          /*
3397 3001           * Report any leaked segments.
3398 3002           */
3399      -        leaks |= zdb_leak_fini(spa, &zcb);
     3003 +        zdb_leak_fini(spa);
3400 3004  
3401 3005          tzb = &zcb.zcb_type[ZB_TOTAL][ZDB_OT_TOTAL];
3402 3006  
3403 3007          norm_alloc = metaslab_class_get_alloc(spa_normal_class(spa));
     3008 +        spec_alloc = metaslab_class_get_alloc(spa_special_class(spa));
3404 3009          norm_space = metaslab_class_get_space(spa_normal_class(spa));
3405 3010  
     3011 +        norm_alloc += spec_alloc;
3406 3012          total_alloc = norm_alloc + metaslab_class_get_alloc(spa_log_class(spa));
3407      -        total_found = tzb->zb_asize - zcb.zcb_dedup_asize +
3408      -            zcb.zcb_removing_size;
     3013 +        total_found = tzb->zb_asize - zcb.zcb_dedup_asize;
3409 3014  
3410 3015          if (total_found == total_alloc) {
3411 3016                  if (!dump_opt['L'])
3412 3017                          (void) printf("\n\tNo leaks (block sum matches space"
3413 3018                              " maps exactly)\n");
3414 3019          } else {
3415 3020                  (void) printf("block traversal size %llu != alloc %llu "
3416 3021                      "(%s %lld)\n",
3417 3022                      (u_longlong_t)total_found,
3418 3023                      (u_longlong_t)total_alloc,
↓ open down ↓ 21 lines elided ↑ open up ↑
3440 3045          (void) printf("\tbp allocated:  %10llu      avg:"
3441 3046              " %6llu     compression: %6.2f\n",
3442 3047              (u_longlong_t)tzb->zb_asize,
3443 3048              (u_longlong_t)(tzb->zb_asize / tzb->zb_count),
3444 3049              (double)tzb->zb_lsize / tzb->zb_asize);
3445 3050          (void) printf("\tbp deduped:    %10llu    ref>1:"
3446 3051              " %6llu   deduplication: %6.2f\n",
3447 3052              (u_longlong_t)zcb.zcb_dedup_asize,
3448 3053              (u_longlong_t)zcb.zcb_dedup_blocks,
3449 3054              (double)zcb.zcb_dedup_asize / tzb->zb_asize + 1.0);
     3055 +        if (spec_alloc != 0) {
     3056 +                (void) printf("\tspecial allocated: %10llu\n",
     3057 +                    (u_longlong_t)spec_alloc);
     3058 +        }
3450 3059          (void) printf("\tSPA allocated: %10llu     used: %5.2f%%\n",
3451 3060              (u_longlong_t)norm_alloc, 100.0 * norm_alloc / norm_space);
3452 3061  
3453 3062          for (bp_embedded_type_t i = 0; i < NUM_BP_EMBEDDED_TYPES; i++) {
3454 3063                  if (zcb.zcb_embedded_blocks[i] == 0)
3455 3064                          continue;
3456 3065                  (void) printf("\n");
3457 3066                  (void) printf("\tadditional, non-pointer bps of type %u: "
3458 3067                      "%10llu\n",
3459 3068                      i, (u_longlong_t)zcb.zcb_embedded_blocks[i]);
↓ open down ↓ 5 lines elided ↑ open up ↑
3465 3074                              sizeof (zcb.zcb_embedded_histogram[i]) /
3466 3075                              sizeof (zcb.zcb_embedded_histogram[i][0]), 0);
3467 3076                  }
3468 3077          }
3469 3078  
3470 3079          if (tzb->zb_ditto_samevdev != 0) {
3471 3080                  (void) printf("\tDittoed blocks on same vdev: %llu\n",
3472 3081                      (longlong_t)tzb->zb_ditto_samevdev);
3473 3082          }
3474 3083  
3475      -        for (uint64_t v = 0; v < spa->spa_root_vdev->vdev_children; v++) {
3476      -                vdev_t *vd = spa->spa_root_vdev->vdev_child[v];
3477      -                vdev_indirect_mapping_t *vim = vd->vdev_indirect_mapping;
3478      -
3479      -                if (vim == NULL) {
3480      -                        continue;
3481      -                }
3482      -
3483      -                char mem[32];
3484      -                zdb_nicenum(vdev_indirect_mapping_num_entries(vim),
3485      -                    mem, vdev_indirect_mapping_size(vim));
3486      -
3487      -                (void) printf("\tindirect vdev id %llu has %llu segments "
3488      -                    "(%s in memory)\n",
3489      -                    (longlong_t)vd->vdev_id,
3490      -                    (longlong_t)vdev_indirect_mapping_num_entries(vim), mem);
3491      -        }
3492      -
3493 3084          if (dump_opt['b'] >= 2) {
3494 3085                  int l, t, level;
3495 3086                  (void) printf("\nBlocks\tLSIZE\tPSIZE\tASIZE"
3496 3087                      "\t  avg\t comp\t%%Total\tType\n");
3497 3088  
3498 3089                  for (t = 0; t <= ZDB_OT_TOTAL; t++) {
3499 3090                          char csize[32], lsize[32], psize[32], asize[32];
3500 3091                          char avg[32], gang[32];
3501 3092                          const char *typename;
3502 3093  
↓ open down ↓ 112 lines elided ↑ open up ↑
3615 3206  
3616 3207          if (dump_opt['S'] > 1 && zb->zb_level == ZB_ROOT_LEVEL) {
3617 3208                  (void) printf("traversing objset %llu, %llu objects, "
3618 3209                      "%lu blocks so far\n",
3619 3210                      (u_longlong_t)zb->zb_objset,
3620 3211                      (u_longlong_t)BP_GET_FILL(bp),
3621 3212                      avl_numnodes(t));
3622 3213          }
3623 3214  
3624 3215          if (BP_IS_HOLE(bp) || BP_GET_CHECKSUM(bp) == ZIO_CHECKSUM_OFF ||
3625      -            BP_GET_LEVEL(bp) > 0 || DMU_OT_IS_METADATA(BP_GET_TYPE(bp)))
     3216 +            BP_IS_METADATA(bp))
3626 3217                  return (0);
3627 3218  
3628 3219          ddt_key_fill(&zdde_search.zdde_key, bp);
3629 3220  
3630 3221          zdde = avl_find(t, &zdde_search, &where);
3631 3222  
3632 3223          if (zdde == NULL) {
3633 3224                  zdde = umem_zalloc(sizeof (*zdde), UMEM_NOFAIL);
3634 3225                  zdde->zdde_key = zdde_search.zdde_key;
3635 3226                  avl_insert(t, zdde, where);
↓ open down ↓ 16 lines elided ↑ open up ↑
3652 3243          ddt_histogram_t ddh_total;
3653 3244          ddt_stat_t dds_total;
3654 3245  
3655 3246          bzero(&ddh_total, sizeof (ddh_total));
3656 3247          bzero(&dds_total, sizeof (dds_total));
3657 3248          avl_create(&t, ddt_entry_compare,
3658 3249              sizeof (zdb_ddt_entry_t), offsetof(zdb_ddt_entry_t, zdde_node));
3659 3250  
3660 3251          spa_config_enter(spa, SCL_CONFIG, FTAG, RW_READER);
3661 3252  
3662      -        (void) traverse_pool(spa, 0, TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA,
3663      -            zdb_ddt_add_cb, &t);
     3253 +        (void) traverse_pool(spa, 0, UINT64_MAX,
     3254 +            TRAVERSE_PRE | TRAVERSE_PREFETCH_METADATA,
     3255 +            zdb_ddt_add_cb, &t, NULL);
3664 3256  
3665 3257          spa_config_exit(spa, SCL_CONFIG, FTAG);
3666 3258  
3667 3259          while ((zdde = avl_destroy_nodes(&t, &cookie)) != NULL) {
3668 3260                  ddt_stat_t dds;
3669 3261                  uint64_t refcnt = zdde->zdde_ref_blocks;
3670 3262                  ASSERT(refcnt != 0);
3671 3263  
3672 3264                  dds.dds_blocks = zdde->zdde_ref_blocks / refcnt;
3673 3265                  dds.dds_lsize = zdde->zdde_ref_lsize / refcnt;
↓ open down ↓ 15 lines elided ↑ open up ↑
3689 3281  
3690 3282          ddt_histogram_stat(&dds_total, &ddh_total);
3691 3283  
3692 3284          (void) printf("Simulated DDT histogram:\n");
3693 3285  
3694 3286          zpool_dump_ddt(&dds_total, &ddh_total);
3695 3287  
3696 3288          dump_dedup_ratio(&dds_total);
3697 3289  }
3698 3290  
3699      -static int
3700      -verify_device_removal_feature_counts(spa_t *spa)
3701      -{
3702      -        uint64_t dr_feature_refcount = 0;
3703      -        uint64_t oc_feature_refcount = 0;
3704      -        uint64_t indirect_vdev_count = 0;
3705      -        uint64_t precise_vdev_count = 0;
3706      -        uint64_t obsolete_counts_object_count = 0;
3707      -        uint64_t obsolete_sm_count = 0;
3708      -        uint64_t obsolete_counts_count = 0;
3709      -        uint64_t scip_count = 0;
3710      -        uint64_t obsolete_bpobj_count = 0;
3711      -        int ret = 0;
3712      -
3713      -        spa_condensing_indirect_phys_t *scip =
3714      -            &spa->spa_condensing_indirect_phys;
3715      -        if (scip->scip_next_mapping_object != 0) {
3716      -                vdev_t *vd = spa->spa_root_vdev->vdev_child[scip->scip_vdev];
3717      -                ASSERT(scip->scip_prev_obsolete_sm_object != 0);
3718      -                ASSERT3P(vd->vdev_ops, ==, &vdev_indirect_ops);
3719      -
3720      -                (void) printf("Condensing indirect vdev %llu: new mapping "
3721      -                    "object %llu, prev obsolete sm %llu\n",
3722      -                    (u_longlong_t)scip->scip_vdev,
3723      -                    (u_longlong_t)scip->scip_next_mapping_object,
3724      -                    (u_longlong_t)scip->scip_prev_obsolete_sm_object);
3725      -                if (scip->scip_prev_obsolete_sm_object != 0) {
3726      -                        space_map_t *prev_obsolete_sm = NULL;
3727      -                        VERIFY0(space_map_open(&prev_obsolete_sm,
3728      -                            spa->spa_meta_objset,
3729      -                            scip->scip_prev_obsolete_sm_object,
3730      -                            0, vd->vdev_asize, 0));
3731      -                        space_map_update(prev_obsolete_sm);
3732      -                        dump_spacemap(spa->spa_meta_objset, prev_obsolete_sm);
3733      -                        (void) printf("\n");
3734      -                        space_map_close(prev_obsolete_sm);
3735      -                }
3736      -
3737      -                scip_count += 2;
3738      -        }
3739      -
3740      -        for (uint64_t i = 0; i < spa->spa_root_vdev->vdev_children; i++) {
3741      -                vdev_t *vd = spa->spa_root_vdev->vdev_child[i];
3742      -                vdev_indirect_config_t *vic = &vd->vdev_indirect_config;
3743      -
3744      -                if (vic->vic_mapping_object != 0) {
3745      -                        ASSERT(vd->vdev_ops == &vdev_indirect_ops ||
3746      -                            vd->vdev_removing);
3747      -                        indirect_vdev_count++;
3748      -
3749      -                        if (vd->vdev_indirect_mapping->vim_havecounts) {
3750      -                                obsolete_counts_count++;
3751      -                        }
3752      -                }
3753      -                if (vdev_obsolete_counts_are_precise(vd)) {
3754      -                        ASSERT(vic->vic_mapping_object != 0);
3755      -                        precise_vdev_count++;
3756      -                }
3757      -                if (vdev_obsolete_sm_object(vd) != 0) {
3758      -                        ASSERT(vic->vic_mapping_object != 0);
3759      -                        obsolete_sm_count++;
3760      -                }
3761      -        }
3762      -
3763      -        (void) feature_get_refcount(spa,
3764      -            &spa_feature_table[SPA_FEATURE_DEVICE_REMOVAL],
3765      -            &dr_feature_refcount);
3766      -        (void) feature_get_refcount(spa,
3767      -            &spa_feature_table[SPA_FEATURE_OBSOLETE_COUNTS],
3768      -            &oc_feature_refcount);
3769      -
3770      -        if (dr_feature_refcount != indirect_vdev_count) {
3771      -                ret = 1;
3772      -                (void) printf("Number of indirect vdevs (%llu) " \
3773      -                    "does not match feature count (%llu)\n",
3774      -                    (u_longlong_t)indirect_vdev_count,
3775      -                    (u_longlong_t)dr_feature_refcount);
3776      -        } else {
3777      -                (void) printf("Verified device_removal feature refcount " \
3778      -                    "of %llu is correct\n",
3779      -                    (u_longlong_t)dr_feature_refcount);
3780      -        }
3781      -
3782      -        if (zap_contains(spa_meta_objset(spa), DMU_POOL_DIRECTORY_OBJECT,
3783      -            DMU_POOL_OBSOLETE_BPOBJ) == 0) {
3784      -                obsolete_bpobj_count++;
3785      -        }
3786      -
3787      -
3788      -        obsolete_counts_object_count = precise_vdev_count;
3789      -        obsolete_counts_object_count += obsolete_sm_count;
3790      -        obsolete_counts_object_count += obsolete_counts_count;
3791      -        obsolete_counts_object_count += scip_count;
3792      -        obsolete_counts_object_count += obsolete_bpobj_count;
3793      -        obsolete_counts_object_count += remap_deadlist_count;
3794      -
3795      -        if (oc_feature_refcount != obsolete_counts_object_count) {
3796      -                ret = 1;
3797      -                (void) printf("Number of obsolete counts objects (%llu) " \
3798      -                    "does not match feature count (%llu)\n",
3799      -                    (u_longlong_t)obsolete_counts_object_count,
3800      -                    (u_longlong_t)oc_feature_refcount);
3801      -                (void) printf("pv:%llu os:%llu oc:%llu sc:%llu "
3802      -                    "ob:%llu rd:%llu\n",
3803      -                    (u_longlong_t)precise_vdev_count,
3804      -                    (u_longlong_t)obsolete_sm_count,
3805      -                    (u_longlong_t)obsolete_counts_count,
3806      -                    (u_longlong_t)scip_count,
3807      -                    (u_longlong_t)obsolete_bpobj_count,
3808      -                    (u_longlong_t)remap_deadlist_count);
3809      -        } else {
3810      -                (void) printf("Verified indirect_refcount feature refcount " \
3811      -                    "of %llu is correct\n",
3812      -                    (u_longlong_t)oc_feature_refcount);
3813      -        }
3814      -        return (ret);
3815      -}
3816      -
3817 3291  static void
3818 3292  dump_zpool(spa_t *spa)
3819 3293  {
3820 3294          dsl_pool_t *dp = spa_get_dsl(spa);
3821 3295          int rc = 0;
3822 3296  
3823 3297          if (dump_opt['S']) {
3824 3298                  dump_simulated_ddt(spa);
3825 3299                  return;
3826 3300          }
↓ open down ↓ 13 lines elided ↑ open up ↑
3840 3314                  dump_all_ddts(spa);
3841 3315  
3842 3316          if (dump_opt['d'] > 2 || dump_opt['m'])
3843 3317                  dump_metaslabs(spa);
3844 3318          if (dump_opt['M'])
3845 3319                  dump_metaslab_groups(spa);
3846 3320  
3847 3321          if (dump_opt['d'] || dump_opt['i']) {
3848 3322                  dump_dir(dp->dp_meta_objset);
3849 3323                  if (dump_opt['d'] >= 3) {
3850      -                        dsl_pool_t *dp = spa->spa_dsl_pool;
3851 3324                          dump_full_bpobj(&spa->spa_deferred_bpobj,
3852 3325                              "Deferred frees", 0);
3853 3326                          if (spa_version(spa) >= SPA_VERSION_DEADLISTS) {
3854      -                                dump_full_bpobj(&dp->dp_free_bpobj,
     3327 +                                dump_full_bpobj(
     3328 +                                    &spa->spa_dsl_pool->dp_free_bpobj,
3855 3329                                      "Pool snapshot frees", 0);
3856 3330                          }
3857      -                        if (bpobj_is_open(&dp->dp_obsolete_bpobj)) {
3858      -                                ASSERT(spa_feature_is_enabled(spa,
3859      -                                    SPA_FEATURE_DEVICE_REMOVAL));
3860      -                                dump_full_bpobj(&dp->dp_obsolete_bpobj,
3861      -                                    "Pool obsolete blocks", 0);
3862      -                        }
3863 3331  
3864 3332                          if (spa_feature_is_active(spa,
3865 3333                              SPA_FEATURE_ASYNC_DESTROY)) {
3866 3334                                  dump_bptree(spa->spa_meta_objset,
3867      -                                    dp->dp_bptree_obj,
     3335 +                                    spa->spa_dsl_pool->dp_bptree_obj,
3868 3336                                      "Pool dataset frees");
3869 3337                          }
3870 3338                          dump_dtl(spa->spa_root_vdev, 0);
3871 3339                  }
3872 3340                  (void) dmu_objset_find(spa_name(spa), dump_one_dir,
3873 3341                      NULL, DS_FIND_SNAPSHOTS | DS_FIND_CHILDREN);
3874 3342  
3875 3343                  for (spa_feature_t f = 0; f < SPA_FEATURES; f++) {
3876 3344                          uint64_t refcount;
3877 3345  
↓ open down ↓ 12 lines elided ↑ open up ↑
3890 3358                                      (longlong_t)dataset_feature_count[f],
3891 3359                                      (longlong_t)refcount);
3892 3360                                  rc = 2;
3893 3361                          } else {
3894 3362                                  (void) printf("Verified %s feature refcount "
3895 3363                                      "of %llu is correct\n",
3896 3364                                      spa_feature_table[f].fi_uname,
3897 3365                                      (longlong_t)refcount);
3898 3366                          }
3899 3367                  }
3900      -
3901      -                if (rc == 0) {
3902      -                        rc = verify_device_removal_feature_counts(spa);
3903      -                }
3904 3368          }
3905 3369          if (rc == 0 && (dump_opt['b'] || dump_opt['c']))
3906 3370                  rc = dump_block_stats(spa);
3907 3371  
3908 3372          if (rc == 0)
3909 3373                  rc = verify_spacemap_refcounts(spa);
3910 3374  
3911 3375          if (dump_opt['s'])
3912 3376                  show_pool_stats(spa);
3913 3377  
↓ open down ↓ 287 lines elided ↑ open up ↑
4201 3665                      ZIO_PRIORITY_SYNC_READ,
4202 3666                      ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW, NULL));
4203 3667          } else {
4204 3668                  /*
4205 3669                   * Treat this as a vdev child I/O.
4206 3670                   */
4207 3671                  zio_nowait(zio_vdev_child_io(zio, bp, vd, offset, pabd,
4208 3672                      psize, ZIO_TYPE_READ, ZIO_PRIORITY_SYNC_READ,
4209 3673                      ZIO_FLAG_DONT_CACHE | ZIO_FLAG_DONT_QUEUE |
4210 3674                      ZIO_FLAG_DONT_PROPAGATE | ZIO_FLAG_DONT_RETRY |
4211      -                    ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW | ZIO_FLAG_OPTIONAL,
4212      -                    NULL, NULL));
     3675 +                    ZIO_FLAG_CANFAIL | ZIO_FLAG_RAW, NULL, NULL));
4213 3676          }
4214 3677  
4215 3678          error = zio_wait(zio);
4216 3679          spa_config_exit(spa, SCL_STATE, FTAG);
4217 3680  
4218 3681          if (error) {
4219 3682                  (void) printf("Read of %s failed, error: %d\n", thing, error);
4220 3683                  goto out;
4221 3684          }
4222 3685  
↓ open down ↓ 318 lines elided ↑ open up ↑
4541 4004           * "zdb -b" uses traversal prefetch which uses async reads.
4542 4005           * For good performance, let several of them be active at once.
4543 4006           */
4544 4007          zfs_vdev_async_read_max_active = 10;
4545 4008  
4546 4009          /*
4547 4010           * Disable reference tracking for better performance.
4548 4011           */
4549 4012          reference_tracking_enable = B_FALSE;
4550 4013  
4551      -        /*
4552      -         * Do not fail spa_load when spa_load_verify fails. This is needed
4553      -         * to load non-idle pools.
4554      -         */
4555      -        spa_load_verify_dryrun = B_TRUE;
4556      -
4557 4014          kernel_init(FREAD);
4558 4015          g_zfs = libzfs_init();
4559 4016          ASSERT(g_zfs != NULL);
4560 4017  
4561 4018          if (dump_all)
4562 4019                  verbose = MAX(verbose, 1);
4563 4020  
4564 4021          for (c = 0; c < 256; c++) {
4565 4022                  if (dump_all && strchr("AeEFlLOPRSX", c) == NULL)
4566 4023                          dump_opt[c] = 1;
↓ open down ↓ 165 lines elided ↑ open up ↑
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX