Print this page
NEX-15281 zfs_panic_recover() during hpr disable/enable
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-15281 zfs_panic_recover() during hpr disable/enable
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-13629 zfs send -s: assertion failed: err != 0 || (dsp->dsa_sent_begin && dsp->dsa_sent_end), file: ../../common/fs/zfs/dmu_send.c, line: 1010
Reviewed by: Alex Deiter <alex.deiter@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
NEX-9752 backport illumos 6950 ARC should cache compressed data
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
6950 ARC should cache compressed data
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
NEX-9575 zfs send -s panics
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Revert "NEX-7251 Resume_token is not cleared right after finishing receive"
This reverts commit 9e97a45e8cf6ca59307a39e2d3c11c6e845e4187.
NEX-7251 Resume_token is not cleared right after finishing receive
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alexey Komarov <alexey.komarov@nexenta.com>
NEX-5928 KRRP: Integrate illumos/openzfs resume-token, to resume replication from a given synced offset
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alexey Komarov <alexey.komarov@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-5795 Rename 'wrc' as 'wbc' in the source and in the tech docs
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
NEX-5272 KRRP: replicate snapshot properties
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Alexey Komarov <alexey.komarov@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-5270 WBC: Incorrect error message when trying to 'zfs recv' into wrcached dataset
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
NEX-5132 WBC: Do not allow recv to datasets with enabled writecache
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
6358 A faulted pool with only unavailable vdevs triggers assertion failure in libzfs
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Reviewed by: Serban Maduta <serban.maduta@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
6393 zfs receive a full send as a clone
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Dan McDonald <danmcd@omniti.com>
2605 want to resume interrupted zfs send
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Xin Li <delphij@freebsd.org>
Reviewed by: Arne Jansen <sensille@gmx.net>
Approved by: Dan McDonald <danmcd@omniti.com>
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R (fix studio build)
4185 add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Garrett D'Amore <garrett@damore.org>
6047 SPARC boot should support feature@embedded_data
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
5959 clean up per-dataset feature count code
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
NEX-4582 update wrc test cases for allow to use write back cache per tree of datasets
Reviewed by: Steve Peng <steve.peng@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
5960 zfs recv should prefetch indirect blocks
5925 zfs receive -o origin=
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
5946 zfs_ioc_space_snaps must check that firstsnap and lastsnap refer to snapshots
5945 zfs_ioc_send_space must ensure that fromsnap refers to a snapshot
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
5870 dmu_recv_end_check() leaks origin_head hold if error happens in drc_force branch
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
5912 full stream can not be force-received into a dataset if it has a snapshot
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
5809 Blowaway full receive in v1 pool causes kernel panic
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Approved by: Gordon Ross <gwr@nexenta.com>
5746 more checksumming in zfs send
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Approved by: Albert Lee <trisk@omniti.com>
5765 add support for estimating send stream size with lzc_send_space when source is a bookmark
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Approved by: Albert Lee <trisk@nexenta.com>
5769 Cast 'zfs bad bloc' to ULL for x86
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Richard PALO <richard@NetBSD.org>
Approved by: Dan McDonald <danmcd@omniti.com>
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
Revert "NEX-4476 WRC: Allow to use write back cache per tree of datasets"
This reverts commit fe97b74444278a6f36fec93179133641296312da.
NEX-4476 WRC: Allow to use write back cache per tree of datasets
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Alex Aizman <alex.aizman@nexenta.com>
NEX-3588 krrp panics in zfs:dmu_recv_end_check+13b () when running zfs tests.
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Reviewed by: Kevin Crowe <kevin.crowe@nexenta.com>
NEX-3558 KRRP Integration
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Garrett D'Amore <garrett@damore.org>
Fixup merge results
re #12619 rb4429 More dp->dp_config_rwlock holds
Bug 10481 - Dry run option in 'zfs send' isn't the same as in NexentaStor 3.1

*** 18,31 **** * * CDDL HEADER END */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2011, 2015 by Delphix. All rights reserved. * Copyright (c) 2014, Joyent, Inc. All rights reserved. * Copyright 2014 HybridCluster. All rights reserved. * Copyright 2016 RackTop Systems. * Copyright (c) 2014 Integros [integros.com] */ #include <sys/dmu.h> --- 18,31 ---- * * CDDL HEADER END */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2011, 2015 by Delphix. All rights reserved. * Copyright (c) 2014, Joyent, Inc. All rights reserved. * Copyright 2014 HybridCluster. All rights reserved. + * Copyright 2017 Nexenta Systems, Inc. All rights reserved. * Copyright 2016 RackTop Systems. * Copyright (c) 2014 Integros [integros.com] */ #include <sys/dmu.h>
*** 52,63 **** --- 52,66 ---- #include <sys/dmu_send.h> #include <sys/dsl_destroy.h> #include <sys/blkptr.h> #include <sys/dsl_bookmark.h> #include <sys/zfeature.h> + #include <sys/autosnap.h> #include <sys/bqueue.h> + #include "zfs_errno.h" + /* Set this tunable to TRUE to replace corrupt data with 0x2f5baddb10c */ int zfs_send_corrupt_data = B_FALSE; int zfs_send_queue_length = 16 * 1024 * 1024; int zfs_recv_queue_length = 16 * 1024 * 1024; /* Set this tunable to FALSE to disable setting of DRR_FLAG_FREERECORDS */
*** 108,161 **** * data that isn't 8-byte aligned; if the assertions were removed, a * feature flag would have to be added. */ ASSERT0(len % 8); dsp->dsa_err = vn_rdwr(UIO_WRITE, dsp->dsa_vp, (caddr_t)buf, len, ! 0, UIO_SYSSPACE, FAPPEND, RLIM64_INFINITY, CRED(), &resid); ! mutex_enter(&ds->ds_sendstream_lock); *dsp->dsa_off += len; mutex_exit(&ds->ds_sendstream_lock); return (dsp->dsa_err); } /* * For all record types except BEGIN, fill in the checksum (overlaid in * drr_u.drr_checksum.drr_checksum). The checksum verifies everything * up to the start of the checksum itself. */ static int dump_record(dmu_sendarg_t *dsp, void *payload, int payload_len) { ASSERT3U(offsetof(dmu_replay_record_t, drr_u.drr_checksum.drr_checksum), ==, sizeof (dmu_replay_record_t) - sizeof (zio_cksum_t)); ! (void) fletcher_4_incremental_native(dsp->dsa_drr, ! offsetof(dmu_replay_record_t, drr_u.drr_checksum.drr_checksum), ! &dsp->dsa_zc); if (dsp->dsa_drr->drr_type == DRR_BEGIN) { dsp->dsa_sent_begin = B_TRUE; - } else { - ASSERT(ZIO_CHECKSUM_IS_ZERO(&dsp->dsa_drr->drr_u. - drr_checksum.drr_checksum)); - dsp->dsa_drr->drr_u.drr_checksum.drr_checksum = dsp->dsa_zc; } if (dsp->dsa_drr->drr_type == DRR_END) { dsp->dsa_sent_end = B_TRUE; } (void) fletcher_4_incremental_native(&dsp->dsa_drr-> drr_u.drr_checksum.drr_checksum, sizeof (zio_cksum_t), &dsp->dsa_zc); if (dump_bytes(dsp, dsp->dsa_drr, sizeof (dmu_replay_record_t)) != 0) return (SET_ERROR(EINTR)); if (payload_len != 0) { ! (void) fletcher_4_incremental_native(payload, payload_len, ! &dsp->dsa_zc); ! if (dump_bytes(dsp, payload, payload_len) != 0) return (SET_ERROR(EINTR)); } return (0); } --- 111,197 ---- * data that isn't 8-byte aligned; if the assertions were removed, a * feature flag would have to be added. */ ASSERT0(len % 8); + ASSERT(buf != NULL); + dsp->dsa_err = 0; + if (!dsp->sendsize) { + /* if vp is NULL, then the send is from krrp */ + if (dsp->dsa_vp != NULL) { dsp->dsa_err = vn_rdwr(UIO_WRITE, dsp->dsa_vp, (caddr_t)buf, len, ! 0, UIO_SYSSPACE, FAPPEND, RLIM64_INFINITY, ! CRED(), &resid); ! } else { ! ASSERT(dsp->dsa_krrp_task != NULL); ! dsp->dsa_err = dmu_krrp_buffer_write(buf, len, ! dsp->dsa_krrp_task); ! } ! } mutex_enter(&ds->ds_sendstream_lock); *dsp->dsa_off += len; mutex_exit(&ds->ds_sendstream_lock); return (dsp->dsa_err); } + static int + dump_bytes_with_checksum(dmu_sendarg_t *dsp, void *buf, int len) + { + if (!dsp->sendsize && (dsp->dsa_krrp_task == NULL || + dsp->dsa_krrp_task->buffer_args.force_cksum)) { + (void) fletcher_4_incremental_native(buf, len, &dsp->dsa_zc); + } + + return (dump_bytes(dsp, buf, len)); + } + /* * For all record types except BEGIN, fill in the checksum (overlaid in * drr_u.drr_checksum.drr_checksum). The checksum verifies everything * up to the start of the checksum itself. */ static int dump_record(dmu_sendarg_t *dsp, void *payload, int payload_len) { + boolean_t do_checksum = (dsp->dsa_krrp_task == NULL || + dsp->dsa_krrp_task->buffer_args.force_cksum); + ASSERT3U(offsetof(dmu_replay_record_t, drr_u.drr_checksum.drr_checksum), ==, sizeof (dmu_replay_record_t) - sizeof (zio_cksum_t)); ! if (dsp->dsa_drr->drr_type == DRR_BEGIN) { dsp->dsa_sent_begin = B_TRUE; } + if (dsp->dsa_drr->drr_type == DRR_END) { dsp->dsa_sent_end = B_TRUE; } + + if (!dsp->sendsize && do_checksum) { + (void) fletcher_4_incremental_native(dsp->dsa_drr, + offsetof(dmu_replay_record_t, + drr_u.drr_checksum.drr_checksum), + &dsp->dsa_zc); + if (dsp->dsa_drr->drr_type != DRR_BEGIN) { + ASSERT(ZIO_CHECKSUM_IS_ZERO(&dsp->dsa_drr->drr_u. + drr_checksum.drr_checksum)); + dsp->dsa_drr->drr_u.drr_checksum.drr_checksum = + dsp->dsa_zc; + } + (void) fletcher_4_incremental_native(&dsp->dsa_drr-> drr_u.drr_checksum.drr_checksum, sizeof (zio_cksum_t), &dsp->dsa_zc); + } + if (dump_bytes(dsp, dsp->dsa_drr, sizeof (dmu_replay_record_t)) != 0) return (SET_ERROR(EINTR)); if (payload_len != 0) { ! if (dump_bytes_with_checksum(dsp, payload, payload_len) != 0) return (SET_ERROR(EINTR)); } return (0); }
*** 359,371 **** return (EINTR); return (0); } static int ! dump_spill(dmu_sendarg_t *dsp, uint64_t object, int blksz, void *data) { struct drr_spill *drrs = &(dsp->dsa_drr->drr_u.drr_spill); if (dsp->dsa_pending_op != PENDING_NONE) { if (dump_record(dsp, NULL, 0) != 0) return (SET_ERROR(EINTR)); dsp->dsa_pending_op = PENDING_NONE; --- 395,412 ---- return (EINTR); return (0); } static int ! dump_spill(dmu_sendarg_t *dsp, uint64_t object, ! const blkptr_t *bp, const zbookmark_phys_t *zb) { + int rc = 0; struct drr_spill *drrs = &(dsp->dsa_drr->drr_u.drr_spill); + enum arc_flags aflags = ARC_FLAG_WAIT; + int blksz = BP_GET_LSIZE(bp); + arc_buf_t *abuf; if (dsp->dsa_pending_op != PENDING_NONE) { if (dump_record(dsp, NULL, 0) != 0) return (SET_ERROR(EINTR)); dsp->dsa_pending_op = PENDING_NONE;
*** 376,387 **** dsp->dsa_drr->drr_type = DRR_SPILL; drrs->drr_object = object; drrs->drr_length = blksz; drrs->drr_toguid = dsp->dsa_toguid; ! if (dump_record(dsp, data, blksz) != 0) return (SET_ERROR(EINTR)); return (0); } static int dump_freeobjects(dmu_sendarg_t *dsp, uint64_t firstobj, uint64_t numobjs) --- 417,458 ---- dsp->dsa_drr->drr_type = DRR_SPILL; drrs->drr_object = object; drrs->drr_length = blksz; drrs->drr_toguid = dsp->dsa_toguid; ! if (dump_record(dsp, NULL, 0)) return (SET_ERROR(EINTR)); + + /* + * if dsa_krrp task is not NULL, then the send is from krrp and we can + * try to bypass copying data to an intermediate buffer. + */ + if (!dsp->sendsize && dsp->dsa_krrp_task != NULL) { + rc = dmu_krrp_direct_arc_read(dsp->dsa_os->os_spa, + dsp->dsa_krrp_task, &dsp->dsa_zc, bp); + /* + * rc == 0 means that we successfully copy + * the data directly from ARC to krrp buffer + * rc != 0 && rc != EINTR means that we cannot + * zerocopy the data and need to use slow-path + */ + if (rc == 0 || rc == EINTR) + return (rc); + + ASSERT3U(rc, ==, ENODATA); + } + + if (arc_read(NULL, dsp->dsa_os->os_spa, bp, arc_getbuf_func, &abuf, + ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, + &aflags, zb) != 0) + return (SET_ERROR(EIO)); + + rc = dump_bytes_with_checksum(dsp, abuf->b_data, blksz); + arc_buf_destroy(abuf, &abuf); + if (rc != 0) + return (SET_ERROR(EINTR)); + return (0); } static int dump_freeobjects(dmu_sendarg_t *dsp, uint64_t firstobj, uint64_t numobjs)
*** 634,654 **** if (err != 0) break; } arc_buf_destroy(abuf, &abuf); } else if (type == DMU_OT_SA) { ! arc_flags_t aflags = ARC_FLAG_WAIT; ! arc_buf_t *abuf; ! int blksz = BP_GET_LSIZE(bp); ! ! if (arc_read(NULL, spa, bp, arc_getbuf_func, &abuf, ! ZIO_PRIORITY_ASYNC_READ, ZIO_FLAG_CANFAIL, ! &aflags, zb) != 0) ! return (SET_ERROR(EIO)); ! ! err = dump_spill(dsa, zb->zb_object, blksz, abuf->b_data); ! arc_buf_destroy(abuf, &abuf); } else if (backup_do_embed(dsa, bp)) { /* it's an embedded level-0 block of a regular object */ int blksz = dblkszsec << SPA_MINBLOCKSHIFT; ASSERT0(zb->zb_level); err = dump_write_embedded(dsa, zb->zb_object, --- 705,720 ---- if (err != 0) break; } arc_buf_destroy(abuf, &abuf); } else if (type == DMU_OT_SA) { ! /* ! * The upstream code has arc_read() call here, but we moved ! * it to dump_spill() since we want to take advantage of ! * zero copy of the buffer if possible ! */ ! err = dump_spill(dsa, zb->zb_object, bp, zb); } else if (backup_do_embed(dsa, bp)) { /* it's an embedded level-0 block of a regular object */ int blksz = dblkszsec << SPA_MINBLOCKSHIFT; ASSERT0(zb->zb_level); err = dump_write_embedded(dsa, zb->zb_object,
*** 685,699 **** ASSERT0(zb->zb_level); ASSERT(zb->zb_object > dsa->dsa_resume_object || (zb->zb_object == dsa->dsa_resume_object && zb->zb_blkid * blksz >= dsa->dsa_resume_offset)); - ASSERT0(zb->zb_level); - ASSERT(zb->zb_object > dsa->dsa_resume_object || - (zb->zb_object == dsa->dsa_resume_object && - zb->zb_blkid * blksz >= dsa->dsa_resume_offset)); - ASSERT3U(blksz, ==, BP_GET_LSIZE(bp)); enum zio_flag zioflags = ZIO_FLAG_CANFAIL; if (request_compressed) zioflags |= ZIO_FLAG_RAW; --- 751,760 ----
*** 722,732 **** while (blksz > 0 && err == 0) { int n = MIN(blksz, SPA_OLD_MAXBLOCKSIZE); err = dump_write(dsa, type, zb->zb_object, offset, n, n, NULL, buf); offset += n; - buf += n; blksz -= n; } } else { err = dump_write(dsa, type, zb->zb_object, offset, blksz, arc_buf_size(abuf), bp, abuf->b_data); --- 783,792 ----
*** 753,767 **** * Actually do the bulk of the work in a zfs send. * * Note: Releases dp using the specified tag. */ static int ! dmu_send_impl(void *tag, dsl_pool_t *dp, dsl_dataset_t *to_ds, zfs_bookmark_phys_t *ancestor_zb, boolean_t is_clone, boolean_t embedok, boolean_t large_block_ok, boolean_t compressok, ! int outfd, uint64_t resumeobj, uint64_t resumeoff, ! vnode_t *vp, offset_t *off) { objset_t *os; dmu_replay_record_t *drr; dmu_sendarg_t *dsp; int err; --- 813,827 ---- * Actually do the bulk of the work in a zfs send. * * Note: Releases dp using the specified tag. */ static int ! dmu_send_impl_ss(void *tag, dsl_pool_t *dp, dsl_dataset_t *to_ds, zfs_bookmark_phys_t *ancestor_zb, boolean_t is_clone, boolean_t embedok, boolean_t large_block_ok, boolean_t compressok, ! int outfd, uint64_t resumeobj, uint64_t resumeoff, vnode_t *vp, ! offset_t *off, boolean_t sendsize, dmu_krrp_task_t *krrp_task) { objset_t *os; dmu_replay_record_t *drr; dmu_sendarg_t *dsp; int err;
*** 848,859 **** --- 908,921 ---- dsp->dsa_outfd = outfd; dsp->dsa_proc = curproc; dsp->dsa_os = os; dsp->dsa_off = off; dsp->dsa_toguid = dsl_dataset_phys(to_ds)->ds_guid; + dsp->dsa_krrp_task = krrp_task; dsp->dsa_pending_op = PENDING_NONE; dsp->dsa_featureflags = featureflags; + dsp->sendsize = sendsize; dsp->dsa_resume_object = resumeobj; dsp->dsa_resume_offset = resumeoff; mutex_enter(&to_ds->ds_sendstream_lock); list_insert_head(&to_ds->ds_sendstreams, dsp);
*** 901,911 **** to_data = bqueue_dequeue(&to_arg.q); while (!to_data->eos_marker && err == 0) { err = do_dump(dsp, to_data); to_data = get_next_record(&to_arg.q, to_data); ! if (issig(JUSTLOOKING) && issig(FORREAL)) err = EINTR; } if (err != 0) { to_arg.cancel = B_TRUE; --- 963,973 ---- to_data = bqueue_dequeue(&to_arg.q); while (!to_data->eos_marker && err == 0) { err = do_dump(dsp, to_data); to_data = get_next_record(&to_arg.q, to_data); ! if (vp != NULL && issig(JUSTLOOKING) && issig(FORREAL)) err = EINTR; } if (err != 0) { to_arg.cancel = B_TRUE;
*** 955,967 **** return (err); } int dmu_send_obj(const char *pool, uint64_t tosnap, uint64_t fromsnap, boolean_t embedok, boolean_t large_block_ok, boolean_t compressok, ! int outfd, vnode_t *vp, offset_t *off) { dsl_pool_t *dp; dsl_dataset_t *ds; dsl_dataset_t *fromds = NULL; int err; --- 1017,1041 ---- return (err); } int + dmu_send_impl(void *tag, dsl_pool_t *dp, dsl_dataset_t *to_ds, + zfs_bookmark_phys_t *ancestor_zb, boolean_t is_clone, boolean_t embedok, + boolean_t large_block_ok, boolean_t compressok, int outfd, + uint64_t resumeobj, uint64_t resumeoff, vnode_t *vp, offset_t *off, + dmu_krrp_task_t *krrp_task) + { + return (dmu_send_impl_ss(tag, dp, to_ds, ancestor_zb, is_clone, + embedok, large_block_ok, compressok, outfd, resumeobj, resumeoff, + vp, off, B_FALSE, krrp_task)); + } + + int dmu_send_obj(const char *pool, uint64_t tosnap, uint64_t fromsnap, boolean_t embedok, boolean_t large_block_ok, boolean_t compressok, ! int outfd, vnode_t *vp, offset_t *off, boolean_t sendsize) { dsl_pool_t *dp; dsl_dataset_t *ds; dsl_dataset_t *fromds = NULL; int err;
*** 992,1006 **** dsl_dataset_phys(fromds)->ds_creation_time; zb.zbm_creation_txg = dsl_dataset_phys(fromds)->ds_creation_txg; zb.zbm_guid = dsl_dataset_phys(fromds)->ds_guid; is_clone = (fromds->ds_dir != ds->ds_dir); dsl_dataset_rele(fromds, FTAG); ! err = dmu_send_impl(FTAG, dp, ds, &zb, is_clone, ! embedok, large_block_ok, compressok, outfd, 0, 0, vp, off); } else { ! err = dmu_send_impl(FTAG, dp, ds, NULL, B_FALSE, ! embedok, large_block_ok, compressok, outfd, 0, 0, vp, off); } dsl_dataset_rele(ds, FTAG); return (err); } --- 1066,1082 ---- dsl_dataset_phys(fromds)->ds_creation_time; zb.zbm_creation_txg = dsl_dataset_phys(fromds)->ds_creation_txg; zb.zbm_guid = dsl_dataset_phys(fromds)->ds_guid; is_clone = (fromds->ds_dir != ds->ds_dir); dsl_dataset_rele(fromds, FTAG); ! err = dmu_send_impl_ss(FTAG, dp, ds, &zb, is_clone, ! embedok, large_block_ok, compressok, outfd, 0, 0, vp, off, ! sendsize, NULL); } else { ! err = dmu_send_impl_ss(FTAG, dp, ds, NULL, B_FALSE, ! embedok, large_block_ok, compressok, outfd, 0, 0, vp, off, ! sendsize, NULL); } dsl_dataset_rele(ds, FTAG); return (err); }
*** 1073,1088 **** dsl_dataset_rele(ds, FTAG); dsl_pool_rele(dp, FTAG); return (err); } err = dmu_send_impl(FTAG, dp, ds, &zb, is_clone, ! embedok, large_block_ok, compressok, ! outfd, resumeobj, resumeoff, vp, off); } else { err = dmu_send_impl(FTAG, dp, ds, NULL, B_FALSE, ! embedok, large_block_ok, compressok, ! outfd, resumeobj, resumeoff, vp, off); } if (owned) dsl_dataset_disown(ds, FTAG); else dsl_dataset_rele(ds, FTAG); --- 1149,1164 ---- dsl_dataset_rele(ds, FTAG); dsl_pool_rele(dp, FTAG); return (err); } err = dmu_send_impl(FTAG, dp, ds, &zb, is_clone, ! embedok, large_block_ok, compressok, outfd, ! resumeobj, resumeoff, vp, off, NULL); } else { err = dmu_send_impl(FTAG, dp, ds, NULL, B_FALSE, ! embedok, large_block_ok, compressok, outfd, ! resumeobj, resumeoff, vp, off, NULL); } if (owned) dsl_dataset_disown(ds, FTAG); else dsl_dataset_rele(ds, FTAG);
*** 1255,1283 **** uint64_t drba_snapobj; } dmu_recv_begin_arg_t; static int recv_begin_check_existing_impl(dmu_recv_begin_arg_t *drba, dsl_dataset_t *ds, ! uint64_t fromguid) { uint64_t val; int error; dsl_pool_t *dp = ds->ds_dir->dd_pool; /* temporary clone name must not exist */ error = zap_lookup(dp->dp_meta_objset, ! dsl_dir_phys(ds->ds_dir)->dd_child_dir_zapobj, recv_clone_name, ! 8, 1, &val); ! if (error != ENOENT) ! return (error == 0 ? EBUSY : error); /* new snapshot name must not exist */ error = zap_lookup(dp->dp_meta_objset, dsl_dataset_phys(ds)->ds_snapnames_zapobj, drba->drba_cookie->drc_tosnap, 8, 1, &val); if (error != ENOENT) ! return (error == 0 ? EEXIST : error); /* * Check snapshot limit before receiving. We'll recheck again at the * end, but might as well abort before receiving if we're already over * the limit. --- 1331,1384 ---- uint64_t drba_snapobj; } dmu_recv_begin_arg_t; static int recv_begin_check_existing_impl(dmu_recv_begin_arg_t *drba, dsl_dataset_t *ds, ! uint64_t fromguid, dmu_tx_t *tx) { uint64_t val; int error; dsl_pool_t *dp = ds->ds_dir->dd_pool; + if (dmu_tx_is_syncing(tx)) { /* temporary clone name must not exist */ error = zap_lookup(dp->dp_meta_objset, ! dsl_dir_phys(ds->ds_dir)->dd_child_dir_zapobj, ! recv_clone_name, 8, 1, &val); ! if (error == 0) { ! dsl_dataset_t *tds; + /* check that if it is currently used */ + error = dsl_dataset_own_obj(dp, val, FTAG, &tds); + if (!error) { + char name[ZFS_MAX_DATASET_NAME_LEN]; + + dsl_dataset_name(tds, name); + dsl_dataset_disown(tds, FTAG); + + error = dsl_dataset_hold(dp, name, FTAG, &tds); + if (!error) { + dsl_destroy_head_sync_impl(tds, tx); + dsl_dataset_rele(tds, FTAG); + error = ENOENT; + } + } else { + error = 0; + } + } + if (error != ENOENT) { + return (error == 0 ? + SET_ERROR(EBUSY) : SET_ERROR(error)); + } + } + /* new snapshot name must not exist */ error = zap_lookup(dp->dp_meta_objset, dsl_dataset_phys(ds)->ds_snapnames_zapobj, drba->drba_cookie->drc_tosnap, 8, 1, &val); if (error != ENOENT) ! return (error == 0 ? SET_ERROR(EEXIST) : SET_ERROR(error)); /* * Check snapshot limit before receiving. We'll recheck again at the * end, but might as well abort before receiving if we're already over * the limit.
*** 1397,1413 **** error = dsl_dataset_hold(dp, tofs, FTAG, &ds); if (error == 0) { /* target fs already exists; recv into temp clone */ /* Can't recv a clone into an existing fs */ if (flags & DRR_FLAG_CLONE || drba->drba_origin) { dsl_dataset_rele(ds, FTAG); return (SET_ERROR(EINVAL)); } ! error = recv_begin_check_existing_impl(drba, ds, fromguid); dsl_dataset_rele(ds, FTAG); } else if (error == ENOENT) { /* target fs does not exist; must be a full backup or clone */ char buf[ZFS_MAX_DATASET_NAME_LEN]; --- 1498,1530 ---- error = dsl_dataset_hold(dp, tofs, FTAG, &ds); if (error == 0) { /* target fs already exists; recv into temp clone */ + if (spa_feature_is_active(dp->dp_spa, SPA_FEATURE_WBC)) { + objset_t *os = NULL; + + error = dmu_objset_from_ds(ds, &os); + if (error) { + dsl_dataset_rele(ds, FTAG); + return (error); + } + + /* Recv is impossible into DS that uses WBC */ + if (os->os_wbc_mode != ZFS_WBC_MODE_OFF) { + dsl_dataset_rele(ds, FTAG); + return (SET_ERROR(EKZFS_WBCNOTSUP)); + } + } + /* Can't recv a clone into an existing fs */ if (flags & DRR_FLAG_CLONE || drba->drba_origin) { dsl_dataset_rele(ds, FTAG); return (SET_ERROR(EINVAL)); } ! error = recv_begin_check_existing_impl(drba, ds, fromguid, tx); dsl_dataset_rele(ds, FTAG); } else if (error == ENOENT) { /* target fs does not exist; must be a full backup or clone */ char buf[ZFS_MAX_DATASET_NAME_LEN];
*** 1433,1442 **** --- 1550,1575 ---- (void) strlcpy(buf, tofs, strrchr(tofs, '/') - tofs + 1); error = dsl_dataset_hold(dp, buf, FTAG, &ds); if (error != 0) return (error); + if (spa_feature_is_active(dp->dp_spa, SPA_FEATURE_WBC)) { + objset_t *os = NULL; + + error = dmu_objset_from_ds(ds, &os); + if (error) { + dsl_dataset_rele(ds, FTAG); + return (error); + } + + /* Recv is impossible into DS that uses WBC */ + if (os->os_wbc_mode != ZFS_WBC_MODE_OFF) { + dsl_dataset_rele(ds, FTAG); + return (SET_ERROR(EKZFS_WBCNOTSUP)); + } + } + /* * Check filesystem and snapshot limits before receiving. We'll * recheck snapshot limits again at the end (we create the * filesystems and increment those counts during begin_sync). */
*** 1647,1657 **** /* check that there is resuming data, and that the toguid matches */ if (!dsl_dataset_is_zapified(ds)) { dsl_dataset_rele(ds, FTAG); return (SET_ERROR(EINVAL)); } ! uint64_t val; error = zap_lookup(dp->dp_meta_objset, ds->ds_object, DS_FIELD_RESUME_TOGUID, sizeof (val), 1, &val); if (error != 0 || drrb->drr_toguid != val) { dsl_dataset_rele(ds, FTAG); return (SET_ERROR(EINVAL)); --- 1780,1790 ---- /* check that there is resuming data, and that the toguid matches */ if (!dsl_dataset_is_zapified(ds)) { dsl_dataset_rele(ds, FTAG); return (SET_ERROR(EINVAL)); } ! uint64_t val = 0; error = zap_lookup(dp->dp_meta_objset, ds->ds_object, DS_FIELD_RESUME_TOGUID, sizeof (val), 1, &val); if (error != 0 || drrb->drr_toguid != val) { dsl_dataset_rele(ds, FTAG); return (SET_ERROR(EINVAL));
*** 1736,1746 **** * NB: callers *MUST* call dmu_recv_stream() if dmu_recv_begin() * succeeds; otherwise we will leak the holds on the datasets. */ int dmu_recv_begin(char *tofs, char *tosnap, dmu_replay_record_t *drr_begin, ! boolean_t force, boolean_t resumable, char *origin, dmu_recv_cookie_t *drc) { dmu_recv_begin_arg_t drba = { 0 }; bzero(drc, sizeof (dmu_recv_cookie_t)); drc->drc_drr_begin = drr_begin; --- 1869,1880 ---- * NB: callers *MUST* call dmu_recv_stream() if dmu_recv_begin() * succeeds; otherwise we will leak the holds on the datasets. */ int dmu_recv_begin(char *tofs, char *tosnap, dmu_replay_record_t *drr_begin, ! boolean_t force, boolean_t resumable, boolean_t force_cksum, ! char *origin, dmu_recv_cookie_t *drc) { dmu_recv_begin_arg_t drba = { 0 }; bzero(drc, sizeof (dmu_recv_cookie_t)); drc->drc_drr_begin = drr_begin;
*** 1751,1766 **** --- 1885,1907 ---- drc->drc_resumable = resumable; drc->drc_cred = CRED(); if (drc->drc_drrb->drr_magic == BSWAP_64(DMU_BACKUP_MAGIC)) { drc->drc_byteswap = B_TRUE; + + /* on-wire checksum can be disabled for krrp */ + if (force_cksum) { (void) fletcher_4_incremental_byteswap(drr_begin, sizeof (dmu_replay_record_t), &drc->drc_cksum); byteswap_record(drr_begin); + } } else if (drc->drc_drrb->drr_magic == DMU_BACKUP_MAGIC) { + /* on-wire checksum can be disabled for krrp */ + if (force_cksum) { (void) fletcher_4_incremental_native(drr_begin, sizeof (dmu_replay_record_t), &drc->drc_cksum); + } } else { return (SET_ERROR(EINVAL)); } drba.drba_origin = origin;
*** 1840,1849 **** --- 1981,1991 ---- struct receive_record_arg *rrd; /* A record that has had its header read in, but not its payload. */ struct receive_record_arg *next_rrd; zio_cksum_t cksum; zio_cksum_t prev_cksum; + dmu_krrp_task_t *krrp_task; int err; boolean_t byteswap; /* Sorted list of objects not to issue prefetches for. */ struct objlist ignore_objlist; };
*** 1892,1922 **** * The code doesn't rely on this (lengths being multiples of 8). See * comment in dump_bytes. */ ASSERT0(len % 8); while (done < len) { ! ssize_t resid; ra->err = vn_rdwr(UIO_READ, ra->vp, (char *)buf + done, len - done, ra->voff, UIO_SYSSPACE, FAPPEND, RLIM64_INFINITY, CRED(), &resid); - if (resid == len - done) { /* ! * Note: ECKSUM indicates that the receive ! * was interrupted and can potentially be resumed. */ ra->err = SET_ERROR(ECKSUM); } ra->voff += len - done - resid; done = len - resid; if (ra->err != 0) return (ra->err); } ra->bytes_read += len; ASSERT3U(done, ==, len); return (0); } --- 2034,2076 ---- * The code doesn't rely on this (lengths being multiples of 8). See * comment in dump_bytes. */ ASSERT0(len % 8); + /* + * if vp is NULL, then the send is from krrp and we can try to bypass + * copying data to an intermediate buffer. + */ + if (ra->vp != NULL) { while (done < len) { ! ssize_t resid = 0; ra->err = vn_rdwr(UIO_READ, ra->vp, (char *)buf + done, len - done, ra->voff, UIO_SYSSPACE, FAPPEND, RLIM64_INFINITY, CRED(), &resid); if (resid == len - done) { /* ! * Note: ECKSUM indicates that the receive was ! * interrupted and can potentially be resumed. */ ra->err = SET_ERROR(ECKSUM); } ra->voff += len - done - resid; done = len - resid; if (ra->err != 0) return (ra->err); } + } else { + ASSERT(ra->krrp_task != NULL); + ra->err = dmu_krrp_buffer_read(buf, len, ra->krrp_task); + if (ra->err != 0) + return (ra->err); + done = len; + } + ra->bytes_read += len; ASSERT3U(done, ==, len); return (0); }
*** 2216,2225 **** --- 2370,2380 ---- err = dmu_tx_assign(tx, TXG_WAIT); if (err != 0) { dmu_tx_abort(tx); return (err); } + if (rwa->byteswap) { dmu_object_byteswap_t byteswap = DMU_OT_BYTESWAP(drrw->drr_type); dmu_ot_byteswap[byteswap].ob_func(abuf->b_data, DRR_WRITE_PAYLOAD_SIZE(drrw));
*** 2445,2454 **** --- 2600,2611 ---- */ static int receive_read_payload_and_next_header(struct receive_arg *ra, int len, void *buf) { int err; + boolean_t checksum_enable = (ra->krrp_task == NULL || + ra->krrp_task->buffer_args.force_cksum); if (len != 0) { ASSERT3U(len, <=, SPA_MAXBLOCKSIZE); err = receive_read(ra, len, buf); if (err != 0)
*** 2478,2495 **** kmem_free(ra->next_rrd, sizeof (*ra->next_rrd)); ra->next_rrd = NULL; return (SET_ERROR(EINVAL)); } /* * Note: checksum is of everything up to but not including the * checksum itself. */ ! ASSERT3U(offsetof(dmu_replay_record_t, drr_u.drr_checksum.drr_checksum), ==, sizeof (dmu_replay_record_t) - sizeof (zio_cksum_t)); receive_cksum(ra, ! offsetof(dmu_replay_record_t, drr_u.drr_checksum.drr_checksum), &ra->next_rrd->header); zio_cksum_t cksum_orig = ra->next_rrd->header.drr_u.drr_checksum.drr_checksum; zio_cksum_t *cksump = --- 2635,2655 ---- kmem_free(ra->next_rrd, sizeof (*ra->next_rrd)); ra->next_rrd = NULL; return (SET_ERROR(EINVAL)); } + if (checksum_enable) { /* * Note: checksum is of everything up to but not including the * checksum itself. */ ! ASSERT3U(offsetof(dmu_replay_record_t, ! drr_u.drr_checksum.drr_checksum), ==, sizeof (dmu_replay_record_t) - sizeof (zio_cksum_t)); receive_cksum(ra, ! offsetof(dmu_replay_record_t, ! drr_u.drr_checksum.drr_checksum), &ra->next_rrd->header); zio_cksum_t cksum_orig = ra->next_rrd->header.drr_u.drr_checksum.drr_checksum; zio_cksum_t *cksump =
*** 2504,2513 **** --- 2664,2674 ---- ra->next_rrd = NULL; return (SET_ERROR(ECKSUM)); } receive_cksum(ra, sizeof (cksum_orig), &cksum_orig); + } return (0); } static void
*** 2700,2712 **** err = receive_read_payload_and_next_header(ra, 0, NULL); return (err); } case DRR_END: { struct drr_end *drre = &ra->rrd->header.drr_u.drr_end; ! if (!ZIO_CHECKSUM_EQUAL(ra->prev_cksum, drre->drr_checksum)) return (SET_ERROR(ECKSUM)); return (0); } case DRR_SPILL: { struct drr_spill *drrs = &ra->rrd->header.drr_u.drr_spill; --- 2861,2877 ---- err = receive_read_payload_and_next_header(ra, 0, NULL); return (err); } case DRR_END: { + if (ra->krrp_task == NULL || + ra->krrp_task->buffer_args.force_cksum) { struct drr_end *drre = &ra->rrd->header.drr_u.drr_end; ! if (!ZIO_CHECKSUM_EQUAL(ra->prev_cksum, ! drre->drr_checksum)) return (SET_ERROR(ECKSUM)); + } return (0); } case DRR_SPILL: { struct drr_spill *drrs = &ra->rrd->header.drr_u.drr_spill;
*** 2868,2878 **** * * NB: callers *must* call dmu_recv_end() if this succeeds. */ int dmu_recv_stream(dmu_recv_cookie_t *drc, vnode_t *vp, offset_t *voffp, ! int cleanup_fd, uint64_t *action_handlep) { int err = 0; struct receive_arg ra = { 0 }; struct receive_writer_arg rwa = { 0 }; int featureflags; --- 3033,3043 ---- * * NB: callers *must* call dmu_recv_end() if this succeeds. */ int dmu_recv_stream(dmu_recv_cookie_t *drc, vnode_t *vp, offset_t *voffp, ! int cleanup_fd, uint64_t *action_handlep, dmu_krrp_task_t *krrp_task) { int err = 0; struct receive_arg ra = { 0 }; struct receive_writer_arg rwa = { 0 }; int featureflags;
*** 2880,2889 **** --- 3045,3055 ---- ra.byteswap = drc->drc_byteswap; ra.cksum = drc->drc_cksum; ra.vp = vp; ra.voff = *voffp; + ra.krrp_task = krrp_task; if (dsl_dataset_is_zapified(drc->drc_ds)) { (void) zap_lookup(drc->drc_ds->ds_dir->dd_pool->dp_meta_objset, drc->drc_ds->ds_object, DS_FIELD_RESUME_BYTES, sizeof (ra.bytes_read), 1, &ra.bytes_read);
*** 2988,2998 **** * has been handed off to the writer thread who will free it. Finally, * if receive_read_record fails or we're at the end of the stream, then * we free ra.rrd and exit. */ while (rwa.err == 0) { ! if (issig(JUSTLOOKING) && issig(FORREAL)) { err = SET_ERROR(EINTR); break; } ASSERT3P(ra.rrd, ==, NULL); --- 3154,3164 ---- * has been handed off to the writer thread who will free it. Finally, * if receive_read_record fails or we're at the end of the stream, then * we free ra.rrd and exit. */ while (rwa.err == 0) { ! if (vp && issig(JUSTLOOKING) && issig(FORREAL)) { err = SET_ERROR(EINTR); break; } ASSERT3P(ra.rrd, ==, NULL);
*** 3054,3063 **** --- 3220,3241 ---- dsl_pool_t *dp = dmu_tx_pool(tx); int error; ASSERT3P(drc->drc_ds->ds_owner, ==, dmu_recv_tag); + if (spa_feature_is_active(dp->dp_spa, SPA_FEATURE_WBC)) { + objset_t *os = NULL; + + error = dmu_objset_from_ds(drc->drc_ds, &os); + if (error) + return (error); + + /* Recv is impossible into DS that uses WBC */ + if (os->os_wbc_mode != ZFS_WBC_MODE_OFF) + return (SET_ERROR(EKZFS_WBCNOTSUP)); + } + if (!drc->drc_newfs) { dsl_dataset_t *origin_head; error = dsl_dataset_hold(dp, drc->drc_tofs, FTAG, &origin_head); if (error != 0)
*** 3110,3119 **** --- 3288,3310 ---- error = dsl_destroy_head_check_impl(drc->drc_ds, 1); } else { error = dsl_dataset_snapshot_check_impl(drc->drc_ds, drc->drc_tosnap, tx, B_TRUE, 1, drc->drc_cred); } + + if (dmu_tx_is_syncing(tx) && drc->drc_krrp_task != NULL) { + const char *token = + drc->drc_krrp_task->buffer_args.to_ds; + const char *cookie = drc->drc_krrp_task->cookie; + dsl_pool_t *dp = tx->tx_pool; + + if (*token != '\0') { + error = zap_update(dp->dp_meta_objset, + DMU_POOL_DIRECTORY_OBJECT, token, 1, + strlen(cookie) + 1, cookie, tx); + } + } return (error); } static void dmu_recv_end_sync(void *arg, dmu_tx_t *tx)