Print this page
NEX-6832 fcsm module's debug level default should be 0 (cstyle fix)
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
NEX-7503 backport illumos #7307
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
NEX-7048 COMSTAR MODE_SENSE support is broken
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
Reviewed by: Evan Layton <evan.layton@nexenta.com>
NEX-6018 Return of the walking dead idm_refcnt_wait_ref comstar threads
Reviewed by:  Rick McNeal <rick.mcneal@nexenta.com>
Reviewed by:  Evan Layton <evan.layton@nexenta.com>
NEX-5428 Backout the 5.0 changes
NEX-2937 Continuous write_same starves all other commands
Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
NEX-3508 CLONE - Port NEX-2946 Add UNMAP/TRIM functionality to ZFS and illumos
Reviewed by: Josef Sipek <josef.sipek@nexenta.com>
Reviewed by: Alek Pinchuk <alek.pinchuk@nexenta.com>
Conflicts:
    usr/src/uts/common/io/scsi/targets/sd.c
    usr/src/uts/common/sys/scsi/targets/sddef.h
NEX-3217 Panic running benchmark at ESX VM
NEX-3204 Panic doing FC rescan from ESXi 5.5u1 with VAAI enabled
        Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
        Reviewed by: Tony Nguyen <tony.nguyen@nexenta.com>
NEX-2613 There should be a tunable to enable/disable SCSI UNMAP in NexentaStor
        Reviewed by: Steve Peng <steve.peng@nexenta.com>
        Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
NEX-1825 LUN's not discovered with ALUA - Commands sent down standby path
Reviewed by: Rob Gittins <rob.gittins@nexenta.com>
Reviewed by: Steve Peng <steve.peng@nexenta.com>
NEX-3171 VAAI Disable not yet included in 5.0
        Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
        Reviewed by: Tony Nguyen <tony.nguyen@nexenta.com>
        Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
NEX-3111 Comstar does not pass cstyle and hdrchk
        Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
        Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
        Reviewed by: Tony Nguyen <tony.nguyen@nexenta.com>
NEX-3023 Panics and hangs when using write_same and compare_and_write
Review by: Bayard Bell <bayard.bell@nexenta.com>
Review by: Rick McNeal <rick.mcneal@nexenta.com>
Review by: Jean McCormack <jean.mccormack@nexenta.com>
Approved by: Jean McCormack <jean.mccormack@nexenta.com>
Related bug: NEX-2723 Kernel panic in xfer_completion code for write_same (0x93) and compare_and_write (0x89)
NEX-2378 RW_LOCK_HELD assertion blown for sl_access_state_lock in sbd_flush_data_cache()
NEX-2178 Multi-block transfers on memory constrained systems for write_same (0x93) and compare_and_write (0x89) cause memory corruption
NEX-2105 assertion failed: (scmd->flags & SBD_SCSI_CMD_TRANS_DATA) && scmd->trans_data != NULL, file: ../../common/io/comstar/lu/stmf_sbd/sbd_scsi.c, line: 2447
SUP-761 sbd_flush_data_cache() call against closed zvol results in NULL pointer deref in zil_commit() call further down the stack
SUP-761 sbd_flush_data_cache() call against closed zvol results in NULL pointer deref in zil_commit() call further down the stack
NEX-1965 Page fault at netbios_first_level_name_decode+0xbb
Support simultaneous compare_and_write operations for VAAI
Bug IDs SUP-505
                SUP-1768
                SUP-1928
Code Reviewers:
        Sarah Jelinek
        Jeffry Molanus
        Albert Lee
        Harold Shaw
SUP-782 COMSTAR UNMAP support should limit number of LBAs per operation
NEX-988 itask_lu_[read|write]_time was inadvertently removed by the Illumos 3862 fix
OS-69 Open source VAAI
re #13117, rb4251 EEE Setting for I350 on by default
re #12981, rbYYYY Panic due to possible race between LU coming to ready state and COMSTAR (stmf) sbd task
re #12375 rb4141 Create ALUA Support on NexentaStor; Failover causes loss of storage
re #7936 rb3706 Support for COMSTAR/OEM
re #8002 rb3706 Allow setting iSCSI vendor ID via stmf_sbd.conf
re #11454 rb3750 Fix inconsistent vid/pid in stmf
re #8499 rb3117 Unreleased ATS lock possibly caused by concurrent svmotions
8226 nza-kernel needs to be buildable by itself
re #6919, rb2433 ATS not dropping locks upon completion
Re #6790 backspace should perform delete on console
VAAI (XXX ATS support for COMSTAR, YYY Block-copy support for COMSTAR)

@@ -16,14 +16,15 @@
  * fields enclosed by brackets "[]" replaced with your own identifying
  * information: Portions Copyright [yyyy] [name of copyright owner]
  *
  * CDDL HEADER END
  */
+
 /*
  * Copyright (c) 2008, 2010, Oracle and/or its affiliates. All rights reserved.
- * Copyright 2011 Nexenta Systems, Inc.  All rights reserved.
  * Copyright (c) 2013 by Delphix. All rights reserved.
+ * Copyright 2016 Nexenta Systems, Inc. All rights reserved.
  */
 
 #include <sys/conf.h>
 #include <sys/file.h>
 #include <sys/ddi.h>

@@ -35,10 +36,11 @@
 #include <sys/disp.h>
 #include <sys/byteorder.h>
 #include <sys/atomic.h>
 #include <sys/sdt.h>
 #include <sys/dkio.h>
+#include <sys/dkioc_free_util.h>
 
 #include <sys/stmf.h>
 #include <sys/lpif.h>
 #include <sys/portif.h>
 #include <sys/stmf_ioctl.h>

@@ -84,19 +86,32 @@
         /* START STOP UNIT with START bit 0 and POWER CONDITION 0  */      \
         (((cdb[0]) == SCMD_START_STOP) && (                                \
             (((cdb[4]) & 0xF0) == 0) && (((cdb[4]) & 0x01) == 0))))
 /* End of SCSI2_CONFLICT_FREE_CMDS */
 
+uint8_t HardwareAcceleratedInit = 1;
+uint8_t sbd_unmap_enable = 1;           /* allow unmap by default */
+
+/*
+ * An /etc/system tunable which specifies the maximum number of LBAs supported
+ * in a single UNMAP operation. Default is 0x002000 blocks or 4MB in size.
+ */
+int stmf_sbd_unmap_max_nblks  = 0x002000;
+
+/*
+ * An /etc/system tunable which indicates if READ ops can run on the standby
+ * path or return an error.
+ */
+int stmf_standby_fail_reads = 0;
+
 stmf_status_t sbd_lu_reset_state(stmf_lu_t *lu);
 static void sbd_handle_sync_cache(struct scsi_task *task,
     struct stmf_data_buf *initial_dbuf);
 void sbd_handle_read_xfer_completion(struct scsi_task *task,
     sbd_cmd_t *scmd, struct stmf_data_buf *dbuf);
 void sbd_handle_short_write_xfer_completion(scsi_task_t *task,
     stmf_data_buf_t *dbuf);
-void sbd_handle_short_write_transfers(scsi_task_t *task,
-    stmf_data_buf_t *dbuf, uint32_t cdb_xfer_size);
 void sbd_handle_mode_select_xfer(scsi_task_t *task, uint8_t *buf,
     uint32_t buflen);
 void sbd_handle_mode_select(scsi_task_t *task, stmf_data_buf_t *dbuf);
 void sbd_handle_identifying_info(scsi_task_t *task, stmf_data_buf_t *dbuf);
 

@@ -103,11 +118,11 @@
 static void sbd_handle_unmap_xfer(scsi_task_t *task, uint8_t *buf,
     uint32_t buflen);
 static void sbd_handle_unmap(scsi_task_t *task, stmf_data_buf_t *dbuf);
 
 extern void sbd_pgr_initialize_it(scsi_task_t *, sbd_it_data_t *);
-extern int sbd_pgr_reservation_conflict(scsi_task_t *);
+extern int sbd_pgr_reservation_conflict(scsi_task_t *, struct sbd_lu *sl);
 extern void sbd_pgr_reset(sbd_lu_t *);
 extern void sbd_pgr_remove_it_handle(sbd_lu_t *, sbd_it_data_t *);
 extern void sbd_handle_pgr_in_cmd(scsi_task_t *, stmf_data_buf_t *);
 extern void sbd_handle_pgr_out_cmd(scsi_task_t *, stmf_data_buf_t *);
 extern void sbd_handle_pgr_out_data(scsi_task_t *, stmf_data_buf_t *);

@@ -117,10 +132,11 @@
     struct stmf_data_buf *initial_dbuf);
 static void sbd_do_write_same_xfer(struct scsi_task *task, sbd_cmd_t *scmd,
     struct stmf_data_buf *dbuf, uint8_t dbuf_reusable);
 static void sbd_handle_write_same_xfer_completion(struct scsi_task *task,
     sbd_cmd_t *scmd, struct stmf_data_buf *dbuf, uint8_t dbuf_reusable);
+
 /*
  * IMPORTANT NOTE:
  * =================
  * The whole world here is based on the assumption that everything within
  * a scsi task executes in a single threaded manner, even the aborts.

@@ -141,11 +157,12 @@
         /* Lets try not to hog all the buffers the port has. */
         bufs_to_take = ((task->task_max_nbufs > 2) &&
             (task->task_cmd_xfer_length < (32 * 1024))) ? 2 :
             task->task_max_nbufs;
 
-        len = scmd->len > dbuf->db_buf_size ? dbuf->db_buf_size : scmd->len;
+        len = ATOMIC32_GET(scmd->len) > dbuf->db_buf_size ?
+            dbuf->db_buf_size : ATOMIC32_GET(scmd->len);
         laddr = scmd->addr + scmd->current_ro;
 
         for (buflen = 0, ndx = 0; (buflen < len) &&
             (ndx < dbuf->db_sglist_length); ndx++) {
                 iolen = min(len - buflen, dbuf->db_sglist[ndx].seg_length);

@@ -165,16 +182,18 @@
         }
         dbuf->db_relative_offset = scmd->current_ro;
         dbuf->db_data_size = buflen;
         dbuf->db_flags = DB_DIRECTION_TO_RPORT;
         (void) stmf_xfer_data(task, dbuf, 0);
-        scmd->len -= buflen;
+        atomic_add_32(&scmd->len, -buflen);
         scmd->current_ro += buflen;
-        if (scmd->len && (scmd->nbufs < bufs_to_take)) {
+        if (ATOMIC32_GET(scmd->len) &&
+            (ATOMIC8_GET(scmd->nbufs) < bufs_to_take)) {
                 uint32_t maxsize, minsize, old_minsize;
 
-                maxsize = (scmd->len > (128*1024)) ? 128*1024 : scmd->len;
+                maxsize = (ATOMIC32_GET(scmd->len) > (128*1024)) ? 128*1024 :
+                    ATOMIC32_GET(scmd->len);
                 minsize = maxsize >> 2;
                 do {
                         /*
                          * A bad port implementation can keep on failing the
                          * the request but keep on sending us a false

@@ -185,11 +204,11 @@
                 } while ((dbuf == NULL) && (old_minsize > minsize) &&
                     (minsize >= 512));
                 if (dbuf == NULL) {
                         return;
                 }
-                scmd->nbufs++;
+                atomic_inc_8(&scmd->nbufs);
                 sbd_do_read_xfer(task, scmd, dbuf);
         }
 }
 
 /*

@@ -216,10 +235,11 @@
         stmf_status_t xstat;
         stmf_data_buf_t *dbuf;
         uint_t nblks;
         uint64_t blksize = sl->sl_blksize;
         size_t db_private_sz;
+        hrtime_t xfer_start;
         uintptr_t pad;
 
         ASSERT(rw_read_held(&sl->sl_access_state_lock));
         ASSERT((sl->sl_flags & SL_MEDIA_LOADED) != 0);
 

@@ -257,18 +277,19 @@
                         first_len = task->task_1st_xfer_len;
         } else {
                 first_len = 0;
         }
 
-        while (scmd->len && scmd->nbufs < task->task_max_nbufs) {
+        while (ATOMIC32_GET(scmd->len) &&
+            ATOMIC8_GET(scmd->nbufs) < task->task_max_nbufs) {
 
-                xfer_len = MIN(max_len, scmd->len);
+                xfer_len = MIN(max_len, ATOMIC32_GET(scmd->len));
                 if (first_len) {
                         xfer_len = MIN(xfer_len, first_len);
                         first_len = 0;
                 }
-                if (scmd->len == xfer_len) {
+                if (ATOMIC32_GET(scmd->len) == xfer_len) {
                         final_xfer = 1;
                 } else {
                         /*
                          * Attempt to end xfer on a block boundary.
                          * The only way this does not happen is if the

@@ -331,28 +352,32 @@
 
                 /*
                  * Accounting for start of read.
                  * Note there is no buffer address for the probe yet.
                  */
+                xfer_start = gethrtime();
                 DTRACE_PROBE5(backing__store__read__start, sbd_lu_t *, sl,
                     uint8_t *, NULL, uint64_t, xfer_len,
                     uint64_t, offset, scsi_task_t *, task);
 
                 ret = sbd_zvol_alloc_read_bufs(sl, dbuf);
 
+                stmf_lu_xfer_done(task, B_TRUE /* read */,
+                    (gethrtime() - xfer_start));
                 DTRACE_PROBE6(backing__store__read__end, sbd_lu_t *, sl,
                     uint8_t *, NULL, uint64_t, xfer_len,
                     uint64_t, offset, int, ret, scsi_task_t *, task);
 
                 if (ret != 0) {
                         /*
                          * Read failure from the backend.
                          */
                         stmf_free(dbuf);
-                        if (scmd->nbufs == 0) {
+                        if (ATOMIC8_GET(scmd->nbufs) == 0) {
                                 /* nothing queued, just finish */
                                 scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                                sbd_ats_remove_by_task(task);
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_READ_ERROR);
                                 rw_exit(&sl->sl_access_state_lock);
                         } else {
                                 /* process failure when other dbufs finish */

@@ -359,11 +384,10 @@
                                 scmd->flags |= SBD_SCSI_CMD_XFER_FAIL;
                         }
                         return;
                 }
 
-
                 /*
                  * Allow PP to do setup
                  */
                 xstat = stmf_setup_dbuf(task, dbuf, 0);
                 if (xstat != STMF_SUCCESS) {

@@ -373,18 +397,19 @@
                          * If other dbufs are queued, try again when the next
                          * one completes, otherwise give up.
                          */
                         sbd_zvol_rele_read_bufs(sl, dbuf);
                         stmf_free(dbuf);
-                        if (scmd->nbufs > 0) {
+                        if (ATOMIC8_GET(scmd->nbufs) > 0) {
                                 /* completion of previous dbuf will retry */
                                 return;
                         }
                         /*
                          * Done with this command.
                          */
                         scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                        sbd_ats_remove_by_task(task);
                         if (first_xfer)
                                 stmf_scsilib_send_status(task, STATUS_QFULL, 0);
                         else
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_READ_ERROR);

@@ -392,11 +417,11 @@
                         return;
                 }
                 /*
                  * dbuf is now queued on task
                  */
-                scmd->nbufs++;
+                atomic_inc_8(&scmd->nbufs);
 
                 /* XXX leave this in for FW? */
                 DTRACE_PROBE4(sbd__xfer, struct scsi_task *, task,
                     struct stmf_data_buf *, dbuf, uint64_t, offset,
                     uint32_t, xfer_len);

@@ -414,20 +439,21 @@
                          * to the PP, thus no completion will occur.
                          */
                         sbd_zvol_rele_read_bufs(sl, dbuf);
                         stmf_teardown_dbuf(task, dbuf);
                         stmf_free(dbuf);
-                        scmd->nbufs--;
-                        if (scmd->nbufs > 0) {
+                        atomic_dec_8(&scmd->nbufs);
+                        if (ATOMIC8_GET(scmd->nbufs) > 0) {
                                 /* completion of previous dbuf will retry */
                                 return;
                         }
                         /*
                          * Done with this command.
                          */
                         rw_exit(&sl->sl_access_state_lock);
                         scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                        sbd_ats_remove_by_task(task);
                         if (first_xfer)
                                 stmf_scsilib_send_status(task, STATUS_QFULL, 0);
                         else
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_READ_ERROR);

@@ -435,17 +461,18 @@
                 case STMF_ABORTED:
                         /*
                          * Completion from task_done will cleanup
                          */
                         scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                        sbd_ats_remove_by_task(task);
                         return;
                 }
                 /*
                  * Update the xfer progress.
                  */
                 ASSERT(scmd->len >= xfer_len);
-                scmd->len -= xfer_len;
+                atomic_add_32(&scmd->len, -xfer_len);
                 scmd->current_ro += xfer_len;
         }
 }
 
 void

@@ -456,16 +483,18 @@
                 stmf_abort(STMF_QUEUE_TASK_ABORT, task,
                     dbuf->db_xfer_status, NULL);
                 return;
         }
         task->task_nbytes_transferred += dbuf->db_data_size;
-        if (scmd->len == 0 || scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
+        if (ATOMIC32_GET(scmd->len) == 0 ||
+            scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
                 stmf_free_dbuf(task, dbuf);
-                scmd->nbufs--;
-                if (scmd->nbufs)
+                atomic_dec_8(&scmd->nbufs);
+                if (ATOMIC8_GET(scmd->nbufs))
                         return; /* wait for all buffers to complete */
                 scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                sbd_ats_remove_by_task(task);
                 if (scmd->flags & SBD_SCSI_CMD_XFER_FAIL)
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_READ_ERROR);
                 else
                         stmf_scsilib_send_status(task, STATUS_GOOD, 0);

@@ -474,20 +503,21 @@
         if (dbuf->db_flags & DB_DONT_REUSE) {
                 /* allocate new dbuf */
                 uint32_t maxsize, minsize, old_minsize;
                 stmf_free_dbuf(task, dbuf);
 
-                maxsize = (scmd->len > (128*1024)) ? 128*1024 : scmd->len;
+                maxsize = (ATOMIC32_GET(scmd->len) > (128*1024)) ?
+                    128*1024 : ATOMIC32_GET(scmd->len);
                 minsize = maxsize >> 2;
                 do {
                         old_minsize = minsize;
                         dbuf = stmf_alloc_dbuf(task, maxsize, &minsize, 0);
                 } while ((dbuf == NULL) && (old_minsize > minsize) &&
                     (minsize >= 512));
                 if (dbuf == NULL) {
-                        scmd->nbufs --;
-                        if (scmd->nbufs == 0) {
+                        atomic_dec_8(&scmd->nbufs);
+                        if (ATOMIC8_GET(scmd->nbufs) == 0) {
                                 stmf_abort(STMF_QUEUE_TASK_ABORT, task,
                                     STMF_ALLOC_FAILURE, NULL);
                         }
                         return;
                 }

@@ -511,11 +541,11 @@
         int scmd_err;
 
         ASSERT(dbuf->db_lu_private);
         ASSERT(scmd->cmd_type == SBD_CMD_SCSI_READ);
 
-        scmd->nbufs--;  /* account for this dbuf */
+        atomic_dec_8(&scmd->nbufs);     /* account for this dbuf */
         /*
          * Release the DMU resources.
          */
         sbd_zvol_rele_read_bufs(sl, dbuf);
         /*

@@ -532,11 +562,12 @@
          * will be queued.
          */
         scmd_err = (((scmd->flags & SBD_SCSI_CMD_ACTIVE) == 0) ||
             (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) ||
             (xfer_status != STMF_SUCCESS));
-        if (scmd->nbufs == 0 && (scmd->len == 0 || scmd_err)) {
+        if ((ATOMIC8_GET(scmd->nbufs) == 0) &&
+            (ATOMIC32_GET(scmd->len) == 0 || scmd_err)) {
                 /* all DMU state has been released */
                 rw_exit(&sl->sl_access_state_lock);
         }
 
         /*

@@ -546,18 +577,20 @@
         if (!scmd_err) {
                 /*
                  * This chunk completed successfully
                  */
                 task->task_nbytes_transferred += data_size;
-                if (scmd->nbufs == 0 && scmd->len == 0) {
+                if (ATOMIC8_GET(scmd->nbufs) == 0 &&
+                    ATOMIC32_GET(scmd->len) == 0) {
                         /*
                          * This command completed successfully
                          *
                          * Status was sent along with data, so no status
                          * completion will occur. Tell stmf we are done.
                          */
                         scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                        sbd_ats_remove_by_task(task);
                         stmf_task_lu_done(task);
                         return;
                 }
                 /*
                  * Start more xfers

@@ -572,22 +605,30 @@
                 /*
                  * If a previous error occurred, leave the command active
                  * and wait for the last completion to send the status check.
                  */
                 if (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
-                        if (scmd->nbufs == 0) {
+                        if (ATOMIC8_GET(scmd->nbufs) == 0) {
                                 scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                                sbd_ats_remove_by_task(task);
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_READ_ERROR);
                         }
                         return;
                 }
                 /*
                  * Must have been a failure on current dbuf
                  */
                 ASSERT(xfer_status != STMF_SUCCESS);
+
+                /*
+                 * Actually this is a bug. stmf abort should have reset the
+                 * active flag but since its been there for some time.
+                 * I wont change it.
+                 */
                 scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                sbd_ats_remove_by_task(task);
                 stmf_abort(STMF_QUEUE_TASK_ABORT, task, xfer_status, NULL);
         }
 }
 
 void

@@ -598,29 +639,32 @@
         sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
         int ret;
         int scmd_err, scmd_xfer_done;
         stmf_status_t xfer_status = dbuf->db_xfer_status;
         uint32_t data_size = dbuf->db_data_size;
+        hrtime_t xfer_start;
 
         ASSERT(zvio);
 
         /*
          * Allow PP to free up resources before releasing the write bufs
          * as writing to the backend could take some time.
          */
         stmf_teardown_dbuf(task, dbuf);
 
-        scmd->nbufs--;  /* account for this dbuf */
+        atomic_dec_8(&scmd->nbufs);     /* account for this dbuf */
         /*
          * All data was queued and this is the last completion,
          * but there could still be an error.
          */
-        scmd_xfer_done = (scmd->len == 0 && scmd->nbufs == 0);
+        scmd_xfer_done = (ATOMIC32_GET(scmd->len) == 0 &&
+            (ATOMIC8_GET(scmd->nbufs) == 0));
         scmd_err = (((scmd->flags & SBD_SCSI_CMD_ACTIVE) == 0) ||
             (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) ||
             (xfer_status != STMF_SUCCESS));
 
+        xfer_start = gethrtime();
         DTRACE_PROBE5(backing__store__write__start, sbd_lu_t *, sl,
             uint8_t *, NULL, uint64_t, data_size,
             uint64_t, zvio->zvio_offset, scsi_task_t *, task);
 
         if (scmd_err) {

@@ -634,10 +678,12 @@
                         zvio->zvio_flags = 0;
                 /* write the data */
                 ret = sbd_zvol_rele_write_bufs(sl, dbuf);
         }
 
+        stmf_lu_xfer_done(task, B_FALSE /* write */,
+            (gethrtime() - xfer_start));
         DTRACE_PROBE6(backing__store__write__end, sbd_lu_t *, sl,
             uint8_t *, NULL, uint64_t, data_size,
             uint64_t, zvio->zvio_offset, int, ret,  scsi_task_t *, task);
 
         if (ret != 0) {

@@ -653,11 +699,12 @@
          * Release the state lock if this is the last completion.
          * If this is the last dbuf on task and all data has been
          * transferred or an error encountered, then no more dbufs
          * will be queued.
          */
-        if (scmd->nbufs == 0 && (scmd->len == 0 || scmd_err)) {
+        if ((ATOMIC8_GET(scmd->nbufs) == 0) &&
+            (ATOMIC32_GET(scmd->len) == 0 || scmd_err)) {
                 /* all DMU state has been released */
                 rw_exit(&sl->sl_access_state_lock);
         }
         /*
          * If there have been no errors, either complete the task

@@ -667,10 +714,11 @@
                 /* This chunk completed successfully */
                 task->task_nbytes_transferred += data_size;
                 if (scmd_xfer_done) {
                         /* This command completed successfully */
                         scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                        sbd_ats_remove_by_task(task);
                         if ((scmd->flags & SBD_SCSI_CMD_SYNC_WRITE) &&
                             (sbd_flush_data_cache(sl, 0) != SBD_SUCCESS)) {
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_WRITE_ERROR);
                         } else {

@@ -687,21 +735,23 @@
         /*
          * Sort out the failure
          */
         if (scmd->flags & SBD_SCSI_CMD_ACTIVE) {
                 if (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
-                        if (scmd->nbufs == 0) {
+                        if (ATOMIC8_GET(scmd->nbufs) == 0) {
                                 scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                                sbd_ats_remove_by_task(task);
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_WRITE_ERROR);
                         }
                         /*
                          * Leave the command active until last dbuf completes.
                          */
                         return;
                 }
                 scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                sbd_ats_remove_by_task(task);
                 ASSERT(xfer_status != STMF_SUCCESS);
                 stmf_abort(STMF_QUEUE_TASK_ABORT, task, xfer_status, NULL);
         }
 }
 

@@ -723,10 +773,11 @@
         sbd_lu_t                *sl = task->task_lu->lu_provider_private;
         struct uio              uio;
         struct iovec            *iov, *tiov, iov1[8];
         uint32_t                len, resid;
         int                     ret, i, iovcnt, flags;
+        hrtime_t                xfer_start;
         boolean_t               is_read;
 
         ASSERT(cmd == SBD_CMD_SCSI_READ || cmd == SBD_CMD_SCSI_WRITE);
 
         is_read = (cmd == SBD_CMD_SCSI_READ) ? B_TRUE : B_FALSE;

@@ -760,10 +811,11 @@
         uio.uio_loffset = laddr;
         uio.uio_segflg = (short)UIO_SYSSPACE;
         uio.uio_resid = (uint64_t)len;
         uio.uio_llimit = RLIM64_INFINITY;
 
+        xfer_start = gethrtime();
         if (is_read == B_TRUE) {
                 uio.uio_fmode = FREAD;
                 uio.uio_extflg = UIO_COPY_CACHED;
                 DTRACE_PROBE5(backing__store__read__start, sbd_lu_t *, sl,
                     uint8_t *, NULL, uint64_t, len, uint64_t, laddr,

@@ -788,10 +840,12 @@
 
                 DTRACE_PROBE6(backing__store__write__end, sbd_lu_t *, sl,
                     uint8_t *, NULL, uint64_t, len, uint64_t, laddr, int, ret,
                     scsi_task_t *, task);
         }
+        /* finalize accounting */
+        stmf_lu_xfer_done(task, is_read, (gethrtime() - xfer_start));
 
         if (iov != &iov1[0])
                 kmem_free(iov, iovcnt * sizeof (*iov));
         if (ret != 0) {
                 /* Backend I/O error */

@@ -802,17 +856,30 @@
 
 void
 sbd_handle_read(struct scsi_task *task, struct stmf_data_buf *initial_dbuf)
 {
         uint64_t lba, laddr;
+        uint64_t blkcount;
         uint32_t len;
         uint8_t op = task->task_cdb[0];
         sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
         sbd_cmd_t *scmd;
         stmf_data_buf_t *dbuf;
         int fast_path;
+        boolean_t fua_bit = B_FALSE;
 
+        /*
+         * Check to see if the command is READ(10), READ(12), or READ(16).
+         * If it is then check for bit 3 being set to indicate if Forced
+         * Unit Access is being requested. If so, we'll bypass the use of
+         * DMA buffers to simplify support of this feature.
+         */
+        if (((op == SCMD_READ_G1) || (op == SCMD_READ_G4) ||
+            (op == SCMD_READ_G5)) &&
+            (task->task_cdb[1] & BIT_3)) {
+                fua_bit = B_TRUE;
+        }
         if (op == SCMD_READ) {
                 lba = READ_SCSI21(&task->task_cdb[1], uint64_t);
                 len = (uint32_t)task->task_cdb[4];
 
                 if (len == 0) {

@@ -832,10 +899,11 @@
                     STMF_SAA_INVALID_OPCODE);
                 return;
         }
 
         laddr = lba << sl->sl_data_blocksize_shift;
+        blkcount = len;
         len <<= sl->sl_data_blocksize_shift;
 
         if ((laddr + (uint64_t)len) > sl->sl_lu_size) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_LBA_OUT_OF_RANGE);

@@ -858,19 +926,26 @@
         if (len == 0) {
                 stmf_scsilib_send_status(task, STATUS_GOOD, 0);
                 return;
         }
 
+        if (sbd_ats_handling_before_io(task, sl, lba, blkcount) !=
+            SBD_SUCCESS) {
+                if (stmf_task_poll_lu(task, 10) != STMF_SUCCESS) {
+                        stmf_scsilib_send_status(task, STATUS_BUSY, 0);
+                }
+                return;
+        }
         /*
          * Determine if this read can directly use DMU buffers.
          */
         if (sbd_zcopy & (2|1) &&                /* Debug switch */
             initial_dbuf == NULL &&             /* No PP buffer passed in */
             sl->sl_flags & SL_CALL_ZVOL &&      /* zvol backing store */
             (task->task_additional_flags &
-            TASK_AF_ACCEPT_LU_DBUF))            /* PP allows it */
-        {
+            TASK_AF_ACCEPT_LU_DBUF) &&          /* PP allows it */
+            !fua_bit) {
                 /*
                  * Reduced copy path
                  */
                 uint32_t copy_threshold, minsize;
                 int ret;

@@ -881,10 +956,11 @@
                  * dbufs have completed.
                  */
                 rw_enter(&sl->sl_access_state_lock, RW_READER);
                 if ((sl->sl_flags & SL_MEDIA_LOADED) == 0) {
                         rw_exit(&sl->sl_access_state_lock);
+                        sbd_ats_remove_by_task(task);
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_READ_ERROR);
                         return;
                 }
 

@@ -901,10 +977,11 @@
 
                         ret = sbd_copy_rdwr(task, laddr, dbuf,
                             SBD_CMD_SCSI_READ, 0);
                         /* done with the backend */
                         rw_exit(&sl->sl_access_state_lock);
+                        sbd_ats_remove_by_task(task);
                         if (ret != 0) {
                                 /* backend error */
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_READ_ERROR);
                         } else {

@@ -933,17 +1010,16 @@
                         task->task_lu_private = scmd;
                 }
                 /*
                  * Setup scmd to track read progress.
                  */
-                scmd->flags = SBD_SCSI_CMD_ACTIVE;
+                scmd->flags = SBD_SCSI_CMD_ACTIVE | SBD_SCSI_CMD_ATS_RELATED;
                 scmd->cmd_type = SBD_CMD_SCSI_READ;
                 scmd->nbufs = 0;
                 scmd->addr = laddr;
                 scmd->len = len;
                 scmd->current_ro = 0;
-
                 /*
                  * Kick-off the read.
                  */
                 sbd_do_sgl_read_xfer(task, scmd, 1);
                 return;

@@ -959,10 +1035,11 @@
                         initial_dbuf = stmf_alloc_dbuf(task, maxsize,
                             &minsize, 0);
                 } while ((initial_dbuf == NULL) && (old_minsize > minsize) &&
                     (minsize >= 512));
                 if (initial_dbuf == NULL) {
+                        sbd_ats_remove_by_task(task);
                         stmf_scsilib_send_status(task, STATUS_QFULL, 0);
                         return;
                 }
         }
         dbuf = initial_dbuf;

@@ -982,20 +1059,21 @@
                         (void) stmf_xfer_data(task, dbuf, STMF_IOF_LU_DONE);
                 } else {
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_READ_ERROR);
                 }
+                sbd_ats_remove_by_task(task);
                 return;
         }
 
         if (task->task_lu_private) {
                 scmd = (sbd_cmd_t *)task->task_lu_private;
         } else {
                 scmd = (sbd_cmd_t *)kmem_alloc(sizeof (sbd_cmd_t), KM_SLEEP);
                 task->task_lu_private = scmd;
         }
-        scmd->flags = SBD_SCSI_CMD_ACTIVE;
+        scmd->flags = SBD_SCSI_CMD_ACTIVE | SBD_SCSI_CMD_ATS_RELATED;
         scmd->cmd_type = SBD_CMD_SCSI_READ;
         scmd->nbufs = 1;
         scmd->addr = laddr;
         scmd->len = len;
         scmd->current_ro = 0;

@@ -1008,11 +1086,11 @@
     struct stmf_data_buf *dbuf, uint8_t dbuf_reusable)
 {
         uint32_t len;
         int bufs_to_take;
 
-        if (scmd->len == 0) {
+        if (ATOMIC32_GET(scmd->len) == 0) {
                 goto DO_WRITE_XFER_DONE;
         }
 
         /* Lets try not to hog all the buffers the port has. */
         bufs_to_take = ((task->task_max_nbufs > 2) &&

@@ -1023,45 +1101,47 @@
             ((dbuf->db_flags & DB_DONT_REUSE) || (dbuf_reusable == 0))) {
                 /* free current dbuf and allocate a new one */
                 stmf_free_dbuf(task, dbuf);
                 dbuf = NULL;
         }
-        if (scmd->nbufs >= bufs_to_take) {
+        if (ATOMIC8_GET(scmd->nbufs) >= bufs_to_take) {
                 goto DO_WRITE_XFER_DONE;
         }
         if (dbuf == NULL) {
                 uint32_t maxsize, minsize, old_minsize;
 
-                maxsize = (scmd->len > (128*1024)) ? 128*1024 :
-                    scmd->len;
+                maxsize = (ATOMIC32_GET(scmd->len) > (128*1024)) ? 128*1024 :
+                    ATOMIC32_GET(scmd->len);
                 minsize = maxsize >> 2;
                 do {
                         old_minsize = minsize;
                         dbuf = stmf_alloc_dbuf(task, maxsize, &minsize, 0);
                 } while ((dbuf == NULL) && (old_minsize > minsize) &&
                     (minsize >= 512));
                 if (dbuf == NULL) {
-                        if (scmd->nbufs == 0) {
+                        if (ATOMIC8_GET(scmd->nbufs) == 0) {
                                 stmf_abort(STMF_QUEUE_TASK_ABORT, task,
                                     STMF_ALLOC_FAILURE, NULL);
                         }
                         return;
                 }
         }
 
-        len = scmd->len > dbuf->db_buf_size ? dbuf->db_buf_size :
-            scmd->len;
+        len = ATOMIC32_GET(scmd->len) > dbuf->db_buf_size ? dbuf->db_buf_size :
+            ATOMIC32_GET(scmd->len);
 
         dbuf->db_relative_offset = scmd->current_ro;
         dbuf->db_data_size = len;
         dbuf->db_flags = DB_DIRECTION_FROM_RPORT;
         (void) stmf_xfer_data(task, dbuf, 0);
-        scmd->nbufs++; /* outstanding port xfers and bufs used */
-        scmd->len -= len;
+        /* outstanding port xfers and bufs used */
+        atomic_inc_8(&scmd->nbufs);
+        atomic_add_32(&scmd->len, -len);
         scmd->current_ro += len;
 
-        if ((scmd->len != 0) && (scmd->nbufs < bufs_to_take)) {
+        if ((ATOMIC32_GET(scmd->len) != 0) &&
+            (ATOMIC8_GET(scmd->nbufs) < bufs_to_take)) {
                 sbd_do_write_xfer(task, scmd, NULL, 0);
         }
         return;
 
 DO_WRITE_XFER_DONE:

@@ -1123,18 +1203,18 @@
         } else {
                 first_len = 0;
         }
 
 
-        while (scmd->len && scmd->nbufs < task->task_max_nbufs) {
-
-                xfer_len = MIN(max_len, scmd->len);
+        while (ATOMIC32_GET(scmd->len) &&
+            ATOMIC8_GET(scmd->nbufs) < task->task_max_nbufs) {
+                xfer_len = MIN(max_len, ATOMIC32_GET(scmd->len));
                 if (first_len) {
                         xfer_len = MIN(xfer_len, first_len);
                         first_len = 0;
                 }
-                if (xfer_len < scmd->len) {
+                if (xfer_len < ATOMIC32_GET(scmd->len)) {
                         /*
                          * Attempt to end xfer on a block boundary.
                          * The only way this does not happen is if the
                          * xfer_len is small enough to stay contained
                          * within the same block.

@@ -1194,14 +1274,15 @@
                          * Could not allocate buffers from the backend;
                          * treat it like an IO error.
                          */
                         stmf_free(dbuf);
                         scmd->flags |= SBD_SCSI_CMD_XFER_FAIL;
-                        if (scmd->nbufs == 0) {
+                        if (ATOMIC8_GET(scmd->nbufs) == 0) {
                                 /*
                                  * Nothing queued, so no completions coming
                                  */
+                                sbd_ats_remove_by_task(task);
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_WRITE_ERROR);
                                 rw_exit(&sl->sl_access_state_lock);
                         }
                         /*

@@ -1221,18 +1302,19 @@
                          * If other dbufs are queued, try again when the next
                          * one completes, otherwise give up.
                          */
                         sbd_zvol_rele_write_bufs_abort(sl, dbuf);
                         stmf_free(dbuf);
-                        if (scmd->nbufs > 0) {
+                        if (ATOMIC8_GET(scmd->nbufs) > 0) {
                                 /* completion of previous dbuf will retry */
                                 return;
                         }
                         /*
                          * Done with this command.
                          */
                         scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                        sbd_ats_remove_by_task(task);
                         if (first_xfer)
                                 stmf_scsilib_send_status(task, STATUS_QFULL, 0);
                         else
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_WRITE_ERROR);

@@ -1241,11 +1323,11 @@
                 }
 
                 /*
                  * dbuf is now queued on task
                  */
-                scmd->nbufs++;
+                atomic_inc_8(&scmd->nbufs);
 
                 xstat = stmf_xfer_data(task, dbuf, 0);
                 switch (xstat) {
                 case STMF_SUCCESS:
                         break;

@@ -1255,19 +1337,20 @@
                          * to the PP, thus no completion will occur.
                          */
                         sbd_zvol_rele_write_bufs_abort(sl, dbuf);
                         stmf_teardown_dbuf(task, dbuf);
                         stmf_free(dbuf);
-                        scmd->nbufs--;
-                        if (scmd->nbufs > 0) {
+                        atomic_dec_8(&scmd->nbufs);
+                        if (ATOMIC8_GET(scmd->nbufs) > 0) {
                                 /* completion of previous dbuf will retry */
                                 return;
                         }
                         /*
                          * Done with this command.
                          */
                         scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                        sbd_ats_remove_by_task(task);
                         if (first_xfer)
                                 stmf_scsilib_send_status(task, STATUS_QFULL, 0);
                         else
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_WRITE_ERROR);

@@ -1281,11 +1364,11 @@
                         return;
                 }
                 /*
                  * Update the xfer progress.
                  */
-                scmd->len -= xfer_len;
+                atomic_add_32(&scmd->len, -xfer_len);
                 scmd->current_ro += xfer_len;
         }
 }
 
 void

@@ -1294,54 +1377,71 @@
 {
         sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
         uint64_t laddr;
         uint32_t buflen, iolen;
         int ndx;
+        uint8_t op = task->task_cdb[0];
+        boolean_t fua_bit = B_FALSE;
 
-        if (scmd->nbufs > 0) {
+        if (ATOMIC8_GET(scmd->nbufs) > 0) {
                 /*
                  * Decrement the count to indicate the port xfer
                  * into the dbuf has completed even though the buf is
                  * still in use here in the LU provider.
                  */
-                scmd->nbufs--;
+                atomic_dec_8(&scmd->nbufs);
         }
 
         if (dbuf->db_xfer_status != STMF_SUCCESS) {
+                sbd_ats_remove_by_task(task);
                 stmf_abort(STMF_QUEUE_TASK_ABORT, task,
                     dbuf->db_xfer_status, NULL);
                 return;
         }
 
         if (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
                 goto WRITE_XFER_DONE;
         }
 
-        if (scmd->len != 0) {
+        if (ATOMIC32_GET(scmd->len) != 0) {
                 /*
                  * Initiate the next port xfer to occur in parallel
                  * with writing this buf.
                  */
                 sbd_do_write_xfer(task, scmd, NULL, 0);
         }
 
+        /*
+         * Check to see if the command is WRITE(10), WRITE(12), or WRITE(16).
+         * If it is then check for bit 3 being set to indicate if Forced
+         * Unit Access is being requested. If so, we'll bypass the direct
+         * call and handle it in sbd_data_write().
+         */
+        if (((op == SCMD_WRITE_G1) || (op == SCMD_WRITE_G4) ||
+            (op == SCMD_WRITE_G5)) && (task->task_cdb[1] & BIT_3)) {
+                fua_bit = B_TRUE;
+        }
         laddr = scmd->addr + dbuf->db_relative_offset;
 
         /*
          * If this is going to a zvol, use the direct call to
          * sbd_zvol_copy_{read,write}. The direct call interface is
          * restricted to PPs that accept sglists, but that is not required.
          */
         if (sl->sl_flags & SL_CALL_ZVOL &&
             (task->task_additional_flags & TASK_AF_ACCEPT_LU_DBUF) &&
-            (sbd_zcopy & (4|1))) {
+            (sbd_zcopy & (4|1)) && !fua_bit) {
                 int commit;
 
-                commit = (scmd->len == 0 && scmd->nbufs == 0);
-                if (sbd_copy_rdwr(task, laddr, dbuf, SBD_CMD_SCSI_WRITE,
+                commit = (ATOMIC32_GET(scmd->len) == 0 &&
+                    ATOMIC8_GET(scmd->nbufs) == 0);
+                rw_enter(&sl->sl_access_state_lock, RW_READER);
+                if ((sl->sl_flags & SL_MEDIA_LOADED) == 0 ||
+                    sbd_copy_rdwr(task, laddr, dbuf, SBD_CMD_SCSI_WRITE,
                     commit) != STMF_SUCCESS)
                         scmd->flags |= SBD_SCSI_CMD_XFER_FAIL;
+                rw_exit(&sl->sl_access_state_lock);
                 buflen = dbuf->db_data_size;
         } else {
                 for (buflen = 0, ndx = 0; (buflen < dbuf->db_data_size) &&
                     (ndx < dbuf->db_sglist_length); ndx++) {
                         iolen = min(dbuf->db_data_size - buflen,

@@ -1357,15 +1457,17 @@
                         laddr += (uint64_t)iolen;
                 }
         }
         task->task_nbytes_transferred += buflen;
 WRITE_XFER_DONE:
-        if (scmd->len == 0 || scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
+        if (ATOMIC32_GET(scmd->len) == 0 ||
+            scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
                 stmf_free_dbuf(task, dbuf);
-                if (scmd->nbufs)
+                if (ATOMIC8_GET(scmd->nbufs))
                         return; /* wait for all buffers to complete */
                 scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                sbd_ats_remove_by_task(task);
                 if (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_WRITE_ERROR);
                 } else {
                         /*

@@ -1427,17 +1529,29 @@
         uint32_t len;
         uint8_t op = task->task_cdb[0], do_immediate_data = 0;
         sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
         sbd_cmd_t *scmd;
         stmf_data_buf_t *dbuf;
+        uint64_t blkcount;
         uint8_t sync_wr_flag = 0;
+        boolean_t fua_bit = B_FALSE;
 
         if (sl->sl_flags & SL_WRITE_PROTECTED) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_WRITE_PROTECTED);
                 return;
         }
+        /*
+         * Check to see if the command is WRITE(10), WRITE(12), or WRITE(16).
+         * If it is then check for bit 3 being set to indicate if Forced
+         * Unit Access is being requested. If so, we'll bypass the fast path
+         * code to simplify support of this feature.
+         */
+        if (((op == SCMD_WRITE_G1) || (op == SCMD_WRITE_G4) ||
+            (op == SCMD_WRITE_G5)) && (task->task_cdb[1] & BIT_3)) {
+                fua_bit = B_TRUE;
+        }
         if (op == SCMD_WRITE) {
                 lba = READ_SCSI21(&task->task_cdb[1], uint64_t);
                 len = (uint32_t)task->task_cdb[4];
 
                 if (len == 0) {

@@ -1469,10 +1583,11 @@
                     STMF_SAA_INVALID_OPCODE);
                 return;
         }
 
         laddr = lba << sl->sl_data_blocksize_shift;
+        blkcount = len;
         len <<= sl->sl_data_blocksize_shift;
 
         if ((laddr + (uint64_t)len) > sl->sl_lu_size) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_LBA_OUT_OF_RANGE);

@@ -1490,16 +1605,25 @@
         if (len == 0) {
                 stmf_scsilib_send_status(task, STATUS_GOOD, 0);
                 return;
         }
 
+        if (sbd_ats_handling_before_io(task, sl, lba, blkcount) !=
+            SBD_SUCCESS) {
+                if (stmf_task_poll_lu(task, 10) != STMF_SUCCESS) {
+                        stmf_scsilib_send_status(task, STATUS_BUSY, 0);
+                }
+                return;
+        }
+
         if (sbd_zcopy & (4|1) &&                /* Debug switch */
             initial_dbuf == NULL &&             /* No PP buf passed in */
             sl->sl_flags & SL_CALL_ZVOL &&      /* zvol backing store */
             (task->task_additional_flags &
             TASK_AF_ACCEPT_LU_DBUF) &&          /* PP allows it */
-            sbd_zcopy_write_useful(task, laddr, len, sl->sl_blksize)) {
+            sbd_zcopy_write_useful(task, laddr, len, sl->sl_blksize) &&
+            !fua_bit) {
 
                 /*
                  * XXX Note that disallowing initial_dbuf will eliminate
                  * iSCSI from participating. For small writes, that is
                  * probably ok. For large writes, it may be best to just

@@ -1507,10 +1631,11 @@
                  * the rest.
                  */
                 rw_enter(&sl->sl_access_state_lock, RW_READER);
                 if ((sl->sl_flags & SL_MEDIA_LOADED) == 0) {
                         rw_exit(&sl->sl_access_state_lock);
+                        sbd_ats_remove_by_task(task);
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_READ_ERROR);
                         return;
                 }
                 /*

@@ -1521,11 +1646,12 @@
                 } else {
                         scmd = (sbd_cmd_t *)kmem_alloc(sizeof (sbd_cmd_t),
                             KM_SLEEP);
                         task->task_lu_private = scmd;
                 }
-                scmd->flags = SBD_SCSI_CMD_ACTIVE | sync_wr_flag;
+                scmd->flags = SBD_SCSI_CMD_ACTIVE | SBD_SCSI_CMD_ATS_RELATED |
+                    sync_wr_flag;
                 scmd->cmd_type = SBD_CMD_SCSI_WRITE;
                 scmd->nbufs = 0;
                 scmd->addr = laddr;
                 scmd->len = len;
                 scmd->current_ro = 0;

@@ -1552,11 +1678,12 @@
                 scmd = (sbd_cmd_t *)task->task_lu_private;
         } else {
                 scmd = (sbd_cmd_t *)kmem_alloc(sizeof (sbd_cmd_t), KM_SLEEP);
                 task->task_lu_private = scmd;
         }
-        scmd->flags = SBD_SCSI_CMD_ACTIVE | sync_wr_flag;
+        scmd->flags = SBD_SCSI_CMD_ACTIVE | SBD_SCSI_CMD_ATS_RELATED |
+            sync_wr_flag;
         scmd->cmd_type = SBD_CMD_SCSI_WRITE;
         scmd->nbufs = 0;
         scmd->addr = laddr;
         scmd->len = len;
         scmd->current_ro = 0;

@@ -1564,11 +1691,11 @@
         if (do_immediate_data) {
                 /*
                  * Account for data passed in this write command
                  */
                 (void) stmf_xfer_data(task, dbuf, STMF_IOF_STATS_ONLY);
-                scmd->len -= dbuf->db_data_size;
+                atomic_add_32(&scmd->len, -dbuf->db_data_size);
                 scmd->current_ro += dbuf->db_data_size;
                 dbuf->db_xfer_status = STMF_SUCCESS;
                 sbd_handle_write_xfer_completion(task, scmd, dbuf, 0);
         } else {
                 sbd_do_write_xfer(task, scmd, dbuf, 0);

@@ -1747,10 +1874,13 @@
                 break;
         case SCMD_UNMAP:
                 sbd_handle_unmap_xfer(task,
                     dbuf->db_sglist[0].seg_addr, dbuf->db_data_size);
                 break;
+        case SCMD_EXTENDED_COPY:
+                sbd_handle_xcopy_xfer(task, dbuf->db_sglist[0].seg_addr);
+                break;
         case SCMD_PERSISTENT_RESERVE_OUT:
                 if (sl->sl_access_state == SBD_LU_STANDBY) {
                         st_ret = stmf_proxy_scsi_cmd(task, dbuf);
                         if (st_ret != STMF_SUCCESS) {
                                 stmf_scsilib_send_status(task, STATUS_CHECK,

@@ -1841,11 +1971,11 @@
         sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
         uint32_t cmd_size, n;
         uint8_t *cdb;
         uint32_t ncyl;
         uint8_t nsectors, nheads;
-        uint8_t page, ctrl, header_size, pc_valid;
+        uint8_t page, ctrl, header_size;
         uint16_t nbytes;
         uint8_t *p;
         uint64_t s = sl->sl_lu_size;
         uint32_t dev_spec_param_offset;
 

@@ -1852,29 +1982,25 @@
         p = buf;        /* buf is assumed to be zeroed out and large enough */
         n = 0;
         cdb = &task->task_cdb[0];
         page = cdb[2] & 0x3F;
         ctrl = (cdb[2] >> 6) & 3;
-        cmd_size = (cdb[0] == SCMD_MODE_SENSE) ? cdb[4] :
-            READ_SCSI16(&cdb[7], uint32_t);
 
         if (cdb[0] == SCMD_MODE_SENSE) {
+                cmd_size = cdb[4];
                 header_size = 4;
                 dev_spec_param_offset = 2;
         } else {
+                cmd_size = READ_SCSI16(&cdb[7], uint32_t);
                 header_size = 8;
                 dev_spec_param_offset = 3;
         }
 
         /* Now validate the command */
-        if ((cdb[2] == 0) || (page == MODEPAGE_ALLPAGES) || (page == 0x08) ||
-            (page == 0x0A) || (page == 0x03) || (page == 0x04)) {
-                pc_valid = 1;
-        } else {
-                pc_valid = 0;
-        }
-        if ((cmd_size < header_size) || (pc_valid == 0)) {
+        if ((cdb[2] != 0) && (page != MODEPAGE_ALLPAGES) &&
+            (page != MODEPAGE_CACHING) && (page != MODEPAGE_CTRL_MODE) &&
+            (page != MODEPAGE_FORMAT) && (page != MODEPAGE_GEOMETRY)) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_INVALID_FIELD_IN_CDB);
                 return;
         }
 

@@ -1888,11 +2014,11 @@
         /* We are not going to return any block descriptor */
 
         nbytes = ((uint16_t)1) << sl->sl_data_blocksize_shift;
         sbd_calc_geometry(s, nbytes, &nsectors, &nheads, &ncyl);
 
-        if ((page == 0x03) || (page == MODEPAGE_ALLPAGES)) {
+        if ((page == MODEPAGE_FORMAT) || (page == MODEPAGE_ALLPAGES)) {
                 p[n] = 0x03;
                 p[n+1] = 0x16;
                 if (ctrl != 1) {
                         p[n + 11] = nsectors;
                         p[n + 12] = nbytes >> 8;

@@ -1899,11 +2025,11 @@
                         p[n + 13] = nbytes & 0xff;
                         p[n + 20] = 0x80;
                 }
                 n += 24;
         }
-        if ((page == 0x04) || (page == MODEPAGE_ALLPAGES)) {
+        if ((page == MODEPAGE_GEOMETRY) || (page == MODEPAGE_ALLPAGES)) {
                 p[n] = 0x04;
                 p[n + 1] = 0x16;
                 if (ctrl != 1) {
                         p[n + 2] = ncyl >> 16;
                         p[n + 3] = ncyl >> 8;

@@ -1974,15 +2100,15 @@
                 /*
                  * Mode parameter header length doesn't include the number
                  * of bytes in the length field, so adjust the count.
                  * Byte count minus header length field size.
                  */
-                buf[0] = (n - 1) & 0xff;
+                buf[0] = (n - header_size) & 0xff;
         } else {
                 /* Byte count minus header length field size. */
-                buf[1] = (n - 2) & 0xff;
-                buf[0] = ((n - 2) >> 8) & 0xff;
+                buf[1] = (n - header_size) & 0xff;
+                buf[0] = ((n - header_size) >> 8) & 0xff;
         }
 
         sbd_handle_short_read_transfers(task, initial_dbuf, buf,
             cmd_size, n);
 }

@@ -2187,11 +2313,12 @@
  * i/p : pointer to pointer to a url string
  * o/p : Adjust the pointer to the url to the first non white character
  *       and returns the length of the URL
  */
 uint16_t
-sbd_parse_mgmt_url(char **url_addr) {
+sbd_parse_mgmt_url(char **url_addr)
+{
         uint16_t url_length = 0;
         char *url;
         url = *url_addr;
 
         while (*url != '\0') {

@@ -2298,28 +2425,58 @@
 
         return (ret);
 }
 
 static void
+sbd_write_same_release_resources(struct scsi_task *task)
+{
+        sbd_cmd_t *scmd = (sbd_cmd_t *)task->task_lu_private;
+
+        if (scmd->nbufs == 0XFF)
+                cmn_err(CE_WARN, "%s invalid buffer count %x",
+                    __func__, scmd->nbufs);
+        if ((scmd->trans_data_len != 0) && (scmd->trans_data != NULL))
+                kmem_free(scmd->trans_data, scmd->trans_data_len);
+        scmd->trans_data = NULL;
+        scmd->trans_data_len = 0;
+        scmd->flags &= ~SBD_SCSI_CMD_TRANS_DATA;
+}
+
+static void
 sbd_handle_write_same_xfer_completion(struct scsi_task *task, sbd_cmd_t *scmd,
     struct stmf_data_buf *dbuf, uint8_t dbuf_reusable)
 {
         uint64_t laddr;
         uint32_t buflen, iolen;
         int ndx, ret;
 
+        if (ATOMIC8_GET(scmd->nbufs) > 0) {
+                atomic_dec_8(&scmd->nbufs);
+        }
+
         if (dbuf->db_xfer_status != STMF_SUCCESS) {
+                sbd_write_same_release_resources(task);
+                sbd_ats_remove_by_task(task);
                 stmf_abort(STMF_QUEUE_TASK_ABORT, task,
                     dbuf->db_xfer_status, NULL);
                 return;
         }
 
         if (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
                 goto write_same_xfer_done;
         }
 
-        if (scmd->len != 0) {
+        /* if this is a unnessary callback just return */
+        if (((scmd->flags & SBD_SCSI_CMD_TRANS_DATA) == 0) ||
+            ((scmd->flags & SBD_SCSI_CMD_ACTIVE) == 0) ||
+            (scmd->trans_data == NULL)) {
+                sbd_ats_remove_by_task(task);
+                scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
+                return;
+        }
+
+        if (ATOMIC32_GET(scmd->len) != 0) {
                 /*
                  * Initiate the next port xfer to occur in parallel
                  * with writing this buf.
                  */
                 sbd_do_write_same_xfer(task, scmd, NULL, 0);

@@ -2339,33 +2496,32 @@
                 laddr += (uint64_t)iolen;
         }
         task->task_nbytes_transferred += buflen;
 
 write_same_xfer_done:
-        if (scmd->len == 0 || scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
+        if (ATOMIC32_GET(scmd->len) == 0 ||
+            scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
                 stmf_free_dbuf(task, dbuf);
+                if (ATOMIC8_GET(scmd->nbufs) > 0)
+                        return;
                 scmd->flags &= ~SBD_SCSI_CMD_ACTIVE;
                 if (scmd->flags & SBD_SCSI_CMD_XFER_FAIL) {
+                        sbd_ats_remove_by_task(task);
+                        sbd_write_same_release_resources(task);
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_WRITE_ERROR);
                 } else {
                         ret = sbd_write_same_data(task, scmd);
+                        sbd_ats_remove_by_task(task);
+                        sbd_write_same_release_resources(task);
                         if (ret != SBD_SUCCESS) {
                                 stmf_scsilib_send_status(task, STATUS_CHECK,
                                     STMF_SAA_WRITE_ERROR);
                         } else {
                                 stmf_scsilib_send_status(task, STATUS_GOOD, 0);
                         }
                 }
-                /*
-                 * Only way we should get here is via handle_write_same(),
-                 * and that should make the following assertion always pass.
-                 */
-                ASSERT((scmd->flags & SBD_SCSI_CMD_TRANS_DATA) &&
-                    scmd->trans_data != NULL);
-                kmem_free(scmd->trans_data, scmd->trans_data_len);
-                scmd->flags &= ~SBD_SCSI_CMD_TRANS_DATA;
                 return;
         }
         sbd_do_write_same_xfer(task, scmd, dbuf, dbuf_reusable);
 }
 

@@ -2373,11 +2529,11 @@
 sbd_do_write_same_xfer(struct scsi_task *task, sbd_cmd_t *scmd,
     struct stmf_data_buf *dbuf, uint8_t dbuf_reusable)
 {
         uint32_t len;
 
-        if (scmd->len == 0) {
+        if (ATOMIC32_GET(scmd->len) == 0) {
                 if (dbuf != NULL)
                         stmf_free_dbuf(task, dbuf);
                 return;
         }
 

@@ -2388,36 +2544,39 @@
                 dbuf = NULL;
         }
         if (dbuf == NULL) {
                 uint32_t maxsize, minsize, old_minsize;
 
-                maxsize = (scmd->len > (128*1024)) ? 128*1024 :
-                    scmd->len;
+                maxsize = (ATOMIC32_GET(scmd->len) > (128*1024)) ? 128*1024 :
+                    ATOMIC32_GET(scmd->len);
                 minsize = maxsize >> 2;
                 do {
                         old_minsize = minsize;
                         dbuf = stmf_alloc_dbuf(task, maxsize, &minsize, 0);
                 } while ((dbuf == NULL) && (old_minsize > minsize) &&
                     (minsize >= 512));
                 if (dbuf == NULL) {
-                        if (scmd->nbufs == 0) {
+                        sbd_ats_remove_by_task(task);
+                        sbd_write_same_release_resources(task);
+                        if (ATOMIC8_GET(scmd->nbufs) == 0) {
                                 stmf_abort(STMF_QUEUE_TASK_ABORT, task,
                                     STMF_ALLOC_FAILURE, NULL);
                         }
                         return;
                 }
         }
 
-        len = scmd->len > dbuf->db_buf_size ? dbuf->db_buf_size :
-            scmd->len;
+        len = ATOMIC32_GET(scmd->len) > dbuf->db_buf_size ? dbuf->db_buf_size :
+            ATOMIC32_GET(scmd->len);
 
         dbuf->db_relative_offset = scmd->current_ro;
         dbuf->db_data_size = len;
         dbuf->db_flags = DB_DIRECTION_FROM_RPORT;
         (void) stmf_xfer_data(task, dbuf, 0);
-        scmd->nbufs++; /* outstanding port xfers and bufs used */
-        scmd->len -= len;
+        /* outstanding port xfers and bufs used */
+        atomic_inc_8(&scmd->nbufs);
+        atomic_add_32(&scmd->len, -len);
         scmd->current_ro += len;
 }
 
 static void
 sbd_handle_write_same(scsi_task_t *task, struct stmf_data_buf *initial_dbuf)

@@ -2427,10 +2586,16 @@
         sbd_cmd_t *scmd;
         stmf_data_buf_t *dbuf;
         uint8_t unmap;
         uint8_t do_immediate_data = 0;
 
+        if (HardwareAcceleratedInit == 0) {
+                stmf_scsilib_send_status(task, STATUS_CHECK,
+                    STMF_SAA_INVALID_OPCODE);
+                return;
+        }
+
         task->task_cmd_xfer_length = 0;
         if (task->task_additional_flags &
             TASK_AF_NO_EXPECTED_XFER_LENGTH) {
                 task->task_expected_xfer_length = 0;
         }

@@ -2443,38 +2608,55 @@
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_INVALID_FIELD_IN_CDB);
                 return;
         }
         unmap = task->task_cdb[1] & 0x08;
+
         if (unmap && ((sl->sl_flags & SL_UNMAP_ENABLED) == 0)) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_INVALID_FIELD_IN_CDB);
                 return;
         }
+
         if (task->task_cdb[0] == SCMD_WRITE_SAME_G1) {
                 addr = READ_SCSI32(&task->task_cdb[2], uint64_t);
                 len = READ_SCSI16(&task->task_cdb[7], uint64_t);
         } else {
                 addr = READ_SCSI64(&task->task_cdb[2], uint64_t);
                 len = READ_SCSI32(&task->task_cdb[10], uint64_t);
         }
+
         if (len == 0) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_INVALID_FIELD_IN_CDB);
                 return;
         }
+
+        if (sbd_ats_handling_before_io(task, sl, addr, len) !=
+            SBD_SUCCESS) {
+                if (stmf_task_poll_lu(task, 10) != STMF_SUCCESS)
+                        stmf_scsilib_send_status(task, STATUS_BUSY, 0);
+                return;
+        }
+
         addr <<= sl->sl_data_blocksize_shift;
         len <<= sl->sl_data_blocksize_shift;
 
         /* Check if the command is for the unmap function */
         if (unmap) {
-                if (sbd_unmap(sl, addr, len) != 0) {
+                dkioc_free_list_t *dfl = kmem_zalloc(DFL_SZ(1), KM_SLEEP);
+
+                dfl->dfl_num_exts = 1;
+                dfl->dfl_exts[0].dfle_start = addr;
+                dfl->dfl_exts[0].dfle_length = len;
+                if (sbd_unmap(sl, dfl) != 0) {
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_LBA_OUT_OF_RANGE);
                 } else {
                         stmf_scsilib_send_status(task, STATUS_GOOD, 0);
                 }
+                dfl_free(dfl);
                 return;
         }
 
         /* Write same function */
 

@@ -2482,10 +2664,11 @@
         if (task->task_additional_flags &
             TASK_AF_NO_EXPECTED_XFER_LENGTH) {
                 task->task_expected_xfer_length = task->task_cmd_xfer_length;
         }
         if ((addr + len) > sl->sl_lu_size) {
+                sbd_ats_remove_by_task(task);
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_LBA_OUT_OF_RANGE);
                 return;
         }
 

@@ -2492,10 +2675,11 @@
         /* For rest of this I/O the transfer length is 1 block */
         len = ((uint64_t)1) << sl->sl_data_blocksize_shift;
 
         /* Some basic checks */
         if ((len == 0) || (len != task->task_expected_xfer_length)) {
+                sbd_ats_remove_by_task(task);
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_INVALID_FIELD_IN_CDB);
                 return;
         }
 

@@ -2503,10 +2687,11 @@
         if ((initial_dbuf != NULL) && (task->task_flags & TF_INITIAL_BURST)) {
                 if (initial_dbuf->db_data_size > len) {
                         if (initial_dbuf->db_data_size >
                             task->task_expected_xfer_length) {
                                 /* protocol error */
+                                sbd_ats_remove_by_task(task);
                                 stmf_abort(STMF_QUEUE_TASK_ABORT, task,
                                     STMF_INVALID_ARG, NULL);
                                 return;
                         }
                         initial_dbuf->db_data_size = (uint32_t)len;

@@ -2519,11 +2704,12 @@
                 scmd = (sbd_cmd_t *)task->task_lu_private;
         } else {
                 scmd = (sbd_cmd_t *)kmem_alloc(sizeof (sbd_cmd_t), KM_SLEEP);
                 task->task_lu_private = scmd;
         }
-        scmd->flags = SBD_SCSI_CMD_ACTIVE | SBD_SCSI_CMD_TRANS_DATA;
+        scmd->flags = SBD_SCSI_CMD_ACTIVE | SBD_SCSI_CMD_TRANS_DATA |
+            SBD_SCSI_CMD_ATS_RELATED;
         scmd->cmd_type = SBD_CMD_SCSI_WRITE;
         scmd->nbufs = 0;
         scmd->len = (uint32_t)len;
         scmd->trans_data_len = (uint32_t)len;
         scmd->trans_data = kmem_alloc((size_t)len, KM_SLEEP);

@@ -2532,11 +2718,11 @@
         if (do_immediate_data) {
                 /*
                  * Account for data passed in this write command
                  */
                 (void) stmf_xfer_data(task, dbuf, STMF_IOF_STATS_ONLY);
-                scmd->len -= dbuf->db_data_size;
+                atomic_add_32(&scmd->len, -dbuf->db_data_size);
                 scmd->current_ro += dbuf->db_data_size;
                 dbuf->db_xfer_status = STMF_SUCCESS;
                 sbd_handle_write_same_xfer_completion(task, scmd, dbuf, 0);
         } else {
                 sbd_do_write_same_xfer(task, scmd, dbuf, 0);

@@ -2544,12 +2730,24 @@
 }
 
 static void
 sbd_handle_unmap(scsi_task_t *task, stmf_data_buf_t *dbuf)
 {
+        sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
         uint32_t cmd_xfer_len;
 
+        if (sbd_unmap_enable == 0) {
+                stmf_scsilib_send_status(task, STATUS_CHECK,
+                    STMF_SAA_INVALID_OPCODE);
+                return;
+        }
+
+        if (sl->sl_flags & SL_WRITE_PROTECTED) {
+                stmf_scsilib_send_status(task, STATUS_CHECK,
+                    STMF_SAA_WRITE_PROTECTED);
+                return;
+        }
         cmd_xfer_len = READ_SCSI16(&task->task_cdb[7], uint32_t);
 
         if (task->task_cdb[1] & 1) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_INVALID_FIELD_IN_CDB);

@@ -2574,11 +2772,13 @@
 {
         sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
         uint32_t ulen, dlen, num_desc;
         uint64_t addr, len;
         uint8_t *p;
+        dkioc_free_list_t *dfl;
         int ret;
+        int i;
 
         if (buflen < 24) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_INVALID_FIELD_IN_CDB);
                 return;

@@ -2591,24 +2791,44 @@
                 stmf_scsilib_send_status(task, STATUS_CHECK,
                     STMF_SAA_INVALID_FIELD_IN_CDB);
                 return;
         }
 
-        for (p = buf + 8; num_desc; num_desc--, p += 16) {
+        dfl = kmem_zalloc(DFL_SZ(num_desc), KM_SLEEP);
+        dfl->dfl_num_exts = num_desc;
+        /*
+         * This should use ATS locking but that was disabled by the
+         * changes to ZFS top take advantage of TRIM in SSDs.
+         *
+         * Since the entire list is passed to ZFS in one list ATS
+         * locking is not done.  This may be detectable, and if it is
+         * then the entire list needs to be locked and then after the
+         * unmap completes the entire list must be unlocked
+         */
+        for (p = buf + 8, i = 0; num_desc; num_desc--, p += 16, i++) {
                 addr = READ_SCSI64(p, uint64_t);
-                addr <<= sl->sl_data_blocksize_shift;
                 len = READ_SCSI32(p+8, uint64_t);
+                addr <<= sl->sl_data_blocksize_shift;
                 len <<= sl->sl_data_blocksize_shift;
-                ret = sbd_unmap(sl, addr, len);
+
+                /* Prepare a list of extents to unmap */
+                dfl->dfl_exts[i].dfle_start = addr;
+                dfl->dfl_exts[i].dfle_length = len;
+
+                /* release the overlap */
+        }
+        ASSERT(i == dfl->dfl_num_exts);
+
+        /* Finally execute the unmap operations in a single step */
+        ret = sbd_unmap(sl, dfl);
+        dfl_free(dfl);
                 if (ret != 0) {
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_LBA_OUT_OF_RANGE);
                         return;
                 }
-        }
 
-unmap_done:
         stmf_scsilib_send_status(task, STATUS_GOOD, 0);
 }
 
 void
 sbd_handle_inquiry(struct scsi_task *task, struct stmf_data_buf *initial_dbuf)

@@ -2675,10 +2895,11 @@
                 inq->inq_rdf = 2;       /* Response data format for SPC-3 */
                 inq->inq_len = page_length;
 
                 inq->inq_tpgs = TPGS_FAILOVER_IMPLICIT;
                 inq->inq_cmdque = 1;
+                inq->inq_3pc = 1;
 
                 if (sl->sl_flags & SL_VID_VALID) {
                         bcopy(sl->sl_vendor_id, inq->inq_vid, 8);
                 } else {
                         bcopy(sbd_vendor_id, inq->inq_vid, 8);

@@ -2773,27 +2994,29 @@
         }
         p = (uint8_t *)kmem_zalloc(bsize, KM_SLEEP);
 
         switch (cdbp[2]) {
         case 0x00:
-                page_length = 4 + (mgmt_url_size ? 1 : 0);
+                page_length = 5 + (mgmt_url_size ? 1 : 0);
+
                 if (sl->sl_flags & SL_UNMAP_ENABLED)
-                        page_length += 2;
+                        page_length += 1;
 
                 p[0] = byte0;
                 p[3] = page_length;
                 /* Supported VPD pages in ascending order */
+                /* CSTYLED */
                 {
                         uint8_t i = 5;
 
                         p[i++] = 0x80;
                         p[i++] = 0x83;
                         if (mgmt_url_size != 0)
                                 p[i++] = 0x85;
                         p[i++] = 0x86;
-                        if (sl->sl_flags & SL_UNMAP_ENABLED) {
                                 p[i++] = 0xb0;
+                        if (sl->sl_flags & SL_UNMAP_ENABLED) {
                                 p[i++] = 0xb2;
                         }
                 }
                 xfer_size = page_length + 4;
                 break;

@@ -2822,11 +3045,11 @@
         case 0x85:
                 if (mgmt_url_size == 0) {
                         stmf_scsilib_send_status(task, STATUS_CHECK,
                             STMF_SAA_INVALID_FIELD_IN_CDB);
                         goto err_done;
-                }
+                } /* CSTYLED */
                 {
                         uint16_t idx, newidx, sz, url_size;
                         char *url;
 
                         p[0] = byte0;

@@ -2883,21 +3106,27 @@
                 p[5] = 1;
                 xfer_size = page_length + 4;
                 break;
 
         case 0xb0:
-                if ((sl->sl_flags & SL_UNMAP_ENABLED) == 0) {
-                        stmf_scsilib_send_status(task, STATUS_CHECK,
-                            STMF_SAA_INVALID_FIELD_IN_CDB);
-                        goto err_done;
-                }
                 page_length = 0x3c;
                 p[0] = byte0;
                 p[1] = 0xb0;
                 p[3] = page_length;
-                p[20] = p[21] = p[22] = p[23] = 0xFF;
-                p[24] = p[25] = p[26] = p[27] = 0xFF;
+                p[4] = 1;
+                p[5] = sbd_ats_max_nblks();
+                if (sl->sl_flags & SL_UNMAP_ENABLED && sbd_unmap_enable) {
+                        p[20] = (stmf_sbd_unmap_max_nblks >> 24) & 0xff;
+                        p[21] = (stmf_sbd_unmap_max_nblks >> 16) & 0xff;
+                        p[22] = (stmf_sbd_unmap_max_nblks >> 8) & 0xff;
+                        p[23] = stmf_sbd_unmap_max_nblks & 0xff;
+
+                        p[24] = 0;
+                        p[25] = 0;
+                        p[26] = 0;
+                        p[27] = 0xFF;
+                }
                 xfer_size = page_length + 4;
                 break;
 
         case 0xb2:
                 if ((sl->sl_flags & SL_UNMAP_ENABLED) == 0) {

@@ -2915,11 +3144,11 @@
                 while (s & ((uint64_t)0xFFFFFFFF80000000ull)) {
                         s >>= 1;
                         exp++;
                 }
                 p[4] = exp;
-                p[5] = 0xc0;
+                p[5] = 0xc0;    /* Logical provisioning UNMAP and WRITE SAME */
                 xfer_size = page_length + 4;
                 break;
 
         default:
                 stmf_scsilib_send_status(task, STATUS_CHECK,

@@ -2936,11 +3165,11 @@
 
 stmf_status_t
 sbd_task_alloc(struct scsi_task *task)
 {
         if ((task->task_lu_private =
-            kmem_alloc(sizeof (sbd_cmd_t), KM_NOSLEEP)) != NULL) {
+            kmem_zalloc(sizeof (sbd_cmd_t), KM_NOSLEEP)) != NULL) {
                 sbd_cmd_t *scmd = (sbd_cmd_t *)task->task_lu_private;
                 scmd->flags = 0;
                 return (STMF_SUCCESS);
         }
         return (STMF_ALLOC_FAILURE);

@@ -3002,12 +3231,44 @@
         it->sbd_it_flags &= ~SBD_IT_HAS_SCSI2_RESERVATION;
         sl->sl_flags &= ~SL_LU_HAS_SCSI2_RESERVATION;
         mutex_exit(&sl->sl_lock);
 }
 
+/*
+ * Given a LU and a task, check if the task is causing reservation
+ * conflict. Returns 1 in case of conflict, 0 otherwise.
+ * Note that the LU might not be the same LU as in the task but the
+ * caller makes sure that the LU can be accessed.
+ */
+int
+sbd_check_reservation_conflict(struct sbd_lu *sl, struct scsi_task *task)
+{
+        sbd_it_data_t *it;
 
+        it = task->task_lu_itl_handle;
+        ASSERT(it);
+        if (sl->sl_access_state == SBD_LU_ACTIVE) {
+                if (SBD_PGR_RSVD(sl->sl_pgr)) {
+                        if (sbd_pgr_reservation_conflict(task, sl)) {
+                                return (1);
+                        }
+                } else if ((sl->sl_flags & SL_LU_HAS_SCSI2_RESERVATION) &&
+                    ((it->sbd_it_flags & SBD_IT_HAS_SCSI2_RESERVATION) == 0)) {
+                        if (!(SCSI2_CONFLICT_FREE_CMDS(task->task_cdb))) {
+                                return (1);
+                        }
+                }
+        }
 
+        return (0);
+}
+
+/*
+ * Keep in mind that sbd_new_task can be called multiple times for the same
+ * task because of us calling stmf_task_poll_lu resulting in a call to
+ * sbd_task_poll().
+ */
 void
 sbd_new_task(struct scsi_task *task, struct stmf_data_buf *initial_dbuf)
 {
         sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
         sbd_it_data_t *it;

@@ -3067,47 +3328,103 @@
          * states, return NOT READY
          */
         if (sl->sl_access_state == SBD_LU_TRANSITION_TO_STANDBY ||
             sl->sl_access_state == SBD_LU_TRANSITION_TO_ACTIVE) {
                 stmf_scsilib_send_status(task, STATUS_CHECK,
-                    STMF_SAA_LU_NO_ACCESS_UNAVAIL);
+                    STMF_SAA_LU_NO_ACCESS_TRANSITION);
                 return;
         }
 
-        /* Checking ua conditions as per SAM3R14 5.3.2 specified order */
-        if ((it->sbd_it_ua_conditions) && (task->task_cdb[0] != SCMD_INQUIRY)) {
+        cdb0 = task->task_cdb[0];
+        cdb1 = task->task_cdb[1];
+        /*
+         * Special case for different versions of Windows.
+         * 1) Windows 2012 and VMWare will fail to discover LU's if a READ
+         *    operation sent down the standby path returns an error. By default
+         *    standby_fail_reads will be set to 0.
+         * 2) Windows 2008 R2 has a severe performace problem if READ ops
+         *    aren't rejected on the standby path. 2008 sends commands
+         *    down the standby path which then must be proxied over to the
+         *    active node and back.
+         */
+        if ((sl->sl_access_state == SBD_LU_STANDBY) &&
+            stmf_standby_fail_reads &&
+            (cdb0 == SCMD_READ || cdb0 == SCMD_READ_G1 ||
+            cdb0 == SCMD_READ_G4 || cdb0 == SCMD_READ_G5)) {
+                stmf_scsilib_send_status(task, STATUS_CHECK,
+                    STMF_SAA_LU_NO_ACCESS_STANDBY);
+                return;
+        }
+
+        /*
+         * Don't go further if cmd is unsupported in standby mode
+         */
+        if (sl->sl_access_state == SBD_LU_STANDBY) {
+                if (cdb0 != SCMD_INQUIRY &&
+                    cdb0 != SCMD_MODE_SENSE &&
+                    cdb0 != SCMD_MODE_SENSE_G1 &&
+                    cdb0 != SCMD_MODE_SELECT &&
+                    cdb0 != SCMD_MODE_SELECT_G1 &&
+                    cdb0 != SCMD_RESERVE &&
+                    cdb0 != SCMD_RELEASE &&
+                    cdb0 != SCMD_PERSISTENT_RESERVE_OUT &&
+                    cdb0 != SCMD_PERSISTENT_RESERVE_IN &&
+                    cdb0 != SCMD_REQUEST_SENSE &&
+                    cdb0 != SCMD_READ_CAPACITY &&
+                    cdb0 != SCMD_TEST_UNIT_READY &&
+                    cdb0 != SCMD_START_STOP &&
+                    cdb0 != SCMD_READ &&
+                    cdb0 != SCMD_READ_G1 &&
+                    cdb0 != SCMD_READ_G4 &&
+                    cdb0 != SCMD_READ_G5 &&
+                    !(cdb0 == SCMD_SVC_ACTION_IN_G4 &&
+                    cdb1 == SSVC_ACTION_READ_CAPACITY_G4) &&
+                    !(cdb0 == SCMD_MAINTENANCE_IN &&
+                    (cdb1 & 0x1F) == 0x05) &&
+                    !(cdb0 == SCMD_MAINTENANCE_IN &&
+                    (cdb1 & 0x1F) == 0x0A)) {
+                        stmf_scsilib_send_status(task, STATUS_CHECK,
+                            STMF_SAA_LU_NO_ACCESS_STANDBY);
+                        return;
+                }
+        }
+
+        /*
+         * Checking ua conditions as per SAM3R14 5.3.2 specified order. During
+         * MPIO/ALUA failover, cmds come in through local ports and proxy port
+         * port provider (i.e. pppt), we want to report unit attention to
+         * only local cmds since initiators (Windows MPIO/DSM) would continue
+         * sending I/O to the target that reported unit attention.
+         */
+        if ((it->sbd_it_ua_conditions) &&
+            !(task->task_additional_flags & TASK_AF_PPPT_TASK) &&
+            (task->task_cdb[0] != SCMD_INQUIRY)) {
                 uint32_t saa = 0;
 
                 mutex_enter(&sl->sl_lock);
                 if (it->sbd_it_ua_conditions & SBD_UA_POR) {
                         it->sbd_it_ua_conditions &= ~SBD_UA_POR;
                         saa = STMF_SAA_POR;
+                } else if (it->sbd_it_ua_conditions &
+                    SBD_UA_ASYMMETRIC_ACCESS_CHANGED) {
+                        it->sbd_it_ua_conditions &=
+                            ~SBD_UA_ASYMMETRIC_ACCESS_CHANGED;
+                        saa = STMF_SAA_ASYMMETRIC_ACCESS_CHANGED;
                 }
                 mutex_exit(&sl->sl_lock);
                 if (saa) {
                         stmf_scsilib_send_status(task, STATUS_CHECK, saa);
                         return;
                 }
         }
 
         /* Reservation conflict checks */
-        if (sl->sl_access_state == SBD_LU_ACTIVE) {
-                if (SBD_PGR_RSVD(sl->sl_pgr)) {
-                        if (sbd_pgr_reservation_conflict(task)) {
+        if (sbd_check_reservation_conflict(sl, task)) {
                                 stmf_scsilib_send_status(task,
                                     STATUS_RESERVATION_CONFLICT, 0);
                                 return;
                         }
-                } else if ((sl->sl_flags & SL_LU_HAS_SCSI2_RESERVATION) &&
-                    ((it->sbd_it_flags & SBD_IT_HAS_SCSI2_RESERVATION) == 0)) {
-                        if (!(SCSI2_CONFLICT_FREE_CMDS(task->task_cdb))) {
-                                stmf_scsilib_send_status(task,
-                                    STATUS_RESERVATION_CONFLICT, 0);
-                                return;
-                        }
-                }
-        }
 
         /* Rest of the ua conndition checks */
         if ((it->sbd_it_ua_conditions) && (task->task_cdb[0] != SCMD_INQUIRY)) {
                 uint32_t saa = 0;
 

@@ -3127,13 +3444,13 @@
                         it->sbd_it_ua_conditions &=
                             ~SBD_UA_MODE_PARAMETERS_CHANGED;
                         saa = STMF_SAA_MODE_PARAMETERS_CHANGED;
                 } else if (it->sbd_it_ua_conditions &
                     SBD_UA_ASYMMETRIC_ACCESS_CHANGED) {
-                        it->sbd_it_ua_conditions &=
-                            ~SBD_UA_ASYMMETRIC_ACCESS_CHANGED;
-                        saa = STMF_SAA_ASYMMETRIC_ACCESS_CHANGED;
+                        saa = 0;
+                } else if (it->sbd_it_ua_conditions & SBD_UA_POR) {
+                        saa = 0;
                 } else if (it->sbd_it_ua_conditions &
                     SBD_UA_ACCESS_STATE_TRANSITION) {
                         it->sbd_it_ua_conditions &=
                             ~SBD_UA_ACCESS_STATE_TRANSITION;
                         saa = STMF_SAA_LU_NO_ACCESS_TRANSITION;

@@ -3146,42 +3463,11 @@
                         stmf_scsilib_send_status(task, STATUS_CHECK, saa);
                         return;
                 }
         }
 
-        cdb0 = task->task_cdb[0];
-        cdb1 = task->task_cdb[1];
-
         if (sl->sl_access_state == SBD_LU_STANDBY) {
-                if (cdb0 != SCMD_INQUIRY &&
-                    cdb0 != SCMD_MODE_SENSE &&
-                    cdb0 != SCMD_MODE_SENSE_G1 &&
-                    cdb0 != SCMD_MODE_SELECT &&
-                    cdb0 != SCMD_MODE_SELECT_G1 &&
-                    cdb0 != SCMD_RESERVE &&
-                    cdb0 != SCMD_RELEASE &&
-                    cdb0 != SCMD_PERSISTENT_RESERVE_OUT &&
-                    cdb0 != SCMD_PERSISTENT_RESERVE_IN &&
-                    cdb0 != SCMD_REQUEST_SENSE &&
-                    cdb0 != SCMD_READ_CAPACITY &&
-                    cdb0 != SCMD_TEST_UNIT_READY &&
-                    cdb0 != SCMD_START_STOP &&
-                    cdb0 != SCMD_READ &&
-                    cdb0 != SCMD_READ_G1 &&
-                    cdb0 != SCMD_READ_G4 &&
-                    cdb0 != SCMD_READ_G5 &&
-                    !(cdb0 == SCMD_SVC_ACTION_IN_G4 &&
-                    cdb1 == SSVC_ACTION_READ_CAPACITY_G4) &&
-                    !(cdb0 == SCMD_MAINTENANCE_IN &&
-                    (cdb1 & 0x1F) == 0x05) &&
-                    !(cdb0 == SCMD_MAINTENANCE_IN &&
-                    (cdb1 & 0x1F) == 0x0A)) {
-                        stmf_scsilib_send_status(task, STATUS_CHECK,
-                            STMF_SAA_LU_NO_ACCESS_STANDBY);
-                        return;
-                }
-
                 /*
                  * is this a short write?
                  * if so, we'll need to wait until we have the buffer
                  * before proxying the command
                  */

@@ -3355,10 +3641,25 @@
         if ((cdb0 == SCMD_WRITE_SAME_G4) || (cdb0 == SCMD_WRITE_SAME_G1)) {
                 sbd_handle_write_same(task, initial_dbuf);
                 return;
         }
 
+        if (cdb0 == SCMD_COMPARE_AND_WRITE) {
+                sbd_handle_ats(task, initial_dbuf);
+                return;
+        }
+
+        if (cdb0 == SCMD_EXTENDED_COPY) {
+                sbd_handle_xcopy(task, initial_dbuf);
+                return;
+        }
+
+        if (cdb0 == SCMD_RECV_COPY_RESULTS) {
+                sbd_handle_recv_copy_results(task, initial_dbuf);
+                return;
+        }
+
         if (cdb0 == SCMD_TEST_UNIT_READY) {     /* Test unit ready */
                 task->task_cmd_xfer_length = 0;
                 stmf_scsilib_send_status(task, STATUS_GOOD, 0);
                 return;
         }

@@ -3461,16 +3762,22 @@
         case (SBD_CMD_SCSI_READ):
                 sbd_handle_read_xfer_completion(task, scmd, dbuf);
                 break;
 
         case (SBD_CMD_SCSI_WRITE):
-                if ((task->task_cdb[0] == SCMD_WRITE_SAME_G1) ||
-                    (task->task_cdb[0] == SCMD_WRITE_SAME_G4)) {
+                switch (task->task_cdb[0]) {
+                case SCMD_WRITE_SAME_G1:
+                case SCMD_WRITE_SAME_G4:
                         sbd_handle_write_same_xfer_completion(task, scmd, dbuf,
                             1);
-                } else {
+                        break;
+                case SCMD_COMPARE_AND_WRITE:
+                        sbd_handle_ats_xfer_completion(task, scmd, dbuf, 1);
+                        break;
+                default:
                         sbd_handle_write_xfer_completion(task, scmd, dbuf, 1);
+                        /* FALLTHRU */
                 }
                 break;
 
         case (SBD_CMD_SMALL_READ):
                 sbd_handle_short_read_xfer_completion(task, scmd, dbuf);

@@ -3532,10 +3839,11 @@
                 return (STMF_SUCCESS);
         }
 
         ASSERT(abort_cmd == STMF_LU_ABORT_TASK);
         task = (scsi_task_t *)arg;
+        sbd_ats_remove_by_task(task);
         if (task->task_lu_private) {
                 sbd_cmd_t *scmd = (sbd_cmd_t *)task->task_lu_private;
 
                 if (scmd->flags & SBD_SCSI_CMD_ACTIVE) {
                         if (scmd->flags & SBD_SCSI_CMD_TRANS_DATA) {

@@ -3549,10 +3857,19 @@
         }
 
         return (STMF_NOT_FOUND);
 }
 
+void
+sbd_task_poll(struct scsi_task *task)
+{
+        stmf_data_buf_t *initial_dbuf;
+
+        initial_dbuf = stmf_handle_to_buf(task, 0);
+        sbd_new_task(task, initial_dbuf);
+}
+
 /*
  * This function is called during task clean-up if the
  * DB_LU_FLAG is set on the dbuf. This should only be called for
  * abort processing after sbd_abort has been called for the task.
  */

@@ -3561,11 +3878,11 @@
 {
         sbd_cmd_t *scmd = (sbd_cmd_t *)task->task_lu_private;
         sbd_lu_t *sl = (sbd_lu_t *)task->task_lu->lu_provider_private;
 
         ASSERT(dbuf->db_lu_private);
-        ASSERT(scmd && scmd->nbufs > 0);
+        ASSERT(scmd && ATOMIC8_GET(scmd->nbufs) > 0);
         ASSERT((scmd->flags & SBD_SCSI_CMD_ACTIVE) == 0);
         ASSERT(dbuf->db_flags & DB_LU_DATA_BUF);
         ASSERT(task->task_additional_flags & TASK_AF_ACCEPT_LU_DBUF);
         ASSERT((curthread->t_flag & T_INTR_THREAD) == 0);
 

@@ -3575,11 +3892,11 @@
                 sbd_zvol_rele_write_bufs_abort(sl, dbuf);
         } else {
                 cmn_err(CE_PANIC, "Unknown cmd type %d, task = %p",
                     scmd->cmd_type, (void *)task);
         }
-        if (--scmd->nbufs == 0)
+        if (atomic_dec_8_nv(&scmd->nbufs) == 0)
                 rw_exit(&sl->sl_access_state_lock);
         stmf_teardown_dbuf(task, dbuf);
         stmf_free(dbuf);
 }
 

@@ -3672,33 +3989,42 @@
 
 sbd_status_t
 sbd_flush_data_cache(sbd_lu_t *sl, int fsync_done)
 {
         int r = 0;
-        int ret;
+        sbd_status_t ret;
 
+        rw_enter(&sl->sl_access_state_lock, RW_READER);
+        if ((sl->sl_flags & SL_MEDIA_LOADED) == 0) {
+                ret = SBD_FILEIO_FAILURE;
+                goto flush_fail;
+        }
         if (fsync_done)
                 goto over_fsync;
         if ((sl->sl_data_vtype == VREG) || (sl->sl_data_vtype == VBLK)) {
-                if (VOP_FSYNC(sl->sl_data_vp, FSYNC, kcred, NULL))
-                        return (SBD_FAILURE);
+                if (VOP_FSYNC(sl->sl_data_vp, FSYNC, kcred, NULL)) {
+                        ret = SBD_FAILURE;
+                        goto flush_fail;
         }
+        }
 over_fsync:
         if (((sl->sl_data_vtype == VCHR) || (sl->sl_data_vtype == VBLK)) &&
             ((sl->sl_flags & SL_NO_DATA_DKIOFLUSH) == 0)) {
                 ret = VOP_IOCTL(sl->sl_data_vp, DKIOCFLUSHWRITECACHE, NULL,
                     FKIOCTL, kcred, &r, NULL);
                 if ((ret == ENOTTY) || (ret == ENOTSUP)) {
                         mutex_enter(&sl->sl_lock);
                         sl->sl_flags |= SL_NO_DATA_DKIOFLUSH;
                         mutex_exit(&sl->sl_lock);
-                } else if (ret != 0) {
-                        return (SBD_FAILURE);
+                } else {
+                        ret = (ret != 0) ? SBD_FAILURE : SBD_SUCCESS;
                 }
         }
+flush_fail:
+        rw_exit(&sl->sl_access_state_lock);
 
-        return (SBD_SUCCESS);
+        return (ret);
 }
 
 /* ARGSUSED */
 static void
 sbd_handle_sync_cache(struct scsi_task *task,