Print this page
15254 %ymm registers not restored after signal handler
15367 x86 getfpregs() summons corrupting %xmm ghosts
15333 want x86 /proc xregs support (libc_db, libproc, mdb, etc.)
15336 want libc functions for extended ucontext_t
15334 want ps_lwphandle-specific reg routines
15328 FPU_CW_INIT mistreats reserved bit
15335 i86pc fpu_subr.c isn't really platform-specific
15332 setcontext(2) isn't actually noreturn
15331 need <sys/stdalign.h>
Change-Id: I7060aa86042dfb989f77fc3323c065ea2eafa9ad
Conflicts:
    usr/src/uts/common/fs/proc/prcontrol.c
    usr/src/uts/intel/os/archdep.c
    usr/src/uts/intel/sys/ucontext.h
    usr/src/uts/intel/syscall/getcontext.c

*** 1252,1261 **** --- 1252,1485 ---- lwp, named by the contract type. Changes made to an active template descriptor do not affect the original template which was activated, though they do affect the active template. It is not possible to activate an active template descriptor. See contract(5). + ARCHITECTURE-SPECIFIC STRUCTURES + x86 + The x86 prxregset_t structure is opaque and is made up of several + different components due to the fact that different x86 processors + enumerate different architectural extensions. + + The structure begins with a header, the prxregset_hdr_t, which is + followed by a number of different information sections which describe + different possible extended registers. Each of those is covered by a + prxregset_info_t, and then finally there are different data payloads that + represent each extended register. + + The number of different informational entries varies from system to + system based on the set of architectural features that the system + supports and the corresponding OS enablement for them. This structure is + built around the idea of the x86 xsave structure. That is, there is a + central header which describes a bit-vector of what extended features are + present and have valid state. + + Each x86 xregs file begins with the prxregset_hdr_t which looks like: + + typedef struct prxregset_hdr { + uint32_t pr_type; + uint32_t pr_size; + uint32_t pr_flags; + uint32_t pr_pad[4]; + uint32_t pr_ninfo; + prxregset_info_t pr_info[]; + } prxregset_hdr_t; + + The pr_type member is always set to PR_TYPE_XSAVE. This is used to + indicate the type of file that is present. There may be different file + types in the future on x86 so this value should always be checked. If it + is not PR_TYPE_XSAVE then the rest of the structure may look different. + The pr_size member indicates the size in bytes of the overall structure. + The pr_flags and pr_pad values are currently reserved for future use. + They will be set to zero right now when read and must be set to zero when + writing the data. The pr_ninfo member indicates the number of + informational items are present in pr_info. There will be one + informational item for each register set that exists. + + The pr_info member points to an array of informational members. These + immediately follow the structure, though the pr_info member may not be + available directly if not in an environment compatible with some C99 + features. Each prxregset_info_t structure looks like: + + typedef struct prxregset_info { + uint32_t pri_type; + uint32_t pri_flags; + uint32_t pri_size; + uint32_t pri_offset; + } prxregset_info_t; + + The pri_type member is used to indicate the type of data and its format + that this represents. Types are listed below. The pri_flags member is + used to indicate future extensions or information about these items. + Right now, these are all zero. The pri_size member indicates the size in + bytes of the type's data. The pri_offset member indicates the offset to + the start of the data section from the beginning of the xregs file. That + is an offset of 0 would be the first byte of the prxregset_hdr_t. + + The following types of structures and their corresponding data structures + are currently defined: + + PRX_INFO_XCR - prxregset_xcr_t + This structure provides read-only access to understanding the + CPU's settings for this thread. In particular, it lets you see + what is set in the x86 %xcr0 register which is the extended + feature control register and controls what extended features the + CPU actually uses. It also contains the x86 extended feature + disable MSR which controls features that are ignored. The + prxregset_xcr_t looks like: + + typedef struct prxregset_xcr { + uint64_t prx_xcr_xcr0; + uint64_t prx_xcr_xfd; + uint64_t prx_xcr_pad[2]; + } prxregset_xcr_t; + + When setting the xregs, this entry can be left out. If it is + included, it must match the existing entries, otherwise an error + will be generated. + + PRX_INFO_XSAVE - prxregset_xsave_t + This structure represents the same as the actual Intel xsave + structure, which has both the traditional XMM state that comes + from the fxsave instruction and then also contains the xsave + header itself. The structure varies between 32-bit and 64-bit + applications. The structure itself looks like: + + typedef struct prxregset_xsave { + uint16_t prx_fx_fcw; + uint16_t prx_fx_fsw; + uint16_t prx_fx_fctw; /* compressed tag word */ + uint16_t prx_fx_fop; + #if defined(__amd64) + uint64_t prx_fx_rip; + uint64_t prx_fx_rdp; + #else + uint32_t prx_fx_eip; + uint16_t prx_fx_cs; + uint16_t __prx_fx_ign0; + uint32_t prx_fx_dp; + uint16_t prx_fx_ds; + uint16_t __prx_fx_ign1; + #endif + uint32_t prx_fx_mxcsr; + uint32_t prx_fx_mxcsr_mask; + union { + uint16_t prx_fpr_16[5]; /* 80-bits of x87 state */ + u_longlong_t prx_fpr_mmx; /* 64-bit mmx register */ + uint32_t _prx__fpr_pad[4]; /* (pad out to 128-bits) */ + } fx_st[8]; + #if defined(__amd64) + upad128_t prx_fx_xmm[16]; /* 128-bit registers */ + upad128_t __prx_fx_ign2[6]; + #else + upad128_t prx_fx_xmm[8]; /* 128-bit registers */ + upad128_t __prx_fx_ign2[14]; + #endif + uint64_t prx_xsh_xstate_bv; + uint64_t prx_xsh_xcomp_bv; + uint64_t prx_xsh_reserved[6]; + } prxregset_xsave_t; + + In the classical fxsave portion of the structure, most of the + members follow the same meaning and match their presence in the + fpregs file and their use as discussed in the Intel and AMD + software developer manuals. The one exception is that when + setting the prx_fx_mxcsr member reserved bits that are set will + be masked off and ignored. + + The most notable fields to consider here right now are the last + few members which are part of the xsave header itself. In + particular, the prx_xsh_xstate_bv component is used to track the + actual features whose content are valid. When reading the + registers, if a given entry is not valid, the register state will + write out the informational entry in its default state. When + setting the extended registers, this notes which features will be + loaded from their default state (as defined by Intel and AMD's + manuals) and which will be loaded from the informational entries. + If a bit is set in the prx_xsh_xstate_bv entry, then it must be + present as its own informational entry otherwise a write will + fail. If an informational entry is present in a write, but not + set in the prx_xsh_xstate_bv then its contents will be ignored. + + The xregs format currently does not support any compressed items + being specified nor does it specify any, so the prx_xsh_xcomp_bv + member will be always set to zero and it and the reserved members + prx_xsh_reserved must all be left as zero. + + PRX_INFO_YMM - prxregset_ymm_t + This structure contains the upper 128-bits of the first 16 %ymm + registers (8 for 32-bit applications). To construct a full + vector register, it must be combined with the prx_fx_xmm member + of the PRX_INFO_XSAVE data. In 32-bit applications, the reserved + registers must be written as zero. The structure itself looks + like: + + typedef struct prxregset_ymm { + #if defined(__amd64) + upad128_t prx_ymm[16]; + #else + upad128_t prx_ymm[8]; + upad128_t prx_rsvd[8]; + #endif + } prxregset_ymm_t; + + PRX_INFO_OPMASK - prxregset_opmask_t + This structure represents one portion of Intel's AVX-512 state: + the 8 64-bit mask registers, %k0 through %k7. The structure + looks like: + + typedef struct prxregset_opmask { + uint64_t prx_opmask[8]; + } prxregset_opmask_t; + + PRX_INFO_ZMM - prxregset_zmm_t + This structure represents one portion of Intel's AVX-512 state: + the upper 256 bits of the 512-bit %zmm0 through %zmm15 registers. + Bits 0-127 are found in the prx_fx_xmm member of the + PRX_INFO_XSAVE data and bits 128-255 are found in the prx_ymm + member of the PRX_INFO_YMM. 32-bit applications only have access + to %zmm0 through %zmm7. This structure looks like: + + typedef struct prxregset_zmm { + #if defined(__amd64) + upad256_t prx_zmm[16]; + #else + upad256_t prx_zmm[8]; + upad256_t prx_rsvd[8]; + #endif + } prxregset_zmm_t; + + PRX_INFO_HI_ZMM - prxregset_hi_zmm_t + This structure represents the third portion of Intel's AVX-512 + state: the additional 16 512-bit registers that are available to + 64-bit applications, but not 32-bit applications. This + represents %zmm16 through %zmm31. This structure looks like: + + typedef struct prxregset_hi_zmm { + #if defined(__amd64) + upad512_t prx_hi_zmm[16]; + #else + upad512_t prx_rsvd[16]; + #endif + } prxregset_hi_zmm_t; + + Unlike the other lower %zmm registers of %zmm0 through %zmm15, this contains the + entire 512-bit register in one spot and there is no need to look at other + information items to reconstitute the entire vector. + + When setting the extended registers, at least the PRX_INFO_XSAVE + component must be present. None of the component offsets may + overlap with the prxregset_hdr_t or any of the prxregset_info_t + structures. In the written data file, it is expected that the + various structures start with their naturally expected alignment, + which is most often 16 bytes (that is the value that the C + alignof() keyword will return). The structures that we use are + all multiples of 16 bytes to make this easier. The kernel will + write out structures with a greater alignment such that the + portions of registers are aligned and safely usable with + instructions that move aligned integers such as vmovdqu64. + CONTROL MESSAGES Process state changes are effected through messages written to a process's ctl file or to an individual lwp's lwpctl file. All control messages consist of a long that names the specific operation followed by additional data containing the operand, if any.
*** 1275,1290 **** When applied to the process control file, PCSTOP directs all lwps to stop and waits for them to stop, PCDSTOP directs all lwps to stop without waiting for them to stop, and PCWSTOP simply waits for all lwps to stop. When applied to an lwp control file, PCSTOP directs the specific lwp to stop and waits until it has stopped, PCDSTOP directs the specific lwp to ! stop without waiting for it to stop, and PCWSTOP ! simply waits for the specific lwp to stop. When applied to an lwp ! control file, PCSTOP and PCWSTOP complete when the lwp stops on an event ! of interest, immediately if already so stopped; when applied to the ! process control file, they complete when every lwp has stopped either on ! an event of interest or on a PR_SUSPENDED stop. PCTWSTOP is identical to PCWSTOP except that it enables the operation to time out, to avoid waiting forever for a process or lwp that may never stop on an event of interest. PCTWSTOP takes a long operand specifying a number of milliseconds; the wait will terminate successfully after the --- 1499,1514 ---- When applied to the process control file, PCSTOP directs all lwps to stop and waits for them to stop, PCDSTOP directs all lwps to stop without waiting for them to stop, and PCWSTOP simply waits for all lwps to stop. When applied to an lwp control file, PCSTOP directs the specific lwp to stop and waits until it has stopped, PCDSTOP directs the specific lwp to ! stop without waiting for it to stop, and PCWSTOP simply waits for the ! specific lwp to stop. When applied to an lwp control file, PCSTOP and ! PCWSTOP complete when the lwp stops on an event of interest, immediately ! if already so stopped; when applied to the process control file, they ! complete when every lwp has stopped either on an event of interest or on ! a PR_SUSPENDED stop. PCTWSTOP is identical to PCWSTOP except that it enables the operation to time out, to avoid waiting forever for a process or lwp that may never stop on an event of interest. PCTWSTOP takes a long operand specifying a number of milliseconds; the wait will terminate successfully after the
*** 1292,1310 **** stopped; a timeout value of zero makes the operation identical to PCWSTOP. An "event of interest" is either a PR_REQUESTED stop or a stop that has been specified in the process's tracing flags (set by PCSTRACE, PCSFAULT, ! PCSENTRY, and PCSEXIT). PR_JOBCONTROL ! and PR_SUSPENDED stops are specifically not events of interest. (An lwp ! may stop twice due to a stop signal, first showing PR_SIGNALLED if the ! signal is traced and again showing PR_JOBCONTROL if the lwp is set ! running without clearing the signal.) If PCSTOP or PCDSTOP is applied to ! an lwp that is stopped, but not on an event of interest, the stop ! directive takes effect when the lwp is restarted by the competing ! mechanism. At that time, the lwp enters a PR_REQUESTED stop before ! executing any user-level code. A write of a control message that blocks is interruptible by a signal so that, for example, an alarm(2) can be set to avoid waiting forever for a process or lwp that may never stop on an event of interest. If PCSTOP is interrupted, the lwp stop directives remain in effect even though the --- 1516,1533 ---- stopped; a timeout value of zero makes the operation identical to PCWSTOP. An "event of interest" is either a PR_REQUESTED stop or a stop that has been specified in the process's tracing flags (set by PCSTRACE, PCSFAULT, ! PCSENTRY, and PCSEXIT). PR_JOBCONTROL and PR_SUSPENDED stops are ! specifically not events of interest. (An lwp may stop twice due to a ! stop signal, first showing PR_SIGNALLED if the signal is traced and again ! showing PR_JOBCONTROL if the lwp is set running without clearing the ! signal.) If PCSTOP or PCDSTOP is applied to an lwp that is stopped, but ! not on an event of interest, the stop directive takes effect when the lwp ! is restarted by the competing mechanism. At that time, the lwp enters a ! PR_REQUESTED stop before executing any user-level code. A write of a control message that blocks is interruptible by a signal so that, for example, an alarm(2) can be set to avoid waiting forever for a process or lwp that may never stop on an event of interest. If PCSTOP is interrupted, the lwp stop directives remain in effect even though the
*** 1674,1685 **** PCSXREG Set the extra state registers for the specific or representative lwp according to the architecture-dependent operand prxregset_t structure. An error (EINVAL) is returned if the system does not support extra state ! registers. PCSXREG fails with EBUSY if the lwp is not stopped on an ! event of interest. PCSASRS Set the ancillary state registers for the specific or representative lwp according to the SPARC V9 platform-dependent operand asrset_t structure. An error (EINVAL) is returned if either the target process or the --- 1897,1908 ---- PCSXREG Set the extra state registers for the specific or representative lwp according to the architecture-dependent operand prxregset_t structure. An error (EINVAL) is returned if the system does not support extra state ! registers or the register state is invalid. PCSXREG fails with EBUSY if ! the lwp is not stopped on an event of interest. PCSASRS Set the ancillary state registers for the specific or representative lwp according to the SPARC V9 platform-dependent operand asrset_t structure. An error (EINVAL) is returned if either the target process or the