Print this page
12482 Have /usr/bin/awk point to /usr/bin/nawk
Reviewed by: Peter Tribble <peter.tribble@gmail.com>
Reviewed by: Toomas Soome <tsoome@me.com>

Split Close
Expand all
Collapse all
          --- old/usr/src/man/man1/awk.1
          +++ new/usr/src/man/man1/awk.1
↓ open down ↓ 33 lines elided ↑ open up ↑
  34   34  .\" and limitations under the License.
  35   35  .\"
  36   36  .\" When distributing Covered Code, include this CDDL HEADER in each
  37   37  .\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  38   38  .\" If applicable, add the following below this CDDL HEADER, with the
  39   39  .\" fields enclosed by brackets "[]" replaced with your own identifying
  40   40  .\" information: Portions Copyright [yyyy] [name of copyright owner]
  41   41  .\"
  42   42  .\"
  43   43  .\" Copyright 1989 AT&T
  44      -.\" Portions Copyright (c) 1992, X/Open Company Limited.  All Rights Reserved.
  45      -.\" Copyright (c) 2005, Sun Microsystems, Inc.  All Rights Reserved
       44 +.\" Copyright 1992, X/Open Company Limited  All Rights Reserved
       45 +.\" Portions Copyright (c) 2005, 2006 Sun Microsystems, Inc. All Rights Reserved
       46 +.\" Copyright 2020 Joyent, Inc.
  46   47  .\"
  47      -.TH AWK 1 "Jun 22, 2005"
       48 +.TH AWK 1 "Apr 20, 2020"
  48   49  .SH NAME
  49   50  awk \- pattern scanning and processing language
  50   51  .SH SYNOPSIS
       52 +.nf
       53 +\fB/usr/bin/awk\fR [\fB-F\fR \fIERE\fR] [\fB-v\fR \fIassignment\fR] \fI\&'program'\fR | \fB-f\fR \fIprogfile\fR...
       54 +     [\fIargument\fR]...
       55 +.fi
       56 +
  51   57  .LP
  52   58  .nf
  53      -\fB/usr/bin/awk\fR [\fB-f\fR \fIprogfile\fR] [\fB-F\fIc\fR\fR] [' \fIprog\fR '] [\fIparameters\fR]
  54      -     [\fIfilename\fR]...
       59 +\fB/usr/bin/nawk\fR [\fB-F\fR \fIERE\fR] [\fB-v\fR \fIassignment\fR] \fI\&'program'\fR | \fB-f\fR \fIprogfile\fR...
       60 +     [\fIargument\fR]...
  55   61  .fi
  56   62  
  57   63  .LP
  58   64  .nf
  59      -\fB/usr/xpg4/bin/awk\fR [\fB-F\fR\fIcERE\fR] [\fB-v\fR \fIassignment\fR]... \fI\&'program'\fR \fB-f\fR \fIprogfile\fR...
       65 +\fB/usr/xpg4/bin/awk\fR [\fB-F\fR \fIERE\fR] [\fB-v\fR \fIassignment\fR]... \fI\&'program'\fR | \fB-f\fR \fIprogfile\fR...
  60   66       [\fIargument\fR]...
  61   67  .fi
  62   68  
  63   69  .SH DESCRIPTION
  64      -.sp
       70 +NOTE: The \fBnawk\fR command is now the system default awk for illumos.
  65   71  .LP
  66      -The \fB/usr/xpg4/bin/awk\fR utility is described on the \fBnawk\fR(1) manual
  67      -page.
       72 +The \fB/usr/bin/awk\fR and \fB/usr/xpg4/bin/awk\fR utilities execute
       73 +\fIprogram\fRs written in the \fBawk\fR programming language, which is
       74 +specialized for textual data manipulation. A \fBawk\fR \fIprogram\fR is a
       75 +sequence of patterns and corresponding actions. The string specifying
       76 +\fIprogram\fR must be enclosed in single quotes (') to protect it from
       77 +interpretation by the shell. The sequence of pattern - action statements can be
       78 +specified in the command line as \fIprogram\fR or in one, or more, file(s)
       79 +specified by the \fB-f\fR\fIprogfile\fR option. When input is read that matches
       80 +a pattern, the action associated with the pattern is performed.
  68   81  .sp
  69   82  .LP
  70      -The \fB/usr/bin/awk\fR utility scans each input \fIfilename\fR for lines that
  71      -match any of a set of patterns specified in \fIprog\fR. The \fIprog\fR string
  72      -must be enclosed in single quotes (\fB a\'\fR) to protect it from the shell.
  73      -For each pattern in \fIprog\fR there can be an associated action performed when
  74      -a line of a \fIfilename\fR matches the pattern. The set of pattern-action
  75      -statements can appear literally as \fIprog\fR or in a file specified with the
  76      -\fB-f\fR\fI progfile\fR option. Input files are read in order; if there are no
  77      -files, the standard input is read. The file name \fB\&'\(mi'\fR means the
  78      -standard input.
  79      -.SH OPTIONS
       83 +Input is interpreted as a sequence of records. By default, a record is a line,
       84 +but this can be changed by using the \fBRS\fR built-in variable. Each record of
       85 +input is matched to each pattern in the \fIprogram\fR. For each pattern
       86 +matched, the associated action is executed.
  80   87  .sp
  81   88  .LP
       89 +The \fBawk\fR utility interprets each input record as a sequence of fields
       90 +where, by default, a field is a string of non-blank characters. This default
       91 +white-space field delimiter (blanks and/or tabs) can be changed by using the
       92 +\fBFS\fR built-in variable or the \fB-F\fR\fIERE\fR option. The \fBawk\fR
       93 +utility denotes the first field in a record \fB$1\fR, the second \fB$2\fR, and
       94 +so forth. The symbol \fB$0\fR refers to the entire record; setting any other
       95 +field causes the reevaluation of \fB$0\fR. Assigning to \fB$0\fR resets the
       96 +values of all fields and the \fBNF\fR built-in variable.
       97 +
       98 +.SH OPTIONS
  82   99  The following options are supported:
  83  100  .sp
  84  101  .ne 2
  85  102  .na
  86      -\fB\fB-f\fR\fI progfile\fR \fR
      103 +\fB\fB-F\fR \fIERE\fR\fR
  87  104  .ad
  88      -.RS 16n
  89      -\fBawk\fR uses the set of patterns it reads from \fIprogfile\fR.
      105 +.RS 17n
      106 +Define the input field separator to be the extended regular expression
      107 +\fIERE\fR, before any input is read (can be a character).
  90  108  .RE
  91  109  
  92  110  .sp
  93  111  .ne 2
  94  112  .na
  95      -\fB\fB-F\fR\fIc\fR \fR
      113 +\fB\fB-f\fR \fIprogfile\fR\fR
  96  114  .ad
  97      -.RS 16n
  98      -Uses the character \fIc\fR as the field separator (FS) character.  See the
  99      -discussion of \fBFS\fR below.
      115 +.RS 17n
      116 +Specifies the pathname of the file \fIprogfile\fR containing a \fBawk\fR
      117 +program. If multiple instances of this option are specified, the concatenation
      118 +of the files specified as \fIprogfile\fR in the order specified is the
      119 +\fBawk\fR program. The \fBawk\fR program can alternatively be specified in
      120 +the command line as a single argument.
 100  121  .RE
 101  122  
 102      -.SH USAGE
 103      -.SS "Input Lines"
 104  123  .sp
      124 +.ne 2
      125 +.na
      126 +\fB\fB-v\fR \fIassignment\fR\fR
      127 +.ad
      128 +.RS 17n
      129 +The \fIassignment\fR argument must be in the same form as an \fIassignment\fR
      130 +operand. The assignment is of the form \fIvar=value\fR, where \fIvar\fR is the
      131 +name of one of the variables described below. The specified assignment occurs
      132 +before executing the \fBawk\fR program, including the actions associated with
      133 +\fBBEGIN\fR patterns (if any). Multiple occurrences of this option can be
      134 +specified.
      135 +.RE
      136 +
      137 +.sp
      138 +.ne 2
      139 +.na
      140 +\fB\fB-safe\fR\fR
      141 +.ad
      142 +.RS 17n
      143 +When passed to \fBawk\fR, this flag will prevent the program from opening new
      144 +files or running child processes. The \fBENVIRON\fR array will also not be
      145 +initialized.
      146 +.RE
      147 +
      148 +.SH OPERANDS
      149 +The following operands are supported:
      150 +.sp
      151 +.ne 2
      152 +.na
      153 +\fB\fIprogram\fR\fR
      154 +.ad
      155 +.RS 12n
      156 +If no \fB-f\fR option is specified, the first operand to \fBawk\fR is the text
      157 +of the \fBawk\fR program. The application supplies the \fIprogram\fR operand
      158 +as a single argument to \fBawk.\fR If the text does not end in a newline
      159 +character, \fBawk\fR interprets the text as if it did.
      160 +.RE
      161 +
      162 +.sp
      163 +.ne 2
      164 +.na
      165 +\fB\fIargument\fR\fR
      166 +.ad
      167 +.RS 12n
      168 +Either of the following two types of \fIargument\fR can be intermixed:
      169 +.sp
      170 +.ne 2
      171 +.na
      172 +\fB\fIfile\fR\fR
      173 +.ad
      174 +.RS 14n
      175 +A pathname of a file that contains the input to be read, which is matched
      176 +against the set of patterns in the program. If no \fIfile\fR operands are
      177 +specified, or if a \fIfile\fR operand is \fB\(mi\fR, the standard input is
      178 +used.
      179 +.RE
      180 +
      181 +.sp
      182 +.ne 2
      183 +.na
      184 +\fB\fIassignment\fR\fR
      185 +.ad
      186 +.RS 14n
      187 +An operand that begins with an underscore or alphabetic character from the
      188 +portable character set, followed by a sequence of underscores, digits and
      189 +alphabetics from the portable character set, followed by the \fB=\fR character
      190 +specifies a variable assignment rather than a pathname. The characters before
      191 +the \fB=\fR represent the name of a \fBawk\fR variable. If that name is a
      192 +\fBawk\fR reserved word, the behavior is undefined. The characters following
      193 +the equal sign is interpreted as if they appeared in the \fBawk\fR program
      194 +preceded and followed by a double-quote (\fB"\fR) character, as a \fBSTRING\fR
      195 +token , except that if the last character is an unescaped backslash, it is
      196 +interpreted as a literal backslash rather than as the first character of the
      197 +sequence \fB\e\fR\&.. The variable is assigned the value of that \fBSTRING\fR
      198 +token. If the value is considered a \fInumeric\fRstring\fI,\fR the variable is
      199 +assigned its numeric value. Each such variable assignment is performed just
      200 +before the processing of the following \fIfile\fR, if any. Thus, an assignment
      201 +before the first \fBfile\fR argument is executed after the \fBBEGIN\fR actions
      202 +(if any), while an assignment after the last \fIfile\fR argument is executed
      203 +before the \fBEND\fR actions (if any).  If there are no \fIfile\fR arguments,
      204 +assignments are executed before processing the standard input.
      205 +.RE
      206 +
      207 +.RE
      208 +
      209 +.SH INPUT FILES
      210 +Input files to the \fBawk\fR program from any of the following sources:
      211 +.RS +4
      212 +.TP
      213 +.ie t \(bu
      214 +.el o
      215 +any \fIfile\fR operands or their equivalents, achieved by modifying the
      216 +\fBawk\fR variables \fBARGV\fR and \fBARGC\fR
      217 +.RE
      218 +.RS +4
      219 +.TP
      220 +.ie t \(bu
      221 +.el o
      222 +standard input in the absence of any \fIfile\fR operands
      223 +.RE
      224 +.RS +4
      225 +.TP
      226 +.ie t \(bu
      227 +.el o
      228 +arguments to the \fBgetline\fR function
      229 +.RE
      230 +.sp
 105  231  .LP
 106      -Each input line is matched against the pattern portion of every pattern-action
 107      -statement; the associated action is performed for each matched pattern. Any
 108      -\fIfilename\fR of the form \fIvar=value\fR is treated as an assignment, not a
 109      -filename, and is executed at the time it would have been opened if it were a
 110      -filename. \fIVariables\fR assigned in this manner are not available inside a
 111      -\fBBEGIN\fR rule, and are assigned after previously specified files have been
 112      -read.
      232 +must be text files. Whether the variable \fBRS\fR is set to a value other than
      233 +a newline character or not, for these files, implementations support records
      234 +terminated with the specified separator up to \fB{LINE_MAX}\fR bytes and can
      235 +support longer records.
 113  236  .sp
 114  237  .LP
 115      -An input line is normally made up of fields separated by white spaces. (This
 116      -default can be changed by using the \fBFS\fR built-in variable or the
 117      -\fB-F\fR\fIc\fR option.) The default is to ignore leading blanks and to
 118      -separate fields by blanks and/or tab characters. However, if \fBFS\fR is
 119      -assigned a value that does not include any of the white spaces, then leading
 120      -blanks are not ignored. The fields are denoted \fB$1\fR, \fB$2\fR,
 121      -\fB\&.\|.\|.\fR\|; \fB$0\fR refers to the entire line.
 122      -.SS "Pattern-action Statements"
      238 +If \fB-\fR\fBf\fR \fIprogfile\fR is specified, the files named by each of the
      239 +\fIprogfile\fR option-arguments must be text files containing an \fBawk\fR
      240 +program.
 123  241  .sp
 124  242  .LP
 125      -A pattern-action statement has the form:
      243 +The standard input are used only if no \fIfile\fR operands are specified, or if
      244 +a \fIfile\fR operand is \fB\(mi\fR\&.
      245 +
      246 +.SH EXTENDED DESCRIPTION
      247 +A \fBawk\fR program is composed of pairs of the form:
 126  248  .sp
 127  249  .in +2
 128  250  .nf
 129      -\fIpattern\fR\fB { \fR\fIaction\fR\fB } \fR
      251 +pattern { \fIaction\fR }
 130  252  .fi
 131  253  .in -2
 132      -.sp
 133  254  
 134  255  .sp
 135  256  .LP
 136      -Either pattern or action can be omitted. If there is no action, the matching
 137      -line is printed. If there is no pattern, the action is performed on every input
 138      -line. Pattern-action statements are separated by newlines or semicolons.
      257 +Either the pattern or the action (including the enclosing brace characters) can
      258 +be omitted. Pattern-action statements are separated by a semicolon or by a
      259 +newline.
 139  260  .sp
 140  261  .LP
 141      -Patterns are arbitrary Boolean combinations ( \fB!\fR, ||, \fB&&\fR, and
 142      -parentheses) of relational expressions and regular expressions. A relational
 143      -expression is one of the following:
      262 +A missing pattern matches any record of input, and a missing action is
      263 +equivalent to an action that writes the matched record of input to standard
      264 +output.
 144  265  .sp
      266 +.LP
      267 +Execution of the \fBawk\fR program starts by first executing the actions
      268 +associated with all \fBBEGIN\fR patterns in the order they occur in the
      269 +program. Then each \fIfile\fR operand (or standard input if no files were
      270 +specified) is processed by reading data from the file until a record separator
      271 +is seen (a newline character by default), splitting the current record into
      272 +fields using the current value of \fBFS\fR, evaluating each pattern in the
      273 +program in the order of occurrence, and executing the action associated with
      274 +each pattern that matches the current record. The action for a matching pattern
      275 +is executed before evaluating subsequent patterns. Last, the actions associated
      276 +with all \fBEND\fR patterns is executed in the order they occur in the program.
      277 +
      278 +.SS "Expressions in awk"
      279 +Expressions describe computations used in \fIpatterns\fR and \fIactions\fR. In
      280 +the following table, valid expression operations are given in groups from
      281 +highest precedence first to lowest precedence last, with equal-precedence
      282 +operators grouped between horizontal lines. In expression evaluation, where the
      283 +grammar is formally ambiguous, higher precedence operators are evaluated before
      284 +lower precedence operators.  In this table \fIexpr,\fR \fIexpr1,\fR
      285 +\fIexpr2,\fR and \fIexpr3\fR represent any expression, while \fIlvalue\fR
      286 +represents any entity that can be assigned to (that is, on the left side of an
      287 +assignment operator).
      288 +.sp
      289 +
      290 +.sp
      291 +.TS
      292 +c c c c
      293 +l l l l .
      294 +\fBSyntax\fR    \fBName\fR      \fBType of Result\fR    \fBAssociativity\fR
      295 +_
      296 +( \fIexpr\fR )  Grouping        type of \fIexpr\fR      n/a
      297 +_
      298 +$\fIexpr\fR     Field reference string  n/a
      299 +_
      300 +++ \fIlvalue\fR Pre-increment   numeric n/a
      301 +\(mi\(mi \fIlvalue\fR   Pre-decrement   numeric n/a
      302 +\fIlvalue\fR ++ Post-increment  numeric n/a
      303 +\fIlvalue\fR \(mi\(mi   Post-decrement  numeric n/a
      304 +_
      305 +\fIexpr\fR ^ \fIexpr\fR Exponentiation  numeric right
      306 +_
      307 +! \fIexpr\fR    Logical not     numeric n/a
      308 ++ \fIexpr\fR    Unary plus      numeric n/a
      309 +\(mi \fIexpr\fR Unary minus     numeric n/a
      310 +_
      311 +\fIexpr\fR * \fIexpr\fR Multiplication  numeric left
      312 +\fIexpr\fR / \fIexpr\fR Division        numeric left
      313 +\fIexpr\fR % \fIexpr\fR Modulus numeric left
      314 +_
      315 +\fIexpr\fR + \fIexpr\fR Addition        numeric left
      316 +\fIexpr\fR \(mi \fIexpr\fR      Subtraction     numeric left
      317 +_
      318 +\fIexpr\fR \fIexpr\fR   String concatenation    string  left
      319 +_
      320 +\fIexpr\fR < \fIexpr\fR Less than       numeric none
      321 +\fIexpr\fR <= \fIexpr\fR        Less than or equal to   numeric none
      322 +\fIexpr\fR != \fIexpr\fR        Not equal to    numeric none
      323 +\fIexpr\fR == \fIexpr\fR        Equal to        numeric none
      324 +\fIexpr\fR > \fIexpr\fR Greater than    numeric none
      325 +\fIexpr\fR >= \fIexpr\fR        Greater than or equal to        numeric none
      326 +_
      327 +\fIexpr\fR ~ \fIexpr\fR ERE match       numeric none
      328 +\fIexpr\fR !~ \fIexpr\fR        ERE non-match    numeric        none
      329 +_
      330 +\fIexpr\fR in array     Array membership        numeric left
      331 +( \fIindex\fR ) in      Multi-dimension array   numeric left
      332 +    \fIarray\fR     membership
      333 +_
      334 +\fBexpr\fR && \fIexpr\fR        Logical AND     numeric left
      335 +_
      336 +\fBexpr\fR |\|| \fIexpr\fR      Logical OR      numeric left
      337 +_
      338 +\fIexpr1\fR ? \fIexpr2\fR       Conditional expression  type of selected        right
      339 +    : \fIexpr3\fR                  \fIexpr2\fR or \fIexpr3\fR
      340 +_
      341 +\fIlvalue\fR ^= \fIexpr\fR      Exponentiation  numeric right
      342 +        assignment
      343 +\fIlvalue\fR %= \fIexpr\fR      Modulus assignment      numeric right
      344 +\fIlvalue\fR *= \fIexpr\fR      Multiplication  numeric right
      345 +        assignment
      346 +\fIlvalue\fR /= \fIexpr\fR      Division assignment     numeric right
      347 +\fIlvalue\fR +=  \fIexpr\fR     Addition assignment     numeric right
      348 +\fIlvalue\fR \(mi= \fIexpr\fR   Subtraction assignment  numeric right
      349 +\fIlvalue\fR = \fIexpr\fR       Assignment      type of \fIexpr\fR      right
      350 +.TE
      351 +
      352 +.sp
      353 +.LP
      354 +Each expression has either a string value, a numeric value or both. Except as
      355 +stated for specific contexts, the value of an expression is implicitly
      356 +converted to the type needed for the context in which it is used.  A string
      357 +value is converted to a numeric value by the equivalent of the following calls:
      358 +.sp
 145  359  .in +2
 146  360  .nf
 147      -\fIexpression relop expression
 148      -expression matchop regular_expression\fR
      361 +setlocale(LC_NUMERIC, "");
      362 +\fInumeric_value\fR = atof(\fIstring_value\fR);
 149  363  .fi
 150  364  .in -2
 151  365  
 152  366  .sp
 153  367  .LP
 154      -where a \fIrelop\fR is any of the six relational operators in C, and a
 155      -\fImatchop\fR is either \fB~\fR (contains) or \fB!~\fR (does not contain). An
 156      -\fIexpression\fR is an arithmetic expression, a relational expression, the
 157      -special expression
      368 +A numeric value that is exactly equal to the value of an integer is converted
      369 +to a string by the equivalent of a call to the \fBsprintf\fR function with the
      370 +string \fB%d\fR as the \fBfmt\fR argument and the numeric value being converted
      371 +as the first and only \fIexpr\fR argument.  Any other numeric value is
      372 +converted to a string by the equivalent of a call to the \fBsprintf\fR function
      373 +with the value of the variable \fBCONVFMT\fR as the \fBfmt\fR argument and the
      374 +numeric value being converted as the first and only \fIexpr\fR argument.
 158  375  .sp
      376 +.LP
      377 +A string value is considered to be a \fInumeric string\fR in the following
      378 +case:
      379 +.RS +4
      380 +.TP
      381 +1.
      382 +Any leading and trailing blank characters is ignored.
      383 +.RE
      384 +.RS +4
      385 +.TP
      386 +2.
      387 +If the first unignored character is a \fB+\fR or \fB\(mi\fR, it is ignored.
      388 +.RE
      389 +.RS +4
      390 +.TP
      391 +3.
      392 +If the remaining unignored characters would be lexically recognized as a
      393 +\fBNUMBER\fR token, the string is considered a \fInumeric string\fR.
      394 +.RE
      395 +.sp
      396 +.LP
      397 +If a \fB\(mi\fR character is ignored in the above steps, the numeric value of
      398 +the \fInumeric string\fR is the negation of the numeric value of the recognized
      399 +\fBNUMBER\fR token. Otherwise the numeric value of the \fInumeric string\fR is
      400 +the numeric value of the recognized \fBNUMBER\fR token. Whether or not a string
      401 +is a \fInumeric string\fR is relevant only in contexts where that term is used
      402 +in this section.
      403 +.sp
      404 +.LP
      405 +When an expression is used in a Boolean context, if it has a numeric value, a
      406 +value of zero is treated as false and any other value is treated as true.
      407 +Otherwise, a string value of the null string is treated as false and any other
      408 +value is treated as true. A Boolean context is one of the following:
      409 +.RS +4
      410 +.TP
      411 +.ie t \(bu
      412 +.el o
      413 +the first subexpression of a conditional expression.
      414 +.RE
      415 +.RS +4
      416 +.TP
      417 +.ie t \(bu
      418 +.el o
      419 +an expression operated on by logical NOT, logical \fBAND,\fR or logical OR.
      420 +.RE
      421 +.RS +4
      422 +.TP
      423 +.ie t \(bu
      424 +.el o
      425 +the second expression of a \fBfor\fR statement.
      426 +.RE
      427 +.RS +4
      428 +.TP
      429 +.ie t \(bu
      430 +.el o
      431 +the expression of an \fBif\fR statement.
      432 +.RE
      433 +.RS +4
      434 +.TP
      435 +.ie t \(bu
      436 +.el o
      437 +the expression of the \fBwhile\fR clause in either a \fBwhile\fR or \fBdo\fR
      438 +\fB\&.\|.\|.\fR \fBwhile\fR statement.
      439 +.RE
      440 +.RS +4
      441 +.TP
      442 +.ie t \(bu
      443 +.el o
      444 +an expression used as a pattern (as in Overall Program Structure).
      445 +.RE
      446 +.sp
      447 +.LP
      448 +The \fBawk\fR language supplies arrays that are used for storing numbers or
      449 +strings. Arrays need not be declared. They are initially empty, and their sizes
      450 +changes dynamically. The subscripts, or element identifiers, are strings,
      451 +providing a type of associative array capability. An array name followed by a
      452 +subscript within square brackets can be used as an \fIlvalue\fR and as an
      453 +expression, as described in the grammar.  Unsubscripted array names are used in
      454 +only the following contexts:
      455 +.RS +4
      456 +.TP
      457 +.ie t \(bu
      458 +.el o
      459 +a parameter in a function definition or function call.
      460 +.RE
      461 +.RS +4
      462 +.TP
      463 +.ie t \(bu
      464 +.el o
      465 +the \fBNAME\fR token following any use of the keyword \fBin\fR.
      466 +.RE
      467 +.sp
      468 +.LP
      469 +A valid array \fIindex\fR consists of one or more comma-separated expressions,
      470 +similar to the way in which multi-dimensional arrays are indexed in some
      471 +programming languages. Because \fBawk\fR arrays are really one-dimensional,
      472 +such a comma-separated list is converted to a single string by concatenating
      473 +the string values of the separate expressions, each separated from the other by
      474 +the value of the \fBSUBSEP\fR variable.
      475 +.sp
      476 +.LP
      477 +Thus, the following two index operations are equivalent:
      478 +.sp
 159  479  .in +2
 160  480  .nf
 161      -\fIvar \fRin \fIarray\fR
      481 +var[expr1, expr2, ... exprn]
      482 +var[expr1 SUBSEP expr2 SUBSEP ... SUBSEP exprn]
 162  483  .fi
 163  484  .in -2
 164  485  
 165  486  .sp
 166  487  .LP
 167      -or a Boolean combination of these.
      488 +A multi-dimensioned \fIindex\fR used with the \fBin\fR operator must be put in
      489 +parentheses. The \fBin\fR operator, which tests for the existence of a
      490 +particular array element, does not create the element if it does not exist.
      491 +Any other reference to a non-existent array element automatically creates it.
      492 +
      493 +.SS "Variables and Special Variables"
      494 +Variables can be used in an \fBawk\fR program by referencing them. With the
      495 +exception of function parameters, they are not explicitly declared.
      496 +Uninitialized scalar variables and array elements have both a numeric value of
      497 +zero and a string value of the empty string.
 168  498  .sp
 169  499  .LP
 170      -Regular expressions are as in \fBegrep\fR(1). In patterns they must be
 171      -surrounded by slashes. Isolated regular expressions in a pattern apply to the
 172      -entire line. Regular expressions can also occur in relational expressions. A
 173      -pattern can consist of two patterns separated by a comma; in this case, the
 174      -action is performed for all lines between the occurrence of the first pattern
 175      -to the occurrence of the second pattern.
      500 +Field variables are designated by a \fB$\fR followed by a number or numerical
      501 +expression. The effect of the field number \fIexpression\fR evaluating to
      502 +anything other than a non-negative integer is unspecified. Uninitialized
      503 +variables or string values need not be converted to numeric values in this
      504 +context. New field variables are created by assigning a value to them.
      505 +References to non-existent fields (that is, fields after \fB$NF\fR) produce the
      506 +null string. However, assigning to a non-existent field (for example,
      507 +\fB$(NF+2) = 5\fR) increases the value of \fBNF\fR, create any intervening
      508 +fields with the null string as their values and cause the value of \fB$0\fR to
      509 +be recomputed, with the fields being separated by the value of \fBOFS\fR. Each
      510 +field variable has a string value when created. If the string, with any
      511 +occurrence of the decimal-point character from the current locale changed to a
      512 +period character, is considered a \fInumeric string\fR (see \fBExpressions in
      513 +awk\fR above), the field variable also has the numeric value of the \fInumeric
      514 +string\fR.
      515 +
      516 +.SS "/usr/bin/awk, /usr/xpg4/bin/awk"
      517 +\fBawk\fR sets the following special variables that are supported by both
      518 +\fB/usr/bin/awk\fR and \fB/usr/xpg4/bin/awk\fR:
 176  519  .sp
 177      -.LP
 178      -The special patterns \fBBEGIN\fR and \fBEND\fR can be used to capture control
 179      -before the first input line has been read and after the last input line has
 180      -been read respectively. These keywords do not combine with any other patterns.
 181      -.SS "Built-in Variables"
      520 +.ne 2
      521 +.na
      522 +\fB\fBARGC\fR\fR
      523 +.ad
      524 +.RS 12n
      525 +The number of elements in the \fBARGV\fR array.
      526 +.RE
      527 +
 182  528  .sp
 183      -.LP
 184      -Built-in variables include:
      529 +.ne 2
      530 +.na
      531 +\fB\fBARGV\fR\fR
      532 +.ad
      533 +.RS 12n
      534 +An array of command line arguments, excluding options and the \fIprogram\fR
      535 +argument, numbered from zero to \fBARGC\fR\(mi1.
 185  536  .sp
      537 +The arguments in \fBARGV\fR can be modified or added to; \fBARGC\fR can be
      538 +altered.  As each input file ends, \fBawk\fR treats the next non-null element
      539 +of \fBARGV\fR, up to the current value of \fBARGC\fR\(mi1, inclusive, as the
      540 +name of the next input file.  Setting an element of \fBARGV\fR to null means
      541 +that it is not treated as an input file. The name \fB\(mi\fR indicates the
      542 +standard input. If an argument matches the format of an \fIassignment\fR
      543 +operand, this argument is treated as an assignment rather than a \fIfile\fR
      544 +argument.
      545 +.RE
      546 +
      547 +.sp
 186  548  .ne 2
 187  549  .na
 188      -\fB\fBFILENAME\fR \fR
      550 +\fB\fBCONVFMT\fR\fR
 189  551  .ad
 190      -.RS 13n
 191      -name of the current input file
      552 +.RS 12n
      553 +The \fBprintf\fR format for converting numbers to strings (except for output
      554 +statements, where \fBOFMT\fR is used). The default is \fB%.6g\fR.
 192  555  .RE
 193  556  
 194  557  .sp
 195  558  .ne 2
 196  559  .na
 197      -\fB\fBFS\fR \fR
      560 +\fB\fBENVIRON\fR\fR
 198  561  .ad
 199      -.RS 13n
 200      -input field separator regular expression (default blank and tab)
      562 +.RS 12n
      563 +The variable \fBENVIRON\fR is an array representing the value of the
      564 +environment. The indices of the array are strings consisting of the names of
      565 +the environment variables, and the value of each array element is a string
      566 +consisting of the value of that variable. If the value of an environment
      567 +variable is considered a \fInumeric string\fR, the array element also has its
      568 +numeric value.
      569 +.sp
      570 +In all cases where \fBawk\fR behavior is affected by environment variables
      571 +(including the environment of any commands that \fBawk\fR executes via the
      572 +\fBsystem\fR function or via pipeline redirections with the \fBprint\fR
      573 +statement, the \fBprintf\fR statement, or the \fBgetline\fR function), the
      574 +environment used is the environment at the time \fBawk\fR began executing.
 201  575  .RE
 202  576  
 203  577  .sp
 204  578  .ne 2
 205  579  .na
 206      -\fB\fBNF\fR \fR
      580 +\fB\fBFILENAME\fR\fR
 207  581  .ad
 208      -.RS 13n
 209      -number of fields in the current record
      582 +.RS 12n
      583 +A pathname of the current input file. Inside a \fBBEGIN\fR action the value is
      584 +undefined. Inside an \fBEND\fR action the value is the name of the last input
      585 +file processed.
 210  586  .RE
 211  587  
 212  588  .sp
 213  589  .ne 2
 214  590  .na
 215      -\fB\fBNR\fR \fR
      591 +\fB\fBFNR\fR\fR
 216  592  .ad
 217      -.RS 13n
 218      -ordinal number of the current record
      593 +.RS 12n
      594 +The ordinal number of the current record in the current file. Inside a
      595 +\fBBEGIN\fR action the value is zero. Inside an \fBEND\fR action the value is
      596 +the number of the last record processed in the last file processed.
 219  597  .RE
 220  598  
 221  599  .sp
 222  600  .ne 2
 223  601  .na
 224      -\fB\fBOFMT\fR \fR
      602 +\fB\fBFS\fR\fR
 225  603  .ad
 226      -.RS 13n
 227      -output format for numbers (default \fB%.6g\fR)
      604 +.RS 12n
      605 +Input field separator regular expression; a space character by default.
 228  606  .RE
 229  607  
 230  608  .sp
 231  609  .ne 2
 232  610  .na
 233      -\fB\fBOFS\fR \fR
      611 +\fB\fBNF\fR\fR
 234  612  .ad
 235      -.RS 13n
 236      -output field separator (default blank)
      613 +.RS 12n
      614 +The number of fields in the current record. Inside a \fBBEGIN\fR action, the
      615 +use of \fBNF\fR is undefined unless a \fBgetline\fR function without a
      616 +\fIvar\fR argument is executed previously. Inside an \fBEND\fR action, \fBNF\fR
      617 +retains the value it had for the last record read, unless a subsequent,
      618 +redirected, \fBgetline\fR function without a \fIvar\fR argument is performed
      619 +prior to entering the \fBEND\fR action.
 237  620  .RE
 238  621  
 239  622  .sp
 240  623  .ne 2
 241  624  .na
 242      -\fB\fBORS\fR \fR
      625 +\fB\fBNR\fR\fR
 243  626  .ad
 244      -.RS 13n
 245      -output record separator (default new-line)
      627 +.RS 12n
      628 +The ordinal number of the current record from the start of input. Inside a
      629 +\fBBEGIN\fR action the value is zero. Inside an \fBEND\fR action the value is
      630 +the number of the last record processed.
 246  631  .RE
 247  632  
 248  633  .sp
 249  634  .ne 2
 250  635  .na
 251      -\fB\fBRS\fR \fR
      636 +\fB\fBOFMT\fR\fR
 252  637  .ad
 253      -.RS 13n
 254      -input record separator (default new-line)
      638 +.RS 12n
      639 +The \fBprintf\fR format for converting numbers to strings in output statements
      640 +\fB"%.6g"\fR by default. The result of the conversion is unspecified if the
      641 +value of \fBOFMT\fR is not a floating-point format specification.
 255  642  .RE
 256  643  
 257  644  .sp
      645 +.ne 2
      646 +.na
      647 +\fB\fBOFS\fR\fR
      648 +.ad
      649 +.RS 12n
      650 +The \fBprint\fR statement output field separator; a space character by default.
      651 +.RE
      652 +
      653 +.sp
      654 +.ne 2
      655 +.na
      656 +\fB\fBORS\fR\fR
      657 +.ad
      658 +.RS 12n
      659 +The \fBprint\fR output record separator; a newline character by default.
      660 +.RE
      661 +
      662 +.sp
      663 +.ne 2
      664 +.na
      665 +\fB\fBRLENGTH\fR\fR
      666 +.ad
      667 +.RS 12n
      668 +The length of the string matched by the \fBmatch\fR function.
      669 +.RE
      670 +
      671 +.sp
      672 +.ne 2
      673 +.na
      674 +\fB\fBRS\fR\fR
      675 +.ad
      676 +.RS 12n
      677 +The first character of the string value of \fBRS\fR is the input record
      678 +separator; a newline character by default. If \fBRS\fR contains more than one
      679 +character, the results are unspecified. If \fBRS\fR is null, then records are
      680 +separated by sequences of one or more blank lines. Leading or trailing blank
      681 +lines do not produce empty records at the beginning or end of input, and the
      682 +field separator is always newline, no matter what the value of \fBFS\fR.
      683 +.RE
      684 +
      685 +.sp
      686 +.ne 2
      687 +.na
      688 +\fB\fBRSTART\fR\fR
      689 +.ad
      690 +.RS 12n
      691 +The starting position of the string matched by the \fBmatch\fR function,
      692 +numbering from 1. This is always equivalent to the return value of the
      693 +\fBmatch\fR function.
      694 +.RE
      695 +
      696 +.sp
      697 +.ne 2
      698 +.na
      699 +\fB\fBSUBSEP\fR\fR
      700 +.ad
      701 +.RS 12n
      702 +The subscript separator string for multi-dimensional arrays. The default value
      703 +is \fB\e034\fR\&.
      704 +.RE
      705 +
      706 +.SS "/usr/bin/awk"
      707 +The following variable is supported for \fB/usr/bin/awk\fR only:
      708 +.sp
      709 +.ne 2
      710 +.na
      711 +\fB\fBRT\fR\fR
      712 +.ad
      713 +.RS 12n
      714 +The record terminator for the most recent record read. For most records this
      715 +will be the same value as \fBRS\fR. At the end of a file with no trailing
      716 +separator value, though, this will be set to the empty string (\fB""\fR).
      717 +.RE
      718 +
      719 +.SS "Regular Expressions"
      720 +The \fBawk\fR utility makes use of the extended regular expression notation
      721 +(see \fBregex\fR(5)) except that it allows the use of C-language conventions to
      722 +escape special characters within the EREs, namely \fB\e\e\fR, \fB\ea\fR,
      723 +\fB\eb\fR, \fB\ef\fR, \fB\en\fR, \fB\er\fR, \fB\et\fR, \fB\ev\fR, and those
      724 +specified in the following table.  These escape sequences are recognized both
      725 +inside and outside bracket expressions.  Note that records need not be
      726 +separated by newline characters and string constants can contain newline
      727 +characters, so even the \fB\en\fR sequence is valid in \fBawk\fR EREs.  Using
      728 +a slash character within the regular expression requires escaping as shown in
      729 +the table below:
      730 +.sp
      731 +
      732 +.sp
      733 +.TS
      734 +l l l
      735 +l l l .
      736 +\fBEscape Sequence\fR   \fBDescription\fR       \fBMeaning\fR
      737 +_
      738 +\fB\e"\fR       Backslash quotation-mark        Quotation-mark character
      739 +_
      740 +\fB\e/\fR       Backslash slash Slash character
      741 +_
      742 +\fB\e\fR\fIddd\fR       T{
      743 +A backslash character followed by the longest sequence of one, two, or three octal-digit characters (01234567).  If all of the digits are 0, (that is, representation of the NULL character), the behavior is undefined.
      744 +T}      T{
      745 +The character encoded by the one-, two- or three-digit octal integer. Multi-byte characters require multiple, concatenated escape sequences, including the leading \e for each byte.
      746 +T}
      747 +_
      748 +\fB\e\fR\fIc\fR T{
      749 +A backslash character followed by any character not described in this table or special characters (\fB\e\e\fR, \fB\ea\fR, \fB\eb\fR, \fB\ef\fR, \fB\en\fR, \fB\er\fR, \fB\et\fR, \fB\ev\fR).
      750 +T}      Undefined
      751 +.TE
      752 +
      753 +.sp
 258  754  .LP
      755 +A regular expression can be matched against a specific field or string by using
      756 +one of the two regular expression matching operators, \fB~\fR and \fB!\|~\fR.
      757 +These operators interpret their right-hand operand as a regular expression and
      758 +their left-hand operand as a string. If the regular expression matches the
      759 +string, the \fB~\fR expression evaluates to the value \fB1\fR, and the
      760 +\fB!\|~\fR expression evaluates to the value \fB0\fR. If the regular expression
      761 +does not match the string, the \fB~\fR expression evaluates to the value
      762 +\fB0\fR, and the \fB!\|~\fR expression evaluates to the value \fB1\fR. If the
      763 +right-hand operand is any expression other than the lexical token \fBERE\fR,
      764 +the string value of the expression is interpreted as an extended regular
      765 +expression, including the escape conventions described above. Notice that these
      766 +same escape conventions also are applied in the determining the value of a
      767 +string literal (the lexical token \fBSTRING\fR), and is applied a second time
      768 +when a string literal is used in this context.
      769 +.sp
      770 +.LP
      771 +When an \fBERE\fR token appears as an expression in any context other than as
      772 +the right-hand of the \fB~\fR or \fB!\|~\fR operator or as one of the built-in
      773 +function arguments described below, the value of the resulting expression is
      774 +the equivalent of:
      775 +.sp
      776 +.in +2
      777 +.nf
      778 +$0 ~ /\fIere\fR/
      779 +.fi
      780 +.in -2
      781 +
      782 +.sp
      783 +.LP
      784 +The \fIere\fR argument to the \fBgsub,\fR \fBmatch,\fR \fBsub\fR functions, and
      785 +the \fIfs\fR argument to the \fBsplit\fR function (see \fBString Functions\fR)
      786 +is interpreted as extended regular expressions. These can be either \fBERE\fR
      787 +tokens or arbitrary expressions, and are interpreted in the same manner as the
      788 +right-hand side of the \fB~\fR or \fB!\|~\fR operator.
      789 +.sp
      790 +.LP
      791 +An extended regular expression can be used to separate fields by using the
      792 +\fB-F\fR \fIERE\fR option or by assigning a string containing the expression to
      793 +the built-in variable \fBFS\fR. The default value of the \fBFS\fR variable is a
      794 +single space character. The following describes \fBFS\fR behavior:
      795 +.RS +4
      796 +.TP
      797 +1.
      798 +If \fBFS\fR is a single character:
      799 +.RS +4
      800 +.TP
      801 +.ie t \(bu
      802 +.el o
      803 +If \fBFS\fR is the space character, skip leading and trailing blank characters;
      804 +fields are delimited by sets of one or more blank characters.
      805 +.RE
      806 +.RS +4
      807 +.TP
      808 +.ie t \(bu
      809 +.el o
      810 +Otherwise, if \fBFS\fR is any other character \fIc\fR, fields are delimited by
      811 +each single occurrence of \fIc\fR.
      812 +.RE
      813 +.RE
      814 +.RS +4
      815 +.TP
      816 +2.
      817 +Otherwise, the string value of \fBFS\fR is considered to be an extended
      818 +regular expression. Each occurrence of a sequence matching the extended regular
      819 +expression delimits fields.
      820 +.RE
      821 +.sp
      822 +.LP
      823 +Except in the \fBgsub\fR, \fBmatch\fR, \fBsplit\fR, and \fBsub\fR built-in
      824 +functions, regular expression matching is based on input records. That is,
      825 +record separator characters (the first character of the value of the variable
      826 +\fBRS\fR, a newline character by default) cannot be embedded in the expression,
      827 +and no expression matches the record separator character. If the record
      828 +separator is not a newline character, newline characters embedded in the
      829 +expression can be matched. In those four built-in functions, regular expression
      830 +matching are based on text strings. So, any character (including the newline
      831 +character and the record separator) can be embedded in the pattern and an
      832 +appropriate pattern matches any character. However, in all \fBawk\fR regular
      833 +expression matching, the use of one or more NULL characters in the pattern,
      834 +input record or text string produces undefined results.
      835 +
      836 +.SS "Patterns"
      837 +A \fIpattern\fR is any valid \fIexpression,\fR a range specified by two
      838 +expressions separated by comma, or one of the two special patterns \fBBEGIN\fR
      839 +or \fBEND\fR.
      840 +
      841 +.SS "Special Patterns"
      842 +The \fBawk\fR utility recognizes two special patterns, \fBBEGIN\fR and
      843 +\fBEND\fR. Each \fBBEGIN\fR pattern is matched once and its associated action
      844 +executed before the first record of input is read (except possibly by use of
      845 +the \fBgetline\fR function in a prior \fBBEGIN\fR action) and before command
      846 +line assignment is done. Each \fBEND\fR pattern is matched once and its
      847 +associated action executed after the last record of input has been read. These
      848 +two patterns have associated actions.
      849 +.sp
      850 +.LP
      851 +\fBBEGIN\fR and \fBEND\fR do not combine with other patterns.  Multiple
      852 +\fBBEGIN\fR and \fBEND\fR patterns are allowed. The actions associated with the
      853 +\fBBEGIN\fR patterns are executed in the order specified in the program, as are
      854 +the \fBEND\fR actions. An \fBEND\fR pattern can precede a \fBBEGIN\fR pattern
      855 +in a program.
      856 +.sp
      857 +.LP
      858 +If an \fBawk\fR program consists of only actions with the pattern \fBBEGIN\fR,
      859 +and the \fBBEGIN\fR action contains no \fBgetline\fR function, \fBawk\fR exits
      860 +without reading its input when the last statement in the last \fBBEGIN\fR
      861 +action is executed. If an \fBawk\fR program consists of only actions with the
      862 +pattern \fBEND\fR or only actions with the patterns \fBBEGIN\fR and \fBEND\fR,
      863 +the input is read before the statements in the \fBEND\fR actions are executed.
      864 +
      865 +.SS "Expression Patterns"
      866 +An expression pattern is evaluated as if it were an expression in a Boolean
      867 +context. If the result is true, the pattern is considered to match, and the
      868 +associated action (if any) is executed. If the result is false, the action is
      869 +not executed.
      870 +
      871 +.SS "Pattern Ranges"
      872 +A pattern range consists of two expressions separated by a comma. In this case,
      873 +the action is performed for all records between a match of the first expression
      874 +and the following match of the second expression, inclusive. At this point, the
      875 +pattern range can be repeated starting at input records subsequent to the end
      876 +of the matched range.
      877 +
      878 +.SS "Actions"
 259  879  An action is a sequence of statements. A statement can be one of the following:
 260  880  .sp
 261  881  .in +2
 262  882  .nf
 263  883  if ( \fIexpression\fR ) \fIstatement\fR [ else \fIstatement\fR ]
 264  884  while ( \fIexpression\fR ) \fIstatement\fR
 265  885  do \fIstatement\fR while ( \fIexpression\fR )
 266  886  for ( \fIexpression\fR ; \fIexpression\fR ; \fIexpression\fR ) \fIstatement\fR
 267  887  for ( \fIvar\fR in \fIarray\fR ) \fIstatement\fR
      888 +delete \fIarray\fR[\fIsubscript\fR] #delete an array element
      889 +delete \fIarray\fR #delete all elements within an array
 268  890  break
 269  891  continue
 270  892  { [ \fIstatement\fR ] .\|.\|. }
 271      -\fIexpression\fR      # commonly variable = expression
      893 +\fIexpression\fR        # commonly variable = expression
 272  894  print [ \fIexpression-list\fR ] [ >\fIexpression\fR ]
 273  895  printf format [ ,\fIexpression-list\fR ] [ >\fIexpression\fR ]
 274      -next            # skip remaining patterns on this input line
 275      -exit [expr]     # skip the rest of the input; exit status is expr
      896 +next              # skip remaining patterns on this input line
      897 +nextfile          # skip remaining patterns on this input file
      898 +exit [expr] # skip the rest of the input; exit status is expr
      899 +return [expr]
 276  900  .fi
 277  901  .in -2
 278  902  
 279  903  .sp
 280  904  .LP
 281      -Statements are terminated by semicolons, newlines, or right braces. An empty
 282      -expression-list stands for the whole input line. Expressions take on string or
 283      -numeric values as appropriate, and are built using the operators \fB+\fR,
 284      -\fB\(mi\fR, \fB*\fR, \fB/\fR, \fB%\fR, \fB^\fR and concatenation (indicated by
 285      -a blank). The operators \fB++\fR, \fB\(mi\(mi\fR, \fB+=\fR, \fB\(mi=\fR,
 286      -\fB*=\fR, \fB/=\fR, \fB%=\fR, \fB^=\fR, \fB>\fR, \fB>=\fR, \fB<\fR, \fB<=\fR,
 287      -\fB==\fR, \fB!=\fR, and \fB?:\fR are also available in expressions. Variables
 288      -can be scalars, array elements (denoted x[i]), or fields. Variables are
 289      -initialized to the null string or zero. Array subscripts can be any string, not
 290      -necessarily numeric; this allows for a form of associative memory. String
 291      -constants are quoted (\fB""\fR), with the usual C escapes recognized within.
      905 +Any single statement can be replaced by a statement list enclosed in braces.
      906 +The statements are terminated by newline characters or semicolons, and are
      907 +executed sequentially in the order that they appear.
 292  908  .sp
 293  909  .LP
 294      -The \fBprint\fR statement prints its arguments on the standard output, or on a
 295      -file if \fB>\fR\fIexpression\fR is present, or on a pipe if '\fB|\fR\fIcmd\fR'
 296      -is present. The output resulted from the print statement is terminated by the
 297      -output record separator with each argument separated by the current output
 298      -field separator. The \fBprintf\fR statement formats its expression list
 299      -according to the format (see \fBprintf\fR(3C)).
 300      -.SS "Built-in Functions"
      910 +The \fBnext\fR statement causes all further processing of the current input
      911 +record to be abandoned. The behavior is undefined if a \fBnext\fR statement
      912 +appears or is invoked in a \fBBEGIN\fR or \fBEND\fR action.
 301  913  .sp
 302  914  .LP
 303      -The arithmetic functions are as follows:
      915 +The \fBnextfile\fR statement is similar to \fBnext\fR, but also skips all other
      916 +records in the current file, and moves on to processing the next input file if
      917 +available (or exits the program if there are none). (Note that this keyword is
      918 +not supported by \fB/usr/xpg4/bin/awk\fR.)
 304  919  .sp
      920 +.LP
      921 +The \fBexit\fR statement invokes all \fBEND\fR actions in the order in which
      922 +they occur in the program source and then terminate the program without reading
      923 +further input. An \fBexit\fR statement inside an \fBEND\fR action terminates
      924 +the program without further execution of \fBEND\fR actions.  If an expression
      925 +is specified in an \fBexit\fR statement, its numeric value is the exit status
      926 +of \fBawk\fR, unless subsequent errors are encountered or a subsequent
      927 +\fBexit\fR statement with an expression is executed.
      928 +
      929 +.SS "Output Statements"
      930 +Both \fBprint\fR and \fBprintf\fR statements write to standard output by
      931 +default.  The output is written to the location specified by
      932 +\fIoutput_redirection\fR if one is supplied, as follows:
      933 +.sp
      934 +.in +2
      935 +.nf
      936 +\fB>\fR \fIexpression\fR\fB>>\fR \fIexpression\fR\fB|\fR \fIexpression\fR
      937 +.fi
      938 +.in -2
      939 +
      940 +.sp
      941 +.LP
      942 +In all cases, the \fIexpression\fR is evaluated to produce a string that is
      943 +used as a full pathname to write into (for \fB>\fR or \fB>>\fR) or as a command
      944 +to be executed (for \fB|\fR). Using the first two forms, if the file of that
      945 +name is not currently open, it is opened, creating it if necessary and using
      946 +the first form, truncating the file. The output then is appended to the file.
      947 +As long as the file remains open, subsequent calls in which \fIexpression\fR
      948 +evaluates to the same string value simply appends output to the file. The file
      949 +remains open until the \fBclose\fR function, which is called with an expression
      950 +that evaluates to the same string value.
      951 +.sp
      952 +.LP
      953 +The third form writes output onto a stream piped to the input of a command. The
      954 +stream is created if no stream is currently open with the value of
      955 +\fIexpression\fR as its command name.  The stream created is equivalent to one
      956 +created by a call to the \fBpopen\fR(3C) function with the value of
      957 +\fIexpression\fR as the \fIcommand\fR argument and a value of \fBw\fR as the
      958 +\fImode\fR argument.  As long as the stream remains open, subsequent calls in
      959 +which \fIexpression\fR evaluates to the same string value writes output to the
      960 +existing stream. The stream remains open until the \fBclose\fR function is
      961 +called with an expression that evaluates to the same string value.  At that
      962 +time, the stream is closed as if by a call to the \fBpclose\fR function.
      963 +.sp
      964 +.LP
      965 +These output statements take a comma-separated list of \fIexpression\fR \fIs\fR
      966 +referred in the grammar by the non-terminal symbols \fBexpr_list,\fR
      967 +\fBprint_expr_list\fR or \fBprint_expr_list_opt.\fR This list is referred to
      968 +here as the \fIexpression list\fR, and each member is referred to as an
      969 +\fIexpression argument\fR.
      970 +.sp
      971 +.LP
      972 +The \fBprint\fR statement writes the value of each expression argument onto the
      973 +indicated output stream separated by the current output field separator (see
      974 +variable \fBOFS\fR above), and terminated by the output record separator (see
      975 +variable \fBORS\fR above). All expression arguments is taken as strings, being
      976 +converted if necessary; with the exception that the \fBprintf\fR format in
      977 +\fBOFMT\fR is used instead of the value in \fBCONVFMT\fR. An empty expression
      978 +list stands for the whole input record \fB(\fR$0\fB)\fR.
      979 +.sp
      980 +.LP
      981 +The \fBprintf\fR statement produces output based on a notation similar to the
      982 +File Format Notation used to describe file formats in this document Output is
      983 +produced as specified with the first expression argument as the string
      984 +\fBformat\fR and subsequent expression arguments as the strings \fBarg1\fR to
      985 +\fBargn,\fR inclusive, with the following exceptions:
      986 +.RS +4
      987 +.TP
      988 +1.
      989 +The \fIformat\fR is an actual character string rather than a graphical
      990 +representation. Therefore, it cannot contain empty character positions. The
      991 +space character in the \fIformat\fR string, in any context other than a
      992 +\fIflag\fR of a conversion specification, is treated as an ordinary character
      993 +that is copied to the output.
      994 +.RE
      995 +.RS +4
      996 +.TP
      997 +2.
      998 +If the character set contains a Delta character and that character appears
      999 +in the \fIformat\fR string, it is treated as an ordinary character that is
     1000 +copied to the output.
     1001 +.RE
     1002 +.RS +4
     1003 +.TP
     1004 +3.
     1005 +The \fIescape sequences\fR beginning with a backslash character is treated
     1006 +as sequences of ordinary characters that are copied to the output. Note that
     1007 +these same sequences is interpreted lexically by \fBawk\fR when they appear in
     1008 +literal strings, but they is not treated specially by the \fBprintf\fR
     1009 +statement.
     1010 +.RE
     1011 +.RS +4
     1012 +.TP
     1013 +4.
     1014 +A \fIfield width\fR or \fIprecision\fR can be specified as the \fB*\fR
     1015 +character instead of a digit string. In this case the next argument from the
     1016 +expression list is fetched and its numeric value taken as the field width or
     1017 +precision.
     1018 +.RE
     1019 +.RS +4
     1020 +.TP
     1021 +5.
     1022 +The implementation does not precede or follow output from the \fBd\fR or
     1023 +\fBu\fR conversion specifications with blank characters not specified by the
     1024 +\fIformat\fR string.
     1025 +.RE
     1026 +.RS +4
     1027 +.TP
     1028 +6.
     1029 +The implementation does not precede output from the \fBo\fR conversion
     1030 +specification with leading zeros not specified by the \fIformat\fR string.
     1031 +.RE
     1032 +.RS +4
     1033 +.TP
     1034 +7.
     1035 +For the \fBc\fR conversion specification: if the argument has a numeric
     1036 +value, the character whose encoding is that value is output.  If the value is
     1037 +zero or is not the encoding of any character in the character set, the behavior
     1038 +is undefined.  If the argument does not have a numeric value, the first
     1039 +character of the string value is output; if the string does not contain any
     1040 +characters the behavior is undefined.
     1041 +.RE
     1042 +.RS +4
     1043 +.TP
     1044 +8.
     1045 +For each conversion specification that consumes an argument, the next
     1046 +expression argument is evaluated. With the exception of the \fBc\fR conversion,
     1047 +the value is converted to the appropriate type for the conversion
     1048 +specification.
     1049 +.RE
     1050 +.RS +4
     1051 +.TP
     1052 +9.
     1053 +If there are insufficient expression arguments to satisfy all the conversion
     1054 +specifications in the \fIformat\fR string, the behavior is undefined.
     1055 +.RE
     1056 +.RS +4
     1057 +.TP
     1058 +10.
     1059 +If any character sequence in the \fIformat\fR string begins with a %
     1060 +character, but does not form a valid conversion specification, the behavior is
     1061 +unspecified.
     1062 +.RE
     1063 +.sp
     1064 +.LP
     1065 +Both \fBprint\fR and \fBprintf\fR can output at least \fB{LINE_MAX}\fR bytes.
     1066 +
     1067 +.SS "Functions"
     1068 +The \fBawk\fR language has a variety of built-in functions: arithmetic,
     1069 +string, input/output and general.
     1070 +
     1071 +.SS "Arithmetic Functions"
     1072 +The arithmetic functions, except for \fBint\fR, are based on the \fBISO\fR
     1073 +\fBC\fR standard. The behavior is undefined in cases where the \fBISO\fR
     1074 +\fBC\fR standard specifies that an error be returned or that the behavior is
     1075 +undefined. Although the grammar permits built-in functions to appear with no
     1076 +arguments or parentheses, unless the argument or parentheses are indicated as
     1077 +optional in the following list (by displaying them within the \fB[ ]\fR
     1078 +brackets), such use is undefined.
     1079 +.sp
 305 1080  .ne 2
 306 1081  .na
     1082 +\fB\fBatan2(\fR\fIy\fR,\fIx\fR\fB)\fR\fR
     1083 +.ad
     1084 +.RS 17n
     1085 +Return arctangent of \fIy\fR/\fIx\fR.
     1086 +.RE
     1087 +
     1088 +.sp
     1089 +.ne 2
     1090 +.na
 307 1091  \fB\fBcos\fR(\fIx\fR)\fR
 308 1092  .ad
 309      -.RS 11n
 310      -Return cosine of \fIx\fR, where \fIx\fR is in radians. (In
 311      -\fB/usr/xpg4/bin/awk\fR only. See \fBnawk\fR(1).)
     1093 +.RS 17n
     1094 +Return cosine of \fIx,\fR where \fIx\fR is in radians.
 312 1095  .RE
 313 1096  
 314 1097  .sp
 315 1098  .ne 2
 316 1099  .na
 317 1100  \fB\fBsin\fR(\fIx\fR)\fR
 318 1101  .ad
 319      -.RS 11n
 320      -Return sine of \fIx\fR, where \fIx\fR is in radians. (In
 321      -\fB/usr/xpg4/bin/awk\fR only. See \fBnawk\fR(1).)
     1102 +.RS 17n
     1103 +Return sine of \fIx,\fR where \fIx\fR is in radians.
 322 1104  .RE
 323 1105  
 324 1106  .sp
 325 1107  .ne 2
 326 1108  .na
 327 1109  \fB\fBexp\fR(\fIx\fR)\fR
 328 1110  .ad
 329      -.RS 11n
     1111 +.RS 17n
 330 1112  Return the exponential function of \fIx\fR.
 331 1113  .RE
 332 1114  
 333 1115  .sp
 334 1116  .ne 2
 335 1117  .na
 336 1118  \fB\fBlog\fR(\fIx\fR)\fR
 337 1119  .ad
 338      -.RS 11n
     1120 +.RS 17n
 339 1121  Return the natural logarithm of \fIx\fR.
 340 1122  .RE
 341 1123  
 342 1124  .sp
 343 1125  .ne 2
 344 1126  .na
 345 1127  \fB\fBsqrt\fR(\fIx\fR)\fR
 346 1128  .ad
 347      -.RS 11n
     1129 +.RS 17n
 348 1130  Return the square root of \fIx\fR.
 349 1131  .RE
 350 1132  
 351 1133  .sp
 352 1134  .ne 2
 353 1135  .na
 354 1136  \fB\fBint\fR(\fIx\fR)\fR
 355 1137  .ad
 356      -.RS 11n
 357      -Truncate its argument to an integer. It is truncated toward \fB0\fR when
 358      -\fIx\fR >\fB 0\fR.
     1138 +.RS 17n
     1139 +Truncate its argument to an integer. It is truncated toward 0 when \fIx\fR > 0.
 359 1140  .RE
 360 1141  
 361 1142  .sp
 362      -.LP
 363      -The string functions are as follows:
     1143 +.ne 2
     1144 +.na
     1145 +\fB\fBrand()\fR\fR
     1146 +.ad
     1147 +.RS 17n
     1148 +Return a random number \fIn\fR, such that 0 \(<= \fIn\fR < 1.
     1149 +.RE
     1150 +
 364 1151  .sp
 365 1152  .ne 2
 366 1153  .na
 367      -\fB\fBindex(\fR\fIs\fR\fB, \fR\fIt\fR\fB)\fR\fR
     1154 +\fB\fBsrand\fR([\fBexpr\fR])\fR
 368 1155  .ad
     1156 +.RS 17n
     1157 +Set the seed value for \fBrand\fR to \fIexpr\fR or use the time of day if
     1158 +\fIexpr\fR is omitted. The previous seed value is returned.
     1159 +.RE
     1160 +
     1161 +.SS "String Functions"
     1162 +The string functions in the following list shall be supported. Although the
     1163 +grammar permits built-in functions to appear with no arguments or parentheses,
     1164 +unless the argument or parentheses are indicated as optional in the following
     1165 +list (by displaying them within the \fB[ ]\fR brackets), such use is undefined.
     1166 +.sp
     1167 +.ne 2
     1168 +.na
     1169 +\fB\fBgsub\fR(\fIere\fR,\fIrepl\fR[,\|\fIin\fR])\fR
     1170 +.ad
 369 1171  .sp .6
 370 1172  .RS 4n
 371      -Return the position in string \fIs\fR where string \fIt\fR first occurs, or
 372      -\fB0\fR if it does not occur at all.
     1173 +Behave like \fBsub\fR (see below), except that it replaces all occurrences of
     1174 +the regular expression (like the \fBed\fR utility global substitute) in
     1175 +\fB$0\fR or in the \fIin\fR argument, when specified.
 373 1176  .RE
 374 1177  
 375 1178  .sp
 376 1179  .ne 2
 377 1180  .na
 378      -\fB\fBint(\fR\fIs\fR\fB)\fR\fR
     1181 +\fB\fBindex\fR(\fIs\fR,\fIt\fR)\fR
 379 1182  .ad
 380 1183  .sp .6
 381 1184  .RS 4n
 382      -truncates \fIs\fR to an integer value. If \fIs\fR is not specified, $0 is used.
     1185 +Return the position, in characters, numbering from 1, in string \fIs\fR where
     1186 +string \fIt\fR first occurs, or zero if it does not occur at all.
 383 1187  .RE
 384 1188  
 385 1189  .sp
 386 1190  .ne 2
 387 1191  .na
 388      -\fB\fBlength(\fR\fIs\fR\fB)\fR\fR
     1192 +\fB\fBlength\fR[([\fIv\fR])]\fR
 389 1193  .ad
 390 1194  .sp .6
 391 1195  .RS 4n
 392      -Return the length of its argument taken as a string, or of the whole line if
 393      -there is no argument.
     1196 +Given no argument, this function returns the length of the whole record,
     1197 +\fB$0\fR. If given an array as an argument (and using \fB/usr/bin/awk\fR),
     1198 +then this returns the number of elements it contains. Otherwise, this function
     1199 +interprets the argument as a string (performing any needed conversions) and
     1200 +returns its length in characters.
 394 1201  .RE
 395 1202  
 396 1203  .sp
 397 1204  .ne 2
 398 1205  .na
 399      -\fB\fBsplit(\fR\fIs\fR, \fIa\fR, \fIfs\fR\fB)\fR\fR
     1206 +\fB\fBmatch\fR(\fIs\fR,\fIere\fR)\fR
 400 1207  .ad
 401 1208  .sp .6
 402 1209  .RS 4n
 403      -Split the string \fIs\fR into array elements \fIa\fR[\fI1\fR],
 404      -\fIa\fR[\fI2\fR], \|.\|.\|. \fIa\fR[\fIn\fR], and returns \fIn\fR. The
 405      -separation is done with the regular expression \fIfs\fR or with the field
 406      -separator \fBFS\fR if \fIfs\fR is not given.
     1210 +Return the position, in characters, numbering from 1, in string \fIs\fR where
     1211 +the extended regular expression \fIere\fR occurs, or zero if it does not occur
     1212 +at all. \fBRSTART\fR is set to the starting position (which is the same as the
     1213 +returned value), zero if no match is found; \fBRLENGTH\fR is set to the length
     1214 +of the matched string, \(mi1 if no match is found.
 407 1215  .RE
 408 1216  
 409 1217  .sp
 410 1218  .ne 2
 411 1219  .na
 412      -\fB\fBsprintf(\fR\fIfmt\fR, \fIexpr\fR, \fIexpr\fR,\|.\|.\|.\|\fB)\fR\fR
     1220 +\fB\fBsplit\fR(\fIs\fR,\fIa\fR[,\|\fIfs\fR])\fR
 413 1221  .ad
 414 1222  .sp .6
 415 1223  .RS 4n
 416      -Format the expressions according to the \fBprintf\fR(3C) format given by
 417      -\fIfmt\fR and returns the resulting string.
     1224 +Split the string \fIs\fR into array elements \fIa\fR[1], \fIa\fR[2],
     1225 +\fB\&...,\fR \fIa\fR[\fIn\fR], and return \fIn\fR. The separation is done with
     1226 +the extended regular expression \fIfs\fR or with the field separator \fBFS\fR
     1227 +if \fIfs\fR is not given. Each array element has a string value when created.
     1228 +If the string assigned to any array element, with any occurrence of the
     1229 +decimal-point character from the current locale changed to a period character,
     1230 +would be considered a \fInumeric string\fR; the array element also has the
     1231 +numeric value of the \fInumeric string\fR. The effect of a null string as the
     1232 +value of \fIfs\fR is unspecified.
 418 1233  .RE
 419 1234  
 420 1235  .sp
 421 1236  .ne 2
 422 1237  .na
 423      -\fB\fBsubstr(\fR\fIs\fR, \fIm\fR, \fIn\fR\fB)\fR\fR
     1238 +\fB\fBsprintf\fR(\fBfmt\fR,\fIexpr\fR,\fIexpr\fR,\fB\&...\fR)\fR
 424 1239  .ad
 425 1240  .sp .6
 426 1241  .RS 4n
 427      -returns the \fIn\fR-character substring of \fIs\fR that begins at position
 428      -\fIm\fR.
     1242 +Format the expressions according to the \fBprintf\fR format given by \fIfmt\fR
     1243 +and return the resulting string.
 429 1244  .RE
 430 1245  
 431 1246  .sp
     1247 +.ne 2
     1248 +.na
     1249 +\fB\fBsub\fR(\fIere\fR,\fIrepl\fR[,\|\fIin\fR])\fR
     1250 +.ad
     1251 +.sp .6
     1252 +.RS 4n
     1253 +Substitute the string \fIrepl\fR in place of the first instance of the extended
     1254 +regular expression \fBERE\fR in string in and return the number of
     1255 +substitutions. An ampersand ( \fB&\fR ) appearing in the string \fIrepl\fR is
     1256 +replaced by the string from in that matches the regular expression. An
     1257 +ampersand preceded with a backslash ( \fB\e\fR ) is interpreted as the literal
     1258 +ampersand character. An occurrence of two consecutive backslashes is
     1259 +interpreted as just a single literal backslash character.  Any other occurrence
     1260 +of a backslash (for example, preceding any other character) is treated as a
     1261 +literal backslash character. If \fIrepl\fR is a string literal, the handling of
     1262 +the ampersand character occurs after any lexical processing, including any
     1263 +lexical backslash escape sequence processing. If \fBin\fR is specified and it
     1264 +is not an \fBlvalue\fR the behavior is undefined. If in is omitted, \fBawk\fR
     1265 +uses the current record (\fB$0\fR) in its place.
     1266 +.RE
     1267 +
     1268 +.sp
     1269 +.ne 2
     1270 +.na
     1271 +\fB\fBsubstr\fR(\fIs\fR,\fIm\fR[,\|\fIn\fR])\fR
     1272 +.ad
     1273 +.sp .6
     1274 +.RS 4n
     1275 +Return the at most \fIn\fR-character substring of \fIs\fR that begins at
     1276 +position \fIm,\fR numbering from 1. If \fIn\fR is missing, the length of the
     1277 +substring is limited by the length of the string \fIs\fR.
     1278 +.RE
     1279 +
     1280 +.sp
     1281 +.ne 2
     1282 +.na
     1283 +\fB\fBtolower\fR(\fIs\fR)\fR
     1284 +.ad
     1285 +.sp .6
     1286 +.RS 4n
     1287 +Return a string based on the string \fIs\fR. Each character in \fIs\fR that is
     1288 +an upper-case letter specified to have a \fBtolower\fR mapping by the
     1289 +\fBLC_CTYPE\fR category of the current locale is replaced in the returned
     1290 +string by the lower-case letter specified by the mapping. Other characters in
     1291 +\fIs\fR are unchanged in the returned string.
     1292 +.RE
     1293 +
     1294 +.sp
     1295 +.ne 2
     1296 +.na
     1297 +\fB\fBtoupper\fR(\fIs\fR)\fR
     1298 +.ad
     1299 +.sp .6
     1300 +.RS 4n
     1301 +Return a string based on the string \fIs\fR. Each character in \fIs\fR that is
     1302 +a lower-case letter specified to have a \fBtoupper\fR mapping by the
     1303 +\fBLC_CTYPE\fR category of the current locale is replaced in the returned
     1304 +string by the upper-case letter specified by the mapping. Other characters in
     1305 +\fIs\fR are unchanged in the returned string.
     1306 +.RE
     1307 +
     1308 +.sp
 432 1309  .LP
 433      -The input/output function is as follows:
     1310 +All of the preceding functions that take \fIERE\fR as a parameter expect a
     1311 +pattern or a string valued expression that is a regular expression as defined
     1312 +below.
     1313 +
     1314 +.SS "Input/Output and General Functions"
     1315 +The input/output and general functions are:
 434 1316  .sp
 435 1317  .ne 2
 436 1318  .na
     1319 +\fB\fBclose(\fR\fIexpression\fR)\fR
     1320 +.ad
     1321 +.RS 27n
     1322 +Close the file or pipe opened by a \fBprint\fR or \fBprintf\fR statement or a
     1323 +call to \fBgetline\fR with the same string-valued \fIexpression\fR. If the
     1324 +close was successful, the function returns \fB0\fR; otherwise, it returns
     1325 +non-zero.
     1326 +.RE
     1327 +
     1328 +.sp
     1329 +.ne 2
     1330 +.na
     1331 +\fB\fBfflush(\fR\fIexpression\fR)\fR
     1332 +.ad
     1333 +.RS 27n
     1334 +Flush any buffered output for the file or pipe opened by a \fBprint\fR or
     1335 +\fBprintf\fR statement or a call to \fBgetline\fR with the same string-valued
     1336 +\fIexpression\fR. If the flush was successful, the function returns \fB0\fR;
     1337 +otherwise, it returns \fBEOF\fR. If no arguments or the empty string
     1338 +(\fB""\fR) are given, then all open files will be flushed. (Note that
     1339 +\fBfflush\fR is supported in \fB/usr/bin/awk\fR only.)
     1340 +.RE
     1341 +
     1342 +.sp
     1343 +.ne 2
     1344 +.na
     1345 +\fB\fIexpression\fR|\fBgetline\fR[\fIvar\fR]\fR
     1346 +.ad
     1347 +.RS 27n
     1348 +Read a record of input from a stream piped from the output of a command. The
     1349 +stream is created if no stream is currently open with the value of
     1350 +\fIexpression\fR as its command name. The stream created is equivalent to one
     1351 +created by a call to the \fBpopen\fR function with the value of
     1352 +\fIexpression\fR as the \fIcommand\fR argument and a value of \fBr\fR as the
     1353 +\fImode\fR argument. As long as the stream remains open, subsequent calls in
     1354 +which \fIexpression\fR evaluates to the same string value reads subsequent
     1355 +records from the file. The stream remains open until the \fBclose\fR function
     1356 +is called with an expression that evaluates to the same string value. At that
     1357 +time, the stream is closed as if by a call to the \fBpclose\fR function. If
     1358 +\fIvar\fR is missing, \fB$0\fR and \fBNF\fR is set. Otherwise, \fIvar\fR is
     1359 +set.
     1360 +.sp
     1361 +The \fBgetline\fR operator can form ambiguous constructs when there are
     1362 +operators that are not in parentheses (including concatenate) to the left of
     1363 +the \fB|\fR (to the beginning of the expression containing \fBgetline\fR). In
     1364 +the context of the \fB$\fR operator, \fB|\fR behaves as if it had a lower
     1365 +precedence than \fB$\fR. The result of evaluating other operators is
     1366 +unspecified, and all such uses of portable applications must be put in
     1367 +parentheses properly.
     1368 +.RE
     1369 +
     1370 +.sp
     1371 +.ne 2
     1372 +.na
 437 1373  \fB\fBgetline\fR\fR
 438 1374  .ad
 439      -.RS 11n
 440      -Set \fB$0\fR to the next input record from the current input file.
 441      -\fBgetline\fR returns \fB1\fR for successful input, \fB0\fR for end of file,
 442      -and \fB\(mi1\fR for an error.
     1375 +.RS 27n
     1376 +Set \fB$0\fR to the next input record from the current input file. This form of
     1377 +\fBgetline\fR sets the \fBNF\fR, \fBNR\fR, and \fBFNR\fR variables.
 443 1378  .RE
 444 1379  
 445      -.SS "Large File Behavior"
 446 1380  .sp
     1381 +.ne 2
     1382 +.na
     1383 +\fB\fBgetline\fR \fIvar\fR\fR
     1384 +.ad
     1385 +.RS 27n
     1386 +Set variable \fIvar\fR to the next input record from the current input file.
     1387 +This form of \fBgetline\fR sets the \fBFNR\fR and \fBNR\fR variables.
     1388 +.RE
     1389 +
     1390 +.sp
     1391 +.ne 2
     1392 +.na
     1393 +\fB\fBgetline\fR [\fIvar\fR] \fB<\fR \fIexpression\fR\fR
     1394 +.ad
     1395 +.RS 27n
     1396 +Read the next record of input from a named file. The \fIexpression\fR is
     1397 +evaluated to produce a string that is used as a full pathname. If the file of
     1398 +that name is not currently open, it is opened. As long as the stream remains
     1399 +open, subsequent calls in which \fIexpression\fR evaluates to the same string
     1400 +value reads subsequent records from the file. The file remains open until the
     1401 +\fBclose\fR function is called with an expression that evaluates to the same
     1402 +string value. If \fIvar\fR is missing, \fB$0\fR and \fBNF\fR is set. Otherwise,
     1403 +\fIvar\fR is set.
     1404 +.sp
     1405 +The \fBgetline\fR operator can form ambiguous constructs when there are binary
     1406 +operators that are not in parentheses (including concatenate) to the right of
     1407 +the \fB<\fR (up to the end of the expression containing the \fBgetline\fR). The
     1408 +result of evaluating such a construct is unspecified, and all such uses of
     1409 +portable applications must be put in parentheses properly.
     1410 +.RE
     1411 +
     1412 +.sp
     1413 +.ne 2
     1414 +.na
     1415 +\fB\fBsystem\fR(\fIexpression\fR)\fR
     1416 +.ad
     1417 +.RS 27n
     1418 +Execute the command given by \fIexpression\fR in a manner equivalent to the
     1419 +\fBsystem\fR(3C) function and return the exit status of the command.
     1420 +.RE
     1421 +
     1422 +.sp
 447 1423  .LP
     1424 +All forms of \fBgetline\fR return \fB1\fR for successful input, \fB0\fR for end
     1425 +of file, and \fB\(mi1\fR for an error.
     1426 +.sp
     1427 +.LP
     1428 +Where strings are used as the name of a file or pipeline, the strings must be
     1429 +textually identical. The terminology ``same string value'' implies that
     1430 +``equivalent strings'', even those that differ only by space characters,
     1431 +represent different files.
     1432 +
     1433 +.SS "User-defined Functions"
     1434 +The \fBawk\fR language also provides user-defined functions. Such functions
     1435 +can be defined as:
     1436 +.sp
     1437 +.in +2
     1438 +.nf
     1439 +\fBfunction\fR \fIname\fR(\fIargs\fR,\|.\|.\|.) { \fIstatements\fR }
     1440 +.fi
     1441 +.in -2
     1442 +
     1443 +.sp
     1444 +.LP
     1445 +A function can be referred to anywhere in an \fBawk\fR program; in particular,
     1446 +its use can precede its definition. The scope of a function is global.
     1447 +.sp
     1448 +.LP
     1449 +Function arguments can be either scalars or arrays; the behavior is undefined
     1450 +if an array name is passed as an argument that the function uses as a scalar,
     1451 +or if a scalar expression is passed as an argument that the function uses as an
     1452 +array. Function arguments are passed by value if scalar and by reference if
     1453 +array name. Argument names are local to the function; all other variable names
     1454 +are global. The same name is not used as both an argument name and as the name
     1455 +of a function or a special \fBawk\fR variable. The same name must not be used
     1456 +both as a variable name with global scope and as the name of a function. The
     1457 +same name must not be used within the same scope both as a scalar variable and
     1458 +as an array.
     1459 +.sp
     1460 +.LP
     1461 +The number of parameters in the function definition need not match the number
     1462 +of parameters in the function call. Excess formal parameters can be used as
     1463 +local variables. If fewer arguments are supplied in a function call than are in
     1464 +the function definition, the extra parameters that are used in the function
     1465 +body as scalars are initialized with a string value of the null string and a
     1466 +numeric value of zero, and the extra parameters that are used in the function
     1467 +body as arrays are initialized as empty arrays. If more arguments are supplied
     1468 +in a function call than are in the function definition, the behavior is
     1469 +undefined.
     1470 +.sp
     1471 +.LP
     1472 +When invoking a function, no white space can be placed between the function
     1473 +name and the opening parenthesis. Function calls can be nested and recursive
     1474 +calls can be made upon functions. Upon return from any nested or recursive
     1475 +function call, the values of all of the calling function's parameters are
     1476 +unchanged, except for array parameters passed by reference. The \fBreturn\fR
     1477 +statement can be used to return a value. If a \fBreturn\fR statement appears
     1478 +outside of a function definition, the behavior is undefined.
     1479 +.sp
     1480 +.LP
     1481 +In the function definition, newline characters are optional before the opening
     1482 +brace and after the closing brace. Function definitions can appear anywhere in
     1483 +the program where a \fIpattern-action\fR pair is allowed.
     1484 +
     1485 +.SH USAGE
     1486 +The \fBindex\fR, \fBlength\fR, \fBmatch\fR, and \fBsubstr\fR functions should
     1487 +not be confused with similar functions in the \fBISO C\fR standard; the
     1488 +\fBawk\fR versions deal with characters, while the \fBISO C\fR standard deals
     1489 +with bytes.
     1490 +.sp
     1491 +.LP
     1492 +Because the concatenation operation is represented by adjacent expressions
     1493 +rather than an explicit operator, it is often necessary to use parentheses to
     1494 +enforce the proper evaluation precedence.
     1495 +.sp
     1496 +.LP
 448 1497  See \fBlargefile\fR(5) for the description of the behavior of \fBawk\fR when
 449      -encountering files greater than or equal to 2 Gbyte ( 2^31 bytes).
     1498 +encountering files greater than or equal to 2 Gbyte (2^31 bytes).
     1499 +
 450 1500  .SH EXAMPLES
     1501 +The \fBawk\fR program specified in the command line is most easily specified
     1502 +within single-quotes (for example, \fB\&'\fR\fIprogram\fR\fB\&'\fR) for
     1503 +applications using \fBsh\fR, because \fBawk\fR programs commonly contain
     1504 +characters that are special to the shell, including double-quotes. In the cases
     1505 +where a \fBawk\fR program contains single-quote characters, it is usually
     1506 +easiest to specify most of the program as strings within single-quotes
     1507 +concatenated by the shell with quoted single-quote characters. For example:
     1508 +.sp
     1509 +.in +2
     1510 +.nf
     1511 +awk '/'\e''/ { print "quote:", $0 }'
     1512 +.fi
     1513 +.in -2
     1514 +
     1515 +.sp
 451 1516  .LP
 452      -\fBExample 1 \fRPrinting Lines Longer Than 72 Characters
     1517 +prints all lines from the standard input containing a single-quote character,
     1518 +prefixed with \fBquote:\fR.
 453 1519  .sp
 454 1520  .LP
 455      -The following example is an \fBawk\fR script that can be executed by an \fBawk
 456      --f examplescript\fR style command. It prints lines longer than seventy two
 457      -characters:
     1521 +The following are examples of simple \fBawk\fR programs:
     1522 +.LP
     1523 +\fBExample 1 \fRWrite to the standard output all input lines for which field 3
     1524 +is greater than 5:
     1525 +.sp
     1526 +.in +2
     1527 +.nf
     1528 +\fB$3 > 5\fR
     1529 +.fi
     1530 +.in -2
     1531 +.sp
 458 1532  
     1533 +.LP
     1534 +\fBExample 2 \fRWrite every tenth line:
 459 1535  .sp
 460 1536  .in +2
 461 1537  .nf
 462      -\fBlength > 72\fR
     1538 +\fB(NR % 10) == 0\fR
 463 1539  .fi
 464 1540  .in -2
 465 1541  .sp
 466 1542  
 467 1543  .LP
 468      -\fBExample 2 \fRPrinting Fields in Opposite Order
     1544 +\fBExample 3 \fRWrite any line with a substring matching the regular
     1545 +expression:
 469 1546  .sp
     1547 +.in +2
     1548 +.nf
     1549 +\fB/(G|D)(2[0-9][[:alpha:]]*)/\fR
     1550 +.fi
     1551 +.in -2
     1552 +.sp
     1553 +
 470 1554  .LP
 471      -The following example is an \fBawk\fR script that can be executed by an \fBawk
 472      --f examplescript\fR style command. It prints the first two fields in opposite
 473      -order:
     1555 +\fBExample 4 \fRPrint any line with a substring containing a G or D, followed
     1556 +by a sequence of digits and characters:
     1557 +.sp
     1558 +.LP
     1559 +This example uses character classes \fBdigit\fR and \fBalpha\fR to match
     1560 +language-independent digit and alphabetic characters, respectively.
 474 1561  
 475 1562  .sp
 476 1563  .in +2
 477 1564  .nf
 478      -\fB{ print $2, $1 }\fR
     1565 +\fB/(G|D)([[:digit:][:alpha:]]*)/\fR
 479 1566  .fi
 480 1567  .in -2
 481 1568  .sp
 482 1569  
 483 1570  .LP
 484      -\fBExample 3 \fRPrinting Fields in Opposite Order with the Input Fields
 485      -Separated
     1571 +\fBExample 5 \fRWrite any line in which the second field matches the regular
     1572 +expression and the fourth field does not:
 486 1573  .sp
     1574 +.in +2
     1575 +.nf
     1576 +\fB$2 ~ /xyz/ && $4 !~ /xyz/\fR
     1577 +.fi
     1578 +.in -2
     1579 +.sp
     1580 +
 487 1581  .LP
 488      -The following example is an \fBawk\fR script that can be executed by an \fBawk
 489      --f examplescript\fR style command. It prints the first two input fields in
 490      -opposite order, separated by a comma, blanks or tabs:
     1582 +\fBExample 6 \fRWrite any line in which the second field contains a backslash:
     1583 +.sp
     1584 +.in +2
     1585 +.nf
     1586 +\fB$2 ~ /\e\e/\fR
     1587 +.fi
     1588 +.in -2
     1589 +.sp
 491 1590  
     1591 +.LP
     1592 +\fBExample 7 \fRWrite any line in which the second field contains a backslash
     1593 +(alternate method):
 492 1594  .sp
     1595 +.LP
     1596 +Notice that backslash escapes are interpreted twice, once in lexical processing
     1597 +of the string and once in processing the regular expression.
     1598 +
     1599 +.sp
 493 1600  .in +2
 494 1601  .nf
 495      -\fBBEGIN { FS = ",[ \et]*|[ \et]+" }
 496      -      { print $2, $1 }\fR
     1602 +\fB$2 ~ "\e\e\e\e"\fR
 497 1603  .fi
 498 1604  .in -2
 499 1605  .sp
 500 1606  
 501 1607  .LP
 502      -\fBExample 4 \fRAdding Up the First Column, Printing the Sum and Average
     1608 +\fBExample 8 \fRWrite the second to the last and the last field in each line,
     1609 +separating the fields by a colon:
 503 1610  .sp
     1611 +.in +2
     1612 +.nf
     1613 +\fB{OFS=":";print $(NF-1), $NF}\fR
     1614 +.fi
     1615 +.in -2
     1616 +.sp
     1617 +
 504 1618  .LP
 505      -The following example is an \fBawk\fR script that can be executed by an \fBawk
 506      --f examplescript\fR style command.  It adds up the first column, and prints the
 507      -sum and average:
     1619 +\fBExample 9 \fRWrite the line number and number of fields in each line:
     1620 +.sp
     1621 +.LP
     1622 +The three strings representing the line number, the colon and the number of
     1623 +fields are concatenated and that string is written to standard output.
 508 1624  
 509 1625  .sp
 510 1626  .in +2
 511 1627  .nf
 512      -\fB{ s += $1 }
 513      -END  { print "sum is", s, " average is", s/NR }\fR
     1628 +\fB{print NR ":" NF}\fR
 514 1629  .fi
 515 1630  .in -2
 516 1631  .sp
 517 1632  
 518 1633  .LP
 519      -\fBExample 5 \fRPrinting Fields in Reverse Order
     1634 +\fBExample 10 \fRWrite lines longer than 72 characters:
 520 1635  .sp
     1636 +.in +2
     1637 +.nf
     1638 +\fB{length($0) > 72}\fR
     1639 +.fi
     1640 +.in -2
     1641 +.sp
     1642 +
 521 1643  .LP
 522      -The following example is an \fBawk\fR script that can be executed by an \fBawk
 523      --f examplescript\fR style command. It prints fields in reverse order:
     1644 +\fBExample 11 \fRWrite first two fields in opposite order separated by the OFS:
     1645 +.sp
     1646 +.in +2
     1647 +.nf
     1648 +\fB{ print $2, $1 }\fR
     1649 +.fi
     1650 +.in -2
     1651 +.sp
 524 1652  
     1653 +.LP
     1654 +\fBExample 12 \fRSame, with input fields separated by comma or space and tab
     1655 +characters, or both:
 525 1656  .sp
 526 1657  .in +2
 527 1658  .nf
 528      -\fB{ for (i = NF; i > 0; \(mi\(mii) print $i }\fR
     1659 +\fBBEGIN { FS = ",[\et]*|[\et]+" }
     1660 +      { print $2, $1 }\fR
 529 1661  .fi
 530 1662  .in -2
 531 1663  .sp
 532 1664  
 533 1665  .LP
 534      -\fBExample 6 \fRPrinting All lines Between \fBstart/stop\fR Pairs
     1666 +\fBExample 13 \fRAdd up first column, print sum and average:
 535 1667  .sp
     1668 +.in +2
     1669 +.nf
     1670 +\fB{s += $1 }
     1671 +END {print "sum is ", s, " average is", s/NR}\fR
     1672 +.fi
     1673 +.in -2
     1674 +.sp
     1675 +
 536 1676  .LP
 537      -The following example is an \fBawk\fR script that can be executed by an \fBawk
 538      --f examplescript\fR style command. It prints all lines between start/stop
 539      -pairs.
     1677 +\fBExample 14 \fRWrite fields in reverse order, one per line (many lines out
     1678 +for each line in):
     1679 +.sp
     1680 +.in +2
     1681 +.nf
     1682 +\fB{ for (i = NF; i > 0; --i) print $i }\fR
     1683 +.fi
     1684 +.in -2
     1685 +.sp
 540 1686  
     1687 +.LP
     1688 +\fBExample 15 \fRWrite all lines between occurrences of the strings "start" and
     1689 +"stop":
 541 1690  .sp
 542 1691  .in +2
 543 1692  .nf
 544 1693  \fB/start/, /stop/\fR
 545 1694  .fi
 546 1695  .in -2
 547 1696  .sp
 548 1697  
 549 1698  .LP
 550      -\fBExample 7 \fRPrinting All Lines Whose First Field is Different from the
 551      -Previous One
     1699 +\fBExample 16 \fRWrite all lines whose first field is different from the
     1700 +previous one:
 552 1701  .sp
     1702 +.in +2
     1703 +.nf
     1704 +\fB$1 != prev { print; prev = $1 }\fR
     1705 +.fi
     1706 +.in -2
     1707 +.sp
     1708 +
 553 1709  .LP
 554      -The following example is an \fBawk\fR script that can be executed by an \fBawk
 555      --f examplescript\fR style command. It prints all lines whose first field is
 556      -different from the previous one.
     1710 +\fBExample 17 \fRSimulate the echo command:
     1711 +.sp
     1712 +.in +2
     1713 +.nf
     1714 +\fBBEGIN  {
     1715 +       for (i = 1; i < ARGC; ++i)
     1716 +             printf "%s%s", ARGV[i], i==ARGC-1?"\en":""
     1717 +       }\fR
     1718 +.fi
     1719 +.in -2
     1720 +.sp
 557 1721  
     1722 +.LP
     1723 +\fBExample 18 \fRWrite the path prefixes contained in the PATH environment
     1724 +variable, one per line:
 558 1725  .sp
 559 1726  .in +2
 560 1727  .nf
 561      -\fB$1 != prev { print; prev = $1 }\fR
     1728 +\fBBEGIN  {
     1729 +       n = split (ENVIRON["PATH"], path, ":")
     1730 +       for (i = 1; i <= n; ++i)
     1731 +              print path[i]
     1732 +       }\fR
 562 1733  .fi
 563 1734  .in -2
 564 1735  .sp
 565 1736  
 566 1737  .LP
 567      -\fBExample 8 \fRPrinting a File and Filling in Page numbers
     1738 +\fBExample 19 \fRPrint the file "input", filling in page numbers starting at 5:
 568 1739  .sp
 569 1740  .LP
 570      -The following example is an \fBawk\fR script that can be executed by an \fBawk
 571      --f examplescript\fR style command. It prints a file and fills in page numbers
 572      -starting at 5:
     1741 +If there is a file named \fBinput\fR containing page headers of the form
 573 1742  
 574 1743  .sp
 575 1744  .in +2
 576 1745  .nf
 577      -\fB/Page/       { $2 = n++; }
 578      -           { print }\fR
     1746 +Page#
 579 1747  .fi
 580 1748  .in -2
 581      -.sp
 582 1749  
     1750 +.sp
 583 1751  .LP
 584      -\fBExample 9 \fRPrinting a File and Numbering Its Pages
     1752 +and a file named \fBprogram\fR that contains
     1753 +
 585 1754  .sp
     1755 +.in +2
     1756 +.nf
     1757 +/Page/{ $2 = n++; }
     1758 +{ print }
     1759 +.fi
     1760 +.in -2
     1761 +
     1762 +.sp
 586 1763  .LP
 587      -Assuming this program is in a file named \fBprog\fR, the following example
 588      -prints the file \fBinput\fR numbering its pages starting at \fB5\fR:
     1764 +then the command line
 589 1765  
 590 1766  .sp
 591 1767  .in +2
 592 1768  .nf
 593      -example% \fBawk -f prog n=5 input\fR
     1769 +\fBawk -f program n=5 input\fR
 594 1770  .fi
 595 1771  .in -2
 596 1772  .sp
 597 1773  
 598      -.SH ENVIRONMENT VARIABLES
 599 1774  .sp
 600 1775  .LP
     1776 +prints the file \fBinput\fR, filling in page numbers starting at 5.
     1777 +
     1778 +.SH ENVIRONMENT VARIABLES
 601 1779  See \fBenviron\fR(5) for descriptions of the following environment variables
 602      -that affect the execution of \fBawk\fR: \fBLANG\fR, \fBLC_ALL\fR,
 603      -\fBLC_COLLATE\fR, \fBLC_CTYPE\fR, \fBLC_MESSAGES\fR, \fBNLSPATH\fR, and
 604      -\fBPATH\fR.
     1780 +that affect execution: \fBLC_COLLATE\fR, \fBLC_CTYPE\fR, \fBLC_MESSAGES\fR, and
     1781 +\fBNLSPATH\fR.
 605 1782  .sp
 606 1783  .ne 2
 607 1784  .na
 608 1785  \fB\fBLC_NUMERIC\fR\fR
 609 1786  .ad
 610 1787  .RS 14n
 611 1788  Determine the radix character used when interpreting numeric input, performing
 612 1789  conversions between numeric and string values and formatting numeric output.
 613 1790  Regardless of locale, the period character (the decimal-point character of the
 614 1791  POSIX locale) is the decimal-point character recognized in processing \fBawk\fR
 615 1792  programs (including assignments in command-line arguments).
 616 1793  .RE
 617 1794  
 618      -.SH ATTRIBUTES
     1795 +.SH EXIT STATUS
     1796 +The following exit values are returned:
 619 1797  .sp
 620      -.LP
 621      -See \fBattributes\fR(5) for descriptions of the following attributes:
 622      -.SS "/usr/bin/awk"
 623      -.sp
     1798 +.ne 2
     1799 +.na
     1800 +\fB\fB0\fR\fR
     1801 +.ad
     1802 +.RS 6n
     1803 +All input files were processed successfully.
     1804 +.RE
 624 1805  
 625 1806  .sp
 626      -.TS
 627      -box;
 628      -c | c
 629      -l | l .
 630      -ATTRIBUTE TYPE  ATTRIBUTE VALUE
 631      -_
 632      -CSI     Not Enabled
 633      -.TE
     1807 +.ne 2
     1808 +.na
     1809 +\fB\fB>0\fR\fR
     1810 +.ad
     1811 +.RS 6n
     1812 +An error occurred.
     1813 +.RE
 634 1814  
 635      -.SS "/usr/xpg4/bin/awk"
 636 1815  .sp
     1816 +.LP
     1817 +The exit status can be altered within the program by using an \fBexit\fR
     1818 +expression.
 637 1819  
 638      -.sp
 639      -.TS
 640      -box;
 641      -c | c
 642      -l | l .
 643      -ATTRIBUTE TYPE  ATTRIBUTE VALUE
 644      -_
 645      -CSI     Enabled
 646      -_
 647      -Interface Stability     Standard
 648      -.TE
 649      -
 650 1820  .SH SEE ALSO
     1821 +\fBed\fR(1), \fBegrep\fR(1), \fBgrep\fR(1), \fBlex\fR(1), \fBoawk\fR(1),
     1822 +\fBsed\fR(1), \fBpopen\fR(3C), \fBprintf\fR(3C), \fBsystem\fR(3C),
     1823 +\fBattributes\fR(5), \fBenviron\fR(5), \fBlargefile\fR(5), \fBregex\fR(5),
     1824 +\fBXPG4\fR(5)
 651 1825  .sp
 652 1826  .LP
 653      -\fBegrep\fR(1), \fBgrep\fR(1), \fBnawk\fR(1), \fBsed\fR(1), \fBprintf\fR(3C),
 654      -\fBattributes\fR(5), \fBenviron\fR(5), \fBlargefile\fR(5), \fBstandards\fR(5)
 655      -.SH NOTES
     1827 +Aho, A. V., B. W. Kernighan, and P. J. Weinberger, \fIThe AWK Programming
     1828 +Language\fR, Addison-Wesley, 1988.
     1829 +
     1830 +.SH DIAGNOSTICS
     1831 +If any \fIfile\fR operand is specified and the named file cannot be accessed,
     1832 +\fBawk\fR writes a diagnostic message to standard error and terminate without
     1833 +any further action.
 656 1834  .sp
 657 1835  .LP
     1836 +If the program specified by either the \fIprogram\fR operand or a
     1837 +\fIprogfile\fR operand is not a valid \fBawk\fR program (as specified in
     1838 +\fBEXTENDED DESCRIPTION\fR), the behavior is undefined.
     1839 +
     1840 +.SH NOTES
 658 1841  Input white space is not preserved on output if fields are involved.
 659 1842  .sp
 660 1843  .LP
 661 1844  There are no explicit conversions between numbers and strings. To force an
 662      -expression to be treated as a number, add \fB0\fR to it. To force an expression
 663      -to be treated as a string, concatenate the null string (\fB""\fR) to it.
     1845 +expression to be treated as a number add 0 to it; to force it to be treated as
     1846 +a string concatenate the null string (\fB""\fR) to it.
    
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX