1 .\"
   2 .\" Sun Microsystems, Inc. gratefully acknowledges The Open Group for
   3 .\" permission to reproduce portions of its copyrighted documentation.
   4 .\" Original documentation from The Open Group can be obtained online at
   5 .\" http://www.opengroup.org/bookstore/.
   6 .\"
   7 .\" The Institute of Electrical and Electronics Engineers and The Open
   8 .\" Group, have given us permission to reprint portions of their
   9 .\" documentation.
  10 .\"
  11 .\" In the following statement, the phrase ``this text'' refers to portions
  12 .\" of the system documentation.
  13 .\"
  14 .\" Portions of this text are reprinted and reproduced in electronic form
  15 .\" in the SunOS Reference Manual, from IEEE Std 1003.1, 2004 Edition,
  16 .\" Standard for Information Technology -- Portable Operating System
  17 .\" Interface (POSIX), The Open Group Base Specifications Issue 6,
  18 .\" Copyright (C) 2001-2004 by the Institute of Electrical and Electronics
  19 .\" Engineers, Inc and The Open Group.  In the event of any discrepancy
  20 .\" between these versions and the original IEEE and The Open Group
  21 .\" Standard, the original IEEE and The Open Group Standard is the referee
  22 .\" document.  The original Standard can be obtained online at
  23 .\" http://www.opengroup.org/unix/online.html.
  24 .\"
  25 .\" This notice shall appear on any product containing this material.
  26 .\"
  27 .\" The contents of this file are subject to the terms of the
  28 .\" Common Development and Distribution License (the "License").
  29 .\" You may not use this file except in compliance with the License.
  30 .\"
  31 .\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
  32 .\" or http://www.opensolaris.org/os/licensing.
  33 .\" See the License for the specific language governing permissions
  34 .\" and limitations under the License.
  35 .\"
  36 .\" When distributing Covered Code, include this CDDL HEADER in each
  37 .\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  38 .\" If applicable, add the following below this CDDL HEADER, with the
  39 .\" fields enclosed by brackets "[]" replaced with your own identifying
  40 .\" information: Portions Copyright [yyyy] [name of copyright owner]
  41 .\"
  42 .\"
  43 .\" Copyright 1989 AT&T
  44 .\" Copyright 1992, X/Open Company Limited  All Rights Reserved
  45 .\" Portions Copyright (c) 2005, 2006 Sun Microsystems, Inc. All Rights Reserved
  46 .\" Copyright 2020 Joyent, Inc.
  47 .\"
  48 .TH AWK 1 "Apr 20, 2020"
  49 .SH NAME
  50 awk \- pattern scanning and processing language
  51 .SH SYNOPSIS
  52 .nf
  53 \fB/usr/bin/awk\fR [\fB-F\fR \fIERE\fR] [\fB-v\fR \fIassignment\fR] \fI\&'program'\fR | \fB-f\fR \fIprogfile\fR...
  54      [\fIargument\fR]...
  55 .fi
  56 
  57 .LP
  58 .nf
  59 \fB/usr/bin/nawk\fR [\fB-F\fR \fIERE\fR] [\fB-v\fR \fIassignment\fR] \fI\&'program'\fR | \fB-f\fR \fIprogfile\fR...
  60      [\fIargument\fR]...
  61 .fi
  62 
  63 .LP
  64 .nf
  65 \fB/usr/xpg4/bin/awk\fR [\fB-F\fR \fIERE\fR] [\fB-v\fR \fIassignment\fR]... \fI\&'program'\fR | \fB-f\fR \fIprogfile\fR...
  66      [\fIargument\fR]...
  67 .fi
  68 
  69 .SH DESCRIPTION
  70 NOTE: The \fBnawk\fR command is now the system default awk for illumos.
  71 .LP
  72 The \fB/usr/bin/awk\fR and \fB/usr/xpg4/bin/awk\fR utilities execute
  73 \fIprogram\fRs written in the \fBawk\fR programming language, which is
  74 specialized for textual data manipulation. A \fBawk\fR \fIprogram\fR is a
  75 sequence of patterns and corresponding actions. The string specifying
  76 \fIprogram\fR must be enclosed in single quotes (') to protect it from
  77 interpretation by the shell. The sequence of pattern - action statements can be
  78 specified in the command line as \fIprogram\fR or in one, or more, file(s)
  79 specified by the \fB-f\fR\fIprogfile\fR option. When input is read that matches
  80 a pattern, the action associated with the pattern is performed.
  81 .sp
  82 .LP
  83 Input is interpreted as a sequence of records. By default, a record is a line,
  84 but this can be changed by using the \fBRS\fR built-in variable. Each record of
  85 input is matched to each pattern in the \fIprogram\fR. For each pattern
  86 matched, the associated action is executed.
  87 .sp
  88 .LP
  89 The \fBawk\fR utility interprets each input record as a sequence of fields
  90 where, by default, a field is a string of non-blank characters. This default
  91 white-space field delimiter (blanks and/or tabs) can be changed by using the
  92 \fBFS\fR built-in variable or the \fB-F\fR\fIERE\fR option. The \fBawk\fR
  93 utility denotes the first field in a record \fB$1\fR, the second \fB$2\fR, and
  94 so forth. The symbol \fB$0\fR refers to the entire record; setting any other
  95 field causes the reevaluation of \fB$0\fR. Assigning to \fB$0\fR resets the
  96 values of all fields and the \fBNF\fR built-in variable.
  97 
  98 .SH OPTIONS
  99 The following options are supported:
 100 .sp
 101 .ne 2
 102 .na
 103 \fB\fB-F\fR \fIERE\fR\fR
 104 .ad
 105 .RS 17n
 106 Define the input field separator to be the extended regular expression
 107 \fIERE\fR, before any input is read (can be a character).
 108 .RE
 109 
 110 .sp
 111 .ne 2
 112 .na
 113 \fB\fB-f\fR \fIprogfile\fR\fR
 114 .ad
 115 .RS 17n
 116 Specifies the pathname of the file \fIprogfile\fR containing a \fBawk\fR
 117 program. If multiple instances of this option are specified, the concatenation
 118 of the files specified as \fIprogfile\fR in the order specified is the
 119 \fBawk\fR program. The \fBawk\fR program can alternatively be specified in
 120 the command line as a single argument.
 121 .RE
 122 
 123 .sp
 124 .ne 2
 125 .na
 126 \fB\fB-v\fR \fIassignment\fR\fR
 127 .ad
 128 .RS 17n
 129 The \fIassignment\fR argument must be in the same form as an \fIassignment\fR
 130 operand. The assignment is of the form \fIvar=value\fR, where \fIvar\fR is the
 131 name of one of the variables described below. The specified assignment occurs
 132 before executing the \fBawk\fR program, including the actions associated with
 133 \fBBEGIN\fR patterns (if any). Multiple occurrences of this option can be
 134 specified.
 135 .RE
 136 
 137 .sp
 138 .ne 2
 139 .na
 140 \fB\fB-safe\fR\fR
 141 .ad
 142 .RS 17n
 143 When passed to \fBawk\fR, this flag will prevent the program from opening new
 144 files or running child processes. The \fBENVIRON\fR array will also not be
 145 initialized.
 146 .RE
 147 
 148 .SH OPERANDS
 149 The following operands are supported:
 150 .sp
 151 .ne 2
 152 .na
 153 \fB\fIprogram\fR\fR
 154 .ad
 155 .RS 12n
 156 If no \fB-f\fR option is specified, the first operand to \fBawk\fR is the text
 157 of the \fBawk\fR program. The application supplies the \fIprogram\fR operand
 158 as a single argument to \fBawk.\fR If the text does not end in a newline
 159 character, \fBawk\fR interprets the text as if it did.
 160 .RE
 161 
 162 .sp
 163 .ne 2
 164 .na
 165 \fB\fIargument\fR\fR
 166 .ad
 167 .RS 12n
 168 Either of the following two types of \fIargument\fR can be intermixed:
 169 .sp
 170 .ne 2
 171 .na
 172 \fB\fIfile\fR\fR
 173 .ad
 174 .RS 14n
 175 A pathname of a file that contains the input to be read, which is matched
 176 against the set of patterns in the program. If no \fIfile\fR operands are
 177 specified, or if a \fIfile\fR operand is \fB\(mi\fR, the standard input is
 178 used.
 179 .RE
 180 
 181 .sp
 182 .ne 2
 183 .na
 184 \fB\fIassignment\fR\fR
 185 .ad
 186 .RS 14n
 187 An operand that begins with an underscore or alphabetic character from the
 188 portable character set, followed by a sequence of underscores, digits and
 189 alphabetics from the portable character set, followed by the \fB=\fR character
 190 specifies a variable assignment rather than a pathname. The characters before
 191 the \fB=\fR represent the name of a \fBawk\fR variable. If that name is a
 192 \fBawk\fR reserved word, the behavior is undefined. The characters following
 193 the equal sign is interpreted as if they appeared in the \fBawk\fR program
 194 preceded and followed by a double-quote (\fB"\fR) character, as a \fBSTRING\fR
 195 token , except that if the last character is an unescaped backslash, it is
 196 interpreted as a literal backslash rather than as the first character of the
 197 sequence \fB\e\fR\&.. The variable is assigned the value of that \fBSTRING\fR
 198 token. If the value is considered a \fInumeric\fRstring\fI,\fR the variable is
 199 assigned its numeric value. Each such variable assignment is performed just
 200 before the processing of the following \fIfile\fR, if any. Thus, an assignment
 201 before the first \fBfile\fR argument is executed after the \fBBEGIN\fR actions
 202 (if any), while an assignment after the last \fIfile\fR argument is executed
 203 before the \fBEND\fR actions (if any).  If there are no \fIfile\fR arguments,
 204 assignments are executed before processing the standard input.
 205 .RE
 206 
 207 .RE
 208 
 209 .SH INPUT FILES
 210 Input files to the \fBawk\fR program from any of the following sources:
 211 .RS +4
 212 .TP
 213 .ie t \(bu
 214 .el o
 215 any \fIfile\fR operands or their equivalents, achieved by modifying the
 216 \fBawk\fR variables \fBARGV\fR and \fBARGC\fR
 217 .RE
 218 .RS +4
 219 .TP
 220 .ie t \(bu
 221 .el o
 222 standard input in the absence of any \fIfile\fR operands
 223 .RE
 224 .RS +4
 225 .TP
 226 .ie t \(bu
 227 .el o
 228 arguments to the \fBgetline\fR function
 229 .RE
 230 .sp
 231 .LP
 232 must be text files. Whether the variable \fBRS\fR is set to a value other than
 233 a newline character or not, for these files, implementations support records
 234 terminated with the specified separator up to \fB{LINE_MAX}\fR bytes and can
 235 support longer records.
 236 .sp
 237 .LP
 238 If \fB-\fR\fBf\fR \fIprogfile\fR is specified, the files named by each of the
 239 \fIprogfile\fR option-arguments must be text files containing an \fBawk\fR
 240 program.
 241 .sp
 242 .LP
 243 The standard input are used only if no \fIfile\fR operands are specified, or if
 244 a \fIfile\fR operand is \fB\(mi\fR\&.
 245 
 246 .SH EXTENDED DESCRIPTION
 247 A \fBawk\fR program is composed of pairs of the form:
 248 .sp
 249 .in +2
 250 .nf
 251 pattern { \fIaction\fR }
 252 .fi
 253 .in -2
 254 
 255 .sp
 256 .LP
 257 Either the pattern or the action (including the enclosing brace characters) can
 258 be omitted. Pattern-action statements are separated by a semicolon or by a
 259 newline.
 260 .sp
 261 .LP
 262 A missing pattern matches any record of input, and a missing action is
 263 equivalent to an action that writes the matched record of input to standard
 264 output.
 265 .sp
 266 .LP
 267 Execution of the \fBawk\fR program starts by first executing the actions
 268 associated with all \fBBEGIN\fR patterns in the order they occur in the
 269 program. Then each \fIfile\fR operand (or standard input if no files were
 270 specified) is processed by reading data from the file until a record separator
 271 is seen (a newline character by default), splitting the current record into
 272 fields using the current value of \fBFS\fR, evaluating each pattern in the
 273 program in the order of occurrence, and executing the action associated with
 274 each pattern that matches the current record. The action for a matching pattern
 275 is executed before evaluating subsequent patterns. Last, the actions associated
 276 with all \fBEND\fR patterns is executed in the order they occur in the program.
 277 
 278 .SS "Expressions in awk"
 279 Expressions describe computations used in \fIpatterns\fR and \fIactions\fR. In
 280 the following table, valid expression operations are given in groups from
 281 highest precedence first to lowest precedence last, with equal-precedence
 282 operators grouped between horizontal lines. In expression evaluation, where the
 283 grammar is formally ambiguous, higher precedence operators are evaluated before
 284 lower precedence operators.  In this table \fIexpr,\fR \fIexpr1,\fR
 285 \fIexpr2,\fR and \fIexpr3\fR represent any expression, while \fIlvalue\fR
 286 represents any entity that can be assigned to (that is, on the left side of an
 287 assignment operator).
 288 .sp
 289 
 290 .sp
 291 .TS
 292 c c c c
 293 l l l l .
 294 \fBSyntax\fR    \fBName\fR      \fBType of Result\fR    \fBAssociativity\fR
 295 _
 296 ( \fIexpr\fR )  Grouping        type of \fIexpr\fR      n/a
 297 _
 298 $\fIexpr\fR     Field reference string  n/a
 299 _
 300 ++ \fIlvalue\fR Pre-increment   numeric n/a
 301 \(mi\(mi \fIlvalue\fR   Pre-decrement   numeric n/a
 302 \fIlvalue\fR ++ Post-increment  numeric n/a
 303 \fIlvalue\fR \(mi\(mi   Post-decrement  numeric n/a
 304 _
 305 \fIexpr\fR ^ \fIexpr\fR Exponentiation  numeric right
 306 _
 307 ! \fIexpr\fR    Logical not     numeric n/a
 308 + \fIexpr\fR    Unary plus      numeric n/a
 309 \(mi \fIexpr\fR Unary minus     numeric n/a
 310 _
 311 \fIexpr\fR * \fIexpr\fR Multiplication  numeric left
 312 \fIexpr\fR / \fIexpr\fR Division        numeric left
 313 \fIexpr\fR % \fIexpr\fR Modulus numeric left
 314 _
 315 \fIexpr\fR + \fIexpr\fR Addition        numeric left
 316 \fIexpr\fR \(mi \fIexpr\fR      Subtraction     numeric left
 317 _
 318 \fIexpr\fR \fIexpr\fR   String concatenation    string  left
 319 _
 320 \fIexpr\fR < \fIexpr\fR      Less than       numeric none
 321 \fIexpr\fR <= \fIexpr\fR     Less than or equal to   numeric none
 322 \fIexpr\fR != \fIexpr\fR        Not equal to    numeric none
 323 \fIexpr\fR == \fIexpr\fR        Equal to        numeric none
 324 \fIexpr\fR > \fIexpr\fR      Greater than    numeric none
 325 \fIexpr\fR >= \fIexpr\fR     Greater than or equal to        numeric none
 326 _
 327 \fIexpr\fR ~ \fIexpr\fR ERE match       numeric none
 328 \fIexpr\fR !~ \fIexpr\fR        ERE non-match    numeric        none
 329 _
 330 \fIexpr\fR in array     Array membership        numeric left
 331 ( \fIindex\fR ) in      Multi-dimension array   numeric left
 332     \fIarray\fR     membership
 333 _
 334 \fBexpr\fR && \fIexpr\fR        Logical AND     numeric left
 335 _
 336 \fBexpr\fR |\|| \fIexpr\fR      Logical OR      numeric left
 337 _
 338 \fIexpr1\fR ? \fIexpr2\fR       Conditional expression  type of selected        right
 339     : \fIexpr3\fR                  \fIexpr2\fR or \fIexpr3\fR
 340 _
 341 \fIlvalue\fR ^= \fIexpr\fR      Exponentiation  numeric right
 342         assignment
 343 \fIlvalue\fR %= \fIexpr\fR      Modulus assignment      numeric right
 344 \fIlvalue\fR *= \fIexpr\fR      Multiplication  numeric right
 345         assignment
 346 \fIlvalue\fR /= \fIexpr\fR      Division assignment     numeric right
 347 \fIlvalue\fR +=  \fIexpr\fR     Addition assignment     numeric right
 348 \fIlvalue\fR \(mi= \fIexpr\fR   Subtraction assignment  numeric right
 349 \fIlvalue\fR = \fIexpr\fR       Assignment      type of \fIexpr\fR      right
 350 .TE
 351 
 352 .sp
 353 .LP
 354 Each expression has either a string value, a numeric value or both. Except as
 355 stated for specific contexts, the value of an expression is implicitly
 356 converted to the type needed for the context in which it is used.  A string
 357 value is converted to a numeric value by the equivalent of the following calls:
 358 .sp
 359 .in +2
 360 .nf
 361 setlocale(LC_NUMERIC, "");
 362 \fInumeric_value\fR = atof(\fIstring_value\fR);
 363 .fi
 364 .in -2
 365 
 366 .sp
 367 .LP
 368 A numeric value that is exactly equal to the value of an integer is converted
 369 to a string by the equivalent of a call to the \fBsprintf\fR function with the
 370 string \fB%d\fR as the \fBfmt\fR argument and the numeric value being converted
 371 as the first and only \fIexpr\fR argument.  Any other numeric value is
 372 converted to a string by the equivalent of a call to the \fBsprintf\fR function
 373 with the value of the variable \fBCONVFMT\fR as the \fBfmt\fR argument and the
 374 numeric value being converted as the first and only \fIexpr\fR argument.
 375 .sp
 376 .LP
 377 A string value is considered to be a \fInumeric string\fR in the following
 378 case:
 379 .RS +4
 380 .TP
 381 1.
 382 Any leading and trailing blank characters is ignored.
 383 .RE
 384 .RS +4
 385 .TP
 386 2.
 387 If the first unignored character is a \fB+\fR or \fB\(mi\fR, it is ignored.
 388 .RE
 389 .RS +4
 390 .TP
 391 3.
 392 If the remaining unignored characters would be lexically recognized as a
 393 \fBNUMBER\fR token, the string is considered a \fInumeric string\fR.
 394 .RE
 395 .sp
 396 .LP
 397 If a \fB\(mi\fR character is ignored in the above steps, the numeric value of
 398 the \fInumeric string\fR is the negation of the numeric value of the recognized
 399 \fBNUMBER\fR token. Otherwise the numeric value of the \fInumeric string\fR is
 400 the numeric value of the recognized \fBNUMBER\fR token. Whether or not a string
 401 is a \fInumeric string\fR is relevant only in contexts where that term is used
 402 in this section.
 403 .sp
 404 .LP
 405 When an expression is used in a Boolean context, if it has a numeric value, a
 406 value of zero is treated as false and any other value is treated as true.
 407 Otherwise, a string value of the null string is treated as false and any other
 408 value is treated as true. A Boolean context is one of the following:
 409 .RS +4
 410 .TP
 411 .ie t \(bu
 412 .el o
 413 the first subexpression of a conditional expression.
 414 .RE
 415 .RS +4
 416 .TP
 417 .ie t \(bu
 418 .el o
 419 an expression operated on by logical NOT, logical \fBAND,\fR or logical OR.
 420 .RE
 421 .RS +4
 422 .TP
 423 .ie t \(bu
 424 .el o
 425 the second expression of a \fBfor\fR statement.
 426 .RE
 427 .RS +4
 428 .TP
 429 .ie t \(bu
 430 .el o
 431 the expression of an \fBif\fR statement.
 432 .RE
 433 .RS +4
 434 .TP
 435 .ie t \(bu
 436 .el o
 437 the expression of the \fBwhile\fR clause in either a \fBwhile\fR or \fBdo\fR
 438 \fB\&.\|.\|.\fR \fBwhile\fR statement.
 439 .RE
 440 .RS +4
 441 .TP
 442 .ie t \(bu
 443 .el o
 444 an expression used as a pattern (as in Overall Program Structure).
 445 .RE
 446 .sp
 447 .LP
 448 The \fBawk\fR language supplies arrays that are used for storing numbers or
 449 strings. Arrays need not be declared. They are initially empty, and their sizes
 450 changes dynamically. The subscripts, or element identifiers, are strings,
 451 providing a type of associative array capability. An array name followed by a
 452 subscript within square brackets can be used as an \fIlvalue\fR and as an
 453 expression, as described in the grammar.  Unsubscripted array names are used in
 454 only the following contexts:
 455 .RS +4
 456 .TP
 457 .ie t \(bu
 458 .el o
 459 a parameter in a function definition or function call.
 460 .RE
 461 .RS +4
 462 .TP
 463 .ie t \(bu
 464 .el o
 465 the \fBNAME\fR token following any use of the keyword \fBin\fR.
 466 .RE
 467 .sp
 468 .LP
 469 A valid array \fIindex\fR consists of one or more comma-separated expressions,
 470 similar to the way in which multi-dimensional arrays are indexed in some
 471 programming languages. Because \fBawk\fR arrays are really one-dimensional,
 472 such a comma-separated list is converted to a single string by concatenating
 473 the string values of the separate expressions, each separated from the other by
 474 the value of the \fBSUBSEP\fR variable.
 475 .sp
 476 .LP
 477 Thus, the following two index operations are equivalent:
 478 .sp
 479 .in +2
 480 .nf
 481 var[expr1, expr2, ... exprn]
 482 var[expr1 SUBSEP expr2 SUBSEP ... SUBSEP exprn]
 483 .fi
 484 .in -2
 485 
 486 .sp
 487 .LP
 488 A multi-dimensioned \fIindex\fR used with the \fBin\fR operator must be put in
 489 parentheses. The \fBin\fR operator, which tests for the existence of a
 490 particular array element, does not create the element if it does not exist.
 491 Any other reference to a non-existent array element automatically creates it.
 492 
 493 .SS "Variables and Special Variables"
 494 Variables can be used in an \fBawk\fR program by referencing them. With the
 495 exception of function parameters, they are not explicitly declared.
 496 Uninitialized scalar variables and array elements have both a numeric value of
 497 zero and a string value of the empty string.
 498 .sp
 499 .LP
 500 Field variables are designated by a \fB$\fR followed by a number or numerical
 501 expression. The effect of the field number \fIexpression\fR evaluating to
 502 anything other than a non-negative integer is unspecified. Uninitialized
 503 variables or string values need not be converted to numeric values in this
 504 context. New field variables are created by assigning a value to them.
 505 References to non-existent fields (that is, fields after \fB$NF\fR) produce the
 506 null string. However, assigning to a non-existent field (for example,
 507 \fB$(NF+2) = 5\fR) increases the value of \fBNF\fR, create any intervening
 508 fields with the null string as their values and cause the value of \fB$0\fR to
 509 be recomputed, with the fields being separated by the value of \fBOFS\fR. Each
 510 field variable has a string value when created. If the string, with any
 511 occurrence of the decimal-point character from the current locale changed to a
 512 period character, is considered a \fInumeric string\fR (see \fBExpressions in
 513 awk\fR above), the field variable also has the numeric value of the \fInumeric
 514 string\fR.
 515 
 516 .SS "/usr/bin/awk, /usr/xpg4/bin/awk"
 517 \fBawk\fR sets the following special variables that are supported by both
 518 \fB/usr/bin/awk\fR and \fB/usr/xpg4/bin/awk\fR:
 519 .sp
 520 .ne 2
 521 .na
 522 \fB\fBARGC\fR\fR
 523 .ad
 524 .RS 12n
 525 The number of elements in the \fBARGV\fR array.
 526 .RE
 527 
 528 .sp
 529 .ne 2
 530 .na
 531 \fB\fBARGV\fR\fR
 532 .ad
 533 .RS 12n
 534 An array of command line arguments, excluding options and the \fIprogram\fR
 535 argument, numbered from zero to \fBARGC\fR\(mi1.
 536 .sp
 537 The arguments in \fBARGV\fR can be modified or added to; \fBARGC\fR can be
 538 altered.  As each input file ends, \fBawk\fR treats the next non-null element
 539 of \fBARGV\fR, up to the current value of \fBARGC\fR\(mi1, inclusive, as the
 540 name of the next input file.  Setting an element of \fBARGV\fR to null means
 541 that it is not treated as an input file. The name \fB\(mi\fR indicates the
 542 standard input. If an argument matches the format of an \fIassignment\fR
 543 operand, this argument is treated as an assignment rather than a \fIfile\fR
 544 argument.
 545 .RE
 546 
 547 .sp
 548 .ne 2
 549 .na
 550 \fB\fBCONVFMT\fR\fR
 551 .ad
 552 .RS 12n
 553 The \fBprintf\fR format for converting numbers to strings (except for output
 554 statements, where \fBOFMT\fR is used). The default is \fB%.6g\fR.
 555 .RE
 556 
 557 .sp
 558 .ne 2
 559 .na
 560 \fB\fBENVIRON\fR\fR
 561 .ad
 562 .RS 12n
 563 The variable \fBENVIRON\fR is an array representing the value of the
 564 environment. The indices of the array are strings consisting of the names of
 565 the environment variables, and the value of each array element is a string
 566 consisting of the value of that variable. If the value of an environment
 567 variable is considered a \fInumeric string\fR, the array element also has its
 568 numeric value.
 569 .sp
 570 In all cases where \fBawk\fR behavior is affected by environment variables
 571 (including the environment of any commands that \fBawk\fR executes via the
 572 \fBsystem\fR function or via pipeline redirections with the \fBprint\fR
 573 statement, the \fBprintf\fR statement, or the \fBgetline\fR function), the
 574 environment used is the environment at the time \fBawk\fR began executing.
 575 .RE
 576 
 577 .sp
 578 .ne 2
 579 .na
 580 \fB\fBFILENAME\fR\fR
 581 .ad
 582 .RS 12n
 583 A pathname of the current input file. Inside a \fBBEGIN\fR action the value is
 584 undefined. Inside an \fBEND\fR action the value is the name of the last input
 585 file processed.
 586 .RE
 587 
 588 .sp
 589 .ne 2
 590 .na
 591 \fB\fBFNR\fR\fR
 592 .ad
 593 .RS 12n
 594 The ordinal number of the current record in the current file. Inside a
 595 \fBBEGIN\fR action the value is zero. Inside an \fBEND\fR action the value is
 596 the number of the last record processed in the last file processed.
 597 .RE
 598 
 599 .sp
 600 .ne 2
 601 .na
 602 \fB\fBFS\fR\fR
 603 .ad
 604 .RS 12n
 605 Input field separator regular expression; a space character by default.
 606 .RE
 607 
 608 .sp
 609 .ne 2
 610 .na
 611 \fB\fBNF\fR\fR
 612 .ad
 613 .RS 12n
 614 The number of fields in the current record. Inside a \fBBEGIN\fR action, the
 615 use of \fBNF\fR is undefined unless a \fBgetline\fR function without a
 616 \fIvar\fR argument is executed previously. Inside an \fBEND\fR action, \fBNF\fR
 617 retains the value it had for the last record read, unless a subsequent,
 618 redirected, \fBgetline\fR function without a \fIvar\fR argument is performed
 619 prior to entering the \fBEND\fR action.
 620 .RE
 621 
 622 .sp
 623 .ne 2
 624 .na
 625 \fB\fBNR\fR\fR
 626 .ad
 627 .RS 12n
 628 The ordinal number of the current record from the start of input. Inside a
 629 \fBBEGIN\fR action the value is zero. Inside an \fBEND\fR action the value is
 630 the number of the last record processed.
 631 .RE
 632 
 633 .sp
 634 .ne 2
 635 .na
 636 \fB\fBOFMT\fR\fR
 637 .ad
 638 .RS 12n
 639 The \fBprintf\fR format for converting numbers to strings in output statements
 640 \fB"%.6g"\fR by default. The result of the conversion is unspecified if the
 641 value of \fBOFMT\fR is not a floating-point format specification.
 642 .RE
 643 
 644 .sp
 645 .ne 2
 646 .na
 647 \fB\fBOFS\fR\fR
 648 .ad
 649 .RS 12n
 650 The \fBprint\fR statement output field separator; a space character by default.
 651 .RE
 652 
 653 .sp
 654 .ne 2
 655 .na
 656 \fB\fBORS\fR\fR
 657 .ad
 658 .RS 12n
 659 The \fBprint\fR output record separator; a newline character by default.
 660 .RE
 661 
 662 .sp
 663 .ne 2
 664 .na
 665 \fB\fBRLENGTH\fR\fR
 666 .ad
 667 .RS 12n
 668 The length of the string matched by the \fBmatch\fR function.
 669 .RE
 670 
 671 .sp
 672 .ne 2
 673 .na
 674 \fB\fBRS\fR\fR
 675 .ad
 676 .RS 12n
 677 The first character of the string value of \fBRS\fR is the input record
 678 separator; a newline character by default. If \fBRS\fR contains more than one
 679 character, the results are unspecified. If \fBRS\fR is null, then records are
 680 separated by sequences of one or more blank lines. Leading or trailing blank
 681 lines do not produce empty records at the beginning or end of input, and the
 682 field separator is always newline, no matter what the value of \fBFS\fR.
 683 .RE
 684 
 685 .sp
 686 .ne 2
 687 .na
 688 \fB\fBRSTART\fR\fR
 689 .ad
 690 .RS 12n
 691 The starting position of the string matched by the \fBmatch\fR function,
 692 numbering from 1. This is always equivalent to the return value of the
 693 \fBmatch\fR function.
 694 .RE
 695 
 696 .sp
 697 .ne 2
 698 .na
 699 \fB\fBSUBSEP\fR\fR
 700 .ad
 701 .RS 12n
 702 The subscript separator string for multi-dimensional arrays. The default value
 703 is \fB\e034\fR\&.
 704 .RE
 705 
 706 .SS "/usr/bin/awk"
 707 The following variable is supported for \fB/usr/bin/awk\fR only:
 708 .sp
 709 .ne 2
 710 .na
 711 \fB\fBRT\fR\fR
 712 .ad
 713 .RS 12n
 714 The record terminator for the most recent record read. For most records this
 715 will be the same value as \fBRS\fR. At the end of a file with no trailing
 716 separator value, though, this will be set to the empty string (\fB""\fR).
 717 .RE
 718 
 719 .SS "Regular Expressions"
 720 The \fBawk\fR utility makes use of the extended regular expression notation
 721 (see \fBregex\fR(5)) except that it allows the use of C-language conventions to
 722 escape special characters within the EREs, namely \fB\e\e\fR, \fB\ea\fR,
 723 \fB\eb\fR, \fB\ef\fR, \fB\en\fR, \fB\er\fR, \fB\et\fR, \fB\ev\fR, and those
 724 specified in the following table.  These escape sequences are recognized both
 725 inside and outside bracket expressions.  Note that records need not be
 726 separated by newline characters and string constants can contain newline
 727 characters, so even the \fB\en\fR sequence is valid in \fBawk\fR EREs.  Using
 728 a slash character within the regular expression requires escaping as shown in
 729 the table below:
 730 .sp
 731 
 732 .sp
 733 .TS
 734 l l l
 735 l l l .
 736 \fBEscape Sequence\fR   \fBDescription\fR       \fBMeaning\fR
 737 _
 738 \fB\e"\fR       Backslash quotation-mark        Quotation-mark character
 739 _
 740 \fB\e/\fR       Backslash slash Slash character
 741 _
 742 \fB\e\fR\fIddd\fR       T{
 743 A backslash character followed by the longest sequence of one, two, or three octal-digit characters (01234567).  If all of the digits are 0, (that is, representation of the NULL character), the behavior is undefined.
 744 T}      T{
 745 The character encoded by the one-, two- or three-digit octal integer. Multi-byte characters require multiple, concatenated escape sequences, including the leading \e for each byte.
 746 T}
 747 _
 748 \fB\e\fR\fIc\fR T{
 749 A backslash character followed by any character not described in this table or special characters (\fB\e\e\fR, \fB\ea\fR, \fB\eb\fR, \fB\ef\fR, \fB\en\fR, \fB\er\fR, \fB\et\fR, \fB\ev\fR).
 750 T}      Undefined
 751 .TE
 752 
 753 .sp
 754 .LP
 755 A regular expression can be matched against a specific field or string by using
 756 one of the two regular expression matching operators, \fB~\fR and \fB!\|~\fR.
 757 These operators interpret their right-hand operand as a regular expression and
 758 their left-hand operand as a string. If the regular expression matches the
 759 string, the \fB~\fR expression evaluates to the value \fB1\fR, and the
 760 \fB!\|~\fR expression evaluates to the value \fB0\fR. If the regular expression
 761 does not match the string, the \fB~\fR expression evaluates to the value
 762 \fB0\fR, and the \fB!\|~\fR expression evaluates to the value \fB1\fR. If the
 763 right-hand operand is any expression other than the lexical token \fBERE\fR,
 764 the string value of the expression is interpreted as an extended regular
 765 expression, including the escape conventions described above. Notice that these
 766 same escape conventions also are applied in the determining the value of a
 767 string literal (the lexical token \fBSTRING\fR), and is applied a second time
 768 when a string literal is used in this context.
 769 .sp
 770 .LP
 771 When an \fBERE\fR token appears as an expression in any context other than as
 772 the right-hand of the \fB~\fR or \fB!\|~\fR operator or as one of the built-in
 773 function arguments described below, the value of the resulting expression is
 774 the equivalent of:
 775 .sp
 776 .in +2
 777 .nf
 778 $0 ~ /\fIere\fR/
 779 .fi
 780 .in -2
 781 
 782 .sp
 783 .LP
 784 The \fIere\fR argument to the \fBgsub,\fR \fBmatch,\fR \fBsub\fR functions, and
 785 the \fIfs\fR argument to the \fBsplit\fR function (see \fBString Functions\fR)
 786 is interpreted as extended regular expressions. These can be either \fBERE\fR
 787 tokens or arbitrary expressions, and are interpreted in the same manner as the
 788 right-hand side of the \fB~\fR or \fB!\|~\fR operator.
 789 .sp
 790 .LP
 791 An extended regular expression can be used to separate fields by using the
 792 \fB-F\fR \fIERE\fR option or by assigning a string containing the expression to
 793 the built-in variable \fBFS\fR. The default value of the \fBFS\fR variable is a
 794 single space character. The following describes \fBFS\fR behavior:
 795 .RS +4
 796 .TP
 797 1.
 798 If \fBFS\fR is a single character:
 799 .RS +4
 800 .TP
 801 .ie t \(bu
 802 .el o
 803 If \fBFS\fR is the space character, skip leading and trailing blank characters;
 804 fields are delimited by sets of one or more blank characters.
 805 .RE
 806 .RS +4
 807 .TP
 808 .ie t \(bu
 809 .el o
 810 Otherwise, if \fBFS\fR is any other character \fIc\fR, fields are delimited by
 811 each single occurrence of \fIc\fR.
 812 .RE
 813 .RE
 814 .RS +4
 815 .TP
 816 2.
 817 Otherwise, the string value of \fBFS\fR is considered to be an extended
 818 regular expression. Each occurrence of a sequence matching the extended regular
 819 expression delimits fields.
 820 .RE
 821 .sp
 822 .LP
 823 Except in the \fBgsub\fR, \fBmatch\fR, \fBsplit\fR, and \fBsub\fR built-in
 824 functions, regular expression matching is based on input records. That is,
 825 record separator characters (the first character of the value of the variable
 826 \fBRS\fR, a newline character by default) cannot be embedded in the expression,
 827 and no expression matches the record separator character. If the record
 828 separator is not a newline character, newline characters embedded in the
 829 expression can be matched. In those four built-in functions, regular expression
 830 matching are based on text strings. So, any character (including the newline
 831 character and the record separator) can be embedded in the pattern and an
 832 appropriate pattern matches any character. However, in all \fBawk\fR regular
 833 expression matching, the use of one or more NULL characters in the pattern,
 834 input record or text string produces undefined results.
 835 
 836 .SS "Patterns"
 837 A \fIpattern\fR is any valid \fIexpression,\fR a range specified by two
 838 expressions separated by comma, or one of the two special patterns \fBBEGIN\fR
 839 or \fBEND\fR.
 840 
 841 .SS "Special Patterns"
 842 The \fBawk\fR utility recognizes two special patterns, \fBBEGIN\fR and
 843 \fBEND\fR. Each \fBBEGIN\fR pattern is matched once and its associated action
 844 executed before the first record of input is read (except possibly by use of
 845 the \fBgetline\fR function in a prior \fBBEGIN\fR action) and before command
 846 line assignment is done. Each \fBEND\fR pattern is matched once and its
 847 associated action executed after the last record of input has been read. These
 848 two patterns have associated actions.
 849 .sp
 850 .LP
 851 \fBBEGIN\fR and \fBEND\fR do not combine with other patterns.  Multiple
 852 \fBBEGIN\fR and \fBEND\fR patterns are allowed. The actions associated with the
 853 \fBBEGIN\fR patterns are executed in the order specified in the program, as are
 854 the \fBEND\fR actions. An \fBEND\fR pattern can precede a \fBBEGIN\fR pattern
 855 in a program.
 856 .sp
 857 .LP
 858 If an \fBawk\fR program consists of only actions with the pattern \fBBEGIN\fR,
 859 and the \fBBEGIN\fR action contains no \fBgetline\fR function, \fBawk\fR exits
 860 without reading its input when the last statement in the last \fBBEGIN\fR
 861 action is executed. If an \fBawk\fR program consists of only actions with the
 862 pattern \fBEND\fR or only actions with the patterns \fBBEGIN\fR and \fBEND\fR,
 863 the input is read before the statements in the \fBEND\fR actions are executed.
 864 
 865 .SS "Expression Patterns"
 866 An expression pattern is evaluated as if it were an expression in a Boolean
 867 context. If the result is true, the pattern is considered to match, and the
 868 associated action (if any) is executed. If the result is false, the action is
 869 not executed.
 870 
 871 .SS "Pattern Ranges"
 872 A pattern range consists of two expressions separated by a comma. In this case,
 873 the action is performed for all records between a match of the first expression
 874 and the following match of the second expression, inclusive. At this point, the
 875 pattern range can be repeated starting at input records subsequent to the end
 876 of the matched range.
 877 
 878 .SS "Actions"
 879 An action is a sequence of statements. A statement can be one of the following:
 880 .sp
 881 .in +2
 882 .nf
 883 if ( \fIexpression\fR ) \fIstatement\fR [ else \fIstatement\fR ]
 884 while ( \fIexpression\fR ) \fIstatement\fR
 885 do \fIstatement\fR while ( \fIexpression\fR )
 886 for ( \fIexpression\fR ; \fIexpression\fR ; \fIexpression\fR ) \fIstatement\fR
 887 for ( \fIvar\fR in \fIarray\fR ) \fIstatement\fR
 888 delete \fIarray\fR[\fIsubscript\fR] #delete an array element
 889 delete \fIarray\fR #delete all elements within an array
 890 break
 891 continue
 892 { [ \fIstatement\fR ] .\|.\|. }
 893 \fIexpression\fR        # commonly variable = expression
 894 print [ \fIexpression-list\fR ] [ >\fIexpression\fR ]
 895 printf format [ ,\fIexpression-list\fR ] [ >\fIexpression\fR ]
 896 next              # skip remaining patterns on this input line
 897 nextfile          # skip remaining patterns on this input file
 898 exit [expr] # skip the rest of the input; exit status is expr
 899 return [expr]
 900 .fi
 901 .in -2
 902 
 903 .sp
 904 .LP
 905 Any single statement can be replaced by a statement list enclosed in braces.
 906 The statements are terminated by newline characters or semicolons, and are
 907 executed sequentially in the order that they appear.
 908 .sp
 909 .LP
 910 The \fBnext\fR statement causes all further processing of the current input
 911 record to be abandoned. The behavior is undefined if a \fBnext\fR statement
 912 appears or is invoked in a \fBBEGIN\fR or \fBEND\fR action.
 913 .sp
 914 .LP
 915 The \fBnextfile\fR statement is similar to \fBnext\fR, but also skips all other
 916 records in the current file, and moves on to processing the next input file if
 917 available (or exits the program if there are none). (Note that this keyword is
 918 not supported by \fB/usr/xpg4/bin/awk\fR.)
 919 .sp
 920 .LP
 921 The \fBexit\fR statement invokes all \fBEND\fR actions in the order in which
 922 they occur in the program source and then terminate the program without reading
 923 further input. An \fBexit\fR statement inside an \fBEND\fR action terminates
 924 the program without further execution of \fBEND\fR actions.  If an expression
 925 is specified in an \fBexit\fR statement, its numeric value is the exit status
 926 of \fBawk\fR, unless subsequent errors are encountered or a subsequent
 927 \fBexit\fR statement with an expression is executed.
 928 
 929 .SS "Output Statements"
 930 Both \fBprint\fR and \fBprintf\fR statements write to standard output by
 931 default.  The output is written to the location specified by
 932 \fIoutput_redirection\fR if one is supplied, as follows:
 933 .sp
 934 .in +2
 935 .nf
 936 \fB>\fR \fIexpression\fR\fB>>\fR \fIexpression\fR\fB|\fR \fIexpression\fR
 937 .fi
 938 .in -2
 939 
 940 .sp
 941 .LP
 942 In all cases, the \fIexpression\fR is evaluated to produce a string that is
 943 used as a full pathname to write into (for \fB>\fR or \fB>>\fR) or as a command
 944 to be executed (for \fB|\fR). Using the first two forms, if the file of that
 945 name is not currently open, it is opened, creating it if necessary and using
 946 the first form, truncating the file. The output then is appended to the file.
 947 As long as the file remains open, subsequent calls in which \fIexpression\fR
 948 evaluates to the same string value simply appends output to the file. The file
 949 remains open until the \fBclose\fR function, which is called with an expression
 950 that evaluates to the same string value.
 951 .sp
 952 .LP
 953 The third form writes output onto a stream piped to the input of a command. The
 954 stream is created if no stream is currently open with the value of
 955 \fIexpression\fR as its command name.  The stream created is equivalent to one
 956 created by a call to the \fBpopen\fR(3C) function with the value of
 957 \fIexpression\fR as the \fIcommand\fR argument and a value of \fBw\fR as the
 958 \fImode\fR argument.  As long as the stream remains open, subsequent calls in
 959 which \fIexpression\fR evaluates to the same string value writes output to the
 960 existing stream. The stream remains open until the \fBclose\fR function is
 961 called with an expression that evaluates to the same string value.  At that
 962 time, the stream is closed as if by a call to the \fBpclose\fR function.
 963 .sp
 964 .LP
 965 These output statements take a comma-separated list of \fIexpression\fR \fIs\fR
 966 referred in the grammar by the non-terminal symbols \fBexpr_list,\fR
 967 \fBprint_expr_list\fR or \fBprint_expr_list_opt.\fR This list is referred to
 968 here as the \fIexpression list\fR, and each member is referred to as an
 969 \fIexpression argument\fR.
 970 .sp
 971 .LP
 972 The \fBprint\fR statement writes the value of each expression argument onto the
 973 indicated output stream separated by the current output field separator (see
 974 variable \fBOFS\fR above), and terminated by the output record separator (see
 975 variable \fBORS\fR above). All expression arguments is taken as strings, being
 976 converted if necessary; with the exception that the \fBprintf\fR format in
 977 \fBOFMT\fR is used instead of the value in \fBCONVFMT\fR. An empty expression
 978 list stands for the whole input record \fB(\fR$0\fB)\fR.
 979 .sp
 980 .LP
 981 The \fBprintf\fR statement produces output based on a notation similar to the
 982 File Format Notation used to describe file formats in this document Output is
 983 produced as specified with the first expression argument as the string
 984 \fBformat\fR and subsequent expression arguments as the strings \fBarg1\fR to
 985 \fBargn,\fR inclusive, with the following exceptions:
 986 .RS +4
 987 .TP
 988 1.
 989 The \fIformat\fR is an actual character string rather than a graphical
 990 representation. Therefore, it cannot contain empty character positions. The
 991 space character in the \fIformat\fR string, in any context other than a
 992 \fIflag\fR of a conversion specification, is treated as an ordinary character
 993 that is copied to the output.
 994 .RE
 995 .RS +4
 996 .TP
 997 2.
 998 If the character set contains a Delta character and that character appears
 999 in the \fIformat\fR string, it is treated as an ordinary character that is
1000 copied to the output.
1001 .RE
1002 .RS +4
1003 .TP
1004 3.
1005 The \fIescape sequences\fR beginning with a backslash character is treated
1006 as sequences of ordinary characters that are copied to the output. Note that
1007 these same sequences is interpreted lexically by \fBawk\fR when they appear in
1008 literal strings, but they is not treated specially by the \fBprintf\fR
1009 statement.
1010 .RE
1011 .RS +4
1012 .TP
1013 4.
1014 A \fIfield width\fR or \fIprecision\fR can be specified as the \fB*\fR
1015 character instead of a digit string. In this case the next argument from the
1016 expression list is fetched and its numeric value taken as the field width or
1017 precision.
1018 .RE
1019 .RS +4
1020 .TP
1021 5.
1022 The implementation does not precede or follow output from the \fBd\fR or
1023 \fBu\fR conversion specifications with blank characters not specified by the
1024 \fIformat\fR string.
1025 .RE
1026 .RS +4
1027 .TP
1028 6.
1029 The implementation does not precede output from the \fBo\fR conversion
1030 specification with leading zeros not specified by the \fIformat\fR string.
1031 .RE
1032 .RS +4
1033 .TP
1034 7.
1035 For the \fBc\fR conversion specification: if the argument has a numeric
1036 value, the character whose encoding is that value is output.  If the value is
1037 zero or is not the encoding of any character in the character set, the behavior
1038 is undefined.  If the argument does not have a numeric value, the first
1039 character of the string value is output; if the string does not contain any
1040 characters the behavior is undefined.
1041 .RE
1042 .RS +4
1043 .TP
1044 8.
1045 For each conversion specification that consumes an argument, the next
1046 expression argument is evaluated. With the exception of the \fBc\fR conversion,
1047 the value is converted to the appropriate type for the conversion
1048 specification.
1049 .RE
1050 .RS +4
1051 .TP
1052 9.
1053 If there are insufficient expression arguments to satisfy all the conversion
1054 specifications in the \fIformat\fR string, the behavior is undefined.
1055 .RE
1056 .RS +4
1057 .TP
1058 10.
1059 If any character sequence in the \fIformat\fR string begins with a %
1060 character, but does not form a valid conversion specification, the behavior is
1061 unspecified.
1062 .RE
1063 .sp
1064 .LP
1065 Both \fBprint\fR and \fBprintf\fR can output at least \fB{LINE_MAX}\fR bytes.
1066 
1067 .SS "Functions"
1068 The \fBawk\fR language has a variety of built-in functions: arithmetic,
1069 string, input/output and general.
1070 
1071 .SS "Arithmetic Functions"
1072 The arithmetic functions, except for \fBint\fR, are based on the \fBISO\fR
1073 \fBC\fR standard. The behavior is undefined in cases where the \fBISO\fR
1074 \fBC\fR standard specifies that an error be returned or that the behavior is
1075 undefined. Although the grammar permits built-in functions to appear with no
1076 arguments or parentheses, unless the argument or parentheses are indicated as
1077 optional in the following list (by displaying them within the \fB[ ]\fR
1078 brackets), such use is undefined.
1079 .sp
1080 .ne 2
1081 .na
1082 \fB\fBatan2(\fR\fIy\fR,\fIx\fR\fB)\fR\fR
1083 .ad
1084 .RS 17n
1085 Return arctangent of \fIy\fR/\fIx\fR.
1086 .RE
1087 
1088 .sp
1089 .ne 2
1090 .na
1091 \fB\fBcos\fR(\fIx\fR)\fR
1092 .ad
1093 .RS 17n
1094 Return cosine of \fIx,\fR where \fIx\fR is in radians.
1095 .RE
1096 
1097 .sp
1098 .ne 2
1099 .na
1100 \fB\fBsin\fR(\fIx\fR)\fR
1101 .ad
1102 .RS 17n
1103 Return sine of \fIx,\fR where \fIx\fR is in radians.
1104 .RE
1105 
1106 .sp
1107 .ne 2
1108 .na
1109 \fB\fBexp\fR(\fIx\fR)\fR
1110 .ad
1111 .RS 17n
1112 Return the exponential function of \fIx\fR.
1113 .RE
1114 
1115 .sp
1116 .ne 2
1117 .na
1118 \fB\fBlog\fR(\fIx\fR)\fR
1119 .ad
1120 .RS 17n
1121 Return the natural logarithm of \fIx\fR.
1122 .RE
1123 
1124 .sp
1125 .ne 2
1126 .na
1127 \fB\fBsqrt\fR(\fIx\fR)\fR
1128 .ad
1129 .RS 17n
1130 Return the square root of \fIx\fR.
1131 .RE
1132 
1133 .sp
1134 .ne 2
1135 .na
1136 \fB\fBint\fR(\fIx\fR)\fR
1137 .ad
1138 .RS 17n
1139 Truncate its argument to an integer. It is truncated toward 0 when \fIx\fR > 0.
1140 .RE
1141 
1142 .sp
1143 .ne 2
1144 .na
1145 \fB\fBrand()\fR\fR
1146 .ad
1147 .RS 17n
1148 Return a random number \fIn\fR, such that 0 \(<= \fIn\fR < 1.
1149 .RE
1150 
1151 .sp
1152 .ne 2
1153 .na
1154 \fB\fBsrand\fR([\fBexpr\fR])\fR
1155 .ad
1156 .RS 17n
1157 Set the seed value for \fBrand\fR to \fIexpr\fR or use the time of day if
1158 \fIexpr\fR is omitted. The previous seed value is returned.
1159 .RE
1160 
1161 .SS "String Functions"
1162 The string functions in the following list shall be supported. Although the
1163 grammar permits built-in functions to appear with no arguments or parentheses,
1164 unless the argument or parentheses are indicated as optional in the following
1165 list (by displaying them within the \fB[ ]\fR brackets), such use is undefined.
1166 .sp
1167 .ne 2
1168 .na
1169 \fB\fBgsub\fR(\fIere\fR,\fIrepl\fR[,\|\fIin\fR])\fR
1170 .ad
1171 .sp .6
1172 .RS 4n
1173 Behave like \fBsub\fR (see below), except that it replaces all occurrences of
1174 the regular expression (like the \fBed\fR utility global substitute) in
1175 \fB$0\fR or in the \fIin\fR argument, when specified.
1176 .RE
1177 
1178 .sp
1179 .ne 2
1180 .na
1181 \fB\fBindex\fR(\fIs\fR,\fIt\fR)\fR
1182 .ad
1183 .sp .6
1184 .RS 4n
1185 Return the position, in characters, numbering from 1, in string \fIs\fR where
1186 string \fIt\fR first occurs, or zero if it does not occur at all.
1187 .RE
1188 
1189 .sp
1190 .ne 2
1191 .na
1192 \fB\fBlength\fR[([\fIv\fR])]\fR
1193 .ad
1194 .sp .6
1195 .RS 4n
1196 Given no argument, this function returns the length of the whole record,
1197 \fB$0\fR. If given an array as an argument (and using \fB/usr/bin/awk\fR),
1198 then this returns the number of elements it contains. Otherwise, this function
1199 interprets the argument as a string (performing any needed conversions) and
1200 returns its length in characters.
1201 .RE
1202 
1203 .sp
1204 .ne 2
1205 .na
1206 \fB\fBmatch\fR(\fIs\fR,\fIere\fR)\fR
1207 .ad
1208 .sp .6
1209 .RS 4n
1210 Return the position, in characters, numbering from 1, in string \fIs\fR where
1211 the extended regular expression \fIere\fR occurs, or zero if it does not occur
1212 at all. \fBRSTART\fR is set to the starting position (which is the same as the
1213 returned value), zero if no match is found; \fBRLENGTH\fR is set to the length
1214 of the matched string, \(mi1 if no match is found.
1215 .RE
1216 
1217 .sp
1218 .ne 2
1219 .na
1220 \fB\fBsplit\fR(\fIs\fR,\fIa\fR[,\|\fIfs\fR])\fR
1221 .ad
1222 .sp .6
1223 .RS 4n
1224 Split the string \fIs\fR into array elements \fIa\fR[1], \fIa\fR[2],
1225 \fB\&...,\fR \fIa\fR[\fIn\fR], and return \fIn\fR. The separation is done with
1226 the extended regular expression \fIfs\fR or with the field separator \fBFS\fR
1227 if \fIfs\fR is not given. Each array element has a string value when created.
1228 If the string assigned to any array element, with any occurrence of the
1229 decimal-point character from the current locale changed to a period character,
1230 would be considered a \fInumeric string\fR; the array element also has the
1231 numeric value of the \fInumeric string\fR. The effect of a null string as the
1232 value of \fIfs\fR is unspecified.
1233 .RE
1234 
1235 .sp
1236 .ne 2
1237 .na
1238 \fB\fBsprintf\fR(\fBfmt\fR,\fIexpr\fR,\fIexpr\fR,\fB\&...\fR)\fR
1239 .ad
1240 .sp .6
1241 .RS 4n
1242 Format the expressions according to the \fBprintf\fR format given by \fIfmt\fR
1243 and return the resulting string.
1244 .RE
1245 
1246 .sp
1247 .ne 2
1248 .na
1249 \fB\fBsub\fR(\fIere\fR,\fIrepl\fR[,\|\fIin\fR])\fR
1250 .ad
1251 .sp .6
1252 .RS 4n
1253 Substitute the string \fIrepl\fR in place of the first instance of the extended
1254 regular expression \fBERE\fR in string in and return the number of
1255 substitutions. An ampersand ( \fB&\fR ) appearing in the string \fIrepl\fR is
1256 replaced by the string from in that matches the regular expression. An
1257 ampersand preceded with a backslash ( \fB\e\fR ) is interpreted as the literal
1258 ampersand character. An occurrence of two consecutive backslashes is
1259 interpreted as just a single literal backslash character.  Any other occurrence
1260 of a backslash (for example, preceding any other character) is treated as a
1261 literal backslash character. If \fIrepl\fR is a string literal, the handling of
1262 the ampersand character occurs after any lexical processing, including any
1263 lexical backslash escape sequence processing. If \fBin\fR is specified and it
1264 is not an \fBlvalue\fR the behavior is undefined. If in is omitted, \fBawk\fR
1265 uses the current record (\fB$0\fR) in its place.
1266 .RE
1267 
1268 .sp
1269 .ne 2
1270 .na
1271 \fB\fBsubstr\fR(\fIs\fR,\fIm\fR[,\|\fIn\fR])\fR
1272 .ad
1273 .sp .6
1274 .RS 4n
1275 Return the at most \fIn\fR-character substring of \fIs\fR that begins at
1276 position \fIm,\fR numbering from 1. If \fIn\fR is missing, the length of the
1277 substring is limited by the length of the string \fIs\fR.
1278 .RE
1279 
1280 .sp
1281 .ne 2
1282 .na
1283 \fB\fBtolower\fR(\fIs\fR)\fR
1284 .ad
1285 .sp .6
1286 .RS 4n
1287 Return a string based on the string \fIs\fR. Each character in \fIs\fR that is
1288 an upper-case letter specified to have a \fBtolower\fR mapping by the
1289 \fBLC_CTYPE\fR category of the current locale is replaced in the returned
1290 string by the lower-case letter specified by the mapping. Other characters in
1291 \fIs\fR are unchanged in the returned string.
1292 .RE
1293 
1294 .sp
1295 .ne 2
1296 .na
1297 \fB\fBtoupper\fR(\fIs\fR)\fR
1298 .ad
1299 .sp .6
1300 .RS 4n
1301 Return a string based on the string \fIs\fR. Each character in \fIs\fR that is
1302 a lower-case letter specified to have a \fBtoupper\fR mapping by the
1303 \fBLC_CTYPE\fR category of the current locale is replaced in the returned
1304 string by the upper-case letter specified by the mapping. Other characters in
1305 \fIs\fR are unchanged in the returned string.
1306 .RE
1307 
1308 .sp
1309 .LP
1310 All of the preceding functions that take \fIERE\fR as a parameter expect a
1311 pattern or a string valued expression that is a regular expression as defined
1312 below.
1313 
1314 .SS "Input/Output and General Functions"
1315 The input/output and general functions are:
1316 .sp
1317 .ne 2
1318 .na
1319 \fB\fBclose(\fR\fIexpression\fR)\fR
1320 .ad
1321 .RS 27n
1322 Close the file or pipe opened by a \fBprint\fR or \fBprintf\fR statement or a
1323 call to \fBgetline\fR with the same string-valued \fIexpression\fR. If the
1324 close was successful, the function returns \fB0\fR; otherwise, it returns
1325 non-zero.
1326 .RE
1327 
1328 .sp
1329 .ne 2
1330 .na
1331 \fB\fBfflush(\fR\fIexpression\fR)\fR
1332 .ad
1333 .RS 27n
1334 Flush any buffered output for the file or pipe opened by a \fBprint\fR or
1335 \fBprintf\fR statement or a call to \fBgetline\fR with the same string-valued
1336 \fIexpression\fR. If the flush was successful, the function returns \fB0\fR;
1337 otherwise, it returns \fBEOF\fR. If no arguments or the empty string
1338 (\fB""\fR) are given, then all open files will be flushed. (Note that
1339 \fBfflush\fR is supported in \fB/usr/bin/awk\fR only.)
1340 .RE
1341 
1342 .sp
1343 .ne 2
1344 .na
1345 \fB\fIexpression\fR|\fBgetline\fR[\fIvar\fR]\fR
1346 .ad
1347 .RS 27n
1348 Read a record of input from a stream piped from the output of a command. The
1349 stream is created if no stream is currently open with the value of
1350 \fIexpression\fR as its command name. The stream created is equivalent to one
1351 created by a call to the \fBpopen\fR function with the value of
1352 \fIexpression\fR as the \fIcommand\fR argument and a value of \fBr\fR as the
1353 \fImode\fR argument. As long as the stream remains open, subsequent calls in
1354 which \fIexpression\fR evaluates to the same string value reads subsequent
1355 records from the file. The stream remains open until the \fBclose\fR function
1356 is called with an expression that evaluates to the same string value. At that
1357 time, the stream is closed as if by a call to the \fBpclose\fR function. If
1358 \fIvar\fR is missing, \fB$0\fR and \fBNF\fR is set. Otherwise, \fIvar\fR is
1359 set.
1360 .sp
1361 The \fBgetline\fR operator can form ambiguous constructs when there are
1362 operators that are not in parentheses (including concatenate) to the left of
1363 the \fB|\fR (to the beginning of the expression containing \fBgetline\fR). In
1364 the context of the \fB$\fR operator, \fB|\fR behaves as if it had a lower
1365 precedence than \fB$\fR. The result of evaluating other operators is
1366 unspecified, and all such uses of portable applications must be put in
1367 parentheses properly.
1368 .RE
1369 
1370 .sp
1371 .ne 2
1372 .na
1373 \fB\fBgetline\fR\fR
1374 .ad
1375 .RS 27n
1376 Set \fB$0\fR to the next input record from the current input file. This form of
1377 \fBgetline\fR sets the \fBNF\fR, \fBNR\fR, and \fBFNR\fR variables.
1378 .RE
1379 
1380 .sp
1381 .ne 2
1382 .na
1383 \fB\fBgetline\fR \fIvar\fR\fR
1384 .ad
1385 .RS 27n
1386 Set variable \fIvar\fR to the next input record from the current input file.
1387 This form of \fBgetline\fR sets the \fBFNR\fR and \fBNR\fR variables.
1388 .RE
1389 
1390 .sp
1391 .ne 2
1392 .na
1393 \fB\fBgetline\fR [\fIvar\fR] \fB<\fR \fIexpression\fR\fR
1394 .ad
1395 .RS 27n
1396 Read the next record of input from a named file. The \fIexpression\fR is
1397 evaluated to produce a string that is used as a full pathname. If the file of
1398 that name is not currently open, it is opened. As long as the stream remains
1399 open, subsequent calls in which \fIexpression\fR evaluates to the same string
1400 value reads subsequent records from the file. The file remains open until the
1401 \fBclose\fR function is called with an expression that evaluates to the same
1402 string value. If \fIvar\fR is missing, \fB$0\fR and \fBNF\fR is set. Otherwise,
1403 \fIvar\fR is set.
1404 .sp
1405 The \fBgetline\fR operator can form ambiguous constructs when there are binary
1406 operators that are not in parentheses (including concatenate) to the right of
1407 the \fB<\fR (up to the end of the expression containing the \fBgetline\fR). The
1408 result of evaluating such a construct is unspecified, and all such uses of
1409 portable applications must be put in parentheses properly.
1410 .RE
1411 
1412 .sp
1413 .ne 2
1414 .na
1415 \fB\fBsystem\fR(\fIexpression\fR)\fR
1416 .ad
1417 .RS 27n
1418 Execute the command given by \fIexpression\fR in a manner equivalent to the
1419 \fBsystem\fR(3C) function and return the exit status of the command.
1420 .RE
1421 
1422 .sp
1423 .LP
1424 All forms of \fBgetline\fR return \fB1\fR for successful input, \fB0\fR for end
1425 of file, and \fB\(mi1\fR for an error.
1426 .sp
1427 .LP
1428 Where strings are used as the name of a file or pipeline, the strings must be
1429 textually identical. The terminology ``same string value'' implies that
1430 ``equivalent strings'', even those that differ only by space characters,
1431 represent different files.
1432 
1433 .SS "User-defined Functions"
1434 The \fBawk\fR language also provides user-defined functions. Such functions
1435 can be defined as:
1436 .sp
1437 .in +2
1438 .nf
1439 \fBfunction\fR \fIname\fR(\fIargs\fR,\|.\|.\|.) { \fIstatements\fR }
1440 .fi
1441 .in -2
1442 
1443 .sp
1444 .LP
1445 A function can be referred to anywhere in an \fBawk\fR program; in particular,
1446 its use can precede its definition. The scope of a function is global.
1447 .sp
1448 .LP
1449 Function arguments can be either scalars or arrays; the behavior is undefined
1450 if an array name is passed as an argument that the function uses as a scalar,
1451 or if a scalar expression is passed as an argument that the function uses as an
1452 array. Function arguments are passed by value if scalar and by reference if
1453 array name. Argument names are local to the function; all other variable names
1454 are global. The same name is not used as both an argument name and as the name
1455 of a function or a special \fBawk\fR variable. The same name must not be used
1456 both as a variable name with global scope and as the name of a function. The
1457 same name must not be used within the same scope both as a scalar variable and
1458 as an array.
1459 .sp
1460 .LP
1461 The number of parameters in the function definition need not match the number
1462 of parameters in the function call. Excess formal parameters can be used as
1463 local variables. If fewer arguments are supplied in a function call than are in
1464 the function definition, the extra parameters that are used in the function
1465 body as scalars are initialized with a string value of the null string and a
1466 numeric value of zero, and the extra parameters that are used in the function
1467 body as arrays are initialized as empty arrays. If more arguments are supplied
1468 in a function call than are in the function definition, the behavior is
1469 undefined.
1470 .sp
1471 .LP
1472 When invoking a function, no white space can be placed between the function
1473 name and the opening parenthesis. Function calls can be nested and recursive
1474 calls can be made upon functions. Upon return from any nested or recursive
1475 function call, the values of all of the calling function's parameters are
1476 unchanged, except for array parameters passed by reference. The \fBreturn\fR
1477 statement can be used to return a value. If a \fBreturn\fR statement appears
1478 outside of a function definition, the behavior is undefined.
1479 .sp
1480 .LP
1481 In the function definition, newline characters are optional before the opening
1482 brace and after the closing brace. Function definitions can appear anywhere in
1483 the program where a \fIpattern-action\fR pair is allowed.
1484 
1485 .SH USAGE
1486 The \fBindex\fR, \fBlength\fR, \fBmatch\fR, and \fBsubstr\fR functions should
1487 not be confused with similar functions in the \fBISO C\fR standard; the
1488 \fBawk\fR versions deal with characters, while the \fBISO C\fR standard deals
1489 with bytes.
1490 .sp
1491 .LP
1492 Because the concatenation operation is represented by adjacent expressions
1493 rather than an explicit operator, it is often necessary to use parentheses to
1494 enforce the proper evaluation precedence.
1495 .sp
1496 .LP
1497 See \fBlargefile\fR(5) for the description of the behavior of \fBawk\fR when
1498 encountering files greater than or equal to 2 Gbyte (2^31 bytes).
1499 
1500 .SH EXAMPLES
1501 The \fBawk\fR program specified in the command line is most easily specified
1502 within single-quotes (for example, \fB\&'\fR\fIprogram\fR\fB\&'\fR) for
1503 applications using \fBsh\fR, because \fBawk\fR programs commonly contain
1504 characters that are special to the shell, including double-quotes. In the cases
1505 where a \fBawk\fR program contains single-quote characters, it is usually
1506 easiest to specify most of the program as strings within single-quotes
1507 concatenated by the shell with quoted single-quote characters. For example:
1508 .sp
1509 .in +2
1510 .nf
1511 awk '/'\e''/ { print "quote:", $0 }'
1512 .fi
1513 .in -2
1514 
1515 .sp
1516 .LP
1517 prints all lines from the standard input containing a single-quote character,
1518 prefixed with \fBquote:\fR.
1519 .sp
1520 .LP
1521 The following are examples of simple \fBawk\fR programs:
1522 .LP
1523 \fBExample 1 \fRWrite to the standard output all input lines for which field 3
1524 is greater than 5:
1525 .sp
1526 .in +2
1527 .nf
1528 \fB$3 > 5\fR
1529 .fi
1530 .in -2
1531 .sp
1532 
1533 .LP
1534 \fBExample 2 \fRWrite every tenth line:
1535 .sp
1536 .in +2
1537 .nf
1538 \fB(NR % 10) == 0\fR
1539 .fi
1540 .in -2
1541 .sp
1542 
1543 .LP
1544 \fBExample 3 \fRWrite any line with a substring matching the regular
1545 expression:
1546 .sp
1547 .in +2
1548 .nf
1549 \fB/(G|D)(2[0-9][[:alpha:]]*)/\fR
1550 .fi
1551 .in -2
1552 .sp
1553 
1554 .LP
1555 \fBExample 4 \fRPrint any line with a substring containing a G or D, followed
1556 by a sequence of digits and characters:
1557 .sp
1558 .LP
1559 This example uses character classes \fBdigit\fR and \fBalpha\fR to match
1560 language-independent digit and alphabetic characters, respectively.
1561 
1562 .sp
1563 .in +2
1564 .nf
1565 \fB/(G|D)([[:digit:][:alpha:]]*)/\fR
1566 .fi
1567 .in -2
1568 .sp
1569 
1570 .LP
1571 \fBExample 5 \fRWrite any line in which the second field matches the regular
1572 expression and the fourth field does not:
1573 .sp
1574 .in +2
1575 .nf
1576 \fB$2 ~ /xyz/ && $4 !~ /xyz/\fR
1577 .fi
1578 .in -2
1579 .sp
1580 
1581 .LP
1582 \fBExample 6 \fRWrite any line in which the second field contains a backslash:
1583 .sp
1584 .in +2
1585 .nf
1586 \fB$2 ~ /\e\e/\fR
1587 .fi
1588 .in -2
1589 .sp
1590 
1591 .LP
1592 \fBExample 7 \fRWrite any line in which the second field contains a backslash
1593 (alternate method):
1594 .sp
1595 .LP
1596 Notice that backslash escapes are interpreted twice, once in lexical processing
1597 of the string and once in processing the regular expression.
1598 
1599 .sp
1600 .in +2
1601 .nf
1602 \fB$2 ~ "\e\e\e\e"\fR
1603 .fi
1604 .in -2
1605 .sp
1606 
1607 .LP
1608 \fBExample 8 \fRWrite the second to the last and the last field in each line,
1609 separating the fields by a colon:
1610 .sp
1611 .in +2
1612 .nf
1613 \fB{OFS=":";print $(NF-1), $NF}\fR
1614 .fi
1615 .in -2
1616 .sp
1617 
1618 .LP
1619 \fBExample 9 \fRWrite the line number and number of fields in each line:
1620 .sp
1621 .LP
1622 The three strings representing the line number, the colon and the number of
1623 fields are concatenated and that string is written to standard output.
1624 
1625 .sp
1626 .in +2
1627 .nf
1628 \fB{print NR ":" NF}\fR
1629 .fi
1630 .in -2
1631 .sp
1632 
1633 .LP
1634 \fBExample 10 \fRWrite lines longer than 72 characters:
1635 .sp
1636 .in +2
1637 .nf
1638 \fB{length($0) > 72}\fR
1639 .fi
1640 .in -2
1641 .sp
1642 
1643 .LP
1644 \fBExample 11 \fRWrite first two fields in opposite order separated by the OFS:
1645 .sp
1646 .in +2
1647 .nf
1648 \fB{ print $2, $1 }\fR
1649 .fi
1650 .in -2
1651 .sp
1652 
1653 .LP
1654 \fBExample 12 \fRSame, with input fields separated by comma or space and tab
1655 characters, or both:
1656 .sp
1657 .in +2
1658 .nf
1659 \fBBEGIN { FS = ",[\et]*|[\et]+" }
1660       { print $2, $1 }\fR
1661 .fi
1662 .in -2
1663 .sp
1664 
1665 .LP
1666 \fBExample 13 \fRAdd up first column, print sum and average:
1667 .sp
1668 .in +2
1669 .nf
1670 \fB{s += $1 }
1671 END {print "sum is ", s, " average is", s/NR}\fR
1672 .fi
1673 .in -2
1674 .sp
1675 
1676 .LP
1677 \fBExample 14 \fRWrite fields in reverse order, one per line (many lines out
1678 for each line in):
1679 .sp
1680 .in +2
1681 .nf
1682 \fB{ for (i = NF; i > 0; --i) print $i }\fR
1683 .fi
1684 .in -2
1685 .sp
1686 
1687 .LP
1688 \fBExample 15 \fRWrite all lines between occurrences of the strings "start" and
1689 "stop":
1690 .sp
1691 .in +2
1692 .nf
1693 \fB/start/, /stop/\fR
1694 .fi
1695 .in -2
1696 .sp
1697 
1698 .LP
1699 \fBExample 16 \fRWrite all lines whose first field is different from the
1700 previous one:
1701 .sp
1702 .in +2
1703 .nf
1704 \fB$1 != prev { print; prev = $1 }\fR
1705 .fi
1706 .in -2
1707 .sp
1708 
1709 .LP
1710 \fBExample 17 \fRSimulate the echo command:
1711 .sp
1712 .in +2
1713 .nf
1714 \fBBEGIN  {
1715        for (i = 1; i < ARGC; ++i)
1716              printf "%s%s", ARGV[i], i==ARGC-1?"\en":""
1717        }\fR
1718 .fi
1719 .in -2
1720 .sp
1721 
1722 .LP
1723 \fBExample 18 \fRWrite the path prefixes contained in the PATH environment
1724 variable, one per line:
1725 .sp
1726 .in +2
1727 .nf
1728 \fBBEGIN  {
1729        n = split (ENVIRON["PATH"], path, ":")
1730        for (i = 1; i <= n; ++i)
1731               print path[i]
1732        }\fR
1733 .fi
1734 .in -2
1735 .sp
1736 
1737 .LP
1738 \fBExample 19 \fRPrint the file "input", filling in page numbers starting at 5:
1739 .sp
1740 .LP
1741 If there is a file named \fBinput\fR containing page headers of the form
1742 
1743 .sp
1744 .in +2
1745 .nf
1746 Page#
1747 .fi
1748 .in -2
1749 
1750 .sp
1751 .LP
1752 and a file named \fBprogram\fR that contains
1753 
1754 .sp
1755 .in +2
1756 .nf
1757 /Page/{ $2 = n++; }
1758 { print }
1759 .fi
1760 .in -2
1761 
1762 .sp
1763 .LP
1764 then the command line
1765 
1766 .sp
1767 .in +2
1768 .nf
1769 \fBawk -f program n=5 input\fR
1770 .fi
1771 .in -2
1772 .sp
1773 
1774 .sp
1775 .LP
1776 prints the file \fBinput\fR, filling in page numbers starting at 5.
1777 
1778 .SH ENVIRONMENT VARIABLES
1779 See \fBenviron\fR(5) for descriptions of the following environment variables
1780 that affect execution: \fBLC_COLLATE\fR, \fBLC_CTYPE\fR, \fBLC_MESSAGES\fR, and
1781 \fBNLSPATH\fR.
1782 .sp
1783 .ne 2
1784 .na
1785 \fB\fBLC_NUMERIC\fR\fR
1786 .ad
1787 .RS 14n
1788 Determine the radix character used when interpreting numeric input, performing
1789 conversions between numeric and string values and formatting numeric output.
1790 Regardless of locale, the period character (the decimal-point character of the
1791 POSIX locale) is the decimal-point character recognized in processing \fBawk\fR
1792 programs (including assignments in command-line arguments).
1793 .RE
1794 
1795 .SH EXIT STATUS
1796 The following exit values are returned:
1797 .sp
1798 .ne 2
1799 .na
1800 \fB\fB0\fR\fR
1801 .ad
1802 .RS 6n
1803 All input files were processed successfully.
1804 .RE
1805 
1806 .sp
1807 .ne 2
1808 .na
1809 \fB\fB>0\fR\fR
1810 .ad
1811 .RS 6n
1812 An error occurred.
1813 .RE
1814 
1815 .sp
1816 .LP
1817 The exit status can be altered within the program by using an \fBexit\fR
1818 expression.
1819 
1820 .SH SEE ALSO
1821 \fBed\fR(1), \fBegrep\fR(1), \fBgrep\fR(1), \fBlex\fR(1), \fBoawk\fR(1),
1822 \fBsed\fR(1), \fBpopen\fR(3C), \fBprintf\fR(3C), \fBsystem\fR(3C),
1823 \fBattributes\fR(5), \fBenviron\fR(5), \fBlargefile\fR(5), \fBregex\fR(5),
1824 \fBXPG4\fR(5)
1825 .sp
1826 .LP
1827 Aho, A. V., B. W. Kernighan, and P. J. Weinberger, \fIThe AWK Programming
1828 Language\fR, Addison-Wesley, 1988.
1829 
1830 .SH DIAGNOSTICS
1831 If any \fIfile\fR operand is specified and the named file cannot be accessed,
1832 \fBawk\fR writes a diagnostic message to standard error and terminate without
1833 any further action.
1834 .sp
1835 .LP
1836 If the program specified by either the \fIprogram\fR operand or a
1837 \fIprogfile\fR operand is not a valid \fBawk\fR program (as specified in
1838 \fBEXTENDED DESCRIPTION\fR), the behavior is undefined.
1839 
1840 .SH NOTES
1841 Input white space is not preserved on output if fields are involved.
1842 .sp
1843 .LP
1844 There are no explicit conversions between numbers and strings. To force an
1845 expression to be treated as a number add 0 to it; to force it to be treated as
1846 a string concatenate the null string (\fB""\fR) to it.