Getopt: Difference between revisions
m Update ref of "public domain AT&T getopt source": add archive-url, archive-date, url-status; update access-date with today's date (verified still live today) |
m Update ref of "usr/src/lib/libc/pdp11/gen/getopt.c": apply 'cite web' template: add data for archive-url, archive-date, url-status, and access-date. |
||
Line 15: | Line 15: | ||
multiple options specified together, and options with arguments (<code>-a arg</code> or <code>-aarg</code>), all controllable by an option string. |
multiple options specified together, and options with arguments (<code>-a arg</code> or <code>-aarg</code>), all controllable by an option string. |
||
{{tt|getopt}} dates back to at least 1980<ref> |
{{tt|getopt}} dates back to at least 1980<ref>{{cite web |
||
| url=https://www.tuhs.org/cgi-bin/utree.pl?file=SysIII/usr/src/lib/libc/pdp11/gen/getopt.c |
|||
| url-status=live |
|||
| title=usr/src/lib/libc/pdp11/gen/getopt.c |
|||
| others=From [[UNIX System III|System III]], released June 1980, linked here from Warren Toomey's The Unix Tree project |
|||
| access-date=2024-04-22 |
|||
| archive-url=https://web.archive.org/web/20230512145228/https://www.tuhs.org/cgi-bin/utree.pl?file=SysIII/usr/src/lib/libc/pdp11/gen/getopt.c |
|||
| archive-date=2023-05-12 |
|||
}}</ref> and was first published by [[AT&T]] at the 1985 UNIFORUM conference in Dallas, Texas, with the intent for it to be available in the public domain.<ref>{{cite web |
|||
|last1=Quarterman |
|last1=Quarterman |
||
|first1=John |
|first1=John |
Revision as of 20:39, 22 April 2024
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
Getopt is a C library function used to parse command-line options of the Unix/POSIX style. It is a part of the POSIX specification, and is universal to Unix-like systems. It is also the name of a Unix program for parsing command line arguments in shell scripts.
History
A long-standing issue with command line programs was how to specify options; early programs used many ways of doing so, including single character options (-a
), multiple options specified together (-abc
is equivalent to -a -b -c
), multicharacter options (-inum
), options with arguments (-a arg
, -inum 3
, -a=arg
), and different prefix characters (-a
, +b
, /c
).
The getopt function was written to be a standard mechanism that all programs could use to parse command-line options so that there would be a common interface on which everyone could depend. As such, the original authors picked out of the variations support for single character options,
multiple options specified together, and options with arguments (-a arg
or -aarg
), all controllable by an option string.
getopt dates back to at least 1980[1] and was first published by AT&T at the 1985 UNIFORUM conference in Dallas, Texas, with the intent for it to be available in the public domain.[2] Versions of it were subsequently picked up by other flavors of Unix (4.3BSD, Linux, etc.). It is specified in the POSIX.2 standard as part of the unistd.h header file. Derivatives of getopt have been created for many programming languages to parse command-line options.
A POSIX-standard companion function to getopt
is getsubopt
. It parses a string of comma-separated sub-options. It appeared in 4.4BSD (1995).[3]
Extensions
getopt is a system dependent function, and its behavior depends on the implementation in the C library. Some custom implementations like gnulib are available, however.[4]
The conventional (POSIX and BSD) handling is that the options end when the first non-option argument is encountered, and that getopt would return -1 to signal that. In the glibc extension, however, options are allowed anywhere for ease of use; getopt implicitly permutes the argument vector so it still leaves the non-options in the end. Since POSIX already has the convention of returning -1 on --
and skipping it, one can always portably use it as an end-of-options signifier.[4]
A GNU extension, getopt_long, allows parsing of more readable, multicharacter options, which are introduced by two dashes instead of one. The choice of two dashes allows multicharacter options (--inum
) to be differentiated from single character options specified together (-abc
). The GNU extension also allows an alternative format for options with arguments: --name=arg
.[4] This interface proved popular, and has been taken up (sans the permutation) by many BSD distributions including FreeBSD as well as Solaris.[5] An alternative way to support long options is seen in Solaris and Korn Shell (extending optstring), but it was not as popular.[6]
Another common advanced extension of getopt is resetting the state of argument parsing; this is useful as a replacement of the options-anyware GNU extension, or as a way to "layer" a set of command-line interface with different options at different levels. This is achieved in BSD systems using an optreset variable, and on GNU systems by setting optind to 0.[4]
Usage
For users
The command-line syntaxes for getopt-based programs is the POSIX-recommended Utility Argument Syntax. In short:[7]
- Options are single-character alphanumerics preceded by a
-
(hyphen-minus) character. - Options can take an argument, mandatory or optional, or none.
- In order to specify that an option takes an argument, include
:
after the option name (only during initial specification) - When an option takes an argument, this can be in the same token or in the next one. In other words, if
o
takes an argument,-ofoo
is the same as-o foo
. - Multiple options can be chained together, as long as the non-last ones are not argument taking. If
a
andb
take no arguments whilee
takes an optional argument,-abe
is the same as-a -b -e
, but-bea
is not the same as-b -e a
due to the preceding rule. - All options precede non-option arguments (except for in the GNU extension).
--
always marks the end of options.
Extensions on the syntax include the GNU convention and Sun's CLIP specification.[8][9]
For programmers
The getopt manual from GNU specifies such a usage for getopt:[10]
#include <unistd.h>
int getopt(int argc, char * const argv[],
const char *optstring);
Here the argc and argv are defined exactly like they are in the C main function prototype; i.e., argc indicates the length of the argv array-of-strings. The optstring contains a specification of what options to look for (normal alphanumerals except W), and what options to accept arguments (colons). For example, "vf::o:" refers to three options: an argumentless v, an optional-argument f, and a mandatory-argument o. GNU here implements a W extension for long option synonyms.[10]
getopt itself returns an integer that is either an option character or -1 for end-of-options.[10] The idiom is to use a while-loop to go through options, and to use a switch-case statement to pick and act on options. See the example section of this article.
To communicate extra information back to the program, a few global extern
variables are referenced by the program to fetch information from getopt
:
extern char *optarg;
extern int optind, opterr, optopt;
- optarg
- A pointer to the argument of the current option, if present. Can be used to control where to start parsing (again).
- optind
- Where getopt is currently looking at in argv.
- opterr
- A boolean switch controlling whether getopt should print error messages.
- optopt
- If an unrecognized option occurs, the value of that unrecognized character.
The GNU extension getopt_long interface is similar, although it belongs to a different header file and takes an extra option for defining the "short" names of long options and some extra controls. If a short name is not defined, getopt will put an index referring to the option structure in the longindex pointer instead.[10]
#include <getopt.h>
int getopt_long(int argc, char * const argv[],
const char *optstring,
const struct option *longopts, int *longindex);
Examples
Using POSIX standard getopt
#include <stdio.h> /* for printf */
#include <stdlib.h> /* for exit */
#include <unistd.h> /* for getopt */
int main (int argc, char **argv) {
int c;
int digit_optind = 0;
int aopt = 0, bopt = 0;
char *copt = 0, *dopt = 0;
while ((c = getopt(argc, argv, "abc:d:012")) != -1) {
int this_option_optind = optind ? optind : 1;
switch (c) {
case '0':
case '1':
case '2':
if (digit_optind != 0 && digit_optind != this_option_optind) {
printf ("digits occur in two different argv-elements.\n");
}
digit_optind = this_option_optind;
printf ("option %c\n", c);
break;
case 'a':
printf ("option a\n");
aopt = 1;
break;
case 'b':
printf ("option b\n");
bopt = 1;
break;
case 'c':
printf ("option c with value '%s'\n", optarg);
copt = optarg;
break;
case 'd':
printf ("option d with value '%s'\n", optarg);
dopt = optarg;
break;
case '?':
break;
default:
printf ("?? getopt returned character code 0%o ??\n", c);
}
}
if (optind < argc) {
printf ("non-option ARGV-elements: ");
while (optind < argc) {
printf ("%s ", argv[optind++]);
}
printf ("\n");
}
exit (0);
}
Using GNU extension getopt_long
#include <stdio.h> /* for printf */
#include <stdlib.h> /* for exit */
#include <getopt.h> /* for getopt_long; POSIX standard getopt is in unistd.h */
int main (int argc, char **argv) {
int c;
int digit_optind = 0;
int aopt = 0, bopt = 0;
char *copt = 0, *dopt = 0;
static struct option long_options[] = {
/* NAME ARGUMENT FLAG SHORTNAME */
{"add", required_argument, NULL, 0},
{"append", no_argument, NULL, 0},
{"delete", required_argument, NULL, 0},
{"verbose", no_argument, NULL, 0},
{"create", required_argument, NULL, 'c'},
{"file", required_argument, NULL, 0},
{NULL, 0, NULL, 0}
};
int option_index = 0;
while ((c = getopt_long(argc, argv, "abc:d:012",
long_options, &option_index)) != -1) {
int this_option_optind = optind ? optind : 1;
switch (c) {
case 0:
printf ("option %s", long_options[option_index].name);
if (optarg) {
printf (" with arg %s", optarg);
}
printf ("\n");
break;
case '0':
case '1':
case '2':
if (digit_optind != 0 && digit_optind != this_option_optind) {
printf ("digits occur in two different argv-elements.\n");
}
digit_optind = this_option_optind;
printf ("option %c\n", c);
break;
case 'a':
printf ("option a\n");
aopt = 1;
break;
case 'b':
printf ("option b\n");
bopt = 1;
break;
case 'c':
printf ("option c with value '%s'\n", optarg);
copt = optarg;
break;
case 'd':
printf ("option d with value '%s'\n", optarg);
dopt = optarg;
break;
case '?':
break;
default:
printf ("?? getopt returned character code 0%o ??\n", c);
}
}
if (optind < argc) {
printf ("non-option ARGV-elements: ");
while (optind < argc) {
printf ("%s ", argv[optind++]);
}
printf ("\n");
}
exit (0);
}
In Shell
Shell script programmers commonly want to provide a consistent way of providing options. To achieve this goal, they turn to getopts and seek to port it to their own language.
The first attempt at porting was the program getopt, implemented by Unix System Laboratories (USL). This version was unable to deal with quoting and shell metacharacters, as it shows no attempts at quoting. It has been inherited to FreeBSD.[11]
In 1986, USL decided that being unsafe around metacharacters and whitespace was no longer acceptable, and they created the builtin getopts command for Unix SVR3 Bourne Shell instead. The advantage of building the command into the shell is that it now has access to the shell's variables, so values could be written safely without quoting. It uses the shell's own variables to track the position of current and argument positions, OPTIND and OPTARG, and returns the option name in a shell variable.
In 1995, getopts
was included in the Single UNIX Specification version 1 / X/Open Portability Guidelines Issue 4.[12] Now a part of the POSIX Shell standard, getopts have spread far and wide in many other shells trying to be POSIX-compliant.
getopt was basically forgotten until util-linux came out with an enhanced version that fixed all of old getopt's problems by escaping. It also supports GNU's long option names.[13] On the other hand, long options have been implemented rarely in the getopts
command in other shells, ksh93 being an exception.
In other languages
getopt is a concise description of the common POSIX command argument structure, and it is replicated widely by programmers seeking to provide a similar interface, both to themselves and to the user on the command-line.
- C: non-POSIX systems do not ship
getopt
in the C library, but gnulib[4] and MinGW (both accept GNU-style), as well as some more minimal libraries, can be used to provide the functionality.[14] Alternative interfaces also exist:- The
popt
library, used by RPM package manager, has the additional advantage of being reentrant. - The
argp
family of functions in glibc and gnulib provides some more convenience and modularity.
- The
- D programming language: has getopt module in the D standard library.
- Go: comes with the
flag
package,[15] which allows long flag names. Thegetopt
package [16] supports processing closer to the C function. There is also anothergetopt
package [17] providing interface much closer to the original POSIX one. - Haskell: comes with System.Console.GetOpt, which is essentially a Haskell port of the GNU getopt library.[18]
- Java: There is no implementation of getopt in the Java standard library. Several open source modules exist, including gnu.getopt.Getopt, which is ported from GNU getopt,[19] and Apache Commons CLI.[20]
- Lisp: has many different dialects with no common standard library. There are some third party implementations of getopt for some dialects of Lisp. Common Lisp has a prominent third party implementation.
- Free Pascal: has its own implementation as one of its standard units named GetOpts. It is supported on all platforms.
- Perl programming language: has two separate derivatives of getopt in its standard library: Getopt::Long[21] and Getopt::Std.[22]
- PHP: has a getopt function.[23]
- Python: contains a module in its standard library based on C's getopt and GNU extensions.[24] Python's standard library also contains other modules to parse options that are more convenient to use.[25][26]
- Ruby: has an implementation of getopt_long in its standard library, GetoptLong. Ruby also has modules in its standard library with a more sophisticated and convenient interface. A third party implementation of the original getopt interface is available.
- .NET Framework: does not have getopt functionality in its standard library. Third-party implementations are available.[27]
References
- ^ "usr/src/lib/libc/pdp11/gen/getopt.c". From System III, released June 1980, linked here from Warren Toomey's The Unix Tree project. Archived from the original on 2023-05-12. Retrieved 2024-04-22.
{{cite web}}
: CS1 maint: others (link) - ^ Quarterman, John (1985-11-03). "public domain AT&T getopt source". linux.co.cr (originally in mod.std.unix newsgroup). Archived from the original on 2023-05-12. Retrieved 2024-04-22.
- ^ FreeBSD Library Functions Manual –
- ^ a b c d e "getopt". GNU Gnulib. Retrieved 23 January 2020.
- ^ FreeBSD Library Functions Manual –
- ^ "getopt(3)". Oracle Solaris 11.2 Information Library.
- ^ "Utility Conventions". POSIX.1-2018.
- ^ "Argument Syntax". The GNU C Library. Retrieved 24 January 2020.
- ^ David-John, Burrowes; Kowalski III, Joseph E. (22 Jan 2003). "CLIP Specification, Version 1.0, PSARC 1999/645" (PDF).
- ^ a b c d Linux Library Functions Manual –
- ^ FreeBSD General Commands Manual –
- ^ "getopts". The Open Group (POSIX 2018).
- ^ Linux User Manual – User Commands –
- ^ "visual studio - getopt.h: Compiling Linux C-Code in Windows". Stack Overflow.
- ^ "Package flag".
- ^ "Package getopt".
- ^ "Package getopt".
- ^ "System.Console.GetOpt".
- ^ "Class gnu.getopt.Getopt". Retrieved 2013-06-24.
- ^ "Commons CLI". Apache Commons. Apache Software Foundation. February 27, 2013. Retrieved June 24, 2013.
- ^ "Getopt::Long - perldoc.perl.org".
- ^ "Getopt::Std - perldoc.perl.org".
- ^ "PHP: getopt - Manual".
- ^ "16.5. getopt — C-style parser for command line options — Python 3.6.0 documentation".
- ^ "Parser for command line options". Retrieved 2013-04-30. Deprecated since version 2.7
- ^ "Parser for command-line options, arguments and sub-commands". Retrieved 2013-04-30.
- ^ "GNU Getopt .NET". GitHub.