Atoms, Lists and Strings

This section includes predicates for identifying, comparing and manipulating atoms, lists and strings. The categories are:

Atom Processing

List Processing

String Processing

Character List Processing

Atom Processing

A number of predicates are provided for manipulating and converting atoms.

atom_codes(Atom, CharList)

atom_codes/2 converts between atoms and a list of character codes. If Atom is an atom then CharList is unified with the list of character codes for the name of Atom. If Atom is uninstantiated and CharList is a list of character codes, then instantiate Atom to the atom whose name is formed from CharList.

?- atom_codes(abc, X).
X = [0w0061, 0w0062, 0w0063] 
yes

?- atom_codes(X, "abc").
X = abc 
yes

atomlist_concat(AtomList, Atom)

atomlist_concat/2 concatenates all of the atoms in the list AtomList to create a single atom, Atom.

?- atomlist_concat([cats,and,dogs], X).
X = catsanddogs 
yes

atom_concat(Atom1, Atom2, AtomVar3)

atom_concat/3 concatenates atoms Atom1 and Atom2 and unifies that with variable AtomVar3. It can also generate all possible pairs of atoms, Atom1 and Atom2, from a given atom AtomVar3.

?- atom_concat(cats, dogs, X).
X = catsdogs 
yes

?- atom_concat(A, B, abc).
A = ''
B = abc ;

A = a
B = bc ;

A = ab
B = c ;

A = abc
B = '' ;
no

atom_length(Atom, Length)

atom_length/2 unifies the length of Atom with Length.

atom_uplow(AtomUpper, AtomLower)

atom_uplow/2 creates a new uppercase atom, AtomUpper, from a lowercase atom, AtomLower, and vice versa.

is_atom(X)

is_atom/1 succeeds if X is an atom.

name(Atom, CharList)

name/2 is the same as atom_codes/2. It is preserved because it is used in many old Prolog programs.

sub_atom(Atom, Index, Length, SubAtom)

sub_atom/4 extracts parts of atoms like sub_string/4 does for strings. The parameters are:

Atom: must be an atom
Index: the starting position, beginning with 1, of subatom
Length: the length of the subatom
SubAtom: the subatom

In addition to the requirement that Atom be instantiated, either Index and Length must be instantiated, or SubAtom. In the first case the subatom is found, and in the second, the index andlength are found. Backtracking is fruitful in the second case if the subatom can be found more than once.

If Index is instantiated and Length isn't, SubAtom becomes the rest of the Atom and Length is instantiated to its length.

The instantiation requirements are somewhat more restrictive than the ISO standard which specifies that sub_atom can be used to generate all possible subatoms, with index and length, from a given atom.

Example:

?- sub_atom(ratatatat, 1, 3, X)
X = rat
?- sub_atom(ratatatat, I, L, tat)
I = 3
L = 3;
I = 5
L = 3;
I = 7
L = 3;
no

List Processing

Some list utility predicates are basic built-in predicates. There are many more that are part of the list library. These are the built-in ones.

is_member(Term, List)

is_member/2 is a restricted version of the classic member/2 predicate (in the LIST.PLM library) than can be used for fast testing if Term is a member of List. It uses a strong unify (==) for testing the element. It cannot be used to backtrack through the various members of a list. The definition is equivalent to:

is_member(X, [Y|_]) :- X == Y, !.
is_member(X,[_|Z]) :- is_member(X,Z).

sort(List, SortedList)

sort/2 sorts a list. List should be bound to a list. SortedList gets unified with the list whose elements are those of List arranged according to standard order.

?- sort([q,w,e,r,t,y,u], X).
X = [e, q, r, t, u, w, y] 
yes

keysort(KeyedList, KeyedSortedList)

keysort/2 sorts a list of keyed elements of the form Key-Item. KeyedSortedList should be bound to a list. KeyedSortedList gets unified with the list whose elements are those of KeyedList sorted by key in standard order.

?- keysort([a-1, r-9, w-3, b-2, y-2, c-1], X).
X = [a - 1, b - 2, c - 1, r - 9, w - 3, y - 2] 
yes

String Processing

A number of predicates are designed to work with strings.

is_string(X)

is_string/1 succeeds if X is a string.

nonblank_string(String)

nonblank_string/1 takes a String as its argument and tests to make sure it contains at least one non-whitespace character. It succeeds if the string is nonblank, and fails if it's blank.

strcat(StringA, StringB, StringAB)

strcat/3 concatenates the first two strings to from the third. The first two arguments must be strings.

stringlist_concat(StringList, String), stringlist_concat(StringList, Separator, String)

stringlist_concat/2 concatenates all of the strings or atoms in StringList to create the output String. For example:

?- stringlist_concat([ `one `, two, ` three` ], X).
X = `one two three`

stringlist_concat/3 allows the use of a separator string that will appear between all of the elements in StringList

?- stringlist_concat([`one`,`two`,`three`], `..`, X).
X = `one..two..three`

string_atom(String, Atom)

string_atom/2 is used to convert between a string and an atom. If String is bound to a string, then Atom is unified with the corresponding atom. If Atom is bound to an atom, then String is unified with the string representation. The more general string_term/2 can be used as well.

string_float(String, Float)

string_float/2 is used to convert between a string and a floating point number.

string_icomp(String1, String2)

string_icomp/2 Performs a case insensitive compare of two strings.

?- string_icomp(`ABC`, `abc`).
yes

string_integer(String, Integer)

string_integer converts back and forth between a string and its integer value (i.e., 33 and `33`). The more general string_term/2 can be used as well.

If String is bound to a string of digits, then Integer is bound to the corresponding value. If Integer is bound to a value, then String is bound to the ASCII digits representing that value. For example:

?- string_integer(StrVal, 33).
StrVal = `33`

string_number(String, Number)

string_number converts back and forth between a string and its numeric value.

string_length(String, Length)

string_length/2 requires that String be bound to a string. Length is unified with the length of the string. For example:

?- string_length(`how long am i?`, Len).
Len = 14

string_list(String, List)

string_list/2 converts back and forth between a string and its representation as a list of character codes(i.e., "foo" and `foo`).

If String is bound to a string then string_list succeeds if it can unify List with the list of character codes of the characters in the string.

If List is bound to a list of character codes then this predicate succeeds if String can be unified with the string comprising the characters whose codes are in List.

For example:

?- string_list(X, [70, 79, 79]).
X = `FOO`

string_list/2 It is especially useful for applications that do character-by-character parsing on input/output strings. DCG applications often use string_list/2. The following example uses some list library predicates to add a file extension to a string file name if it doesn't have one.

:- ensure_loaded(list).
:- import(list).

add_file_extension(S_FILE, S_EXT, S_OUT) :-
   string_list(S_FILE, L_FILE),
   (member(0'., L_FILE) ->
      S_OUT = S_FILE
      ;
      string_list(S_EXT, L_EXT),
      append(L_FILE, L_EXT, L_OUT),
      string_list(S_OUT, L_OUT)
   ).

Trying it:

?- [ex_string_list].
yes

?- add_file_extension(`ducky`, `.pro`, X).
X = `ducky.pro` 
yes

?- add_file_extension(`ducky.pro`, `.plm`, X).
X = `ducky.pro` 
yes

string_split(String, DelimitersS, List)

string_split splits the String into a list of strings separated by the characters in the DelimiterS string. Unifies the result with List.

?- string_split(`one/two:three`, `/:`, X).
X = [`one`,`two`,`three`]

Note, string_split/3 differs from string_tokens in that it preserves whitespace, and does not return the delimiters as part of the list.

string_term(String, Term)

string_term/2 converts back and forth between a string and a term. For example:

?- string_term(`whiz(bang)`, X).
X = whiz(bang)
?- string_term(X, whiz(bang)).
X = `whiz(bang)`

string_term/2 uses the normal Prolog I/O reader to read/write the term to/from the string. You can use catch/3 to catch bad syntax in the string as the following example shows:

check_string :-
   repeat,
   write(`Enter term: `),
   read_string(STRING),
   catch(
      string_term(STRING, TERM),
      X,
      process_error(X) ).

process_error(X) :-
   write(`Error: `),
   write(X), nl,
   write(`Try again`), nl,
   fail.

Trying it:

?- check_string.
Enter term: duck+*(le
Error: error(syntax_error,
   [type = read, rc = 407,
    message = Unexpected operator,
    ...
    read_buffer = duck+*( **NEAR HERE** le .,
    ...])
Try again
Enter term: duck(leona)

yes

string_termq(StringQ, Term)

string_termq/2 is exactly like string_term/2, except when creating a string from a term it adds syntax like quotes as necessary to ensure that the string can be converted back to the same term. For example, an atom with a leading uppercase letter needs to be quoted so it isn't confused as a variable. For example:

?- string_termq(S, person(name('Walter'), addr(`Maple Way`))).
S = `person(name('Walter'), addr(``Maple Way``)))`

is_string_term(String, Term)

is_string_termq/2 is exactly like string_term/2, except instead of throwing an error when an input string has a syntax error, it simply fails.

string_tokens(String, TokenList), string_tokens(String, TokenList, DelimitersS)

string_tokens/2,3 takes a string and parses it into a list of tokens, where a token is a sequence of alphanumeric characters, or a punctuation mark. Whitespace between the tokens is removed with one exception, an ending period. An period embedded in text is simply a period, but one followed by white space is period space ( '. ' ). For example

?- string_tokens(`the file log.txt is 3.4 pages. or so.`, X).
X = [the, file, log, ., txt, is, '3', ., '4', pages, '. ', or, so, '. ']

?- string_tokens(`Don't go near, or in, the water!`, X). 
X = ['Don','''',t,go,near,',',or,in,',',the,water,!]

The three argument version allows you to specify the characters considered as punctuation marks. For example this query might be used to parse HTML strings:

?- string_tokens(`<H2>This: A heading?</H2>`, X, `<>/`).
X = [<,'H2',>,'This:','A','heading?',<,/,'H2',>]

Note, string_tokens/2,3 differs from string_split/3 in that all delimiters are returned in the TokenList, and whitespace is not preserved.

string_trim(String, TrimmedString)

string_trim/2 Trims the leading and trailing white space from String and unifies the resulting string with TrimmedString.

?- string_trim(` hello `, X).
X = `hello`

sub_string(String, Index, Length, SubString)

sub_string/4 is used to locate or generate a substring of a given string. String must be bound to a string. There are two cases to consider:

SubString is bound to a string. In this case sub_string looks for the first occurrence of SubString in String and unifies Index and Length with the position of the start of SubString in String and its length. Backtracking will attempt to find the next occurrence of SubString in String. Note that index 1 means the first character in StringIn.
SubString is unbound. Index and Length are bound to valid integers (i.e., 1 < Index < length_of(String) and Index + Length =< length_of(String)). The substring unifies SubString with the substring of String at the given Index and of the given Length. If Index is bound and Length is unbound, then the SubString is the rest of the string and Length is bound to its length.

For example:

?- sub_string(`i am hiding`, Where, Len, `am`).
Where = 3
Len = 2
?- sub_string(`1Q93`, 3, 2, Year).
Year = 93

tilt_slashes(InAS, OutAS)

tilt_slashes/2 converts between slashes of either tilt in the string In, to the correct path separator slashes for the platform, which is / on Unix and \ on Windows. Note that this is not usually necessary because the file open predicates accept either tilt on slashes.

For example, this was run on a Windows version of Amzi!:

?- tilt_slashes(`abc/def/ghi`, X).
X = `abc\def\ghi`
yes

Character List Processing

A number of predicates are designed to work with character lists.

atom_chars(Number, CharList)

atom_chars/2 converts back and forth between an atom and a list of characters.

char(X)

char/1 succeeds if X is a character.

char_code(Atom, CharList)

char/1 converts back and forth between an atom and a list of characters.

number_chars(Number, CharList)

number_chars/2 converts back and forth between a number and a list of characters.