See also
- is_of_type/2 - checks that a variable is bound to a term of a given type and fails if not
- must_be/2 - checks that a variable is bound to a term of a given type and throws if not
There needs to be an additional which outputs the type or better yet the "type path" to the "anytype" (the root of the type tree, see below).
Note that an unbound variable could (going forward in the Prolog proof) take on any type, as in:
morphy(X) :- var(X),X=12,integer(X). % X is now bound to an integer morphy(X) :- var(X),X=foo,atom(X). % X is now bound to an atom morphy(X) :- var(X),X=f(x),compound(X). % X is now bound to a compound term
Then:
?- morphy(X). X = 12 ; X = foo ; X = f(x).
But if unbound, X itself is not considered an integer, atom or compound. It is just var(X)
- it "can become anything".
If you have a a Prolog term denoted by variable name T, what could it be?
This is basically a decision tree that a program could run, although in practice you just immediately use the predictate at the very botoom: for example, integer(T)
will tell you whether T denotes an integer or not. If yes, you immediately know that it is also a rational, a number, an atomic, a nonvar, and evidently "any".
Note the subtree under number
:
float
is a "representation type" (for want of a better word). You can have a numeric value that has a representation both as integer and float and these representations are distinguishable (1 =:= 1.0
but 1 \== 1.0
).
Similarly, every number that has a float
representation can equivalently be represented as a rational (except for infinities and negative zeros and NaNs), but the representations are distinguishable: 0.5 =:= 1r2
but 0.5 \== 1r2
.
On the other hand, integer
is a "subset type" of rational
: There is no way to distinguish the (infinitely many) representations of an integer written as a non-canonical rational from "the" integer: 5r1 =:= 10r2
and also 5r1 == 10r2
, as well as 5 =:= 10r2
and also 5 == 10r2
hold.
Special floats that could be added to the diagram:
- The NaN (Not-a-Number), which is actually a number/1, see nan/0
- The +Infinity and -Infinity, see inf/0
- The "negative 0" -0.0, which is subtly different from +0.0 alias 0.0 (see Wikipedia: Signed Zero and NaN and Infinity floats and their syntax
See also in this manual:
2020-12-14 any T | +-----------------------------------+---------------------------+ | | var(T) nonvar(T) the variable name T denotes an 'unbound the complement of var(T) variable' (aka. uninstantiated variable), | which is an empty memory cell distinguishable | by '==' at the time the call to var/1 occurs | - a nonlogical predicate about the state of the | computation | - this predicate should have been called | unbound(T); a historical accident! | +---------------------------+-----------------------------+ | | atomic(T) compound(T) | | +----------------------------------+------------------------+ +--------------+-------------+ | | | | | blob(T,_) string(T) number(T) compound term compound term | | of arity 0 of arity > 0 | | (SWI-Prolog only) | | | | | | | +-----------------------+-----------------------+ +---------+---------+ +------------------+-------------------+ | | | | | | | | (other blob types) blob(T,reserved_symbol) blob(T,text) rational(T,Nu,De) float(T) SWI-Prolog dict first listcell others encapsulated foreign | atom(T) | of a list backbone resources | | | an encapsulated '[|]'(H,Rs) | | | data structure; for alias [H|Rs] +--------+--------+ | +----------+----------+ now implemented as | | | | | a compound term) "list" is a *nonlocal* T==[] T\==[] | rational(X),\+integer(X) integer(T) structure based on convention empty list for now only | "proper rational" there may or may not be an the unfakeable | actual list beyond the first dict functor name | listcell | +------------------+------------------+ | | | length=0 length=1 atom with the empty atom character/char length>1
The above as easily printable A4 PDF
More complete in in SVG.
Some notes I took about this.
The SWI-Prolog Wiki has more on SWI-Prolog datatypes here.
Interestingly, the empty list symbol []
can be used as functor name:
?- compound_name_arguments(F, [],[x,y]). F = [](x,y).
Blobs can be used as functor names in general. However, performing that party trick is dis-recommended.
A symbol that is missing
In SQL, we have the value NULL, indicating missing data. This might be of some use in Prolog facts, and "officialize" a NULL value (what do we use currently? _? , '', "", []? It depends!)
Yes we can use the atom null
, but having a dedicated NULL would be more expressive.
No booleans
Perphas surprisingly for a logic programming language, there is no dedicated "boolean" type (there seem to be no ready-to-use logical connectives working on "boolean" either but they are easy to write). Instead the atoms true
and false
are (generally) used when you want to "reify" the truth values:
xor(true,false,true) :- !. xor(false,true,true) :- !. xor(true,true,false) :- !. xor(false,false,false) :- !.
There are 0-ary predicates of the same name true/0 and false/0 so you can call those to "un-reify" the truth value, obtaining proper success or failure. An atom at goal position is interpreted as a call to the similarly named 0-ary predicate:
?- true. true. ?- false. false.
There is also an undefined/0 truth-value predicate used in well-founded semantics:
?- undefined. undefined.
Indeed, on can refer to must_be/2: must_be(boolean,X)
checks whether X is one of the atoms true
or false
. (It doesn't accept a third truth value undefined
, used tabling, either).
Thoughts
- There should be a special "NULL" atomic value similar as for SQL to designate missing information. Currently one has to resort to contortions express NULL, choosing atoms with special meanings or using _, which is an unbund variable, not "NULL".
- Actual arrays with access-by-numeric-index can be implemented with (large) compound terms.
- The blob type gives the possibility to create unfakeable identifiers (i.e. enum types); there is absolutely use for this (think of manipulating a syntax tree where you really want to distinguish the variables that appear in the AST from the rest without awkward conventions about allowed names or using tagging) but it is not supported for now. You will have to add C code. See: blob
- Why is Prolog still not "sorted" after all these years? It would be Prolog++ of course. Would still be worth it.
The concept of the Prolog "term" is still very primitive. In Higher-Order Logic Programming as Constraint Logic Programming (Spiro Michaylov, Frank Pfenning), 1993, we read:
Higher-order logic programming (HOLP) languages typically use a typed λ-calculus as their domain of computation. In the case of λProlog it is the simply-typed λ-calculus, while in the case of Elf it is a dependently typed λ-calculus. These languages are particularly useful for various kinds of meta-programming and theorem proving tasks because of the logical support for variable binding via λ-abstraction. They have been used for a wide range of applications including theorem proving, programming language interpretation, type inference, compilation, and natural language parsing.
However:
Full unification in higher-order languages is clearly impractical, due to the non- existence of minimal complete sets of most-general unifiers. Therefore, work on λProlog has used Huet’s algorithm for pre-unification, where so-called flex-flex pairs (which are always unifiable) are maintained as constraints, rather than being incorporated in an explicit parametric form. Yet, even pre-unifiability is undecidable, and sets of most general pre-unifiers may be infinite. While undecidability has not turned out to be a severe problem, the lack of unique most general unifiers makes it difficult to accurately predict the run-time behavior of λProlog programs that attempt to take advantage of full higher-order pre-unification. It can result in thrashing when certain combinations of unification problems have to be solved by extensive backtracking. Moreover, in a straightforward implementation, common cases of unification incur a high overhead compared to first-order unification. These problems have led to a search for natural, decidable subcases of higher-order unification where most general unifiers exist. Miller has suggested a syntactic restriction (Lλ) to λProlog, easily extensible to related languages, where most general unifiers are unique modulo βηα-equivalence
Going further, one attains logic programming on top of proper object-oriented database objects: F-Logic
A type tree just for "terms that are _texty_"
So I was thinking about what the type of a "text thing" could be and this seems appropriate:
Categorization always categorizes a term as not belonging to this tree or else as belonging to exactly one of the leaf types. Types nearer the root type just regroup sutypes in a meaningful way (for the programmer).
+--- var?: an unbound variable; can become anything | | | +--- openvarlist?: nonempty open list of > 0 vars | | +--- openlist? -----+--- opencharylist?: nonempty open list of > 0 char, may contain vars | | | +--- opencodeylist?: nonempty open list of > 0 unicode code points, may contain vars | | +--- varlist?: list of N > 0 vars; edge case | | +--- changylist? ---+--- charylist?: list of C > 0 chars and V > 0 vars | | | +--- codeylist?: list of C> 0 unicode code points and V > 0 vars | | +--- emptylist!: empty list; edge case as not sure whether charlist or codelist | | | +--- textlist! ---+--- charlist!: nonempty list of chars (atoms of length 1) | | | | | +--- codelist!: nonempty list of unicode code points (integers between 0 and 0x10FFFF) | | texty* ---+ +--- text! ---+ | | | | | | +--- atom!: Prolog atoms including the empty atom '' (but not the empty list []) | | | | +---anytext! ---+ +--- stringy! ---+ | | | +--- string!: SWI-Prolog strings including the empty string "" | +--- number! acceptable because a number can be transformed into text (according to some unspecified convention...)
Show me the code you say?
An implementation of the above type tree can be found as pre-release here
See also: Predicates that operate on strings