Did you know ... | Search Documentation: |
Managing external tables |
This section describes the predicates required for creating and destroying the access to external database tables.
ColumnName(
Type [, ColumnOptions])
Type denotes the Prolog type to which the field should be converted and is one of:
integer | Convert to a Prolog integer. The input is treated as a decimal number. |
hexadecimal | Convert to a Prolog integer. The input is treated as a hex number. |
float | Convert to a Prolog floating point
number. The input is handled by the C-library function
strtod() . |
atom | Convert to a Prolog atom. |
string | Convert to a SWI-Prolog string object. |
code_list | Convert to a list of ASCII codes. |
ColumnOptions is a list of additional properties of the column. Supported values are:
sorted | The field is strictly sorted, but may have (adjacent) duplicate entries. If the field is textual, it should be sorted alphabetically, otherwise it should be sorted numerically. |
sorted(+Table) | The (textual) field is sorted using the ordering declared by the named ordering table. This option may be used to define reverse order,‘dictionary’order or other irregular alphabetical ordering. See new_order_table/2. |
unique | This column has distinct values for each row in the table. |
downcase | Map all uppercase in the field to lowercase before converting to a Prolog atom, string or code_list. |
map_space_to_underscore | Map spaces to underscores before converting to a Prolog atom, string or code_list. |
syntax | For numerical fields. If the field does not contain a valid number, matching the value fails. Reading the value returns the value as an atom. |
width(+Chars) | Field has fixed width of the specified number of characters. The column-separator is not considered for this column. |
arg(+Index) | For read_table_record/4, unify the field with the given argument of the record term. Further fields will be assigned index+1, ... . |
skip | Don't convert this field to Prolog. The field is simply skipped without checking for consistency. |
The Options argument is a list of global options for the table. Defined options are:
record_separator(+Code) | Character (ASCII) value of the character separating two records. Default is the newline (ASCII 10). |
field_separator(+Code) | Character (ASCII) value of the character separating two fields in a record. Default is the space (ASCII 32), which also has a special meaning. Two fields separated by a space may be separated by any non-empty sequence of spaces and tab (ASCII 9) characters. For all other separators, a single character separates the fields. |
encoding(+Encoding) | Text
encoding of the file. Values are iso_latin_1 (default),
utf8 or native . The latter uses the native
multibyte to unicode conversion. |
escape(+Code, +ListOfMap) | Sometimes,
a table defines escape sequences to make it possible to use the
separator-characters in text-fields. This options provides a simple way
to handle some standard cases. Code is the ASCII
code of the character that leads the escape sequence. The default is
-1 , and thus never matched.
ListOfMap is a list of
From = To character mappings. The
default map table is the identity map, unless Code refers to
the
\ character, in which case
\b , \e , \n , \r and \t
have their usual meaning. |
functor(+Head) | Functor used by read_table_record/4.
Default is record using the maximal argument index of the
fields as arity. |
If the options are parsed successfully, Handle is unified with a term that may be used as a handle to the table for future operations on it. Note that new_table/4 does not access the file system, so its success only indicates the description could be parsed, not the presence, access or format of the file.
This section describes the predicates to read data from a table.
Records are addressed by their offset in the table (file). As records have generally non-fixed length, searching is often required. The predicates below allow for finding records in the file.
file | Unify value with the name of the file with which the table is associated. |
field(N) | Unify value with declaration of n-th (1-based) field. |
field_separator | Unify value with the field separator character. |
record_separator | Unify value with the record separator character. |
key_field | Unify value with the 1-based index of the field that is sorted or fails if the table contains no sorted fields. |
field_count | Unify value with the total number of columns in the table. |
size | Unify value with the number of characters in the table-file, not the number of records. |
window | Unify value with a term Start
- Size, indicating the properties of the current
window. |
There are two predicates for reading records. The read_table_record/4 reads an entire record, while read_table_fields/4 reads one or more fields from a record.
record
and arity
the number of not-skipped columns), each of the arguments containing the
converted data. An error is raised if the data could not be converted. Next
is unified with the start position for the next record.
Fields is a list of field specifiers. Each specifier is of the format:
FieldName(Value [, Options])
Options is a list of options to specify the search. By
default, the package will search for an exact match, possibly using the
ordering table associated with the field (see order
option
in new_table/4).
Options are:
prefix | Uses prefix search with the default table. |
prefix(Table) | Uses prefix search with the specified ordering table. |
substring | Searches for a substring in the field. This requires linear search of the table. |
substring(Table) | Searches for a substring, using the table information for determining the equivalence of characters. |
= | Default equivalence. |
=(Table) | Equivalence using the given table. |
If Value is unbound (i.e. a variable), the record is considered not specified. The possible option list is ignored. If a match is found on the remaining fields, the variable is unified with the value found in the field.
First, the system checks whether there is an ordered field that is specified. In this case, binary search is employed to find the matching record(s). Otherwise, linear search is used.
If the match contains a specified field that has the property
unique
set (see new_table/4), in_table/3
succeeds deterministically. Otherwise it will create a backtrack-point
and backtracking will yield further solutions to the query.
in_table/3
may be comfortable used to bind the table transparently to a predicate.
For example, we have a file with lines of the format.1This
is the disproot.dat
table from the AAT
database used in GRASP
C1C2,Full Name
C1C2 is a two-character identifier used in the other tables, and FullName is the description of the identifier. We want to have a predicate identifier_name(?Id, ?FullName) to reflect this table. The code below does the trick:
:- dynamic stored_idtable_handle/1. idtable(Handle) :- stored_idtable_handle(Handle). idtable(Handle) :- new_table('disproot.dat', [ id(atom, [downcase, sorted, unique]), name(atom) ], [ field_separator(0',) ], Handle), assert(stored_idtable_handle(Handle)). identifier_name(Id, Name) :- idtable(Handle), in_table(Handle, [id(Id), name(Name)], _).