This is a library for working with data matrices, taking off from where library(csv) ends.
The library will hopefully grow to become useful tool for logic programming based data science.
In theory the library supports polymorphic representations of matrices, but in its
current form is best to assume that the canonical form (see mtx/1) is the only one supported.
The library should be considered as still in developmental flux.
License: MIT.
At the very least library(mtx) can be viewed as an addition/enhancement io of matrices to files via mtx/2.
The library can interrogate the data/ subdirectory of all installed packs for csv files using alias data.<br>
?- mtx( data(mtcars), Mtcars ). Mtcars = [row(mpg, cyl, disp, hp, ....
Where mtcars.csv
is in some pack's data directory.
?- mtx_data( mtcars, Mtcars ). Mtx = [row(mpg, cyl, disp, hp, ....
Where mtcars.csv is in pack(mtx)
data subdirectory.
mtx/2 works both as input and output.<br>
If 2nd argument is ground, mtx/2 with output the 2nd argument to the file pointed by the 1st.
Else, the 1st argument is inputed to the 2nd argument in standard form.
?- tmp_file( mtc, TmpF ), mtx( pack('mtx/data/mtcars'), Mtc ), mtx( TmpF, Mtc ). TmpF = '/tmp/pl_mtc_14092_0', Mtc = [row(mpg, cyl,
The first call to mtx/2 above, inputs the test csv mtcars.csv, to Mtc (instantiated to list of rows).
The second call, outputs Mtc to the temporary file TmpF.
mtx/3 provides a couple of options on top of csv_read_file/3 and csv_write_file/3.
sep(Sep)
is short for separator, that also understands comma, tab and space (see mtx_sep/2).
match(Match)
is short for match_arity(Match)
?- mtx( data(mtcars), Mtcars, sep(comma) ). Mtcars = [row(mpg, cyl, disp, hp, ....)|...]
If a predicate definition has both Cnm and Cps define them in that order.
Good starting points are the documentation for mtx/1, mtx/2 and mtx/3.
This is a synonym for mtx(Mtx, _Canonical)
. Cite this predicate for valid input representations of Mtx variables.
Valid representations are (see mtx_type/2):
Notes for developers
For examples use:
?- mtx_data( mtcars, Mtcars ). M = [row(mpg, cyl, disp, hp, .... ?- mtx( pack(mtx/data/mtcars), Mtc ). ?- mtx( data(mtcars), Mtx ).
Variable naming conventions
If a predicate definition has both Cnm and Cps define them in that order.
?- mtx_data( mtcars, Cars ), mtx( Cars ).
The canonical representation of a matrix is a list of compounds, the first of which is the header and the rest are the rows. The term name of the compounds is not strict but header is often and by convention either hdr or row and rows are usually term named by row.
When Opts is missing, it is set to the empty list (see options/2).
Modes
When +Any is ground and -Canonical is unbound, Any is converted from any of the accepted input formats (see mtx_type/2) to the canonical form.
When both +Canonical and +Res are ground, Res is taken to be a file to write Canonical on.
Under +Canonical and -Res, Res is bound to Canonical (allows non-output).
This predicate is often called from within mtx pack predicates to translate inputs/outputs to canonical matrices, before and after performing the intended operations.
The predicate can be used with data/1 alias, to look at data directories of packs for input data matrices.
The following three calls are equivalent.
?- mtx( data(mtcars), Mtcars, sep(comma) ). ?- mtx( data(mtcars), Mtcars ). ?- mtx( pack('mtx/data/mtcars.csv'), Mtcars).
Data matrices can be debug-ed via the dims
and length
goals in debug_call/3.<br>
?- debug(mtx_ex). ?- use_module(library(lib)). ?- lib(debug_call). ?- mtx( data(mtcars), Mtcars ), debug_call( mtx_ex, dims, mtcars/Mtcars ). % Dimensions for matrix, (mtcars) nR: 33, nC: 11. Mtcars = [row(mpg, cyl, disp, hp, ....)|...] ?- mtx( data(mtcars), Mtcars ), debug_call( mtx_ex, len, mtcars/Mtcars ). ?- mtx( data(mtcars), Mtcars ), debug_call( mtx_ex, length, mtcars/Mtcars ). % Length for list, mtcars: 33 Mtcars = [row(mpg, cyl, disp, hp, ....)|...]
Options
Opts is a term or list of terms from the following:
mtx(Handle,Mtx)
.convert(Conv)
to Wopts and Ropts (the default here, flips the current convert(true)
default in csv_write_file/3 - also for read)match_arity(Match)
into Wopts and Roptscall(RowG,Ln,RowIn,RowOut)
which allows arbitrary transformation of Rows while reading-in
(see example below)separator(SepCode)
into Wopts and Ropts, via mtx_sep(Sep,SepCode)
, mtx_sep/2?- mtx( pack(mtx/data/mtcars), Cars ), length( Cars, Length ). Cars = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, ....], Length = 33. ?- mtx( What, [hdr(a,b,c),row(1,2,3),row(4,5,6),row(7,8,9)], [output_file(testo)] ). What = testo. ?- shell( 'more testo' ). a,b,c 1,2,3 4,5,6 7,8,9 true. ?- mtx( What, [hdr(a,b,c),row(1,2,3),row(4,5,6),row(7,8,9)], [input_file('testo.csv'),output_postfix('_demo')] ). What = testo_demo.csv. ?- mtx( pack(mtx/data/mtcars), Cars, cache(cars) ). Cars = [row(mpg, cyl...)|...] ?- debug(mtx(mtx)). ?- mtx( cars, Cars ). Using cached mtx with handle: cars Cars = [row(mpg, cyl...)|...] ?- mtx( pack(mtx/data/mtcars), Mtx, cache(mtcars) ), assert(mc(Mtx)), length( Mtx, Len ). ... Len = 33. ?- mtx( mtcars, Mtcars ), length( Mtcars, Len ). ... Len = 33. ?- mtx( mc, Mc), length( Mc, Len ). ... Len = 33.
?- assert( ( only_c_b(Cb,Ln,RowIn,RowOut) :- ( Ln=:=1 -> once(arg(Cb,RowIn,c_b)), RowOut = row(c_b) ; arg(Cb,RowIn,CbItem), RowOut = row(CbItem) ) ) ). ?- tmp_file( testo, TmpF ), csv_write_file( TmpF, [row(c_a,c_b,c_c),row(1,a,b),row(2,aa,bb)], [] ), mtx( TmpF, Mtx, row_call(only_c_b(_)) ). TmpF = '/tmp/swipl_testo_8588_1', Mtx = [row(c_b), row(a), row(aa)]. ?- mtx( '/tmp/swipl_testo_8588_1', Full ). Full = [row(c_a, c_b, c_c), row(1, a, b), row(2, aa, bb)].
?- mtx_data( mtcars, Mt ), mtx_column_kv( Mt, mpg, KVs ). KVs = [21.0-row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), 21.0-row(21.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), 22.8-row(22.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), 21.4-row(21.4, 6...)|...]
has_header(HasH)
in Options.
If HasH is false, Header is a made up row of the shape row(1,...,N)
Opts
The predicate is meant as a companion to mtx_header_body/5.
Here, unlike in the alternative implementation, we first look for Cid in Hdr args if that is successful the corresponding position is returned, only then we check if Cid is integer before returning it as the requested position. We also check Pos in this case is within range. jjj
?- mtx_mtcars( Mt ), Mt = [Hdr|_Rows], mtx_header_column_name_pos( Hdr, mpg, Cnm, Cpos ). Cnm = mpg, Cpos = 1. ?- mtx_mtcars( Mt ), Mt = [Hdr|_Rows], mtx_header_column_name_pos( Hdr, 3, Cnm, Cpos ). Cnm = disp, Cpos = 3.
mtx_header_column_name_pos( Hdr, Cid, _, Pos )
.
?- mtx_mtcars(M), mtx_header(M,H), mtx:mtx_header_column_pos(H,carb,Pos).
?- mtx_header_column_multi_pos( hdr(a,b,a,c), =(a), Cnms, Poss ). Cnms = [a, a], Poss = [1, 3]. ?- mtx_header_column_multi_pos( hdr(a,b,a,c), [b,c], Cnms, Pos ). Cnms = [b, c], Pos = [2, 4].
?- mtx_facts( data('mtcars.csv'), Mtcars ). ?- mtx_in_memory( Mod ). Mod = mtcars. ?- mtx_in_memory( Mod, File ). Mod = mtcars, File = '/home/nicos/.local/share/swi-prolog/pack/mtx/data/mtcars.csv'.
?- mtx_facts( data('mtcars.csv'), Mtcars ). ?- mtx_matrices_in_memory( Mtcs ). Mtcs = [mtcars-'/home/nicos/.local/share/swi-prolog/pack/mtx/data/mtcars.csv'].
?- mtx_sort( [row(a,b,c),row(1,2,3),row(7,8,9),row(4,5,6)], b, Ord ). Ord = [row(a, b, c), row(1, 2, 3), row(4, 5, 6), row(7, 8, 9)].
When module is missing or is variable, it is taken to be the stem of the base name of CsvF.
When Opts is missing it defaults to the empty list.
If basename(CsvF)
.pl exists and no option pl_ignore(true)
is given, then the .pl file
is consulted into Module with no further questions asked of Opts.
A warning message is printed on user_output except if pl_warning(false)
is in Opts.
Opts it should be one, or a list of the following
hdr(1,...,n)
is assertedAny remaining options are passed to csv_read_file/3.
?- debug(mtx(facts)). true. ?- mtx_facts( data('mtcars.csv'), Mtcars ). % Expanded facts file to: /home/nicos/.local/share/swi-prolog/pack/mtx/data/mtcars.csv, (type: csv) % Asserting rows of file:'/home/nicos/.local/share/swi-prolog/pack/mtx/data/mtcars.csv' to module:mtcars. Mtcars = mtcars. ?- listing( mtcars:_ ). :- dynamic hdr/11. hdr(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb). :- dynamic row/11. row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0). row(21.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0). ...
Listens to debug(mtx(facts))
.
FileOrModule can be either the absolute filename of the input matrix file or the module the facts are.
% assumes example on mtx_facts/2 has ran, then: ?- debug(mtx(facts)). ?- mtx_facts_remove(mtcars). % Removing mod: mtcars, from file:'/home/nicos/.local/share/swi-prolog/pack/mtx/data/mtcars.csv' true. ?- listing(mtcars:_). true.
Values should be a list of values, or a term of the form:
call(WholeG,AllClmdata)
, where AllClmData is the whole Kth Column (minus header).
Note that for callable K, all columns of Mtx that succeed on the K(Cid) are transformed.
N is taken to be relative to each input and can be an expression except if
of the form abs_pos(Abs)
(see mtx_relative_pos/5).
?- Mtx = [row(a, b, d), row(1, 2, 4), row(5, 6, 8)], assert( an_mtx(Mtx) ). ?- an_mtx(Mtx), mtx_column_add( Mtx, 3, [c,3,7], New ). New = [row(a, b, c, d), row(1, 2, 3, 4), row(5, 6, 7, 8)]. ?- an_mtx(Mtx), mtx_column_add( Mtx, 1+2, [c,3,7], New ). New = [row(a, b, c, d), row(1, 2, 3, 4), row(5, 6, 7, 8)]. ?- an_mtx(Mtx), mtx_column_add( Mtx, -1, [c,3,7], New ). New = [row(a, b, c, d), row(1, 2, 3, 4), row(5, 6, 7, 8)]. ?- an_mtx(Mtx), mtx_column_add( Mtx, d, [c,3,7], New ). New = [row(a, b, c, d), row(1, 2, 3, 4), row(5, 6, 7, 8)]. ?- an_mtx(Mtx), mtx_column_add( Mtx, 3, transform(3,plus(1),plus1), New ). New = [row(a, b, d, plus1), row(1, 2, 4, 5), row(5, 6, 8, 9)]. ?- Mtx = [hdr(a,b,a,c), row(1,2,1,3), row(2,3,2,4)], mtx_column_add( Mtx, +(1), transform(=(a),plus(2),plus2), Out ). Out = [hdr(a, plus2, b, a, plus2, c), row(1, 3, 2, 1, 3, 3), row(2, 4, 3, 2, 4, 4)]. ?- Mtx = [hdr(a,b,a,c), row(1,2,1,3), row(2,3,2,4)], mtx_column_add( Mtx, 1, transform(=(a),plus(2),atom_concat('2+')), Out ). Out = [hdr(a, '2+a', b, a, '2+a', c), row(1, 3, 2, 1, 3, 3), row(2, 4, 3, 2, 4, 4)]. ?- Mtx = [hdr(a, b, c), row(1, 2, 3), row(4,5,6)], mtx_column_add( Mtx, 4, transform([1,2],sum_list,atom_concat('a+b')), Out ). Out = [hdr(a, b, c, ab), row(1, 2, 3, 3), row(4, 5, 6, 9)]. ?- ['/home/nicos/pl/lib/src/meta/aggregate']. ?- Mtx = [r(a,b,c,d),r(x,1,2,3),r(y,4,5,6),r(z,7,8,9)], mtx_column_add( Mtx, 5, derive(aggregate(plus(),0,indices([3,2,4])),1,3,sum), Otx ). Otx = [r(a, b, c, d, sum), r(x, 1, 2, 3, 6), r(y, 4, 5, 6, 15), r(z, 7, 8, 9, 24)].
When Cid is an unbound all possible values are erumerated, whic Cid = Cname.
?- mtx_mtcars(Mtc), mtx_column( Mtc, carb, Carbs ). Carbs = [4.0, 4.0, 1.0, 1.0, 2.0, 1.0, 4.0, 2.0, 2.0|...].
Since v.0.2 supports memory csvs.
Since v.0.3 supports Order. Previously Order = true was assumed which remains the default for back compatibility
% fixme: use the cars csv from pac()
?- mtx_read_file( 'example.csv', Ex )
, mtx_columns( Ex, [c,b], ABs )
.
Ex = [row(a, b, c)
, row(1, 2, 3)
, row(4, 5, 6)
, row(7, 8, 9)
],
ABs = [row(2, 3)
, row(5, 6)
, row(8, 9)
].
% fixme: use the cars csv from pac()
?- mtx_read_file( 'example.csv', Ex )
, mtx_columns( Ex, [c,b], false, ABs )
.
Ex = [row(a, b, c)
, row(1, 2, 3)
, row(4, 5, 6)
, row(7, 8, 9)
],
ABs = [row(3, 2)
, row(6, 5)
, row(9, 8)
].
?- mtx_data( mtcars, Mt ), mtx_column_default( Mt, mpg, true, Mpg ). Mt =..., Mpg = [21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8|...]. ?- mtx_data( mtcars, Mt ), mtx_column( Mt, typo, NaL ). ERROR: Unhandled exception: could_not_locate_column_in_header_row(typo,row(mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb)) ?- G = ( Mpg=[] ), mtx_data( mtcars, Mt ), mtx_column_default( Mt, typo, G, Mpg ). G = ([]=[]), Mpg = [], Mt = ... .
cnm_StdCnm(Cnm)
is in Opts.
Def is propagated as the 3rd argument to mtx_column_default/4,
except when it is an atomic different to true and false.
In the latter case, a ball is prepared which includes Def in its arguments
with the intution that in that case Def is an atom identifying the
matrix or its source, to the user.
?- Mtx = [r(a,sec,c),r(1,2,3),r(4,5,6)], assert( m(Mtx) ). ?- m(Mtx), mtx_column_name_options( Mtx, b, example, Column, [] ). ERROR: Unhandled exception: matrix_required_column_missing(example,b) ?- m(Mtx), mtx_column_name_options( Mtx, b, false, Column, [] ). false. ?- m(Mtx), mtx_column_name_options( Mtx, b, example, Column, [cnm_b(sec)] ). Mtx = [r(a, sec, c), r(1, 2, 3), r(4, 5, 6)], Column = [2, 5].
Opts
cnm_from(From=from) | from |
cnm_to(To=to) | to |
cnm_weight(Weight=weight) | weight |
cnm_StdCnm(Cnm)
is in Opts and Cnm is ground.Mtx and Out can be either files or read in rows: see mtx/3. Opts are passed to the two calls.
Opts
?- assert( mtx1([row(a,b,c),row(1,2,3),row(4,5,6)]) ). ?- mtx1( Mtx1 ), mtx_column_include_rows( Mtx1, 2, =:=(2), Rows ). Rows = [row(a, b, c), row(1, 2, 3)]. ?- mtx1( Mtx1 ), mtx_column_include_rows( Mtx1, 2, =:=(4), Rows ). Mtx1 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], Rows = [row(a, b, c)]. ?- mtx1( Mtx1 ), mtx_column_include_rows( Mtx1, 2, =:=(2), Rows, excludes(Exc) ). Mtx1 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], Rows = [row(a, b, c), row(1, 2, 3)], Exc = [row(4, 5, 6)].
header(s)
or number(s)
of) from Mtx to produce Sel with remainder Rem.
Sel is the removed column(s)
, and Rem is the remainder of Mtx.
Rem is a matrix whereas Sel is a list of values if ColumnS was atomic or a list of list values if
ColumnS was a list.
When CallStr is of the form @(Goal) or call(Goal)
, it will be applied to each column, with
succeeding columns Selected for Sel.
(Note that dealing with presence/absence of column name is delegated to Goal).
Goal is called in user if it is not module prepended (see mod_goal/4).
?- Mtx = [row(a,b,c,d),row(1,1,1,1),row(1,1,2,3),row(2,2,2,2)], assert( ex_mtx(Mtx) ). ?- ex_mtx(Mtx), mtx_column_select( Mtx, b, Red, Sel ). Mtx, = [row(a,b,c,d),row(1,1,1,1),row(1,1,2,3),row(2,2,2,2)], ?- mtx_column_select( Mtx, [a,b], Red, Sel ). Red = [row(c, d), row(1, 1), row(2, 3), row(2, 2)], Sel = [[a, b], [1, 1], [1, 1], [2, 2]]. ?- assert( ( has_at_least(Tms,Val,List) :- findall( 1, member(Val,List), Ones ), sum_list(Ones,Sum), Tms =< Sum) ). ?- has_at_least(2,a,[a,b,c,a] ). true. ?- has_at_least(2,b,[a,b,c,a] ). false. ?- ex_mtx(Mtx), mtx_column_select( Mtx, call(has_at_least(2,1)), Red, Sel ). Mtx = [row(a, b, c, d), row(1, 1, 1, 1), row(1, 1, 2, 3), row(2, 2, 2, 2)], Red = [row(c, d), row(1, 1), row(2, 3), row(2, 2)], Sel = [[a, b], [1, 1], [1, 1], [2, 2]].
The predicate assumes Csv is of the form [Hdr|Rows] and includes Hdr to result. If you want to call on non headers Rows then with numeric NumClm you can call:
?- mtx_column_threshold( [_|Rows], NumClm, Val, Dir, [_|OutRows] ).
Exaamples
?- assert( csv([row(a,b,c),row(1,2,3),row(1,4,5),row(3,6,7),row('',8,9),row(3,b,10)]) ). ?- csv( Csv ), mtx_column_threshold( Csv, a, 2, <, Out ). Out = [row(a, b, c), row(1, 2, 3), row(1, 4, 5)]. ?- csv( Csv ), mtx_column_threshold( Csv, 1, 2, >, Out ). Out = [row(a, b, c), row(3, 6, 7), row(3, b, 10)].
Header is assumed.
Op should be a recognisable operator, see stoics_lib: op_compare/).
The predicate will call op_compare( Op, Freq, Thresh )
, for the Frequency
of every distinct value on column Cid in Mtx.
?- assert( a_mtx([r(a,b,c),r(1,2,1),r(1,2,1),r(1,6,7),r(8,9,10)]) ). ?- a_mtx(Mtx), mtx_column_frequency_threshold( Mtx, a, >, 2, Red ). Red = [r(a, b, c), r(1, 2, 1), r(1, 2, 1), r(1, 6, 7)]. ?- a_mtx(Mtx), mtx_column_frequency_threshold( Mtx, a, <, 2, Red ). Red = [r(a, b, c), r(8, 9, 10)]. ?- a_mtx(Mtx), mtx_column_frequency_threshold( Mtx, a, <, 1, Red ). Red = [r(a, b, c)]. ?- a_mtx(Mtx), mtx_column_frequency_threshold( Mtx, a, =<, 1, Red ). Red = [r(a, b, c), r(8, 9, 10)].
NewClmName(ClmName,New)
where New is used as the new column name.?- assert( (plus_one(A,B):-B is A + 1) ). % plus/3 only works on integers... ?- mtx( pack('mtx/data/mtcars'), Mtx, cache(mtcars) ), mtx_column_replace( Mtx, mpg, mpgp1, @(user:plus_one()), _, New ). Mtx = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), row(22.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), row(21.4, 6.0, 258.0, 110.0, 3.08, 3.215, 19.44, 1.0, 0.0, 3.0, 1.0), row(18.7, 8.0, 360.0, 175.0, 3.15, 3.44, 17.02, 0.0, 0.0, 3.0, 2.0), row(18.1, 6.0, 225.0, 105.0, 2.76, 3.46, 20.22, 1.0, 0.0, 3.0, 1.0), row(14.3, 8.0, 360.0, 245.0, 3.21, 3.57, 15.84, 0.0, 0.0, 3.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...], New = [row(mpgp1, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(22.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(22.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), row(23.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), row(22.4, 6.0, 258.0, 110.0, 3.08, 3.215, 19.44, 1.0, 0.0, 3.0, 1.0), row(19.7, 8.0, 360.0, 175.0, 3.15, 3.44, 17.02, 0.0, 0.0, 3.0, 2.0), row(19.1, 6.0, 225.0, 105.0, 2.76, 3.46, 20.22, 1.0, 0.0, 3.0, 1.0), row(15.3, 8.0, 360.0, 245.0, 3.21, 3.57, 15.84, 0.0, 0.0, 3.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...]. ?- assert( (psfx_one(Name,Psfxed) :- atomic_list_concat([Name,one],'_',Psfxed)) ). ?- mtx_column_replace( mtcars, mpg, user:psfx_one(), @(user:plus_one()), _, New ). New = [row(mpg_one, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(22.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...]. ?- mtx_column_replace( mtcars, mpg, mpgp1, @(plus_one()), _, New ). New = [row(mpgp1, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(22.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...].
Hdr is protected and added to both Sel and Rej.
Opts
?- Csv = [row(a,b,c),row(1,2,3),row(4,5,6)], csv_column_values_select( Csv, c, 3, Red, _ ). Csv = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], Red = [row(a, b, c), row(1, 2, 3)].
% throws error ?- Mtx = [hdr(aa,ab,ba,bb),row(1,2,3)], mtx_name_prefix_column( Mtx, a, Pos, Cnm, Clm ). ?- Mtx = [hdr(aa,ab,ba,bb),row(1,2,3)], mtx_name_prefix_column( Mtx, aa, Pos, Cnm, Clm ). Pos = 1, Cnm = aa, Clm = [1].
?- mtx_relative_pos( 2, 2, _, Pos ). Pos = 4. ?- mtx_relative_pos( -2, 0, c(a,b,c), Pos ). Pos = 2. ?- mtx_relative_pos( -2, 0, c(a,b,c), Nadj, Pos ). Pos = 2.
Opts
by(By=column)
use row to get the report row-wise
frequency(Freq=false)
to report factors, or add number each factor appeared
max(Max=0)
if positive, the maximum number of items to be displayed for each vector. if negative no reporting takes place.
?- mtx( pack(mtx/data/mtcars), Cars ), mtx_factors( Cars, _, [max(5)] ), fail. mpg: [10.4,13.3,14.3,14.7,15.0,...] cyl: [4.0,6.0,8.0] disp: [71.1,75.7,78.7,79.0,95.1,...] hp: [52.0,62.0,65.0,66.0,91.0,...] drat: [2.76,2.93,3.0,3.07,3.08,...] wt: [1.513,1.615,1.835,1.935,2.14,...] qsec: [14.5,14.6,15.41,15.5,15.84,...] vs: [0.0,1.0] am: [0.0,1.0] gear: [3.0,4.0,5.0] carb: [1.0,2.0,3.0,4.0,6.0,...] false. ?- mtx( pack(mtx/data/mtcars), Cars ), mtx_factors( Cars, _, [max(3),frequency(true)] ), fail. mpg: [21.0-2,22.8-2,21.4-2,...] cyl: [6.0-7,4.0-11,8.0-14] disp: [160.0-2,108.0-1,258.0-1,...] hp: [110.0-3,93.0-1,175.0-3,...] drat: [3.9-2,3.85-1,3.08-2,...] wt: [2.62-1,2.875-1,2.32-1,...] qsec: [16.46-1,17.02-2,18.61-1,...] vs: [0.0-18,1.0-14] am: [1.0-13,0.0-19] gear: [4.0-12,3.0-15,5.0-5] carb: [4.0-10,1.0-7,2.0-10,...] false.
column(CidIn,PosOut)
term in Opts column
with Cid, CidIn, is copied from Mtx to MtxOut.
In MtxOut, the column is placed in position PosOut.
The predicate scans Opts as they come, so PosOut should
take account of all operation to its left.
?- M1 = [r(a,b,c),r(1,2,3),r(4,5,6)], M2 = [r(d,e,f),r(7,8,9),r(10,11,12)], mtx_columns_copy( M1, M2, M3, column_copy(c,2) ). M3 = [r(d, c, e, f), r(7, 3, 8, 9), r(10, 6, 11, 12)].
?- mtx_data( mtcars, Mt ), mtx_columns_kv( Mt, mpg, hp, KVs, _, _ ). Mt = [row(mpg, cyl, disp,..)|...], KVs = [21.0-110.0, 21.0-110.0, 22.8-93.0, 21.4-110.0, 18.7-175.0, 18.1-105.0, ... - ...|...].
?- mtx_data( mtcars, Mt ), mtx_header( Mt, Hdr ), mtx_header_cids_order( Hdr, [drat,cyl], Order ). Mt = ..., Hdr = row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), Order = [2, 5].
CidsOrGoal should be either be a Cid, a list of Cids or a Goal.
?- mtx_data( mtcars, Mt ), mtx_columns_remove( Mt, [wt,cyl], Red ). Mt = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), .... ], Red = [row(mpg, disp, hp, drat, qsec, vs, am, gear, carb), row(21.0, 160.0, 110.0, 3.9, 16.46, 0.0, 1.0, 4.0, 4.0), row(21.0, 160.0, 110.0, 3.9, 17.02, 0.0, 1.0, 4.0, 4.0), ...]. ?- mtx_data( mtcars, Mt ), mtx_columns_remove( Mt, [wt,cyl], Red ), mtx_dims( Mt, MtR, MtC ), mtx_dims( Red, RdR, RdC ). MtR = RdR, RdR = 33, MtC = 11, RdC = 9. ?- assert( mtx1( [row(a,b,c,c), row(1,2,3,4), row(1,5,6,7), row(1,8,9,10)] ) ).true. ?- assert( ( below_min_length_of_factor(Min,Clm) :- Clm = [_|Vals], sort( Vals, Ord ), length( Ord, Len ), Len < Min) ). true. ?- mtx1( Mtx1 ), mtx_columns_remove( Mtx1, below_min_length_of_factor(2), Red ). Mtx1 = [row(a, b, c, c), row(1, 2, 3, 4), row(1, 5, 6, 7), row(1, 8, 9, 10)], Red = [row(b, c, c), row(2, 3, 4), row(5, 6, 7), row(8, 9, 10)]. ?- lib(stoics_lib:has_at_least/3). ?- mtx1( Mtx1 ), mtx_columns_remove( Mtx1, has_at_least(2,1), Red ). Red = [row(b, c, c), row(2, 3, 4), row(5, 6, 7), row(8, 9, 10)]. ?- lib(stoics_lib:has_at_most/3). ?- mtx1( Mtx1 ), mtx_columns_remove( Mtx1, has_at_most(2,1), Red ).
?- mtx_data( mtcars, Mt ), mtx_dims( Mt, Nr, Nc ). Mt = ..., Nr = 33, Nc = 11. ?- mtx_dims( Mtx, 2, 3 ). Mtx = [row(0, 0, 0), row(0, 0, 0)].
Prolog can be given, in which case it is considered to be a full filename.
If Prolog is free, it instantiates to the filename of the file the facts
were dumped on, or the Rows themselves if consult(consult)
was in Opts.
In what follows, Stem is the first of:
Opts
file_stem(Fstem)
.)option(s)
to be passed to mtx/3maplist(Pred)
if you want
to use maplist on each row for Pred rather than the default of calling
Pred with RowsIn and RowsOutModalities
Goal is elliptically expanded to an expresssion.
?- assert( mtx1([row(a,b,c),row(1,2,3),row(4,5,6),row(7,8,9)] ) ). ?- lib(lists). % this is needed for sum_list/2 ?- mtx1( Mtx1 ), mtx_columns_partition( Mtx1, sum_list > 0, Mtx2, Excl ). Mtx1 = Mtx2, Mtx2 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6), row(7, 8, 9)], Excl = []. ?- mtx1( Mtx1 ), mtx_columns_partition( Mtx1, sum_list > 12, body, Acc, Rej ). Mtx1 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6), row(7, 8, 9)], Acc = [row(b, c), row(2, 3), row(5, 6), row(8, 9)], Rej = [row(a), row(1), row(4), row(7)]. ?- mtx1( Mtx1 ), mtx_columns_partition( Mtx1, sum_list > 15, body, Acc, Rej ). Mtx1 = [row(a, b, c), row(1, 2, 3), row(4, 5, 6), row(7, 8, 9)], Acc = [row(c), row(3), row(6), row(9)], Rej = [row(a, b), row(1, 2), row(4, 5), row(7, 8)]. ?- assert( (chkmember(List,Elem):-memberchk(Elem,List)) ). ?- mtx1( Mtx1 ), mtx_columns_partition( Mtx1, chkmember([a,c]), head, Acc, Rej ).
If Mtx, Incl and Excl are ground and non-lists are taken to be files to read/write upon
in which case an optimised version is used, that does not read the whole file
into memory but processes each line as it is read. In this case Incl and Excl
can be the special atom false which will indicated the specified channel is
not required.
Opts
?- assert( (arg_val(N,Val,Row) :- arg(N,Row,Val)) ). ?- mtx_data( mtcars, Mtcars ), mtx_rows_partition( Mtcars, arg_val(1,21.0), Incl, Excl, true ), length( Excl, Nxcl ), maplist( writeln, Incl ), write( xLen:Nxcl ), nl, fail. row(mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb) row(21.0,6.0,160.0,110.0,3.9,2.62,16.46,0.0,1.0,4.0,4.0) row(21.0,6.0,160.0,110.0,3.9,2.875,17.02,0.0,1.0,4.0,4.0) xLen:31
Opts
?- mtx_data( mtcars, MtCars ), mtx_columns_sets( MtCars, Sets, true ), maplist( length, Sets, Lengths ), write( lengths(Lengths) ), nl. lengths([25,3,27,22,22,29,30,2,2,3,6]) ...
Requires pack(mlu)
.
Opts
?- [pack(mtx/examples/ones_plots)]. ones_plots. % displays 2 frequency plots one with a vertical separator line and % the other with 3 frequency groups distinguished by colour.
?- Mtx = [r(a,b,c,d),r(1,0,0,0),r(1,1,0,0),r(1,1,1,0)], maplist(writeln,Mtx), mtx_value_column_frequencies(Mtx,1,VC). r(a,b,c,d) r(1,0,0,0) r(1,1,0,0) r(1,1,1,0) Mtx = [r(a, b, c, d), r(1, 0, 0, 0), r(1, 1, 0, 0), r(1, 1, 1, 0)], VC = [a-3, b-2, c-1, d-0].
Opts
?- Mtx = [w(lets,nums),w(a,1),w(a,2),w(b,3),w(c,2),w(c,3)], mtx_columns_cross_table( Mtx, lets, nums, Tbl, true ), maplist( writeln, Mtx ), maplist( writeln, Tbl ). w(lets,nums) w(a,1) w(a,2) w(b,3) w(c,2) w(c,3) hdr(,1,2,3) row(a,1,1,0) row(b,0,0,1) row(c,0,1,1) Mtx = [w(lets, nums), w(a, 1), w(a, 2), w(b, 3), w(c, 2), w(c, 3)], Tbl = [hdr('', 1, 2, 3), row(a, 1, 1, 0), row(b, 0, 0, 1), row(c, 0, 1, 1)].
mtx_pos_elem/5 can be used to generate all positions and elements
Please note this uses the canonical representation and not optimised for other formats.
Opts
?- Mtx = [row(a,b,c),row(1,2,3),row(4,5,6)], assert( a_mtx(Mtx) ). ?- a_mtx(Amtx), mtx_pos_elem(Amtx,I,J,Elem,true). Amtx = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], I = J, J = Elem, Elem = 1 ; ... ?- a_mtx(Amtx), mtx_pos_elem(Amtx,2,3,0,Bmtx,true). Amtx = [row(a, b, c), row(1, 2, 3), row(4, 5, 6)], Bmtx = [row(a, b, c), row(1, 2, 3), row(4, 5, 0)].
Opts
value(Val)
=DefV when you want to set the elements that fail ij_constraintcall(Gname,Scf,I,J,Elem|Gargs,NtxScf)
, else
it is call(Gname,Elem|Gargs,OutElem)
?- Mtx = [row(a,b,c),row(1,2,3),row(4,5,6),row(7,8,9)], assert( a_mtx(Mtx) ). ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, true ). Bmtx = [row(a, b, c), row(2, 3, 4), row(5, 6, 7), row(8, 9, 10)]. ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, ij_constraint(<) ). Bmtx = [row(a, b, c), row(1, 3, 4), row(4, 5, 7), row(7, 8, 9)]. ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, [ij_constraint(=<),default_value(0),row_start(bottom)] ). Bmtx = [row(a, b, c), row(0, 0, 4), row(0, 6, 7), row(8, 9, 10)]. ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, [ij_constraint(=<),default_value(0),row_start(top)] ). Bmtx = [row(a, b, c), row(2, 3, 4), row(0, 6, 7), row(0, 0, 10)]. ?- a_mtx( Amtx ), mtx_apply( Amtx, plus(1), Bmtx, [ij_constraint(=<),default_value(0),row_start(top)] ). Bmtx = [row(a, b, c), row(0, 3, 4), row(0, 0, 7), row(0, 0, 0)].
Types:
asserted (atomic) when Mtx is not a current handle and given that predicate Mtx/1 exists with its argument instantiating to a list, this list is taken to be a matrix in canonical representation
by_column (list of lists) which is assumed to be a per-column representation (see mtx_lists/2)
by_row (list of compounds) such as those read in with csv_read_file/2 but there is no restriction on term name and arity. this is the canonical representation and each term is a row of the matrix
predicated (Pid of the form Pname/Arity) where the atom Pname corresponds to a predicate name and the predicate with arity N is defined to succeeds with the returned arguments
predfile (atomic) when Mtx is not a current mtx handle and given that predicate Mtx/1 exists with its argument instantiating to a non-list; this argument is taken to be the stem (with possible exts csv and tsv) or filename of a csv/tsv file which csv_read_file/3 can read as a canonical matrix
on_file (ground; non-list) (atomic or compound: csv file or its stem) as possible to be read by csv_read_file/2 alias paths and normal delimited file extension can be ommitted
asserted (atomic)
atomic, when mtx was cached at loading time (see option cache(Cache)
in mtx/3)
If Mtx is a list, its contents are first checked for sublists (by_column) and then
for compounds (by_row). When Mtx is a predicate identifier of the form Pname/Arity,
it is taken to define the corresponding Mtx (predicated). If Mtx is atomic the options are
Mtx matrix handle exists (see mtx/2)
then the type is in_memory
Mtx/1 is defined and returns a list
type is asserted
Mtx/1 is defined and returns a non list
type on_file(File)
?- mtx_type( [[a],[b],[c]], Type ). Type = by_column. ?- mtx_type( [r(a,b,c),r(1,2,3),r(4,5,6)], Type ). Type = by_row. ?- mtx_type( pack(mtx/data/mtcars), Type ). Type = on_file. % was: Type = on_file('/usr/local/users/na11/local/git/lib/swipl-7.3.29/pack/mtx/data/mtcars.csv'). ?- assert( mc_file(pack(mtx/data/mtcars)) ). ?- mtx_type( mc_file, Type ). ?- mtx( pack(mtx/data/mtcars), Mtx, cache(mtcars) ), assert(mc(Mtx)). ?- mtx_type( mtcars, Type ). Type = handled. ?- mtx_type( mc, Type ). Type = asserted. ?- mtx( mc, Mc ), findall( _, (member(Row,Mc),assert(Row)), _ ). ?- mtx( mc, [Hdr|_Rows] ), functor( Hdr, Pname, Arity ), mtx_type( Pname/Arity, Type ). Hdr = ..., Rows = ..., Pname = row, Arity = 11, Type = predicated.
OptS
match_arity(Match)
rows read in (see csv//2 options).separator(Sep)
option of csv//2 (mtx_sep/2). Defaults to csv//2 version which is based on filename extension.
Any other OptS are passed to csv//2.
As per mtx/3 convention OptS can be a single option (un-listed) or a list of options.
?- tmp_file( testo, TmpF ), csv_write_file( TmpF, [row(c_a,c_b),row(1,a,b),row(2,aa,bb)], [match_arity(false),separator(0'\t)] ), mtx_read_table( TmpF, samples, Tbl, sep(tab) ). TmpF = '/tmp/pl_testo_12445_0', Tbl = [row(samples, c_a, c_b), row(1, a, b), row(2, aa, bb)].
?- assert( ( or_gate(List,And) :- sum_list(List,Sum), ( Sum > 0 -> And is 1; And is 0)) ). ?- Mtx = [r(a,b1,b2,c),r(0,1,0,1),r(0,0,1,0),r(1,0,0,1),r(1,1,1,0)], mtx_columns_collapse( Mtx, [b1,b2], b, or_gate, 2, OutMtx ). Mtx = ... OutMtx = [r(a, b, c), r(0, 0, 1), r(0, 0, 0), r(1, 0, 1), r(1, 1, 0)].
Please note that Out would usually be another matrix,
however, the predicate can also produce other outputs.
You need to set is_mtx(false)
in this case, (note thaugh
this will also (a) change the default of Hdr to false and
(b) by pass calling mtx/2 on the output).
Opts
In addition you can give any option that you want to pass to both mtx/3 calls from those
that are recognised by mtx/3 (see mtx_options_select/5). For example, convert(true)
will be passed to both mtx/3 calls, whereas in_convert(true)
will only be pased to the
input call.
?- mtx( data('mtcars.csv'), MtC ), mtx_row_apply( =, MtC, MtA, [] ). MtC = MtA, MtA = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row('21.0', ...), ... ]. ?- mtx( data('mtcars.csv'), MtC ), mtx_row_apply( =, MtC, MtA, out_has_header(false) ). MtC = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, ...), ... ], MtA = [row('21.0', '6.0', '160.0', '110.0', '3.9', '2.62', '16.46', '0.0', '1.0', '4.0', '4.0'), ...]. ?- assert((sum_args(Term,Sum) :- Term=..[_|Args], sumlist(Args,Sum))). ?- sum_args( a(1,2,3), Sum ). Sum = 6. ?- mtx_row_apply(sum_args,data('mtcars.csv'),Sums,[convert(true),out_is_mtx(false)]). Sums = [328.97999999999996, 329.79499999999996, 259.58, ... ]. ?- tmp_file( mtcars_clone, TmpF ), mtx_row_apply( =, data('mtcars.csv', TmpF, [] ).
On *nix only:
==
?- library(by_unix).
?- tmp_file( mtcars_clone, TmpF )
, mtx_row_apply( =, data('mtcars.csv'), TmpF, [] )
, @ head( -2, TmpF )
.
mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
21.0,6.0,160.0,110.0,3.9,2.62,16.46,0.0,1.0,4.0,4.0
TmpF = '/tmp/swipl_mtcars_clone_21824_1'.
===
Opts
?- mtx_bi_opts( [], true.csv, out.csv, Ins, Outs ). min([])-sin([])-mou([])-sou([sep(44)]) Ins = [], Outs = [sep(44)].
?- mtx_column_subsets( [w(c1,c2),w(a,1),w(a,2),w(b,1),w(b,2),w(c,3)], 1, Subs ). Subs = [a-[w(a, 1), w(a, 2)], b-[w(b, 1), w(b, 2)], c-[w(c, 3)]].
This should really be in library(csv).
CsvOpts are Csv specificially compiled options.
?- mtx_read_stream( S, D, O ).
Opts
mtx(column_join(multi))
triggers the printing of discarded matching rows))?- assert( mtx1([r(a,b,c),r(1,2,3),r(4,5,6)]) ). ?- assert( mtx2([r(a,e,f),r(1,7,7),r(4,8,8)]) ). ?- mtx1(Mtx1),mtx2(Mtx2),mtx_column_join(Mtx1, a, Mtx2, Mtx, []). ?- mtx1(Mtx1),mtx2(Mtx2),mtx_column_join(Mtx1, a, Mtx2, Mtx, [at(2)]).
Default values for mtx/3 are fished out at run-time.
Opts
?- mtx_options_select( [convert(false)], in, Ms, Rs, [] ). Ms = [convert(false)], Rs = []. ?- mtx_options_select( [in_convert(false)], in, Ms, Rs, [] ). Ms = [convert(false)], Rs = []. ?- mtx_options_select( [convert(false)], in, Ms, Rs, [match_generic(false)] ). Ms = [], Rs = [convert(false)].
This could possibly be folded into mtx/3 with prefix(Pfx)
and rem_opts(RemOpts)
,
however, it is handy to clean the options before the output call. So the current model is:
mtx_lib_pred( MtxIn, MtxOut, Args ) :- options_append( mtx_lib_pred, Args, AllOpts ), mtx_options_select( AllOpts, in, InMtxOpts, NonInOpts ), mtx( MtxIn, Mtx, InMtxOpts ), mtx_options_select( NonInOpts, out, OutMtxOpts, Opts ), ... mtx( MtxOut, MtxForOut, Opts ).
pack(mtx/data)
.
Data is in canonical Mtx format.
SetName
?- mtx( pack(mtx/data/mtcars), Mtcars ), mtx_data(mtcars, Mtcars). Mtcars = [row(mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.62, 16.46, 0.0, 1.0, 4.0, 4.0), row(21.0, 6.0, 160.0, 110.0, 3.9, 2.875, 17.02, 0.0, 1.0, 4.0, 4.0), row(22.8, 4.0, 108.0, 93.0, 3.85, 2.32, 18.61, 1.0, 1.0, 4.0, 1.0), row(21.4, 6.0, 258.0, 110.0, 3.08, 3.215, 19.44, 1.0, 0.0, 3.0, 1.0), row(18.7, 8.0, 360.0, 175.0, 3.15, 3.44, 17.02, 0.0, 0.0, 3.0, 2.0), row(18.1, 6.0, 225.0, 105.0, 2.76, 3.46, 20.22, 1.0, 0.0, 3.0, 1.0), row(14.3, 8.0, nle.360.0, 245.0, 3.21, 3.57, 15.84, 0.0, 0.0, 3.0, 4.0), row(..., ..., ..., ..., ..., ..., ..., ..., ..., ..., ...)|...]
Sep can be a code, or one of:
mtx
.
The pack is distributed under the MIT license.
?- mtx_version( Ver, Date ). Ver = 0:6:0, Date = date(2021, 6, 17).