API Reference
Types
Tables.AbstractColumns
— TypeTables.AbstractColumns
An interface type defined as an ordered set of columns that support retrieval of individual columns by name or index. A retrieved column must be a 1-based indexable collection with known length, i.e. an object that supports length(col)
and col[i]
for any i = 1:length(col)
. Tables.columns
must return an object that satisfies the Tables.AbstractColumns
interface. While Tables.AbstractColumns
is an abstract type that custom "columns" types may subtype for useful default behavior (indexing, iteration, property-access, etc.), users should not use it for dispatch, as Tables.jl interface objects are not required to subtype, but only implement the required interface methods.
Interface definition:
Required Methods | Default Definition | Brief Description |
---|---|---|
Tables.getcolumn(table, i::Int) | getfield(table, i) | Retrieve a column by index |
Tables.getcolumn(table, nm::Symbol) | getproperty(table, nm) | Retrieve a column by name |
Tables.columnnames(table) | propertynames(table) | Return column names for a table as a 1-based indexable collection |
Optional methods | ||
Tables.getcolumn(table, ::Type{T}, i::Int, nm::Symbol) | Tables.getcolumn(table, nm) | Given a column eltype T , index i , and column name nm , retrieve the column. Provides a type-stable or even constant-prop-able mechanism for efficiency. |
Note that subtypes of Tables.AbstractColumns
must overload all required methods listed above instead of relying on these methods' default definitions.
While types aren't required to subtype Tables.AbstractColumns
, benefits of doing so include:
- Indexing interface defined (using
getcolumn
); i.e.tbl[i]
will retrieve the column at indexi
- Property access interface defined (using
columnnames
andgetcolumn
); i.e.tbl.col1
will retrieve column namedcol1
- Iteration interface defined; i.e.
for col in table
will iterate each column in the table AbstractDict
methods defined (get
,haskey
, etc.) for checking and retrieving columns- A default
show
method
This allows a custom table type to behave as close as possible to a builtin NamedTuple
of vectors object.
Tables.AbstractRow
— TypeTables.AbstractRow
Abstract interface type representing the expected eltype
of the iterator returned from Tables.rows(table)
. Tables.rows
must return an iterator of elements that satisfy the Tables.AbstractRow
interface. While Tables.AbstractRow
is an abstract type that custom "row" types may subtype for useful default behavior (indexing, iteration, property-access, etc.), users should not use it for dispatch, as Tables.jl interface objects are not required to subtype, but only implement the required interface methods.
Interface definition:
Required Methods | Default Definition | Brief Description |
---|---|---|
Tables.getcolumn(row, i::Int) | getfield(row, i) | Retrieve a column value by index |
Tables.getcolumn(row, nm::Symbol) | getproperty(row, nm) | Retrieve a column value by name |
Tables.columnnames(row) | propertynames(row) | Return column names for a row as a 1-based indexable collection |
Optional methods | ||
Tables.getcolumn(row, ::Type{T}, i::Int, nm::Symbol) | Tables.getcolumn(row, nm) | Given a column element type T , index i , and column name nm , retrieve the column value. Provides a type-stable or even constant-prop-able mechanism for efficiency. |
Note that subtypes of Tables.AbstractRow
must overload all required methods listed above instead of relying on these methods' default definitions.
While custom row types aren't required to subtype Tables.AbstractRow
, benefits of doing so include:
- Indexing interface defined (using
getcolumn
); i.e.row[i]
will return the column value at indexi
- Property access interface defined (using
columnnames
andgetcolumn
); i.e.row.col1
will retrieve the value for the column namedcol1
- Iteration interface defined; i.e.
for x in row
will iterate each column value in the row AbstractDict
methods defined (get
,haskey
, etc.) for checking and retrieving column values- A default
show
method
This allows the custom row type to behave as close as possible to a builtin NamedTuple
object.
Tables.ByRow
— TypeByRow <: Function
ByRow(f)
returns a function which applies function f
to each element in a vector.
ByRow(f)
can be passed two types of arguments:
- One or more 1-based
AbstractVector
s of equal length: In this case the returned value is a vector resulting from applyingf
to elements of passed vectors element-wise. Functionf
is called exactly once for each element of passed vectors (as opposed tomap
which assumes for some types of source vectors (e.g.SparseVector
) that the wrapped function is pure, and may call the functionf
only once for multiple equal values. - A
Tables.ColumnTable
holding 1-based columns of equal length: In this case the functionf
is passed aNamedTuple
created for each row of passed table.
The return value of ByRow(f)
is always a vector.
ByRow
expects that at least one argument is passed to it and in the case of Tables.ColumnTable
passed that the table has at least one column. In some contexts of operations on tables (for example DataFrame
) the user might want to pass no arguments (or an empty Tables.ColumnTable
) to ByRow
. This case must be separately handled by the code implementing the logic of processing the ByRow
operation on this specific parent table (the reason is that passing such arguments to ByRow
does not allow it to determine the number of rows of the source table).
Examples
julia> Tables.ByRow(x -> x^2)(1:3)
3-element Vector{Int64}:
1
4
9
julia> Tables.ByRow((x, y) -> x*y)(1:3, 2:4)
3-element Vector{Int64}:
2
6
12
julia> Tables.ByRow(x -> x.a)((a=1:2, b=3:4))
2-element Vector{Int64}:
1
2
julia> Tables.ByRow(x -> (a=x.a*2, b=sin(x.b), c=x.c))((a=[1, 2, 3],
b=[1.2, 3.4, 5.6],
c=["a", "b", "c"]))
3-element Vector{NamedTuple{(:a, :b, :c), Tuple{Int64, Float64, String}}}:
(a = 2, b = 0.9320390859672263, c = "a")
(a = 4, b = -0.2555411020268312, c = "b")
(a = 6, b = -0.6312666378723216, c = "c")
Tables.Columns
— TypeTables.Columns(tbl)
Convenience type that calls Tables.columns
on an input tbl
and wraps the resulting AbstractColumns
interface object in a dedicated struct to provide useful default behaviors (allows any AbstractColumns
to be used like a NamedTuple
of Vectors
):
- Indexing interface defined; i.e.
row[i]
will return the column at indexi
,row[nm]
will return column for column namenm
- Property access interface defined; i.e.
row.col1
will retrieve the value for the column namedcol1
- Iteration interface defined; i.e.
for x in row
will iterate each column in the row AbstractDict
methods defined (get
,haskey
, etc.) for checking and retrieving columns
Note that Tables.Columns
calls Tables.columns
internally on the provided table argument. Tables.Columns
can be used for dispatch if needed.
Tables.CopiedColumns
— TypeTables.CopiedColumns
For some sinks, there's a concern about whether they can safely "own" columns from the input. If mutation will be allowed, to be safe, they should always copy input columns, to avoid unintended mutation to the original source. When we've called buildcolumns
, however, Tables.jl essentially built/owns the columns, and it's happy to pass ownership to the sink. Thus, any built columns will be wrapped in a CopiedColumns
struct to signal to the sink that essentially "a copy has already been made" and they're safe to assume ownership.
Tables.LazyTable
— TypeTables.LazyTable(f, arg)
A "table" type that delays materialization until Tables.columns
or Tables.rows
is called. This allows, for example, sending a LazyTable
to a remote process or thread which can then call Tables.columns
or Tables.rows
to "materialize" the table. Is used by default in Tables.partitioner(f, itr)
where a materializer function f
is passed to each element of an iterable itr
, allowing distributed/concurrent patterns like:
for tbl in Tables.partitions(Tables.partitioner(CSV.File, list_of_csv_files))
Threads.@spawn begin
cols = Tables.columns(tbl)
# do stuff with cols
end
end
In this example, CSV.File
will be called like CSV.File(x)
for each element of the list_of_csv_files
iterable, but not until Tables.columns(tbl)
is called, which in this case happens in a thread-spawned task, allowing files to be parsed and processed in parallel.
Tables.Row
— TypeTables.Row(row)
Convenience type to wrap any AbstractRow
interface object in a dedicated struct to provide useful default behaviors (allows any AbstractRow
to be used like a NamedTuple
):
- Indexing interface defined; i.e.
row[i]
will return the column value at indexi
,row[nm]
will return column value for column namenm
- Property access interface defined; i.e.
row.col1
will retrieve the value for the column namedcol1
- Iteration interface defined; i.e.
for x in row
will iterate each column value in the row AbstractDict
methods defined (get
,haskey
, etc.) for checking and retrieving column values
Tables.Schema
— TypeTables.Schema(names, types)
Create a Tables.Schema
object that holds the column names and types for an AbstractRow
iterator returned from Tables.rows
or an AbstractColumns
object returned from Tables.columns
. Tables.Schema
is dual-purposed: provide an easy interface for users to query these properties, as well as provide a convenient "structural" type for code generation.
To get a table's schema, one can call Tables.schema
on the result of Tables.rows
or Tables.columns
, but also note that a table may return nothing
, indicating that its column names and/or column element types are unknown (usually not inferable). This is similar to the Base.EltypeUnknown()
trait for iterators when Base.IteratorEltype
is called. Users should account for the Tables.schema(tbl) => nothing
case by using the properties of the results of Tables.rows(x)
and Tables.columns(x)
directly.
To access the names, one can simply call sch.names
to return a collection of Symbols (Tuple
or Vector
). To access column element types, one can similarly call sch.types
, which will return a collection of types (like (Int64, Float64, String)
).
The actual type definition is
struct Schema{names, types}
storednames::Union{Nothing, Vector{Symbol}}
storedtypes::Union{Nothing, Vector{Type}}
end
Where names
is a tuple of Symbol
s or nothing
, and types
is a tuple type of types (like Tuple{Int64, Float64, String}
) or nothing
. Encoding the names & types as type parameters allows convenient use of the type in generated functions and other optimization use-cases, but users should note that when names
and/or types
are the nothing
value, the names and/or types are stored in the storednames
and storedtypes
fields. This is to account for extremely wide tables with columns in the 10s of thousands where encoding the names/types as type parameters becomes prohibitive to the compiler. So while optimizations can be written on the typed names
/types
type parameters, users should also consider handling the extremely wide tables by specializing on Tables.Schema{nothing, nothing}
.
Functions
Tables.allocatecolumn
— MethodTables.allocatecolumn(::Type{T}, len) => returns a column type (usually `AbstractVector`) with size to hold `len` elements
Custom column types can override with an appropriate "scalar" element type that should dispatch to their column allocator. Alternatively, and more generally, custom scalars can overload DataAPI.defaultarray
to signal the default array type. In this case the signaled array type must support a constructor accepting undef
for initialization.
Tables.columnaccess
— FunctionTables.columnaccess(x) => Bool
Check whether an object has specifically defined that it implements the Tables.columns
function that does not copy table data. That is to say, Tables.columns(x)
must be done with O(1) time and space complexity when Tables.columnaccess(x) == true
. Note that Tables.columns
has generic fallbacks allowing it to produces AbstractColumns
objects, even if the input doesn't define columnaccess
. However, this generic fallback may copy the data from input table x
. Also note that just because an object defines columnaccess
doesn't mean a user should call Tables.columns
on it; Tables.rows
will also work, providing a valid AbstractRow
iterator. Hence, users should call Tables.rows
or Tables.columns
depending on what is most natural for them to consume instead of worrying about what and how the input is oriented.
It is recommended that for users implementing MyType
, they define only columnaccess(::Type{MyType})
. columnaccess(::MyType)
will then automatically delegate to this method.
Tables.columnindex
— MethodTables.columnindex(table, name::Symbol)
Return the column index (1-based) of a column by name
in a table with a known schema; returns 0 if name
doesn't exist in table
Tables.columnindex
— Methodgiven names and a Symbol name
, compute the index (1-based) of the name in names
Tables.columnnames
— FunctionTables.columnnames(::Union{AbstractColumns, AbstractRow}) => Indexable collection
Retrieves the list of column names as a 1-based indexable collection (like a Tuple
or Vector
) for a AbstractColumns
or AbstractRow
interface object. The default definition calls propertynames(x)
. The returned column names must be unique.
Tables.columns
— FunctionTables.columns(x) => AbstractColumns-compatible object
Accesses data of input table source x
by returning an AbstractColumns
-compatible object, which allows retrieving entire columns by name or index. A retrieved column is a 1-based indexable object that has a known length, i.e. supports length(col)
and col[i]
for any i = 1:length(col)
. Note that even if the input table source is row-oriented by nature, an efficient generic definition of Tables.columns
is defined in Tables.jl to build a AbstractColumns
- compatible object object from the input rows.
The Tables.Schema
of a AbstractColumns
object can be queried via Tables.schema(columns)
, which may return nothing
if the schema is unknown. Column names can always be queried by calling Tables.columnnames(columns)
, and individual columns can be accessed by calling Tables.getcolumn(columns, i::Int )
or Tables.getcolumn(columns, nm::Symbol)
with a column index or name, respectively.
Note that if x
is an object in which columns are stored as vectors, the check that these vectors use 1-based indexing is not performed (it should be ensured when x
is constructed).
Tables.columntable
— FunctionTables.columntable(x) => NamedTuple of AbstractVectors
Takes any input table source x
and returns a NamedTuple
of AbstractVector
s, also known as a "column table". A "column table" is a kind of default table type of sorts, since it satisfies the Tables.jl column interface naturally.
Note that if x
is an object in which columns are stored as vectors, the check that these vectors use 1-based indexing is not performed (it should be ensured when x
is constructed).
Not for use with extremely wide tables with # of columns > 67K; current fundamental compiler limits prevent constructing NamedTuple
s that large.
Tables.columntype
— MethodTables.columntype(table, name::Symbol)
Return the column element type of a column by name
in a table with a known schema; returns Union{} if name
doesn't exist in table
Tables.columntype
— Methodgiven tuple type and a Symbol name
, compute the type of the name in the tuples types
Tables.datavaluerows
— MethodTables.datavaluerows(x) => NamedTuple iterator
Takes any table input x
and returns a NamedTuple
iterator that will replace missing values with DataValue
-wrapped values; this allows any table type to satisfy the TableTraits.jl Queryverse integration interface by defining:
IteratorInterfaceExtensions.getiterator(x::MyTable) = Tables.datavaluerows(x)
Tables.dictcolumntable
— MethodTables.dictcolumntable(x) => Tables.DictColumnTable
Take any Tables.jl-compatible source x
and return a DictColumnTable
, which can be thought of as a OrderedDict
mapping column names as Symbol
s to AbstractVector
s. The order of the input table columns is preserved via the Tables.schema(::DictColumnTable)
.
For "schema-less" input tables, dictcolumntable
employs a "column unioning" behavior, as opposed to inferring the schema from the first row like Tables.columns
. This means that as rows are iterated, each value from the row is joined into an aggregate final set of columns. This is especially useful when input table rows may not include columns if the value is missing, instead of including an actual value missing
, which is common in json, for example. This results in a performance cost tracking all seen values and inferring the final unioned schemas, so it's recommended to use only when needed.
Tables.dictrowtable
— MethodTables.dictrowtable(x) => Tables.DictRowTable
Take any Tables.jl-compatible source x
and return a DictRowTable
, which can be thought of as a Vector
of OrderedDict
rows mapping column names as Symbol
s to values. The order of the input table columns is preserved via the Tables.schema(::DictRowTable)
.
For "schema-less" input tables, dictrowtable
employs a "column unioning" behavior, as opposed to inferring the schema from the first row like Tables.columns
. This means that as rows are iterated, each value from the row is joined into an aggregate final set of columns. This is especially useful when input table rows may not include columns if the value is missing, instead of including an actual value missing
, which is common in json, for example. This results in a performance cost tracking all seen values and inferring the final unioned schemas, so it's recommended to use only when the union behavior is needed.
Tables.eachcolumn
— FunctionTables.eachcolumn(f, sch::Tables.Schema{names, types}, x::Union{Tables.AbstractRow, Tables.AbstractColumns})
Tables.eachcolumn(f, sch::Tables.Schema{names, nothing}, x::Union{Tables.AbstractRow, Tables.AbstractColumns})
Takes a function f
, table schema sch
, x
, which is an object that satisfies the AbstractRow
or AbstractColumns
interfaces; it generates calls to get the value for each column (Tables.getcolumn(x, nm)
) and then calls f(val, index, name)
, where f
is the user-provided function, val
is the column value (AbstractRow
) or entire column (AbstractColumns
), index
is the column index as an Int
, and name
is the column name as a Symbol
.
An example using Tables.eachcolumn
is:
rows = Tables.rows(tbl)
sch = Tables.schema(rows)
if sch === nothing
state = iterate(rows)
state === nothing && return
row, st = state
sch = Tables.schema(Tables.columnnames(row), nothing)
while state !== nothing
Tables.eachcolumn(sch, row) do val, i, nm
bind!(stmt, i, val)
end
state = iterate(rows, st)
state === nothing && return
row, st = state
end
else
for row in rows
Tables.eachcolumn(sch, row) do val, i, nm
bind!(stmt, i, val)
end
end
end
Note in this example we account for the input table potentially returning nothing
from Tables.schema(rows)
; in that case, we start iterating the rows, and build a partial schema using the column names from the first row sch = Tables.schema(Tables.columnnames(row), nothing)
, which is valid to pass to Tables.eachcolumn
.
Tables.getcolumn
— FunctionTables.getcolumn(::AbstractColumns, nm::Symbol) => Indexable collection with known length
Tables.getcolumn(::AbstractColumns, i::Int) => Indexable collection with known length
Tables.getcolumn(::AbstractColumns, T, i::Int, nm::Symbol) => Indexable collection with known length
Tables.getcolumn(::AbstractRow, nm::Symbol) => Column value
Tables.getcolumn(::AbstractRow, i::Int) => Column value
Tables.getcolumn(::AbstractRow, T, i::Int, nm::Symbol) => Column value
Retrieve an entire column (from AbstractColumns
) or single row column value (from an AbstractRow
) by column name (nm
), index (i
), or if desired, by column element type (T
), index (i
), and name (nm
). When called on a AbstractColumns
interface object, the returned object should be a 1-based indexable collection with known length. When called on a AbstractRow
interface object, it returns the single column value. The methods taking a single Symbol
or Int
are both required for the AbstractColumns
and AbstractRow
interfaces; the third method is optional if type stability is possible. The default definition of Tables.getcolumn(x, i::Int)
is getfield(x, i)
. The default definition of Tables.getcolumn(x, nm::Symbol)
is getproperty(x, nm)
.
Tables.isrowtable
— FunctionTables.isrowtable(x) => Bool
For convenience, some table objects that are naturally "row oriented" can define Tables.isrowtable(::Type{TableType}) = true
to simplify satisfying the Tables.jl interface. Requirements for defining isrowtable
include:
Tables.rows(x) === x
, i.e. the table object itself is aRow
iterator- If the table object is mutable, it should support:
push!(x, row)
: allow pushing a single row onto tableappend!(x, rows)
: allow appending set of rows onto table
- If table object is mutable and indexable, it should support:
x[i] = row
: allow replacing of a row with another row by index
A table object that defines Tables.isrowtable
will have definitions for Tables.istable
, Tables.rowaccess
, and Tables.rows
automatically defined.
Tables.istable
— FunctionTables.istable(x) => Bool
Check if an object has specifically defined that it is a table. Note that not all valid tables will return true, since it's possible to satisfy the Tables.jl interface at "run-time", e.g. a Generator
of NamedTuple
s iterates NamedTuple
s, which satisfies the AbstractRow
interface, but there's no static way of knowing that the generator is a table.
It is recommended that for users implementing MyType
, they define only istable(::Type{MyType})
. istable(::MyType)
will then automatically delegate to this method.
istable
calls TableTraits.isiterabletable
as a fallback. This can have a considerable runtime overhead in some contexts. To avoid these and use istable
as a compile-time trait, it can be called on a type as istable(typeof(obj))
.
Tables.materializer
— FunctionTables.materializer(x) => Callable
For a table input, return the "sink" function or "materializing" function that can take a Tables.jl-compatible table input and make an instance of the table type. This enables "transform" workflows that take table inputs, apply transformations, potentially converting the table to a different form, and end with producing a table of the same type as the original input. The default materializer is Tables.columntable
, which converts any table input into a NamedTuple
of Vector
s.
It is recommended that for users implementing MyType
, they define only materializer(::Type{<:MyType})
. materializer(::MyType)
will then automatically delegate to this method.
Tables.matrix
— MethodTables.matrix(table; transpose::Bool=false)
Materialize any table source input as a new Matrix
or in the case of a MatrixTable
return the originally wrapped matrix. If the table column element types are not homogeneous, they will be promoted to a common type in the materialized Matrix
. Note that column names are ignored in the conversion. By default, input table columns will be materialized as corresponding matrix columns; passing transpose=true
will transpose the input with input columns as matrix rows or in the case of a MatrixTable
apply permutedims
to the originally wrapped matrix.
Tables.namedtupleiterator
— MethodTables.namedtupleiterator(x)
Pass any table input source and return a NamedTuple
iterator
Not for use with extremely wide tables with # of columns > 67K; current fundamental compiler limits prevent constructing NamedTuple
s that large.
Tables.nondatavaluerows
— MethodTables.nondatavaluerows(x)
Takes any Queryverse-compatible NamedTuple
iterator source and converts to a Tables.jl-compatible AbstractRow
iterator. Will automatically unwrap any DataValue
s, replacing NA
with missing
. Useful for translating Query.jl results back to non-DataValue
-based tables.
Tables.partitioner
— MethodTables.partitioner(f, itr)
Tables.partitioner(x)
Convenience methods to generate table iterators. The first method takes a "materializer" function f
and an iterator itr
, and will call Tables.LazyTable(f, x) for x in itr
for each iteration. This allows delaying table materialization until Tables.columns
or Tables.rows
are called on the LazyTable
object (which will call f(x)
). This allows a common desired pattern of materializing and processing a table on a remote process or thread, like:
for tbl in Tables.partitions(Tables.partitioner(CSV.File, list_of_csv_files))
Threads.@spawn begin
cols = Tables.columns(tbl)
# do stuff with cols
end
end
The second method is provided because the default behavior of Tables.partition(x)
is to treat x
as a single, non-partitioned table. This method allows users to easily wrap a Vector
or generator of tables as table partitions to pass to sink functions able to utilize Tables.partitions
.
Tables.partitions
— MethodTables.partitions(x)
Request a "table" iterator from x
. Each iterated element must be a "table" in the sense that one may call Tables.rows
or Tables.columns
to get a row-iterator or collection of columns. All iterated elements must have identical schema, so that users may call Tables.schema(first_element)
on the first iterated element and know that each subsequent iteration will match the same schema. The default definition is:
Tables.partitions(x) = (x,)
So that any input is assumed to be a single "table". This means users should feel free to call Tables.partitions
anywhere they're currently calling Tables.columns
or Tables.rows
, and get back an iterator of those instead. In other words, "sink" functions can use Tables.partitions
whether or not the user passes a partionable table, since the default is to treat a single input as a single, non-partitioned table.
Tables.partitioner(itr)
is a convenience wrapper to provide table partitions from any table iterator; this allows for easy wrapping of a Vector
or iterator of tables as valid partitions, since by default, they'd be treated as a single table.
A 2nd convenience method is provided with the definition:
Tables.partitions(x...) = x
That allows passing vararg tables and they'll be treated as separate partitions. Sink functions may allow vararg table inputs and can "splat them through" to partitions
.
For convenience, Tables.partitions(x::Iterators.PartitionIterator) = x
and Tables.partitions(x::Tables.Partitioner) = x
are defined to handle cases where user created partitioning with the Iterators.partition
or Tables.partitioner
functions.
Tables.rowaccess
— FunctionTables.rowaccess(x) => Bool
Check whether an object has specifically defined that it implements the Tables.rows
function that does not copy table data. That is to say, Tables.rows(x)
must be done with O(1) time and space complexity when Tables.rowaccess(x) == true
. Note that Tables.rows
will work on any object that iterates AbstractRow
-compatible objects, even if they don't define rowaccess
, e.g. a Generator
of NamedTuple
s. However, this generic fallback may copy the data from input table x
. Also note that just because an object defines rowaccess
doesn't mean a user should call Tables.rows
on it; Tables.columns
will also work, providing a valid AbstractColumns
object from the rows. Hence, users should call Tables.rows
or Tables.columns
depending on what is most natural for them to consume instead of worrying about what and how the input is oriented.
It is recommended that for users implementing MyType
, they define only rowaccess(::Type{MyType})
. rowaccess(::MyType)
will then automatically delegate to this method.
Tables.rowmerge
— Methodrowmerge(row, other_rows...)
rowmerge(row; fields_to_merge...)
Return a NamedTuple
by merging row
(an AbstractRow
-compliant value) with other_rows
(one or more AbstractRow
-compliant values) via Base.merge
. This function is similar to Base.merge(::NamedTuple, ::NamedTuple...)
, but accepts AbstractRow
-compliant values instead of NamedTuple
s.
A convenience method rowmerge(row; fields_to_merge...) = rowmerge(row, fields_to_merge)
is defined that enables the fields_to_merge
to be specified as keyword arguments.
Tables.rows
— FunctionTables.rows(x) => Row iterator
Accesses data of input table source x
row-by-row by returning an AbstractRow
-compatible iterator. Note that even if the input table source is column-oriented by nature, an efficient generic definition of Tables.rows
is defined in Tables.jl to return an iterator of row views into the columns of the input.
The Tables.Schema
of an AbstractRow
iterator can be queried via Tables.schema(rows)
, which may return nothing
if the schema is unknown. Column names can always be queried by calling Tables.columnnames(row)
on an individual row, and row values can be accessed by calling Tables.getcolumn(row, i::Int )
or Tables.getcolumn(row, nm::Symbol)
with a column index or name, respectively.
See also rowtable
and namedtupleiterator
.
Tables.rowtable
— FunctionTables.rowtable(x) => Vector{NamedTuple}
Take any input table source, and produce a Vector
of NamedTuple
s, also known as a "row table". A "row table" is a kind of default table type of sorts, since it satisfies the Tables.jl row interface naturally, i.e. a Vector
naturally iterates its elements, and NamedTuple
satisfies the AbstractRow
interface by default (allows indexing value by index, name, and getting all names).
For a lazy iterator over rows see rows
and namedtupleiterator
.
Not for use with extremely wide tables with # of columns > 67K; current fundamental compiler limits prevent constructing NamedTuple
s that large.
Tables.runlength
— Methodhelper function to calculate a run-length encoding of a tuple type
Tables.schema
— FunctionTables.schema(x) => Union{Nothing, Tables.Schema}
Attempt to retrieve the schema of the object returned by Tables.rows
or Tables.columns
. If the AbstractRow
iterator or AbstractColumns
object can't determine its schema, nothing
will be returned. Otherwise, a Tables.Schema
object is returned, with the column names and types available for use.
Tables.subset
— MethodTables.subset(x, inds; viewhint=nothing)
Return one or more rows from table x
according to the position(s) specified by inds
:
- If
inds
is a single non-boolean integer return a row object. - If
inds
is a vector of non-boolean integers, a vector of booleans, or a:
, return a subset of the original table according to the indices. In this case, the returned type is not necessarily the same as the original table type.
If other types of inds
are passed than specified above the behavior is undefined.
The viewhint
argument tries to influence whether the returned object is a view of the original table or an independent copy:
- If
viewhint=nothing
(the default) then the implementation for a specific table type is free to decide whether to return a copy or a view. - If
viewhint=true
then a view is returned and ifviewhint=false
a copy is returned. This applies both to returning a row or a table.
Any specialized implementation of subset
must support the viewhint=nothing
argument. Support for viewhint=true
or viewhint=false
is optional (i.e. implementations may ignore the keyword argument and return a view or a copy regardless of viewhint
value).
Tables.table
— MethodTables.table(m::AbstractVecOrMat; [header])
Wrap an AbstractVecOrMat
(Matrix
, Vector
, Adjoint
, etc.) in a MatrixTable
, which satisfies the Tables.jl interface. (An AbstractVector
is treated as a 1-column matrix.) This allows accessing the matrix via Tables.rows
and Tables.columns
. An optional keyword argument iterator header
can be passed which will be converted to a Vector{Symbol}
to be used as the column names. Note that no copy of the AbstractVecOrMat
is made.