init research
This commit is contained in:
@@ -0,0 +1,41 @@
|
||||
[//]: # (title: DataColumn)
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Create-->
|
||||
|
||||
[`DataColumn`](DataColumn.md) represents a column of values.
|
||||
It can store objects of primitive or reference types,
|
||||
or other [`DataFrame`](DataFrame.md) objects.
|
||||
|
||||
See [how to create columns](createColumn.md)
|
||||
|
||||
### Properties
|
||||
* `name: String` — name of the column; should be unique within containing dataframe
|
||||
* `path: ColumnPath` — path to the column; depends on the way column was retrieved from dataframe
|
||||
* `type: KType` — type of elements in the column
|
||||
* `hasNulls: Boolean` — flag indicating whether column contains `null` values
|
||||
* `values: Iterable<T>` — column data
|
||||
* `size: Int` — number of elements in the column
|
||||
|
||||
### Column kinds
|
||||
[`DataColumn`](DataColumn.md) instances can be one of three subtypes: `ValueColumn`, [`ColumnGroup`](DataColumn.md#columngroup) or [`FrameColumn`](DataColumn.md#framecolumn)
|
||||
|
||||
#### ValueColumn
|
||||
|
||||
Represents a sequence of values.
|
||||
|
||||
It can store values of primitive (integers, strings, decimals, etc.) or reference types.
|
||||
Currently, it uses [`List`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/-list/) as underlying data storage.
|
||||
|
||||
#### ColumnGroup
|
||||
|
||||
Container for nested columns. Used to create column hierarchy.
|
||||
|
||||
You can create column groups using the group operation or by splitting inward — see [group](group.md) and [split](split.md) for details.
|
||||
|
||||
#### FrameColumn
|
||||
|
||||
Special case of [`ValueColumn`](#valuecolumn) that stores another [`DataFrame`](DataFrame.md) objects as elements.
|
||||
|
||||
[`DataFrame`](DataFrame.md) stored in [`FrameColumn`](DataColumn.md#framecolumn) may have different schemas.
|
||||
|
||||
[`FrameColumn`](DataColumn.md#framecolumn) may appear after [reading](read.md) from JSON or other hierarchical data structures, or after grouping operations such as [groupBy](groupBy.md) or [pivot](pivot.md).
|
||||
|
||||
@@ -0,0 +1,14 @@
|
||||
[//]: # (title: DataFrame)
|
||||
|
||||
[`DataFrame`](DataFrame.md) represents a list of [`DataColumn`](DataColumn.md).
|
||||
|
||||
Columns in [`DataFrame`](DataFrame.md) must have equal size and unique names.
|
||||
|
||||
**Learn how to:**
|
||||
- [Create DataFrame](createDataFrame.md)
|
||||
- [Read DataFrame](read.md)
|
||||
- [Get an overview of DataFrame](info.md)
|
||||
- [Access data in DataFrame](access.md)
|
||||
- [Modify data in DataFrame](modify.md)
|
||||
- [Compute statistics for DataFrame](summaryStatistics.md)
|
||||
- [Combine several DataFrame objects](multipleDataFrames.md)
|
||||
@@ -0,0 +1,103 @@
|
||||
[//]: # (title: DataRow)
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.DataRowApi-->
|
||||
|
||||
`DataRow` represents a single record, one piece of data within a [`DataFrame`](DataFrame.md)
|
||||
|
||||
## Row functions
|
||||
|
||||
<snippet id="rowFunctions">
|
||||
|
||||
* `index(): Int` — sequential row number in [`DataFrame`](DataFrame.md), starts from 0
|
||||
* `prev(): DataRow?` — previous row (`null` for the first row)
|
||||
* `next(): DataRow?` — next row (`null` for the last row)
|
||||
* `diff(T) { rowExpression }: T / diffOrNull { rowExpression }: T?` — difference between the results of a [row expression](DataRow.md#row-expressions) calculated for current and previous rows
|
||||
* `explode(columns): DataFrame<T>` — spread lists and [`DataFrame`](DataFrame.md) objects vertically into new rows
|
||||
* `values(): List<Any?>` — list of all cell values from the current row
|
||||
* `valuesOf<T>(): List<T>` — list of values of the given type
|
||||
* `columnsCount(): Int` — number of columns
|
||||
* `columnNames(): List<String>` — list of all column names
|
||||
* `columnTypes(): List<KType>` — list of all column types
|
||||
* `namedValues(): List<NameValuePair<Any?>>` — list of name-value pairs where `name` is a column name and `value` is cell value
|
||||
* `namedValuesOf<T>(): List<NameValuePair<T>>` — list of name-value pairs where value has given type
|
||||
* `transpose(): DataFrame<NameValuePair<*>>` — [`DataFrame`](DataFrame.md) of two columns: `name: String` is column names and `value: Any?` is cell values
|
||||
* `transposeTo<T>(): DataFrame<NameValuePair<T>>`— [`DataFrame`](DataFrame.md) of two columns: `name: String` is column names and `value: T` is cell values
|
||||
* `getRow(Int): DataRow` — row from [`DataFrame`](DataFrame.md) by row index
|
||||
* `getRows(Iterable<Int>): DataFrame` — [`DataFrame`](DataFrame.md) with subset of rows selected by absolute row index.
|
||||
* `relative(Iterable<Int>): DataFrame` — [`DataFrame`](DataFrame.md) with subset of rows selected by relative row index: `relative(-1..1)` will return previous, current and next row. Requested indices will be coerced to the valid range and invalid indices will be skipped
|
||||
* `getValue<T>(columnName)` — cell value of type `T` by this row and given `columnName`
|
||||
* `getValueOrNull<T>(columnName)` — cell value of type `T?` by this row and given `columnName` or `null` if there's no such column
|
||||
* `get(column): T` — cell value by this row and given `column`
|
||||
* `String.invoke<T>(): T` — cell value of type `T` by this row and given `this` column name
|
||||
* `ColumnPath.invoke<T>(): T` — cell value of type `T` by this row and given `this` column path
|
||||
* `ColumnReference.invoke(): T` — cell value of type `T` by this row and given `this` column
|
||||
* `df()` — [`DataFrame`](DataFrame.md) that current row belongs to
|
||||
|
||||
</snippet>
|
||||
|
||||
## Row expressions
|
||||
Row expressions provide a value for every row of [`DataFrame`](DataFrame.md) and are used in [add](add.md), [filter](filter.md), [forEach](iterate.md), [update](update.md) and other operations.
|
||||
|
||||
<!---FUN expressions-->
|
||||
|
||||
```kotlin
|
||||
// Row expression computes values for a new column
|
||||
df.add("fullName") { name.firstName + " " + name.lastName }
|
||||
|
||||
// Row expression computes updated values
|
||||
df.update { weight }.at(1, 3, 4).with { prev()?.weight }
|
||||
|
||||
// Row expression computes cell content for values of pivoted column
|
||||
df.pivot { city }.with { name.lastName.uppercase() }
|
||||
```
|
||||
|
||||
<inline-frame src="resources/org.jetbrains.kotlinx.dataframe.samples.api.DataRowApi.expressions.html" width="100%"/>
|
||||
<!---END-->
|
||||
|
||||
Row expression signature: ```DataRow.(DataRow) -> T```. Row values can be accessed with or without ```it``` keyword. Implicit and explicit argument represent the same `DataRow` object.
|
||||
|
||||
## Row conditions
|
||||
Row condition is a special case of [row expression](#row-expressions) that returns `Boolean`.
|
||||
|
||||
<!---FUN conditions-->
|
||||
|
||||
```kotlin
|
||||
// Row condition is used to filter rows by index
|
||||
df.filter { index() % 5 == 0 }
|
||||
|
||||
// Row condition is used to drop rows where `age` is the same as in the previous row
|
||||
df.drop { diffOrNull { age } == 0 }
|
||||
|
||||
// Row condition is used to filter rows for value update
|
||||
df.update { weight }.where { index() > 4 && city != "Paris" }.with { 50 }
|
||||
```
|
||||
|
||||
<inline-frame src="resources/org.jetbrains.kotlinx.dataframe.samples.api.DataRowApi.conditions.html" width="100%"/>
|
||||
<!---END-->
|
||||
|
||||
Row condition signature: ```DataRow.(DataRow) -> Boolean```
|
||||
|
||||
|
||||
|
||||
## Row statistics
|
||||
|
||||
<snippet id="rowStatistics">
|
||||
|
||||
The following [statistics](summaryStatistics.md) are available for `DataRow`:
|
||||
* `rowSum`
|
||||
* `rowMean`
|
||||
* `rowStd`
|
||||
|
||||
These statistics will be applied only to values of appropriate types, and incompatible values will be ignored.
|
||||
For example, if a [dataframe](DataFrame.md) has columns of types `String` and `Int`,
|
||||
`rowSum()` will compute the sum of the `Int` values in the row and ignore `String` values.
|
||||
|
||||
To apply statistics only to values of a particular type use `-Of` versions:
|
||||
* `rowSumOf<T>`
|
||||
* `rowMeanOf<T>`
|
||||
* `rowStdOf<T>`
|
||||
* `rowMinOf<T>`
|
||||
* `rowMaxOf<T>`
|
||||
* `rowMedianOf<T>`
|
||||
* `rowPercentileOf<T>`
|
||||
|
||||
</snippet>
|
||||
@@ -0,0 +1,117 @@
|
||||
[//]: # (title: Access APIs)
|
||||
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.ApiLevels-->
|
||||
|
||||
By nature, dataframes are dynamic objects;
|
||||
column labels depend on the input source and new columns can be added
|
||||
or deleted while wrangling.
|
||||
Kotlin, in contrast, is a statically typed language where all types are defined and verified
|
||||
ahead of execution.
|
||||
|
||||
That's why creating a flexible, handy, and, at the same time, safe API to a dataframe is tricky.
|
||||
|
||||
In the Kotlin DataFrame library, we provide two different ways to access columns
|
||||
|
||||
## List of Access APIs
|
||||
|
||||
Here's a list of all APIs in order of increasing safety.
|
||||
|
||||
* **String API** <br/>
|
||||
Columns are accessed by `string` representing their name. Type-checking is done at runtime, name-checking too.
|
||||
|
||||
* [**Extension Properties API**](extensionPropertiesApi.md) <br/>
|
||||
Extension access properties are generated based on the dataframe schema. The name and type of properties are inferred
|
||||
from the name and type of the corresponding columns.
|
||||
|
||||
## Example
|
||||
|
||||
Here's an example of how the same operations can be performed via different Access APIs:
|
||||
|
||||
<note>
|
||||
In the most of the code snippets in this documentation there's a tab selector that allows switching across Access APIs.
|
||||
</note>
|
||||
|
||||
<tabs>
|
||||
|
||||
<tab title="String API">
|
||||
|
||||
<!---FUN strings-->
|
||||
|
||||
```kotlin
|
||||
DataFrame.read("titanic.csv")
|
||||
.add("lastName") { "name"<String>().split(",").last() }
|
||||
.dropNulls("age")
|
||||
.filter {
|
||||
"survived"<Boolean>() &&
|
||||
"home"<String>().endsWith("NY") &&
|
||||
"age"<Int>() in 10..20
|
||||
}
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
</tab>
|
||||
|
||||
<tab title = "Extension Properties API">
|
||||
|
||||
<!---FUN extensionProperties1-->
|
||||
|
||||
```kotlin
|
||||
val df /* : AnyFrame */ = DataFrame.read("titanic.csv")
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<!---FUN extensionProperties2-->
|
||||
|
||||
```kotlin
|
||||
df.add("lastName") { name.split(",").last() }
|
||||
.dropNulls { age }
|
||||
.filter { survived && home.endsWith("NY") && age in 10..20 }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
</tab>
|
||||
|
||||
</tabs>
|
||||
|
||||
The `titanic.csv` file can be found [here](https://github.com/Kotlin/dataframe/blob/master/data/titanic.csv).
|
||||
|
||||
# Comparing APIs
|
||||
|
||||
The String API is the simplest and unsafest of them all. The main advantage of it is that it can be
|
||||
used at any time, including when accessing new columns in chain calls. So we can write something like:
|
||||
|
||||
```kotlin
|
||||
df.add("weight") { ... } // add a new column `weight`, calculated by some expression
|
||||
.sortBy("weight") // sorting dataframe rows by its value
|
||||
```
|
||||
|
||||
In contrast, generated [extension properties](extensionPropertiesApi.md) form the most convenient and the safest API.
|
||||
Using them, you can always be sure that you work with correct data and types.
|
||||
However, there's a bottleneck at the moment of generation.
|
||||
To get new extension properties, you have to run a cell in a notebook,
|
||||
which could lead to unnecessary variable declarations.
|
||||
Currently, we are working on a compiler plugin that generates these properties on the fly while typing!
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<td> API </td>
|
||||
<td> Type-checking </td>
|
||||
<td> Column names checking </td>
|
||||
<td> Column existence checking </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> String API </td>
|
||||
<td> Runtime </td>
|
||||
<td> Runtime </td>
|
||||
<td> Runtime </td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td> Extension Properties API </td>
|
||||
<td> Generation-time </td>
|
||||
<td> Generation-time </td>
|
||||
<td> Generation-time </td>
|
||||
</tr>
|
||||
</table>
|
||||
@@ -0,0 +1,62 @@
|
||||
# Concepts And Principles
|
||||
|
||||
<web-summary>
|
||||
Learn what Kotlin DataFrame is about — its core concepts, design principles, and usage philosophy.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Discover the fundamentals of the library —
|
||||
understand key concepts, motivation, and the overall structure of the library.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Explore the fundamentals of Kotlin DataFrame —
|
||||
understand key concepts, motivation, and the overall structure of the library.
|
||||
</link-summary>
|
||||
|
||||
|
||||
<show-structure depth="3"/>
|
||||
|
||||
|
||||
## What is a dataframe
|
||||
|
||||
A *dataframe* is an abstraction for working with structured data.
|
||||
Essentially, it’s a 2-dimensional table with labeled columns of potentially different types.
|
||||
You can think of it like a spreadsheet or SQL table, or a dictionary of series objects.
|
||||
|
||||
The handiness of this abstraction is not in the table itself but in a set of operations defined on it.
|
||||
The Kotlin DataFrame library is an idiomatic Kotlin DSL defining such operations.
|
||||
The process of working with dataframe is often called *data wrangling* which
|
||||
is the process of transforming and mapping data from one "raw" data form into another format
|
||||
that is more appropriate for analytics and visualization.
|
||||
The goal of data wrangling is to ensure quality and useful data.
|
||||
|
||||
## Main Features and Concepts
|
||||
|
||||
* [**Hierarchical**](hierarchical.md) — the Kotlin DataFrame library provides an ability to read and present data from different sources,
|
||||
including not only plain **CSV** but also **JSON** or **[SQL databases](readSqlDatabases.md)**.
|
||||
This is why it was designed to be hierarchical and allows nesting of columns and cells.
|
||||
* **Functional** — the data processing pipeline is organized in a chain of [`DataFrame`](DataFrame.md) transformation operations.
|
||||
* **Immutable** — every operation returns a new instance of [`DataFrame`](DataFrame.md) reusing underlying storage wherever it's possible.
|
||||
* **Readable** — data transformation operations are defined in DSL close to natural language.
|
||||
* **Practical** — provides simple solutions for common problems and the ability to perform complex tasks.
|
||||
* **Minimalistic** — simple, yet powerful data model of three [column kinds](DataColumn.md#column-kinds).
|
||||
* [**Interoperable**](collectionsInterop.md) — convertable with Kotlin data classes and collections.
|
||||
This also means conversion to/from other libraries' data structures is usually quite straightforward!
|
||||
See our [examples](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources)
|
||||
for some conversions between DataFrame and [Apache Spark](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/spark), [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik), and [JetBrains Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed).
|
||||
* **Generic** — can store objects of any type, not only numbers or strings.
|
||||
* **Typesafe** — the Kotlin DataFrame library provides a mechanism of on-the-fly [**generation of extension properties**](extensionPropertiesApi.md)
|
||||
that correspond to the columns of a dataframe.
|
||||
In interactive notebooks like Jupyter or Datalore, the generation runs after each cell execution.
|
||||
In IntelliJ IDEA there's a Gradle plugin for generation properties based on CSV file or JSON file.
|
||||
Also, we’re working on a compiler plugin that infers and transforms [`DataFrame`](DataFrame.md) schema while typing.
|
||||
You can now clone this [project with many examples](https://github.com/koperagen/df-plugin-demo) showcasing how it allows you to reliably use our most convenient extension properties API.
|
||||
The generated properties ensure you’ll never misspell column name and don’t mess up with its type, and of course nullability is also preserved.
|
||||
* [**Polymorphic**](schemas.md) —
|
||||
if all columns of a [`DataFrame`](DataFrame.md) instance are presented in another dataframe,
|
||||
then the first one will be seen as a superclass for the latter.
|
||||
This means you can define a function on an interface with some set of columns
|
||||
and then execute it safely on any [`DataFrame`](DataFrame.md) which contains this same set of columns.
|
||||
In notebooks, this works out-of-the-box.
|
||||
In ordinary projects, this requires casting (for now).
|
||||
@@ -0,0 +1,20 @@
|
||||
[//]: # (title: Hierarchical data structures)
|
||||
|
||||
[`DataFrame`](DataFrame.md) can represent hierarchical data structures using two special types of columns:
|
||||
|
||||
* [`ColumnGroup`](DataColumn.md#columngroup) is a group of [columns](DataColumn.md)
|
||||
* [`FrameColumn`](DataColumn.md#framecolumn) is a column of [dataframes](DataFrame.md)
|
||||
|
||||
You can read [`DataFrame`](DataFrame.md) [from json](read.md#read-from-json) or [from in-memory object graph](createDataFrame.md#todataframe) preserving original tree structure.
|
||||
|
||||
Hierarchical columns can also appear as a result of some [modification operations](modify.md):
|
||||
* [group](group.md) produces [`ColumnGroup`](DataColumn.md#columngroup)
|
||||
* [groupBy](groupBy.md) produces [`FrameColumn`](DataColumn.md#framecolumn)
|
||||
* [pivot](pivot.md) may produce [`FrameColumn`](DataColumn.md#framecolumn)
|
||||
* [split](split.md) of [`FrameColumn`](DataColumn.md#framecolumn) will produce several [`ColumnGroup`](DataColumn.md#columngroup)
|
||||
* [implode](implode.md) converts [`ColumnGroup`](DataColumn.md#columngroup) into [`FrameColumn`](DataColumn.md#framecolumn)
|
||||
* [explode](explode.md) converts [`FrameColumn`](DataColumn.md#framecolumn) into [`ColumnGroup`](DataColumn.md#columngroup)
|
||||
* [merge](merge.md) converts [`ColumnGroup`](DataColumn.md#columngroup) into [`FrameColumn`](DataColumn.md#framecolumn)
|
||||
* etc.
|
||||
|
||||
Operations in the navigation tree are grouped such that you can find operations and their respective inverse together, like `group` and `ungroup`. This allows you to quickly find out how to simplify any hierarchical structure you come across.
|
||||
@@ -0,0 +1,24 @@
|
||||
[//]: # (title: NaN and NA)
|
||||
|
||||
Using the Kotlin DataFrame library, you might come across the terms `NaN` and `NA`.
|
||||
This page explains what they mean and how to work with them.
|
||||
|
||||
## NaN
|
||||
|
||||
`Float` or `Double` values can be represented as `NaN`,
|
||||
in cases where a mathematical operation is undefined, such as for dividing by zero. The
|
||||
result of such an operation can only be described as "**N**ot **a** **N**umber".
|
||||
|
||||
This is different from `null`, which means that a value is missing and, in Kotlin, can only occur
|
||||
for `Float?` and `Double?` types.
|
||||
|
||||
You can use [fillNaNs](fill.md#fillnans) to replace `NaNs` in certain columns with a given value or expression
|
||||
or [dropNaNs](drop.md#dropnans) to drop rows with `NaNs` in them.
|
||||
|
||||
## NA
|
||||
|
||||
`NA` in Dataframe can be seen as: [`NaN`](#nan) or `null`. Which is another way to say that the value
|
||||
is "**N**ot **A**vailable".
|
||||
|
||||
You can use [fillNA](fill.md#fillna) to replace `NAs` in certain columns with a given value or expression
|
||||
or [dropNA](drop.md#dropna) to drop rows with `NAs` in them.
|
||||
@@ -0,0 +1,46 @@
|
||||
[//]: # (title: Number Unification)
|
||||
|
||||
Unifying numbers means converting them to a common number type without losing information.
|
||||
|
||||
This is currently an internal part of the library,
|
||||
but its logic implementation can be encountered in multiple places, such as
|
||||
[statistics](summaryStatistics.md), and [reading JSON](read.md#read-from-json).
|
||||
|
||||
The following graph shows the hierarchy of number types in Kotlin DataFrame.
|
||||
|
||||
<inline-frame src="resources/org.jetbrains.kotlinx.dataframe.documentation.UnifyingNumbers.Graph.html" width="100%"/>
|
||||
|
||||
The order is top-down from the most complex type to the simplest one.
|
||||
|
||||
For each number type in the graph, it holds that a number of that type can be expressed lossless by
|
||||
a number of a more complex type (any of its parents).
|
||||
This is either because the more complex type has a larger range or higher precision (in terms of bits).
|
||||
|
||||
Nullability, while not displayed in the graph, is also taken into account.
|
||||
This means that `Int?` and `Float` will be unified to `Double?`.
|
||||
|
||||
`Nothing` is at the bottom of the graph and is the starting point in unification.
|
||||
This can be interpreted as "no type" and can have no instance, while `Nothing?` can only be `null`.
|
||||
|
||||
> There may be parts of the library that "unify" numbers, such as [`readCsv`](read.md#column-type-inference-from-csv),
|
||||
> or [`readExcel`](read.md#read-from-excel).
|
||||
> However, because they rely on another library (like [Deephaven CSV](https://github.com/deephaven/deephaven-csv))
|
||||
> this may behave slightly differently.
|
||||
|
||||
### Unified Number Type Options
|
||||
|
||||
There are variants of this graph that exclude some types, such as `BigDecimal` and `BigInteger`, or
|
||||
allow some slightly lossy conversions, like from `Long` to `Double`.
|
||||
|
||||
This follows either `UnifiedNumberTypeOptions.PRIMITIVES_ONLY` or
|
||||
`UnifiedNumberTypeOptions.DEFAULT`.
|
||||
|
||||
For `PRIMITIVES_ONLY`, used by [statistics](summaryStatistics.md), big numbers are excluded from the graph.
|
||||
Additionally, `Double` is considered the most complex type,
|
||||
meaning `Long`/`ULong` and `Double` can be joined to `Double`,
|
||||
potentially losing a little precision(!).
|
||||
|
||||
For `DEFAULT`, used by [`readJson`](read.md#read-from-json), big numbers can appear.
|
||||
`BigDecimal` is considered the most complex type, meaning that `Long`/`ULong` and `Double` will be joined
|
||||
to `BigDecimal` instead.
|
||||
|
||||
@@ -0,0 +1,25 @@
|
||||
# Spelling Conventions
|
||||
|
||||
<web-summary>
|
||||
Clarifies naming conventions used in Kotlin DataFrame documentation for the library, data format, and Kotlin type.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Understand how to distinguish between "Kotlin DataFrame", "dataframe", and `DataFrame` in the documentation.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Spelling and naming rules for using "Kotlin DataFrame", "dataframe", and `DataFrame` properly.
|
||||
</link-summary>
|
||||
|
||||
While reading Kotlin DataFrame documentation, you may come across several similar terms referring to different concepts:
|
||||
|
||||
* **Kotlin DataFrame** (or just "DataFrame") — the name of the official library.
|
||||
* *dataframe* — a general term for data in a tabular (frame) format.
|
||||
* [`DataFrame`](DataFrame.md) — a Kotlin type or its instance that represents a wrapper around a dataframe.
|
||||
|
||||
Here’s a correct usage example:
|
||||
|
||||
```markdown
|
||||
Kotlin DataFrame allows you to read a dataframe from a CSV file into a `DataFrame`.
|
||||
```
|
||||
@@ -0,0 +1,5 @@
|
||||
[//]: # (title: Data Abstractions)
|
||||
|
||||
* [`DataColumn`](DataColumn.md) is a named, typed and ordered collection of elements
|
||||
* [`DataFrame`](DataFrame.md) consists of one or several [`DataColumns`](DataColumn.md) with unique names and equal size
|
||||
* [`DataRow`](DataRow.md) is a single row of [`DataFrame`](DataFrame.md) and provides a single value for every [`DataColumn`](DataColumn.md)
|
||||
Reference in New Issue
Block a user