init research
This commit is contained in:
@@ -0,0 +1,24 @@
|
||||
# asIterable
|
||||
|
||||
|
||||
<web-summary>
|
||||
Discover `asIterable` operation in Kotlin Dataframe.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Discover `asIterable` operation in Kotlin Dataframe.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Discover `asIterable` operation in Kotlin Dataframe.
|
||||
</link-summary>
|
||||
|
||||
|
||||
Returns values of this [`DataColumn`](DataColumn.md) as
|
||||
[Iterable](https://kotlinlang.org/api/core/kotlin-stdlib/kotlin.collections/-iterable/).
|
||||
|
||||
```kotlin
|
||||
col.asIterable()
|
||||
```
|
||||
|
||||
**Related operation**: [](asSequenceColumn.md).
|
||||
@@ -0,0 +1,24 @@
|
||||
# asSequence
|
||||
|
||||
|
||||
<web-summary>
|
||||
Discover `asSequence` operation in Kotlin Dataframe.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Discover `asSequence` operation in Kotlin Dataframe.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Discover `asSequence` operation in Kotlin Dataframe.
|
||||
</link-summary>
|
||||
|
||||
|
||||
Returns values of this [`DataColumn`](DataColumn.md) as
|
||||
[Sequence](https://kotlinlang.org/api/core/kotlin-stdlib/kotlin.sequences/-sequence/).
|
||||
|
||||
```kotlin
|
||||
col.asSequence()
|
||||
```
|
||||
|
||||
**Related operation**: [](asIterable.md).
|
||||
@@ -0,0 +1,57 @@
|
||||
# between
|
||||
|
||||
|
||||
<web-summary>
|
||||
Return a Boolean DataColumn indicating whether each value lies between two bounds.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Return a Boolean DataColumn indicating whether each value lies between two bounds.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Return a Boolean DataColumn indicating whether each value lies between two bounds.
|
||||
</link-summary>
|
||||
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.column.BetweenSamples-->
|
||||
|
||||
Returns a [`DataColumn`](DataColumn.md) of `Boolean` values indicating whether each element in this column
|
||||
lies between the given lower and upper boundaries.
|
||||
|
||||
If `includeBoundaries` is `true` (default), values equal to the lower or upper boundary are also considered in range.
|
||||
|
||||
```kotlin
|
||||
col.between(left, right, includeBoundaries)
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
<!---FUN notebook_test_between_1-->
|
||||
|
||||
```kotlin
|
||||
df
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
<inline-frame src="./resources/notebook_test_between_1.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
Check ages are between 18 and 25 inclusive:
|
||||
<!---FUN notebook_test_between_2-->
|
||||
|
||||
```kotlin
|
||||
df.age.between(left = 18, right = 25)
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
<inline-frame src="./resources/notebook_test_between_2.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
Strictly between 18 and 25 (excluding boundaries):
|
||||
<!---FUN notebook_test_between_3-->
|
||||
|
||||
```kotlin
|
||||
df.age.between(left = 18, right = 25, includeBoundaries = false)
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
<inline-frame src="./resources/notebook_test_between_3.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
@@ -0,0 +1,397 @@
|
||||
[//]: # (title: join)
|
||||
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.multiple.JoinSamples-->
|
||||
|
||||
Joins two [`DataFrame`](DataFrame.md) objects by join columns.
|
||||
|
||||
A *join* creates a new dataframe by combining rows from two input dataframes according to one or more key columns.
|
||||
Rows are merged when the values in the join columns match.
|
||||
If there is no match, whether the row is included and how missing values are filled depends on the type of join (e.g., inner, left, right, full).
|
||||
|
||||
Returns a new [`DataFrame`](DataFrame.md) that contains the merged rows and columns from both inputs.
|
||||
|
||||
```kotlin
|
||||
join(otherDf, type = JoinType.Inner) [ { joinColumns } ]
|
||||
|
||||
joinColumns: JoinDsl.(LeftDataFrame) -> Columns
|
||||
|
||||
interface JoinDsl: LeftDataFrame {
|
||||
|
||||
val right: RightDataFrame
|
||||
|
||||
fun DataColumn.match(rightColumn: DataColumn)
|
||||
}
|
||||
```
|
||||
|
||||
`joinColumns` is a special case of [columns selector](ColumnSelectors.md) that defines column mapping for join.
|
||||
|
||||
Related operations: [](multipleDataFrames.md)
|
||||
|
||||
## Examples
|
||||
|
||||
### Join with explicit keys (with different names) {collapsible="true"}
|
||||
|
||||
Use the Join DSL when the key column names differ:
|
||||
|
||||
- access the right `DataFrame` via `right`;
|
||||
- define the join condition with **`match`**.
|
||||
|
||||
<!---FUN notebook_test_join_3-->
|
||||
|
||||
```kotlin
|
||||
dfAges
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_3.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_5-->
|
||||
|
||||
```kotlin
|
||||
dfCities
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_5.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_6-->
|
||||
|
||||
```kotlin
|
||||
// INNER JOIN on differently named keys:
|
||||
// Merge a row when dfAges.firstName == dfCities.name.
|
||||
// With the given data all 3 names match → all rows merge.
|
||||
dfAges.join(dfCities) { firstName match right.name }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_6.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
### Join with explicit keys (with the same names) {collapsible="true"}
|
||||
|
||||
If mapped columns have the same name, just select join columns (one or several) from the left [`DataFrame`](DataFrame.md):
|
||||
|
||||
<!---FUN notebook_test_join_8-->
|
||||
|
||||
```kotlin
|
||||
dfLeft
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_8.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
<!---FUN notebook_test_join_10-->
|
||||
|
||||
```kotlin
|
||||
dfRight
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_10.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_11-->
|
||||
|
||||
```kotlin
|
||||
// INNER JOIN on "name" only:
|
||||
// Merge when left.name == right.name.
|
||||
// Duplicate keys produce multiple merged rows (one per pairing).
|
||||
dfLeft.join(dfRight) { name }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_11.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
> In this example, the "city" columns from the left and right dataframes do not match to each other.
|
||||
> After joining, the "city" column from the right dataframe is included in the result dataframe
|
||||
> with the name **"city1"** to avoid a name conflict.
|
||||
> { style = "note" }
|
||||
|
||||
### Join with implicit keys (all columns with the same name) {collapsible="true"}
|
||||
|
||||
If `joinColumns` is not specified, columns with the same name from both [`DataFrame`](DataFrame.md)
|
||||
objects will be used as join columns:
|
||||
|
||||
<!---FUN dfLeftImplicit-->
|
||||
|
||||
```kotlin
|
||||
dfLeft
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/dfLeftImplicit.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
<!---FUN dfRightImplicit-->
|
||||
|
||||
```kotlin
|
||||
dfRight
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/dfRightImplicit.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
<!---FUN notebook_test_join_12-->
|
||||
|
||||
```kotlin
|
||||
// INNER JOIN on all same-named columns ("name" and "city"):
|
||||
// Merge when BOTH name AND city are equal; otherwise the row is dropped.
|
||||
dfLeft.join(dfRight)
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_12.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
## Join types
|
||||
|
||||
Supported join types:
|
||||
* `Inner` (default) — only matched rows from left and right [`DataFrame`](DataFrame.md) objects
|
||||
* `Filter` — only matched rows from left [`DataFrame`](DataFrame.md)
|
||||
* `Left` — all rows from left [`DataFrame`](DataFrame.md), mismatches from right [`DataFrame`](DataFrame.md) filled with `null`
|
||||
* `Right` — all rows from right [`DataFrame`](DataFrame.md), mismatches from left [`DataFrame`](DataFrame.md) filled with `null`
|
||||
* `Full` — all rows from left and right [`DataFrame`](DataFrame.md) objects, any mismatches filled with `null`
|
||||
* `Exclude` — only mismatched rows from left [`DataFrame`](DataFrame.md)
|
||||
|
||||
For every join type there is a shortcut operation:
|
||||
|
||||
```kotlin
|
||||
df.innerJoin(otherDf) [ { joinColumns } ]
|
||||
df.filterJoin(otherDf) [ { joinColumns } ]
|
||||
df.leftJoin(otherDf) [ { joinColumns } ]
|
||||
df.rightJoin(otherDf) [ { joinColumns } ]
|
||||
df.fullJoin(otherDf) [ { joinColumns } ]
|
||||
df.excludeJoin(otherDf) [ { joinColumns } ]
|
||||
```
|
||||
|
||||
|
||||
### Examples {id="examples_1"}
|
||||
|
||||
#### Inner {collapsible="true"}
|
||||
|
||||
<!---FUN notebook_test_join_13-->
|
||||
|
||||
```kotlin
|
||||
dfLeft
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_13.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_14-->
|
||||
|
||||
```kotlin
|
||||
dfRight
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_14.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_15-->
|
||||
|
||||
```kotlin
|
||||
// INNER JOIN:
|
||||
// Combines columns from the left and right dataframes
|
||||
// and keep only rows where (name, city) matches on both sides.
|
||||
dfLeft.innerJoin(dfRight) { name and city }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_15.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
#### Filter {collapsible="true"}
|
||||
|
||||
<!---FUN notebook_test_join_13-->
|
||||
|
||||
```kotlin
|
||||
dfLeft
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_13.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_14-->
|
||||
|
||||
```kotlin
|
||||
dfRight
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_14.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
<!---FUN notebook_test_join_16-->
|
||||
|
||||
```kotlin
|
||||
// FILTER JOIN:
|
||||
// Keep ONLY left rows that have ANY match on (name, city).
|
||||
// No right-side columns are added.
|
||||
dfLeft.filterJoin(dfRight) { name and city }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_16.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
#### Left {collapsible="true"}
|
||||
|
||||
<!---FUN notebook_test_join_13-->
|
||||
|
||||
```kotlin
|
||||
dfLeft
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_13.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_14-->
|
||||
|
||||
```kotlin
|
||||
dfRight
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_14.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
<!---FUN notebook_test_join_17-->
|
||||
|
||||
```kotlin
|
||||
// LEFT JOIN:
|
||||
// Keep ALL left rows and add columns from the right dataframe.
|
||||
// If (name, city) matches, attach right columns values from
|
||||
// the corresponding row in the right dataframe;
|
||||
// if not (e.g. ("Bob", "Dubai") row), fill them with `null`.
|
||||
dfLeft.leftJoin(dfRight) { name and city }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_17.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
#### Right {collapsible="true"}
|
||||
|
||||
<!---FUN notebook_test_join_13-->
|
||||
|
||||
```kotlin
|
||||
dfLeft
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_13.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_14-->
|
||||
|
||||
```kotlin
|
||||
dfRight
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_14.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
<!---FUN notebook_test_join_18-->
|
||||
|
||||
```kotlin
|
||||
// RIGHT JOIN:
|
||||
// Keep ALL right rows and add columns from the left dataframe.
|
||||
// If (name, city) matches, attach left columns values from
|
||||
// the corresponding row in the left dataframe;
|
||||
// if not (e.g. ("Bob", "Tokyo") row), fill them with `null`.
|
||||
dfLeft.rightJoin(dfRight) { name and city }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_18.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
#### Full {collapsible="true"}
|
||||
|
||||
<!---FUN notebook_test_join_13-->
|
||||
|
||||
```kotlin
|
||||
dfLeft
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_13.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_14-->
|
||||
|
||||
```kotlin
|
||||
dfRight
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_14.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
<!---FUN notebook_test_join_19-->
|
||||
|
||||
```kotlin
|
||||
// FULL JOIN:
|
||||
// Keep ALL rows from both sides. Where there's no match on (name, city),
|
||||
// the other side is filled with nulls.
|
||||
dfLeft.fullJoin(dfRight) { name and city }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_19.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
#### Exclude {collapsible="true"}
|
||||
|
||||
<!---FUN notebook_test_join_13-->
|
||||
|
||||
```kotlin
|
||||
dfLeft
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_13.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
<!---FUN notebook_test_join_14-->
|
||||
|
||||
```kotlin
|
||||
dfRight
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_14.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
|
||||
<!---FUN notebook_test_join_20-->
|
||||
|
||||
```kotlin
|
||||
// EXCLUDE JOIN:
|
||||
// Keep ONLY left rows that have NO match on (name, city).
|
||||
// Useful to find "unpaired" left rows.
|
||||
dfLeft.excludeJoin(dfRight) { name and city }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_join_20.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
@@ -0,0 +1,22 @@
|
||||
# Utility functions
|
||||
|
||||
|
||||
<web-summary>
|
||||
Overview of common utility operations in Kotlin Dataframe.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Overview of common utility operations in Kotlin Dataframe.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Overview of common utility operations in Kotlin Dataframe.
|
||||
</link-summary>
|
||||
|
||||
Explore frequently used helpers for querying and transforming your data:
|
||||
|
||||
- [`all`](all.md) — Check whether all rows satisfy a predicate.
|
||||
- [`any`](any.md) — Check whether any row satisfies a predicate.
|
||||
- [`chunked`](chunked.md) — Split a [`DataFrame`](DataFrame.md) into consecutive chunks and return them as a
|
||||
[`FrameColumn`](DataColumn.md#framecolumn).
|
||||
- [`shuffle`](shuffle.md) — Randomly reorder rows.
|
||||
@@ -0,0 +1,70 @@
|
||||
# all
|
||||
|
||||
|
||||
<web-summary>
|
||||
Discover `all` operation in Kotlin Dataframe.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Discover `all` operation in Kotlin Dataframe.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Discover `all` operation in Kotlin Dataframe.
|
||||
</link-summary>
|
||||
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.utils.AllSamples-->
|
||||
|
||||
Checks if all rows in the [](DataFrame.md) satisfy the predicate.
|
||||
|
||||
Returns `Boolean` — `true` if every row satisfies the predicate, `false` otherwise.
|
||||
|
||||
```kotlin
|
||||
all { rowCondition }
|
||||
|
||||
rowCondition: (DataRow) -> Boolean
|
||||
```
|
||||
|
||||
**Related operations**: [](any.md), [](filter.md), [](single.md), [](count.md).
|
||||
|
||||
### Examples
|
||||
|
||||
<!---FUN notebook_test_all_3-->
|
||||
|
||||
```kotlin
|
||||
df
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_all_3.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
Check if all persons' `age` is greater than 21:
|
||||
|
||||
<!---FUN notebook_test_all_4-->
|
||||
|
||||
```kotlin
|
||||
df.all { age > 21 }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
Output:
|
||||
```text
|
||||
false
|
||||
```
|
||||
|
||||
Check if all persons have `age` greater or equal to 15:
|
||||
|
||||
<!---FUN notebook_test_all_5-->
|
||||
|
||||
```kotlin
|
||||
df.all { name.first().isUpperCase() && age >= 15 }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
Output:
|
||||
```text
|
||||
true
|
||||
```
|
||||
@@ -0,0 +1,70 @@
|
||||
# any
|
||||
|
||||
|
||||
<web-summary>
|
||||
Discover `any` operation in Kotlin Dataframe.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Discover `any` operation in Kotlin Dataframe.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Discover `any` operation in Kotlin Dataframe.
|
||||
</link-summary>
|
||||
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.utils.AnySamples-->
|
||||
|
||||
Checks if there is at least one row in the [](DataFrame.md) that satisfies the predicate.
|
||||
|
||||
Returns `Boolean` — `true` if there is at least one row that satisfies the predicate, `false` otherwise.
|
||||
|
||||
```kotlin
|
||||
df.any { rowCondition }
|
||||
|
||||
rowCondition: (DataRow) -> Boolean
|
||||
```
|
||||
|
||||
**Related operations**: [](all.md), [](filter.md), [](single.md), [](count.md).
|
||||
|
||||
### Examples
|
||||
|
||||
<!---FUN notebook_test_any_3-->
|
||||
|
||||
```kotlin
|
||||
df
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_any_3.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
Check if any person `age` is greater than 21:
|
||||
|
||||
<!---FUN notebook_test_any_4-->
|
||||
|
||||
```kotlin
|
||||
df.any { age > 21 }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
Output:
|
||||
```text
|
||||
false
|
||||
```
|
||||
|
||||
Check if there is any person with `age` equal to 15 and `name` equal to "Alice":
|
||||
|
||||
<!---FUN notebook_test_any_5-->
|
||||
|
||||
```kotlin
|
||||
df.any { age == 15 && name == "Alice" }
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
Output:
|
||||
```text
|
||||
true
|
||||
```
|
||||
@@ -0,0 +1,64 @@
|
||||
# chunked
|
||||
|
||||
|
||||
<web-summary>
|
||||
Discover `chunked` operation in Kotlin Dataframe.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Discover `chunked` operation in Kotlin Dataframe.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Discover `chunked` operation in Kotlin Dataframe.
|
||||
</link-summary>
|
||||
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.utils.ChunkedSamples-->
|
||||
|
||||
Splits a [`DataFrame`](DataFrame.md) into consecutive sub-dataframes (chunks) and returns them as a
|
||||
[`FrameColumn`](DataColumn.md#framecolumn). Chunks are formed in order and do not overlap.
|
||||
|
||||
Each chunk contains at most the specified number of rows.
|
||||
The resulting `FrameColumn`’s name can be customized; by default, it is "groups."
|
||||
|
||||
`DataFrame` can be split into chunks in two ways:
|
||||
- By fixed size: split into chunks of up to the given size.
|
||||
- By start indices: split using custom zero-based start indices for each chunk; each chunk ends right before the next start index or the end of the DataFrame.
|
||||
|
||||
```kotlin
|
||||
df.chunked(size: Int, name: String)
|
||||
df.chunked(startIndices: List<Int>, name: String)
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
<!---FUN notebook_test_chunked_1-->
|
||||
|
||||
```kotlin
|
||||
df
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
<inline-frame src="./resources/notebook_test_chunked_1.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
Fixed size chunks:
|
||||
<!---FUN notebook_test_chunked_2-->
|
||||
|
||||
```kotlin
|
||||
df.chunked(size = 2)
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_chunked_2.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
Custom start indices:
|
||||
<!---FUN notebook_test_chunked_3-->
|
||||
|
||||
```kotlin
|
||||
df.chunked(startIndices = listOf(0, 1, 3), name = "segments")
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
|
||||
<inline-frame src="./resources/notebook_test_chunked_3.html" width="100%" height="500px"></inline-frame>
|
||||
@@ -0,0 +1,47 @@
|
||||
# shuffle
|
||||
|
||||
|
||||
<web-summary>
|
||||
Discover `shuffle` operation in Kotlin Dataframe.
|
||||
</web-summary>
|
||||
|
||||
<card-summary>
|
||||
Discover `shuffle` operation in Kotlin Dataframe.
|
||||
</card-summary>
|
||||
|
||||
<link-summary>
|
||||
Discover `shuffle` operation in Kotlin Dataframe.
|
||||
</link-summary>
|
||||
|
||||
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.utils.ShuffleSamples-->
|
||||
|
||||
Returns a new [`DataFrame`](DataFrame.md) with rows in random order.
|
||||
|
||||
You can supply a [kotlin.random.Random](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.random/-random/)
|
||||
instance with a fixed seed for reproducible results.
|
||||
|
||||
```Kotlin
|
||||
df.shuffle()
|
||||
df.shuffle(random: Random)
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
<!---FUN notebook_test_shuffle_1-->
|
||||
|
||||
```kotlin
|
||||
df
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
<inline-frame src="./resources/notebook_test_shuffle_1.html" width="100%" height="500px"></inline-frame>
|
||||
|
||||
Deterministic shuffle using a fixed seed:
|
||||
<!---FUN notebook_test_shuffle_2-->
|
||||
|
||||
```kotlin
|
||||
df.shuffle(Random(42))
|
||||
```
|
||||
|
||||
<!---END-->
|
||||
<inline-frame src="./resources/notebook_test_shuffle_2.html" width="100%" height="500px"></inline-frame>
|
||||
Reference in New Issue
Block a user