4.3 KiB
Vendored
The current Gradle plugin is under consideration for deprecation and may be officially marked as deprecated in future releases.
At the moment, data schema generation is handled via dedicated methods instead of relying on the plugin. {style="warning"}
In Gradle projects, the Kotlin DataFrame library provides
- Annotation processing for generation of extension properties
- Annotation processing for
DataSchemainference from datasets. - Gradle task for
DataSchemainference from datasets.
Configuration
To use the extension properties API in Gradle project add the dataframe plugin as follows:
plugins {
id("org.jetbrains.kotlinx.dataframe") version "%dataFrameVersion%"
}
dependencies {
implementation("org.jetbrains.kotlinx:dataframe:%dataFrameVersion%")
}
plugins {
id("org.jetbrains.kotlinx.dataframe") version "%dataFrameVersion%"
}
dependencies {
implementation 'org.jetbrains.kotlinx:dataframe:%dataFrameVersion%'
}
Annotation processing
Declare data schemas in your code and use them to access data in DataFrame objects.
A data schema is a class or interface annotated with @DataSchema:
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
@DataSchema
interface Person {
val name: String
val age: Int
}
Execute the assemble task to generate type-safe accessors for schemas:
val df = dataFrameOf("name", "age")(
"Alice", 15,
"Bob", 20,
).cast<Person>()
// age only available after executing `build` or `kspKotlin`!
val teens = df.filter { age in 10..19 }
teens.print()
Schema inference
Specify schema with preferred method and execute the assemble task.
@ImportDataSchema annotation must be above package directive.
You can import schemas from a URL or from the relative path of a file.
Relative path by default is resolved to the project root directory.
You can configure it by passing dataframe.resolutionDir
option to preprocessor.
For example:
ksp {
arg("dataframe.resolutionDir", file("data").absolutePath)
}
Note that due to incremental processing, imported schema will be re-generated only if some source code has changed from the previous invocation, at least one character.
For the following configuration, file Repository.Generated.kt will be generated to build/generated/ksp/ folder in
the same package as file containing the annotation.
@file:ImportDataSchema(
"Repository",
"https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv",
)
import org.jetbrains.kotlinx.dataframe.annotations.ImportDataSchema
import org.jetbrains.kotlinx.dataframe.api.*
See KDocs for @ImportDataSchema in IDE
or GitHub
for more details.
Put this in build.gradle or build.gradle.kts
For the following configuration, file Repository.Generated.kt will be generated
to build/generated/dataframe/org/example folder.
dataframes {
schema {
data = "https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv"
name = "org.example.Repository"
}
}
See reference and examples for more details.
After assemble, the following code should compile and run:
// Repository.readCsv() has argument 'path' with default value https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv
val df = Repository.readCsv()
// Use generated properties to access data in rows
df.maxBy { stargazersCount }.print()
// Or to access columns in dataframe.
print(df.fullName.count { it.contains("kotlin") })