init research
@@ -0,0 +1,74 @@
|
||||
# Examples of Kotlin DataFrame
|
||||
|
||||
### Idea examples
|
||||
* [Gradle plugin example](kotlin-dataframe-plugin-gradle-example) IDEA project with a
|
||||
[Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html) example.
|
||||
* [Maven plugin example](kotlin-dataframe-plugin-maven-example) IDEA project with a
|
||||
[Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html) example.
|
||||
* [android example](android-example) A minimal Android project showcasing integration with Kotlin DataFrame.
|
||||
Also includes [Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html).
|
||||
* [movies](idea-examples/movies) Using extension properties [Access API](https://kotlin.github.io/dataframe/apilevels.html) to perform a data cleaning task
|
||||
* [titanic](idea-examples/titanic)
|
||||
* [youtube](idea-examples/youtube)
|
||||
* [json](idea-examples/json) Using OpenAPI support in DataFrame's Gradle and KSP plugins to access data from [API guru](https://apis.guru/) in a type-safe manner
|
||||
* [imdb sql database](https://github.com/zaleslaw/KotlinDataFrame-SQL-Examples) This project prominently showcases how to convert data from an SQL table to a Kotlin DataFrame
|
||||
and how to transform the result of an SQL query into a DataFrame.
|
||||
* [spark-parquet-dataframe](idea-examples/spark-parquet-dataframe) This project showcases how to export data and ML models from Apache Spark via reading from Parquet files.
|
||||
* [unsupported-data-sources](idea-examples/unsupported-data-sources) Showcases of how to use DataFrame with
|
||||
(momentarily) unsupported data libraries such as [Spark](https://spark.apache.org/) and [Exposed](https://github.com/JetBrains/Exposed).
|
||||
They show how to convert to and from Kotlin DataFrame and their respective tables.
|
||||
* **JetBrains Exposed**: See the [exposed folder](./idea-examples/unsupported-data-sources/exposed)
|
||||
for an example of using Kotlin DataFrame with [Exposed](https://github.com/JetBrains/Exposed).
|
||||
* **Hibernate**: See the [hibernate folder](./idea-examples/unsupported-data-sources/hibernate)
|
||||
for an example of using Kotlin DataFrame with [Hibernate](https://hibernate.org/orm/).
|
||||
* **Apache Spark**: See the [spark folder](./idea-examples/unsupported-data-sources/spark)
|
||||
for an example of using Kotlin DataFrame with [Spark](https://spark.apache.org/) and with the [Kotlin Spark API](https://github.com/JetBrains/kotlin-spark-api).
|
||||
* **Multik**: See the [multik folder](./idea-examples/unsupported-data-sources/multik)
|
||||
for an example of using Kotlin DataFrame with [Multik](https://github.com/Kotlin/multik).
|
||||
|
||||
|
||||
### Notebook examples
|
||||
|
||||
* people ([Datalore](https://datalore.jetbrains.com/view/notebook/aOTioEClQQrsZZBKeUPAQj)) –
|
||||
Small artificial dataset used in [DataFrame API examples](https://kotlin.github.io/dataframe/operations.html)
|
||||
___
|
||||
* puzzles ([notebook](notebooks/puzzles/40%20puzzles.ipynb)/[Datalore](https://datalore.jetbrains.com/view/notebook/CVp3br3CDXjUGaxxqfJjFF)) –
|
||||
Inspired [by 100 pandas puzzles](https://github.com/ajcr/100-pandas-puzzles). You will go from the simplest tasks to
|
||||
complex problems where need to think. This notebook will show you how to solve these tasks with the Kotlin
|
||||
Dataframe in a laconic, beautiful style.
|
||||
___
|
||||
* movies ([notebook](notebooks/movies/movies.ipynb)/[Datalore](https://datalore.jetbrains.com/view/notebook/89IMYb1zbHZxHfwAta6eKP)) –
|
||||
In this notebook you can see the basic operations of the Kotlin DataFrame on data from [movielens](https://movielens.org/).
|
||||
You can take the data from the [link](https://grouplens.org/datasets/movielens/latest/).
|
||||
___
|
||||
* netflix ([notebook](notebooks/netflix/netflix.ipynb)/[Datalore](https://datalore.jetbrains.com/view/notebook/xSJ4rx49hcH71pPnFgZBCq)) –
|
||||
Explore TV shows and movies from Netflix with the powerful Kotlin DataFrame API and beautiful
|
||||
visualizations from [lets-plot](https://github.com/JetBrains/lets-plot-kotlin).
|
||||
___
|
||||
* github ([notebook](notebooks/github/github.ipynb)/[Datalore](https://datalore.jetbrains.com/view/notebook/P9n6jYL4mmY1gx3phz5TsX)) –
|
||||
This notebook shows the hierarchical dataframes look like and how to work with them.
|
||||
___
|
||||
* titanic ([notebook](notebooks/titanic/Titanic.ipynb)/[Datalore](https://datalore.jetbrains.com/view/notebook/B5YeMMONSAR78FgKQ9yJyW)) –
|
||||
Let's see how the new library will show itself on the famous Titanic dataset.
|
||||
___
|
||||
* Financial Analyze of the top-12 German companies ([notebook](notebooks/top_12_german_companies)/[Datalore](https://datalore.jetbrains.com/report/static/KQKedA4jDrKu63O53gEN0z/MDg5pHcGvRdDVQnPLmwjuc)) –
|
||||
Analyze key financial metrics for several major German companies.
|
||||
___
|
||||
* wine ([notebook](notebooks/wine/WineNetWIthKotlinDL.ipynb)/[Datalore](https://datalore.jetbrains.com/view/notebook/aK9vYHH8pCA8H1KbKB5WsI)) –
|
||||
Wine. Kotlin DataFrame. KotlinDL. What came out of this can be seen in this notebook.
|
||||
___
|
||||
* youtube ([notebook](notebooks/youtube/Youtube.ipynb)/[Datalore](https://datalore.jetbrains.com/view/notebook/uXH0VfIM6qrrmwPJnLBi0j)) –
|
||||
Explore YouTube videos with YouTube REST API and Kotlin DataFrame
|
||||
|
||||
___
|
||||
* imdb sql database ([notebook](https://github.com/zaleslaw/KotlinDataFrame-SQL-Examples/blob/master/notebooks/imdb.ipynb)) – In this notebook, we use Kotlin DataFrame and Kandy library to analyze data from [IMDB](https://datasets.imdbws.com/) (SQL dump for the MariaDB database with the name "imdb" could be downloaded by this [link](https://drive.google.com/file/d/10HnOu0Yem2Tkz_34SfvDoHTVqF_8b4N7/view?usp=sharing)).
|
||||
|
||||
---
|
||||
* Feature Overviews [notebook folder](notebooks/feature_overviews)
|
||||
Overview of new features available a given version
|
||||
|
||||
The example notebooks always target the latest stable version of the library.
|
||||
Notebooks compatible with the latest dev/master version are located in the [dev](notebooks/dev) folder.
|
||||
|
||||
These [dev versions](notebooks/dev) are tested by the
|
||||
[:dataframe-jupyter module](../dataframe-jupyter/src/test/kotlin/org/jetbrains/kotlinx/dataframe/jupyter).
|
||||
@@ -0,0 +1,15 @@
|
||||
*.iml
|
||||
.gradle
|
||||
/local.properties
|
||||
/.idea/caches
|
||||
/.idea/libraries
|
||||
/.idea/modules.xml
|
||||
/.idea/workspace.xml
|
||||
/.idea/navEditor.xml
|
||||
/.idea/assetWizardSettings.xml
|
||||
.DS_Store
|
||||
/build
|
||||
/captures
|
||||
.externalNativeBuild
|
||||
.cxx
|
||||
local.properties
|
||||
@@ -0,0 +1,15 @@
|
||||
# 📱 Android Example
|
||||
|
||||
A minimal Android project showcasing integration with **Kotlin DataFrame**.
|
||||
|
||||
<p align="center">
|
||||
<img src="screen.jpg" alt="App screenshot" height="320"/>
|
||||
</p>
|
||||
|
||||
It also includes the [Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html).
|
||||
|
||||
We recommend using an up-to-date Android Studio for the best experience.
|
||||
For proper functionality in Android Studio requires version Otter | 2025.2.3 or newer.
|
||||
|
||||
[Download Android Example](https://github.com/Kotlin/dataframe/raw/example-projects-archives/android-example.zip)
|
||||
|
||||
@@ -0,0 +1 @@
|
||||
/build
|
||||
@@ -0,0 +1,74 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
|
||||
plugins {
|
||||
alias(libs.plugins.android.application)
|
||||
alias(libs.plugins.kotlin.compose)
|
||||
|
||||
// DataFrame Compiler plugin, matching the Kotlin version
|
||||
alias(libs.plugins.dataframe)
|
||||
}
|
||||
|
||||
android {
|
||||
namespace = "com.example.myapplication"
|
||||
compileSdk = 36
|
||||
|
||||
defaultConfig {
|
||||
applicationId = "com.example.myapplication"
|
||||
minSdk = 21
|
||||
targetSdk = 36
|
||||
versionCode = 1
|
||||
versionName = "1.0"
|
||||
|
||||
testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner"
|
||||
}
|
||||
|
||||
buildTypes {
|
||||
release {
|
||||
isMinifyEnabled = true
|
||||
proguardFiles(
|
||||
getDefaultProguardFile("proguard-android-optimize.txt"),
|
||||
"proguard-rules.pro",
|
||||
)
|
||||
}
|
||||
}
|
||||
compileOptions {
|
||||
sourceCompatibility = JavaVersion.VERSION_1_8
|
||||
targetCompatibility = JavaVersion.VERSION_1_8
|
||||
}
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_1_8
|
||||
}
|
||||
}
|
||||
buildFeatures {
|
||||
compose = true
|
||||
}
|
||||
}
|
||||
|
||||
dependencies {
|
||||
|
||||
implementation(libs.androidx.core.ktx)
|
||||
implementation(libs.androidx.lifecycle.runtime.ktx)
|
||||
implementation(libs.androidx.activity.compose)
|
||||
implementation(platform(libs.androidx.compose.bom))
|
||||
implementation(libs.androidx.ui)
|
||||
implementation(libs.androidx.ui.graphics)
|
||||
implementation(libs.androidx.ui.tooling.preview)
|
||||
implementation(libs.androidx.material3)
|
||||
testImplementation(libs.junit)
|
||||
androidTestImplementation(libs.androidx.junit)
|
||||
androidTestImplementation(libs.androidx.espresso.core)
|
||||
androidTestImplementation(platform(libs.androidx.compose.bom))
|
||||
androidTestImplementation(libs.androidx.ui.test.junit4)
|
||||
debugImplementation(libs.androidx.ui.tooling)
|
||||
debugImplementation(libs.androidx.ui.test.manifest)
|
||||
|
||||
// Core Kotlin DataFrame API, JSON and CSV IO.
|
||||
// See custom Gradle setup:
|
||||
// https://kotlin.github.io/dataframe/setupcustomgradle.html
|
||||
implementation("org.jetbrains.kotlinx:dataframe-core:1.0.0-Beta4")
|
||||
implementation("org.jetbrains.kotlinx:dataframe-json:1.0.0-Beta4")
|
||||
implementation("org.jetbrains.kotlinx:dataframe-csv:1.0.0-Beta4")
|
||||
// You can add any additional IO modules you like, except for 'dataframe-arrow'.
|
||||
// Apache Arrow is not supported well on Android.
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
# Add project specific ProGuard rules here.
|
||||
# You can control the set of applied configuration files using the
|
||||
# proguardFiles setting in build.gradle.
|
||||
#
|
||||
# For more details, see
|
||||
# http://developer.android.com/guide/developing/tools/proguard.html
|
||||
|
||||
# If your project uses WebView with JS, uncomment the following
|
||||
# and specify the fully qualified class name to the JavaScript interface
|
||||
# class:
|
||||
#-keepclassmembers class fqcn.of.javascript.interface.for.webview {
|
||||
# public *;
|
||||
#}
|
||||
|
||||
# Uncomment this to preserve the line number information for
|
||||
# debugging stack traces.
|
||||
#-keepattributes SourceFile,LineNumberTable
|
||||
|
||||
# If you keep the line number information, uncomment this to
|
||||
# hide the original source file name.
|
||||
#-renamesourcefileattribute SourceFile
|
||||
@@ -0,0 +1,24 @@
|
||||
package com.example.myapplication
|
||||
|
||||
import androidx.test.platform.app.InstrumentationRegistry
|
||||
import androidx.test.ext.junit.runners.AndroidJUnit4
|
||||
|
||||
import org.junit.Test
|
||||
import org.junit.runner.RunWith
|
||||
|
||||
import org.junit.Assert.*
|
||||
|
||||
/**
|
||||
* Instrumented test, which will execute on an Android device.
|
||||
*
|
||||
* See [testing documentation](http://d.android.com/tools/testing).
|
||||
*/
|
||||
@RunWith(AndroidJUnit4::class)
|
||||
class ExampleInstrumentedTest {
|
||||
@Test
|
||||
fun useAppContext() {
|
||||
// Context of the app under test.
|
||||
val appContext = InstrumentationRegistry.getInstrumentation().targetContext
|
||||
assertEquals("com.example.myapplication", appContext.packageName)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,27 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
|
||||
xmlns:tools="http://schemas.android.com/tools" >
|
||||
|
||||
<application
|
||||
android:allowBackup="true"
|
||||
android:dataExtractionRules="@xml/data_extraction_rules"
|
||||
android:fullBackupContent="@xml/backup_rules"
|
||||
android:icon="@mipmap/ic_launcher"
|
||||
android:label="@string/app_name"
|
||||
android:roundIcon="@mipmap/ic_launcher_round"
|
||||
android:supportsRtl="true"
|
||||
android:theme="@style/Theme.MyApplication" >
|
||||
<activity
|
||||
android:name=".MainActivity"
|
||||
android:exported="true"
|
||||
android:label="@string/app_name"
|
||||
android:theme="@style/Theme.MyApplication" >
|
||||
<intent-filter>
|
||||
<action android:name="android.intent.action.MAIN" />
|
||||
|
||||
<category android:name="android.intent.category.LAUNCHER" />
|
||||
</intent-filter>
|
||||
</activity>
|
||||
</application>
|
||||
|
||||
</manifest>
|
||||
@@ -0,0 +1,148 @@
|
||||
package com.example.myapplication
|
||||
|
||||
import android.os.Bundle
|
||||
import androidx.activity.ComponentActivity
|
||||
import androidx.activity.compose.setContent
|
||||
import androidx.activity.enableEdgeToEdge
|
||||
import androidx.compose.foundation.background
|
||||
import androidx.compose.foundation.layout.Column
|
||||
import androidx.compose.foundation.layout.Row
|
||||
import androidx.compose.foundation.layout.Spacer
|
||||
import androidx.compose.foundation.layout.fillMaxSize
|
||||
import androidx.compose.foundation.layout.height
|
||||
import androidx.compose.foundation.layout.padding
|
||||
import androidx.compose.foundation.lazy.LazyColumn
|
||||
import androidx.compose.foundation.lazy.items
|
||||
import androidx.compose.material3.MaterialTheme
|
||||
import androidx.compose.material3.Surface
|
||||
import androidx.compose.material3.Text
|
||||
import androidx.compose.runtime.Composable
|
||||
import androidx.compose.runtime.remember
|
||||
import androidx.compose.ui.Modifier
|
||||
import androidx.compose.ui.graphics.Color
|
||||
import androidx.compose.ui.tooling.preview.Preview
|
||||
import androidx.compose.ui.unit.dp
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.cast
|
||||
import org.jetbrains.kotlinx.dataframe.api.dataFrameOf
|
||||
import org.jetbrains.kotlinx.dataframe.api.filter
|
||||
import org.jetbrains.kotlinx.dataframe.api.rows
|
||||
|
||||
@DataSchema
|
||||
data class Person(
|
||||
val age: Int,
|
||||
val name: String
|
||||
)
|
||||
|
||||
class MainActivity : ComponentActivity() {
|
||||
override fun onCreate(savedInstanceState: Bundle?) {
|
||||
super.onCreate(savedInstanceState)
|
||||
enableEdgeToEdge()
|
||||
|
||||
val df = dataFrameOf(
|
||||
"name" to listOf("Andrei", "Nikita", "Jolan"),
|
||||
"age" to listOf(22, 16, 37)
|
||||
).cast<Person>()
|
||||
|
||||
setContent {
|
||||
MaterialTheme {
|
||||
Surface(modifier = Modifier.fillMaxSize()) {
|
||||
DataFrameScreen(df)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@Preview(showBackground = true)
|
||||
@Composable
|
||||
fun DefaultDataFrameScreenPreview() {
|
||||
val df = dataFrameOf(
|
||||
"name" to listOf("Andrei", "Nikita", "Jolan"),
|
||||
"age" to listOf(22, 16, 37)
|
||||
).cast<Person>()
|
||||
DataFrameScreen(df)
|
||||
}
|
||||
|
||||
@Composable
|
||||
fun DataFrameScreen(df: DataFrame<Person>) {
|
||||
val filtered = remember(df) { df.filter { age >= 20 } }
|
||||
Column(
|
||||
modifier = Modifier
|
||||
.fillMaxSize()
|
||||
.padding(top = 48.dp, start = 16.dp, end = 16.dp)
|
||||
) {
|
||||
Text(
|
||||
text = "Kotlin DataFrame on Android",
|
||||
style = MaterialTheme.typography.headlineSmall,
|
||||
modifier = Modifier.padding(bottom = 16.dp)
|
||||
)
|
||||
|
||||
Text(
|
||||
text = "df",
|
||||
modifier = Modifier
|
||||
.background(color = Color.LightGray)
|
||||
.padding(2.dp)
|
||||
)
|
||||
|
||||
DataFrameTable(df)
|
||||
|
||||
Text(
|
||||
text = "df.filter { age >= 20 }",
|
||||
modifier = Modifier
|
||||
.background(color = Color.LightGray)
|
||||
.padding(2.dp)
|
||||
)
|
||||
|
||||
DataFrameTable(filtered)
|
||||
}
|
||||
}
|
||||
|
||||
@Preview(showBackground = true)
|
||||
@Composable
|
||||
fun DefaultDataFrameTablePreview() {
|
||||
val df = dataFrameOf(
|
||||
"name" to listOf("Andrei", "Nikita", "Jolan"),
|
||||
"age" to listOf(22, 16, 37)
|
||||
).cast<Person>()
|
||||
DataFrameTable(df)
|
||||
}
|
||||
|
||||
@Composable
|
||||
fun DataFrameTable(df: DataFrame<*>) {
|
||||
val columnNames = remember(df) { df.columnNames() }
|
||||
val rows = remember(df) { df.rows().toList() }
|
||||
|
||||
LazyColumn {
|
||||
item {
|
||||
// Header
|
||||
Row {
|
||||
for (name in columnNames) {
|
||||
Text(
|
||||
text = name,
|
||||
modifier = Modifier
|
||||
.weight(1f)
|
||||
.padding(4.dp),
|
||||
style = MaterialTheme.typography.labelLarge
|
||||
)
|
||||
}
|
||||
}
|
||||
|
||||
Spacer(Modifier.height(4.dp))
|
||||
}
|
||||
// Rows
|
||||
items(rows) { row ->
|
||||
Row {
|
||||
for (cell in row.values()) {
|
||||
Text(
|
||||
text = cell.toString(),
|
||||
modifier = Modifier
|
||||
.weight(1f)
|
||||
.padding(4.dp)
|
||||
)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,170 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<vector xmlns:android="http://schemas.android.com/apk/res/android"
|
||||
android:width="108dp"
|
||||
android:height="108dp"
|
||||
android:viewportWidth="108"
|
||||
android:viewportHeight="108">
|
||||
<path
|
||||
android:fillColor="#3DDC84"
|
||||
android:pathData="M0,0h108v108h-108z" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M9,0L9,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M19,0L19,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M29,0L29,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M39,0L39,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M49,0L49,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M59,0L59,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M69,0L69,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M79,0L79,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M89,0L89,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M99,0L99,108"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,9L108,9"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,19L108,19"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,29L108,29"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,39L108,39"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,49L108,49"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,59L108,59"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,69L108,69"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,79L108,79"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,89L108,89"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M0,99L108,99"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M19,29L89,29"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M19,39L89,39"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M19,49L89,49"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M19,59L89,59"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M19,69L89,69"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M19,79L89,79"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M29,19L29,89"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M39,19L39,89"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M49,19L49,89"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M59,19L59,89"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M69,19L69,89"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
<path
|
||||
android:fillColor="#00000000"
|
||||
android:pathData="M79,19L79,89"
|
||||
android:strokeWidth="0.8"
|
||||
android:strokeColor="#33FFFFFF" />
|
||||
</vector>
|
||||
@@ -0,0 +1,30 @@
|
||||
<vector xmlns:android="http://schemas.android.com/apk/res/android"
|
||||
xmlns:aapt="http://schemas.android.com/aapt"
|
||||
android:width="108dp"
|
||||
android:height="108dp"
|
||||
android:viewportWidth="108"
|
||||
android:viewportHeight="108">
|
||||
<path android:pathData="M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z">
|
||||
<aapt:attr name="android:fillColor">
|
||||
<gradient
|
||||
android:endX="85.84757"
|
||||
android:endY="92.4963"
|
||||
android:startX="42.9492"
|
||||
android:startY="49.59793"
|
||||
android:type="linear">
|
||||
<item
|
||||
android:color="#44000000"
|
||||
android:offset="0.0" />
|
||||
<item
|
||||
android:color="#00000000"
|
||||
android:offset="1.0" />
|
||||
</gradient>
|
||||
</aapt:attr>
|
||||
</path>
|
||||
<path
|
||||
android:fillColor="#FFFFFF"
|
||||
android:fillType="nonZero"
|
||||
android:pathData="M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z"
|
||||
android:strokeWidth="1"
|
||||
android:strokeColor="#00000000" />
|
||||
</vector>
|
||||
@@ -0,0 +1,6 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
|
||||
<background android:drawable="@drawable/ic_launcher_background" />
|
||||
<foreground android:drawable="@drawable/ic_launcher_foreground" />
|
||||
<monochrome android:drawable="@drawable/ic_launcher_foreground" />
|
||||
</adaptive-icon>
|
||||
@@ -0,0 +1,6 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
|
||||
<background android:drawable="@drawable/ic_launcher_background" />
|
||||
<foreground android:drawable="@drawable/ic_launcher_foreground" />
|
||||
<monochrome android:drawable="@drawable/ic_launcher_foreground" />
|
||||
</adaptive-icon>
|
||||
|
After Width: | Height: | Size: 1.4 KiB |
|
After Width: | Height: | Size: 2.8 KiB |
|
After Width: | Height: | Size: 982 B |
|
After Width: | Height: | Size: 1.7 KiB |
|
After Width: | Height: | Size: 1.9 KiB |
|
After Width: | Height: | Size: 3.8 KiB |
|
After Width: | Height: | Size: 2.8 KiB |
|
After Width: | Height: | Size: 5.8 KiB |
|
After Width: | Height: | Size: 3.8 KiB |
|
After Width: | Height: | Size: 7.6 KiB |
@@ -0,0 +1,10 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<resources>
|
||||
<color name="purple_200">#FFBB86FC</color>
|
||||
<color name="purple_500">#FF6200EE</color>
|
||||
<color name="purple_700">#FF3700B3</color>
|
||||
<color name="teal_200">#FF03DAC5</color>
|
||||
<color name="teal_700">#FF018786</color>
|
||||
<color name="black">#FF000000</color>
|
||||
<color name="white">#FFFFFFFF</color>
|
||||
</resources>
|
||||
@@ -0,0 +1,3 @@
|
||||
<resources>
|
||||
<string name="app_name">Kotlin Dataframe Simple App</string>
|
||||
</resources>
|
||||
@@ -0,0 +1,4 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<resources>
|
||||
<style name="Theme.MyApplication" parent="android:Theme.Material.Light.NoActionBar" />
|
||||
</resources>
|
||||
@@ -0,0 +1,13 @@
|
||||
<?xml version="1.0" encoding="utf-8"?><!--
|
||||
Sample backup rules file; uncomment and customize as necessary.
|
||||
See https://developer.android.com/guide/topics/data/autobackup
|
||||
for details.
|
||||
Note: This file is ignored for devices older than API 31
|
||||
See https://developer.android.com/about/versions/12/backup-restore
|
||||
-->
|
||||
<full-backup-content>
|
||||
<!--
|
||||
<include domain="sharedpref" path="."/>
|
||||
<exclude domain="sharedpref" path="device.xml"/>
|
||||
-->
|
||||
</full-backup-content>
|
||||
@@ -0,0 +1,19 @@
|
||||
<?xml version="1.0" encoding="utf-8"?><!--
|
||||
Sample data extraction rules file; uncomment and customize as necessary.
|
||||
See https://developer.android.com/about/versions/12/backup-restore#xml-changes
|
||||
for details.
|
||||
-->
|
||||
<data-extraction-rules>
|
||||
<cloud-backup>
|
||||
<!-- TODO: Use <include> and <exclude> to control what is backed up.
|
||||
<include .../>
|
||||
<exclude .../>
|
||||
-->
|
||||
</cloud-backup>
|
||||
<!--
|
||||
<device-transfer>
|
||||
<include .../>
|
||||
<exclude .../>
|
||||
</device-transfer>
|
||||
-->
|
||||
</data-extraction-rules>
|
||||
@@ -0,0 +1,17 @@
|
||||
package com.example.myapplication
|
||||
|
||||
import org.junit.Test
|
||||
|
||||
import org.junit.Assert.*
|
||||
|
||||
/**
|
||||
* Example local unit test, which will execute on the development machine (host).
|
||||
*
|
||||
* See [testing documentation](http://d.android.com/tools/testing).
|
||||
*/
|
||||
class ExampleUnitTest {
|
||||
@Test
|
||||
fun addition_isCorrect() {
|
||||
assertEquals(4, 2 + 2)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,6 @@
|
||||
// Top-level build file where you can add configuration options common to all sub-projects/modules.
|
||||
plugins {
|
||||
alias(libs.plugins.android.application) apply false
|
||||
alias(libs.plugins.kotlin.android) apply false
|
||||
alias(libs.plugins.kotlin.compose) apply false
|
||||
}
|
||||
@@ -0,0 +1,25 @@
|
||||
# Project-wide Gradle settings.
|
||||
# IDE (e.g. Android Studio) users:
|
||||
# Gradle settings configured through the IDE *will override*
|
||||
# any settings specified in this file.
|
||||
# For more details on how to configure your build environment visit
|
||||
# http://www.gradle.org/docs/current/userguide/build_environment.html
|
||||
# Specifies the JVM arguments used for the daemon process.
|
||||
# The setting is particularly useful for tweaking memory settings.
|
||||
org.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8
|
||||
# When configured, Gradle will run in incubating parallel mode.
|
||||
# This option should only be used with decoupled projects. For more details, visit
|
||||
# https://developer.android.com/r/tools/gradle-multi-project-decoupled-projects
|
||||
# org.gradle.parallel=true
|
||||
# AndroidX package structure to make it clearer which packages are bundled with the
|
||||
# Android operating system, and which are packaged with your app's APK
|
||||
# https://developer.android.com/topic/libraries/support-library/androidx-rn
|
||||
android.useAndroidX=true
|
||||
# Kotlin code style for this project: "official" or "obsolete":
|
||||
kotlin.code.style=official
|
||||
# Enables namespacing of each library's R class so that its R class includes only the
|
||||
# resources declared in the library itself and none from the library's dependencies,
|
||||
# thereby reducing the size of the R class for that library
|
||||
android.nonTransitiveRClass=true
|
||||
|
||||
kotlin.incremental=false
|
||||
@@ -0,0 +1,33 @@
|
||||
[versions]
|
||||
agp = "9.0.0"
|
||||
kotlin = "2.3.0-RC2"
|
||||
coreKtx = "1.10.1"
|
||||
junit = "4.13.2"
|
||||
junitVersion = "1.1.5"
|
||||
espressoCore = "3.5.1"
|
||||
lifecycleRuntimeKtx = "2.6.1"
|
||||
activityCompose = "1.8.0"
|
||||
composeBom = "2024.09.00"
|
||||
|
||||
[libraries]
|
||||
androidx-core-ktx = { group = "androidx.core", name = "core-ktx", version.ref = "coreKtx" }
|
||||
junit = { group = "junit", name = "junit", version.ref = "junit" }
|
||||
androidx-junit = { group = "androidx.test.ext", name = "junit", version.ref = "junitVersion" }
|
||||
androidx-espresso-core = { group = "androidx.test.espresso", name = "espresso-core", version.ref = "espressoCore" }
|
||||
androidx-lifecycle-runtime-ktx = { group = "androidx.lifecycle", name = "lifecycle-runtime-ktx", version.ref = "lifecycleRuntimeKtx" }
|
||||
androidx-activity-compose = { group = "androidx.activity", name = "activity-compose", version.ref = "activityCompose" }
|
||||
androidx-compose-bom = { group = "androidx.compose", name = "compose-bom", version.ref = "composeBom" }
|
||||
androidx-ui = { group = "androidx.compose.ui", name = "ui" }
|
||||
androidx-ui-graphics = { group = "androidx.compose.ui", name = "ui-graphics" }
|
||||
androidx-ui-tooling = { group = "androidx.compose.ui", name = "ui-tooling" }
|
||||
androidx-ui-tooling-preview = { group = "androidx.compose.ui", name = "ui-tooling-preview" }
|
||||
androidx-ui-test-manifest = { group = "androidx.compose.ui", name = "ui-test-manifest" }
|
||||
androidx-ui-test-junit4 = { group = "androidx.compose.ui", name = "ui-test-junit4" }
|
||||
androidx-material3 = { group = "androidx.compose.material3", name = "material3" }
|
||||
|
||||
[plugins]
|
||||
android-application = { id = "com.android.application", version.ref = "agp" }
|
||||
kotlin-android = { id = "org.jetbrains.kotlin.android", version.ref = "kotlin" }
|
||||
kotlin-compose = { id = "org.jetbrains.kotlin.plugin.compose", version.ref = "kotlin" }
|
||||
dataframe = { id = "org.jetbrains.kotlin.plugin.dataframe", version.ref = "kotlin" }
|
||||
|
||||
@@ -0,0 +1,7 @@
|
||||
distributionBase=GRADLE_USER_HOME
|
||||
distributionPath=wrapper/dists
|
||||
distributionUrl=https\://services.gradle.org/distributions/gradle-9.1.0-bin.zip
|
||||
networkTimeout=10000
|
||||
validateDistributionUrl=true
|
||||
zipStoreBase=GRADLE_USER_HOME
|
||||
zipStorePath=wrapper/dists
|
||||
@@ -0,0 +1,185 @@
|
||||
#!/usr/bin/env sh
|
||||
|
||||
#
|
||||
# Copyright 2015 the original author or authors.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
##############################################################################
|
||||
##
|
||||
## Gradle start up script for UN*X
|
||||
##
|
||||
##############################################################################
|
||||
|
||||
# Attempt to set APP_HOME
|
||||
# Resolve links: $0 may be a link
|
||||
PRG="$0"
|
||||
# Need this for relative symlinks.
|
||||
while [ -h "$PRG" ] ; do
|
||||
ls=`ls -ld "$PRG"`
|
||||
link=`expr "$ls" : '.*-> \(.*\)$'`
|
||||
if expr "$link" : '/.*' > /dev/null; then
|
||||
PRG="$link"
|
||||
else
|
||||
PRG=`dirname "$PRG"`"/$link"
|
||||
fi
|
||||
done
|
||||
SAVED="`pwd`"
|
||||
cd "`dirname \"$PRG\"`/" >/dev/null
|
||||
APP_HOME="`pwd -P`"
|
||||
cd "$SAVED" >/dev/null
|
||||
|
||||
APP_NAME="Gradle"
|
||||
APP_BASE_NAME=`basename "$0"`
|
||||
|
||||
# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
|
||||
DEFAULT_JVM_OPTS='"-Xmx64m" "-Xms64m"'
|
||||
|
||||
# Use the maximum available, or set MAX_FD != -1 to use that value.
|
||||
MAX_FD="maximum"
|
||||
|
||||
warn () {
|
||||
echo "$*"
|
||||
}
|
||||
|
||||
die () {
|
||||
echo
|
||||
echo "$*"
|
||||
echo
|
||||
exit 1
|
||||
}
|
||||
|
||||
# OS specific support (must be 'true' or 'false').
|
||||
cygwin=false
|
||||
msys=false
|
||||
darwin=false
|
||||
nonstop=false
|
||||
case "`uname`" in
|
||||
CYGWIN* )
|
||||
cygwin=true
|
||||
;;
|
||||
Darwin* )
|
||||
darwin=true
|
||||
;;
|
||||
MINGW* )
|
||||
msys=true
|
||||
;;
|
||||
NONSTOP* )
|
||||
nonstop=true
|
||||
;;
|
||||
esac
|
||||
|
||||
CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar
|
||||
|
||||
|
||||
# Determine the Java command to use to start the JVM.
|
||||
if [ -n "$JAVA_HOME" ] ; then
|
||||
if [ -x "$JAVA_HOME/jre/sh/java" ] ; then
|
||||
# IBM's JDK on AIX uses strange locations for the executables
|
||||
JAVACMD="$JAVA_HOME/jre/sh/java"
|
||||
else
|
||||
JAVACMD="$JAVA_HOME/bin/java"
|
||||
fi
|
||||
if [ ! -x "$JAVACMD" ] ; then
|
||||
die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME
|
||||
|
||||
Please set the JAVA_HOME variable in your environment to match the
|
||||
location of your Java installation."
|
||||
fi
|
||||
else
|
||||
JAVACMD="java"
|
||||
which java >/dev/null 2>&1 || die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
|
||||
|
||||
Please set the JAVA_HOME variable in your environment to match the
|
||||
location of your Java installation."
|
||||
fi
|
||||
|
||||
# Increase the maximum file descriptors if we can.
|
||||
if [ "$cygwin" = "false" -a "$darwin" = "false" -a "$nonstop" = "false" ] ; then
|
||||
MAX_FD_LIMIT=`ulimit -H -n`
|
||||
if [ $? -eq 0 ] ; then
|
||||
if [ "$MAX_FD" = "maximum" -o "$MAX_FD" = "max" ] ; then
|
||||
MAX_FD="$MAX_FD_LIMIT"
|
||||
fi
|
||||
ulimit -n $MAX_FD
|
||||
if [ $? -ne 0 ] ; then
|
||||
warn "Could not set maximum file descriptor limit: $MAX_FD"
|
||||
fi
|
||||
else
|
||||
warn "Could not query maximum file descriptor limit: $MAX_FD_LIMIT"
|
||||
fi
|
||||
fi
|
||||
|
||||
# For Darwin, add options to specify how the application appears in the dock
|
||||
if $darwin; then
|
||||
GRADLE_OPTS="$GRADLE_OPTS \"-Xdock:name=$APP_NAME\" \"-Xdock:icon=$APP_HOME/media/gradle.icns\""
|
||||
fi
|
||||
|
||||
# For Cygwin or MSYS, switch paths to Windows format before running java
|
||||
if [ "$cygwin" = "true" -o "$msys" = "true" ] ; then
|
||||
APP_HOME=`cygpath --path --mixed "$APP_HOME"`
|
||||
CLASSPATH=`cygpath --path --mixed "$CLASSPATH"`
|
||||
|
||||
JAVACMD=`cygpath --unix "$JAVACMD"`
|
||||
|
||||
# We build the pattern for arguments to be converted via cygpath
|
||||
ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`
|
||||
SEP=""
|
||||
for dir in $ROOTDIRSRAW ; do
|
||||
ROOTDIRS="$ROOTDIRS$SEP$dir"
|
||||
SEP="|"
|
||||
done
|
||||
OURCYGPATTERN="(^($ROOTDIRS))"
|
||||
# Add a user-defined pattern to the cygpath arguments
|
||||
if [ "$GRADLE_CYGPATTERN" != "" ] ; then
|
||||
OURCYGPATTERN="$OURCYGPATTERN|($GRADLE_CYGPATTERN)"
|
||||
fi
|
||||
# Now convert the arguments - kludge to limit ourselves to /bin/sh
|
||||
i=0
|
||||
for arg in "$@" ; do
|
||||
CHECK=`echo "$arg"|egrep -c "$OURCYGPATTERN" -`
|
||||
CHECK2=`echo "$arg"|egrep -c "^-"` ### Determine if an option
|
||||
|
||||
if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then ### Added a condition
|
||||
eval `echo args$i`=`cygpath --path --ignore --mixed "$arg"`
|
||||
else
|
||||
eval `echo args$i`="\"$arg\""
|
||||
fi
|
||||
i=`expr $i + 1`
|
||||
done
|
||||
case $i in
|
||||
0) set -- ;;
|
||||
1) set -- "$args0" ;;
|
||||
2) set -- "$args0" "$args1" ;;
|
||||
3) set -- "$args0" "$args1" "$args2" ;;
|
||||
4) set -- "$args0" "$args1" "$args2" "$args3" ;;
|
||||
5) set -- "$args0" "$args1" "$args2" "$args3" "$args4" ;;
|
||||
6) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" ;;
|
||||
7) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" ;;
|
||||
8) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" ;;
|
||||
9) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" "$args8" ;;
|
||||
esac
|
||||
fi
|
||||
|
||||
# Escape application args
|
||||
save () {
|
||||
for i do printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/" ; done
|
||||
echo " "
|
||||
}
|
||||
APP_ARGS=`save "$@"`
|
||||
|
||||
# Collect all arguments for the java command, following the shell quoting and substitution rules
|
||||
eval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS "\"-Dorg.gradle.appname=$APP_BASE_NAME\"" -classpath "\"$CLASSPATH\"" org.gradle.wrapper.GradleWrapperMain "$APP_ARGS"
|
||||
|
||||
exec "$JAVACMD" "$@"
|
||||
@@ -0,0 +1,89 @@
|
||||
@rem
|
||||
@rem Copyright 2015 the original author or authors.
|
||||
@rem
|
||||
@rem Licensed under the Apache License, Version 2.0 (the "License");
|
||||
@rem you may not use this file except in compliance with the License.
|
||||
@rem You may obtain a copy of the License at
|
||||
@rem
|
||||
@rem https://www.apache.org/licenses/LICENSE-2.0
|
||||
@rem
|
||||
@rem Unless required by applicable law or agreed to in writing, software
|
||||
@rem distributed under the License is distributed on an "AS IS" BASIS,
|
||||
@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
@rem See the License for the specific language governing permissions and
|
||||
@rem limitations under the License.
|
||||
@rem
|
||||
|
||||
@if "%DEBUG%" == "" @echo off
|
||||
@rem ##########################################################################
|
||||
@rem
|
||||
@rem Gradle startup script for Windows
|
||||
@rem
|
||||
@rem ##########################################################################
|
||||
|
||||
@rem Set local scope for the variables with windows NT shell
|
||||
if "%OS%"=="Windows_NT" setlocal
|
||||
|
||||
set DIRNAME=%~dp0
|
||||
if "%DIRNAME%" == "" set DIRNAME=.
|
||||
set APP_BASE_NAME=%~n0
|
||||
set APP_HOME=%DIRNAME%
|
||||
|
||||
@rem Resolve any "." and ".." in APP_HOME to make it shorter.
|
||||
for %%i in ("%APP_HOME%") do set APP_HOME=%%~fi
|
||||
|
||||
@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
|
||||
set DEFAULT_JVM_OPTS="-Xmx64m" "-Xms64m"
|
||||
|
||||
@rem Find java.exe
|
||||
if defined JAVA_HOME goto findJavaFromJavaHome
|
||||
|
||||
set JAVA_EXE=java.exe
|
||||
%JAVA_EXE% -version >NUL 2>&1
|
||||
if "%ERRORLEVEL%" == "0" goto execute
|
||||
|
||||
echo.
|
||||
echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
|
||||
echo.
|
||||
echo Please set the JAVA_HOME variable in your environment to match the
|
||||
echo location of your Java installation.
|
||||
|
||||
goto fail
|
||||
|
||||
:findJavaFromJavaHome
|
||||
set JAVA_HOME=%JAVA_HOME:"=%
|
||||
set JAVA_EXE=%JAVA_HOME%/bin/java.exe
|
||||
|
||||
if exist "%JAVA_EXE%" goto execute
|
||||
|
||||
echo.
|
||||
echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%
|
||||
echo.
|
||||
echo Please set the JAVA_HOME variable in your environment to match the
|
||||
echo location of your Java installation.
|
||||
|
||||
goto fail
|
||||
|
||||
:execute
|
||||
@rem Setup the command line
|
||||
|
||||
set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar
|
||||
|
||||
|
||||
@rem Execute Gradle
|
||||
"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %*
|
||||
|
||||
:end
|
||||
@rem End local scope for the variables with windows NT shell
|
||||
if "%ERRORLEVEL%"=="0" goto mainEnd
|
||||
|
||||
:fail
|
||||
rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of
|
||||
rem the _cmd.exe /c_ return code!
|
||||
if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1
|
||||
exit /b 1
|
||||
|
||||
:mainEnd
|
||||
if "%OS%"=="Windows_NT" endlocal
|
||||
|
||||
:omega
|
||||
|
After Width: | Height: | Size: 27 KiB |
@@ -0,0 +1,23 @@
|
||||
pluginManagement {
|
||||
repositories {
|
||||
google {
|
||||
content {
|
||||
includeGroupByRegex("com\\.android.*")
|
||||
includeGroupByRegex("com\\.google.*")
|
||||
includeGroupByRegex("androidx.*")
|
||||
}
|
||||
}
|
||||
mavenCentral()
|
||||
gradlePluginPortal()
|
||||
}
|
||||
}
|
||||
dependencyResolutionManagement {
|
||||
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
|
||||
repositories {
|
||||
google()
|
||||
mavenCentral()
|
||||
}
|
||||
}
|
||||
|
||||
rootProject.name = "android-example"
|
||||
include(":app")
|
||||
@@ -0,0 +1,69 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
import org.jetbrains.kotlinx.dataframe.api.JsonPath
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenLocal() // in case of local dataframe development
|
||||
mavenCentral()
|
||||
}
|
||||
|
||||
dependencies {
|
||||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
|
||||
implementation(project(":"))
|
||||
|
||||
// explicitly depend on openApi
|
||||
implementation(projects.dataframeOpenapi)
|
||||
}
|
||||
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_1_8
|
||||
freeCompilerArgs.add("-Xjdk-release=8")
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
options.release.set(8)
|
||||
}
|
||||
|
||||
dataframes {
|
||||
// Metrics, no key-value paths
|
||||
schema {
|
||||
data = "src/main/resources/apiGuruMetrics.json"
|
||||
name = "org.jetbrains.kotlinx.dataframe.examples.openapi.gradle.noOpenApi.MetricsNoKeyValue"
|
||||
}
|
||||
|
||||
// Metrics, with key-value paths
|
||||
schema {
|
||||
data = "src/main/resources/apiGuruMetrics.json"
|
||||
name = "org.jetbrains.kotlinx.dataframe.examples.openapi.gradle.noOpenApi.MetricsKeyValue"
|
||||
jsonOptions {
|
||||
keyValuePaths = listOf(
|
||||
JsonPath()
|
||||
.append("datasets")
|
||||
.appendArrayWithWildcard()
|
||||
.append("data"),
|
||||
)
|
||||
}
|
||||
}
|
||||
|
||||
// ApiGuru, OpenApi
|
||||
schema {
|
||||
data = "src/main/resources/ApiGuruOpenApi.yaml"
|
||||
// name is still needed to get the full path
|
||||
name = "org.jetbrains.kotlinx.dataframe.examples.openapi.ApiGuruOpenApiGradle"
|
||||
}
|
||||
|
||||
enableExperimentalOpenApi = true
|
||||
}
|
||||
@@ -0,0 +1,77 @@
|
||||
@file:ImportDataSchema(
|
||||
// Using just a sample since the full file will cause OOM errors
|
||||
path = "src/main/resources/ApiGuruSample.json",
|
||||
name = "APIsNoKeyValue",
|
||||
enableExperimentalOpenApi = true,
|
||||
)
|
||||
@file:ImportDataSchema(
|
||||
// Now we can use the full file; either a URL or a local path
|
||||
path = "src/main/resources/api_guru_list.json",
|
||||
name = "APIsKeyValue",
|
||||
jsonOptions = JsonOptions(
|
||||
// paths in the json that should be converted to KeyValue columns
|
||||
keyValuePaths = ["""$""", """$[*]["versions"]"""],
|
||||
),
|
||||
enableExperimentalOpenApi = true,
|
||||
)
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.openapi
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.ImportDataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.JsonOptions
|
||||
import org.jetbrains.kotlinx.dataframe.api.first
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.examples.openapi.gradle.noOpenApi.MetricsKeyValue
|
||||
import org.jetbrains.kotlinx.dataframe.examples.openapi.gradle.noOpenApi.MetricsNoKeyValue
|
||||
|
||||
/**
|
||||
* In this file we'll demonstrate how to use the jsonOption `keyValuePaths`
|
||||
* both using the Gradle- and KSP plugin and what it does.
|
||||
*/
|
||||
fun main() {
|
||||
gradleNoKeyValue()
|
||||
gradleKeyValue()
|
||||
|
||||
kspNoKeyValue()
|
||||
kspKeyValue()
|
||||
}
|
||||
|
||||
/**
|
||||
* Gradle example of reading a JSON file with no key-value pairs.
|
||||
* Ctrl+Click on [MetricsNoKeyValue] to see the generated code.
|
||||
*/
|
||||
private fun gradleNoKeyValue() {
|
||||
val df = MetricsNoKeyValue.readJson("examples/idea-examples/json/src/main/resources/apiGuruMetrics.json")
|
||||
df.print(columnTypes = true, title = true, borders = true)
|
||||
}
|
||||
|
||||
/**
|
||||
* Gradle example of reading a JSON file with key-value pairs.
|
||||
* Ctrl+Click on [MetricsKeyValue] to see the generated code.
|
||||
*/
|
||||
private fun gradleKeyValue() {
|
||||
val df = MetricsKeyValue.readJson("examples/idea-examples/json/src/main/resources/apiGuruMetrics.json")
|
||||
df.print(columnTypes = true, title = true, borders = true)
|
||||
}
|
||||
|
||||
/**
|
||||
* KSP example of reading a JSON file with no key-value pairs.
|
||||
* Ctrl+Click on [APIsNoKeyValue] to see the generated code.
|
||||
*
|
||||
* Note the many generated interfaces. You can imagine larger files crashing the code generator.
|
||||
*/
|
||||
private fun kspNoKeyValue() {
|
||||
val df = APIsNoKeyValue.readJson("examples/idea-examples/json/src/main/resources/ApiGuruSample.json")
|
||||
df.print(columnTypes = true, title = true, borders = true)
|
||||
}
|
||||
|
||||
/**
|
||||
* KSP example of reading a JSON file with key-value pairs.
|
||||
* Ctrl+Click on [APIsKeyValue] to see the generated code.
|
||||
*/
|
||||
private fun kspKeyValue() {
|
||||
val df = APIsKeyValue.readJson("examples/idea-examples/json/src/main/resources/ApiGuruSample.json")
|
||||
.value.first()
|
||||
|
||||
df.print(columnTypes = true, title = true, borders = true)
|
||||
}
|
||||
@@ -0,0 +1,61 @@
|
||||
@file:ImportDataSchema(
|
||||
path = "src/main/resources/ApiGuruOpenApi.yaml",
|
||||
name = "ApiGuruOpenApiKsp",
|
||||
enableExperimentalOpenApi = true,
|
||||
)
|
||||
@file:ImportDataSchema(
|
||||
path = "https://raw.githubusercontent.com/1Password/connect/aac5e44b27570036e6b56e9f5b2a398a824ae5fc/docs/openapi/spec.yaml",
|
||||
name = "OnePassword",
|
||||
enableExperimentalOpenApi = true,
|
||||
)
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.openapi
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.ImportDataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.any
|
||||
import org.jetbrains.kotlinx.dataframe.api.filter
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.value
|
||||
|
||||
/**
|
||||
* In this file we'll demonstrate how to use OpenApi schemas
|
||||
* to generate DataSchemas and how to use them.
|
||||
*/
|
||||
fun main() {
|
||||
gradle()
|
||||
ksp()
|
||||
}
|
||||
|
||||
/**
|
||||
* Gradle example of reading JSON files with OpenApi schemas.
|
||||
* Ctrl+Click on [GradleAPIs] or [GradleMetrics] to see the generated code.
|
||||
*
|
||||
* (We use import aliases to avoid clashes with the KSP example)
|
||||
*/
|
||||
private fun gradle() {
|
||||
val apis = ApiGuruOpenApiGradle.APIs.readJson("examples/idea-examples/json/src/main/resources/ApiGuruSample.json")
|
||||
apis.print(columnTypes = true, title = true, borders = true)
|
||||
|
||||
apis.filter {
|
||||
value.versions.value.any {
|
||||
(it.updated ?: it.added).year >= 2021
|
||||
}
|
||||
}
|
||||
|
||||
val metrics =
|
||||
ApiGuruOpenApiGradle.Metrics.readJson("examples/idea-examples/json/src/main/resources/apiGuruMetrics.json")
|
||||
metrics.print(columnTypes = true, title = true, borders = true)
|
||||
}
|
||||
|
||||
/**
|
||||
* KSP example of reading JSON files with OpenApi schemas.
|
||||
* Ctrl+Click on [APIs] or [Metrics] to see the generated code.
|
||||
*/
|
||||
private fun ksp() {
|
||||
val apis = ApiGuruOpenApiKsp.APIs.readJson("examples/idea-examples/json/src/main/resources/ApiGuruSample.json")
|
||||
apis.print(columnTypes = true, title = true, borders = true)
|
||||
|
||||
val metrics =
|
||||
ApiGuruOpenApiKsp.Metrics.readJson("examples/idea-examples/json/src/main/resources/apiGuruMetrics.json")
|
||||
metrics.print(columnTypes = true, title = true, borders = true)
|
||||
}
|
||||
@@ -0,0 +1,304 @@
|
||||
# DEMO for DataFrame, this might differ from the actual API (it's updated a bit)
|
||||
openapi: 3.0.0
|
||||
info:
|
||||
version: 2.0.2
|
||||
title: APIs.guru
|
||||
description: >
|
||||
Wikipedia for Web APIs. Repository of API specs in OpenAPI format.
|
||||
|
||||
|
||||
**Warning**: If you want to be notified about changes in advance please join our [Slack channel](https://join.slack.com/t/mermade/shared_invite/zt-g78g7xir-MLE_CTCcXCdfJfG3CJe9qA).
|
||||
|
||||
|
||||
Client sample: [[Demo]](https://apis.guru/simple-ui) [[Repo]](https://github.com/APIs-guru/simple-ui)
|
||||
contact:
|
||||
name: APIs.guru
|
||||
url: https://APIs.guru
|
||||
email: mike.ralphson@gmail.com
|
||||
license:
|
||||
name: CC0 1.0
|
||||
url: https://github.com/APIs-guru/openapi-directory#licenses
|
||||
x-logo:
|
||||
url: https://apis.guru/branding/logo_vertical.svg
|
||||
externalDocs:
|
||||
url: https://github.com/APIs-guru/openapi-directory/blob/master/API.md
|
||||
security: [ ]
|
||||
tags:
|
||||
- name: APIs
|
||||
description: Actions relating to APIs in the collection
|
||||
paths:
|
||||
/list.json:
|
||||
get:
|
||||
operationId: listAPIs
|
||||
tags:
|
||||
- APIs
|
||||
summary: List all APIs
|
||||
description: >
|
||||
List all APIs in the directory.
|
||||
|
||||
Returns links to OpenAPI specification for each API in the directory.
|
||||
|
||||
If API exist in multiple versions `preferred` one is explicitly marked.
|
||||
|
||||
|
||||
Some basic info from OpenAPI spec is cached inside each object.
|
||||
|
||||
This allows to generate some simple views without need to fetch OpenAPI spec for each API.
|
||||
responses:
|
||||
"200":
|
||||
description: OK
|
||||
content:
|
||||
application/json; charset=utf-8:
|
||||
schema:
|
||||
$ref: "#/components/schemas/APIs"
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/APIs"
|
||||
/metrics.json:
|
||||
get:
|
||||
operationId: getMetrics
|
||||
summary: Get basic metrics
|
||||
description: >
|
||||
Some basic metrics for the entire directory.
|
||||
|
||||
Just stunning numbers to put on a front page and are intended purely for WoW effect :)
|
||||
tags:
|
||||
- APIs
|
||||
responses:
|
||||
"200":
|
||||
description: OK
|
||||
content:
|
||||
application/json; charset=utf-8:
|
||||
schema:
|
||||
$ref: "#/components/schemas/Metrics"
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/Metrics"
|
||||
components:
|
||||
schemas:
|
||||
APIs:
|
||||
description: |
|
||||
List of API details.
|
||||
It is a JSON object with API IDs(`<provider>[:<service>]`) as keys.
|
||||
type: object
|
||||
additionalProperties:
|
||||
$ref: "#/components/schemas/API"
|
||||
minProperties: 1
|
||||
example:
|
||||
googleapis.com:drive:
|
||||
added: 2015-02-22T20:00:45.000Z
|
||||
preferred: v3
|
||||
versions:
|
||||
v2:
|
||||
added: 2015-02-22T20:00:45.000Z
|
||||
info:
|
||||
title: Drive
|
||||
version: v2
|
||||
x-apiClientRegistration:
|
||||
url: https://console.developers.google.com
|
||||
x-logo:
|
||||
url: https://api.apis.guru/v2/cache/logo/https_www.gstatic.com_images_icons_material_product_2x_drive_32dp.png
|
||||
x-origin:
|
||||
format: google
|
||||
url: https://www.googleapis.com/discovery/v1/apis/drive/v2/rest
|
||||
version: v1
|
||||
x-preferred: false
|
||||
x-providerName: googleapis.com
|
||||
x-serviceName: drive
|
||||
swaggerUrl: https://api.apis.guru/v2/specs/googleapis.com/drive/v2/swagger.json
|
||||
swaggerYamlUrl: https://api.apis.guru/v2/specs/googleapis.com/drive/v2/swagger.yaml
|
||||
updated: 2016-06-17T00:21:44.000Z
|
||||
v3:
|
||||
added: 2015-12-12T00:25:13.000Z
|
||||
info:
|
||||
title: Drive
|
||||
version: v3
|
||||
x-apiClientRegistration:
|
||||
url: https://console.developers.google.com
|
||||
x-logo:
|
||||
url: https://api.apis.guru/v2/cache/logo/https_www.gstatic.com_images_icons_material_product_2x_drive_32dp.png
|
||||
x-origin:
|
||||
format: google
|
||||
url: https://www.googleapis.com/discovery/v1/apis/drive/v3/rest
|
||||
version: v1
|
||||
x-preferred: true
|
||||
x-providerName: googleapis.com
|
||||
x-serviceName: drive
|
||||
swaggerUrl: https://api.apis.guru/v2/specs/googleapis.com/drive/v3/swagger.json
|
||||
swaggerYamlUrl: https://api.apis.guru/v2/specs/googleapis.com/drive/v3/swagger.yaml
|
||||
updated: 2016-06-17T00:21:44.000Z
|
||||
API:
|
||||
description: Meta information about API
|
||||
type: object
|
||||
required:
|
||||
- added
|
||||
- preferred
|
||||
- versions
|
||||
properties:
|
||||
added:
|
||||
description: Timestamp when the API was first added to the directory
|
||||
type: string
|
||||
format: date-time
|
||||
preferred:
|
||||
description: Recommended version
|
||||
type: string
|
||||
versions:
|
||||
description: List of supported versions of the API
|
||||
type: object
|
||||
additionalProperties:
|
||||
$ref: "#/components/schemas/ApiVersion"
|
||||
minProperties: 1
|
||||
additionalProperties: false
|
||||
ApiVersion:
|
||||
type: object
|
||||
required:
|
||||
- added
|
||||
# - updated apparently not required!
|
||||
- swaggerUrl
|
||||
- swaggerYamlUrl
|
||||
- info
|
||||
- openapiVer
|
||||
properties:
|
||||
added:
|
||||
description: Timestamp when the version was added
|
||||
type: string
|
||||
format: date-time
|
||||
updated: # apparently not required!
|
||||
description: Timestamp when the version was updated
|
||||
type: string
|
||||
format: date-time
|
||||
swaggerUrl:
|
||||
description: URL to OpenAPI definition in JSON format
|
||||
type: string
|
||||
format: url
|
||||
swaggerYamlUrl:
|
||||
description: URL to OpenAPI definition in YAML format
|
||||
type: string
|
||||
format: url
|
||||
info:
|
||||
description: Copy of `info` section from OpenAPI definition
|
||||
type: object
|
||||
minProperties: 1
|
||||
externalDocs:
|
||||
description: Copy of `externalDocs` section from OpenAPI definition
|
||||
type: object
|
||||
minProperties: 1
|
||||
openapiVer:
|
||||
description: OpenAPI version
|
||||
type: string
|
||||
additionalProperties: false
|
||||
|
||||
Metrics:
|
||||
description: List of basic metrics
|
||||
type: object
|
||||
required:
|
||||
- numSpecs
|
||||
- numAPIs
|
||||
- numEndpoints
|
||||
- unreachable
|
||||
- invalid
|
||||
- unofficial
|
||||
- fixes
|
||||
- fixedPct
|
||||
- datasets
|
||||
- stars
|
||||
- issues
|
||||
- thisWeek
|
||||
properties:
|
||||
numSpecs:
|
||||
description: Number of API specifications including different versions of the
|
||||
same API
|
||||
type: integer
|
||||
minimum: 1
|
||||
numAPIs:
|
||||
description: Number of APIs
|
||||
type: integer
|
||||
minimum: 1
|
||||
numEndpoints:
|
||||
description: Total number of endpoints inside all specifications
|
||||
type: integer
|
||||
minimum: 1
|
||||
unreachable:
|
||||
description: Number of unreachable specifications
|
||||
type: integer
|
||||
minimum: 0
|
||||
invalid:
|
||||
description: Number of invalid specifications
|
||||
type: integer
|
||||
minimum: 0
|
||||
unofficial:
|
||||
description: Number of unofficial specifications
|
||||
type: integer
|
||||
minimum: 0
|
||||
fixes:
|
||||
description: Number of fixes applied to specifications
|
||||
type: integer
|
||||
minimum: 0
|
||||
fixedPct:
|
||||
description: Percentage of fixed specifications
|
||||
type: number
|
||||
minimum: 0
|
||||
maximum: 100
|
||||
datasets:
|
||||
description: An overview of the datasets used to gather the APIs
|
||||
type: array
|
||||
items:
|
||||
description: A single metric per dataset
|
||||
type: object
|
||||
required:
|
||||
- title
|
||||
- data
|
||||
properties:
|
||||
title:
|
||||
description: Title of the metric
|
||||
type: string
|
||||
data:
|
||||
description: Value of the metric per dataset
|
||||
type: object
|
||||
additionalProperties:
|
||||
type: integer
|
||||
minimum: 0
|
||||
stars:
|
||||
description: Number of stars on GitHub
|
||||
type: integer
|
||||
minimum: 0
|
||||
issues:
|
||||
description: Number of issues on GitHub
|
||||
type: integer
|
||||
minimum: 0
|
||||
thisWeek:
|
||||
description: Number of new specifications added/updated this week
|
||||
type: object
|
||||
required:
|
||||
- added
|
||||
- updated
|
||||
properties:
|
||||
added:
|
||||
description: Number of new specifications added this week
|
||||
type: integer
|
||||
minimum: 0
|
||||
updated:
|
||||
description: Number of specifications updated this week
|
||||
type: integer
|
||||
minimum: 0
|
||||
additionalProperties: false
|
||||
example:
|
||||
numSpecs: 1000
|
||||
numAPIs: 100
|
||||
numEndpoints: 10000
|
||||
unreachable: 10
|
||||
invalid: 10
|
||||
unofficial: 10
|
||||
fixes: 10
|
||||
fixedPct: 10
|
||||
datasets:
|
||||
- title: providerCount
|
||||
data:
|
||||
"a.com": 10
|
||||
"b.com": 20
|
||||
"c.com": 30
|
||||
stars: 1000
|
||||
issues: 100
|
||||
thisWeek:
|
||||
added: 10
|
||||
updated: 10
|
||||
@@ -0,0 +1,717 @@
|
||||
{
|
||||
"1forge.com": {
|
||||
"added": "2017-05-30T08:34:14.000Z",
|
||||
"preferred": "0.0.1",
|
||||
"versions": {
|
||||
"0.0.1": {
|
||||
"added": "2017-05-30T08:34:14.000Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "contact@1forge.com",
|
||||
"name": "1Forge",
|
||||
"url": "http://1forge.com"
|
||||
},
|
||||
"description": "Stock and Forex Data and Realtime Quotes",
|
||||
"title": "1Forge Finance APIs",
|
||||
"version": "0.0.1",
|
||||
"x-apisguru-categories": [
|
||||
"financial"
|
||||
],
|
||||
"x-logo": {
|
||||
"backgroundColor": "#24292e",
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_1forge.com_assets_images_f-blue.svg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "swagger",
|
||||
"url": "http://1forge.com/openapi.json",
|
||||
"version": "2.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "1forge.com"
|
||||
},
|
||||
"updated": "2017-06-27T16:49:57.000Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/1forge.com/0.0.1/swagger.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/1forge.com/0.0.1/swagger.yaml",
|
||||
"openapiVer": "2.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"1password.com:events": {
|
||||
"added": "2021-07-19T10:17:09.188Z",
|
||||
"preferred": "1.0.0",
|
||||
"versions": {
|
||||
"1.0.0": {
|
||||
"added": "2021-07-19T10:17:09.188Z",
|
||||
"info": {
|
||||
"description": "1Password Events API Specification.",
|
||||
"title": "Events API",
|
||||
"version": "1.0.0",
|
||||
"x-apisguru-categories": [
|
||||
"security"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_upload.wikimedia.org_wikipedia_commons_thumb_e_e3_1password-logo.svg_1280px-1password-logo.svg.png"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://i.1password.com/media/1password-events-reporting/1password-events-api.yaml",
|
||||
"version": "3.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "1password.com",
|
||||
"x-serviceName": "events"
|
||||
},
|
||||
"updated": "2021-07-22T10:32:52.774Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/1password.com/events/1.0.0/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/1password.com/events/1.0.0/openapi.yaml",
|
||||
"openapiVer": "3.0.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"1password.local:connect": {
|
||||
"added": "2021-04-16T15:56:45.939Z",
|
||||
"preferred": "1.3.0",
|
||||
"versions": {
|
||||
"1.3.0": {
|
||||
"added": "2021-04-16T15:56:45.939Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "support@1password.com",
|
||||
"name": "1Password Integrations",
|
||||
"url": "https://support.1password.com/"
|
||||
},
|
||||
"description": "REST API interface for 1Password Connect.",
|
||||
"title": "1Password Connect",
|
||||
"version": "1.3.0",
|
||||
"x-apisguru-categories": [
|
||||
"security"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_upload.wikimedia.org_wikipedia_commons_thumb_e_e3_1password-logo.svg_1280px-1password-logo.svg.png"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://i.1password.com/media/1password-connect/1password-connect-api.yaml",
|
||||
"version": "3.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "1password.local",
|
||||
"x-serviceName": "connect"
|
||||
},
|
||||
"updated": "2021-07-26T08:51:53.432Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/1password.local/connect/1.3.0/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/1password.local/connect/1.3.0/openapi.yaml",
|
||||
"openapiVer": "3.0.2"
|
||||
}
|
||||
}
|
||||
},
|
||||
"6-dot-authentiqio.appspot.com": {
|
||||
"added": "2017-03-15T14:45:58.000Z",
|
||||
"preferred": "6",
|
||||
"versions": {
|
||||
"6": {
|
||||
"added": "2017-03-15T14:45:58.000Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "hello@authentiq.com",
|
||||
"name": "Authentiq team",
|
||||
"url": "http://authentiq.io/support"
|
||||
},
|
||||
"description": "Strong authentication, without the passwords.",
|
||||
"license": {
|
||||
"name": "Apache 2.0",
|
||||
"url": "http://www.apache.org/licenses/LICENSE-2.0.html"
|
||||
},
|
||||
"termsOfService": "http://authentiq.com/terms/",
|
||||
"title": "Authentiq API",
|
||||
"version": "6",
|
||||
"x-apisguru-categories": [
|
||||
"security"
|
||||
],
|
||||
"x-logo": {
|
||||
"backgroundColor": "#F26641",
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_www.authentiq.com_theme_images_authentiq-logo-a-inverse.svg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/AuthentiqID/authentiq-docs/master/docs/swagger/issuer.yaml",
|
||||
"version": "3.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "6-dot-authentiqio.appspot.com"
|
||||
},
|
||||
"updated": "2021-06-21T12:16:53.715Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/6-dot-authentiqio.appspot.com/6/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/6-dot-authentiqio.appspot.com/6/openapi.yaml",
|
||||
"openapiVer": "3.0.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"ably.io:platform": {
|
||||
"added": "2019-07-13T11:28:07.000Z",
|
||||
"preferred": "1.1.0",
|
||||
"versions": {
|
||||
"1.1.0": {
|
||||
"added": "2019-07-13T11:28:07.000Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "support@ably.io",
|
||||
"name": "Ably Support",
|
||||
"url": "https://www.ably.io/contact",
|
||||
"x-twitter": "ablyrealtime"
|
||||
},
|
||||
"description": "The [REST API specification](https://www.ably.io/documentation/rest-api) for Ably.",
|
||||
"title": "Platform API",
|
||||
"version": "1.1.0",
|
||||
"x-apisguru-categories": [
|
||||
"cloud"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_ablyrealtime_profile_image"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/ably/open-specs/main/definitions/platform-v1.yaml",
|
||||
"version": "3.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "ably.io",
|
||||
"x-serviceName": "platform"
|
||||
},
|
||||
"updated": "2021-07-26T09:42:14.653Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/ably.io/platform/1.1.0/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/ably.io/platform/1.1.0/openapi.yaml",
|
||||
"openapiVer": "3.0.1"
|
||||
}
|
||||
}
|
||||
},
|
||||
"ably.net:control": {
|
||||
"added": "2021-07-26T09:45:31.536Z",
|
||||
"preferred": "1.0.14",
|
||||
"versions": {
|
||||
"1.0.14": {
|
||||
"added": "2021-07-26T09:45:31.536Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"x-twitter": "ablyrealtime"
|
||||
},
|
||||
"description": "Use the Control API to manage your applications, namespaces, keys, queues, rules, and more.\n\nDetailed information on using this API can be found in the Ably <a href=\"https://ably.com/documentation/control-api\">developer documentation</a>.\n\nControl API is currently in Beta.\n",
|
||||
"title": "Control API v1",
|
||||
"version": "1.0.14",
|
||||
"x-apisguru-categories": [
|
||||
"cloud"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_ablyrealtime_profile_image"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/ably/open-specs/main/definitions/control-v1.yaml",
|
||||
"version": "3.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "ably.net",
|
||||
"x-serviceName": "control"
|
||||
},
|
||||
"updated": "2021-07-26T09:47:48.565Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/ably.net/control/1.0.14/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/ably.net/control/1.0.14/openapi.yaml",
|
||||
"openapiVer": "3.0.1"
|
||||
}
|
||||
}
|
||||
},
|
||||
"abstractapi.com:geolocation": {
|
||||
"added": "2021-04-14T17:12:40.648Z",
|
||||
"preferred": "1.0.0",
|
||||
"versions": {
|
||||
"1.0.0": {
|
||||
"added": "2021-04-14T17:12:40.648Z",
|
||||
"info": {
|
||||
"description": "Abstract IP geolocation API allows developers to retrieve the region, country and city behind any IP worldwide. The API covers the geolocation of IPv4 and IPv6 addresses in 180+ countries worldwide. Extra information can be retrieved like the currency, flag or language associated to an IP.",
|
||||
"title": "IP geolocation API",
|
||||
"version": "1.0.0",
|
||||
"x-apisguru-categories": [
|
||||
"location"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_global-uploads.webflow.com_5ebbd0a566a3996636e55959_5ec2ba29feeeb05d69160e7b_webclip.png"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://documentation.abstractapi.com/ip-geolocation-openapi.json",
|
||||
"version": "3.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "abstractapi.com",
|
||||
"x-serviceName": "geolocation"
|
||||
},
|
||||
"externalDocs": {
|
||||
"description": "API Documentation",
|
||||
"url": "https://www.abstractapi.com/ip-geolocation-api#docs"
|
||||
},
|
||||
"updated": "2021-06-21T12:16:53.715Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/abstractapi.com/geolocation/1.0.0/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/abstractapi.com/geolocation/1.0.0/openapi.yaml",
|
||||
"openapiVer": "3.0.1"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adafruit.com": {
|
||||
"added": "2018-02-10T10:41:43.000Z",
|
||||
"preferred": "2.0.0",
|
||||
"versions": {
|
||||
"2.0.0": {
|
||||
"added": "2018-02-10T10:41:43.000Z",
|
||||
"info": {
|
||||
"description": "### The Internet of Things for Everyone\n\nThe Adafruit IO HTTP API provides access to your Adafruit IO data from any programming language or hardware environment that can speak HTTP. The easiest way to get started is with [an Adafruit IO learn guide](https://learn.adafruit.com/series/adafruit-io-basics) and [a simple Internet of Things capable device like the Feather Huzzah](https://www.adafruit.com/product/2821).\n\nThis API documentation is hosted on GitHub Pages and is available at [https://github.com/adafruit/io-api](https://github.com/adafruit/io-api). For questions or comments visit the [Adafruit IO Forums](https://forums.adafruit.com/viewforum.php?f=56) or the [adafruit-io channel on the Adafruit Discord server](https://discord.gg/adafruit).\n\n#### Authentication\n\nAuthentication for every API request happens through the `X-AIO-Key` header or query parameter and your IO API key. A simple cURL request to get all available feeds for a user with the username \"io_username\" and the key \"io_key_12345\" could look like this:\n\n $ curl -H \"X-AIO-Key: io_key_12345\" https://io.adafruit.com/api/v2/io_username/feeds\n\nOr like this:\n\n $ curl \"https://io.adafruit.com/api/v2/io_username/feeds?X-AIO-Key=io_key_12345\n\nUsing the node.js [request](https://github.com/request/request) library, IO HTTP requests are as easy as:\n\n```js\nvar request = require('request');\n\nvar options = {\n url: 'https://io.adafruit.com/api/v2/io_username/feeds',\n headers: {\n 'X-AIO-Key': 'io_key_12345',\n 'Content-Type': 'application/json'\n }\n};\n\nfunction callback(error, response, body) {\n if (!error && response.statusCode == 200) {\n var feeds = JSON.parse(body);\n console.log(feeds.length + \" FEEDS AVAILABLE\");\n\n feeds.forEach(function (feed) {\n console.log(feed.name, feed.key);\n })\n }\n}\n\nrequest(options, callback);\n```\n\nUsing the ESP8266 Arduino HTTPClient library, an HTTPS GET request would look like this (replacing `---` with your own values in the appropriate locations):\n\n```arduino\n/// based on\n/// https://github.com/esp8266/Arduino/blob/master/libraries/ESP8266HTTPClient/examples/Authorization/Authorization.ino\n\n#include <Arduino.h>\n#include <ESP8266WiFi.h>\n#include <ESP8266WiFiMulti.h>\n#include <ESP8266HTTPClient.h>\n\nESP8266WiFiMulti WiFiMulti;\n\nconst char* ssid = \"---\";\nconst char* password = \"---\";\n\nconst char* host = \"io.adafruit.com\";\n\nconst char* io_key = \"---\";\nconst char* path_with_username = \"/api/v2/---/dashboards\";\n\n// Use web browser to view and copy\n// SHA1 fingerprint of the certificate\nconst char* fingerprint = \"77 00 54 2D DA E7 D8 03 27 31 23 99 EB 27 DB CB A5 4C 57 18\";\n\nvoid setup() {\n Serial.begin(115200);\n\n for(uint8_t t = 4; t > 0; t--) {\n Serial.printf(\"[SETUP] WAIT %d...\\n\", t);\n Serial.flush();\n delay(1000);\n }\n\n WiFi.mode(WIFI_STA);\n WiFiMulti.addAP(ssid, password);\n\n // wait for WiFi connection\n while(WiFiMulti.run() != WL_CONNECTED) {\n Serial.print('.');\n delay(1000);\n }\n\n Serial.println(\"[WIFI] connected!\");\n\n HTTPClient http;\n\n // start request with URL and TLS cert fingerprint for verification\n http.begin(\"https://\" + String(host) + String(path_with_username), fingerprint);\n\n // IO API authentication\n http.addHeader(\"X-AIO-Key\", io_key);\n\n // start connection and send HTTP header\n int httpCode = http.GET();\n\n // httpCode will be negative on error\n if(httpCode > 0) {\n // HTTP header has been send and Server response header has been handled\n Serial.printf(\"[HTTP] GET response: %d\\n\", httpCode);\n\n // HTTP 200 OK\n if(httpCode == HTTP_CODE_OK) {\n String payload = http.getString();\n Serial.println(payload);\n }\n\n http.end();\n }\n}\n\nvoid loop() {}\n```\n\n#### Client Libraries\n\nWe have client libraries to help you get started with your project: [Python](https://github.com/adafruit/io-client-python), [Ruby](https://github.com/adafruit/io-client-ruby), [Arduino C++](https://github.com/adafruit/Adafruit_IO_Arduino), [Javascript](https://github.com/adafruit/adafruit-io-node), and [Go](https://github.com/adafruit/io-client-go) are available. They're all open source, so if they don't already do what you want, you can fork and add any feature you'd like.\n\n",
|
||||
"title": "Adafruit IO REST API",
|
||||
"version": "2.0.0",
|
||||
"x-apisguru-categories": [
|
||||
"iot"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_adafruit_profile_image.jpeg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "swagger",
|
||||
"url": "https://raw.githubusercontent.com/adafruit/io-api/gh-pages/v2.json",
|
||||
"version": "2.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "adafruit.com"
|
||||
},
|
||||
"updated": "2021-06-21T12:16:53.715Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adafruit.com/2.0.0/swagger.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adafruit.com/2.0.0/swagger.yaml",
|
||||
"openapiVer": "2.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adobe.com:aem": {
|
||||
"added": "2019-01-03T07:01:34.000Z",
|
||||
"preferred": "3.5.0-pre.0",
|
||||
"versions": {
|
||||
"3.5.0-pre.0": {
|
||||
"added": "2019-01-03T07:01:34.000Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "opensource@shinesolutions.com",
|
||||
"name": "Shine Solutions",
|
||||
"url": "http://shinesolutions.com",
|
||||
"x-twitter": "Adobe"
|
||||
},
|
||||
"description": "Swagger AEM is an OpenAPI specification for Adobe Experience Manager (AEM) API",
|
||||
"title": "Adobe Experience Manager (AEM) API",
|
||||
"version": "3.5.0-pre.0",
|
||||
"x-apisguru-categories": [
|
||||
"marketing"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_Adobe_profile_image.jpeg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/shinesolutions/swagger-aem/master/conf/api.yml",
|
||||
"version": "3.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "adobe.com",
|
||||
"x-serviceName": "aem",
|
||||
"x-unofficialSpec": true
|
||||
},
|
||||
"updated": "2021-06-21T12:16:53.715Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adobe.com/aem/3.5.0-pre.0/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adobe.com/aem/3.5.0-pre.0/openapi.yaml",
|
||||
"openapiVer": "3.0.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:AccountService": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"preferred": "6",
|
||||
"versions": {
|
||||
"6": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "developer-experience@adyen.com",
|
||||
"name": "Adyen Developer Experience team",
|
||||
"url": "https://www.adyen.help/hc/en-us/community/topics",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "The Account API provides endpoints for managing account-related entities on your platform. These related entities include account holders, accounts, bank accounts, shareholders, and KYC-related documents. The management operations include actions such as creation, retrieval, updating, and deletion of them.\n\nFor more information, refer to our [documentation](https://docs.adyen.com/platforms).\n## Authentication\nTo connect to the Account API, you must use basic authentication credentials of your web service user. If you don't have one, contact the [Adyen Support Team](https://support.adyen.com/hc/en-us/requests/new). Then use its credentials to authenticate your request, for example:\n\n```\ncurl\n-U \"ws@MarketPlace.YourMarketPlace\":\"YourWsPassword\" \\\n-H \"Content-Type: application/json\" \\\n...\n```\nNote that when going live, you need to generate new web service user credentials to access the [live endpoints](https://docs.adyen.com/development-resources/live-endpoints).\n\n## Versioning\nThe Account API supports versioning of its endpoints through a version suffix in the endpoint URL. This suffix has the following format: \"vXX\", where XX is the version number.\n\nFor example:\n```\nhttps://cal-test.adyen.com/cal/services/Account/v6/createAccountHolder\n```",
|
||||
"termsOfService": "https://www.adyen.com/legal/terms-and-conditions",
|
||||
"title": "Adyen for Platforms: Account API",
|
||||
"version": "6",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_Adyen_profile_image.jpeg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/Adyen/adyen-openapi/master/json/AccountService-v6.json",
|
||||
"version": "3.1"
|
||||
}
|
||||
],
|
||||
"x-preferred": true,
|
||||
"x-providerName": "adyen.com",
|
||||
"x-publicVersion": true,
|
||||
"x-serviceName": "AccountService"
|
||||
},
|
||||
"updated": "2021-11-12T23:18:19.544Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/AccountService/6/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/AccountService/6/openapi.yaml",
|
||||
"openapiVer": "3.1.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:BalancePlatformService": {
|
||||
"added": "2021-06-14T12:42:12.263Z",
|
||||
"preferred": "1",
|
||||
"versions": {
|
||||
"1": {
|
||||
"added": "2021-06-14T12:42:12.263Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "developer-experience@adyen.com",
|
||||
"name": "Adyen Developer Experience team",
|
||||
"url": "https://www.adyen.help/hc/en-us/community/topics",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "The Balance Platform API enables you to create a platform, onboard users as account holders, create balance accounts, and issue cards.\n\nFor information about use cases, refer to [Adyen Issuing](https://docs.adyen.com/issuing).\n\n ## Authentication\nYour Adyen contact will provide your API credential and an API key. To connect to the API, add an `X-API-Key` header with the API key as the value, for example:\n\n ```\ncurl\n-H \"Content-Type: application/json\" \\\n-H \"X-API-Key: YOUR_API_KEY\" \\\n...\n```\n\nAlternatively, you can use the username and password to connect to the API using basic authentication. For example:\n\n```\ncurl\n-H \"Content-Type: application/json\" \\\n-U \"ws@BalancePlatform.YOUR_BALANCE_PLATFORM\":\"YOUR_WS_PASSWORD\" \\\n...\n```\n## Versioning\nBalance Platform API supports versioning of its endpoints through a version suffix in the endpoint URL. This suffix has the following format: \"vXX\", where XX is the version number.\n\nFor example:\n```\nhttps://balanceplatform-api-test.adyen.com/bcl/v1\n```\n## Going live\nWhen going live, your Adyen contact will provide your API credential for the live environment. You can then use the API key or the username and password to send requests to `https://balanceplatform-api-live.adyen.com/bcl/v1`.\n\nFor more information, refer to our [Going live documentation](https://docs.adyen.com/issuing/integration-checklist#going-live).",
|
||||
"termsOfService": "https://www.adyen.com/legal/terms-and-conditions",
|
||||
"title": "Issuing: Balance Platform API",
|
||||
"version": "1",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_adyen.com_.resources_adyen-website_themes_images_apple-icon-180x180.png"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/Adyen/adyen-openapi/master/json/BalancePlatformService-v1.json",
|
||||
"version": "3.1"
|
||||
}
|
||||
],
|
||||
"x-providerName": "adyen.com",
|
||||
"x-publicVersion": true,
|
||||
"x-serviceName": "BalancePlatformService"
|
||||
},
|
||||
"updated": "2021-11-22T23:16:57.458Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/BalancePlatformService/1/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/BalancePlatformService/1/openapi.yaml",
|
||||
"openapiVer": "3.1.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:BinLookupService": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"preferred": "50",
|
||||
"versions": {
|
||||
"50": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "developer-experience@adyen.com",
|
||||
"name": "Adyen Developer Experience team",
|
||||
"url": "https://www.adyen.help/hc/en-us/community/topics",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "The BIN Lookup API provides endpoints for retrieving information, such as cost estimates, and 3D Secure supported version based on a given BIN.",
|
||||
"termsOfService": "https://www.adyen.com/legal/terms-and-conditions",
|
||||
"title": "Adyen BinLookup API",
|
||||
"version": "50",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_Adyen_profile_image.jpeg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/Adyen/adyen-openapi/master/json/BinLookupService-v50.json",
|
||||
"version": "3.1"
|
||||
}
|
||||
],
|
||||
"x-preferred": true,
|
||||
"x-providerName": "adyen.com",
|
||||
"x-publicVersion": true,
|
||||
"x-serviceName": "BinLookupService"
|
||||
},
|
||||
"updated": "2021-11-01T23:17:40.475Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/BinLookupService/50/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/BinLookupService/50/openapi.yaml",
|
||||
"openapiVer": "3.1.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:CheckoutService": {
|
||||
"added": "2021-11-01T23:17:40.475Z",
|
||||
"preferred": "68",
|
||||
"versions": {
|
||||
"68": {
|
||||
"added": "2021-11-01T23:17:40.475Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "developer-experience@adyen.com",
|
||||
"name": "Adyen Developer Experience team",
|
||||
"url": "https://www.adyen.help/hc/en-us/community/topics",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "Adyen Checkout API provides a simple and flexible way to initiate and authorise online payments. You can use the same integration for payments made with cards (including 3D Secure), mobile wallets, and local payment methods (for example, iDEAL and Sofort).\n\nThis API reference provides information on available endpoints and how to interact with them. To learn more about the API, visit [Checkout documentation](https://docs.adyen.com/online-payments).\n\n## Authentication\nEach request to the Checkout API must be signed with an API key. For this, obtain an API Key from your Customer Area, as described in [How to get the API key](https://docs.adyen.com/development-resources/api-credentials#generate-api-key). Then set this key to the `X-API-Key` header value, for example:\n\n```\ncurl\n-H \"Content-Type: application/json\" \\\n-H \"X-API-Key: Your_Checkout_API_key\" \\\n...\n```\nNote that when going live, you need to generate a new API Key to access the [live endpoints](https://docs.adyen.com/development-resources/live-endpoints).\n\n## Versioning\nCheckout API supports versioning of its endpoints through a version suffix in the endpoint URL. This suffix has the following format: \"vXX\", where XX is the version number.\n\nFor example:\n```\nhttps://checkout-test.adyen.com/v68/payments\n```",
|
||||
"termsOfService": "https://www.adyen.com/legal/terms-and-conditions",
|
||||
"title": "Adyen Checkout API",
|
||||
"version": "68",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_adyen.com_.resources_adyen-website_themes_images_apple-icon-180x180.png"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/Adyen/adyen-openapi/master/json/CheckoutService-v68.json",
|
||||
"version": "3.1"
|
||||
}
|
||||
],
|
||||
"x-preferred": true,
|
||||
"x-providerName": "adyen.com",
|
||||
"x-publicVersion": true,
|
||||
"x-serviceName": "CheckoutService"
|
||||
},
|
||||
"updated": "2021-11-12T23:18:19.544Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/CheckoutService/68/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/CheckoutService/68/openapi.yaml",
|
||||
"openapiVer": "3.1.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:CheckoutUtilityService": {
|
||||
"added": "2021-06-18T13:57:32.889Z",
|
||||
"preferred": "1",
|
||||
"versions": {
|
||||
"1": {
|
||||
"added": "2021-06-18T13:57:32.889Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "support@adyen.com",
|
||||
"name": "Adyen Support",
|
||||
"url": "https://support.adyen.com/",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "A web service containing utility functions available for merchants integrating with Checkout APIs.\n## Authentication\nEach request to the Checkout Utility API must be signed with an API key. For this, obtain an API Key from your Customer Area, as described in [How to get the Checkout API key](https://docs.adyen.com/developers/user-management/how-to-get-the-checkout-api-key). Then set this key to the `X-API-Key` header value, for example:\n\n```\ncurl\n-H \"Content-Type: application/json\" \\\n-H \"X-API-Key: Your_Checkout_API_key\" \\\n...\n```\nNote that when going live, you need to generate a new API Key to access the [live endpoints](https://docs.adyen.com/developers/api-reference/live-endpoints).\n\n## Versioning\nCheckout API supports versioning of its endpoints through a version suffix in the endpoint URL. This suffix has the following format: \"vXX\", where XX is the version number.\n\nFor example:\n```\nhttps://checkout-test.adyen.com/v1/originKeys\n```",
|
||||
"termsOfService": "https://docs.adyen.com/legal/terms-conditions",
|
||||
"title": "Adyen Checkout Utility Service",
|
||||
"version": "1",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_Adyen_profile_image.jpeg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"converter": {
|
||||
"url": "https://github.com/lucybot/api-spec-converter",
|
||||
"version": "2.7.11"
|
||||
},
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/adyen/adyen-openapi/master/specs/3.0/CheckoutUtilityService-v1.json",
|
||||
"version": "3.0"
|
||||
}
|
||||
],
|
||||
"x-providerName": "adyen.com",
|
||||
"x-serviceName": "CheckoutUtilityService"
|
||||
},
|
||||
"updated": "2021-06-18T13:57:32.889Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/CheckoutUtilityService/1/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/CheckoutUtilityService/1/openapi.yaml",
|
||||
"openapiVer": "3.0.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:FundService": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"preferred": "6",
|
||||
"versions": {
|
||||
"6": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "developer-experience@adyen.com",
|
||||
"name": "Adyen Developer Experience team",
|
||||
"url": "https://www.adyen.help/hc/en-us/community/topics",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "The Fund API provides endpoints for managing the funds in the accounts on your platform. These management operations include actions such as the transfer of funds from one account to another, the payout of funds to an account holder, and the retrieval of balances in an account.\n\nFor more information, refer to our [documentation](https://docs.adyen.com/platforms).\n## Authentication\nTo connect to the Fund API, you must use basic authentication credentials of your web service user. If you don't have one, please contact the [Adyen Support Team](https://support.adyen.com/hc/en-us/requests/new). Then use its credentials to authenticate your request, for example:\n\n```\ncurl\n-U \"ws@MarketPlace.YourMarketPlace\":\"YourWsPassword\" \\\n-H \"Content-Type: application/json\" \\\n...\n```\nNote that when going live, you need to generate new web service user credentials to access the [live endpoints](https://docs.adyen.com/development-resources/live-endpoints).\n\n## Versioning\nThe Fund API supports versioning of its endpoints through a version suffix in the endpoint URL. This suffix has the following format: \"vXX\", where XX is the version number.\n\nFor example:\n```\nhttps://cal-test.adyen.com/cal/services/Fund/v6/accountHolderBalance\n```",
|
||||
"termsOfService": "https://www.adyen.com/legal/terms-and-conditions",
|
||||
"title": "Adyen for Platforms: Fund API",
|
||||
"version": "6",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_Adyen_profile_image.jpeg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/Adyen/adyen-openapi/master/json/FundService-v6.json",
|
||||
"version": "3.1"
|
||||
}
|
||||
],
|
||||
"x-preferred": true,
|
||||
"x-providerName": "adyen.com",
|
||||
"x-publicVersion": true,
|
||||
"x-serviceName": "FundService"
|
||||
},
|
||||
"updated": "2021-11-01T23:17:40.475Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/FundService/6/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/FundService/6/openapi.yaml",
|
||||
"openapiVer": "3.1.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:HopService": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"preferred": "6",
|
||||
"versions": {
|
||||
"6": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "developer-experience@adyen.com",
|
||||
"name": "Adyen Developer Experience team",
|
||||
"url": "https://www.adyen.help/hc/en-us/community/topics",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "The Hosted onboarding API provides endpoints that you can use to generate links to Adyen-hosted pages, such as an [onboarding page](https://docs.adyen.com/platforms/hosted-onboarding-page) or a [PCI compliance questionnaire](https://docs.adyen.com/platforms/platforms-for-partners). Then you can provide the link to your account holder so they can complete their onboarding.\n\n## Authentication\nTo connect to the Hosted onboarding API, you must use basic authentication credentials of your web service user. If you don't have one, contact our [Support Team](https://support.adyen.com/hc/en-us/requests/new). Then use your credentials to authenticate your request, for example:\n\n```\ncurl\n-U \"ws@MarketPlace.YourMarketPlace\":\"YourWsPassword\" \\\n-H \"Content-Type: application/json\" \\\n...\n```\nWhen going live, you need to generate new web service user credentials to access the [live endpoints](https://docs.adyen.com/development-resources/live-endpoints).\n\n## Versioning\nThe Hosted onboarding API supports versioning of its endpoints through a version suffix in the endpoint URL. This suffix has the following format: \"vXX\", where XX is the version number.\n\nFor example:\n```\nhttps://cal-test.adyen.com/cal/services/Hop/v6/getOnboardingUrl\n```",
|
||||
"termsOfService": "https://www.adyen.com/legal/terms-and-conditions",
|
||||
"title": "Adyen for Platforms: Hosted Onboarding",
|
||||
"version": "6",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_Adyen_profile_image.jpeg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/Adyen/adyen-openapi/master/json/HopService-v6.json",
|
||||
"version": "3.1"
|
||||
}
|
||||
],
|
||||
"x-preferred": true,
|
||||
"x-providerName": "adyen.com",
|
||||
"x-publicVersion": true,
|
||||
"x-serviceName": "HopService"
|
||||
},
|
||||
"updated": "2021-11-01T23:17:40.475Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/HopService/6/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/HopService/6/openapi.yaml",
|
||||
"openapiVer": "3.1.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:MarketPayNotificationService": {
|
||||
"added": "2021-06-21T10:54:37.877Z",
|
||||
"preferred": "6",
|
||||
"versions": {
|
||||
"6": {
|
||||
"added": "2021-06-21T10:54:37.877Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "developer-experience@adyen.com",
|
||||
"name": "Adyen Developer Experience team",
|
||||
"url": "https://www.adyen.help/hc/en-us/community/topics",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "The Notification API sends notifications to the endpoints specified in a given subscription. Subscriptions are managed through the Notification Configuration API. The API specifications listed here detail the format of each notification.\n\nFor more information, refer to our [documentation](https://docs.adyen.com/platforms/notifications).",
|
||||
"termsOfService": "https://www.adyen.com/legal/terms-and-conditions",
|
||||
"title": "Adyen for Platforms: Notifications",
|
||||
"version": "6",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_Adyen_profile_image"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/Adyen/adyen-openapi/master/json/MarketPayNotificationService-v6.json",
|
||||
"version": "3.1"
|
||||
}
|
||||
],
|
||||
"x-preferred": true,
|
||||
"x-providerName": "adyen.com",
|
||||
"x-publicVersion": true,
|
||||
"x-serviceName": "MarketPayNotificationService"
|
||||
},
|
||||
"updated": "2021-11-12T23:18:19.544Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/MarketPayNotificationService/6/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/MarketPayNotificationService/6/openapi.yaml",
|
||||
"openapiVer": "3.1.0"
|
||||
}
|
||||
}
|
||||
},
|
||||
"adyen.com:NotificationConfigurationService": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"preferred": "6",
|
||||
"versions": {
|
||||
"6": {
|
||||
"added": "2020-11-03T12:51:40.318Z",
|
||||
"info": {
|
||||
"contact": {
|
||||
"email": "developer-experience@adyen.com",
|
||||
"name": "Adyen Developer Experience team",
|
||||
"url": "https://www.adyen.help/hc/en-us/community/topics",
|
||||
"x-twitter": "Adyen"
|
||||
},
|
||||
"description": "The Notification Configuration API provides endpoints for setting up and testing notifications that inform you of events on your platform, for example when a KYC check or a payout has been completed.\n\nFor more information, refer to our [documentation](https://docs.adyen.com/platforms/notifications).\n## Authentication\nTo connect to the Notification Configuration API, you must use basic authentication credentials of your web service user. If you don't have one, contact our [Adyen Support Team](https://support.adyen.com/hc/en-us/requests/new). Then use its credentials to authenticate your request, for example:\n\n```\ncurl\n-U \"ws@MarketPlace.YourMarketPlace\":\"YourWsPassword\" \\\n-H \"Content-Type: application/json\" \\\n...\n```\nNote that when going live, you need to generate new web service user credentials to access the [live endpoints](https://docs.adyen.com/development-resources/live-endpoints).\n\n## Versioning\nThe Notification Configuration API supports versioning of its endpoints through a version suffix in the endpoint URL. This suffix has the following format: \"vXX\", where XX is the version number.\n\nFor example:\n```\nhttps://cal-test.adyen.com/cal/services/Notification/v6/createNotificationConfiguration\n```",
|
||||
"termsOfService": "https://www.adyen.com/legal/terms-and-conditions",
|
||||
"title": "Adyen for Platforms: Notification Configuration API",
|
||||
"version": "6",
|
||||
"x-apisguru-categories": [
|
||||
"payment"
|
||||
],
|
||||
"x-logo": {
|
||||
"url": "https://api.apis.guru/v2/cache/logo/https_twitter.com_Adyen_profile_image.jpeg"
|
||||
},
|
||||
"x-origin": [
|
||||
{
|
||||
"format": "openapi",
|
||||
"url": "https://raw.githubusercontent.com/Adyen/adyen-openapi/master/json/NotificationConfigurationService-v6.json",
|
||||
"version": "3.1"
|
||||
}
|
||||
],
|
||||
"x-preferred": true,
|
||||
"x-providerName": "adyen.com",
|
||||
"x-publicVersion": true,
|
||||
"x-serviceName": "NotificationConfigurationService"
|
||||
},
|
||||
"updated": "2021-11-12T23:18:19.544Z",
|
||||
"swaggerUrl": "https://api.apis.guru/v2/specs/adyen.com/NotificationConfigurationService/6/openapi.json",
|
||||
"swaggerYamlUrl": "https://api.apis.guru/v2/specs/adyen.com/NotificationConfigurationService/6/openapi.yaml",
|
||||
"openapiVer": "3.1.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,42 @@
|
||||
{
|
||||
"numSpecs": 3809,
|
||||
"numAPIs": 2362,
|
||||
"numEndpoints": 79405,
|
||||
"unreachable": 138,
|
||||
"invalid": 634,
|
||||
"unofficial": 24,
|
||||
"fixes": 34001,
|
||||
"fixedPct": 21,
|
||||
"datasets": [
|
||||
{
|
||||
"title": "providerCount",
|
||||
"data": {
|
||||
"adyen.com": 69,
|
||||
"amazonaws.com": 295,
|
||||
"apideck.com": 14,
|
||||
"apisetu.gov.in": 181,
|
||||
"azure.com": 1832,
|
||||
"ebay.com": 20,
|
||||
"fungenerators.com": 12,
|
||||
"googleapis.com": 443,
|
||||
"hubapi.com": 11,
|
||||
"interzoid.com": 20,
|
||||
"mastercard.com": 14,
|
||||
"microsoft.com": 27,
|
||||
"nexmo.com": 20,
|
||||
"nytimes.com": 11,
|
||||
"parliament.uk": 11,
|
||||
"sportsdata.io": 35,
|
||||
"twilio.com": 41,
|
||||
"windows.net": 10,
|
||||
"Others": 743
|
||||
}
|
||||
}
|
||||
],
|
||||
"stars": 2964,
|
||||
"issues": 206,
|
||||
"thisWeek": {
|
||||
"added": 123,
|
||||
"updated": 119
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,36 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenCentral()
|
||||
mavenLocal() // in case of local dataframe development
|
||||
}
|
||||
|
||||
application.mainClass = "org.jetbrains.kotlinx.dataframe.examples.movies.MoviesWithDataClassKt"
|
||||
|
||||
dependencies {
|
||||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
|
||||
implementation(project(":"))
|
||||
}
|
||||
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_1_8
|
||||
freeCompilerArgs.add("-Xjdk-release=8")
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
options.release.set(8)
|
||||
}
|
||||
@@ -0,0 +1,66 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.movies
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.*
|
||||
import org.jetbrains.kotlinx.dataframe.io.*
|
||||
|
||||
/**
|
||||
* movieId title genres
|
||||
* 0 9b30aff7943f44579e92c261f3adc193 Women in Black (1997) Fantasy|Suspenseful|Comedy
|
||||
* 1 2a1ba1fc5caf492a80188e032995843e Bumblebee Movie (2007) Comedy|Jazz|Family|Animation
|
||||
*/
|
||||
@DataSchema
|
||||
interface Movie {
|
||||
val movieId: String
|
||||
val title: String
|
||||
val genres: String
|
||||
}
|
||||
|
||||
private const val pathToCsv = "examples/idea-examples/movies/src/main/resources/movies.csv"
|
||||
// Uncomment this line if you want to copy-paste and run the code in your project without downloading the file
|
||||
//private const val pathToCsv = "https://raw.githubusercontent.com/Kotlin/dataframe/master/examples/idea-examples/movies/src/main/resources/movies.csv"
|
||||
|
||||
fun main() {
|
||||
// This example shows how to the use extension properties API to address columns in different operations
|
||||
// https://kotlin.github.io/dataframe/apilevels.html
|
||||
|
||||
// Add the Gradle plugin and run `assemble`
|
||||
// check the README https://github.com/Kotlin/dataframe?tab=readme-ov-file#setup
|
||||
val step1 = DataFrame
|
||||
.read(pathToCsv).convertTo<Movie>()
|
||||
.split { genres }.by("|").inplace()
|
||||
.split { title }.by {
|
||||
listOf<Any>(
|
||||
"""\s*\(\d{4}\)\s*$""".toRegex().replace(it, ""),
|
||||
"\\d{4}".toRegex().findAll(it).lastOrNull()?.value?.toIntOrNull() ?: -1,
|
||||
)
|
||||
}.into("title", "year")
|
||||
.explode("genres")
|
||||
step1.print()
|
||||
|
||||
/**
|
||||
* Data is parsed and prepared for aggregation
|
||||
* movieId title year genres
|
||||
* 0 9b30aff7943f44579e92c261f3adc193 Women in Black 1997 Fantasy
|
||||
* 1 9b30aff7943f44579e92c261f3adc193 Women in Black 1997 Suspenseful
|
||||
* 2 9b30aff7943f44579e92c261f3adc193 Women in Black 1997 Comedy
|
||||
* 3 2a1ba1fc5caf492a80188e032995843e Bumblebee Movie 2007 Comedy
|
||||
* 4 2a1ba1fc5caf492a80188e032995843e Bumblebee Movie 2007 Jazz
|
||||
* 5 2a1ba1fc5caf492a80188e032995843e Bumblebee Movie 2007 Family
|
||||
* 6 2a1ba1fc5caf492a80188e032995843e Bumblebee Movie 2007 Animation
|
||||
*/
|
||||
val step2 = step1
|
||||
.filter { "year"<Int>() >= 0 && genres != "(no genres listed)" }
|
||||
.groupBy("year")
|
||||
.sortBy("year")
|
||||
.pivot("genres", inward = false)
|
||||
.aggregate {
|
||||
count() into "count"
|
||||
mean() into "mean"
|
||||
}
|
||||
|
||||
step2.print(10)
|
||||
// Discover the final reshaped data in an interactive HTML table
|
||||
// step2.toStandaloneHTML().openInBrowser()
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
movieId,title,genres
|
||||
9b30aff7943f44579e92c261f3adc193,Women in Black (1997),Fantasy|Suspenseful|Comedy
|
||||
2a1ba1fc5caf492a80188e032995843e,Bumblebee Movie (2007),Comedy|Jazz|Family|Animation
|
||||
f44ceb4771504342bb856d76c112d5a6,Magical School Boy and the Rock of Wise Men (2001),Fantasy|Growing up|Magic
|
||||
43d02fb064514ff3bd30d1e3a7398357,Master of the Jewlery: The Company of the Jewel (2001),Fantasy|Magic|Suspenseful
|
||||
6aa0d26a483148998c250b9c80ddf550,Sun Conflicts: Part IV: A Novel Espair (1977),Fantasy
|
||||
eace16e59ce24eff90bf8924eb6a926c,The Outstanding Bulk (2008),Fantasy|Superhero|Family
|
||||
ae916bc4844a4bb7b42b70d9573d05cd,In Automata (2014),Horror|Existential
|
||||
c1f0a868aeb44c5ea8d154ec3ca295ac,Interplanetary (2014),Sci-fi|Futuristic
|
||||
9595b771f87f42a3b8dd07d91e7cb328,Woods Run (1994),Family|Drama
|
||||
aa9fc400e068443488b259ea0802a975,Anthropod-Dude (2002),Superhero|Fantasy|Family|Growing up
|
||||
22d20c2ba11d44cab83aceea39dc00bd,The Chamber (2003),Comedy|Drama
|
||||
8cf4d0c1bd7b41fab6af9d92c892141f,That Thing About an Iceberg (1997),Drama|History|Family|Romance
|
||||
c2f3e7588da84684a7d78d6bd8d8e1f4,Vehicles (2006),Animation|Family
|
||||
ce06175106af4105945f245161eac3c7,Playthings Tale (1995),Animation|Family
|
||||
ee28d7e69103485c83e10b8055ef15fb,Metal Man 2 (2010),Fantasy|Superhero|Family
|
||||
c32bdeed466f4ec09de828bb4b6fc649,Surgeon Odd in the Omniverse of Crazy (2022),Fantasy|Superhero|Family|Horror
|
||||
d4a325ab648a42c4a2d6f35dfabb387f,Bad Dream on Pine Street (1984),Horror
|
||||
60ebe74947234ddcab49dea1a958faed,The Shimmering (1980),Horror
|
||||
f24327f2b05147b197ca34bf13ae3524,Krubit: Societal Teachings for Do Many Good Amazing Country of Uzbekistan (2006),Comedy
|
||||
2bb29b3a245e434fa80542e711fd2cee,This is No Movie (1950),(no genres listed)
|
||||
|
@@ -0,0 +1,39 @@
|
||||
userId,movieId,tag,timestamp
|
||||
3,9595b771f87f42a3b8dd07d91e7cb328,classic,1439472355
|
||||
3,6aa0d26a483148998c250b9c80ddf550,sci-fi,1439472256
|
||||
4,f24327f2b05147b197ca34bf13ae3524,dark comedy,1573943598
|
||||
4,ae916bc4844a4bb7b42b70d9573d05cd,great dialogue,1573943604
|
||||
4,f24327f2b05147b197ca34bf13ae3524,so bad it's good,1573943455
|
||||
4,d4a325ab648a42c4a2d6f35dfabb387f,tense,1573943077
|
||||
4,ae916bc4844a4bb7b42b70d9573d05cd,artificial intelligence,1573942979
|
||||
4,ae916bc4844a4bb7b42b70d9573d05cd,philosophical,1573943033
|
||||
4,c1f0a868aeb44c5ea8d154ec3ca295ac,tense,1573943042
|
||||
4,22d20c2ba11d44cab83aceea39dc00bd,so bad it's good,1573942965
|
||||
4,8cf4d0c1bd7b41fab6af9d92c892141f,cliche,1573943721
|
||||
4,2bb29b3a245e434fa80542e711fd2cee,musical,1573943714
|
||||
4,60ebe74947234ddcab49dea1a958faed,horror,1573945163
|
||||
4,2bb29b3a245e434fa80542e711fd2cee,unpredictable,1573945171
|
||||
19,9b30aff7943f44579e92c261f3adc193,Oscar (Best Supporting Actress),1446909853
|
||||
19,43d02fb064514ff3bd30d1e3a7398357,adventure,1445286141
|
||||
19,f44ceb4771504342bb856d76c112d5a6,fantasy,1445286144
|
||||
19,c1f0a868aeb44c5ea8d154ec3ca295ac,post-apocalyptic,1445286136
|
||||
20,2a1ba1fc5caf492a80188e032995843e,bah,1155082282
|
||||
84,f24327f2b05147b197ca34bf13ae3524,documentary,1549387432
|
||||
87,c1f0a868aeb44c5ea8d154ec3ca295ac,sci-fi,1542308464
|
||||
87,ae916bc4844a4bb7b42b70d9573d05cd,android(s)/cyborg(s),1542309549
|
||||
87,c1f0a868aeb44c5ea8d154ec3ca295ac,apocalypse,1542309703
|
||||
87,ae916bc4844a4bb7b42b70d9573d05cd,artificial intelligence,1542309599
|
||||
87,ee28d7e69103485c83e10b8055ef15fb,franchise,1542309536
|
||||
87,ee28d7e69103485c83e10b8055ef15fb,sci-fi,1542308408
|
||||
87,ee28d7e69103485c83e10b8055ef15fb,science fiction,1542308395
|
||||
87,eace16e59ce24eff90bf8924eb6a926c,bad science,1522676752
|
||||
87,ae916bc4844a4bb7b42b70d9573d05cd,philosophical issues,1522676687
|
||||
87,6aa0d26a483148998c250b9c80ddf550,sci-fi,1522676660
|
||||
87,6aa0d26a483148998c250b9c80ddf550,science fiction,1522676703
|
||||
87,6aa0d26a483148998c250b9c80ddf550,space,1522676664
|
||||
87,c1f0a868aeb44c5ea8d154ec3ca295ac,space travel,1522676685
|
||||
87,c1f0a868aeb44c5ea8d154ec3ca295ac,visually appealing,1522676682
|
||||
91,aa9fc400e068443488b259ea0802a975,quirky,1415914797
|
||||
91,8cf4d0c1bd7b41fab6af9d92c892141f,romantic,1415131173
|
||||
91,ae916bc4844a4bb7b42b70d9573d05cd,thought-provoking,1415131203
|
||||
91,f44ceb4771504342bb856d76c112d5a6,based on book,1414248543
|
||||
|
@@ -0,0 +1,177 @@
|
||||
# spark-parquet-dataframe
|
||||
|
||||
This example shows how to:
|
||||
- Load a CSV (California Housing) with local Apache Spark
|
||||
- Write it to Parquet, then read Parquet back with Kotlin DataFrame (Arrow-based reader)
|
||||
- Train a simple Linear Regression model with Spark MLlib
|
||||
- Export the model in two ways and explain why we do both
|
||||
- Inspect the saved Spark model artifacts
|
||||
- Build a 2D plot for a single model coefficient
|
||||
|
||||
Below is a faithful, step-by-step walkthrough matching the code in `SparkParquetDataframe.kt`.
|
||||
|
||||
## The 10 steps of the example (with explanations)
|
||||
|
||||
1. Start local Spark
|
||||
- A local `SparkSession` is created. The example configures Spark to work against the local filesystem and sets Java options required by Arrow/Parquet.
|
||||
|
||||
2. Read `housing.csv` with Spark
|
||||
- Spark loads the CSV with header and automatic schema inference into a Spark DataFrame.
|
||||
|
||||
3. Show the Spark DataFrame and write it to Parquet
|
||||
- `show(10, false)` prints the first rows for inspection.
|
||||
- The DataFrame is written to a temporary directory in Parquet format.
|
||||
|
||||
4. Read this Parquet with Kotlin DataFrame (Arrow backend)
|
||||
- Kotlin DataFrame reads the concrete `part-*.parquet` files produced by Spark using the Arrow-based Parquet reader.
|
||||
|
||||
5. Print `head()` of the Kotlin DataFrame
|
||||
- A quick glance at the loaded data in Kotlin DataFrame form.
|
||||
|
||||
6. Train a regression model with Spark MLlib
|
||||
- Numeric features are assembled with `VectorAssembler` (the categorical `ocean_proximity` is excluded).
|
||||
- A `LinearRegression` model (no intercept in the code, elasticNet=0.5, maxIter=10) is trained on a train split.
|
||||
|
||||
7. Export model summary to Parquet (tabular, portable)
|
||||
- The learned coefficients are paired with their feature names, plus a special row for the intercept.
|
||||
- This small, explicit summary table is written to Parquet. It’s easy to exchange and read without Spark.
|
||||
|
||||
8. Read the model-summary Parquet with Kotlin DataFrame
|
||||
- Kotlin DataFrame reads the summary Parquet and prints its head. This is the portable path for analytics/visualization.
|
||||
|
||||
9. Save the full fitted PipelineModel
|
||||
- The entire fitted `PipelineModel` is saved using Spark’s native ML writer. This produces a directory with both JSON metadata and Parquet data.
|
||||
|
||||
10. Inspect pipeline internals using Kotlin DataFrame
|
||||
- For exploration, the example then reads some of those JSON and Parquet files back using Kotlin DataFrame.
|
||||
- Notes:
|
||||
- Internal folder names contain stage indices and UIDs (e.g., `0_...`, `1_...`) and may vary across Spark versions.
|
||||
- This inspection method is for exploration only. For reuse in Spark, you should load using `PipelineModel.load(...)`.
|
||||
- Sub-steps:
|
||||
- 10.1 Root metadata (JSON): read each file under `.../metadata/` and print heads.
|
||||
- 10.2 Stage 0 (VectorAssembler): read JSON metadata and Parquet data under `.../stages/0_*/{metadata,data}` if present.
|
||||
- 10.3 Stage 1 (LinearRegressionModel): read JSON metadata and Parquet data under `.../stages/1_*/{metadata,data}` if present.
|
||||
|
||||
11. Build a 2D plot using one coefficient
|
||||
- We choose the feature `median_income` and the label `median_house_value` to produce a 2D scatter plot.
|
||||
- From the summary table, we extract the slope for `median_income` and the intercept, and draw the line `y = slope*x + intercept`.
|
||||
- Sub-steps:
|
||||
- 11.1 Concatenate any metadata JSON frames that were successfully read (optional, for inspection).
|
||||
- 11.2 Use the model-summary table (coefficients + intercept) as the unified model data source.
|
||||
- 11.3 Compute the slope/intercept for the chosen feature from the summary table.
|
||||
- 11.4 Create a Kandy plot (points + abLine) and save it to `linear_model_plot.jpg`.
|
||||
- The plot is saved as `linear_model_plot.jpg` (an example image is committed at `lets-plot-images/linear_model_plot.jpg`).
|
||||
|
||||

|
||||
|
||||
## Why two ways to serialize the model?
|
||||
|
||||
We deliberately show both because they serve different goals:
|
||||
- Tabular summary (Parquet):
|
||||
- A small, human- and tool-friendly table of coefficients + intercept.
|
||||
- Portable across tools; easy to read directly in Kotlin DataFrame, pandas, SQL engines, etc.
|
||||
- Great for analytics, reporting, and plotting.
|
||||
- Full Spark ML writer (PipelineModel.save):
|
||||
- Contains everything needed to reuse the trained model inside Spark (including metadata and internal data).
|
||||
- Directory layout and file names aren’t guaranteed to be stable across versions; the intended way to consume is `PipelineModel.load(...)` in Spark.
|
||||
- Not ideal as a cross-tool tabular export, but perfect for production use in Spark pipelines.
|
||||
|
||||
## Why do we plot only one coefficient?
|
||||
|
||||
The linear model has multiple coefficients (one per feature). A 2D chart can only show two axes. To visualize the learned relationship, we pick a single feature (here, `median_income`) and the target (`median_house_value`) and draw the corresponding fitted line. You can repeat the procedure with any other feature to obtain a different 2D projection of the multi-dimensional model.
|
||||
|
||||
## About the dataset (`housing.csv`)
|
||||
|
||||
1. __longitude:__ How far west a house is; higher values are farther west
|
||||
2. __latitude:__ How far north a house is; higher values are farther north
|
||||
3. __housingMedianAge:__ Median age of a house within a block; lower means newer
|
||||
4. __totalRooms:__ Total number of rooms within a block
|
||||
5. __totalBedrooms:__ Total number of bedrooms within a block
|
||||
6. __population:__ Total number of people residing within a block
|
||||
7. __households:__ Total number of households within a block
|
||||
8. __medianIncome:__ Median household income (in tens of thousands of USD)
|
||||
9. __medianHouseValue:__ Median house value (in USD)
|
||||
10. __oceanProximity:__ Location of the house with respect to the ocean/sea
|
||||
|
||||
The CSV file is located at `examples/housing.csv` in the repository root.
|
||||
|
||||
## Windows note
|
||||
|
||||
<details>
|
||||
<summary>Running on Windows: install winutils and set Hadoop environment variables</summary>
|
||||
|
||||
On Windows, Spark may require Hadoop native helpers. If you see errors like "winutils.exe not found" or permission/FS issues, do the following:
|
||||
|
||||
1. Install winutils.exe that matches your Spark/Hadoop version and place it under a Hadoop directory, e.g. `C:\hadoop\bin\winutils.exe`.
|
||||
2. Set environment variables:
|
||||
- `HADOOP_HOME=C:\hadoop`
|
||||
- Add `%HADOOP_HOME%\bin` to your `PATH`
|
||||
3. Restart your IDE/terminal so the variables are picked up and re-run the example.
|
||||
|
||||
This ensures Spark can operate correctly with Hadoop on Windows.
|
||||
</details>
|
||||
|
||||
|
||||
## SparkSession configuration to bypass Hadoop/winutils and enable Arrow
|
||||
|
||||
Use the following SparkSession builder if you want to completely avoid native Hadoop libraries (including winutils on Windows) and enable Arrow-related add-opens:
|
||||
|
||||
```kotlin
|
||||
val spark = SparkSession.builder()
|
||||
.appName("spark-parquet-dataframe")
|
||||
.master("local[*]")
|
||||
.config("spark.sql.warehouse.dir", Files.createTempDirectory("spark-warehouse").toString())
|
||||
// Completely bypass native Hadoop libraries and winutils
|
||||
.config("spark.hadoop.fs.defaultFS", "file:///")
|
||||
.config("spark.hadoop.fs.AbstractFileSystem.file.impl", "org.apache.hadoop.fs.local.LocalFs")
|
||||
.config("spark.hadoop.fs.file.impl.disable.cache", "true")
|
||||
// Disable Hadoop native library requirements and native warnings
|
||||
.config("spark.hadoop.hadoop.native.lib", "false")
|
||||
.config("spark.hadoop.io.native.lib.available", "false")
|
||||
.config(
|
||||
"spark.driver.extraJavaOptions",
|
||||
"--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
|
||||
)
|
||||
.config(
|
||||
"spark.executor.extraJavaOptions",
|
||||
"--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
|
||||
)
|
||||
.getOrCreate()
|
||||
```
|
||||
|
||||
Notes:
|
||||
- This configuration uses the pure-Java local filesystem (file://) and disables Hadoop native library checks, making winutils unnecessary.
|
||||
- If you rely on HDFS or native Hadoop tooling, omit these overrides and configure Hadoop as usual.
|
||||
|
||||
## What each Spark config does (and why it matters on JDK 21 and the Java module system)
|
||||
- `spark.sql.warehouse.dir=Files.createTempDirectory("spark-warehouse").toString()`
|
||||
- Points Spark SQL’s warehouse to an ephemeral, writable temp directory.
|
||||
- Avoids permission issues and clutter in the project directory, especially on Windows.
|
||||
- `spark.hadoop.fs.defaultFS = file:///`
|
||||
- Forces Hadoop to use the local filesystem instead of HDFS.
|
||||
- Bypasses native Hadoop bits and makes winutils unnecessary on Windows for this example.
|
||||
- `spark.hadoop.fs.AbstractFileSystem.file.impl = org.apache.hadoop.fs.local.LocalFs`
|
||||
- Ensures the AbstractFileSystem implementation resolves to the pure-Java LocalFs.
|
||||
- `spark.hadoop.fs.file.impl.disable.cache = true`
|
||||
- Disables FS implementation caching so the LocalFs overrides are applied immediately within the current JVM.
|
||||
- `spark.hadoop.hadoop.native.lib = false` and `spark.hadoop.io.native.lib.available = false`
|
||||
- Tell Hadoop not to load native libraries and suppress related warnings.
|
||||
- Prevents errors stemming from missing native binaries (e.g., winutils) when you only need local file IO.
|
||||
- `spark.driver.extraJavaOptions` and `spark.executor.extraJavaOptions` with:
|
||||
`--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED`
|
||||
- Why needed: Starting with the Java Platform Module System (JDK 9+) and especially under JDK 17/21 (JEP 403 strong encapsulation), reflective access into JDK internals is restricted. Apache Arrow (used by the vectorized Parquet reader in Kotlin DataFrame) may need reflective access within java.nio for memory management and buffer internals. Without opening the package, you can get errors like:
|
||||
- `java.lang.reflect.InaccessibleObjectException: module java.base does not open java.nio to org.apache.arrow.memory.core`
|
||||
- ...does not open `java.nio` to unnamed module @xxxx
|
||||
- What it does: Opens the `java.nio` package in module `java.base` at runtime to both the named module org.apache.arrow.memory.core (when Arrow is on the module path) and to ALL-UNNAMED (when Arrow is on the classpath). This enables Arrow’s memory code to work on modern JDKs.
|
||||
- Driver vs executor: In `local[*]` both apply to the same process, but keeping both symmetric makes this snippet cluster-ready (executors are separate JVMs).
|
||||
- When you might not need it: On JDK 8 (no module system) or if your stack does not use Arrow’s vectorized path. On JDK 17/21+, keep it if you see `InaccessibleObjectException` referencing `java.nio`.
|
||||
- Other packages: Some environments/libraries (e.g., Netty) may require additional opens such as `--add-opens=java.base/sun.nio.ch=ALL-UNNAMED`. Only add the opens that your error messages explicitly mention.
|
||||
- Security note: add-opens affects only the current JVM process at runtime; it doesn’t change compile-time checks or system-wide settings.
|
||||
|
||||
## Troubleshooting on JDK 17+
|
||||
- Symptom: `InaccessibleObjectException` mentioning java.nio or “illegal reflective access” warnings.
|
||||
- Fix: Ensure both spark.driver.extraJavaOptions and `spark.executor.extraJavaOptions` include the exact `--add-opens` string shown above.
|
||||
- Symptom: Works in IDE, fails with spark-submit.
|
||||
- Fix: Pass the options with `--conf spark.driver.extraJavaOptions=...` and `--conf spark.executor.extraJavaOptions=...` (or via SPARK_SUBMIT_OPTS), not only in IDE settings.
|
||||
- Symptom: On Windows, “winutils.exe not found”.
|
||||
- Fix: Either use this configuration block (bypassing native Hadoop) or install winutils as described in the Windows note above.
|
||||
@@ -0,0 +1,70 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
import org.jetbrains.kotlin.gradle.tasks.KotlinCompile
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenCentral()
|
||||
mavenLocal() // in case of local dataframe development
|
||||
}
|
||||
|
||||
application.mainClass = "org.jetbrains.kotlinx.dataframe.examples.spark.parquet.SparkParquetDataframeKt"
|
||||
|
||||
dependencies {
|
||||
implementation(project(":"))
|
||||
|
||||
// Spark SQL + MLlib (Spark 4.0.0)
|
||||
implementation("org.apache.spark:spark-sql_2.13:4.0.0")
|
||||
implementation("org.apache.spark:spark-mllib_2.13:4.0.0")
|
||||
|
||||
// Kandy (Lets-Plot backend) for plotting
|
||||
implementation(libs.kandy) {
|
||||
// Avoid pulling transitive kotlinx-dataframe from Kandy — we use the monorepo modules
|
||||
exclude("org.jetbrains.kotlinx", "dataframe")
|
||||
}
|
||||
|
||||
// Logging to keep Spark quiet
|
||||
implementation(libs.log4j.core)
|
||||
implementation(libs.log4j.api)
|
||||
}
|
||||
|
||||
// for Java 17+, and Arrow/Parquet support
|
||||
application {
|
||||
applicationDefaultJvmArgs = listOf(
|
||||
"--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED",
|
||||
)
|
||||
}
|
||||
|
||||
java {
|
||||
toolchain {
|
||||
languageVersion.set(JavaLanguageVersion.of(11))
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaExec> {
|
||||
jvmArgs(
|
||||
"--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED",
|
||||
)
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_11.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_11.toString()
|
||||
options.release.set(11)
|
||||
}
|
||||
|
||||
// Configure KSP tasks to use the same JVM target
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget.set(JvmTarget.JVM_11)
|
||||
freeCompilerArgs.add("-Xjdk-release=11")
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,333 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.spark.parquet
|
||||
|
||||
import org.apache.spark.ml.PipelineStage
|
||||
import org.apache.spark.ml.feature.VectorAssembler
|
||||
import org.apache.spark.ml.regression.LinearRegression
|
||||
import org.apache.spark.ml.regression.LinearRegressionModel
|
||||
import org.apache.spark.sql.SparkSession
|
||||
import org.apache.spark.sql.functions.col
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.api.add
|
||||
import org.jetbrains.kotlinx.dataframe.api.concat
|
||||
import org.jetbrains.kotlinx.dataframe.api.head
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.cast
|
||||
import org.jetbrains.kotlinx.dataframe.api.dropNA
|
||||
import org.jetbrains.kotlinx.dataframe.api.getColumn
|
||||
import org.jetbrains.kotlinx.dataframe.io.readJson
|
||||
import org.jetbrains.kotlinx.dataframe.io.readParquet
|
||||
import org.jetbrains.kotlinx.kandy.dsl.plot
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.layers.points
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.layers.abLine
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.export.save
|
||||
import org.jetbrains.kotlinx.kandy.util.color.Color
|
||||
import java.nio.file.Files
|
||||
import java.nio.file.Path
|
||||
import java.nio.file.Paths
|
||||
import java.util.stream.Collectors
|
||||
import kotlin.io.path.exists
|
||||
import kotlin.io.path.isDirectory
|
||||
import kotlin.io.path.notExists
|
||||
import kotlin.jvm.java
|
||||
|
||||
/**
|
||||
* Demonstrates reading CSV with Apache Spark, writing Parquet, and reading Parquet with Kotlin DataFrame via Arrow.
|
||||
* Also trains a simple Spark ML regression model and exports a summary as Parquet, then reads it back with Kotlin DataFrame.
|
||||
*
|
||||
* NOTE: This example doesn't use Kotlin Apache Spark API, as it relies on the Java Spark API directly.
|
||||
*/
|
||||
fun main() {
|
||||
// 1) Start local Spark
|
||||
val spark = SparkSession.builder()
|
||||
.appName("spark-parquet-dataframe")
|
||||
.master("local[*]")
|
||||
.config("spark.sql.warehouse.dir", Files.createTempDirectory("spark-warehouse").toString())
|
||||
// Completely bypass native Hadoop libraries and winutils
|
||||
.config("spark.hadoop.fs.defaultFS", "file:///")
|
||||
.config("spark.hadoop.fs.AbstractFileSystem.file.impl", "org.apache.hadoop.fs.local.LocalFs")
|
||||
.config("spark.hadoop.fs.file.impl.disable.cache", "true")
|
||||
// Disable Hadoop native library requirements and native warnings
|
||||
.config("spark.hadoop.hadoop.native.lib", false)
|
||||
.config("spark.hadoop.io.native.lib.available", false)
|
||||
.config(
|
||||
"spark.driver.extraJavaOptions",
|
||||
"--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
|
||||
)
|
||||
.config(
|
||||
"spark.executor.extraJavaOptions",
|
||||
"--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"
|
||||
)
|
||||
.getOrCreate()
|
||||
|
||||
// Make Spark a bit quieter
|
||||
spark.sparkContext().setLogLevel("WARN")
|
||||
|
||||
// 2) Read housing.csv (from a repo path) with Spark
|
||||
val csvResource = object {}::class.java.getResource("/housing.csv")
|
||||
?: throw IllegalStateException("housing.csv not found in classpath resources")
|
||||
val csvPath = Paths.get(csvResource.toURI()).toAbsolutePath().toString()
|
||||
|
||||
val sdf = spark.read()
|
||||
.option("header", "true")
|
||||
.option("inferSchema", "true")
|
||||
.csv(csvPath)
|
||||
|
||||
// 3) Print the Spark DataFrame and export to Parquet in a temp directory
|
||||
println("Spark DataFrame (head):")
|
||||
sdf.show(10, false)
|
||||
|
||||
val parquetDir: Path = Files.createTempDirectory("housing_spark_parquet_")
|
||||
val parquetPath = parquetDir.toString()
|
||||
sdf.write().mode("overwrite").parquet(parquetPath)
|
||||
println("Saved Spark Parquet to: $parquetPath")
|
||||
|
||||
// 4) Read this Parquet with Kotlin DataFrame (Arrow backend)
|
||||
// Pass the actual part-*.parquet files instead of the directory
|
||||
val parquetFiles = listParquetFilesIfAny(parquetDir)
|
||||
val kdf = DataFrame.readParquet(*parquetFiles)
|
||||
|
||||
// 5) Print out head() for this Kotlin DataFrame
|
||||
println("Kotlin DataFrame (head):")
|
||||
kdf.head().print()
|
||||
|
||||
// 6) Train a regression model with Spark MLlib
|
||||
// Use numeric features only, drop the categorical 'ocean_proximity'
|
||||
val labelCol = "median_house_value"
|
||||
val candidateFeatureCols = listOf(
|
||||
"longitude", "latitude", "housing_median_age", "total_rooms", "total_bedrooms",
|
||||
"population", "households", "median_income"
|
||||
)
|
||||
|
||||
val colsArray = (candidateFeatureCols + labelCol).map { col(it) }.toTypedArray()
|
||||
val sdfNumeric = sdf.select(*colsArray)
|
||||
.na().drop()
|
||||
|
||||
val assembler = VectorAssembler()
|
||||
.setInputCols(candidateFeatureCols.toTypedArray())
|
||||
.setOutputCol("features")
|
||||
|
||||
// Build Pipeline (VectorAssembler -> LinearRegression) and train/test split WITHOUT prebuilt 'features'
|
||||
val lr = LinearRegression()
|
||||
.setFeaturesCol("features")
|
||||
.setLabelCol(labelCol)
|
||||
.setFitIntercept(false)
|
||||
.setElasticNetParam(0.5)
|
||||
.setMaxIter(10)
|
||||
|
||||
val fullPipeline = org.apache.spark.ml.Pipeline().setStages(arrayOf<PipelineStage>(assembler, lr))
|
||||
|
||||
val fullPipelineModel = fullPipeline.fit(sdfNumeric)
|
||||
val lrModel = fullPipelineModel.stages()[1] as LinearRegressionModel
|
||||
|
||||
val summary = lrModel.summary()
|
||||
println("Training RMSE: ${summary.rootMeanSquaredError()}")
|
||||
println("Training r2: ${summary.r2()}")
|
||||
|
||||
// 7) Export model information to Parquet (coefficients per feature + intercept row)
|
||||
val coeffs = lrModel.coefficients().toArray()
|
||||
val rows =
|
||||
candidateFeatureCols.mapIndexed { idx, name -> org.apache.spark.sql.RowFactory.create(name, coeffs[idx]) } +
|
||||
listOf(org.apache.spark.sql.RowFactory.create("intercept", lrModel.intercept()))
|
||||
|
||||
val schema = org.apache.spark.sql.types.StructType(
|
||||
arrayOf(
|
||||
org.apache.spark.sql.types.StructField(
|
||||
"term",
|
||||
org.apache.spark.sql.types.DataTypes.StringType,
|
||||
false,
|
||||
org.apache.spark.sql.types.Metadata.empty()
|
||||
),
|
||||
org.apache.spark.sql.types.StructField(
|
||||
"coefficient",
|
||||
org.apache.spark.sql.types.DataTypes.DoubleType,
|
||||
false,
|
||||
org.apache.spark.sql.types.Metadata.empty()
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
val modelDf = spark.createDataFrame(rows, schema)
|
||||
val modelParquetDir = parquetDir.resolve("model")
|
||||
modelDf.write().mode("overwrite").parquet(modelParquetDir.toString())
|
||||
println("Saved model summary Parquet to: $modelParquetDir")
|
||||
|
||||
// 8) Read this model Parquet with Kotlin DataFrame and print
|
||||
val modelParquetFiles = listParquetFilesIfAny(modelParquetDir)
|
||||
val modelKdf = DataFrame.readParquet(*modelParquetFiles)
|
||||
|
||||
println("Model summary Kotlin DataFrame (head):")
|
||||
modelKdf.head().print()
|
||||
|
||||
// 9) Save the entire PipelineModel using the standard Spark ML mechanism
|
||||
// The model is already fitted above; just save it.
|
||||
val pipelinePath = parquetDir.resolve("pipeline_model_spark").toString()
|
||||
fullPipelineModel.write().overwrite().save(pipelinePath)
|
||||
println("Step 8: Saved PipelineModel to: $pipelinePath")
|
||||
|
||||
// 10) Inspect pipeline internals using Kotlin DataFrame from concrete paths (no directory walking)
|
||||
// IMPORTANT (why this is not the most convenient way to export/import):
|
||||
// - The ML writer saves a directory with mixed JSON (metadata) and Parquet (model data).
|
||||
// - Internal folder names for stages include indexes and algorithm/uids (e.g., "0_VectorAssembler_xxx", "1_LinearRegressionModel_xxx"),
|
||||
// which are not guaranteed to be stable across Spark versions.
|
||||
// - Reading internals are suitable only for inspection/exploration. For reuse, prefer PipelineModel.load();
|
||||
// for portable/tabular exchange, write an explicit summary DataFrame.
|
||||
//
|
||||
// Concrete layout this demo relies on:
|
||||
// $pipelinePath/metadata/
|
||||
// $pipelinePath/stages/0_*/metadata/, $pipelinePath/stages/0_*/data/
|
||||
// $pipelinePath/stages/1_*/metadata/, $pipelinePath/stages/1_*/data/
|
||||
|
||||
val pipelineRoot = Paths.get(pipelinePath)
|
||||
val stagesDir = pipelineRoot.resolve("stages")
|
||||
val stage0Dir = findStageDir(stagesDir, "0_")
|
||||
val stage1Dir = findStageDir(stagesDir, "1_")
|
||||
|
||||
// Accumulate Kotlin DataFrames found in step 9 so we can optionally join only existing ones in step 10
|
||||
val metaKdfs = mutableListOf<DataFrame<*>>()
|
||||
val stageDataKdfs = mutableListOf<DataFrame<*>>()
|
||||
|
||||
// 10.1) Root metadata (JSON) -> read each file one-by-one
|
||||
val rootMetaDir = pipelineRoot.resolve("metadata")
|
||||
val rootMetaFiles = listTextOrJsonFiles(rootMetaDir)
|
||||
for (file in rootMetaFiles) {
|
||||
val df = DataFrame.readJson(file.toFile())
|
||||
println("Step 9: Pipeline root metadata JSON (${file.fileName}) head:")
|
||||
df.head().print()
|
||||
metaKdfs += df
|
||||
}
|
||||
|
||||
// 10.2) Stage 0 (VectorAssembler) metadata/data
|
||||
val stage0MetaDir = stage0Dir.resolve("metadata")
|
||||
for (file in listTextOrJsonFiles(stage0MetaDir)) {
|
||||
val df = DataFrame.readJson(file.toFile())
|
||||
println("Step 9: Stage 0 metadata (${file.fileName}) head:")
|
||||
df.head().print()
|
||||
metaKdfs += df
|
||||
}
|
||||
val stage0DataDir = stage0Dir.resolve("data")
|
||||
val stage0ParquetFiles = listParquetFilesIfAny(stage0DataDir)
|
||||
if (stage0ParquetFiles.isNotEmpty()) {
|
||||
val stage0Kdf = DataFrame.readParquet(*stage0ParquetFiles)
|
||||
println("Step 9: Stage 0 data (Parquet) head:")
|
||||
stage0Kdf.head().print()
|
||||
stageDataKdfs += stage0Kdf
|
||||
} else {
|
||||
println("Step 9: Stage 0 data directory is missing or has no .parquet files, skipping.")
|
||||
}
|
||||
|
||||
// 10.3) Stage 1 (LinearRegressionModel) metadata/data
|
||||
val stage1MetaDir = stage1Dir.resolve("metadata")
|
||||
for (file in listTextOrJsonFiles(stage1MetaDir)) {
|
||||
val df = DataFrame.readJson(file.toFile())
|
||||
println("Step 9: Stage 1 metadata (${file.fileName}) head:")
|
||||
df.head().print()
|
||||
metaKdfs += df
|
||||
}
|
||||
val stage1DataDir = stage1Dir.resolve("data")
|
||||
val stage1ParquetFiles = listParquetFilesIfAny(stage1DataDir)
|
||||
if (stage1ParquetFiles.isNotEmpty()) {
|
||||
val stage1Kdf = DataFrame.readParquet(*stage1ParquetFiles)
|
||||
println("Step 9: Stage 1 data (Parquet) head:")
|
||||
stage1Kdf.head().print()
|
||||
stageDataKdfs += stage1Kdf
|
||||
} else {
|
||||
println("Step 9: Stage 1 data directory is missing or has no .parquet files, skipping.")
|
||||
}
|
||||
|
||||
// 11) Join only existing Kotlin DataFrames and build a plot from the linear model
|
||||
// 11.1) Unified metadata from any JSON files we successfully parsed above
|
||||
val unifiedMeta = if (metaKdfs.isNotEmpty()) metaKdfs.concat() else null
|
||||
if (unifiedMeta != null) {
|
||||
println("Step 10: Unified metadata head:")
|
||||
unifiedMeta.head().print()
|
||||
} else {
|
||||
println("Step 10: No metadata DataFrames were found to unify.")
|
||||
}
|
||||
|
||||
// 11.2) Unified model data: in this demo we already have a single modelKdf (coefficients + intercept)
|
||||
val unifiedModelDf = modelKdf
|
||||
println("Step 10: Unified model data (coefficients) head:")
|
||||
unifiedModelDf.head().print()
|
||||
|
||||
// 11.3) Build a linear plot: dataset points and model line y = a*x + b for the chosen feature
|
||||
// Choose feature 'median_income' vs. label 'median_house_value'
|
||||
val pointsDf = kdf.dropNA("median_income", "median_house_value")
|
||||
|
||||
// Extract slope (coefficient for 'median_income') and intercept from modelKdf
|
||||
val terms = unifiedModelDf.getColumn("term").cast<String>().toList()
|
||||
val coefs = unifiedModelDf.getColumn("coefficient").cast<Double>().toList()
|
||||
val slopeIdx = terms.indexOf("median_income")
|
||||
val interceptIdx = terms.indexOf("intercept")
|
||||
val slopeValue = if (slopeIdx >= 0) coefs[slopeIdx] else 0.0
|
||||
val interceptValue = if (interceptIdx >= 0) coefs[interceptIdx] else 0.0
|
||||
|
||||
println("slope: $slopeValue intercept: $interceptValue")
|
||||
|
||||
// Prepare DF for plotting: add constant columns for abLine mapping
|
||||
val dfForPlot = pointsDf
|
||||
.add("slope_const") { slopeValue }
|
||||
.add("intercept_const") { interceptValue }
|
||||
|
||||
// 11.4) Create Kandy plot using abLine (slope/intercept) and export to a .jpg file
|
||||
val plot = dfForPlot.plot {
|
||||
points {
|
||||
x("median_income")
|
||||
y("median_house_value")
|
||||
// Visual hint: small circles
|
||||
color = Color.LIGHT_BLUE
|
||||
size = 2.0
|
||||
}
|
||||
abLine {
|
||||
// Use linear model parameters: y = slope * x + intercept
|
||||
slope.constant(slopeValue)
|
||||
intercept.constant(interceptValue)
|
||||
color = Color.RED
|
||||
width = 2.0
|
||||
}
|
||||
}
|
||||
|
||||
val targetDir = Paths.get("").normalize()
|
||||
Files.createDirectories(targetDir)
|
||||
val plotPath = targetDir.resolve("linear_model_plot.jpg").toAbsolutePath().toString()
|
||||
|
||||
plot.save(plotPath)
|
||||
println("Step 10: Saved plot to: $plotPath")
|
||||
|
||||
spark.stop()
|
||||
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns .parquet files if the directory exists and contains any; otherwise returns an empty array.
|
||||
* Safe to use for Spark ML stage "data" subfolders that may be absent.
|
||||
*/
|
||||
private fun listParquetFilesIfAny(dir: Path): Array<Path> {
|
||||
if (dir.notExists() || !dir.isDirectory()) return emptyArray()
|
||||
val files: List<Path> = Files.list(dir).use { stream ->
|
||||
stream
|
||||
.filter { Files.isRegularFile(it) && it.fileName.toString().endsWith(".parquet", ignoreCase = true) }
|
||||
.collect(Collectors.toList())
|
||||
}
|
||||
return files.toTypedArray()
|
||||
}
|
||||
|
||||
/**
|
||||
* Finds a stage directory inside 'stagesDir' by prefix (e.g., "0_", "1_").
|
||||
* No extra checks: assumes such a directory exists.
|
||||
*/
|
||||
private fun findStageDir(stagesDir: Path, prefix: String): Path {
|
||||
return Files.list(stagesDir).use { s ->
|
||||
s.filter { Files.isDirectory(it) && it.fileName.toString().startsWith(prefix) }
|
||||
.findFirst().get()
|
||||
}
|
||||
}
|
||||
|
||||
private fun listTextOrJsonFiles(dir: Path): List<Path> {
|
||||
return Files.list(dir).use { s ->
|
||||
s.filter {
|
||||
Files.isRegularFile(it) &&
|
||||
(it.fileName.toString().endsWith(".json", ignoreCase = true) ||
|
||||
it.fileName.toString().endsWith(".txt", ignoreCase = true))
|
||||
}.collect(Collectors.toList())
|
||||
}
|
||||
}
|
||||
|
After Width: | Height: | Size: 36 KiB |
@@ -0,0 +1,52 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenCentral()
|
||||
mavenLocal() // in case of local dataframe development
|
||||
}
|
||||
|
||||
application.mainClass = "org.jetbrains.kotlinx.dataframe.examples.titanic.ml.TitanicKt"
|
||||
|
||||
dependencies {
|
||||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
|
||||
implementation(project(":"))
|
||||
|
||||
// note: needs to target java 11 for these dependencies
|
||||
implementation("org.jetbrains.kotlinx:kotlin-deeplearning-api:0.5.2")
|
||||
implementation("org.jetbrains.kotlinx:kotlin-deeplearning-impl:0.5.2")
|
||||
implementation("org.jetbrains.kotlinx:kotlin-deeplearning-tensorflow:0.5.2")
|
||||
implementation("org.jetbrains.kotlinx:kotlin-deeplearning-dataset:0.5.2")
|
||||
}
|
||||
|
||||
dataframes {
|
||||
schema {
|
||||
data = "src/main/resources/titanic.csv"
|
||||
name = "org.jetbrains.kotlinx.dataframe.examples.titanic.ml.Passenger"
|
||||
csvOptions {
|
||||
delimiter = ';'
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_11
|
||||
freeCompilerArgs.add("-Xjdk-release=11")
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_11.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_11.toString()
|
||||
options.release.set(11)
|
||||
}
|
||||
@@ -0,0 +1,95 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.titanic.ml
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.ColumnSelector
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.api.*
|
||||
import org.jetbrains.kotlinx.dl.api.core.Sequential
|
||||
import org.jetbrains.kotlinx.dl.api.core.activation.Activations
|
||||
import org.jetbrains.kotlinx.dl.api.core.initializer.HeNormal
|
||||
import org.jetbrains.kotlinx.dl.api.core.initializer.Zeros
|
||||
import org.jetbrains.kotlinx.dl.api.core.layer.core.Dense
|
||||
import org.jetbrains.kotlinx.dl.api.core.layer.core.Input
|
||||
import org.jetbrains.kotlinx.dl.api.core.loss.Losses
|
||||
import org.jetbrains.kotlinx.dl.api.core.metric.Metrics
|
||||
import org.jetbrains.kotlinx.dl.api.core.optimizer.Adam
|
||||
import org.jetbrains.kotlinx.dl.dataset.OnHeapDataset
|
||||
import java.util.Locale
|
||||
|
||||
private const val SEED = 12L
|
||||
private const val TEST_BATCH_SIZE = 100
|
||||
private const val EPOCHS = 50
|
||||
private const val TRAINING_BATCH_SIZE = 50
|
||||
|
||||
private val model = Sequential.of(
|
||||
Input(9),
|
||||
Dense(50, Activations.Relu, kernelInitializer = HeNormal(SEED), biasInitializer = Zeros()),
|
||||
Dense(50, Activations.Relu, kernelInitializer = HeNormal(SEED), biasInitializer = Zeros()),
|
||||
Dense(2, Activations.Linear, kernelInitializer = HeNormal(SEED), biasInitializer = Zeros())
|
||||
)
|
||||
|
||||
fun main() {
|
||||
|
||||
// Set Locale for correct number parsing
|
||||
Locale.setDefault(Locale.FRANCE)
|
||||
|
||||
val df = Passenger.readCsv()
|
||||
|
||||
// Calculating imputing values
|
||||
val (train, test) = df
|
||||
// imputing
|
||||
.fillNulls { sibsp and parch and age and fare }.perCol { it.mean() }
|
||||
.fillNulls { sex }.with { "female" }
|
||||
// one hot encoding
|
||||
.pivotMatches { pclass and sex }
|
||||
// feature extraction
|
||||
.select { survived and pclass and sibsp and parch and age and fare and sex }
|
||||
.shuffle()
|
||||
.toTrainTest(0.7) { survived }
|
||||
|
||||
model.use {
|
||||
it.compile(
|
||||
optimizer = Adam(),
|
||||
loss = Losses.SOFT_MAX_CROSS_ENTROPY_WITH_LOGITS,
|
||||
metric = Metrics.ACCURACY
|
||||
)
|
||||
|
||||
it.summary()
|
||||
it.fit(dataset = train, epochs = EPOCHS, batchSize = TRAINING_BATCH_SIZE)
|
||||
|
||||
val accuracy = it.evaluate(dataset = test, batchSize = TEST_BATCH_SIZE).metrics[Metrics.ACCURACY]
|
||||
|
||||
println("Accuracy: $accuracy")
|
||||
}
|
||||
}
|
||||
|
||||
fun <T> DataFrame<T>.toTrainTest(
|
||||
trainRatio: Double,
|
||||
yColumn: ColumnSelector<T, Number>,
|
||||
): Pair<OnHeapDataset, OnHeapDataset> =
|
||||
toOnHeapDataset(yColumn)
|
||||
.split(trainRatio)
|
||||
|
||||
private fun <T> DataFrame<T>.toOnHeapDataset(yColumn: ColumnSelector<T, Number>): OnHeapDataset =
|
||||
OnHeapDataset.create(
|
||||
dataframe = this,
|
||||
yColumn = yColumn,
|
||||
)
|
||||
|
||||
private fun <T> OnHeapDataset.Companion.create(
|
||||
dataframe: DataFrame<T>,
|
||||
yColumn: ColumnSelector<T, Number>,
|
||||
): OnHeapDataset {
|
||||
|
||||
fun extractX(): Array<FloatArray> =
|
||||
dataframe.remove(yColumn)
|
||||
.convert { colsAtAnyDepth().filter { !it.isColumnGroup() } }.toFloat()
|
||||
.merge { colsAtAnyDepth().colsOf<Float>() }.by { it.toFloatArray() }.into("X")
|
||||
.getColumn("X").cast<FloatArray>().toTypedArray()
|
||||
|
||||
fun extractY(): FloatArray = dataframe.get(yColumn).toFloatArray()
|
||||
|
||||
return create(
|
||||
::extractX,
|
||||
::extractY,
|
||||
)
|
||||
}
|
||||
@@ -0,0 +1,43 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
// uses the 'old' Gradle plugin instead of the compiler plugin for now
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenLocal() // in case of local dataframe development
|
||||
mavenCentral()
|
||||
}
|
||||
|
||||
dependencies {
|
||||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
|
||||
implementation(project(":"))
|
||||
|
||||
// exposed + sqlite database support
|
||||
implementation(libs.sqlite)
|
||||
implementation(libs.exposed.core)
|
||||
implementation(libs.exposed.kotlin.datetime)
|
||||
implementation(libs.exposed.jdbc)
|
||||
implementation(libs.exposed.json)
|
||||
implementation(libs.exposed.money)
|
||||
}
|
||||
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_1_8
|
||||
freeCompilerArgs.add("-Xjdk-release=8")
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
options.release.set(8)
|
||||
}
|
||||
@@ -0,0 +1,107 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.exposed
|
||||
|
||||
import org.jetbrains.exposed.v1.core.BiCompositeColumn
|
||||
import org.jetbrains.exposed.v1.core.Column
|
||||
import org.jetbrains.exposed.v1.core.Expression
|
||||
import org.jetbrains.exposed.v1.core.ExpressionAlias
|
||||
import org.jetbrains.exposed.v1.core.ResultRow
|
||||
import org.jetbrains.exposed.v1.core.Table
|
||||
import org.jetbrains.exposed.v1.jdbc.Query
|
||||
import org.jetbrains.kotlinx.dataframe.AnyFrame
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.convertTo
|
||||
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.codeGen.NameNormalizer
|
||||
import org.jetbrains.kotlinx.dataframe.impl.schema.DataFrameSchemaImpl
|
||||
import org.jetbrains.kotlinx.dataframe.schema.ColumnSchema
|
||||
import org.jetbrains.kotlinx.dataframe.schema.DataFrameSchema
|
||||
import kotlin.reflect.KProperty1
|
||||
import kotlin.reflect.full.isSubtypeOf
|
||||
import kotlin.reflect.full.memberProperties
|
||||
import kotlin.reflect.typeOf
|
||||
|
||||
/**
|
||||
* Retrieves all columns of any [Iterable][Iterable]`<`[ResultRow][ResultRow]`>`, like [Query][Query],
|
||||
* from Exposed row by row and converts the resulting [Map] into a [DataFrame], cast to type [T].
|
||||
*
|
||||
* In notebooks, the untyped version works just as well due to runtime inference :)
|
||||
*/
|
||||
inline fun <reified T : Any> Iterable<ResultRow>.convertToDataFrame(): DataFrame<T> =
|
||||
convertToDataFrame().convertTo<T>()
|
||||
|
||||
/**
|
||||
* Retrieves all columns of an [Iterable][Iterable]`<`[ResultRow][ResultRow]`>` from Exposed, like [Query][Query],
|
||||
* row by row and converts the resulting [Map] of lists into a [DataFrame] by calling
|
||||
* [Map.toDataFrame].
|
||||
*/
|
||||
@JvmName("convertToAnyFrame")
|
||||
fun Iterable<ResultRow>.convertToDataFrame(): AnyFrame {
|
||||
val map = mutableMapOf<String, MutableList<Any?>>()
|
||||
for (row in this) {
|
||||
for (expression in row.fieldIndex.keys) {
|
||||
map.getOrPut(expression.readableName) {
|
||||
mutableListOf()
|
||||
} += row[expression]
|
||||
}
|
||||
}
|
||||
return map.toDataFrame()
|
||||
}
|
||||
|
||||
/**
|
||||
* Retrieves a simple column name from [this] [Expression].
|
||||
*
|
||||
* Might need to be expanded with multiple types of [Expression].
|
||||
*/
|
||||
val Expression<*>.readableName: String
|
||||
get() = when (this) {
|
||||
is Column<*> -> name
|
||||
is ExpressionAlias<*> -> alias
|
||||
is BiCompositeColumn<*, *, *> -> getRealColumns().joinToString("_") { it.readableName }
|
||||
else -> toString()
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a [DataFrameSchema] from the declared [Table] instance.
|
||||
*
|
||||
* This is not needed for conversion, but it can be useful to create a DataFrame [@DataSchema][DataSchema] instance.
|
||||
*
|
||||
* @param columnNameToAccessor Optional [MutableMap] which will be filled with entries mapping
|
||||
* the SQL column name to the accessor name from the [Table].
|
||||
* This can be used to define a [NameNormalizer] later.
|
||||
* @see toDataFrameSchemaWithNameNormalizer
|
||||
*/
|
||||
@Suppress("UNCHECKED_CAST")
|
||||
fun Table.toDataFrameSchema(columnNameToAccessor: MutableMap<String, String> = mutableMapOf()): DataFrameSchema {
|
||||
// we use reflection to go over all `Column<*>` properties in the Table object
|
||||
val columns = this::class.memberProperties
|
||||
.filter { it.returnType.isSubtypeOf(typeOf<Column<*>>()) }
|
||||
.associate { prop ->
|
||||
prop as KProperty1<Table, Column<*>>
|
||||
|
||||
// retrieve the SQL column name
|
||||
val columnName = prop.get(this).name
|
||||
// store the SQL column name together with the accessor name in the map
|
||||
columnNameToAccessor[columnName] = prop.name
|
||||
|
||||
// get the column type from `val a: Column<Type>`
|
||||
val type = prop.returnType.arguments.first().type!!
|
||||
|
||||
// and we add the name and column shema type to the `columns` map :)
|
||||
columnName to ColumnSchema.Value(type)
|
||||
}
|
||||
return DataFrameSchemaImpl(columns)
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a [DataFrameSchema] from the declared [Table] instance with a [NameNormalizer] to
|
||||
* convert the SQL column names to the corresponding Kotlin property names.
|
||||
*
|
||||
* This is not needed for conversion, but it can be useful to create a DataFrame [@DataSchema][DataSchema] instance.
|
||||
*
|
||||
* @see toDataFrameSchema
|
||||
*/
|
||||
fun Table.toDataFrameSchemaWithNameNormalizer(): Pair<DataFrameSchema, NameNormalizer> {
|
||||
val columnNameToAccessor = mutableMapOf<String, String>()
|
||||
return Pair(toDataFrameSchema(), NameNormalizer { columnNameToAccessor[it] ?: it })
|
||||
}
|
||||
@@ -0,0 +1,96 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.exposed
|
||||
|
||||
import org.jetbrains.exposed.v1.core.Column
|
||||
import org.jetbrains.exposed.v1.core.SortOrder
|
||||
import org.jetbrains.exposed.v1.core.count
|
||||
import org.jetbrains.exposed.v1.jdbc.Database
|
||||
import org.jetbrains.exposed.v1.jdbc.SchemaUtils
|
||||
import org.jetbrains.exposed.v1.jdbc.batchInsert
|
||||
import org.jetbrains.exposed.v1.jdbc.deleteAll
|
||||
import org.jetbrains.exposed.v1.jdbc.select
|
||||
import org.jetbrains.exposed.v1.jdbc.selectAll
|
||||
import org.jetbrains.exposed.v1.jdbc.transactions.transaction
|
||||
import org.jetbrains.kotlinx.dataframe.api.asSequence
|
||||
import org.jetbrains.kotlinx.dataframe.api.count
|
||||
import org.jetbrains.kotlinx.dataframe.api.describe
|
||||
import org.jetbrains.kotlinx.dataframe.api.groupBy
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.sortByDesc
|
||||
import org.jetbrains.kotlinx.dataframe.size
|
||||
import java.io.File
|
||||
|
||||
/**
|
||||
* Describes a simple bridge between [Exposed](https://www.jetbrains.com/exposed/) and DataFrame!
|
||||
*/
|
||||
fun main() {
|
||||
// defining where to find our SQLite database for Exposed
|
||||
val resourceDb = "chinook.db"
|
||||
val dbPath = File(object {}.javaClass.classLoader.getResource(resourceDb)!!.toURI()).absolutePath
|
||||
val db = Database.connect(url = "jdbc:sqlite:$dbPath", driver = "org.sqlite.JDBC")
|
||||
|
||||
// let's read the database!
|
||||
val df = transaction(db) {
|
||||
// addLogger(StdOutSqlLogger) // enable if you want to see verbose logs
|
||||
|
||||
// tables in Exposed need to be defined, see tables.kt
|
||||
SchemaUtils.create(Customers, Artists, Albums)
|
||||
|
||||
println()
|
||||
|
||||
// In Exposed, we can write queries like this.
|
||||
// Here, we count per country how many customers there are and print the results:
|
||||
Customers
|
||||
.select(Customers.country, Customers.customerId.count())
|
||||
.groupBy(Customers.country)
|
||||
.orderBy(Customers.customerId.count() to SortOrder.DESC)
|
||||
.forEach {
|
||||
println("${it[Customers.country]}: ${it[Customers.customerId.count()]} customers")
|
||||
}
|
||||
|
||||
println()
|
||||
|
||||
// Perform the specific query you want to read into the DataFrame.
|
||||
// Note: DataFrames are in-memory structures, so don't make it too large if you don't have the RAM ;)
|
||||
val query = Customers.selectAll() // .where { Customers.company.isNotNull() }
|
||||
|
||||
println()
|
||||
|
||||
// read and convert the query to a typed DataFrame
|
||||
// see compatibilityLayer.kt for how we created convertToDataFrame<>()
|
||||
// and see tables.kt for how we created DfCustomers!
|
||||
query.convertToDataFrame<DfCustomers>()
|
||||
}
|
||||
|
||||
println(df.size())
|
||||
|
||||
// now we have a DataFrame, we can perform DataFrame operations,
|
||||
// like doing the same operation as we did in Exposed above
|
||||
df.groupBy { country }.count()
|
||||
.sortByDesc { "count"<Int>() }
|
||||
.print(columnTypes = true, borders = true)
|
||||
|
||||
// or just general statistics
|
||||
df.describe()
|
||||
.print(columnTypes = true, borders = true)
|
||||
|
||||
// or make plots using Kandy! It's all up to you
|
||||
|
||||
// writing a DataFrame back into an SQL database with Exposed can also be done easily!
|
||||
transaction(db) {
|
||||
// addLogger(StdOutSqlLogger) // enable if you want to see verbose logs
|
||||
|
||||
// first delete the original contents
|
||||
Customers.deleteAll()
|
||||
|
||||
println()
|
||||
|
||||
// batch-insert our dataframe back into the SQL database as a sequence of rows
|
||||
Customers.batchInsert(df.asSequence()) { dfRow ->
|
||||
// we simply go over each value in the row and put it in the right place in the Exposed statement
|
||||
for (column in Customers.columns) {
|
||||
@Suppress("UNCHECKED_CAST")
|
||||
this[column as Column<Any?>] = dfRow[column.name]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,97 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.exposed
|
||||
|
||||
import org.jetbrains.exposed.v1.core.Column
|
||||
import org.jetbrains.exposed.v1.core.Table
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.ColumnName
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.generateDataClasses
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
|
||||
object Albums : Table() {
|
||||
val albumId: Column<Int> = integer("AlbumId").autoIncrement()
|
||||
val title: Column<String> = varchar("Title", 160)
|
||||
val artistId: Column<Int> = integer("ArtistId")
|
||||
|
||||
override val primaryKey = PrimaryKey(albumId)
|
||||
}
|
||||
|
||||
object Artists : Table() {
|
||||
val artistId: Column<Int> = integer("ArtistId").autoIncrement()
|
||||
val name: Column<String> = varchar("Name", 120)
|
||||
|
||||
override val primaryKey = PrimaryKey(artistId)
|
||||
}
|
||||
|
||||
object Customers : Table() {
|
||||
val customerId: Column<Int> = integer("CustomerId").autoIncrement()
|
||||
val firstName: Column<String> = varchar("FirstName", 40)
|
||||
val lastName: Column<String> = varchar("LastName", 20)
|
||||
val company: Column<String?> = varchar("Company", 80).nullable()
|
||||
val address: Column<String?> = varchar("Address", 70).nullable()
|
||||
val city: Column<String?> = varchar("City", 40).nullable()
|
||||
val state: Column<String?> = varchar("State", 40).nullable()
|
||||
val country: Column<String?> = varchar("Country", 40).nullable()
|
||||
val postalCode: Column<String?> = varchar("PostalCode", 10).nullable()
|
||||
val phone: Column<String?> = varchar("Phone", 24).nullable()
|
||||
val fax: Column<String?> = varchar("Fax", 24).nullable()
|
||||
val email: Column<String> = varchar("Email", 60)
|
||||
val supportRepId: Column<Int?> = integer("SupportRepId").nullable()
|
||||
|
||||
override val primaryKey = PrimaryKey(customerId)
|
||||
}
|
||||
|
||||
/**
|
||||
* Exposed requires you to provide [Table] instances to
|
||||
* provide type-safe access to your columns and data.
|
||||
*
|
||||
* While DataFrame can infer types at runtime, which is enough for Kotlin Notebook,
|
||||
* to get type safe access at compile time, we need to define a [@DataSchema][DataSchema].
|
||||
*
|
||||
* This is what we created the [toDataFrameSchema] function for!
|
||||
*/
|
||||
fun main() {
|
||||
val (schema, nameNormalizer) = Customers.toDataFrameSchemaWithNameNormalizer()
|
||||
|
||||
// checking whether the schema is converted correctly.
|
||||
// schema.print()
|
||||
|
||||
// printing a @DataSchema data class to copy-paste into the code.
|
||||
// we use a NameNormalizer to let DataFrame generate the same accessors as in the Table
|
||||
// while keeping the correct column names
|
||||
schema.generateDataClasses(
|
||||
markerName = "DfCustomers",
|
||||
nameNormalizer = nameNormalizer,
|
||||
).print()
|
||||
}
|
||||
|
||||
// created by Customers.toDataFrameSchema()
|
||||
// The same can be done for the other tables
|
||||
@DataSchema
|
||||
data class DfCustomers(
|
||||
@ColumnName("Address")
|
||||
val address: String?,
|
||||
@ColumnName("City")
|
||||
val city: String?,
|
||||
@ColumnName("Company")
|
||||
val company: String?,
|
||||
@ColumnName("Country")
|
||||
val country: String?,
|
||||
@ColumnName("CustomerId")
|
||||
val customerId: Int,
|
||||
@ColumnName("Email")
|
||||
val email: String,
|
||||
@ColumnName("Fax")
|
||||
val fax: String?,
|
||||
@ColumnName("FirstName")
|
||||
val firstName: String,
|
||||
@ColumnName("LastName")
|
||||
val lastName: String,
|
||||
@ColumnName("Phone")
|
||||
val phone: String?,
|
||||
@ColumnName("PostalCode")
|
||||
val postalCode: String?,
|
||||
@ColumnName("State")
|
||||
val state: String?,
|
||||
@ColumnName("SupportRepId")
|
||||
val supportRepId: Int?,
|
||||
)
|
||||
@@ -0,0 +1,43 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
// uses the 'old' Gradle plugin instead of the compiler plugin for now
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenLocal() // in case of local dataframe development
|
||||
mavenCentral()
|
||||
}
|
||||
|
||||
dependencies {
|
||||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
|
||||
implementation(project(":"))
|
||||
|
||||
// Hibernate + H2 + HikariCP (for Hibernate example)
|
||||
implementation(libs.hibernate.core)
|
||||
implementation(libs.hibernate.hikaricp)
|
||||
implementation(libs.hikaricp)
|
||||
|
||||
implementation(libs.h2db)
|
||||
implementation(libs.sl4jsimple)
|
||||
}
|
||||
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_11
|
||||
freeCompilerArgs.add("-Xjdk-release=11")
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_11.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_11.toString()
|
||||
options.release.set(11)
|
||||
}
|
||||
@@ -0,0 +1,100 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.hibernate
|
||||
|
||||
import jakarta.persistence.Column
|
||||
import jakarta.persistence.Entity
|
||||
import jakarta.persistence.GeneratedValue
|
||||
import jakarta.persistence.GenerationType
|
||||
import jakarta.persistence.Id
|
||||
import jakarta.persistence.Table
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.ColumnName
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
|
||||
@Entity
|
||||
@Table(name = "Albums")
|
||||
class AlbumsEntity(
|
||||
@Id
|
||||
@GeneratedValue(strategy = GenerationType.IDENTITY)
|
||||
@Column(name = "AlbumId")
|
||||
var albumId: Int? = null,
|
||||
|
||||
@Column(name = "Title", length = 160, nullable = false)
|
||||
var title: String = "",
|
||||
|
||||
@Column(name = "ArtistId", nullable = false)
|
||||
var artistId: Int = 0,
|
||||
)
|
||||
|
||||
@Entity
|
||||
@Table(name = "Artists")
|
||||
class ArtistsEntity(
|
||||
@Id
|
||||
@GeneratedValue(strategy = GenerationType.IDENTITY)
|
||||
@Column(name = "ArtistId")
|
||||
var artistId: Int? = null,
|
||||
|
||||
@Column(name = "Name", length = 120, nullable = false)
|
||||
var name: String = "",
|
||||
)
|
||||
|
||||
@Entity
|
||||
@Table(name = "Customers")
|
||||
class CustomersEntity(
|
||||
@Id
|
||||
@GeneratedValue(strategy = GenerationType.IDENTITY)
|
||||
@Column(name = "CustomerId")
|
||||
var customerId: Int? = null,
|
||||
|
||||
@Column(name = "FirstName", length = 40, nullable = false)
|
||||
var firstName: String = "",
|
||||
|
||||
@Column(name = "LastName", length = 20, nullable = false)
|
||||
var lastName: String = "",
|
||||
|
||||
@Column(name = "Company", length = 80)
|
||||
var company: String? = null,
|
||||
|
||||
@Column(name = "Address", length = 70)
|
||||
var address: String? = null,
|
||||
|
||||
@Column(name = "City", length = 40)
|
||||
var city: String? = null,
|
||||
|
||||
@Column(name = "State", length = 40)
|
||||
var state: String? = null,
|
||||
|
||||
@Column(name = "Country", length = 40)
|
||||
var country: String? = null,
|
||||
|
||||
@Column(name = "PostalCode", length = 10)
|
||||
var postalCode: String? = null,
|
||||
|
||||
@Column(name = "Phone", length = 24)
|
||||
var phone: String? = null,
|
||||
|
||||
@Column(name = "Fax", length = 24)
|
||||
var fax: String? = null,
|
||||
|
||||
@Column(name = "Email", length = 60, nullable = false)
|
||||
var email: String = "",
|
||||
|
||||
@Column(name = "SupportRepId")
|
||||
var supportRepId: Int? = null,
|
||||
)
|
||||
|
||||
// DataFrame schema to get typed accessors similar to Exposed example
|
||||
@DataSchema
|
||||
data class DfCustomers(
|
||||
@ColumnName("Address") val address: String?,
|
||||
@ColumnName("City") val city: String?,
|
||||
@ColumnName("Company") val company: String?,
|
||||
@ColumnName("Country") val country: String?,
|
||||
@ColumnName("CustomerId") val customerId: Int,
|
||||
@ColumnName("Email") val email: String,
|
||||
@ColumnName("Fax") val fax: String?,
|
||||
@ColumnName("FirstName") val firstName: String,
|
||||
@ColumnName("LastName") val lastName: String,
|
||||
@ColumnName("Phone") val phone: String?,
|
||||
@ColumnName("PostalCode") val postalCode: String?,
|
||||
@ColumnName("State") val state: String?,
|
||||
@ColumnName("SupportRepId") val supportRepId: Int?,
|
||||
)
|
||||
@@ -0,0 +1,251 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.hibernate
|
||||
|
||||
import jakarta.persistence.Tuple
|
||||
import jakarta.persistence.criteria.CriteriaBuilder
|
||||
import jakarta.persistence.criteria.CriteriaDelete
|
||||
import jakarta.persistence.criteria.CriteriaQuery
|
||||
import jakarta.persistence.criteria.Expression
|
||||
import jakarta.persistence.criteria.Root
|
||||
import org.hibernate.FlushMode
|
||||
import org.hibernate.SessionFactory
|
||||
import org.hibernate.cfg.Configuration
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.DataRow
|
||||
import org.jetbrains.kotlinx.dataframe.api.asSequence
|
||||
import org.jetbrains.kotlinx.dataframe.api.count
|
||||
import org.jetbrains.kotlinx.dataframe.api.describe
|
||||
import org.jetbrains.kotlinx.dataframe.api.groupBy
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.sortByDesc
|
||||
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.size
|
||||
|
||||
/**
|
||||
* Example showing Kotlin DataFrame with Hibernate ORM + H2 in-memory DB.
|
||||
* Mirrors logic from the Exposed example: load data, convert to DataFrame, group/describe, write back.
|
||||
*/
|
||||
fun main() {
|
||||
val sessionFactory: SessionFactory = buildSessionFactory()
|
||||
|
||||
sessionFactory.insertSampleData()
|
||||
|
||||
val df = sessionFactory.loadCustomersAsDataFrame()
|
||||
|
||||
// Pure Hibernate + Criteria API approach for counting customers per country
|
||||
println("=== Hibernate + Criteria API Approach ===")
|
||||
sessionFactory.countCustomersPerCountryWithHibernate()
|
||||
|
||||
println("\n=== DataFrame Approach ===")
|
||||
df.analyzeAndPrintResults()
|
||||
|
||||
sessionFactory.replaceCustomersFromDataFrame(df)
|
||||
|
||||
sessionFactory.close()
|
||||
}
|
||||
|
||||
private fun SessionFactory.insertSampleData() {
|
||||
withTransaction { session ->
|
||||
// a few artists and albums (minimal, not used further; just demo schema)
|
||||
val artist1 = ArtistsEntity(name = "AC/DC")
|
||||
val artist2 = ArtistsEntity(name = "Queen")
|
||||
session.persist(artist1)
|
||||
session.persist(artist2)
|
||||
session.flush()
|
||||
|
||||
session.persist(AlbumsEntity(title = "High Voltage", artistId = artist1.artistId!!))
|
||||
session.persist(AlbumsEntity(title = "Back in Black", artistId = artist1.artistId!!))
|
||||
session.persist(AlbumsEntity(title = "A Night at the Opera", artistId = artist2.artistId!!))
|
||||
// customers we'll analyze using DataFrame
|
||||
session.persist(
|
||||
CustomersEntity(
|
||||
firstName = "John",
|
||||
lastName = "Doe",
|
||||
email = "john.doe@example.com",
|
||||
country = "USA",
|
||||
),
|
||||
)
|
||||
session.persist(
|
||||
CustomersEntity(
|
||||
firstName = "Jane",
|
||||
lastName = "Smith",
|
||||
email = "jane.smith@example.com",
|
||||
country = "USA",
|
||||
),
|
||||
)
|
||||
session.persist(
|
||||
CustomersEntity(
|
||||
firstName = "Alice",
|
||||
lastName = "Wang",
|
||||
email = "alice.wang@example.com",
|
||||
country = "Canada",
|
||||
),
|
||||
)
|
||||
}
|
||||
}
|
||||
|
||||
private fun SessionFactory.loadCustomersAsDataFrame(): DataFrame<DfCustomers> {
|
||||
return withReadOnlyTransaction { session ->
|
||||
val criteriaBuilder: CriteriaBuilder = session.criteriaBuilder
|
||||
val criteriaQuery: CriteriaQuery<CustomersEntity> = criteriaBuilder.createQuery(CustomersEntity::class.java)
|
||||
val root: Root<CustomersEntity> = criteriaQuery.from(CustomersEntity::class.java)
|
||||
criteriaQuery.select(root)
|
||||
|
||||
session.createQuery(criteriaQuery)
|
||||
.resultList
|
||||
.map { c ->
|
||||
DfCustomers(
|
||||
address = c.address,
|
||||
city = c.city,
|
||||
company = c.company,
|
||||
country = c.country,
|
||||
customerId = c.customerId ?: -1,
|
||||
email = c.email,
|
||||
fax = c.fax,
|
||||
firstName = c.firstName,
|
||||
lastName = c.lastName,
|
||||
phone = c.phone,
|
||||
postalCode = c.postalCode,
|
||||
state = c.state,
|
||||
supportRepId = c.supportRepId,
|
||||
)
|
||||
}
|
||||
.toDataFrame()
|
||||
}
|
||||
}
|
||||
|
||||
/** DTO used for aggregation projection. */
|
||||
private data class CountryCountDto(
|
||||
val country: String,
|
||||
val customerCount: Long,
|
||||
)
|
||||
|
||||
/**
|
||||
* **Hibernate + Criteria API:**
|
||||
* - ✅ Database-level aggregation (efficient)
|
||||
* - ✅ Type-safe queries
|
||||
* - ❌ Verbose syntax
|
||||
* - ❌ Limited to SQL-like operations
|
||||
*/
|
||||
private fun SessionFactory.countCustomersPerCountryWithHibernate() {
|
||||
withReadOnlyTransaction { session ->
|
||||
val cb = session.criteriaBuilder
|
||||
val cq: CriteriaQuery<CountryCountDto> = cb.createQuery(CountryCountDto::class.java)
|
||||
val root: Root<CustomersEntity> = cq.from(CustomersEntity::class.java)
|
||||
|
||||
val countryPath = root.get<String>("country")
|
||||
val idPath = root.get<Long>("customerId")
|
||||
|
||||
val countExpr = cb.count(idPath)
|
||||
|
||||
cq.select(
|
||||
cb.construct(
|
||||
CountryCountDto::class.java,
|
||||
countryPath, // country
|
||||
countExpr, // customerCount
|
||||
),
|
||||
)
|
||||
cq.groupBy(countryPath)
|
||||
cq.orderBy(cb.desc(countExpr))
|
||||
|
||||
val results = session.createQuery(cq).resultList
|
||||
results.forEach { dto ->
|
||||
println("${dto.country}: ${dto.customerCount} customers")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* **DataFrame approach: **
|
||||
* - ✅ Rich analytical operations
|
||||
* - ✅ Fluent, readable API
|
||||
* - ✅ Flexible data transformations
|
||||
* - ❌ In-memory processing (less efficient for large datasets)
|
||||
*/
|
||||
private fun DataFrame<DfCustomers>.analyzeAndPrintResults() {
|
||||
println(size())
|
||||
|
||||
// same operation as Exposed example: customers per country
|
||||
groupBy { country }.count()
|
||||
.sortByDesc { "count"<Int>() }
|
||||
.print(columnTypes = true, borders = true)
|
||||
|
||||
// general statistics
|
||||
describe()
|
||||
.print(columnTypes = true, borders = true)
|
||||
}
|
||||
|
||||
private fun SessionFactory.replaceCustomersFromDataFrame(df: DataFrame<DfCustomers>) {
|
||||
withTransaction { session ->
|
||||
val criteriaBuilder: CriteriaBuilder = session.criteriaBuilder
|
||||
val criteriaDelete: CriteriaDelete<CustomersEntity> =
|
||||
criteriaBuilder.createCriteriaDelete(CustomersEntity::class.java)
|
||||
criteriaDelete.from(CustomersEntity::class.java)
|
||||
|
||||
session.createMutationQuery(criteriaDelete).executeUpdate()
|
||||
}
|
||||
|
||||
withTransaction { session ->
|
||||
df.asSequence().forEach { row ->
|
||||
session.persist(row.toCustomersEntity())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private fun DataRow<DfCustomers>.toCustomersEntity(): CustomersEntity {
|
||||
return CustomersEntity(
|
||||
customerId = null, // let DB generate
|
||||
firstName = this.firstName,
|
||||
lastName = this.lastName,
|
||||
company = this.company,
|
||||
address = this.address,
|
||||
city = this.city,
|
||||
state = this.state,
|
||||
country = this.country,
|
||||
postalCode = this.postalCode,
|
||||
phone = this.phone,
|
||||
fax = this.fax,
|
||||
email = this.email,
|
||||
supportRepId = this.supportRepId,
|
||||
)
|
||||
}
|
||||
|
||||
private inline fun <T> SessionFactory.withSession(block: (session: org.hibernate.Session) -> T): T {
|
||||
return openSession().use(block)
|
||||
}
|
||||
|
||||
private inline fun SessionFactory.withTransaction(block: (session: org.hibernate.Session) -> Unit) {
|
||||
withSession { session ->
|
||||
session.beginTransaction()
|
||||
try {
|
||||
block(session)
|
||||
session.transaction.commit()
|
||||
} catch (e: Exception) {
|
||||
session.transaction.rollback()
|
||||
throw e
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/** Read-only transaction helper for SELECT queries to minimize overhead. */
|
||||
private inline fun <T> SessionFactory.withReadOnlyTransaction(block: (session: org.hibernate.Session) -> T): T {
|
||||
return withSession { session ->
|
||||
session.beginTransaction()
|
||||
// Minimize overhead for read operations
|
||||
session.isDefaultReadOnly = true
|
||||
session.hibernateFlushMode = FlushMode.MANUAL
|
||||
try {
|
||||
val result = block(session)
|
||||
session.transaction.commit()
|
||||
result
|
||||
} catch (e: Exception) {
|
||||
session.transaction.rollback()
|
||||
throw e
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
private fun buildSessionFactory(): SessionFactory {
|
||||
// Load configuration from resources/hibernate/hibernate.cfg.xml
|
||||
return Configuration().configure("hibernate/hibernate.cfg.xml").buildSessionFactory()
|
||||
}
|
||||
@@ -0,0 +1,32 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<!DOCTYPE hibernate-configuration PUBLIC
|
||||
"-//Hibernate/Hibernate Configuration DTD 5.3//EN"
|
||||
"http://hibernate.org/dtd/hibernate-configuration-5.3.dtd">
|
||||
<hibernate-configuration>
|
||||
<session-factory>
|
||||
<!-- H2 in-memory -->
|
||||
<property name="hibernate.connection.driver_class">org.h2.Driver</property>
|
||||
<property name="hibernate.connection.url">jdbc:h2:mem:testdb;DB_CLOSE_DELAY=-1</property>
|
||||
<property name="hibernate.connection.username">sa</property>
|
||||
<property name="hibernate.connection.password"></property>
|
||||
|
||||
<!-- Connection pool: HikariCP via Hibernate integration -->
|
||||
<property name="hibernate.connection.provider_class">org.hibernate.hikaricp.internal.HikariCPConnectionProvider</property>
|
||||
<property name="hibernate.hikari.maximumPoolSize">5</property>
|
||||
|
||||
<!-- Hibernate Dialect -->
|
||||
<property name="hibernate.dialect">org.hibernate.dialect.H2Dialect</property>
|
||||
|
||||
<!-- Automatic schema generation -->
|
||||
<property name="hibernate.hbm2ddl.auto">create-drop</property>
|
||||
|
||||
<!-- Logging -->
|
||||
<property name="hibernate.show_sql">true</property>
|
||||
<property name="hibernate.format_sql">true</property>
|
||||
|
||||
<!-- Mappings -->
|
||||
<mapping class="org.jetbrains.kotlinx.dataframe.examples.hibernate.CustomersEntity"/>
|
||||
<mapping class="org.jetbrains.kotlinx.dataframe.examples.hibernate.ArtistsEntity"/>
|
||||
<mapping class="org.jetbrains.kotlinx.dataframe.examples.hibernate.AlbumsEntity"/>
|
||||
</session-factory>
|
||||
</hibernate-configuration>
|
||||
@@ -0,0 +1,39 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
// uses the 'old' Gradle plugin instead of the compiler plugin for now
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenLocal() // in case of local dataframe development
|
||||
mavenCentral()
|
||||
}
|
||||
|
||||
dependencies {
|
||||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
|
||||
implementation(project(":"))
|
||||
|
||||
// multik support
|
||||
implementation(libs.multik.core)
|
||||
implementation(libs.multik.default)
|
||||
}
|
||||
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_1_8
|
||||
freeCompilerArgs.add("-Xjdk-release=8")
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
options.release.set(8)
|
||||
}
|
||||
@@ -0,0 +1,374 @@
|
||||
@file:OptIn(ExperimentalTypeInference::class)
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.multik
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.AnyFrame
|
||||
import org.jetbrains.kotlinx.dataframe.ColumnSelector
|
||||
import org.jetbrains.kotlinx.dataframe.ColumnsSelector
|
||||
import org.jetbrains.kotlinx.dataframe.DataColumn
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.api.ValueProperty
|
||||
import org.jetbrains.kotlinx.dataframe.api.cast
|
||||
import org.jetbrains.kotlinx.dataframe.api.colsOf
|
||||
import org.jetbrains.kotlinx.dataframe.api.column
|
||||
import org.jetbrains.kotlinx.dataframe.api.dataFrameOf
|
||||
import org.jetbrains.kotlinx.dataframe.api.getColumn
|
||||
import org.jetbrains.kotlinx.dataframe.api.getColumns
|
||||
import org.jetbrains.kotlinx.dataframe.api.map
|
||||
import org.jetbrains.kotlinx.dataframe.api.named
|
||||
import org.jetbrains.kotlinx.dataframe.api.toColumn
|
||||
import org.jetbrains.kotlinx.dataframe.api.toColumnGroup
|
||||
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.columns.BaseColumn
|
||||
import org.jetbrains.kotlinx.dataframe.columns.ColumnGroup
|
||||
import org.jetbrains.kotlinx.multik.api.mk
|
||||
import org.jetbrains.kotlinx.multik.api.ndarray
|
||||
import org.jetbrains.kotlinx.multik.ndarray.complex.Complex
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.D1Array
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.D2Array
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.D3Array
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.MultiArray
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.NDArray
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.get
|
||||
import org.jetbrains.kotlinx.multik.ndarray.operations.toList
|
||||
import org.jetbrains.kotlinx.multik.ndarray.operations.toListD2
|
||||
import kotlin.experimental.ExperimentalTypeInference
|
||||
import kotlin.reflect.KClass
|
||||
import kotlin.reflect.KType
|
||||
import kotlin.reflect.full.isSubtypeOf
|
||||
import kotlin.reflect.typeOf
|
||||
|
||||
// region 1D
|
||||
|
||||
/** Converts a one-dimensional array ([D1Array]) to a [DataColumn] with optional [name]. */
|
||||
inline fun <reified N> D1Array<N>.convertToColumn(name: String = ""): DataColumn<N> {
|
||||
// we can simply convert the 1D array to a typed list and create a typed column from it
|
||||
// by using the reified type parameter, DataFrame needs to do no inference :)
|
||||
val values = this.toList()
|
||||
return column<N>(values) named name
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a one-dimensional array ([D1Array]) of type [N] into a DataFrame.
|
||||
* The resulting DataFrame contains a single column named "value", where each element of the array becomes a row in the DataFrame.
|
||||
*
|
||||
* @return a DataFrame where each element of the source array is represented as a row in a column named "value" under the schema [ValueProperty].
|
||||
*/
|
||||
@JvmName("convert1dArrayToDataFrame")
|
||||
inline fun <reified N> D1Array<N>.convertToDataFrame(): DataFrame<ValueProperty<N>> {
|
||||
// do the conversion like above, but name the column "value"...
|
||||
val column = this.convertToColumn(ValueProperty<*>::value.name)
|
||||
// ...so we can cast it to a ValueProperty DataFrame
|
||||
return dataFrameOf(column).cast<ValueProperty<N>>()
|
||||
}
|
||||
|
||||
/** Converts a [DataColumn] to a one-dimensional array ([D1Array]). */
|
||||
@JvmName("convertNumberColumnToMultik")
|
||||
inline fun <reified N> DataColumn<N>.convertToMultik(): D1Array<N> where N : Number, N : Comparable<N> {
|
||||
// we can convert our column to a typed list again to convert it to a multik array
|
||||
val values = this.toList()
|
||||
return mk.ndarray(values)
|
||||
}
|
||||
|
||||
/** Converts a [DataColumn] to a one-dimensional array ([D1Array]). */
|
||||
@JvmName("convertComplexColumnToMultik")
|
||||
inline fun <reified N : Complex> DataColumn<N>.convertToMultik(): D1Array<N> {
|
||||
// we can convert our column to a typed list again to convert it to a multik array
|
||||
val values = this.toList()
|
||||
return mk.ndarray(values)
|
||||
}
|
||||
|
||||
/** Converts a [DataColumn] selected by [column] to a one-dimensional array ([D1Array]). */
|
||||
@JvmName("convertNumberColumnFromDfToMultik")
|
||||
@OverloadResolutionByLambdaReturnType
|
||||
inline fun <T, reified N> DataFrame<T>.convertToMultik(
|
||||
crossinline column: ColumnSelector<T, N>,
|
||||
): D1Array<N>
|
||||
where N : Number, N : Comparable<N> {
|
||||
// use the selector to get the column from this DataFrame and convert it
|
||||
val col = this.getColumn { column(it) }
|
||||
return col.convertToMultik()
|
||||
}
|
||||
|
||||
/** Converts a [DataColumn] selected by [column] to a one-dimensional array ([D1Array]). */
|
||||
@JvmName("convertComplexColumnFromDfToMultik")
|
||||
@OverloadResolutionByLambdaReturnType
|
||||
inline fun <T, reified N : Complex> DataFrame<T>.convertToMultik(crossinline column: ColumnSelector<T, N>): D1Array<N> {
|
||||
// use the selector to get the column from this DataFrame and convert it
|
||||
val col = this.getColumn { column(it) }
|
||||
return col.convertToMultik()
|
||||
}
|
||||
|
||||
// endregion
|
||||
|
||||
// region 2D
|
||||
|
||||
/**
|
||||
* Converts a two-dimensional array ([D2Array]) to a DataFrame.
|
||||
* It will contain `shape[0]` rows and `shape[1]` columns.
|
||||
*
|
||||
* Column names can be specified using the [columnNameGenerator] lambda.
|
||||
*
|
||||
* The conversion enforces that `multikArray[x][y] == dataframe[x][y]`
|
||||
*/
|
||||
@JvmName("convert2dArrayToDataFrame")
|
||||
inline fun <reified N> D2Array<N>.convertToDataFrame(columnNameGenerator: (Int) -> String = { "col$it" }): AnyFrame {
|
||||
// Turning the 2D array into a list of typed columns first, no inference needed
|
||||
val columns: List<DataColumn<N>> = List(shape[1]) { i ->
|
||||
this[0..<shape[0], i] // get all cells of column i
|
||||
.toList()
|
||||
.toColumn<N>(name = columnNameGenerator(i))
|
||||
}
|
||||
// and make a DataFrame from it
|
||||
return columns.toDataFrame()
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a [DataFrame] to a two-dimensional array ([D2Array]).
|
||||
* You'll need to specify which columns to convert using the [columns] selector.
|
||||
*
|
||||
* All columns need to be of the same type. If no columns are supplied, the function
|
||||
* will only succeed if all columns are of the same type.
|
||||
*
|
||||
* @see convertToMultikOf
|
||||
*/
|
||||
@JvmName("convertNumberColumnsFromDfToMultik")
|
||||
@OverloadResolutionByLambdaReturnType
|
||||
inline fun <T, reified N> DataFrame<T>.convertToMultik(
|
||||
crossinline columns: ColumnsSelector<T, N>,
|
||||
): D2Array<N>
|
||||
where N : Number, N : Comparable<N> {
|
||||
// use the selector to get the columns from this DataFrame and convert them
|
||||
val cols = this.getColumns { columns(it) }
|
||||
return cols.convertToMultik()
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a [DataFrame] to a two-dimensional array ([D2Array]).
|
||||
* You'll need to specify which columns to convert using the [columns] selector.
|
||||
*
|
||||
* All columns need to be of the same type. If no columns are supplied, the function
|
||||
* will only succeed if all columns are of the same type.
|
||||
*
|
||||
* @see convertToMultikOf
|
||||
*/
|
||||
@JvmName("convertComplexColumnsFromDfToMultik")
|
||||
@OverloadResolutionByLambdaReturnType
|
||||
inline fun <T, reified N : Complex> DataFrame<T>.convertToMultik(
|
||||
crossinline columns: ColumnsSelector<T, N>,
|
||||
): D2Array<N> {
|
||||
// use the selector to get the columns from this DataFrame and convert them
|
||||
val cols = this.getColumns { columns(it) }
|
||||
return cols.convertToMultik()
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a [DataFrame] to a two-dimensional array ([D2Array]).
|
||||
* You'll need to specify which columns to convert using the `columns` selector.
|
||||
*
|
||||
* All columns need to be of the same type. If no columns are supplied, the function
|
||||
* will only succeed if all columns in [this] are of the same type.
|
||||
*
|
||||
* @see convertToMultikOf
|
||||
*/
|
||||
@JvmName("convertToMultikGuess")
|
||||
fun AnyFrame.convertToMultik(): D2Array<*> {
|
||||
val columnTypes = this.columnTypes().distinct()
|
||||
val type = columnTypes.singleOrNull() ?: error("found multiple column types: $columnTypes")
|
||||
return when {
|
||||
type == typeOf<Complex>() -> convertToMultik { colsOf<Complex>() }
|
||||
type.isSubtypeOf(typeOf<Byte>()) -> convertToMultik { colsOf<Byte>() }
|
||||
type.isSubtypeOf(typeOf<Short>()) -> convertToMultik { colsOf<Short>() }
|
||||
type.isSubtypeOf(typeOf<Int>()) -> convertToMultik { colsOf<Int>() }
|
||||
type.isSubtypeOf(typeOf<Long>()) -> convertToMultik { colsOf<Long>() }
|
||||
type.isSubtypeOf(typeOf<Float>()) -> convertToMultik { colsOf<Float>() }
|
||||
type.isSubtypeOf(typeOf<Double>()) -> convertToMultik { colsOf<Double>() }
|
||||
else -> error("found multiple column types: $columnTypes")
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a [DataFrame] to a two-dimensional array ([D2Array]) by taking all
|
||||
* columns of type [N].
|
||||
*
|
||||
* Allows you to write `df.convertToMultikOf<Complex>()`.
|
||||
*
|
||||
* @see convertToMultik
|
||||
*/
|
||||
@JvmName("convertToMultikOfComplex")
|
||||
@Suppress("LocalVariableName")
|
||||
inline fun <reified N : Complex> AnyFrame.convertToMultikOf(
|
||||
// unused param to avoid overload resolution ambiguity
|
||||
_klass: KClass<Complex> = Complex::class,
|
||||
): D2Array<N> =
|
||||
convertToMultik { colsOf<N>() }
|
||||
|
||||
/**
|
||||
* Converts a [DataFrame] to a two-dimensional array ([D2Array]) by taking all
|
||||
* columns of type [N].
|
||||
*
|
||||
* Allows you to write `df.convertToMultikOf<Int>()`.
|
||||
*
|
||||
* @see convertToMultik
|
||||
*/
|
||||
@JvmName("convertToMultikOfNumber")
|
||||
@Suppress("LocalVariableName")
|
||||
inline fun <reified N> AnyFrame.convertToMultikOf(
|
||||
// unused param to avoid overload resolution ambiguity
|
||||
_klass: KClass<Number> = Number::class,
|
||||
): D2Array<N> where N : Number, N : Comparable<N> = convertToMultik { colsOf<N>() }
|
||||
|
||||
/**
|
||||
* Helper function to convert a list of same-typed [DataColumn]s to a two-dimensional array ([D2Array]).
|
||||
* We cannot enforce all columns have the same type if we require just a [DataFrame].
|
||||
*/
|
||||
@Suppress("UNCHECKED_CAST")
|
||||
@JvmName("convertNumberColumnsToMultik")
|
||||
inline fun <reified N> List<DataColumn<N>>.convertToMultik(): D2Array<N> where N : Number, N : Comparable<N> {
|
||||
// to get the list of columns as a list of rows, we need to convert them back to a dataframe first,
|
||||
// then we can get the values of each row
|
||||
val rows = this.toDataFrame().map { row -> row.values() as List<N> }
|
||||
return mk.ndarray(rows)
|
||||
}
|
||||
|
||||
/**
|
||||
* Helper function to convert a list of same-typed [DataColumn]s to a two-dimensional array ([D2Array]).
|
||||
* We cannot enforce all columns have the same type if we require just a [DataFrame].
|
||||
*/
|
||||
@Suppress("UNCHECKED_CAST")
|
||||
@JvmName("convertComplexColumnsToMultik")
|
||||
inline fun <reified N : Complex> List<DataColumn<N>>.convertToMultik(): D2Array<N> {
|
||||
// to get the list of columns as a list of rows, we need to convert them back to a dataframe first,
|
||||
// then we can get the values of each row
|
||||
val rows = this.toDataFrame().map { row -> row.values() as List<N> }
|
||||
return mk.ndarray(rows)
|
||||
}
|
||||
|
||||
// endregion
|
||||
|
||||
// region higher dimensions
|
||||
|
||||
/**
|
||||
* Converts a three-dimensional array ([D3Array]) to a DataFrame.
|
||||
* It will contain `shape[0]` rows and `shape[1]` columns containing lists of size `shape[2]`.
|
||||
*
|
||||
* Column names can be specified using the [columnNameGenerator] lambda.
|
||||
*
|
||||
* The conversion enforces that `multikArray[x][y][z] == dataframe[x][y][z]`
|
||||
*/
|
||||
inline fun <reified N> D3Array<N>.convertToDataFrameWithLists(
|
||||
columnNameGenerator: (Int) -> String = { "col$it" },
|
||||
): AnyFrame {
|
||||
val columns: List<DataColumn<List<N>>> = List(shape[1]) { y ->
|
||||
this[0..<shape[0], y, 0..<shape[2]] // get all cells of column y, each is a 2d array of size shape[0] x shape[2]
|
||||
.toListD2() // get a shape[0]-sized list/column filled with lists of size shape[2]
|
||||
.toColumn<List<N>>(name = columnNameGenerator(y))
|
||||
}
|
||||
return columns.toDataFrame()
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a three-dimensional array ([D3Array]) to a DataFrame.
|
||||
* It will contain `shape[0]` rows and `shape[1]` column groups containing `shape[2]` columns each.
|
||||
*
|
||||
* Column names can be specified using the [columnNameGenerator] lambda.
|
||||
*
|
||||
* The conversion enforces that `multikArray[x][y][z] == dataframe[x][y][z]`
|
||||
*/
|
||||
@JvmName("convert3dArrayToDataFrame")
|
||||
inline fun <reified N> D3Array<N>.convertToDataFrame(columnNameGenerator: (Int) -> String = { "col$it" }): AnyFrame {
|
||||
val columns: List<ColumnGroup<*>> = List(shape[1]) { y ->
|
||||
this[0..<shape[0], y, 0..<shape[2]] // get all cells of column i, each is a 2d array of size shape[0] x shape[2]
|
||||
.transpose(1, 0) // flip, so we get shape[2] x shape[0]
|
||||
.toListD2() // get a shape[2]-sized list filled with lists of size shape[0]
|
||||
.mapIndexed { z, list ->
|
||||
list.toColumn<N>(name = columnNameGenerator(z))
|
||||
} // we get shape[2] columns inside each column group
|
||||
.toColumnGroup(name = columnNameGenerator(y))
|
||||
}
|
||||
return columns.toDataFrame()
|
||||
}
|
||||
|
||||
/**
|
||||
* Exploratory recursive function to convert a [MultiArray] of any number of dimensions
|
||||
* to a `List<List<...>>` of the same number of dimensions.
|
||||
*/
|
||||
fun <T> MultiArray<T, *>.toListDn(): List<*> {
|
||||
// Recursive helper function to handle traversal across dimensions
|
||||
fun toListRecursive(indices: IntArray): List<*> {
|
||||
// If we are at the last dimension (1D case)
|
||||
if (indices.size == shape.lastIndex) {
|
||||
return List(shape[indices.size]) { i ->
|
||||
this[intArrayOf(*indices, i)] // Collect values for this dimension
|
||||
}
|
||||
}
|
||||
|
||||
// For higher dimensions, recursively process smaller dimensions
|
||||
return List(shape[indices.size]) { i ->
|
||||
toListRecursive(indices + i) // Add `i` to the current index array
|
||||
}
|
||||
}
|
||||
return toListRecursive(intArrayOf())
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a multidimensional array ([NDArray]) to a DataFrame.
|
||||
* Inspired by [toListDn].
|
||||
*
|
||||
* For a single-dimensional array, it will call [D1Array.convertToDataFrame].
|
||||
*
|
||||
* Column names can be specified using the [columnNameGenerator] lambda.
|
||||
*
|
||||
* The conversion enforces that `multikArray[a][b][c][d]... == dataframe[a][b][c][d]...`
|
||||
*/
|
||||
@Suppress("UNCHECKED_CAST")
|
||||
inline fun <reified N> NDArray<N, *>.convertToDataFrameNestedGroups(
|
||||
noinline columnNameGenerator: (Int) -> String = { "col$it" },
|
||||
): AnyFrame {
|
||||
if (shape.size == 1) return (this as D1Array<N>).convertToDataFrame()
|
||||
|
||||
// push the first dimension to the end, because this represents the rows in DataFrame,
|
||||
// and they are accessed by []'s first
|
||||
return transpose(*(1..<dim.d).toList().toIntArray(), 0)
|
||||
.convertToDataFrameNestedGroupsRecursive(
|
||||
indices = intArrayOf(),
|
||||
type = typeOf<N>(), // cannot inline a recursive function, so pass the type explicitly
|
||||
columnNameGenerator = columnNameGenerator,
|
||||
).let {
|
||||
// we could just cast this to a DataFrame<*>, because a ColumnGroup<*>: DataFrame
|
||||
// however, this can sometimes cause issues where instance checks are done at runtime
|
||||
// this converts it to an actual DataFrame instance
|
||||
dataFrameOf((it as ColumnGroup<*>).columns())
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Recursive helper function to handle traversal across dimensions. Do not call directly,
|
||||
* use [convertToDataFrameNestedGroups] instead.
|
||||
*/
|
||||
@PublishedApi
|
||||
internal fun NDArray<*, *>.convertToDataFrameNestedGroupsRecursive(
|
||||
indices: IntArray,
|
||||
type: KType,
|
||||
columnNameGenerator: (Int) -> String,
|
||||
): BaseColumn<*> {
|
||||
// If we are at the last dimension (1D case)
|
||||
if (indices.size == shape.lastIndex) {
|
||||
return List(shape[indices.size]) { i ->
|
||||
this[intArrayOf(*indices, i)] // Collect values for this dimension
|
||||
}.let {
|
||||
DataColumn.createByType(name = "", values = it, type = type)
|
||||
}
|
||||
}
|
||||
|
||||
// For higher dimensions, recursively process smaller dimensions
|
||||
return List(shape[indices.size]) { i ->
|
||||
convertToDataFrameNestedGroupsRecursive(
|
||||
indices = indices + i, // Add `i` to the current index array
|
||||
type = type,
|
||||
columnNameGenerator = columnNameGenerator,
|
||||
).rename(columnNameGenerator(i))
|
||||
}.toColumnGroup("")
|
||||
}
|
||||
|
||||
// endregion
|
||||
@@ -0,0 +1,23 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.multik
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.multik.api.io.readNPY
|
||||
import org.jetbrains.kotlinx.multik.api.mk
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.D1
|
||||
import java.io.File
|
||||
|
||||
/**
|
||||
* Multik can read/write data from NPY/NPZ files.
|
||||
* We can use this from DataFrame too!
|
||||
*
|
||||
* We use compatibilityLayer.kt for the conversions, check it out for the implementation details of the conversion!
|
||||
*/
|
||||
fun main() {
|
||||
val npyFilename = "a1d.npy"
|
||||
val npyFile = File(object {}.javaClass.classLoader.getResource(npyFilename)!!.toURI())
|
||||
|
||||
val mk1 = mk.readNPY<Long, D1>(npyFile)
|
||||
val df1 = mk1.convertToDataFrame()
|
||||
|
||||
df1.print(borders = true, columnTypes = true)
|
||||
}
|
||||
@@ -0,0 +1,99 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.multik
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.api.cast
|
||||
import org.jetbrains.kotlinx.dataframe.api.colsOf
|
||||
import org.jetbrains.kotlinx.dataframe.api.describe
|
||||
import org.jetbrains.kotlinx.dataframe.api.mean
|
||||
import org.jetbrains.kotlinx.dataframe.api.meanFor
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.value
|
||||
import org.jetbrains.kotlinx.multik.api.mk
|
||||
import org.jetbrains.kotlinx.multik.api.rand
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.get
|
||||
|
||||
/**
|
||||
* Let's explore some ways we can combine Multik with Kotlin DataFrame.
|
||||
*
|
||||
* We will use compatibilityLayer.kt for the conversions.
|
||||
* Take a look at that file for the implementation details!
|
||||
*/
|
||||
fun main() {
|
||||
oneDimension()
|
||||
twoDimensions()
|
||||
higherDimensions()
|
||||
}
|
||||
|
||||
fun oneDimension() {
|
||||
// we can convert a 1D ndarray to a column of a DataFrame:
|
||||
val mk1 = mk.rand<Double>(50)
|
||||
val col1 by mk1.convertToColumn()
|
||||
println(col1)
|
||||
|
||||
// or straight to a DataFrame. It will become the `value` column.
|
||||
val df1 = mk1.convertToDataFrame()
|
||||
println(df1)
|
||||
|
||||
// this allows us to perform any DF operation:
|
||||
println(df1.mean { value })
|
||||
df1.describe().print(borders = true)
|
||||
|
||||
// we can convert back to Multik:
|
||||
val mk2 = df1.convertToMultik { value }
|
||||
// or
|
||||
df1.value.convertToMultik()
|
||||
|
||||
println(mk2)
|
||||
}
|
||||
|
||||
fun twoDimensions() {
|
||||
// we can also convert a 2D ndarray to a DataFrame
|
||||
// This conversion will create columns like "col0", "col1", etc.
|
||||
// (careful, when the number of columns is too large, this can cause problems)
|
||||
// but will allow for similar access like in multik
|
||||
// aka: `multikArray[x][y] == dataframe[x][y]`
|
||||
val mk1 = mk.rand<Int>(5, 10)
|
||||
println(mk1)
|
||||
val df = mk1.convertToDataFrame()
|
||||
df.print()
|
||||
|
||||
// this allows us to perform any DF operation:
|
||||
val means = df.meanFor { ("col0".."col9").cast<Int>() }
|
||||
means.print()
|
||||
|
||||
// we can convert back to Multik in multiple ways.
|
||||
// Multik can only store one type of data, so we need to specify the type or select
|
||||
// only the columns we want:
|
||||
val mk2 = df.convertToMultik { colsOf<Int>() }
|
||||
// or
|
||||
df.convertToMultikOf<Int>()
|
||||
// or if all columns are of the same type:
|
||||
df.convertToMultik()
|
||||
|
||||
println(mk2)
|
||||
}
|
||||
|
||||
fun higherDimensions() {
|
||||
// Multik can store higher dimensions as well
|
||||
// however; to convert this to a DataFrame, we need to specify how to do a particular conversion
|
||||
// for instance, for 3d, we could store a list in each cell of the DF to represent the extra dimension:
|
||||
val mk1 = mk.rand<Int>(5, 4, 3)
|
||||
|
||||
println(mk1)
|
||||
|
||||
val df1 = mk1.convertToDataFrameWithLists()
|
||||
df1.print()
|
||||
|
||||
// Alternatively, this could be solved using column groups.
|
||||
// This subdivides each column into more columns, while ensuring `multikArray[x][y][z] == dataframe[x][y][z]`
|
||||
val df2 = mk1.convertToDataFrame()
|
||||
df2.print()
|
||||
|
||||
// For even higher dimensions, we can keep adding more column groups
|
||||
val mk2 = mk.rand<Int>(5, 4, 3, 2)
|
||||
val df3 = mk2.convertToDataFrameNestedGroups()
|
||||
df3.print()
|
||||
|
||||
// ...or use nested DataFrames (in FrameColumns)
|
||||
// (for instance, a 4D matrix could be stored in a 2D DataFrame where each cell is another DataFrame)
|
||||
// but, we'll leave that as an exercise for the reader :)
|
||||
}
|
||||
@@ -0,0 +1,115 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.multik
|
||||
|
||||
import kotlinx.datetime.LocalDate
|
||||
import kotlinx.datetime.Month
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.append
|
||||
import org.jetbrains.kotlinx.dataframe.api.cast
|
||||
import org.jetbrains.kotlinx.dataframe.api.mapToFrame
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.single
|
||||
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
|
||||
import org.jetbrains.kotlinx.multik.api.mk
|
||||
import org.jetbrains.kotlinx.multik.api.rand
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.D3Array
|
||||
import org.jetbrains.kotlinx.multik.ndarray.data.D4Array
|
||||
|
||||
/**
|
||||
* DataFrames can store anything inside, including Multik ndarrays.
|
||||
* This can be useful for storing matrices for easier access later or to simply organize data read from other files.
|
||||
* For example, MRI data is often stored as 3D arrays and sometimes even 4D arrays.
|
||||
*/
|
||||
fun main() {
|
||||
// imaginary list of patient data
|
||||
@Suppress("ktlint:standard:argument-list-wrapping")
|
||||
val metadata = listOf(
|
||||
MriMetadata(10012L, 25, "Healthy", LocalDate(2023, 1, 1)),
|
||||
MriMetadata(10013L, 45, "Tuberculosis", LocalDate(2023, 2, 15)),
|
||||
MriMetadata(10014L, 32, "Healthy", LocalDate(2023, 3, 22)),
|
||||
MriMetadata(10015L, 58, "Pneumonia", LocalDate(2023, 4, 8)),
|
||||
MriMetadata(10016L, 29, "Tuberculosis", LocalDate(2023, 5, 30)),
|
||||
MriMetadata(10017L, 42, "Healthy", LocalDate(2023, 6, 15)),
|
||||
MriMetadata(10018L, 37, "Healthy", LocalDate(2023, 7, 1)),
|
||||
MriMetadata(10019L, 55, "Healthy", LocalDate(2023, 8, 15)),
|
||||
MriMetadata(10020L, 28, "Healthy", LocalDate(2023, 9, 1)),
|
||||
MriMetadata(10021L, 44, "Healthy", LocalDate(2023, 10, 15)),
|
||||
MriMetadata(10022L, 31, "Healthy", LocalDate(2023, 11, 1)),
|
||||
).toDataFrame()
|
||||
|
||||
// "reading" the results from "files"
|
||||
val results = metadata.mapToFrame {
|
||||
+patientId
|
||||
+age
|
||||
+diagnosis
|
||||
+scanDate
|
||||
"t1WeightedMri" from { readT1WeightedMri(patientId) }
|
||||
"fMriBoldSeries" from { readFMRiBoldSeries(patientId) }
|
||||
}.cast<MriResults>(verify = true)
|
||||
.append()
|
||||
|
||||
results.print(borders = true)
|
||||
|
||||
// now when we want to check and visualize the T1-weighted MRI scan
|
||||
// for that one healthy patient in July, we can do:
|
||||
val scan = results
|
||||
.single { scanDate.month == Month.JULY && diagnosis == "Healthy" }
|
||||
.t1WeightedMri
|
||||
|
||||
// easy :)
|
||||
visualize(scan)
|
||||
}
|
||||
|
||||
@DataSchema
|
||||
data class MriMetadata(
|
||||
/** Unique patient ID. */
|
||||
val patientId: Long,
|
||||
/** Patient age. */
|
||||
val age: Int,
|
||||
/** Clinical diagnosis (e.g. "Healthy", "Tuberculosis") */
|
||||
val diagnosis: String,
|
||||
/** Date of the scan */
|
||||
val scanDate: LocalDate,
|
||||
)
|
||||
|
||||
@DataSchema
|
||||
data class MriResults(
|
||||
/** Unique patient ID. */
|
||||
val patientId: Long,
|
||||
/** Patient age. */
|
||||
val age: Int,
|
||||
/** Clinical diagnosis (e.g. "Healthy", "Tuberculosis") */
|
||||
val diagnosis: String,
|
||||
/** Date of the scan */
|
||||
val scanDate: LocalDate,
|
||||
/**
|
||||
* T1-weighted anatomical MRI scan.
|
||||
*
|
||||
* Dimensions: (256 x 256 x 180)
|
||||
* - 256 width x 256 height
|
||||
* - 180 slices
|
||||
*/
|
||||
val t1WeightedMri: D3Array<Float>,
|
||||
/**
|
||||
* Blood oxygenation level-dependent (BOLD) time series from an fMRI scan.
|
||||
*
|
||||
* Dimensions: (64 x 64 x 30 x 200)
|
||||
* - 64 width x 64 height
|
||||
* - 30 slices
|
||||
* - 200 timepoints
|
||||
*/
|
||||
val fMriBoldSeries: D4Array<Float>,
|
||||
)
|
||||
|
||||
fun readT1WeightedMri(id: Long): D3Array<Float> {
|
||||
// This should in practice, of course, read the actual data, but for this example we just return a dummy array
|
||||
return mk.rand(256, 256, 180)
|
||||
}
|
||||
|
||||
fun readFMRiBoldSeries(id: Long): D4Array<Float> {
|
||||
// This should in practice, of course, read the actual data, but for this example we just return a dummy array
|
||||
return mk.rand(64, 64, 30, 200)
|
||||
}
|
||||
|
||||
fun visualize(scan: D3Array<Float>) {
|
||||
// This would then actually visualize the scan
|
||||
}
|
||||
@@ -0,0 +1,77 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
// uses the 'old' Gradle plugin instead of the compiler plugin for now
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenLocal() // in case of local dataframe development
|
||||
mavenCentral()
|
||||
}
|
||||
|
||||
dependencies {
|
||||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
|
||||
implementation(project(":"))
|
||||
|
||||
// (kotlin) spark support
|
||||
implementation(libs.kotlin.spark)
|
||||
compileOnly(libs.spark)
|
||||
implementation(libs.log4j.core)
|
||||
implementation(libs.log4j.api)
|
||||
}
|
||||
|
||||
/**
|
||||
* Runs the kotlinSpark/typedDataset example with java 11.
|
||||
*/
|
||||
val runKotlinSparkTypedDataset by tasks.registering(JavaExec::class) {
|
||||
classpath = sourceSets["main"].runtimeClasspath
|
||||
javaLauncher = javaToolchains.launcherFor { languageVersion = JavaLanguageVersion.of(11) }
|
||||
mainClass = "org.jetbrains.kotlinx.dataframe.examples.kotlinSpark.TypedDatasetKt"
|
||||
}
|
||||
|
||||
/**
|
||||
* Runs the kotlinSpark/untypedDataset example with java 11.
|
||||
*/
|
||||
val runKotlinSparkUntypedDataset by tasks.registering(JavaExec::class) {
|
||||
classpath = sourceSets["main"].runtimeClasspath
|
||||
javaLauncher = javaToolchains.launcherFor { languageVersion = JavaLanguageVersion.of(11) }
|
||||
mainClass = "org.jetbrains.kotlinx.dataframe.examples.kotlinSpark.UntypedDatasetKt"
|
||||
}
|
||||
|
||||
/**
|
||||
* Runs the spark/typedDataset example with java 11.
|
||||
*/
|
||||
val runSparkTypedDataset by tasks.registering(JavaExec::class) {
|
||||
classpath = sourceSets["main"].runtimeClasspath
|
||||
javaLauncher = javaToolchains.launcherFor { languageVersion = JavaLanguageVersion.of(11) }
|
||||
mainClass = "org.jetbrains.kotlinx.dataframe.examples.spark.TypedDatasetKt"
|
||||
}
|
||||
|
||||
/**
|
||||
* Runs the spark/untypedDataset example with java 11.
|
||||
*/
|
||||
val runSparkUntypedDataset by tasks.registering(JavaExec::class) {
|
||||
classpath = sourceSets["main"].runtimeClasspath
|
||||
javaLauncher = javaToolchains.launcherFor { languageVersion = JavaLanguageVersion.of(11) }
|
||||
mainClass = "org.jetbrains.kotlinx.dataframe.examples.spark.UntypedDatasetKt"
|
||||
}
|
||||
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_11
|
||||
freeCompilerArgs.add("-Xjdk-release=11")
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_11.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_11.toString()
|
||||
options.release.set(11)
|
||||
}
|
||||
@@ -0,0 +1,8 @@
|
||||
@file:Suppress("ktlint:standard:no-empty-file")
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.kotlinSpark
|
||||
|
||||
/*
|
||||
* See ../spark/compatibilityLayer.kt for the implementation.
|
||||
* It's the same with- and without the Kotlin Spark API.
|
||||
*/
|
||||
@@ -0,0 +1,78 @@
|
||||
@file:Suppress("ktlint:standard:function-signature")
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.kotlinSpark
|
||||
|
||||
import org.apache.spark.sql.Dataset
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.aggregate
|
||||
import org.jetbrains.kotlinx.dataframe.api.groupBy
|
||||
import org.jetbrains.kotlinx.dataframe.api.max
|
||||
import org.jetbrains.kotlinx.dataframe.api.mean
|
||||
import org.jetbrains.kotlinx.dataframe.api.min
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.schema
|
||||
import org.jetbrains.kotlinx.dataframe.api.std
|
||||
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.api.toList
|
||||
import org.jetbrains.kotlinx.spark.api.withSpark
|
||||
|
||||
/**
|
||||
* With the Kotlin Spark API, normal Kotlin data classes are supported,
|
||||
* meaning we can reuse the same class for Spark and DataFrame!
|
||||
*
|
||||
* Also, since we use an actual class to define the schema, we need no type conversion!
|
||||
*
|
||||
* See [Person] and [Name] for an example.
|
||||
*
|
||||
* NOTE: You will likely need to run this function with Java 8 or 11 for it to work correctly.
|
||||
* Use the `runKotlinSparkTypedDataset` Gradle task to do so.
|
||||
*/
|
||||
fun main() = withSpark {
|
||||
// Creating a Spark Dataset. Usually, this is loaded from some server or database.
|
||||
val rawDataset: Dataset<Person> = listOf(
|
||||
Person(Name("Alice", "Cooper"), 15, "London", 54, true),
|
||||
Person(Name("Bob", "Dylan"), 45, "Dubai", 87, true),
|
||||
Person(Name("Charlie", "Daniels"), 20, "Moscow", null, false),
|
||||
Person(Name("Charlie", "Chaplin"), 40, "Milan", null, true),
|
||||
Person(Name("Bob", "Marley"), 30, "Tokyo", 68, true),
|
||||
Person(Name("Alice", "Wolf"), 20, null, 55, false),
|
||||
Person(Name("Charlie", "Byrd"), 30, "Moscow", 90, true),
|
||||
).toDS()
|
||||
|
||||
// we can perform large operations in Spark.
|
||||
// DataFrames are in-memory structures, so this is a good place to limit the number of rows if you don't have the RAM ;)
|
||||
val dataset = rawDataset.filter { it.age > 17 }
|
||||
|
||||
// and convert it to DataFrame via a typed List
|
||||
val dataframe = dataset.collectAsList().toDataFrame()
|
||||
dataframe.schema().print()
|
||||
dataframe.print(columnTypes = true, borders = true)
|
||||
|
||||
// now we can use DataFrame-specific functions
|
||||
val ageStats = dataframe
|
||||
.groupBy { city }.aggregate {
|
||||
mean { age } into "meanAge"
|
||||
std { age } into "stdAge"
|
||||
min { age } into "minAge"
|
||||
max { age } into "maxAge"
|
||||
}
|
||||
|
||||
ageStats.print(columnTypes = true, borders = true)
|
||||
|
||||
// and when we want to convert a DataFrame back to Spark, we can do the same trick via a typed List
|
||||
val sparkDatasetAgain = dataframe.toList().toDS()
|
||||
sparkDatasetAgain.printSchema()
|
||||
sparkDatasetAgain.show()
|
||||
}
|
||||
|
||||
@DataSchema
|
||||
data class Name(val firstName: String, val lastName: String)
|
||||
|
||||
@DataSchema
|
||||
data class Person(
|
||||
val name: Name,
|
||||
val age: Int,
|
||||
val city: String?,
|
||||
val weight: Int?,
|
||||
val isHappy: Boolean,
|
||||
)
|
||||
@@ -0,0 +1,74 @@
|
||||
@file:Suppress("ktlint:standard:function-signature")
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.kotlinSpark
|
||||
|
||||
import org.apache.spark.sql.Dataset
|
||||
import org.apache.spark.sql.Row
|
||||
import org.jetbrains.kotlinx.dataframe.api.aggregate
|
||||
import org.jetbrains.kotlinx.dataframe.api.groupBy
|
||||
import org.jetbrains.kotlinx.dataframe.api.max
|
||||
import org.jetbrains.kotlinx.dataframe.api.mean
|
||||
import org.jetbrains.kotlinx.dataframe.api.min
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.schema
|
||||
import org.jetbrains.kotlinx.dataframe.api.std
|
||||
import org.jetbrains.kotlinx.dataframe.examples.spark.convertToDataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.examples.spark.convertToDataFrameByInference
|
||||
import org.jetbrains.kotlinx.dataframe.examples.spark.convertToSpark
|
||||
import org.jetbrains.kotlinx.spark.api.col
|
||||
import org.jetbrains.kotlinx.spark.api.gt
|
||||
import org.jetbrains.kotlinx.spark.api.withSpark
|
||||
|
||||
/**
|
||||
* Since we don't know the schema at compile time this time, we need to do
|
||||
* some schema mapping in between Spark and DataFrame.
|
||||
*
|
||||
* We will use spark/compatibilityLayer.kt to do this.
|
||||
* Take a look at that file for the implementation details!
|
||||
*
|
||||
* NOTE: You will likely need to run this function with Java 8 or 11 for it to work correctly.
|
||||
* Use the `runKotlinSparkUntypedDataset` Gradle task to do so.
|
||||
*/
|
||||
fun main() = withSpark {
|
||||
// Creating a Spark Dataframe (untyped Dataset). Usually, this is loaded from some server or database.
|
||||
val rawDataset: Dataset<Row> = listOf(
|
||||
Person(Name("Alice", "Cooper"), 15, "London", 54, true),
|
||||
Person(Name("Bob", "Dylan"), 45, "Dubai", 87, true),
|
||||
Person(Name("Charlie", "Daniels"), 20, "Moscow", null, false),
|
||||
Person(Name("Charlie", "Chaplin"), 40, "Milan", null, true),
|
||||
Person(Name("Bob", "Marley"), 30, "Tokyo", 68, true),
|
||||
Person(Name("Alice", "Wolf"), 20, null, 55, false),
|
||||
Person(Name("Charlie", "Byrd"), 30, "Moscow", 90, true),
|
||||
).toDF()
|
||||
|
||||
// we can perform large operations in Spark.
|
||||
// DataFrames are in-memory structures, so this is a good place to limit the number of rows if you don't have the RAM ;)
|
||||
val dataset = rawDataset.filter(col("age") gt 17)
|
||||
|
||||
// Using inference
|
||||
val df1 = dataset.convertToDataFrameByInference()
|
||||
df1.schema().print()
|
||||
df1.print(columnTypes = true, borders = true)
|
||||
|
||||
// Using full schema mapping
|
||||
val df2 = dataset.convertToDataFrame()
|
||||
df2.schema().print()
|
||||
df2.print(columnTypes = true, borders = true)
|
||||
|
||||
// now we can use DataFrame-specific functions
|
||||
val ageStats = df1
|
||||
.groupBy("city").aggregate {
|
||||
mean("age") into "meanAge"
|
||||
std("age") into "stdAge"
|
||||
min("age") into "minAge"
|
||||
max("age") into "maxAge"
|
||||
}
|
||||
|
||||
ageStats.print(columnTypes = true, borders = true)
|
||||
|
||||
// and when we want to convert a DataFrame back to Spark, we will use the `convertToSpark()` extension function
|
||||
// This performs the necessary schema mapping under the hood.
|
||||
val sparkDataset = df2.convertToSpark(spark, sc)
|
||||
sparkDataset.printSchema()
|
||||
sparkDataset.show()
|
||||
}
|
||||
@@ -0,0 +1,330 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.spark
|
||||
|
||||
import org.apache.spark.api.java.JavaRDD
|
||||
import org.apache.spark.api.java.JavaSparkContext
|
||||
import org.apache.spark.sql.Dataset
|
||||
import org.apache.spark.sql.Row
|
||||
import org.apache.spark.sql.RowFactory
|
||||
import org.apache.spark.sql.SparkSession
|
||||
import org.apache.spark.sql.types.ArrayType
|
||||
import org.apache.spark.sql.types.DataType
|
||||
import org.apache.spark.sql.types.DataTypes
|
||||
import org.apache.spark.sql.types.Decimal
|
||||
import org.apache.spark.sql.types.DecimalType
|
||||
import org.apache.spark.sql.types.MapType
|
||||
import org.apache.spark.sql.types.StructType
|
||||
import org.apache.spark.unsafe.types.CalendarInterval
|
||||
import org.jetbrains.kotlinx.dataframe.AnyFrame
|
||||
import org.jetbrains.kotlinx.dataframe.DataColumn
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.DataRow
|
||||
import org.jetbrains.kotlinx.dataframe.api.rows
|
||||
import org.jetbrains.kotlinx.dataframe.api.schema
|
||||
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.columns.ColumnGroup
|
||||
import org.jetbrains.kotlinx.dataframe.columns.TypeSuggestion
|
||||
import org.jetbrains.kotlinx.dataframe.schema.ColumnSchema
|
||||
import org.jetbrains.kotlinx.dataframe.schema.DataFrameSchema
|
||||
import java.math.BigDecimal
|
||||
import java.math.BigInteger
|
||||
import java.sql.Date
|
||||
import java.sql.Timestamp
|
||||
import java.time.Instant
|
||||
import java.time.LocalDate
|
||||
import kotlin.reflect.KType
|
||||
import kotlin.reflect.KTypeProjection
|
||||
import kotlin.reflect.full.createType
|
||||
import kotlin.reflect.full.isSubtypeOf
|
||||
import kotlin.reflect.full.withNullability
|
||||
import kotlin.reflect.typeOf
|
||||
|
||||
// region Spark to DataFrame
|
||||
|
||||
/**
|
||||
* Converts an untyped Spark [Dataset] (Dataframe) to a Kotlin [DataFrame].
|
||||
* [StructTypes][StructType] are converted to [ColumnGroups][ColumnGroup].
|
||||
*
|
||||
* DataFrame supports type inference to do the conversion automatically.
|
||||
* This is usually fine for smaller data sets, but when working with larger datasets, a type map might be a good idea.
|
||||
* See [convertToDataFrame] for more information.
|
||||
*/
|
||||
fun Dataset<Row>.convertToDataFrameByInference(
|
||||
schema: StructType = schema(),
|
||||
prefix: List<String> = emptyList(),
|
||||
): AnyFrame {
|
||||
val columns = schema.fields().map { field ->
|
||||
val name = field.name()
|
||||
when (val dataType = field.dataType()) {
|
||||
is StructType ->
|
||||
// a column group can be easily created from a dataframe and a name
|
||||
DataColumn.createColumnGroup(
|
||||
name = name,
|
||||
df = this.convertToDataFrameByInference(dataType, prefix + name),
|
||||
)
|
||||
|
||||
else ->
|
||||
// we can use DataFrame type inference to create a column with the correct type
|
||||
// from Spark we use `select()` to select a single column
|
||||
// and `collectAsList()` to get all the values in a list of single-celled rows
|
||||
DataColumn.createByInference(
|
||||
name = name,
|
||||
values = this.select((prefix + name).joinToString("."))
|
||||
.collectAsList()
|
||||
.map { it[0] },
|
||||
suggestedType = TypeSuggestion.Infer,
|
||||
// Spark provides nullability :) you can leave this out if you want this to be inferred too
|
||||
nullable = field.nullable(),
|
||||
)
|
||||
}
|
||||
}
|
||||
return columns.toDataFrame()
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts an untyped Spark [Dataset] (Dataframe) to a Kotlin [DataFrame].
|
||||
* [StructTypes][StructType] are converted to [ColumnGroups][ColumnGroup].
|
||||
*
|
||||
* This version uses a [type-map][DataType.convertToDataFrame] to convert the schemas with a fallback to inference.
|
||||
* For smaller data sets, inference is usually fine too.
|
||||
* See [convertToDataFrameByInference] for more information.
|
||||
*/
|
||||
fun Dataset<Row>.convertToDataFrame(schema: StructType = schema(), prefix: List<String> = emptyList()): AnyFrame {
|
||||
val columns = schema.fields().map { field ->
|
||||
val name = field.name()
|
||||
when (val dataType = field.dataType()) {
|
||||
is StructType ->
|
||||
// a column group can be easily created from a dataframe and a name
|
||||
DataColumn.createColumnGroup(
|
||||
name = name,
|
||||
df = convertToDataFrame(dataType, prefix + name),
|
||||
)
|
||||
|
||||
else ->
|
||||
// we create a column with the correct type using our type-map with fallback to inference
|
||||
// from Spark we use `select()` to select a single column
|
||||
// and `collectAsList()` to get all the values in a list of single-celled rows
|
||||
DataColumn.createByInference(
|
||||
name = name,
|
||||
values = select((prefix + name).joinToString("."))
|
||||
.collectAsList()
|
||||
.map { it[0] },
|
||||
suggestedType =
|
||||
dataType.convertToDataFrame()
|
||||
?.let(TypeSuggestion::Use)
|
||||
?: TypeSuggestion.Infer, // fallback to inference if needed
|
||||
nullable = field.nullable(),
|
||||
)
|
||||
}
|
||||
}
|
||||
return columns.toDataFrame()
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the corresponding [Kotlin type][KType] for a given Spark [DataType].
|
||||
*
|
||||
* This list may be incomplete, but it can at least give you a good start.
|
||||
*
|
||||
* @return The [KType] that corresponds to the Spark [DataType], or null if no matching [KType] is found.
|
||||
*/
|
||||
fun DataType.convertToDataFrame(): KType? =
|
||||
when {
|
||||
this == DataTypes.ByteType -> typeOf<Byte>()
|
||||
|
||||
this == DataTypes.ShortType -> typeOf<Short>()
|
||||
|
||||
this == DataTypes.IntegerType -> typeOf<Int>()
|
||||
|
||||
this == DataTypes.LongType -> typeOf<Long>()
|
||||
|
||||
this == DataTypes.BooleanType -> typeOf<Boolean>()
|
||||
|
||||
this == DataTypes.FloatType -> typeOf<Float>()
|
||||
|
||||
this == DataTypes.DoubleType -> typeOf<Double>()
|
||||
|
||||
this == DataTypes.StringType -> typeOf<String>()
|
||||
|
||||
this == DataTypes.DateType -> typeOf<Date>()
|
||||
|
||||
this == DataTypes.TimestampType -> typeOf<Timestamp>()
|
||||
|
||||
this is DecimalType -> typeOf<Decimal>()
|
||||
|
||||
this == DataTypes.CalendarIntervalType -> typeOf<CalendarInterval>()
|
||||
|
||||
this == DataTypes.NullType -> nullableNothingType
|
||||
|
||||
this == DataTypes.BinaryType -> typeOf<ByteArray>()
|
||||
|
||||
this is ArrayType -> {
|
||||
when (elementType()) {
|
||||
DataTypes.ShortType -> typeOf<ShortArray>()
|
||||
DataTypes.IntegerType -> typeOf<IntArray>()
|
||||
DataTypes.LongType -> typeOf<LongArray>()
|
||||
DataTypes.FloatType -> typeOf<FloatArray>()
|
||||
DataTypes.DoubleType -> typeOf<DoubleArray>()
|
||||
DataTypes.BooleanType -> typeOf<BooleanArray>()
|
||||
else -> null
|
||||
}
|
||||
}
|
||||
|
||||
this is MapType -> {
|
||||
val key = keyType().convertToDataFrame() ?: return null
|
||||
val value = valueType().convertToDataFrame() ?: return null
|
||||
Map::class.createType(
|
||||
listOf(
|
||||
KTypeProjection.invariant(key),
|
||||
KTypeProjection.invariant(value.withNullability(valueContainsNull())),
|
||||
),
|
||||
)
|
||||
}
|
||||
|
||||
else -> null
|
||||
}
|
||||
|
||||
// endregion
|
||||
|
||||
// region DataFrame to Spark
|
||||
|
||||
/**
|
||||
* Converts the [DataFrame] to a Spark [Dataset] of [Rows][Row] using the provided [SparkSession] and [JavaSparkContext].
|
||||
*
|
||||
* Spark needs both the data and the schema to be converted to create a correct [Dataset],
|
||||
* so we need to map our types somehow.
|
||||
*
|
||||
* @param spark The [SparkSession] object to use for creating the [DataFrame].
|
||||
* @param sc The [JavaSparkContext] object to use for converting the [DataFrame] to [RDD][JavaRDD].
|
||||
* @return A [Dataset] of [Rows][Row] representing the converted DataFrame.
|
||||
*/
|
||||
fun DataFrame<*>.convertToSpark(spark: SparkSession, sc: JavaSparkContext): Dataset<Row> {
|
||||
// Convert each row to spark rows
|
||||
val rows = sc.parallelize(this.rows().map { it.convertToSpark() })
|
||||
// convert the data schema to a spark StructType
|
||||
val schema = this.schema().convertToSpark()
|
||||
return spark.createDataFrame(rows, schema)
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a [DataRow] to a Spark [Row] object.
|
||||
*
|
||||
* @return The converted Spark [Row].
|
||||
*/
|
||||
fun DataRow<*>.convertToSpark(): Row =
|
||||
RowFactory.create(
|
||||
*values().map {
|
||||
when (it) {
|
||||
// a row can be nested inside another row if it's a column group
|
||||
is DataRow<*> -> it.convertToSpark()
|
||||
|
||||
is DataFrame<*> -> error("nested dataframes are not supported")
|
||||
|
||||
else -> it
|
||||
}
|
||||
}.toTypedArray(),
|
||||
)
|
||||
|
||||
/**
|
||||
* Converts a [DataFrameSchema] to a Spark [StructType].
|
||||
*
|
||||
* @return The converted Spark [StructType].
|
||||
*/
|
||||
fun DataFrameSchema.convertToSpark(): StructType =
|
||||
DataTypes.createStructType(
|
||||
this.columns.map { (name, schema) ->
|
||||
DataTypes.createStructField(name, schema.convertToSpark(), schema.nullable)
|
||||
},
|
||||
)
|
||||
|
||||
/**
|
||||
* Converts a [ColumnSchema] object to Spark [DataType].
|
||||
*
|
||||
* @return The Spark [DataType] corresponding to the given [ColumnSchema] object.
|
||||
* @throws IllegalArgumentException if the column type or kind is unknown.
|
||||
*/
|
||||
fun ColumnSchema.convertToSpark(): DataType =
|
||||
when (this) {
|
||||
is ColumnSchema.Value -> type.convertToSpark() ?: error("unknown data type: $type")
|
||||
is ColumnSchema.Group -> schema.convertToSpark()
|
||||
is ColumnSchema.Frame -> error("nested dataframes are not supported")
|
||||
else -> error("unknown column kind: $this")
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the corresponding Spark [DataType] for a given [Kotlin type][KType].
|
||||
*
|
||||
* This list may be incomplete, but it can at least give you a good start.
|
||||
*
|
||||
* @return The Spark [DataType] that corresponds to the [Kotlin type][KType], or null if no matching [DataType] is found.
|
||||
*/
|
||||
fun KType.convertToSpark(): DataType? =
|
||||
when {
|
||||
isSubtypeOf(typeOf<Byte?>()) -> DataTypes.ByteType
|
||||
|
||||
isSubtypeOf(typeOf<Short?>()) -> DataTypes.ShortType
|
||||
|
||||
isSubtypeOf(typeOf<Int?>()) -> DataTypes.IntegerType
|
||||
|
||||
isSubtypeOf(typeOf<Long?>()) -> DataTypes.LongType
|
||||
|
||||
isSubtypeOf(typeOf<Boolean?>()) -> DataTypes.BooleanType
|
||||
|
||||
isSubtypeOf(typeOf<Float?>()) -> DataTypes.FloatType
|
||||
|
||||
isSubtypeOf(typeOf<Double?>()) -> DataTypes.DoubleType
|
||||
|
||||
isSubtypeOf(typeOf<String?>()) -> DataTypes.StringType
|
||||
|
||||
isSubtypeOf(typeOf<LocalDate?>()) -> DataTypes.DateType
|
||||
|
||||
isSubtypeOf(typeOf<Date?>()) -> DataTypes.DateType
|
||||
|
||||
isSubtypeOf(typeOf<Timestamp?>()) -> DataTypes.TimestampType
|
||||
|
||||
isSubtypeOf(typeOf<Instant?>()) -> DataTypes.TimestampType
|
||||
|
||||
isSubtypeOf(typeOf<Decimal?>()) -> DecimalType.SYSTEM_DEFAULT()
|
||||
|
||||
isSubtypeOf(typeOf<BigDecimal?>()) -> DecimalType.SYSTEM_DEFAULT()
|
||||
|
||||
isSubtypeOf(typeOf<BigInteger?>()) -> DecimalType.SYSTEM_DEFAULT()
|
||||
|
||||
isSubtypeOf(typeOf<CalendarInterval?>()) -> DataTypes.CalendarIntervalType
|
||||
|
||||
isSubtypeOf(nullableNothingType) -> DataTypes.NullType
|
||||
|
||||
isSubtypeOf(typeOf<ByteArray?>()) -> DataTypes.BinaryType
|
||||
|
||||
isSubtypeOf(typeOf<ShortArray?>()) -> DataTypes.createArrayType(DataTypes.ShortType, false)
|
||||
|
||||
isSubtypeOf(typeOf<IntArray?>()) -> DataTypes.createArrayType(DataTypes.IntegerType, false)
|
||||
|
||||
isSubtypeOf(typeOf<LongArray?>()) -> DataTypes.createArrayType(DataTypes.LongType, false)
|
||||
|
||||
isSubtypeOf(typeOf<FloatArray?>()) -> DataTypes.createArrayType(DataTypes.FloatType, false)
|
||||
|
||||
isSubtypeOf(typeOf<DoubleArray?>()) -> DataTypes.createArrayType(DataTypes.DoubleType, false)
|
||||
|
||||
isSubtypeOf(typeOf<BooleanArray?>()) -> DataTypes.createArrayType(DataTypes.BooleanType, false)
|
||||
|
||||
isSubtypeOf(typeOf<Array<*>>()) ->
|
||||
error("non-primitive arrays are not supported for now, you can add it yourself")
|
||||
|
||||
isSubtypeOf(typeOf<List<*>>()) -> error("lists are not supported for now, you can add it yourself")
|
||||
|
||||
isSubtypeOf(typeOf<Set<*>>()) -> error("sets are not supported for now, you can add it yourself")
|
||||
|
||||
classifier == Map::class -> {
|
||||
val (key, value) = arguments
|
||||
DataTypes.createMapType(
|
||||
key.type?.convertToSpark(),
|
||||
value.type?.convertToSpark(),
|
||||
value.type?.isMarkedNullable ?: true,
|
||||
)
|
||||
}
|
||||
|
||||
else -> null
|
||||
}
|
||||
|
||||
private val nullableNothingType: KType = typeOf<List<Nothing?>>().arguments.first().type!!
|
||||
|
||||
// endregion
|
||||
@@ -0,0 +1,105 @@
|
||||
@file:Suppress("ktlint:standard:function-signature")
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.spark
|
||||
|
||||
import org.apache.spark.SparkConf
|
||||
import org.apache.spark.api.java.JavaSparkContext
|
||||
import org.apache.spark.sql.Dataset
|
||||
import org.apache.spark.sql.Encoder
|
||||
import org.apache.spark.sql.Encoders
|
||||
import org.apache.spark.sql.SparkSession
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.aggregate
|
||||
import org.jetbrains.kotlinx.dataframe.api.groupBy
|
||||
import org.jetbrains.kotlinx.dataframe.api.max
|
||||
import org.jetbrains.kotlinx.dataframe.api.mean
|
||||
import org.jetbrains.kotlinx.dataframe.api.min
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.schema
|
||||
import org.jetbrains.kotlinx.dataframe.api.std
|
||||
import org.jetbrains.kotlinx.dataframe.api.toDataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.api.toList
|
||||
import java.io.Serializable
|
||||
|
||||
/**
|
||||
* For Spark, Kotlin data classes are supported if we:
|
||||
* - Add [@JvmOverloads][JvmOverloads] to the constructor
|
||||
* - Make all parameter arguments mutable and with defaults
|
||||
* - Make them [Serializable]
|
||||
*
|
||||
* But by adding [@DataSchema][DataSchema] we can reuse the same class for Spark and DataFrame!
|
||||
*
|
||||
* See [Person] and [Name] for an example.
|
||||
*
|
||||
* Also, since we use an actual class to define the schema, we need no type conversion!
|
||||
*
|
||||
* NOTE: You will likely need to run this function with Java 8 or 11 for it to work correctly.
|
||||
* Use the `runSparkTypedDataset` Gradle task to do so.
|
||||
*/
|
||||
fun main() {
|
||||
val spark = SparkSession.builder()
|
||||
.master(SparkConf().get("spark.master", "local[*]"))
|
||||
.appName("Kotlin Spark Sample")
|
||||
.getOrCreate()
|
||||
val sc = JavaSparkContext(spark.sparkContext())
|
||||
|
||||
// Creating a Spark Dataset. Usually, this is loaded from some server or database.
|
||||
val rawDataset: Dataset<Person> = spark.createDataset(
|
||||
listOf(
|
||||
Person(Name("Alice", "Cooper"), 15, "London", 54, true),
|
||||
Person(Name("Bob", "Dylan"), 45, "Dubai", 87, true),
|
||||
Person(Name("Charlie", "Daniels"), 20, "Moscow", null, false),
|
||||
Person(Name("Charlie", "Chaplin"), 40, "Milan", null, true),
|
||||
Person(Name("Bob", "Marley"), 30, "Tokyo", 68, true),
|
||||
Person(Name("Alice", "Wolf"), 20, null, 55, false),
|
||||
Person(Name("Charlie", "Byrd"), 30, "Moscow", 90, true),
|
||||
),
|
||||
beanEncoderOf(),
|
||||
)
|
||||
|
||||
// we can perform large operations in Spark.
|
||||
// DataFrames are in-memory structures, so this is a good place to limit the number of rows if you don't have the RAM ;)
|
||||
val dataset = rawDataset.filter { it.age > 17 }
|
||||
|
||||
// and convert it to DataFrame via a typed List
|
||||
val dataframe = dataset.collectAsList().toDataFrame()
|
||||
dataframe.schema().print()
|
||||
dataframe.print(columnTypes = true, borders = true)
|
||||
|
||||
// now we can use DataFrame-specific functions
|
||||
val ageStats = dataframe
|
||||
.groupBy { city }.aggregate {
|
||||
mean { age } into "meanAge"
|
||||
std { age } into "stdAge"
|
||||
min { age } into "minAge"
|
||||
max { age } into "maxAge"
|
||||
}
|
||||
|
||||
ageStats.print(columnTypes = true, borders = true)
|
||||
|
||||
// and when we want to convert a DataFrame back to Spark, we can do the same trick via a typed List
|
||||
val sparkDatasetAgain = spark.createDataset(dataframe.toList(), beanEncoderOf())
|
||||
sparkDatasetAgain.printSchema()
|
||||
sparkDatasetAgain.show()
|
||||
|
||||
spark.stop()
|
||||
}
|
||||
|
||||
/** Creates a [bean encoder][Encoders.bean] for the given [T] instance. */
|
||||
inline fun <reified T : Serializable> beanEncoderOf(): Encoder<T> = Encoders.bean(T::class.java)
|
||||
|
||||
@DataSchema
|
||||
data class Name
|
||||
@JvmOverloads
|
||||
constructor(var firstName: String = "", var lastName: String = "") : Serializable
|
||||
|
||||
@DataSchema
|
||||
data class Person
|
||||
@JvmOverloads
|
||||
constructor(
|
||||
var name: Name = Name(),
|
||||
var age: Int = -1,
|
||||
var city: String? = null,
|
||||
var weight: Int? = null,
|
||||
var isHappy: Boolean = false,
|
||||
) : Serializable
|
||||
@@ -0,0 +1,87 @@
|
||||
@file:Suppress("ktlint:standard:function-signature")
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.spark
|
||||
|
||||
import org.apache.spark.SparkConf
|
||||
import org.apache.spark.api.java.JavaSparkContext
|
||||
import org.apache.spark.sql.Dataset
|
||||
import org.apache.spark.sql.Row
|
||||
import org.apache.spark.sql.SparkSession
|
||||
import org.jetbrains.kotlinx.dataframe.api.aggregate
|
||||
import org.jetbrains.kotlinx.dataframe.api.groupBy
|
||||
import org.jetbrains.kotlinx.dataframe.api.max
|
||||
import org.jetbrains.kotlinx.dataframe.api.mean
|
||||
import org.jetbrains.kotlinx.dataframe.api.min
|
||||
import org.jetbrains.kotlinx.dataframe.api.print
|
||||
import org.jetbrains.kotlinx.dataframe.api.schema
|
||||
import org.jetbrains.kotlinx.dataframe.api.std
|
||||
import org.jetbrains.kotlinx.dataframe.examples.spark.convertToDataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.examples.spark.convertToDataFrameByInference
|
||||
import org.jetbrains.kotlinx.dataframe.examples.spark.convertToSpark
|
||||
import org.jetbrains.kotlinx.spark.api.col
|
||||
import org.jetbrains.kotlinx.spark.api.gt
|
||||
|
||||
/**
|
||||
* Since we don't know the schema at compile time this time, we need to do
|
||||
* some schema mapping in between Spark and DataFrame.
|
||||
*
|
||||
* We will use spark/compatibilityLayer.kt to do this.
|
||||
* Take a look at that file for the implementation details!
|
||||
*
|
||||
* NOTE: You will likely need to run this function with Java 8 or 11 for it to work correctly.
|
||||
* Use the `runSparkUntypedDataset` Gradle task to do so.
|
||||
*/
|
||||
fun main() {
|
||||
val spark = SparkSession.builder()
|
||||
.master(SparkConf().get("spark.master", "local[*]"))
|
||||
.appName("Kotlin Spark Sample")
|
||||
.getOrCreate()
|
||||
val sc = JavaSparkContext(spark.sparkContext())
|
||||
|
||||
// Creating a Spark Dataframe (untyped Dataset). Usually, this is loaded from some server or database.
|
||||
val rawDataset: Dataset<Row> = spark.createDataset(
|
||||
listOf(
|
||||
Person(Name("Alice", "Cooper"), 15, "London", 54, true),
|
||||
Person(Name("Bob", "Dylan"), 45, "Dubai", 87, true),
|
||||
Person(Name("Charlie", "Daniels"), 20, "Moscow", null, false),
|
||||
Person(Name("Charlie", "Chaplin"), 40, "Milan", null, true),
|
||||
Person(Name("Bob", "Marley"), 30, "Tokyo", 68, true),
|
||||
Person(Name("Alice", "Wolf"), 20, null, 55, false),
|
||||
Person(Name("Charlie", "Byrd"), 30, "Moscow", 90, true),
|
||||
),
|
||||
beanEncoderOf<Person>(),
|
||||
).toDF()
|
||||
|
||||
// we can perform large operations in Spark.
|
||||
// DataFrames are in-memory structures, so this is a good place to limit the number of rows if you don't have the RAM ;)
|
||||
val dataset = rawDataset.filter(col("age") gt 17)
|
||||
|
||||
// Using inference
|
||||
val df1 = dataset.convertToDataFrameByInference()
|
||||
df1.schema().print()
|
||||
df1.print(columnTypes = true, borders = true)
|
||||
|
||||
// Using full schema mapping
|
||||
val df2 = dataset.convertToDataFrame()
|
||||
df2.schema().print()
|
||||
df2.print(columnTypes = true, borders = true)
|
||||
|
||||
// now we can use DataFrame-specific functions
|
||||
val ageStats = df1
|
||||
.groupBy("city").aggregate {
|
||||
mean("age") into "meanAge"
|
||||
std("age") into "stdAge"
|
||||
min("age") into "minAge"
|
||||
max("age") into "maxAge"
|
||||
}
|
||||
|
||||
ageStats.print(columnTypes = true, borders = true)
|
||||
|
||||
// and when we want to convert a DataFrame back to Spark, we will use the `convertToSpark()` extension function
|
||||
// This performs the necessary schema mapping under the hood.
|
||||
val sparkDataset = df2.convertToSpark(spark, sc)
|
||||
sparkDataset.printSchema()
|
||||
sparkDataset.show()
|
||||
|
||||
spark.stop()
|
||||
}
|
||||
@@ -0,0 +1,42 @@
|
||||
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
|
||||
import org.jetbrains.kotlin.gradle.tasks.KotlinCompile
|
||||
|
||||
plugins {
|
||||
application
|
||||
kotlin("jvm")
|
||||
|
||||
id("org.jetbrains.kotlinx.dataframe")
|
||||
|
||||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
|
||||
id("com.google.devtools.ksp")
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenCentral()
|
||||
mavenLocal() // in case of local dataframe development
|
||||
}
|
||||
|
||||
application.mainClass = "org.jetbrains.kotlinx.dataframe.examples.youtube.YoutubeKt"
|
||||
|
||||
dependencies {
|
||||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
|
||||
implementation(project(":"))
|
||||
implementation(libs.kotlin.datetimeJvm)
|
||||
}
|
||||
|
||||
tasks.withType<KotlinCompile> {
|
||||
compilerOptions.jvmTarget = JvmTarget.JVM_1_8
|
||||
}
|
||||
|
||||
kotlin {
|
||||
compilerOptions {
|
||||
jvmTarget = JvmTarget.JVM_1_8
|
||||
freeCompilerArgs.add("-Xjdk-release=8")
|
||||
}
|
||||
}
|
||||
|
||||
tasks.withType<JavaCompile> {
|
||||
sourceCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
targetCompatibility = JavaVersion.VERSION_1_8.toString()
|
||||
options.release.set(8)
|
||||
}
|
||||
@@ -0,0 +1,4 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.youtube
|
||||
|
||||
val apiKey: String = TODO("Insert your API key here")
|
||||
const val basePath = "https://www.googleapis.com/youtube/v3"
|
||||
@@ -0,0 +1,94 @@
|
||||
@file:ImportDataSchema(
|
||||
"SearchResponse",
|
||||
"src/main/resources/searchResponse.json",
|
||||
)
|
||||
@file:ImportDataSchema(
|
||||
"StatisticsResponse",
|
||||
"src/main/resources/statisticsResponse.json",
|
||||
)
|
||||
|
||||
package org.jetbrains.kotlinx.dataframe.examples.youtube
|
||||
|
||||
import kotlinx.datetime.Instant
|
||||
import org.jetbrains.kotlinx.dataframe.AnyFrame
|
||||
import org.jetbrains.kotlinx.dataframe.AnyRow
|
||||
import org.jetbrains.kotlinx.dataframe.DataRow
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.ImportDataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.*
|
||||
import org.jetbrains.kotlinx.dataframe.dataTypes.IFRAME
|
||||
import org.jetbrains.kotlinx.dataframe.dataTypes.IMG
|
||||
import org.jetbrains.kotlinx.dataframe.io.read
|
||||
import java.net.URL
|
||||
|
||||
fun load(path: String) = DataRow.read("$basePath/$path&key=$apiKey")
|
||||
|
||||
fun load(path: String, maxPages: Int): AnyFrame {
|
||||
val rows = mutableListOf<AnyRow>()
|
||||
var pagePath = path
|
||||
do {
|
||||
val row = load(pagePath)
|
||||
rows.add(row)
|
||||
val next = row.getValueOrNull<String>("nextPageToken")
|
||||
pagePath = "$path&pageToken=$next"
|
||||
|
||||
} while (next != null && rows.size < maxPages)
|
||||
return rows.concat()
|
||||
}
|
||||
|
||||
fun main() {
|
||||
val searchRequest = "cute%20cats"
|
||||
val resultsPerPage = 50
|
||||
val maxPages = 5
|
||||
|
||||
val videoId by column<String>("id")
|
||||
val channel by columnGroup()
|
||||
|
||||
val videos = load("search?q=$searchRequest&maxResults=$resultsPerPage&part=snippet", maxPages)
|
||||
.convertTo<SearchResponse> {
|
||||
convert<String?>().with { it.toString() }
|
||||
convert<Int?>().with { it ?: 0 }
|
||||
}
|
||||
.items.concat()
|
||||
.dropNulls { id.videoId }
|
||||
.select { id.videoId into videoId and snippet }
|
||||
.distinct()
|
||||
.parse()
|
||||
.convert { colsAtAnyDepth().colsOf<URL>() }.with {
|
||||
IMG(it, maxHeight = 150)
|
||||
}.add("video") {
|
||||
val id = videoId()
|
||||
IFRAME("http://www.youtube.com/embed/$id")
|
||||
}.move { snippet.title and snippet.publishTime }.toTop()
|
||||
.move { snippet.channelId and snippet.channelTitle }.under(channel)
|
||||
.remove { snippet }
|
||||
|
||||
val stats = videos[videoId]
|
||||
.chunked(50)
|
||||
.map {
|
||||
val ids = it.joinToString("%2C")
|
||||
load("videos?part=statistics&id=$ids").cast<StatisticsResponse>()
|
||||
}.asColumnGroup()
|
||||
.items.concat()
|
||||
.select { id and statistics.allCols() }
|
||||
.parse()
|
||||
|
||||
val withStat = videos.join(stats) { videoId match right.id }
|
||||
|
||||
val viewCount by column<Int>()
|
||||
val publishTime by column<Instant>()
|
||||
|
||||
val channels = withStat
|
||||
.groupBy { channel }.sum { viewCount }
|
||||
.sortByDesc { viewCount }
|
||||
.flatten()
|
||||
|
||||
channels.print(borders = true, columnTypes = true)
|
||||
|
||||
val growth = withStat
|
||||
.select { publishTime and viewCount }
|
||||
.convert { publishTime and viewCount }.toLong()
|
||||
.sortBy { publishTime }
|
||||
.cumSum { viewCount }
|
||||
|
||||
growth.print(borders = true, columnTypes = true)
|
||||
}
|
||||
@@ -0,0 +1,182 @@
|
||||
{
|
||||
"kind": "youtube#searchListResponse",
|
||||
"etag": "nl77cg-yrK-TW2q2RtoGXrdkkfo",
|
||||
"nextPageToken": "CAUQAA",
|
||||
"regionCode": "NL",
|
||||
"pageInfo": {
|
||||
"totalResults": 1000000,
|
||||
"resultsPerPage": 5
|
||||
},
|
||||
"items": [
|
||||
{
|
||||
"kind": "youtube#searchResult",
|
||||
"etag": "gsRtDXx5RZlp-qILhP65o2oF-go",
|
||||
"id": {
|
||||
"kind": "youtube#video",
|
||||
"videoId": "Dix58mO0Pbc"
|
||||
},
|
||||
"snippet": {
|
||||
"publishedAt": "2022-08-10T14:30:04Z",
|
||||
"channelId": "UC7wafFu5c8AO0YF5U7R7xFA",
|
||||
"title": "Cat TV for Cats to Watch 😺 Summer birds and ducks by the lake 🐦 Cute squirrels 🐿 8 Hours(4K HDR)",
|
||||
"description": "8 hours of pleasing video for cats, dogs, parrots, or other nature lovers to enjoy. It can relax your kitten or puppy and minimize ...",
|
||||
"thumbnails": {
|
||||
"default": {
|
||||
"url": "https://i.ytimg.com/vi/Dix58mO0Pbc/default.jpg",
|
||||
"width": 120,
|
||||
"height": 90
|
||||
},
|
||||
"medium": {
|
||||
"url": "https://i.ytimg.com/vi/Dix58mO0Pbc/mqdefault.jpg",
|
||||
"width": 320,
|
||||
"height": 180
|
||||
},
|
||||
"high": {
|
||||
"url": "https://i.ytimg.com/vi/Dix58mO0Pbc/hqdefault.jpg",
|
||||
"width": 480,
|
||||
"height": 360
|
||||
}
|
||||
},
|
||||
"channelTitle": "Birder King",
|
||||
"liveBroadcastContent": "none",
|
||||
"publishTime": "2022-08-10T14:30:04Z"
|
||||
}
|
||||
},
|
||||
{
|
||||
"kind": "youtube#searchResult",
|
||||
"etag": "_7QEwCZHKtgnPTcYmsxNaol-I0Q",
|
||||
"id": {
|
||||
"kind": "youtube#video",
|
||||
"videoId": "bGsN7jzp5DE"
|
||||
},
|
||||
"snippet": {
|
||||
"publishedAt": "2022-08-09T17:00:30Z",
|
||||
"channelId": "UCINb0wqPz-A0dV9nARjJlOQ",
|
||||
"title": "Cat Is Obsessed With His Tiny Love Bird | The Dodo Odd Couples",
|
||||
"description": "This cat is glued to his favorite little love bird and even climbs inside her cage to hang out longer This video is dedicated to ...",
|
||||
"thumbnails": {
|
||||
"default": {
|
||||
"url": "https://i.ytimg.com/vi/bGsN7jzp5DE/default.jpg",
|
||||
"width": 120,
|
||||
"height": 90
|
||||
},
|
||||
"medium": {
|
||||
"url": "https://i.ytimg.com/vi/bGsN7jzp5DE/mqdefault.jpg",
|
||||
"width": 320,
|
||||
"height": 180
|
||||
},
|
||||
"high": {
|
||||
"url": "https://i.ytimg.com/vi/bGsN7jzp5DE/hqdefault.jpg",
|
||||
"width": 480,
|
||||
"height": 360
|
||||
}
|
||||
},
|
||||
"channelTitle": "The Dodo",
|
||||
"liveBroadcastContent": "none",
|
||||
"publishTime": "2022-08-09T17:00:30Z"
|
||||
}
|
||||
},
|
||||
{
|
||||
"kind": "youtube#searchResult",
|
||||
"etag": "IHNyBgppiApI3KGzkUV5AuPMftM",
|
||||
"id": {
|
||||
"kind": "youtube#video",
|
||||
"videoId": "U1OxDRxNEMM"
|
||||
},
|
||||
"snippet": {
|
||||
"publishedAt": "2022-08-10T14:45:00Z",
|
||||
"channelId": "UCcnThqTwvub5ykbII9WkR5g",
|
||||
"title": "Funny animals - Funny cats / dogs - Funny animal videos 218",
|
||||
"description": "Funny animals! Compilation number 218. Only the best! Sit back and charge positively Funny animal videos (funny cats, dogs ...",
|
||||
"thumbnails": {
|
||||
"default": {
|
||||
"url": "https://i.ytimg.com/vi/U1OxDRxNEMM/default.jpg",
|
||||
"width": 120,
|
||||
"height": 90
|
||||
},
|
||||
"medium": {
|
||||
"url": "https://i.ytimg.com/vi/U1OxDRxNEMM/mqdefault.jpg",
|
||||
"width": 320,
|
||||
"height": 180
|
||||
},
|
||||
"high": {
|
||||
"url": "https://i.ytimg.com/vi/U1OxDRxNEMM/hqdefault.jpg",
|
||||
"width": 480,
|
||||
"height": 360
|
||||
}
|
||||
},
|
||||
"channelTitle": "Happy Dog",
|
||||
"liveBroadcastContent": "none",
|
||||
"publishTime": "2022-08-10T14:45:00Z"
|
||||
}
|
||||
},
|
||||
{
|
||||
"kind": "youtube#searchResult",
|
||||
"etag": "0OYjMrtwAzJH7yo_jOPvF2hwSao",
|
||||
"id": {
|
||||
"kind": "youtube#video",
|
||||
"videoId": "ByH9LuSILxU"
|
||||
},
|
||||
"snippet": {
|
||||
"publishedAt": "2020-06-19T02:18:53Z",
|
||||
"channelId": "UC8hC-augAnujJeprhjI0YkA",
|
||||
"title": "Baby Cats - Cute and Funny Cat Videos Compilation #34 | Aww Animals",
|
||||
"description": "Baby cats are amazing creature because they are the cutest and most funny. Watching funny baby cats is the hardest try not to ...",
|
||||
"thumbnails": {
|
||||
"default": {
|
||||
"url": "https://i.ytimg.com/vi/ByH9LuSILxU/default.jpg",
|
||||
"width": 120,
|
||||
"height": 90
|
||||
},
|
||||
"medium": {
|
||||
"url": "https://i.ytimg.com/vi/ByH9LuSILxU/mqdefault.jpg",
|
||||
"width": 320,
|
||||
"height": 180
|
||||
},
|
||||
"high": {
|
||||
"url": "https://i.ytimg.com/vi/ByH9LuSILxU/hqdefault.jpg",
|
||||
"width": 480,
|
||||
"height": 360
|
||||
}
|
||||
},
|
||||
"channelTitle": "Aww Animals",
|
||||
"liveBroadcastContent": "none",
|
||||
"publishTime": "2020-06-19T02:18:53Z"
|
||||
}
|
||||
},
|
||||
{
|
||||
"kind": "youtube#searchResult",
|
||||
"etag": "S1UukOVi_sofJQLHSU0jX5GSv2M",
|
||||
"id": {
|
||||
"kind": "youtube#video",
|
||||
"videoId": "VkqVsCPAIag"
|
||||
},
|
||||
"snippet": {
|
||||
"publishedAt": "2022-08-10T11:03:15Z",
|
||||
"channelId": "UCHBnS9TR-4h2nvuiiq3XCAA",
|
||||
"title": "Awesome SO Cute Cat ! Cute and Funny Cat Videos to Keep You Smiling! 🐱",
|
||||
"description": "The featured clips in our video are used with permission from the original video owners. The highlight clips can be done by our ...",
|
||||
"thumbnails": {
|
||||
"default": {
|
||||
"url": "https://i.ytimg.com/vi/VkqVsCPAIag/default.jpg",
|
||||
"width": 120,
|
||||
"height": 90
|
||||
},
|
||||
"medium": {
|
||||
"url": "https://i.ytimg.com/vi/VkqVsCPAIag/mqdefault.jpg",
|
||||
"width": 320,
|
||||
"height": 180
|
||||
},
|
||||
"high": {
|
||||
"url": "https://i.ytimg.com/vi/VkqVsCPAIag/hqdefault.jpg",
|
||||
"width": 480,
|
||||
"height": 360
|
||||
}
|
||||
},
|
||||
"channelTitle": "Best awesome",
|
||||
"liveBroadcastContent": "none",
|
||||
"publishTime": "2022-08-10T11:03:15Z"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
{
|
||||
"kind": "youtube#videoListResponse",
|
||||
"etag": "rHk7psLWXLIjjx8rGeiNKUxrD-s",
|
||||
"items": [
|
||||
{
|
||||
"kind": "youtube#video",
|
||||
"etag": "hiKGiry1Gc19FmHigb3sMfjnzP8",
|
||||
"id": "uHKfrz65KSU",
|
||||
"statistics": {
|
||||
"viewCount": "67715094",
|
||||
"likeCount": "641192",
|
||||
"favoriteCount": "0",
|
||||
"commentCount": "22174"
|
||||
}
|
||||
}
|
||||
],
|
||||
"pageInfo": {
|
||||
"totalResults": 1,
|
||||
"resultsPerPage": 1
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,14 @@
|
||||
# Kotlin DataFrame Compiler Gradle Plugin Example
|
||||
|
||||
An IntelliJ IDEA Gradle Kotlin project demonstrating the use of the
|
||||
[Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html).
|
||||
|
||||
We recommend using an up-to-date IntelliJ IDEA for the best experience,
|
||||
as well as the latest Kotlin plugin version.
|
||||
|
||||
> [!WARNING]
|
||||
> For proper functionality in IntelliJ IDEA requires version 2025.2 or newer.
|
||||
|
||||
[Download Kotlin DataFrame Compiler Plugin Gradle Example](https://github.com/Kotlin/dataframe/raw/example-projects-archives/kotlin-dataframe-plugin-gradle-example.zip)
|
||||
|
||||
See also [Kotlin DataFrame Compiler Maven Plugin Example](../kotlin-dataframe-plugin-maven-example)
|
||||
@@ -0,0 +1,39 @@
|
||||
import org.jlleitschuh.gradle.ktlint.KtlintExtension
|
||||
|
||||
plugins {
|
||||
id("org.jlleitschuh.gradle.ktlint") version "12.3.0"
|
||||
|
||||
val kotlinVersion = "2.3.0-RC3"
|
||||
kotlin("jvm") version kotlinVersion
|
||||
// Add the Kotlin DataFrame Compiler plugin of the same version as the Kotlin plugin.
|
||||
kotlin("plugin.dataframe") version kotlinVersion
|
||||
|
||||
application
|
||||
}
|
||||
|
||||
group = "org.example"
|
||||
version = "1.0-SNAPSHOT"
|
||||
|
||||
repositories {
|
||||
mavenCentral()
|
||||
}
|
||||
|
||||
dependencies {
|
||||
// Add general `dataframe` dependency
|
||||
implementation("org.jetbrains.kotlinx:dataframe:1.0.0-Beta4")
|
||||
// Add `kandy` dependency
|
||||
implementation("org.jetbrains.kotlinx:kandy-lets-plot:0.8.3")
|
||||
testImplementation(kotlin("test"))
|
||||
}
|
||||
|
||||
tasks.test {
|
||||
useJUnitPlatform()
|
||||
}
|
||||
kotlin {
|
||||
jvmToolchain(11)
|
||||
}
|
||||
|
||||
configure<KtlintExtension> {
|
||||
version = "1.6.0"
|
||||
// rules are set up through .editorconfig
|
||||
}
|
||||
@@ -0,0 +1,4 @@
|
||||
kotlin.code.style=official
|
||||
# Disabling incremental compilation will no longer be necessary
|
||||
# when https://youtrack.jetbrains.com/issue/KT-66735 is resolved.
|
||||
kotlin.incremental=false
|
||||
@@ -0,0 +1,7 @@
|
||||
distributionBase=GRADLE_USER_HOME
|
||||
distributionPath=wrapper/dists
|
||||
distributionUrl=https\://services.gradle.org/distributions/gradle-9.1.0-bin.zip
|
||||
networkTimeout=10000
|
||||
validateDistributionUrl=true
|
||||
zipStoreBase=GRADLE_USER_HOME
|
||||
zipStorePath=wrapper/dists
|
||||
@@ -0,0 +1,11 @@
|
||||
pluginManagement {
|
||||
repositories {
|
||||
maven("https://packages.jetbrains.team/maven/p/kt/dev/")
|
||||
mavenCentral()
|
||||
gradlePluginPortal()
|
||||
}
|
||||
}
|
||||
plugins {
|
||||
id("org.gradle.toolchains.foojay-resolver-convention") version "1.0.0"
|
||||
}
|
||||
rootProject.name = "kotlin-dataframe-plugin-gradle-example"
|
||||
@@ -0,0 +1,110 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.plugin
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.add
|
||||
import org.jetbrains.kotlinx.dataframe.api.aggregate
|
||||
import org.jetbrains.kotlinx.dataframe.api.convert
|
||||
import org.jetbrains.kotlinx.dataframe.api.convertTo
|
||||
import org.jetbrains.kotlinx.dataframe.api.filter
|
||||
import org.jetbrains.kotlinx.dataframe.api.groupBy
|
||||
import org.jetbrains.kotlinx.dataframe.api.into
|
||||
import org.jetbrains.kotlinx.dataframe.api.max
|
||||
import org.jetbrains.kotlinx.dataframe.api.rename
|
||||
import org.jetbrains.kotlinx.dataframe.api.renameToCamelCase
|
||||
import org.jetbrains.kotlinx.dataframe.api.with
|
||||
import org.jetbrains.kotlinx.dataframe.io.readCsv
|
||||
import org.jetbrains.kotlinx.dataframe.io.writeCsv
|
||||
import org.jetbrains.kotlinx.kandy.dsl.plot
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.export.save
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.feature.layout
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.layers.bars
|
||||
import java.net.URL
|
||||
|
||||
// Declare data schema for the DataFrame from jetbrains_repositories.csv.
|
||||
@DataSchema
|
||||
data class Repositories(
|
||||
val full_name: String,
|
||||
val html_url: URL,
|
||||
val stargazers_count: Int,
|
||||
val topics: String,
|
||||
val watchers: Int,
|
||||
)
|
||||
|
||||
// Define kinds of repositories.
|
||||
enum class RepoKind {
|
||||
Kotlin,
|
||||
IntelliJ,
|
||||
Other,
|
||||
}
|
||||
|
||||
// A rule for determining the kind of repository based on its name and topics.
|
||||
fun getKind(fullName: String, topics: List<String>): RepoKind {
|
||||
fun checkContains(name: String) = name in topics || fullName.lowercase().contains(name)
|
||||
|
||||
return when {
|
||||
checkContains("kotlin") -> RepoKind.Kotlin
|
||||
checkContains("idea") || checkContains("intellij") -> RepoKind.IntelliJ
|
||||
else -> RepoKind.Other
|
||||
}
|
||||
}
|
||||
|
||||
fun main() {
|
||||
val repos = DataFrame
|
||||
// Read DataFrame from the CSV file.
|
||||
.readCsv("https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv")
|
||||
// And convert it to match the `Repositories` schema.
|
||||
.convertTo<Repositories>()
|
||||
|
||||
// With Compiler Plugin, the DataFrame schema changes immediately after each operation:
|
||||
// For example, if a new column is added or the old one is renamed (or its type is changed)
|
||||
// during the operation, you can use the new name immediately in the following operations:
|
||||
repos
|
||||
// Add a new "name" column...
|
||||
.add("name") { full_name.substringAfterLast("/") }
|
||||
// ... and now we can use "name" extension in DataFrame operations, such as `filter`.
|
||||
.filter { name.lowercase().contains("kotlin") }
|
||||
|
||||
// Let's update the DataFrame with some operations using these features.
|
||||
val reposUpdated = repos
|
||||
// Rename columns to CamelCase.
|
||||
// Note that after that, in the following operations, extension properties will have
|
||||
// new names corresponding to the column names.
|
||||
.renameToCamelCase()
|
||||
// Rename "stargazersCount" column to "stars".
|
||||
.rename { stargazersCount }.into("stars")
|
||||
// And we can immediately use the updated name in the filtering.
|
||||
.filter { stars > 50 }
|
||||
// Convert values in the "topic" column (which were `String` initially)
|
||||
// to the list of topics.
|
||||
.convert { topics }.with {
|
||||
val inner = it.removeSurrounding("[", "]")
|
||||
if (inner.isEmpty()) emptyList() else inner.split(',').map(String::trim)
|
||||
}
|
||||
// Now "topics" is a `List<String>` column.
|
||||
// Add a new column with the number of topics.
|
||||
.add("topicCount") { topics.size }
|
||||
// Add a new column with the kind of repository.
|
||||
.add("kind") { getKind(fullName, topics) }
|
||||
|
||||
// Write the updated DataFrame to a CSV file.
|
||||
reposUpdated.writeCsv("jetbrains_repositories_new.csv")
|
||||
|
||||
reposUpdated
|
||||
// Group repositories by kind
|
||||
.groupBy { kind }
|
||||
// And then compute the maximum stars in each group.
|
||||
.aggregate {
|
||||
max { stars } into "maxStars"
|
||||
}
|
||||
// Build a bar plot showing the maximum number of stars per repository kind.
|
||||
.plot {
|
||||
bars {
|
||||
x(kind)
|
||||
y(maxStars)
|
||||
}
|
||||
layout.title = "Max stars per repo kind"
|
||||
}
|
||||
// Save the plot to an SVG file.
|
||||
.save("kindToStars.svg")
|
||||
}
|
||||
@@ -0,0 +1,39 @@
|
||||
target/
|
||||
!.mvn/wrapper/maven-wrapper.jar
|
||||
!**/src/main/**/target/
|
||||
!**/src/test/**/target/
|
||||
.kotlin
|
||||
|
||||
### IntelliJ IDEA ###
|
||||
.idea/modules.xml
|
||||
.idea/jarRepositories.xml
|
||||
.idea/compiler.xml
|
||||
.idea/libraries/
|
||||
*.iws
|
||||
*.iml
|
||||
*.ipr
|
||||
|
||||
### Eclipse ###
|
||||
.apt_generated
|
||||
.classpath
|
||||
.factorypath
|
||||
.project
|
||||
.settings
|
||||
.springBeans
|
||||
.sts4-cache
|
||||
|
||||
### NetBeans ###
|
||||
/nbproject/private/
|
||||
/nbbuild/
|
||||
/dist/
|
||||
/nbdist/
|
||||
/.nb-gradle/
|
||||
build/
|
||||
!**/src/main/**/build/
|
||||
!**/src/test/**/build/
|
||||
|
||||
### VS Code ###
|
||||
.vscode/
|
||||
|
||||
### Mac OS ###
|
||||
.DS_Store
|
||||
@@ -0,0 +1,14 @@
|
||||
# Kotlin DataFrame Compiler Maven Plugin Example
|
||||
|
||||
An IntelliJ IDEA Maven Kotlin project demonstrating the use of the
|
||||
[Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html).
|
||||
|
||||
We recommend using an up-to-date IntelliJ IDEA for the best experience,
|
||||
as well as the latest Kotlin plugin version.
|
||||
|
||||
> [!WARNING]
|
||||
> For proper functionality in IntelliJ IDEA requires version 2025.3 or newer.
|
||||
|
||||
> [Download Kotlin DataFrame Compiler Plugin Maven Example](https://github.com/Kotlin/dataframe/raw/example-projects-archives/kotlin-dataframe-plugin-maven-example.zip)
|
||||
|
||||
See also [Kotlin DataFrame Compiler Gradle Plugin Example](../kotlin-dataframe-plugin-gradle-example)
|
||||
@@ -0,0 +1,112 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<project xmlns="http://maven.apache.org/POM/4.0.0"
|
||||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
||||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
||||
<modelVersion>4.0.0</modelVersion>
|
||||
|
||||
<groupId>org.example</groupId>
|
||||
<artifactId>dataframe_maven</artifactId>
|
||||
<version>1.0-SNAPSHOT</version>
|
||||
|
||||
<properties>
|
||||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
|
||||
<kotlin.code.style>official</kotlin.code.style>
|
||||
<kotlin.compiler.jvmTarget>11</kotlin.compiler.jvmTarget>
|
||||
</properties>
|
||||
|
||||
<repositories>
|
||||
<repository>
|
||||
<id>mavenCentral</id>
|
||||
<url>https://repo1.maven.org/maven2/</url>
|
||||
</repository>
|
||||
</repositories>
|
||||
|
||||
<build>
|
||||
<sourceDirectory>src/main/kotlin</sourceDirectory>
|
||||
<testSourceDirectory>src/test/kotlin</testSourceDirectory>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.jetbrains.kotlin</groupId>
|
||||
<artifactId>kotlin-maven-plugin</artifactId>
|
||||
<version>2.3.0-RC3</version>
|
||||
<configuration>
|
||||
<compilerPlugins>
|
||||
<plugin>kotlin-dataframe</plugin>
|
||||
</compilerPlugins>
|
||||
</configuration>
|
||||
|
||||
<dependencies>
|
||||
<dependency>
|
||||
<groupId>org.jetbrains.kotlin</groupId>
|
||||
<artifactId>kotlin-maven-dataframe</artifactId>
|
||||
<version>2.3.0-RC3</version>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
<executions>
|
||||
<execution>
|
||||
<id>compile</id>
|
||||
<phase>compile</phase>
|
||||
<goals>
|
||||
<goal>compile</goal>
|
||||
</goals>
|
||||
</execution>
|
||||
<execution>
|
||||
<id>test-compile</id>
|
||||
<phase>test-compile</phase>
|
||||
<goals>
|
||||
<goal>test-compile</goal>
|
||||
</goals>
|
||||
</execution>
|
||||
</executions>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<artifactId>maven-surefire-plugin</artifactId>
|
||||
<version>2.22.2</version>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<artifactId>maven-failsafe-plugin</artifactId>
|
||||
<version>2.22.2</version>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<groupId>org.codehaus.mojo</groupId>
|
||||
<artifactId>exec-maven-plugin</artifactId>
|
||||
<version>1.6.0</version>
|
||||
<configuration>
|
||||
<mainClass>MainKt</mainClass>
|
||||
</configuration>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
|
||||
<dependencies>
|
||||
<dependency>
|
||||
<groupId>org.jetbrains.kotlin</groupId>
|
||||
<artifactId>kotlin-test-junit5</artifactId>
|
||||
<version>2.3.0-RC3</version>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.junit.jupiter</groupId>
|
||||
<artifactId>junit-jupiter</artifactId>
|
||||
<version>5.10.0</version>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.jetbrains.kotlin</groupId>
|
||||
<artifactId>kotlin-stdlib</artifactId>
|
||||
<version>2.3.0-RC3</version>
|
||||
</dependency>
|
||||
<!-- DataFrame and Kandy dependencies -->
|
||||
<dependency>
|
||||
<groupId>org.jetbrains.kotlinx</groupId>
|
||||
<artifactId>dataframe</artifactId>
|
||||
<version>1.0.0-Beta4</version>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.jetbrains.kotlinx</groupId>
|
||||
<artifactId>kandy-lets-plot</artifactId>
|
||||
<version>0.8.3</version>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
|
||||
</project>
|
||||
@@ -0,0 +1,110 @@
|
||||
package org.jetbrains.kotlinx.dataframe.examples.plugin
|
||||
|
||||
import org.jetbrains.kotlinx.dataframe.DataFrame
|
||||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema
|
||||
import org.jetbrains.kotlinx.dataframe.api.add
|
||||
import org.jetbrains.kotlinx.dataframe.api.aggregate
|
||||
import org.jetbrains.kotlinx.dataframe.api.convert
|
||||
import org.jetbrains.kotlinx.dataframe.api.convertTo
|
||||
import org.jetbrains.kotlinx.dataframe.api.filter
|
||||
import org.jetbrains.kotlinx.dataframe.api.groupBy
|
||||
import org.jetbrains.kotlinx.dataframe.api.into
|
||||
import org.jetbrains.kotlinx.dataframe.api.max
|
||||
import org.jetbrains.kotlinx.dataframe.api.rename
|
||||
import org.jetbrains.kotlinx.dataframe.api.renameToCamelCase
|
||||
import org.jetbrains.kotlinx.dataframe.api.with
|
||||
import org.jetbrains.kotlinx.dataframe.io.readCsv
|
||||
import org.jetbrains.kotlinx.dataframe.io.writeCsv
|
||||
import org.jetbrains.kotlinx.kandy.dsl.plot
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.export.save
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.feature.layout
|
||||
import org.jetbrains.kotlinx.kandy.letsplot.layers.bars
|
||||
import java.net.URL
|
||||
|
||||
// Declare data schema for the DataFrame from jetbrains_repositories.csv.
|
||||
@DataSchema
|
||||
data class Repositories(
|
||||
val full_name: String,
|
||||
val html_url: URL,
|
||||
val stargazers_count: Int,
|
||||
val topics: String,
|
||||
val watchers: Int,
|
||||
)
|
||||
|
||||
// Define kinds of repositories.
|
||||
enum class RepoKind {
|
||||
Kotlin,
|
||||
IntelliJ,
|
||||
Other,
|
||||
}
|
||||
|
||||
// A rule for determining the kind of repository based on its name and topics.
|
||||
fun getKind(fullName: String, topics: List<String>): RepoKind {
|
||||
fun checkContains(name: String) = name in topics || fullName.lowercase().contains(name)
|
||||
|
||||
return when {
|
||||
checkContains("kotlin") -> RepoKind.Kotlin
|
||||
checkContains("idea") || checkContains("intellij") -> RepoKind.IntelliJ
|
||||
else -> RepoKind.Other
|
||||
}
|
||||
}
|
||||
|
||||
fun main() {
|
||||
val repos = DataFrame
|
||||
// Read DataFrame from the CSV file.
|
||||
.readCsv("https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv")
|
||||
// And convert it to match the `Repositories` schema.
|
||||
.convertTo<Repositories>()
|
||||
|
||||
// With Compiler Plugin, the DataFrame schema changes immediately after each operation:
|
||||
// For example, if a new column is added or the old one is renamed (or its type is changed)
|
||||
// during the operation, you can use the new name immediately in the following operations:
|
||||
repos
|
||||
// Add a new "name" column...
|
||||
.add("name") { full_name.substringAfterLast("/") }
|
||||
// ... and now we can use "name" extension in DataFrame operations, such as `filter`.
|
||||
.filter { name.lowercase().contains("kotlin") }
|
||||
|
||||
// Let's update the DataFrame with some operations using these features.
|
||||
val reposUpdated = repos
|
||||
// Rename columns to CamelCase.
|
||||
// Note that after that, in the following operations, extension properties will have
|
||||
// new names corresponding to the column names.
|
||||
.renameToCamelCase()
|
||||
// Rename "stargazersCount" column to "stars".
|
||||
.rename { stargazersCount }.into("stars")
|
||||
// And we can immediately use the updated name in the filtering.
|
||||
.filter { stars > 50 }
|
||||
// Convert values in the "topic" column (which were `String` initially)
|
||||
// to the list of topics.
|
||||
.convert { topics }.with {
|
||||
val inner = it.removeSurrounding("[", "]")
|
||||
if (inner.isEmpty()) emptyList() else inner.split(',').map(String::trim)
|
||||
}
|
||||
// Now "topics" is a `List<String>` column.
|
||||
// Add a new column with the number of topics.
|
||||
.add("topicCount") { topics.size }
|
||||
// Add a new column with the kind of repository.
|
||||
.add("kind") { getKind(fullName, topics) }
|
||||
|
||||
// Write the updated DataFrame to a CSV file.
|
||||
reposUpdated.writeCsv("jetbrains_repositories_new.csv")
|
||||
|
||||
reposUpdated
|
||||
// Group repositories by kind
|
||||
.groupBy { kind }
|
||||
// And then compute the maximum stars in each group.
|
||||
.aggregate {
|
||||
max { stars } into "maxStars"
|
||||
}
|
||||
// Build a bar plot showing the maximum number of stars per repository kind.
|
||||
.plot {
|
||||
bars {
|
||||
x(kind)
|
||||
y(maxStars)
|
||||
}
|
||||
layout.title = "Max stars per repo kind"
|
||||
}
|
||||
// Save the plot to an SVG file.
|
||||
.save("kindToStars.svg")
|
||||
}
|
||||
@@ -0,0 +1,304 @@
|
||||
# DEMO for DataFrame, this might differ from the actual API (it's updated a bit)
|
||||
openapi: 3.0.0
|
||||
info:
|
||||
version: 2.0.2
|
||||
title: APIs.guru
|
||||
description: >
|
||||
Wikipedia for Web APIs. Repository of API specs in OpenAPI format.
|
||||
|
||||
|
||||
**Warning**: If you want to be notified about changes in advance please join our [Slack channel](https://join.slack.com/t/mermade/shared_invite/zt-g78g7xir-MLE_CTCcXCdfJfG3CJe9qA).
|
||||
|
||||
|
||||
Client sample: [[Demo]](https://apis.guru/simple-ui) [[Repo]](https://github.com/APIs-guru/simple-ui)
|
||||
contact:
|
||||
name: APIs.guru
|
||||
url: https://APIs.guru
|
||||
email: mike.ralphson@gmail.com
|
||||
license:
|
||||
name: CC0 1.0
|
||||
url: https://github.com/APIs-guru/openapi-directory#licenses
|
||||
x-logo:
|
||||
url: https://apis.guru/branding/logo_vertical.svg
|
||||
externalDocs:
|
||||
url: https://github.com/APIs-guru/openapi-directory/blob/master/API.md
|
||||
security: [ ]
|
||||
tags:
|
||||
- name: APIs
|
||||
description: Actions relating to APIs in the collection
|
||||
paths:
|
||||
/list.json:
|
||||
get:
|
||||
operationId: listAPIs
|
||||
tags:
|
||||
- APIs
|
||||
summary: List all APIs
|
||||
description: >
|
||||
List all APIs in the directory.
|
||||
|
||||
Returns links to OpenAPI specification for each API in the directory.
|
||||
|
||||
If API exist in multiple versions `preferred` one is explicitly marked.
|
||||
|
||||
|
||||
Some basic info from OpenAPI spec is cached inside each object.
|
||||
|
||||
This allows to generate some simple views without need to fetch OpenAPI spec for each API.
|
||||
responses:
|
||||
"200":
|
||||
description: OK
|
||||
content:
|
||||
application/json; charset=utf-8:
|
||||
schema:
|
||||
$ref: "#/components/schemas/APIs"
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/APIs"
|
||||
/metrics.json:
|
||||
get:
|
||||
operationId: getMetrics
|
||||
summary: Get basic metrics
|
||||
description: >
|
||||
Some basic metrics for the entire directory.
|
||||
|
||||
Just stunning numbers to put on a front page and are intended purely for WoW effect :)
|
||||
tags:
|
||||
- APIs
|
||||
responses:
|
||||
"200":
|
||||
description: OK
|
||||
content:
|
||||
application/json; charset=utf-8:
|
||||
schema:
|
||||
$ref: "#/components/schemas/Metrics"
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/Metrics"
|
||||
components:
|
||||
schemas:
|
||||
APIs:
|
||||
description: |
|
||||
List of API details.
|
||||
It is a JSON object with API IDs(`<provider>[:<service>]`) as keys.
|
||||
type: object
|
||||
additionalProperties:
|
||||
$ref: "#/components/schemas/API"
|
||||
minProperties: 1
|
||||
example:
|
||||
googleapis.com:drive:
|
||||
added: 2015-02-22T20:00:45.000Z
|
||||
preferred: v3
|
||||
versions:
|
||||
v2:
|
||||
added: 2015-02-22T20:00:45.000Z
|
||||
info:
|
||||
title: Drive
|
||||
version: v2
|
||||
x-apiClientRegistration:
|
||||
url: https://console.developers.google.com
|
||||
x-logo:
|
||||
url: https://api.apis.guru/v2/cache/logo/https_www.gstatic.com_images_icons_material_product_2x_drive_32dp.png
|
||||
x-origin:
|
||||
format: google
|
||||
url: https://www.googleapis.com/discovery/v1/apis/drive/v2/rest
|
||||
version: v1
|
||||
x-preferred: false
|
||||
x-providerName: googleapis.com
|
||||
x-serviceName: drive
|
||||
swaggerUrl: https://api.apis.guru/v2/specs/googleapis.com/drive/v2/swagger.json
|
||||
swaggerYamlUrl: https://api.apis.guru/v2/specs/googleapis.com/drive/v2/swagger.yaml
|
||||
updated: 2016-06-17T00:21:44.000Z
|
||||
v3:
|
||||
added: 2015-12-12T00:25:13.000Z
|
||||
info:
|
||||
title: Drive
|
||||
version: v3
|
||||
x-apiClientRegistration:
|
||||
url: https://console.developers.google.com
|
||||
x-logo:
|
||||
url: https://api.apis.guru/v2/cache/logo/https_www.gstatic.com_images_icons_material_product_2x_drive_32dp.png
|
||||
x-origin:
|
||||
format: google
|
||||
url: https://www.googleapis.com/discovery/v1/apis/drive/v3/rest
|
||||
version: v1
|
||||
x-preferred: true
|
||||
x-providerName: googleapis.com
|
||||
x-serviceName: drive
|
||||
swaggerUrl: https://api.apis.guru/v2/specs/googleapis.com/drive/v3/swagger.json
|
||||
swaggerYamlUrl: https://api.apis.guru/v2/specs/googleapis.com/drive/v3/swagger.yaml
|
||||
updated: 2016-06-17T00:21:44.000Z
|
||||
API:
|
||||
description: Meta information about API
|
||||
type: object
|
||||
required:
|
||||
- added
|
||||
- preferred
|
||||
- versions
|
||||
properties:
|
||||
added:
|
||||
description: Timestamp when the API was first added to the directory
|
||||
type: string
|
||||
format: date-time
|
||||
preferred:
|
||||
description: Recommended version
|
||||
type: string
|
||||
versions:
|
||||
description: List of supported versions of the API
|
||||
type: object
|
||||
additionalProperties:
|
||||
$ref: "#/components/schemas/ApiVersion"
|
||||
minProperties: 1
|
||||
additionalProperties: false
|
||||
ApiVersion:
|
||||
type: object
|
||||
required:
|
||||
- added
|
||||
# - updated apparently not required!
|
||||
- swaggerUrl
|
||||
- swaggerYamlUrl
|
||||
- info
|
||||
- openapiVer
|
||||
properties:
|
||||
added:
|
||||
description: Timestamp when the version was added
|
||||
type: string
|
||||
format: date-time
|
||||
updated: # apparently not required!
|
||||
description: Timestamp when the version was updated
|
||||
type: string
|
||||
format: date-time
|
||||
swaggerUrl:
|
||||
description: URL to OpenAPI definition in JSON format
|
||||
type: string
|
||||
format: url
|
||||
swaggerYamlUrl:
|
||||
description: URL to OpenAPI definition in YAML format
|
||||
type: string
|
||||
format: url
|
||||
info:
|
||||
description: Copy of `info` section from OpenAPI definition
|
||||
type: object
|
||||
minProperties: 1
|
||||
externalDocs:
|
||||
description: Copy of `externalDocs` section from OpenAPI definition
|
||||
type: object
|
||||
minProperties: 1
|
||||
openapiVer:
|
||||
description: OpenAPI version
|
||||
type: string
|
||||
additionalProperties: false
|
||||
|
||||
Metrics:
|
||||
description: List of basic metrics
|
||||
type: object
|
||||
required:
|
||||
- numSpecs
|
||||
- numAPIs
|
||||
- numEndpoints
|
||||
- unreachable
|
||||
- invalid
|
||||
- unofficial
|
||||
- fixes
|
||||
- fixedPct
|
||||
- datasets
|
||||
- stars
|
||||
- issues
|
||||
- thisWeek
|
||||
properties:
|
||||
numSpecs:
|
||||
description: Number of API specifications including different versions of the
|
||||
same API
|
||||
type: integer
|
||||
minimum: 1
|
||||
numAPIs:
|
||||
description: Number of APIs
|
||||
type: integer
|
||||
minimum: 1
|
||||
numEndpoints:
|
||||
description: Total number of endpoints inside all specifications
|
||||
type: integer
|
||||
minimum: 1
|
||||
unreachable:
|
||||
description: Number of unreachable specifications
|
||||
type: integer
|
||||
minimum: 0
|
||||
invalid:
|
||||
description: Number of invalid specifications
|
||||
type: integer
|
||||
minimum: 0
|
||||
unofficial:
|
||||
description: Number of unofficial specifications
|
||||
type: integer
|
||||
minimum: 0
|
||||
fixes:
|
||||
description: Number of fixes applied to specifications
|
||||
type: integer
|
||||
minimum: 0
|
||||
fixedPct:
|
||||
description: Percentage of fixed specifications
|
||||
type: number
|
||||
minimum: 0
|
||||
maximum: 100
|
||||
datasets:
|
||||
description: An overview of the datasets used to gather the APIs
|
||||
type: array
|
||||
items:
|
||||
description: A single metric per dataset
|
||||
type: object
|
||||
required:
|
||||
- title
|
||||
- data
|
||||
properties:
|
||||
title:
|
||||
description: Title of the metric
|
||||
type: string
|
||||
data:
|
||||
description: Value of the metric per dataset
|
||||
type: object
|
||||
additionalProperties:
|
||||
type: integer
|
||||
minimum: 0
|
||||
stars:
|
||||
description: Number of stars on GitHub
|
||||
type: integer
|
||||
minimum: 0
|
||||
issues:
|
||||
description: Number of issues on GitHub
|
||||
type: integer
|
||||
minimum: 0
|
||||
thisWeek:
|
||||
description: Number of new specifications added/updated this week
|
||||
type: object
|
||||
required:
|
||||
- added
|
||||
- updated
|
||||
properties:
|
||||
added:
|
||||
description: Number of new specifications added this week
|
||||
type: integer
|
||||
minimum: 0
|
||||
updated:
|
||||
description: Number of specifications updated this week
|
||||
type: integer
|
||||
minimum: 0
|
||||
additionalProperties: false
|
||||
example:
|
||||
numSpecs: 1000
|
||||
numAPIs: 100
|
||||
numEndpoints: 10000
|
||||
unreachable: 10
|
||||
invalid: 10
|
||||
unofficial: 10
|
||||
fixes: 10
|
||||
fixedPct: 10
|
||||
datasets:
|
||||
- title: providerCount
|
||||
data:
|
||||
"a.com": 10
|
||||
"b.com": 20
|
||||
"c.com": 30
|
||||
stars: 1000
|
||||
issues: 100
|
||||
thisWeek:
|
||||
added: 10
|
||||
updated: 10
|
||||
@@ -0,0 +1,42 @@
|
||||
{
|
||||
"numSpecs": 3809,
|
||||
"numAPIs": 2362,
|
||||
"numEndpoints": 79405,
|
||||
"unreachable": 138,
|
||||
"invalid": 634,
|
||||
"unofficial": 24,
|
||||
"fixes": 34001,
|
||||
"fixedPct": 21,
|
||||
"datasets": [
|
||||
{
|
||||
"title": "providerCount",
|
||||
"data": {
|
||||
"adyen.com": 69,
|
||||
"amazonaws.com": 295,
|
||||
"apideck.com": 14,
|
||||
"apisetu.gov.in": 181,
|
||||
"azure.com": 1832,
|
||||
"ebay.com": 20,
|
||||
"fungenerators.com": 12,
|
||||
"googleapis.com": 443,
|
||||
"hubapi.com": 11,
|
||||
"interzoid.com": 20,
|
||||
"mastercard.com": 14,
|
||||
"microsoft.com": 27,
|
||||
"nexmo.com": 20,
|
||||
"nytimes.com": 11,
|
||||
"parliament.uk": 11,
|
||||
"sportsdata.io": 35,
|
||||
"twilio.com": 41,
|
||||
"windows.net": 10,
|
||||
"Others": 743
|
||||
}
|
||||
}
|
||||
],
|
||||
"stars": 2964,
|
||||
"issues": 206,
|
||||
"thisWeek": {
|
||||
"added": 123,
|
||||
"updated": 119
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
movieId,title,genres
|
||||
9b30aff7943f44579e92c261f3adc193,Women in Black (1997),Fantasy|Suspenseful|Comedy
|
||||
2a1ba1fc5caf492a80188e032995843e,Bumblebee Movie (2007),Comedy|Jazz|Family|Animation
|
||||
f44ceb4771504342bb856d76c112d5a6,Magical School Boy and the Rock of Wise Men (2001),Fantasy|Growing up|Magic
|
||||
43d02fb064514ff3bd30d1e3a7398357,Master of the Jewlery: The Company of the Jewel (2001),Fantasy|Magic|Suspenseful
|
||||
6aa0d26a483148998c250b9c80ddf550,Sun Conflicts: Part IV: A Novel Espair (1977),Fantasy
|
||||
eace16e59ce24eff90bf8924eb6a926c,The Outstanding Bulk (2008),Fantasy|Superhero|Family
|
||||
ae916bc4844a4bb7b42b70d9573d05cd,In Automata (2014),Horror|Existential
|
||||
c1f0a868aeb44c5ea8d154ec3ca295ac,Interplanetary (2014),Sci-fi|Futuristic
|
||||
9595b771f87f42a3b8dd07d91e7cb328,Woods Run (1994),Family|Drama
|
||||
aa9fc400e068443488b259ea0802a975,Anthropod-Dude (2002),Superhero|Fantasy|Family|Growing up
|
||||
22d20c2ba11d44cab83aceea39dc00bd,The Chamber (2003),Comedy|Drama
|
||||
8cf4d0c1bd7b41fab6af9d92c892141f,That Thing About an Iceberg (1997),Drama|History|Family|Romance
|
||||
c2f3e7588da84684a7d78d6bd8d8e1f4,Vehicles (2006),Animation|Family
|
||||
ce06175106af4105945f245161eac3c7,Playthings Tale (1995),Animation|Family
|
||||
ee28d7e69103485c83e10b8055ef15fb,Metal Man 2 (2010),Fantasy|Superhero|Family
|
||||
c32bdeed466f4ec09de828bb4b6fc649,Surgeon Odd in the Omniverse of Crazy (2022),Fantasy|Superhero|Family|Horror
|
||||
d4a325ab648a42c4a2d6f35dfabb387f,Bad Dream on Pine Street (1984),Horror
|
||||
60ebe74947234ddcab49dea1a958faed,The Shimmering (1980),Horror
|
||||
f24327f2b05147b197ca34bf13ae3524,Krubit: Societal Teachings for Do Many Good Amazing Country of Uzbekistan (2006),Comedy
|
||||
2bb29b3a245e434fa80542e711fd2cee,This is No Movie (1950),(no genres listed)
|
||||
|