init research

This commit is contained in:
2026-02-08 11:20:43 -10:00
commit bdf064f54d
3041 changed files with 1592200 additions and 0 deletions
@@ -0,0 +1,53 @@
# Apache Arrow
<web-summary>
Read and write Apache Arrow files in Kotlin — efficient binary format support with Kotlin DataFrame.
</web-summary>
<card-summary>
Work with Arrow files in Kotlin for fast I/O — supports both streaming and random access formats.
</card-summary>
<link-summary>
Kotlin DataFrame provides full support for reading and writing Apache Arrow files in high-performance workflows.
</link-summary>
Kotlin DataFrame supports reading from and writing to Apache Arrow files.
Requires the [`dataframe-arrow` module](Modules.md#dataframe-arrow), which is included by
default in the general [`dataframe`](Modules.md#dataframe-general) artifact
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
> Make sure to follow the
> [Apache Arrow Java compatibility guide](https://arrow.apache.org/docs/java/install.html#java-compatibility)
> when using Java 9+.
> {style="warning"}
> Structured (nested) Arrow types such as Struct are not supported yet in Kotlin DataFrame.
> See the issue: [Add inner / Struct type support in Arrow](https://github.com/Kotlin/dataframe/issues/536)
> {style="warning"}
## Read
[`DataFrame`](DataFrame.md) supports both the
[Arrow interprocess streaming format](https://arrow.apache.org/docs/java/ipc.html#writing-and-reading-streaming-format)
and the [Arrow random access format](https://arrow.apache.org/docs/java/ipc.html#writing-and-reading-random-access-files).
You can read a `DataFrame` from Apache Arrow data sources
(via a file path, URL, or stream) using the [`readArrowFeather()`](read.md#read-apache-arrow-formats) method:
```kotlin
val df = DataFrame.readArrowFeather("example.feather")
```
```kotlin
val df = DataFrame.readArrowFeather("https://kotlin.github.io/dataframe/resources/example.feather")
```
## Write
A [`DataFrame`](DataFrame.md) can be written to Arrow format using the interprocess streaming or random access format.
Output targets include `WritableByteChannel`, `OutputStream`, `File`, or `ByteArray`.
See [](write.md#writing-to-apache-arrow-formats) for more details.
@@ -0,0 +1,52 @@
# CSV / TSV
<web-summary>
Work with CSV and TSV files — read, analyze, and export tabular data using Kotlin DataFrame.
</web-summary>
<card-summary>
Seamlessly load and write CSV or TSV files in Kotlin — perfect for common tabular data workflows.
</card-summary>
<link-summary>
Kotlin DataFrame support for reading and writing CSV and TSV files with simple, type-safe APIs.
</link-summary>
Kotlin DataFrame supports reading from and writing to CSV and TSV files.
Requires the [`dataframe-csv` module](Modules.md#dataframe-csv),
which is included by default in the general [`dataframe`](Modules.md#dataframe-general)
artifact and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
## Read
You can read a [`DataFrame`](DataFrame.md) from a CSV or TSV file (via a file path or URL)
using the [`readCsv()`](read.md#read-from-csv) or `readTsv()` methods:
```kotlin
val df = DataFrame.readCsv("example.csv")
```
```kotlin
val df = DataFrame.readCsv("https://kotlin.github.io/dataframe/resources/example.csv")
```
## Write
You can write a [`DataFrame`](DataFrame.md) to a CSV file using the [`writeCsv()`](write.md#writing-to-csv) method:
```kotlin
df.writeCsv("example.csv")
```
## Deephaven CSV
The [`dataframe-csv`](Modules.md#dataframe-csv) module uses the high-performance
[Deephaven CSV library](https://github.com/deephaven/deephaven-csv) under the hood
for fast and efficient CSV reading and writing.
If you're working with large CSV files, you can adjust the parser manually
by [configuring Deephaven-specific parameters](https://kotlin.github.io/dataframe/read.html#unlocking-deephaven-csv-features)
to get the best performance for your use case.
@@ -0,0 +1,35 @@
# Data Sources
<web-summary>
Discover all the data formats Kotlin DataFrame can work with — including JSON, CSV, Excel, SQL databases, and more.
</web-summary>
<card-summary>
Explore supported data sources in Kotlin DataFrame and how to integrate them into your data processing workflow.
</card-summary>
<link-summary>
Explore supported data sources in Kotlin DataFrame and how to integrate them into your data processing workflow.
</link-summary>
One of the key aspects of working with data is being able to read from and write to various data sources.
Kotlin DataFrame provides seamless support for a wide range of formats to integrate into your data workflows.
Below you'll find a list of supported sources along with instructions on how to read and write data using them.
- [JSON](JSON.md)
- [OpenAPI](OpenAPI.md)
- [CSV / TSV](CSV-TSV.md)
- [Excel](Excel.md)
- [Apache Arrow](ApacheArrow.md)
- [Parquet](Parquet.md)
- [SQL](SQL.md):
- [PostgreSQL](PostgreSQL.md)
- [MySQL](MySQL.md)
- [Microsoft SQL Server](Microsoft-SQL-Server.md)
- [SQLite](SQLite.md)
- [H2](H2.md)
- [MariaDB](MariaDB.md)
- [DuckDB](DuckDB.md)
- [Custom SQL Source](Custom-SQL-Source.md)
- [Custom integrations with unsupported data sources](Integrations.md)
+42
View File
@@ -0,0 +1,42 @@
# Excel
<web-summary>
Read from and write to Excel files in `.xls` or `.xlsx` formats with Kotlin DataFrame for seamless spreadsheet integration.
</web-summary>
<card-summary>
Kotlin DataFrame makes it easy to load and save data from Excel files — perfect for working with spreadsheet-based workflows.
</card-summary>
<link-summary>
Learn how to read and write Excel files using Kotlin DataFrame with just a single line of code.
</link-summary>
Kotlin DataFrame supports reading from and writing to Excel files in both `.xls` and `.xlsx` formats.
Requires the [`dataframe-excel` module](Modules.md#dataframe-excel),
which is included by default in the general [`dataframe`](Modules.md#dataframe-general)
artifact and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
## Read
You can read a [`DataFrame`](DataFrame.md) from an Excel file (via a file path or URL)
using the [`readExcel()`](read.md#read-from-excel) method:
```kotlin
val df = DataFrame.readExcel("example.xlsx")
```
```kotlin
val df = DataFrame.readExcel("https://kotlin.github.io/dataframe/resources/example.xlsx")
```
## Write
You can write a [`DataFrame`](DataFrame.md) to an Excel file using the
[`writeExcel()`](write.html#write-to-excel-spreadsheet) method:
```kotlin
df.writeExcel("example.xlsx")
```
@@ -0,0 +1,26 @@
# Custom integrations with unsupported data sources
<web-summary>
Examples of how to integrate Kotlin DataFrame with other data frameworks like Exposed, Spark, or Multik.
</web-summary>
<card-summary>
Integrate Kotlin DataFrame with unsupported sources — see practical examples with Exposed, Spark, and more.
</card-summary>
<link-summary>
How to connect Kotlin DataFrame with data sources like Exposed, Apache Spark, or Multik.
</link-summary>
Some data sources are not officially supported in the Kotlin DataFrame API yet —
but you can still integrate them easily using custom code.
Below is a list of example integrations with other data frameworks.
These examples demonstrate how to bridge Kotlin DataFrame with external libraries or APIs.
- [Kotlin Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed)
- [Apache Spark (with/without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/spark)
- [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik)
You can use these examples as templates to create your own integrations
with any data processing library that produces structured tabular data.
+47
View File
@@ -0,0 +1,47 @@
# JSON
<web-summary>
Support for working with JSON data — load, explore, and save structured JSON using Kotlin DataFrame.
</web-summary>
<card-summary>
Easily handle JSON data in Kotlin — read from files or URLs, and export your data back to JSON format.
</card-summary>
<link-summary>
Kotlin DataFrame support for reading and writing JSON files in a structured and type-safe way.
</link-summary>
Kotlin DataFrame supports reading from and writing to JSON files.
Requires the [`dataframe-json` module](Modules.md#dataframe-json),
which is included by default in the general [`dataframe`](Modules.md#dataframe-general)
artifact and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe)
for Kotlin Notebook.
> Kotlin DataFrame is suitable only for working with table-like structured JSON —
> a list of objects where each object represents a row and all objects share the same structure.
>
> Experimental support for [OpenAPI JSON schemas](OpenAPI.md) is also available.
> {style="note"}
## Read
You can read a [`DataFrame`](DataFrame.md) or [`DataRow`](DataRow.md)
from a JSON file (via a file path or URL) using the [`readJson()`](read.md#read-from-json) method:
```kotlin
val df = DataFrame.readJson("example.json")
```
```kotlin
val df = DataFrame.readJson("https://kotlin.github.io/dataframe/resources/example.json")
```
## Write
You can write a [`DataFrame`](DataFrame.md) to a JSON file using the [`writeJson()`](write.md#writing-to-json) method:
```kotlin
df.writeJson("example.json")
```
@@ -0,0 +1,34 @@
# OpenAPI
<web-summary>
Work with JSON data based on OpenAPI 3.0 schemas using Kotlin DataFrame — helpful for consuming structured API responses.
</web-summary>
<card-summary>
Use Kotlin DataFrame to read and write data that conforms to OpenAPI specifications. Great for API-driven data workflows.
</card-summary>
<link-summary>
Learn how to use OpenAPI 3.0 JSON schemas with Kotlin DataFrame to load and manipulate API-defined data.
</link-summary>
> **Experimental**: Support for OpenAPI 3.0.0 schemas is currently experimental
> and may change or be removed in future releases.
> {style="warning"}
Kotlin DataFrame provides support for reading and writing JSON data
that conforms to [OpenAPI 3.0 specifications](https://www.openapis.org).
This feature is useful when working with APIs that expose structured data defined via OpenAPI schemas.
Requires the [`dataframe-openapi` module](Modules.md#dataframe-openapi),
which **is not included** in the general [`dataframe`](Modules.md#dataframe-general) artifact.
To enable it in Kotlin Notebook, use:
```kotlin
%use dataframe(enableExperimentalOpenApi=true)
```
See [the OpenAPI guide notebook](https://github.com/Kotlin/dataframe/blob/master/examples/notebooks/json/KeyValueAndOpenApi.ipynb)
for details on how to work with OpenAPI-based data.
@@ -0,0 +1,159 @@
# Parquet
<web-summary>
Read Parquet files via Apache Arrow in Kotlin DataFrame — highperformance columnar storage for analytics.
</web-summary>
<card-summary>
Use Kotlin DataFrame to read Parquet datasets using Apache Arrow for fast, typed, columnar I/O.
</card-summary>
<link-summary>
Kotlin DataFrame can read Parquet files through Apache Arrows Dataset API. Learn how and when to use it.
</link-summary>
Kotlin DataFrame supports reading [Apache Parquet](https://parquet.apache.org/) files through the Apache Arrow integration.
Requires the [`dataframe-arrow` module](Modules.md#dataframe-arrow), which is included by default in the general [`dataframe`](Modules.md#dataframe-general) artifact and in and when using `%use dataframe` for Kotlin Notebook.
> We currently only support READING Parquet via Apache Arrow; writing Parquet is not supported in Kotlin DataFrame.
> {style="note"}
> Apache Arrow is not supported on Android, so reading Parquet files on Android is not available.
> {style="warning"}
> Structured (nested) Arrow types such as Struct are not supported yet in Kotlin DataFrame.
> See the issue: [Add inner / Struct type support in Arrow](https://github.com/Kotlin/dataframe/issues/536)
> {style="warning"}
## Reading Parquet Files
Kotlin DataFrame provides four `readParquet()` methods that can read from different source types.
All overloads accept optional `nullability` inference settings and `batchSize` for Arrow scanning.
```kotlin
// 1) URLs
public fun DataFrame.Companion.readParquet(
vararg urls: URL,
nullability: NullabilityOptions = NullabilityOptions.Infer,
batchSize: Long = ARROW_PARQUET_DEFAULT_BATCH_SIZE,
): AnyFrame
// 2) Strings (interpreted as file paths or URLs, e.g., "data/file.parquet", "file://", or "http(s)://")
public fun DataFrame.Companion.readParquet(
vararg strUrls: String,
nullability: NullabilityOptions = NullabilityOptions.Infer,
batchSize: Long = ARROW_PARQUET_DEFAULT_BATCH_SIZE,
): AnyFrame
// 3) Paths
public fun DataFrame.Companion.readParquet(
vararg paths: Path,
nullability: NullabilityOptions = NullabilityOptions.Infer,
batchSize: Long = ARROW_PARQUET_DEFAULT_BATCH_SIZE,
): AnyFrame
// 4) Files
public fun DataFrame.Companion.readParquet(
vararg files: File,
nullability: NullabilityOptions = NullabilityOptions.Infer,
batchSize: Long = ARROW_PARQUET_DEFAULT_BATCH_SIZE,
): AnyFrame
```
These overloads are defined in the `dataframe-arrow` module and internally use `FileFormat.PARQUET` from Apache Arrows
Dataset API to scan the data and materialize it as a Kotlin `DataFrame`.
### Examples
```kotlin
// Read from file paths (as strings)
val df = DataFrame.readParquet("data/sales.parquet")
```
<!---FUN readParquetFilePath-->
```kotlin
// Read from Path objects
val path = Paths.get("data/sales.parquet")
val df = DataFrame.readParquet(path)
```
<!---END-->
<!---FUN readParquetURL-->
```kotlin
// Read from URLs
val df = DataFrame.readParquet(url)
```
<!---END-->
<!---FUN readParquetFile-->
```kotlin
// Read from File objects
val file = File("data/sales.parquet")
val df = DataFrame.readParquet(file)
```
<!---END-->
<!---FUN readParquetFileWithParameters-->
```kotlin
// Read from File objects
val file = File("data/sales.parquet")
val df = DataFrame.readParquet(
file,
nullability = NullabilityOptions.Infer,
batchSize = 64L * 1024
)
```
<!---END-->
If you want to see a complete, realistic dataengineering example using Spark and Parquet with Kotlin DataFrame,
check out the [example project](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/spark-parquet-dataframe).
### Multiple Files
It's possible to read multiple Parquet files:
<!---FUN readMultipleParquetFiles-->
```kotlin
val file = File("data/sales.parquet")
val file1 = File("data/sales1.parquet")
val file2 = File("data/sales2.parquet")
val df = DataFrame.readParquet(file, file1, file2)
```
<!---END-->
**Requirements:**
- All files must have compatible schemas
- Files are vertically concatenated (union of rows)
- Column types must match exactly
- Missing columns in some files will result in null values
### Performance tips
- **Column selection**: Because the `readParquet` method reads all columns, use DataFrame operations like `select()` immediately after reading to reduce memory usage in later operations
- **Predicate pushdown**: Currently not supported—filtering happens after data is loaded into memory
- Use Arrowcompatible JVMs as documented in
[Apache Arrow Java compatibility](https://arrow.apache.org/docs/java/install.html#java-compatibility).
- Adjust `batchSize` if you read huge files and need to tune throughput vs. memory.
### See also
- [](ApacheArrow.md) — reading/writing Arrow IPC formats.
- [Parquet official site](https://parquet.apache.org/).
- Example: [Spark + Parquet + Kotlin DataFrame](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/spark-parquet-dataframe)
- [](Data-Sources.md) — Overview of all supported formats
@@ -0,0 +1,22 @@
# Custom SQL Source
<web-summary>
Connect Kotlin DataFrame to any JDBC-compatible database using a custom SQL source configuration.
</web-summary>
<card-summary>
Easily integrate unsupported SQL databases in Kotlin DataFrame using a flexible custom source setup.
</card-summary>
<link-summary>
Define a custom SQL source in Kotlin DataFrame to work with any JDBC-based database.
</link-summary>
If your SQL database is not officially supported, you can either
[create an issue](https://github.com/Kotlin/dataframe/issues)
or define a simple, configurable custom SQL source.
See the [How to Extend DataFrame Library for Custom SQL Database Support guide](readSqlFromCustomDatabase.md)
for detailed instructions and an example with HSQLDB.
@@ -0,0 +1,107 @@
# DuckDB
<web-summary>
Work with DuckDB databases in Kotlin — read tables and queries into DataFrames using JDBC.
</web-summary>
<card-summary>
Use Kotlin DataFrame to query and transform DuckDB data directly via JDBC.
</card-summary>
<link-summary>
Read DuckDB data into Kotlin DataFrame with JDBC support.
</link-summary>
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.io.DuckDb-->
Kotlin DataFrame supports reading from [DuckDB](https://duckdb.org/) databases using JDBC.
This requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
Youll also need [the official DuckDB JDBC driver](https://duckdb.org/docs/stable/clients/java):
<tabs>
<tab title="Gradle project">
```kotlin
dependencies {
implementation("org.duckdb:duckdb_jdbc:$version")
}
```
</tab>
<tab title="Kotlin Notebook">
```kotlin
USE {
dependencies("org.duckdb:duckdb_jdbc:$version")
}
```
</tab>
</tabs>
The actual Maven Central driver version can be found
[here](https://mvnrepository.com/artifact/org.duckdb/duckdb_jdbc).
## Read
A [`DataFrame`](DataFrame.md) instance can be loaded from a database in several ways:
a user can read data from a SQL table by a given name ([`readSqlTable`](readSqlDatabases.md)),
as the result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
It is also possible to load all data from non-system tables, each into a separate `DataFrame` ([
`readAllSqlTables`](readSqlDatabases.md)).
See [](readSqlDatabases.md) for more details.
<!---FUN readSqlTable-->
```kotlin
val url = "jdbc:duckdb:/testDatabase"
val username = "duckdb"
val password = "password"
val dbConfig = DbConnectionConfig(url, username, password)
val tableName = "Customer"
val df = DataFrame.readSqlTable(dbConfig, tableName)
```
<!---END-->
### Extensions
DuckDB has a special trick up its sleeve: it has support
for [extensions](https://duckdb.org/docs/stable/extensions/overview).
These can be installed, loaded, and used to connect to a different database via DuckDB.
See [Core Extensions](https://duckdb.org/docs/stable/core_extensions/overview) for a list of available extensions.
For example, let's load a dataframe
from [Apache Iceberg via DuckDB](https://duckdb.org/docs/stable/core_extensions/iceberg/overview.html),
as Iceberg is an unsupported data source in DataFrame at the moment:
<!---FUN readIcebergExtension-->
```kotlin
// Creating an in-memory DuckDB database
val connection = DriverManager.getConnection("jdbc:duckdb:")
val df = connection.use { connection ->
// install and load Iceberg
connection.createStatement().execute("INSTALL iceberg; LOAD iceberg;")
// query a table from Iceberg using a specific SQL query
DataFrame.readSqlQuery(
connection = connection,
sqlQuery = "SELECT * FROM iceberg_scan('data/iceberg/lineitem_iceberg', allow_moved_paths = true);",
)
}
```
<!---END-->
As you can see, the process is very similar to reading from any other JDBC database,
just without needing explicit DataFrame support.
@@ -0,0 +1,98 @@
# H2
<web-summary>
Use Kotlin DataFrame to query H2 databases via JDBC — read tables, run SQL queries, or fetch result sets directly.
</web-summary>
<card-summary>
Connect to H2 databases in Kotlin DataFrame and load data using simple JDBC configurations.
</card-summary>
<link-summary>
Read from H2 databases in Kotlin DataFrame using built-in SQL reading methods.
</link-summary>
Kotlin DataFrame supports reading from an [H2](https://www.h2database.com/html/main.html) database using JDBC.
Requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
Youll also need [the official H2 JDBC driver](https://www.h2database.com/html/main.html):
<tabs>
<tab title="Gradle project">
```kotlin
dependencies {
implementation("com.h2database:h2:$version")
}
```
</tab>
<tab title="Kotlin Notebook">
```kotlin
USE {
dependencies("com.h2database:h2:$version")
}
```
</tab>
</tabs>
The actual Maven Central driver version could be found
[here](https://mvnrepository.com/artifact/com.h2database/h2).
## Read
[`DataFrame`](DataFrame.md) can be loaded from a database in several ways:
a user can read data from a SQL table by given name ([`readSqlTable`](readSqlDatabases.md)),
as a result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
It is also possible to load all data from non-system tables, each into a separate `DataFrame` ([`readAllSqlTables`](readSqlDatabases.md)).
See [](readSqlDatabases.md) for more details.
### H2 Compatibility Modes
When working with H2 database, the library automatically detects the compatibility mode from the connection.
If no `MODE` is specified in the JDBC URL, the default `Regular` mode is used.
H2 supports the following compatibility modes: `MySQL`, `PostgreSQL`, `MSSQLServer`, `MariaDB`, and `Regular`.
```kotlin
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
import org.jetbrains.kotlinx.dataframe.api.*
// Basic H2 connection (uses Regular mode by default)
val url = "jdbc:h2:mem:testDatabase"
val username = "sa"
val password = ""
val dbConfig = DbConnectionConfig(url, username, password)
val tableName = "Customer"
val df = DataFrame.readSqlTable(dbConfig, tableName)
```
```kotlin
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
import org.jetbrains.kotlinx.dataframe.api.*
// H2 with PostgreSQL compatibility mode
val postgresUrl = "jdbc:h2:mem:testDatabase;MODE=PostgreSQL"
val username = "sa"
val password = ""
val postgresConfig = DbConnectionConfig(postgresUrl, username, password)
val tableName = "Customer"
val dfPostgres = DataFrame.readSqlTable(postgresConfig, tableName)
```
@@ -0,0 +1,72 @@
# MariaDB
<web-summary>
Access MariaDB databases using Kotlin DataFrame and JDBC — fetch data from tables or custom SQL queries with ease.
</web-summary>
<card-summary>
Seamlessly integrate MariaDB with Kotlin DataFrame — load data using JDBC and analyze it in Kotlin.
</card-summary>
<link-summary>
Read data from MariaDB into Kotlin DataFrame using standard JDBC configurations.
</link-summary>
Kotlin DataFrame supports reading from [MariaDB](https://mariadb.org) database using JDBC.
Requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
Youll also need [the official MariaDB JDBC driver](https://mariadb.com/docs/connectors/mariadb-connector-j):
<tabs>
<tab title="Gradle project">
```kotlin
dependencies {
implementation("org.mariadb.jdbc:mariadb-java-client:$version")
}
```
</tab>
<tab title="Kotlin Notebook">
```kotlin
USE {
dependencies("org.mariadb.jdbc:mariadb-java-client:$version")
}
```
</tab>
</tabs>
The actual Maven Central driver version could be found
[here](https://mvnrepository.com/artifact/org.mariadb.jdbc/mariadb-java-client).
## Read
[`DataFrame`](DataFrame.md) can be loaded from a database in several ways:
a user can read data from a SQL table by given name ([`readSqlTable`](readSqlDatabases.md)),
as a result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
It is also possible to load all data from non-system tables, each into a separate `DataFrame` ([`readAllSqlTables`](readSqlDatabases.md)).
See [](readSqlDatabases.md) for more details.
```kotlin
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
import org.jetbrains.kotlinx.dataframe.api.*
val url = "jdbc:mariadb://localhost:3306/testDatabase"
val username = "root"
val password = "password"
val dbConfig = DbConnectionConfig(url, username, password)
val tableName = "Customer"
val df = DataFrame.readSqlTable(dbConfig, tableName)
```
@@ -0,0 +1,74 @@
# Microsoft SQL Server (MS SQL)
<web-summary>
Connect to Microsoft SQL Server using Kotlin DataFrame and JDBC — load structured data directly into your Kotlin workflow.
</web-summary>
<card-summary>
Use Kotlin DataFrame to read from Microsoft SQL Server — run queries or load entire tables via JDBC.
</card-summary>
<link-summary>
Fetch data from Microsoft SQL Server into Kotlin DataFrame using JDBC configuration.
</link-summary>
Kotlin DataFrame supports reading from [Microsoft SQL Server (MS SQL)](https://www.microsoft.com/en-us/sql-server)
database using JDBC.
Requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
Youll also need
[the official MS SQL JDBC driver](https://learn.microsoft.com/en-us/sql/connect/jdbc/download-microsoft-jdbc-driver-for-sql-server?view=sql-server-ver17):
<tabs>
<tab title="Gradle project">
```kotlin
dependencies {
implementation("com.microsoft.sqlserver:mssql-jdbc:$version")
}
```
</tab>
<tab title="Kotlin Notebook">
```kotlin
USE {
dependencies("com.microsoft.sqlserver:mssql-jdbc:$version")
}
```
</tab>
</tabs>
The actual Maven Central driver version could be found
[here](https://mvnrepository.com/artifact/com.microsoft.sqlserver/mssql-jdbc).
## Read
[`DataFrame`](DataFrame.md) can be loaded from a database in several ways:
a user can read data from a SQL table by given name ([`readSqlTable`](readSqlDatabases.md)),
as a result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
It is also possible to load all data from non-system tables, each into a separate `DataFrame` ([`readAllSqlTables`](readSqlDatabases.md)).
See [](readSqlDatabases.md) for more details.
```kotlin
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
import org.jetbrains.kotlinx.dataframe.api.*
val url = "jdbc:sqlserver://localhost:1433;databaseName=testDatabase"
val username = "sa"
val password = "password"
val dbConfig = DbConnectionConfig(url, username, password)
val tableName = "Customer"
val df = DataFrame.readSqlTable(dbConfig, tableName)
```
@@ -0,0 +1,72 @@
# MySQL
<web-summary>
Connect to MySQL databases and load data into Kotlin DataFrame using JDBC — query, analyze, and transform SQL data in Kotlin.
</web-summary>
<card-summary>
Use Kotlin DataFrame with MySQL — easily read tables and queries over JDBC into powerful data structures.
</card-summary>
<link-summary>
Read data from MySQL into Kotlin DataFrame using JDBC configuration.
</link-summary>
Kotlin DataFrame supports reading from [MySQL](https://www.mysql.com) database using JDBC.
Requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
Youll also need [the official MySQL JDBC driver](https://dev.mysql.com/downloads/connector/j/):
<tabs>
<tab title="Gradle project">
```kotlin
dependencies {
implementation("com.mysql:mysql-connector-j:$version")
}
```
</tab>
<tab title="Kotlin Notebook">
```kotlin
USE {
dependencies("com.mysql:mysql-connector-j:$version")
}
```
</tab>
</tabs>
The actual Maven Central driver version could be found
[here](https://mvnrepository.com/artifact/com.mysql/mysql-connector-j).
## Read
[`DataFrame`](DataFrame.md) can be loaded from a database in several ways:
a user can read data from a SQL table by given name ([`readSqlTable`](readSqlDatabases.md)),
as a result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
It is also possible to load all data from non-system tables, each into a separate `DataFrame` ([`readAllSqlTables`](readSqlDatabases.md)).
See [](readSqlDatabases.md) for more details.
```kotlin
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
import org.jetbrains.kotlinx.dataframe.api.*
val url = "jdbc:mysql://localhost:3306/testDatabase"
val username = "root"
val password = "password"
val dbConfig = DbConnectionConfig(url, username, password)
val tableName = "Customer"
val df = DataFrame.readSqlTable(dbConfig, tableName)
```
@@ -0,0 +1,71 @@
# PostgreSQL
<web-summary>
Work with PostgreSQL databases in Kotlin — read tables and queries into DataFrames using JDBC.
</web-summary>
<card-summary>
Use Kotlin DataFrame to query and transform PostgreSQL data directly via JDBC.
</card-summary>
<link-summary>
Read PostgreSQL data into Kotlin DataFrame with JDBC support.
</link-summary>
Kotlin DataFrame supports reading from [PostgreSQL](https://www.postgresql.org) database using JDBC.
Requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
Youll also need [the official PostgreSQL JDBC driver](https://jdbc.postgresql.org):
<tabs>
<tab title="Gradle project">
```kotlin
dependencies {
implementation("org.postgresql:postgresql:$version")
}
```
</tab>
<tab title="Kotlin Notebook">
```kotlin
USE {
dependencies("org.postgresql:postgresql:$version")
}
```
</tab>
</tabs>
The actual Maven Central driver version could be found
[here](https://mvnrepository.com/artifact/org.postgresql/postgresql).
## Read
[`DataFrame`](DataFrame.md) can be loaded from a database in several ways:
a user can read data from a SQL table by given name ([`readSqlTable`](readSqlDatabases.md)),
as a result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
It is also possible to load all data from non-system tables, each into a separate `DataFrame` ([`readAllSqlTables`](readSqlDatabases.md)).
See [](readSqlDatabases.md) for more details.
```kotlin
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
val url = "jdbc:postgresql://localhost:5432/testDatabase"
val username = "postgres"
val password = "password"
val dbConfig = DbConnectionConfig(url, username, password)
val tableName = "Customer"
val df = DataFrame.readSqlTable(dbConfig, tableName)
```
@@ -0,0 +1,46 @@
# SQL
<web-summary>
Work with SQL databases in Kotlin using DataFrame and JDBC — read tables and queries with ease.
</web-summary>
<card-summary>
Connect to PostgreSQL, MySQL, SQLite, and other SQL databases using Kotlin DataFrame's JDBC support.
</card-summary>
<link-summary>
Load data from SQL databases into Kotlin DataFrame using JDBC and built-in reading functions.
</link-summary>
Kotlin DataFrame supports reading from SQL databases using JDBC.
Requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
Youll also need a JDBC driver for the specific database.
## Supported databases
Kotlin DataFrame provides out-of-the-box support for the most common SQL databases:
- [PostgreSQL](PostgreSQL.md)
- [MySQL](MySQL.md)
- [Microsoft SQL Server](Microsoft-SQL-Server.md)
- [SQLite](SQLite.md)
- [H2](H2.md)
- [MariaDB](MariaDB.md)
- [DuckDB](DuckDB.md)
You can also define a [Custom SQL Source](Custom-SQL-Source.md)
to work with any other JDBC-compatible database.
## Read
[`DataFrame`](DataFrame.md) can be loaded from a database in several ways:
a user can read data from a SQL table by given name ([`readSqlTable`](readSqlDatabases.md)),
as a result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
It is also possible to load all data from non-system tables, each into a separate `DataFrame`
([`readAllSqlTables`](readSqlDatabases.md)).
See [](readSqlDatabases.md) for more details.
@@ -0,0 +1,70 @@
# SQLite
<web-summary>
Use Kotlin DataFrame to read data from SQLite databases with minimal setup via JDBC.
</web-summary>
<card-summary>
Query and transform SQLite data directly in Kotlin using DataFrame and JDBC.
</card-summary>
<link-summary>
Read SQLite tables into Kotlin DataFrame using the built-in JDBC integration.
</link-summary>
Kotlin DataFrame supports reading from [SQLite](https://www.sqlite.org) database using JDBC.
Requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
Youll also need [SQLite JDBC driver](https://github.com/xerial/sqlite-jdbc):
<tabs>
<tab title="Gradle project">
```kotlin
dependencies {
implementation("org.xerial:sqlite-jdbc:$version")
}
```
</tab>
<tab title="Kotlin Notebook">
```kotlin
USE {
dependencies("org.xerial:sqlite-jdbc:$version")
}
```
</tab>
</tabs>
The actual Maven Central driver version could be found
[here](https://mvnrepository.com/artifact/org.xerial/sqlite-jdbc).
## Read
[`DataFrame`](DataFrame.md) can be loaded from a database in several ways:
a user can read data from a SQL table by given name ([`readSqlTable`](readSqlDatabases.md)),
as a result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
It is also possible to load all data from non-system tables, each into a separate `DataFrame` ([`readAllSqlTables`](readSqlDatabases.md)).
See [](readSqlDatabases.md) for more details.
```kotlin
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
import org.jetbrains.kotlinx.dataframe.api.*
val url = "jdbc:sqlite:testDatabase.db"
val dbConfig = DbConnectionConfig(url)
val tableName = "Customer"
val df = DataFrame.readSqlTable(dbConfig, tableName)
```