2.1 KiB
Vendored
Apache Arrow
Read and write Apache Arrow files in Kotlin — efficient binary format support with Kotlin DataFrame. Work with Arrow files in Kotlin for fast I/O — supports both streaming and random access formats. Kotlin DataFrame provides full support for reading and writing Apache Arrow files in high-performance workflows.Kotlin DataFrame supports reading from and writing to Apache Arrow files.
Requires the dataframe-arrow module, which is included by
default in the general dataframe artifact
and in %use dataframe for Kotlin Notebook.
Make sure to follow the Apache Arrow Java compatibility guide when using Java 9+. {style="warning"}
Structured (nested) Arrow types such as Struct are not supported yet in Kotlin DataFrame. See the issue: Add inner / Struct type support in Arrow {style="warning"}
Read
DataFrame supports both the
Arrow interprocess streaming format
and the Arrow random access format.
You can read a DataFrame from Apache Arrow data sources
(via a file path, URL, or stream) using the readArrowFeather() method:
val df = DataFrame.readArrowFeather("example.feather")
val df = DataFrame.readArrowFeather("https://kotlin.github.io/dataframe/resources/example.feather")
Write
A DataFrame can be written to Arrow format using the interprocess streaming or random access format.
Output targets include WritableByteChannel, OutputStream, File, or ByteArray.