[//]: # (title: Extension Properties API) When working with a [`DataFrame`](DataFrame.md), the most convenient and reliable way to access its columns — including for operations and retrieving column values in row expressions — is through *auto-generated extension properties*. They are generated based on a [dataframe schema](schemas.md), with the name and type of properties inferred from the name and type of the corresponding columns. It also works for all types of hierarchical dataframes. > The behavior of data schema generation differs between the > [Compiler Plugin](Compiler-Plugin.md) and [Kotlin Notebook](SetupKotlinNotebook.md). > > * In **Kotlin Notebook**, a schema is generated **only after cell execution** for > `DataFrame` variables defined within that cell. > * With the **Compiler Plugin**, a new schema is generated **after every operation** > — but support for all operations is still in progress. > Retrieving the schema for `DataFrame` read from a file or URL is **not yet supported** either. > > This behavior may change in future releases. See the [example](#example) below that demonstrates these differences. {style="warning"} ## Example Consider a simple hierarchical dataframe from . This table consists of two columns: `name`, which is a `String` column, and `info`, which is a [**column group**](DataColumn.md#columngroup) containing two nested [value columns](DataColumn.md#valuecolumn) — `age` of type `Int`, and `height` of type `Double`.
name info
age height
Alice 23 175.5
Bob 27 160.2
Read the [`DataFrame`](DataFrame.md) from the CSV file: ```kotlin val df = DataFrame.readCsv("example.csv") ``` **After cell execution** data schema and extensions for this `DataFrame` will be generated so you can use extensions for accessing columns, using it in operations inside the [Column Selector DSL](ColumnSelectors.md) and [DataRow API](DataRow.md): ```kotlin // Get nested column df.info.age // Sort by multiple columns df.sortBy { name and info.height } // Filter rows using a row condition. // These extensions express the exact value in the row // with the corresponding type: df.filter { name.startsWith("A") && info.age >= 16 } ``` If you change the dataframe's schema by changing any column [name](rename.md), or [type](convert.md) or [add](add.md) a new one, you need to run a cell with a new [`DataFrame`](DataFrame.md) declaration first. For example, rename the `name` column into "firstName": ```kotlin val dfRenamed = df.rename { name }.into("firstName") ``` After running the cell with the code above, you can use `firstName` extensions in the following cells: ```kotlin dfRenamed.firstName dfRenamed.rename { firstName }.into("name") dfRenamed.filter { firstName == "Nikita" } ``` See the [](quickstart.md) in Kotlin Notebook with basic Extension Properties API examples. For now, if you read [`DataFrame`](DataFrame.md) from a file or URL, you need to define its schema manually. You can do it quickly with [`generate..()` methods](DataSchemaGenerationMethods.md). Define schemas: ```kotlin @DataSchema data class PersonInfo( val age: Int, val height: Float ) @DataSchema data class Person( val info: PersonInfo, val name: String ) ``` Read the [`DataFrame`](DataFrame.md) from the CSV file and specify the schema with [`.convertTo()`](convertTo.md) or [`cast()`](cast.md): ```kotlin val df = DataFrame.readCsv("example.csv").convertTo() ``` Extensions for this `DataFrame` will be generated automatically by the plugin, so you can use extensions for accessing columns, using it in operations inside the [Column Selector DSL](ColumnSelectors.md) and [DataRow API](DataRow.md). ```kotlin // Get nested column df.info.age // Sort by multiple columns df.sortBy { name and info.height } // Filter rows using a row condition. // These extensions express the exact value in the row // with the corresponding type: df.filter { name.startsWith("A") && info.age >= 16 } ``` Moreover, new extensions will be generated on-the-fly after each schema change: by changing any column [name](rename.md), or [type](convert.md) or [add](add.md) a new one. For example, rename the `name` column into "firstName" and then we can use `firstName` extensions in the following operations: ```kotlin // Rename "name" column into "firstName" df.rename { name }.into("firstName") // Can use `firstName` extension in the row condition // right after renaming .filter { firstName == "Nikita" } ``` See [Compiler Plugin Example](https://github.com/Kotlin/dataframe/tree/plugin_example/examples/kotlin-dataframe-plugin-gradle-example) IDEA project with basic Extension Properties API examples. ## Properties name generation By default, each extension property is generated with a name equal to the original column name. ```kotlin val df = dataFrameOf("size_in_inches" to listOf(..)) df.size_in_inches ``` If the original column name cannot be used as a property name (for example, if it contains spaces or has a name equal to a keyword in Kotlin), it will be enclosed in backticks. ```kotlin val df = dataFrameOf("size in inches" to listOf(..)) df.`size in inches` ``` However, sometimes the original column name contains special symbols and can't be used as a property name in backticks. In such cases, special symbols in the auto-generated property name will be replaced. ```kotlin val df = dataFrameOf("size\nin:inches" to listOf(..)) df.`size in - inches` ``` > In such cases, use [**`rename`**](rename.md) to update column names, > or [**`renameToCamelCase`**](rename.md#renametocamelcase) to convert all column names > in a `DataFrame` to `camelCase`, which is the idiomatic and widely preferred naming style in Kotlin. If you don't want to change the actual column name, but you need a convenient accessor for this column, you can use the `@ColumnName` annotation in a manually declared [data schema](schemas.md). It allows you to use a property name different from the original column name without changing the column's actual name: ```kotlin @DataSchema interface Info { @ColumnName("size\nin:inches") val sizeInInches: Double } ``` ```kotlin val df = dataFrameOf("size\nin:inches" to listOf(..)).cast() df.sizeInInches ```