12 KiB
Vendored
Change Log
[7.042]
- Deps updated
- added fn 'map-column->columns' ([#178])(https://github.com/scicloj/tablecloth/issues/178)
[7.029]
Added
reorder-columnscan work on grouped dataset now
Fixed
- arrays of 2 element arrays behave as expected on dataset creation (#142)
[7.021]
Deps updated
Documentation changed to be generated by Clay instead of RMarkdown
[7.017]
Fixed
- semi and anti joins fail on table containing missing values, multi columns and duplicated rows
[7.014]
Deps updated to fix j/left-join issue.
[7.012]
Fixed
- join columns should consider
nilas missing value only, discussion :nil-missing?in more places needed (group-by operations), discussion- changes to the
group-bydocumentation PR115, thanks to Marshall - reflection warning for
Collections/shuffleremoved
[7.007]
Added
- Extened documentation for
dataset(copied from TMD), #112
Changed
rowsaccepts:nil-missing?(default: true) andcopying?(default: false) options.
[7.000-beta-51]
Deps updated
[7.000-beta-50.2]
Added
:hashingis available for single column joins too
[7.000-beta-50.1]
Added
:hashingoption determines method of creating an index for multicolumn joins (washashisidentity)
Fixed
- #108 - hashing replaced with packing data into the sequence
[7.000-beta-50]
Deps updated
[7.000-beta-38]
Fixed
- dataset from singleton creation generated from wrong structure
[7.000-beta-27]
Added
map-rowsto map each row and produce new columnsrowscan return sequence of vectors (:as-vecs)
Fixed
- balanced k-fold partitioning as proposed in #92 by @behrica
[7.000-beta-16]
Updated to TMD v7
Differences:
- the order of columns is persisted in more cases
- the order of groups in grouped dataset can be random
[6.103.1]
Added
- doc strings for every funcitons, #87, #88
- aggregate-columns should default to all columns when called without a column selector #91
- create functions for packing / unpacking columns to arrays #82
Changed
- [breaking] when dataset file do not exists throw an exception #84, #85
[6.103]
Clojure upgraded to 1.11.1
Added
separate-columninfers column names when function is used andtarget-columnsisnil, #78
Changed
- [breaking][minor]
separate-columnrepleces source column with target on every case
[6.102]
Fixed
- replace
clojure.core/pmapwithdtype-nextversion (related to #325)
[6.101]
Added
get-entry introduced
[6.094.1]
Fixed
- [#77]
anti-joinandsemi-joinbugs when tables contain missing values
[6.094]
Added
crosstab- cross tabulationpivot->longer:coerce-to-numberoption added
Changed
- [breaking]
pivot->widerno longer coerces column names to strings, it's up to user
Fixed
- predicates should behave as in Clojure (discussion)
[6.090]
TMD version bump
Changed
[breaking]
replace-missing up/down strategies clarified. :down is replaced by :downup and :up is replaced by :updown. :down and :up work only in one direction now.
https://github.com/techascent/tech.ml.dataset/issues/305
[6.088.1]
Fixed
- Wrong way of selecting columns for joins (shouldn't be a set). https://clojurians.zulipchat.com/#narrow/stream/151924-data-science/topic/complete.20ala.20R/near/286277344
[6.088]
Added
data frameterm in the title of docs (discussion)- joins can accept different names for left/right datasets
cross-join,expandandcompleteintroduced
Changed
- removed setting
*warn-on-reflection* - [breaking] creation of singleton dataset adds an error message as a column by default (discussion)
[6.076]
Version bump
Added
- docstring for
unrollandfold-byby @holyjak (#60 and #61)
[6.051]
Fixed
- [#58] - editor friendly api file
[6.031]
Fixed
- #57 - InputStream should be dispatched first (the flow now: tries to create a dataset and it fails packs an objet as a singleton
Changed
select-rowsacceptsIFnfor row selection.- [breaking] #54, #56 -
pipelinenamespace is stripped, all functions are moved to metamorph library. This is temporary solution before removing this namespace completely. Pipelined versions of functions will be moved to metamorph as well later.
[6.025]
Added
- [#49] added docstring to
add-column
Fixed
- [#53] summary prefix ignored for aggregate (when fn[ds] is passed)
[6.023]
Added
- Documented columns / rows functions PR52
- Reference to original to lifted functions metadata for pipelines PR51
Changed
- alias for api functions in reference (was:
api, is:tc)
[6.012]
Fixed
replace-missingon grouped dataset has swapped arguments
[6.006]
Fixed
update-columnson grouped dataset
[6.002]
Changed
- [#43] Align with TMD for dataset creation from a map of sequences.
- [breaking] creation from tensor is
:as-rowsnow
[6.00-beta-16]
Changed
- [#42] [breaking]
add-columndefault strategy is:strictnow.
[6.00-beta-10]
Fixed
- [#41] dataset name not set on tensor path
[6.00-beta-7]
TMD upgrade, no changes in TC
[5.17]
TMD upgrade
Fixed
- [#36]
reorder-columnson empty dataset returns nil
[5.11]
Fixed
aggregate-columnsdidn't keep column order (#35)
[5.05.1]
Added
pipelinefunctions havedoccopied from original ones
[5.05]
Added
splitcan turn off shuffling now (:shuffle?option)split :holdouts- sequence of consecutive holdouts
[5.04]
tech.ml.dataset version bump, this introduces the change of the order of the groups after group-by operation
[5.02]
Added
split :holdoutsupports any number of splits (minimum 2) [#28]splitsupportssplit-namesto provide custom names for subdatasetsconcatandconcat-copyingare working with grouped datasets
Fixed
kfoldsplit failed on small number of rows (due topartition-allbehaviour
[5.01]
Added
split->seqto return train/test splits as a sequence or datasets or as map of sequences for grouped datasets
Changed
- [breaking]
tablecloth.pipelinereturns a map with dataset under:metamorph/datakey (see metamorph) - [breaking]
splitreturns now a dataset or grouped dataset with two new columns indicating train/test and split id. Seesplit->seqfor previous behaviour.
[5.00-beta-29.1]
Added
without-grouping->threading macro which allows operations on grouping dataset treated as a regular one.
Changed
group-byaccepts any java.util.Map for a collection of indexes (use LinkedHashMap to persist an order)- some
tablecloth.api.group-byfunctions moved totablecloth.api.utils, no changes to API
[5.00-beta-29]
Changed
add-or-replace-column(s)replaced byadd-column(s)(add-or-replace-column(s)is marked as deprecated) (#16)
Fixed
mark-as-groupwasn't visible in API (#18)map-columnsdidn't propagatenew-typefor grouped case (#20)- broken links (#14) in readme
[5.00-beta-28]
Added
let-dataset- to simulatetibblefrom R
Fixed
- Adding a column to an empty dataset returned empty dataset
[5.00-beta-27]
Changed
- re-implementation of numerical arrays path dataset creation
[5.00-beta-25]
Added
rowsandcolumnsnew result::as-double-arrays- convert rows to 2d double array- dataset can be created from numerical arrays discusson
Fixed
- column from single value should create valid datatype (#10)
[5.00-beta-21a]
Added
tablecloth.pipelinefor pipeline operations
[5.00-beta-21]
Added
concat-copyingexposed.splitfunction for splitting into train-test pairs with:kfold,:bootstrap,:looandholdoutstrategies + stratified versionsreplace-missingwith new strategy:midpoint
[5.00-beta-5a]
Fixed
- column names should keep order for provided names (#9)
[5.00-beta-5]
t.m.d update
[5.00-beta-3]
t.m.d update
Changed
- contribution guide in readme
[5.00-beta-2]
t.m.d update
Changed
write-nippy!andread-nippyare deprecated, replaced bywrite!anddataset
[5.0-SNAPSHOT]
tech.ml.dataset version 5.0-alpha*
Added
map-columnsaccepts optional target datatypeds/column->datasetfunctionality introduced inseparate-column- more datatypes included for conversion (
:textamong others)
Changed
write-csv!replaced bywrite!(write-csv!is marked as deprecated)infofield:sizeis replaced by:n-elems- [breaking]
separate-column3-arity version acceptsseparatorinsteadtarget-columnsnow
Fixed
- do not skip 1-row DS when folding
- do not attempt to fold empty dataset
[4.04]
tech.ml.dataset version 4.04
Added
- tests: dataset
Changed
- version number to match t.m.dataset version
- documentation:
- gfm renderer for markdown
Fixed
- code block language alignment fix in css
[1.0.0-pre-alpha9]
tech.ml.dataset version 4.03
Added
- some operations on grouped dataset can be parallel (
parallel?option set totrue). These are:aggregate,unique-by,order-by,join-columns,separate-columns,ungroup
Fixed
- #2 - docs typo
- #3 - recover datatypes after ungrouping
Changed
aggregationuses now in-place ungrouping which is much faster
[1.0.0-pre-alpha8]
tech.ml.dataset version 3.06
Added
fill-range-replaceto inject data to make continuous seqence in columnwrite-nippy!andread-nippy
[1.0.0-pre-alpha7]
tech.ml.dataset version 2.13
Added
replace-missingnew strategies::midand:lerp, working also for dates.
Changed
- [breaking]
replace-missinghas different conctract and default strategy:mid.valueargument is the last argument now. - [breaking]
replace-missing:upand:downstrategies, whenvalueisnilfills border missing values with nearest value.
[1.0.0-pre-alpha6]
tech.ml.dataset version 2.06
Added
asof-joinadded
[1.0.0-pre-alpha4]
Added
reshapetestspivot->wideraccepts:drop-missing?option (default:true)
Changed
pivot->widerdrops missing rows by defaultpivto->widerorder of concatenated column names is reversed (first: colnames, last: value), was opposite.pivot->longer:splitteraccepts string used for splitting column name