59 lines
16 KiB
HTML
Vendored
59 lines
16 KiB
HTML
Vendored
<!DOCTYPE html PUBLIC ""
|
|
"">
|
|
<html><head><meta charset="UTF-8" /><title>tech.v3.libs.tribuo documentation</title><script async="true" src="https://www.googletagmanager.com/gtag/js?id=G-RGTB4J7LGP"></script><script>window.dataLayer = window.dataLayer || [];
|
|
function gtag(){dataLayer.push(arguments);}
|
|
gtag('js', new Date());
|
|
|
|
gtag('config', 'G-95TVFC1FEB');</script><link rel="stylesheet" type="text/css" href="css/default.css" /><link rel="stylesheet" type="text/css" href="highlight/solarized-light.css" /><script type="text/javascript" src="highlight/highlight.min.js"></script><script type="text/javascript" src="js/jquery.min.js"></script><script type="text/javascript" src="js/page_effects.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body><div id="header"><h2>Generated by <a href="https://github.com/weavejester/codox">Codox</a> with <a href="https://github.com/xsc/codox-theme-rdash">RDash UI</a> theme</h2><h1><a href="index.html"><span class="project-title"><span class="project-name">TMD</span> <span class="project-version">8.003</span></span></a></h1></div><div class="sidebar primary"><h3 class="no-link"><span class="inner">Project</span></h3><ul class="index-link"><li class="depth-1 "><a href="index.html"><div class="inner">Index</div></a></li></ul><h3 class="no-link"><span class="inner">Topics</span></h3><ul><li class="depth-1 "><a href="000-getting-started.html"><div class="inner"><span>tech.ml.dataset Getting Started</span></div></a></li><li class="depth-1 "><a href="100-walkthrough.html"><div class="inner"><span>tech.ml.dataset Walkthrough</span></div></a></li><li class="depth-1 "><a href="200-quick-reference.html"><div class="inner"><span>tech.ml.dataset Quick Reference</span></div></a></li><li class="depth-1 "><a href="columns-readers-and-datatypes.html"><div class="inner"><span>tech.ml.dataset Columns, Readers, and Datatypes</span></div></a></li><li class="depth-1 "><a href="nippy-serialization-rocks.html"><div class="inner"><span>tech.ml.dataset And nippy</span></div></a></li><li class="depth-1 "><a href="supported-datatypes.html"><div class="inner"><span>tech.ml.dataset Supported Datatypes</span></div></a></li></ul><h3 class="no-link"><span class="inner">Namespaces</span></h3><ul><li class="depth-1"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>tech</span></div></div></li><li class="depth-2"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>v3</span></div></div></li><li class="depth-3"><a href="tech.v3.dataset.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>dataset</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.categorical.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>categorical</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.clipboard.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>clipboard</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.column.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>column</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.column-filters.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>column-filters</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>io</span></div></div></li><li class="depth-5 branch"><a href="tech.v3.dataset.io.csv.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>csv</span></div></a></li><li class="depth-5 branch"><a href="tech.v3.dataset.io.datetime.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>datetime</span></div></a></li><li class="depth-5 branch"><a href="tech.v3.dataset.io.string-row-parser.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>string-row-parser</span></div></a></li><li class="depth-5"><a href="tech.v3.dataset.io.univocity.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>univocity</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.join.html"><div class="inner"><span class="tree" style="top: -145px;"><span class="top" style="height: 154px;"></span><span class="bottom"></span></span><span>join</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.math.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>math</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.metamorph.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>metamorph</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.modelling.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>modelling</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.print.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>print</span></div></a></li><li class="depth-4"><a href="tech.v3.dataset.reductions.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>reductions</span></div></a></li><li class="depth-5"><a href="tech.v3.dataset.reductions.apache-data-sketch.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>apache-data-sketch</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.rolling.html"><div class="inner"><span class="tree" style="top: -52px;"><span class="top" style="height: 61px;"></span><span class="bottom"></span></span><span>rolling</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.set.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>set</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.dataset.tensor.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>tensor</span></div></a></li><li class="depth-4"><a href="tech.v3.dataset.zip.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>zip</span></div></a></li><li class="depth-3"><div class="no-link"><div class="inner"><span class="tree" style="top: -641px;"><span class="top" style="height: 650px;"></span><span class="bottom"></span></span><span>libs</span></div></div></li><li class="depth-4 branch"><a href="tech.v3.libs.arrow.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>arrow</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.libs.clj-transit.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>clj-transit</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.libs.fastexcel.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>fastexcel</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>guava</span></div></div></li><li class="depth-5"><a href="tech.v3.libs.guava.cache.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>cache</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.libs.parquet.html"><div class="inner"><span class="tree" style="top: -52px;"><span class="top" style="height: 61px;"></span><span class="bottom"></span></span><span>parquet</span></div></a></li><li class="depth-4 branch"><a href="tech.v3.libs.poi.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>poi</span></div></a></li><li class="depth-4 current"><a href="tech.v3.libs.tribuo.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>tribuo</span></div></a></li></ul></div><div class="sidebar secondary"><h3><a href="#top"><span class="inner">Public Vars</span></a></h3><ul><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-classification-predictions-.3Edataset"><div class="inner"><span>classification-predictions->dataset</span></div></a></li><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-evaluate-regression"><div class="inner"><span>evaluate-regression</span></div></a></li><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-make-classification-datasource"><div class="inner"><span>make-classification-datasource</span></div></a></li><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-make-regression-datasource"><div class="inner"><span>make-regression-datasource</span></div></a></li><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-predict-classification"><div class="inner"><span>predict-classification</span></div></a></li><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-predict-regression"><div class="inner"><span>predict-regression</span></div></a></li><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-train-classification"><div class="inner"><span>train-classification</span></div></a></li><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-train-regression"><div class="inner"><span>train-regression</span></div></a></li><li class="depth-1"><a href="tech.v3.libs.tribuo.html#var-trainer"><div class="inner"><span>trainer</span></div></a></li></ul></div><div class="namespace-docs" id="content"><h1 class="anchor" id="top">tech.v3.libs.tribuo</h1><div class="doc"><div class="markdown"><p>Bindings to make working with tribuo more straight forward when using datasets.</p>
|
|
<pre><code class="language-clojure">;; Classification
|
|
|
|
tech.v3.dataset.tribuo-test> (def ds (classification-example-ds 10000))
|
|
#'tech.v3.dataset.tribuo-test/ds
|
|
tech.v3.dataset.tribuo-test> (def model (tribuo/train-classification (org.tribuo.classification.xgboost.XGBoostClassificationTrainer. 6) ds :label))
|
|
#'tech.v3.dataset.tribuo-test/model
|
|
tech.v3.dataset.tribuo-test> (ds/head (tribuo/predict-classification model (ds/remove-columns ds [:label])))
|
|
_unnamed [5 3]:
|
|
|
|
| :prediction | red | green |
|
|
|-------------|-----------:|-----------:|
|
|
| red | 0.92524981 | 0.07475022 |
|
|
| green | 0.07464883 | 0.92535114 |
|
|
| green | 0.07464883 | 0.92535114 |
|
|
| red | 0.92525917 | 0.07474083 |
|
|
| green | 0.07464883 | 0.92535114 |
|
|
|
|
|
|
;; Regression
|
|
tech.v3.dataset.tribuo-test> (def ds (ds/->dataset "test/data/winequality-red.csv" {:separator \;}))
|
|
#'tech.v3.dataset.tribuo-test/ds
|
|
tech.v3.dataset.tribuo-test> (def model (tribuo/train-regression (org.tribuo.regression.xgboost.XGBoostRegressionTrainer. 50) ds "quality"))
|
|
#'tech.v3.dataset.tribuo-test/model
|
|
tech.v3.dataset.tribuo-test> (ds/head (tribuo/predict-regression model (ds/remove-columns ds ["quality"])))
|
|
_unnamed [5 1]:
|
|
|
|
| :prediction |
|
|
|------------:|
|
|
| 5.01974726 |
|
|
| 5.02164841 |
|
|
| 5.22696543 |
|
|
| 5.79519272 |
|
|
| 5.01974726 |
|
|
</code></pre>
|
|
</div></div><div class="public anchor" id="var-classification-predictions-.3Edataset"><h3>classification-predictions->dataset</h3><div class="usage"><code>(classification-predictions->dataset predictions)</code></div><div class="doc"><div class="markdown"><p>Given the list of predictions from a classification model return a dataset
|
|
that will include probability distributions when possible. The actual prediction
|
|
will be in the <code>:prediction</code> column.</p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L239">view source</a></div></div><div class="public anchor" id="var-evaluate-regression"><h3>evaluate-regression</h3><div class="usage"><code>(evaluate-regression model ds inf-col-name)</code></div><div class="doc"><div class="markdown"><p>Evaluate a regression model against this model. Returns map of
|
|
<code>{:rmse :mae :r2}</code></p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L302">view source</a></div></div><div class="public anchor" id="var-make-classification-datasource"><h3>make-classification-datasource</h3><div class="usage"><code>(make-classification-datasource ds)</code><code>(make-classification-datasource ds inf-col-name)</code></div><div class="doc"><div class="markdown"><p>Make a single label classification datasource.</p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L221">view source</a></div></div><div class="public anchor" id="var-make-regression-datasource"><h3>make-regression-datasource</h3><div class="usage"><code>(make-regression-datasource ds)</code><code>(make-regression-datasource ds inf-col-name)</code></div><div class="doc"><div class="markdown"><p>Make a regression datasource from a dataset.</p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L274">view source</a></div></div><div class="public anchor" id="var-predict-classification"><h3>predict-classification</h3><div class="usage"><code>(predict-classification model ds)</code></div><div class="doc"><div class="markdown"><p>Use this model to predict every row of this dataset returning a new dataset containing
|
|
at least a <code>:prediction</code> column. If this classifier is capable of predicting probability
|
|
distributions those will be returned as per-label as separate columns.</p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L263">view source</a></div></div><div class="public anchor" id="var-predict-regression"><h3>predict-regression</h3><div class="usage"><code>(predict-regression model ds)</code></div><div class="doc"><div class="markdown"><p>Use a regression model to predict each column of the dataset returning a dataset with
|
|
at least one column named <code>:prediction</code>.</p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L292">view source</a></div></div><div class="public anchor" id="var-train-classification"><h3>train-classification</h3><div class="usage"><code>(train-classification trainer ds & [inf-col-name])</code></div><div class="doc"><div class="markdown"><p>Train a single label classification model. Returns the model.</p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L232">view source</a></div></div><div class="public anchor" id="var-train-regression"><h3>train-regression</h3><div class="usage"><code>(train-regression trainer ds & [inf-col-name])</code></div><div class="doc"><div class="markdown"><p>Train a regression model on a dataset returning the model.</p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L285">view source</a></div></div><div class="public anchor" id="var-trainer"><h3>trainer</h3><div class="usage"><code>(trainer config-components trainer-name)</code></div><div class="doc"><div class="markdown"><p>Creates a tribuo trainer from a list of config components
|
|
follwing OLCUT convention. One of the components should be a trainer,
|
|
which is the looked-up by <code>trainer-name</code> and returned.</p>
|
|
</div></div><div class="src-link"><a href="https://github.com/techascent/tech.ml.dataset/blob/master/src/tech/v3/libs/tribuo.clj#L316">view source</a></div></div></div></body></html> |