![]() |
UFJF - Machine Learning Toolkit
0.51.8
|
Namespace for artificial datasets generation. More...
Classes | |
| struct | RegPair |
| struct | BlobsPair |
Typedefs | |
| using | Centers = std::vector< mltk::Point< double > > |
Functions | |
| mltk::Data< double > | make_spirals (size_t n_samples=100, int n_classes=2, bool shuffle=true, double noise=1.0, size_t n_loops=2, double margin=0.5, size_t seed=0) |
| generates a synthetic data set composed of interlaced Archimedean spirals [source]. More... | |
| BlobsPair | make_blobs (size_t n_samples=100, int n_centers=2, int n_dims=2, double cluster_std=1.0, double center_min=-10.0, double center_max=10.0, bool shuffle=true, bool has_classes=true, size_t seed=0) |
| Generate isotropic Gaussian blobs for clustering or classification [source]. More... | |
| BlobsPair | make_blobs (const std::vector< size_t > &n_samples, const std::vector< mltk::Point< double >> ¢ers, std::vector< double > clusters_std, int n_dims=2, bool shuffle=true, bool has_classes=true, size_t seed=0) |
| Generate isotropic Gaussian blobs for clustering or classification from given centers and samples distribution. More... | |
| RegPair | make_regression (size_t n_samples=100, size_t n_dims=100, double bias=0.0, double noise=0.1, double stdev=0.01, size_t n_informative=10, bool shuffle=true, size_t seed=0) |
| Generate a random regression problem [source]. More... | |
Namespace for artificial datasets generation.
| BlobsPair mltk::datasets::make_blobs | ( | const std::vector< size_t > & | n_samples, |
| const std::vector< mltk::Point< double >> & | centers, | ||
| std::vector< double > | clusters_std, | ||
| int | n_dims = 2, |
||
| bool | shuffle = true, |
||
| bool | has_classes = true, |
||
| size_t | seed = 0 |
||
| ) |
Generate isotropic Gaussian blobs for clustering or classification from given centers and samples distribution.
| n_samples | number of samples in each gaussian blob. |
| centers | centers of the gaussian blobs. |
| clusters_std | vector of standard deviations for each blob. |
| n_dims | dimensionality of the data. |
| shuffle | tells if the dataset must be shuffled after generation. |
| has_classes | tells if the returned data have labels for the blobs. |
| seed | seed for random data generation. |
| BlobsPair mltk::datasets::make_blobs | ( | size_t | n_samples = 100, |
| int | n_centers = 2, |
||
| int | n_dims = 2, |
||
| double | cluster_std = 1.0, |
||
| double | center_min = -10.0, |
||
| double | center_max = 10.0, |
||
| bool | shuffle = true, |
||
| bool | has_classes = true, |
||
| size_t | seed = 0 |
||
| ) |
Generate isotropic Gaussian blobs for clustering or classification [source].
| n_samples | number of samples on each gaussian blob. |
| n_centers | number of classes or gaussian blobs. |
| n_dims | dimensionality of the data. |
| cluster_std | standard deviation for blobs generation. |
| center_min | min value of generated data. |
| center_max | max value of generated data. |
| shuffle | tells if the dataset must be shuffled after generation. |
| has_classes | tells if the returned data have labels for the blobs. |
| seed | seed for random data generation. |
| RegPair mltk::datasets::make_regression | ( | size_t | n_samples = 100, |
| size_t | n_dims = 100, |
||
| double | bias = 0.0, |
||
| double | noise = 0.1, |
||
| double | stdev = 0.01, |
||
| size_t | n_informative = 10, |
||
| bool | shuffle = true, |
||
| size_t | seed = 0 |
||
| ) |
Generate a random regression problem [source].
| n_samples | number of samples on the dataset. |
| n_dims | dimensionality of the data. |
| bias | The bias term in the underlying linear model. |
| noise | The standard deviation of the gaussian noise applied to the output. |
| stdev | The standard deviation of the gaussian noise applied to the output. |
| n_informative | The number of informative features, i.e., the number of features used to build the linear model used to generate the output. |
| shuffle | Shuffle the samples and the features. |
| seed | seed for random data generation. |
| mltk::Data< double > mltk::datasets::make_spirals | ( | size_t | n_samples = 100, |
| int | n_classes = 2, |
||
| bool | shuffle = true, |
||
| double | noise = 1.0, |
||
| size_t | n_loops = 2, |
||
| double | margin = 0.5, |
||
| size_t | seed = 0 |
||
| ) |
generates a synthetic data set composed of interlaced Archimedean spirals [source].
| n_samples | number of samples on the dataset. |
| n_classes | number of classes or spirals. |
| shuffle | tells if the dataset must be shuffled after generation. |
| noise | value of the noise added to the dataset. |
| n_loops | how many loops each spiral must have. |
| margin | margin between spirals. |
| seed | seed for random data generation. |