Title: | R Ultimate Multilabel Dataset Repository |
---|---|
Description: | Large collection of multilabel datasets along with the functions needed to export them to several formats, to make partitions, and to obtain bibliographic information. |
Authors: | David Charte [cre] |
Maintainer: | David Charte <[email protected]> |
License: | LGPL (>= 3) | file LICENSE |
Version: | 0.4.2 |
Built: | 2025-02-12 04:10:48 UTC |
Source: | https://github.com/fcharte/mldr.datasets |
available.mldrs
retrieves the most up to date list of additional datasets. Those datasets are not
included into the package, but can be downloaded and saved locally.
available.mldrs()
available.mldrs()
A data.frame with the available multilabel datasets
## Not run: library(mldr.datasets) names <- available.mldrs()$Name ## End(Not run)
## Not run: library(mldr.datasets) names <- available.mldrs()$Name ## End(Not run)
Multilabel dataset from the text domain.
bibtex(...)
bibtex(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 7395 instances, 1836 attributes and 159 labels
Katakis, I. and Tsoumakas, G. and Vlahavas, I., "Multilabel Text Classification for Automated Tag Suggestion", in Proc. ECML PKDD08 Discovery Challenge, Antwerp, Belgium, pp. 75-83, 2008
## Not run: bibtex <- bibtex() # Check and load the dataset toBibtex(bibtex) bibtex$measures ## End(Not run)
## Not run: bibtex <- bibtex() # Check and load the dataset toBibtex(bibtex) bibtex$measures ## End(Not run)
Multilabel dataset from the sound domain.
birds
birds
An mldr object with 645 instances, 260 attributes and 19 labels
Briggs, F. and Lakshminarayanan, B. and Neal, L. and Fern, X. Z. and Raich, R. and Hadley, S. J. K. and Hadley, A. S. and Betts, M. G., "Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach", The Journal of the Acoustical Society of America, (6)131, pp. 4640–4650, 2012
## Not run: toBibtex(birds) birds$measures ## End(Not run)
## Not run: toBibtex(birds) birds$measures ## End(Not run)
Multilabel dataset from the text domain.
bookmarks(...)
bookmarks(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 87856 instances, 2150 attributes and 208 labels
Katakis, I. and Tsoumakas, G. and Vlahavas, I., "Multilabel Text Classification for Automated Tag Suggestion", in Proc. ECML PKDD08 Discovery Challenge, Antwerp, Belgium, pp. 75-83, 2008
## Not run: bookmarks <- bookmarks() # Check and load the dataset toBibtex(bookmarks) bookmarks$measures ## End(Not run)
## Not run: bookmarks <- bookmarks() # Check and load the dataset toBibtex(bookmarks) bookmarks$measures ## End(Not run)
Multilabel dataset from the music domain.
cal500
cal500
An mldr object with 502 instances, 68 attributes and 174 labels
Turnbull, Douglas and Barrington, Luke and Torres, David and Lanckriet, Gert, "Semantic annotation and retrieval of music and sound effects", Audio, Speech, and Language Processing, IEEE Transactions on, (2)16, pp. 467-476, 2008
## Not run: toBibtex(cal500) cal500$measures ## End(Not run)
## Not run: toBibtex(cal500) cal500$measures ## End(Not run)
This function checks if the mldr object whose name is given as input is locally available, loading it in memory. If necessary, the dataset will be downloaded from the GitHub repository and saved locally.
check_n_load.mldr(mldr.name)
check_n_load.mldr(mldr.name)
mldr.name |
Name of the dataset to load |
## Not run: library(mldr.datasets) check_n_load.mldr("bibtex") bibtex$measures ## End(Not run)
## Not run: library(mldr.datasets) check_n_load.mldr("bibtex") bibtex$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k001(...)
corel16k001(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13766 instances, 500 attributes and 153 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k001 <- corel16k001() # Check and load the dataset toBibtex(corel16k001) corel16k001$measures ## End(Not run)
## Not run: corel16k001 <- corel16k001() # Check and load the dataset toBibtex(corel16k001) corel16k001$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k002(...)
corel16k002(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13761 instances, 500 attributes and 164 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k002 <- corel16k002() # Check and load the dataset toBibtex(corel16k002) corel16k002$measures ## End(Not run)
## Not run: corel16k002 <- corel16k002() # Check and load the dataset toBibtex(corel16k002) corel16k002$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k003(...)
corel16k003(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13760 instances, 500 attributes and 154 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k003 <- corel16k003() # Check and load the dataset toBibtex(corel16k003) corel16k003$measures ## End(Not run)
## Not run: corel16k003 <- corel16k003() # Check and load the dataset toBibtex(corel16k003) corel16k003$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k004(...)
corel16k004(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13837 instances, 500 attributes and 162 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k004 <- corel16k004() # Check and load the dataset toBibtex(corel16k004) corel16k004$measures ## End(Not run)
## Not run: corel16k004 <- corel16k004() # Check and load the dataset toBibtex(corel16k004) corel16k004$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k005(...)
corel16k005(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13847 instances, 500 attributes and 160 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k005 <- corel16k005() # Check and load the dataset toBibtex(corel16k005) corel16k005$measures ## End(Not run)
## Not run: corel16k005 <- corel16k005() # Check and load the dataset toBibtex(corel16k005) corel16k005$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k006(...)
corel16k006(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13859 instances, 500 attributes and 162 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k006 <- corel16k006() # Check and load the dataset toBibtex(corel16k006) corel16k006$measures ## End(Not run)
## Not run: corel16k006 <- corel16k006() # Check and load the dataset toBibtex(corel16k006) corel16k006$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k007(...)
corel16k007(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13915 instances, 500 attributes and 174 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k007 <- corel16k007() # Check and load the dataset toBibtex(corel16k007) corel16k007$measures ## End(Not run)
## Not run: corel16k007 <- corel16k007() # Check and load the dataset toBibtex(corel16k007) corel16k007$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k008(...)
corel16k008(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13864 instances, 500 attributes and 168 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k008 <- corel16k008() # Check and load the dataset toBibtex(corel16k008) corel16k008$measures ## End(Not run)
## Not run: corel16k008 <- corel16k008() # Check and load the dataset toBibtex(corel16k008) corel16k008$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k009(...)
corel16k009(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13884 instances, 500 attributes and 173 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k009 <- corel16k009() # Check and load the dataset toBibtex(corel16k009) corel16k009$measures ## End(Not run)
## Not run: corel16k009 <- corel16k009() # Check and load the dataset toBibtex(corel16k009) corel16k009$measures ## End(Not run)
Multilabel dataset from the image domain.
corel16k010(...)
corel16k010(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13618 instances, 500 attributes and 144 labels
Barnard, K. and Duygulu, P. and Forsyth, D. and de Freitas, N. and Blei, D. M. and Jordan, M. I., "Matching words and pictures", Journal of Machine Learning Research, Vol. 3, pp. 1107–1135, 2003
## Not run: corel16k010 <- corel16k010() # Check and load the dataset toBibtex(corel16k010) corel16k010$measures ## End(Not run)
## Not run: corel16k010 <- corel16k010() # Check and load the dataset toBibtex(corel16k010) corel16k010$measures ## End(Not run)
Multilabel dataset from the image domain.
corel5k(...)
corel5k(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 5000 instances, 499 attributes and 374 labels
Duygulu, P. and Barnard, K. and de Freitas, J.F.G. and Forsyth, D.A., "Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary", Computer Vision, ECCV 2002, pp. 97-112, 2002
## Not run: corel5k <- corel5k() # Check and load the dataset toBibtex(corel5k) corel5k$measures ## End(Not run)
## Not run: corel5k <- corel5k() # Check and load the dataset toBibtex(corel5k) corel5k$measures ## End(Not run)
Multilabel dataset from the text domain.
delicious(...)
delicious(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 16105 instances, 500 attributes and 983 labels
Tsoumakas, G. and Katakis, I. and Vlahavas, I., "Effective and Efficient Multilabel Classification in Domains with Large Number of Labels", in Proc. ECML/PKDD Workshop on Mining Multidimensional Data, Antwerp, Belgium, MMD08, pp. 30–44, 2008
## Not run: delicious <- delicious() # Check and load the dataset toBibtex(delicious) delicious$measures ## End(Not run)
## Not run: delicious <- delicious() # Check and load the dataset toBibtex(delicious) delicious$measures ## End(Not run)
This function calculates the ratio of nonzero-valued elements over the total of elements.
density(mld)
density(mld)
mld |
An |
library(mldr.datasets) density(emotions)
library(mldr.datasets) density(emotions)
Multilabel dataset from the music domain.
emotions
emotions
An mldr object with 593 instances, 72 attributes and 6 labels
Wieczorkowska, A. and Synak, P. and Ra's, Z., "Multi-Label Classification of Emotions in Music", Intelligent Information Processing and Web Mining, Vol. 35, Chap. 30, pp. 307-315, 2006
## Not run: toBibtex(emotions) emotions$measures ## End(Not run)
## Not run: toBibtex(emotions) emotions$measures ## End(Not run)
Multilabel dataset from the text domain.
enron(...)
enron(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 1702 instances, 1001 attributes and 53 labels
Klimt, B. and Yang, Y., "The Enron Corpus: A New Dataset for Email Classification Research", in Proc. ECML04, Pisa, Italy, pp. 217-226, 2004
## Not run: enron <- enron() # Check and load the dataset toBibtex(enron) enron$measures ## End(Not run)
## Not run: enron <- enron() # Check and load the dataset toBibtex(enron) enron$measures ## End(Not run)
Multilabel dataset from the text domain.
eurlexdc_test(...)
eurlexdc_test(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 1935 instances, 5000 attributes and 412 labels
Mencia, E. L. and Furnkranz, J., "Efficient pairwise multilabel classification for large-scale problems in the legal domain", Machine Learning and Knowledge Discovery in Databases, pp. 50–65, 2008
## Not run: eurlexdc_test <- eurlexdc_test() # Check and load the dataset toBibtex(eurlexdc_test[[1]]) eurlexdc_test[[1]]$measures ## End(Not run)
## Not run: eurlexdc_test <- eurlexdc_test() # Check and load the dataset toBibtex(eurlexdc_test[[1]]) eurlexdc_test[[1]]$measures ## End(Not run)
Multilabel dataset from the text domain.
eurlexdc_tra(...)
eurlexdc_tra(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 17413 instances, 5000 attributes and 412 labels
Mencia, E. L. and Furnkranz, J., "Efficient pairwise multilabel classification for large-scale problems in the legal domain", Machine Learning and Knowledge Discovery in Databases, pp. 50–65, 2008
## Not run: eurlexdc_tra <- eurlexdc_tra() # Check and load the dataset toBibtex(eurlexdc_test[[1]]) eurlexdc_test[[1]]$measures ## End(Not run)
## Not run: eurlexdc_tra <- eurlexdc_tra() # Check and load the dataset toBibtex(eurlexdc_test[[1]]) eurlexdc_test[[1]]$measures ## End(Not run)
Multilabel dataset from the text domain.
eurlexev_test(...)
eurlexev_test(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 1935 instances, 5000 attributes and 3993 labels
Mencia, E. L. and Furnkranz, J., "Efficient pairwise multilabel classification for large-scale problems in the legal domain", Machine Learning and Knowledge Discovery in Databases, pp. 50–65, 2008
## Not run: eurlexev_test <- eurlexev_test() # Check and load the dataset toBibtex(eurlexev_test[[1]]) eurlexev_test[[1]]$measures ## End(Not run)
## Not run: eurlexev_test <- eurlexev_test() # Check and load the dataset toBibtex(eurlexev_test[[1]]) eurlexev_test[[1]]$measures ## End(Not run)
Multilabel dataset from the text domain.
eurlexev_tra(...)
eurlexev_tra(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 17413 instances, 5000 attributes and 3993 labels
Mencia, E. L. and Furnkranz, J., "Efficient pairwise multilabel classification for large-scale problems in the legal domain", Machine Learning and Knowledge Discovery in Databases, pp. 50–65, 2008
## Not run: eurlexev_tra <- eurlexev_tra() # Check and load the dataset toBibtex(eurlexev_tra[[1]]) eurlexev_tra[[1]]$measures ## End(Not run)
## Not run: eurlexev_tra <- eurlexev_tra() # Check and load the dataset toBibtex(eurlexev_tra[[1]]) eurlexev_tra[[1]]$measures ## End(Not run)
Multilabel dataset from the text domain.
eurlexsm_test(...)
eurlexsm_test(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 1935 instances, 5000 attributes and 201 labels
Mencia, E. L. and Furnkranz, J., "Efficient pairwise multilabel classification for large-scale problems in the legal domain", Machine Learning and Knowledge Discovery in Databases, pp. 50–65, 2008
## Not run: eurlexsm_test <- eurlexsm_test() # Check and load the dataset toBibtex(eurlexsm_test[[1]]) eurlexsm_test[[1]]$measures ## End(Not run)
## Not run: eurlexsm_test <- eurlexsm_test() # Check and load the dataset toBibtex(eurlexsm_test[[1]]) eurlexsm_test[[1]]$measures ## End(Not run)
Multilabel dataset from the text domain.
eurlexsm_tra(...)
eurlexsm_tra(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 17413 instances, 5000 attributes and 201 labels
Mencia, E. L. and Furnkranz, J., "Efficient pairwise multilabel classification for large-scale problems in the legal domain", Machine Learning and Knowledge Discovery in Databases, pp. 50–65, 2008
## Not run: eurlexsm_tra <- eurlexsm_tra() # Check and load the dataset toBibtex(eurlexsm_tra[[1]]) eurlexsm_tra[[1]]$measures ## End(Not run)
## Not run: eurlexsm_tra <- eurlexsm_tra() # Check and load the dataset toBibtex(eurlexsm_tra[[1]]) eurlexsm_tra[[1]]$measures ## End(Not run)
Multilabel dataset from the image domain.
flags
flags
An mldr object with 194 instances, 19 attributes and 7 labels
Goncalves, E. C. and Plastino, A. and Freitas, A. A., "A genetic algorithm for optimizing the label ordering in multi-label classifier chains", Tools with Artificial Intelligence (ICTAI), 2013 IEEE 25th International Conference on, pp. 469-476, 2013
## Not run: toBibtex(flags) flags$measures ## End(Not run)
## Not run: toBibtex(flags) flags$measures ## End(Not run)
Multilabel dataset from the biology domain.
genbase
genbase
An mldr object with 662 instances, 1186 attributes and 27 labels
Diplaris, S. and Tsoumakas, G. and Mitkas, P. and Vlahavas, I., "Protein Classification with Multiple Algorithms", in Proc. 10th Panhellenic Conference on Informatics, Volos, Greece, PCI05, pp. 448–456, 2005
## Not run: toBibtex(genbase) genbase$measures ## End(Not run)
## Not run: toBibtex(genbase) genbase$measures ## End(Not run)
get.mldr
obtains a multilabel dataset, either by finding it inside the package data, in the download directory or by downloading it.
get.mldr(name, download.dir = if (is.null(getOption("mldr.download.dir"))) tempdir() else getOption("mldr.download.dir"))
get.mldr(name, download.dir = if (is.null(getOption("mldr.download.dir"))) tempdir() else getOption("mldr.download.dir"))
name |
Name of the dataset to load |
download.dir |
The path to the download directory, can be also set through |
## Not run: library(mldr.datasets) # customize the download directory options(mldr.download.dir = "./datasets") # retrieve the bibtex dataset, as an mldr object, into a variable bibtex <- get.mldr("bibtex") bibtex$measures ## End(Not run)
## Not run: library(mldr.datasets) # customize the download directory options(mldr.download.dir = "./datasets") # retrieve the bibtex dataset, as an mldr object, into a variable bibtex <- get.mldr("bibtex") bibtex$measures ## End(Not run)
Multilabel dataset from the text domain.
imdb(...)
imdb(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 120919 instances, 1001 attributes and 28 labels
Read, J. and Pfahringer, B. and Holmes, G. and Frank, E., "Classifier chains for multi-label classification", Machine Learning, (3)85, pp. 333-359, 2011
## Not run: imdb <- imdb() # Check and load the dataset toBibtex(imdb) imdb$measures ## End(Not run)
## Not run: imdb <- imdb() # Check and load the dataset toBibtex(imdb) imdb$measures ## End(Not run)
Iterative stratification
Implemented from the algorithm explained in: Konstantinos Sechidis, Grigorios Tsoumakas, and Ioannis Vlahavas. 2011. On the stratification of multi-label data. In Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III (ECML PKDD'11), Dimitrios Gunopulos, Thomas Hofmann, Donato Malerba, and Michalis Vazirgiannis (Eds.), Vol. Part III. Springer-Verlag, Berlin, Heidelberg, 145-158.
iterative.stratification.holdout(mld, p = 60, seed = 10, get.indices = FALSE)
iterative.stratification.holdout(mld, p = 60, seed = 10, get.indices = FALSE)
mld |
The |
p |
The percentage of instances to be selected for the training partition |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- iterative.stratification.holdout(emotions, p = 70) summary(parts.emotions$train) summary(parts.emotions$test) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- iterative.stratification.holdout(emotions, p = 70) summary(parts.emotions$train) summary(parts.emotions$test) ## End(Not run)
Iterative stratification
Implemented from the algorithm explained in: Konstantinos Sechidis, Grigorios Tsoumakas, and Ioannis Vlahavas. 2011. On the stratification of multi-label data. In Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III (ECML PKDD'11), Dimitrios Gunopulos, Thomas Hofmann, Donato Malerba, and Michalis Vazirgiannis (Eds.), Vol. Part III. Springer-Verlag, Berlin, Heidelberg, 145-158.
iterative.stratification.kfolds(mld, k = 5, seed = 10, get.indices = FALSE)
iterative.stratification.kfolds(mld, k = 5, seed = 10, get.indices = FALSE)
mld |
The |
k |
The number of folds to be generated. By default is 5 |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) folds.emotions <- iterative.stratification.kfolds(emotions) summary(folds.emotions[[1]]$train) summary(folds.emotions[[1]]$test) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) folds.emotions <- iterative.stratification.kfolds(emotions) summary(folds.emotions[[1]]$train) summary(folds.emotions[[1]]$test) ## End(Not run)
Iterative stratification
Implemented from the algorithm explained in: Konstantinos Sechidis, Grigorios Tsoumakas, and Ioannis Vlahavas. 2011. On the stratification of multi-label data. In Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III (ECML PKDD'11), Dimitrios Gunopulos, Thomas Hofmann, Donato Malerba, and Michalis Vazirgiannis (Eds.), Vol. Part III. Springer-Verlag, Berlin, Heidelberg, 145-158.
iterative.stratification.partitions(mld, is.cv = FALSE, r, seed = 10, get.indices = FALSE)
iterative.stratification.partitions(mld, is.cv = FALSE, r, seed = 10, get.indices = FALSE)
mld |
The |
is.cv |
Option to enable treatment of partitions as cross-validation test folds |
r |
A vector of percentages of instances to be selected for each partition |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- iterative.stratification.partitions(emotions, r = c(35, 25, 40)) summary(parts.emotions[[2]]) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- iterative.stratification.partitions(emotions, r = c(35, 25, 40)) summary(parts.emotions[[2]]) ## End(Not run)
Multilabel dataset from the text domain.
langlog
langlog
An mldr object with 1460 instances, 1004 attributes and 75 labels
Read, Jesse, "Scalable multi-label classification", University of Waikato, 2010
## Not run: toBibtex(langlog) langlog$measures ## End(Not run)
## Not run: toBibtex(langlog) langlog$measures ## End(Not run)
Multilabel dataset from the video domain.
mediamill(...)
mediamill(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 43907 instances, 120 attributes and 101 labels
Snoek, C. G. M. and Worring, M. and van Gemert, J. C. and Geusebroek, J. M. and Smeulders, A. W. M., "The challenge problem for automated detection of 101 semantic concepts in multimedia", in Proc. 14th ACM International Conference on Multimedia, MULTIMEDIA06, pp. 421-430, 2006
## Not run: mediamill <- mediamill() # Check and load the dataset toBibtex(mediamill) mediamill$measures ## End(Not run)
## Not run: mediamill <- mediamill() # Check and load the dataset toBibtex(mediamill) mediamill$measures ## End(Not run)
Multilabel dataset from the text domain.
medical
medical
An mldr object with 978 instances, 1449 attributes and 45 labels
Crammer, K. and Dredze, M. and Ganchev, K. and Talukdar, P. P. and Carroll, S., "Automatic Code Assignment to Medical Text", in Proc. Workshop on Biological, Translational, and Clinical Language Processing, Prague, Czech Republic, BioNLP07, pp. 129-136, 2007
## Not run: toBibtex(medical) medical$measures ## End(Not run)
## Not run: toBibtex(medical) medical$measures ## End(Not run)
The function downloads from GitHub the most up to date list of additional datasets. Those datasets are not included into the package, but can be downloaded and saved locally.
mldrs()
mldrs()
## Not run: library(mldr.datasets) mldrs() ## End(Not run)
## Not run: library(mldr.datasets) mldrs() ## End(Not run)
Multilabel dataset from the text domain. The original name of the dataset is 20ng
ng20
ng20
An mldr object with 19300 instances, 1006 attributes and 20 labels
Ken Lang, "Newsweeder: Learning to filter netnews", in Proc. 12th International Conference on Machine Learning, pp. 331-339, 1995
## Not run: toBibtex(ng20) ng20$measures ## End(Not run)
## Not run: toBibtex(ng20) ng20$measures ## End(Not run)
Multilabel dataset from the image domain.
nuswide_BoW(...)
nuswide_BoW(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 269648 instances, 501 attributes and 81 labels
Chua, Tat-Seng and Tang, Jinhui and Hong, Richang and Li, Haojie and Luo, Zhiping and Zheng, Yantao, "NUS-WIDE: a real-world web image database from National University of Singapore", in Proc. of the ACM international conference on image and video retrieval, pp. 48, 2009
## Not run: nuswide_BoW <- nuswide_BoW() # Check and load the dataset toBibtex(nuswide_BoW) nuswide_BoW$measures ## End(Not run)
## Not run: nuswide_BoW <- nuswide_BoW() # Check and load the dataset toBibtex(nuswide_BoW) nuswide_BoW$measures ## End(Not run)
Multilabel dataset from the image domain.
nuswide_VLAD(...)
nuswide_VLAD(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 269648 instances, 129 attributes and 81 labels
Chua, Tat-Seng and Tang, Jinhui and Hong, Richang and Li, Haojie and Luo, Zhiping and Zheng, Yantao, "NUS-WIDE: a real-world web image database from National University of Singapore", in Proc. of the ACM international conference on image and video retrieval, pp. 48, 2009
## Not run: nuswide_VLAD <- nuswide_VLAD() # Check and load the dataset toBibtex(nuswide_VLAD) nuswide_VLAD$measures ## End(Not run)
## Not run: nuswide_VLAD <- nuswide_VLAD() # Check and load the dataset toBibtex(nuswide_VLAD) nuswide_VLAD$measures ## End(Not run)
Multilabel dataset from the text domain.
ohsumed(...)
ohsumed(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 13929 instances, 1002 attributes and 23 labels
Joachims, Thorsten, "Text Categorization with Suport Vector Machines: Learning with Many Relevant Features", in Proc. 10th European Conference on Machine Learning, pp. 137-142, 1998
## Not run: ohsumed <- ohsumed() # Check and load the dataset toBibtex(ohsumed) ohsumed$measures ## End(Not run)
## Not run: ohsumed <- ohsumed() # Check and load the dataset toBibtex(ohsumed) ohsumed$measures ## End(Not run)
Random partitioning
random.holdout(mld, p = 60, seed = 10, get.indices = FALSE)
random.holdout(mld, p = 60, seed = 10, get.indices = FALSE)
mld |
The |
p |
The percentage of instances to be selected for the training partition |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- random.holdout(emotions, p = 70) summary(parts.emotions$train) summary(parts.emotions$test) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- random.holdout(emotions, p = 70) summary(parts.emotions$train) summary(parts.emotions$test) ## End(Not run)
This method randomly partitions the given dataset into k folds, providing training and test partitions for each fold.
random.kfolds(mld, k = 5, seed = 10, get.indices = FALSE)
random.kfolds(mld, k = 5, seed = 10, get.indices = FALSE)
mld |
The |
k |
The number of folds to be generated. By default is 5 |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) folds.emotions <- random.kfolds(emotions) summary(folds.emotions[[1]]$train) summary(folds.emotions[[1]]$test) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) folds.emotions <- random.kfolds(emotions) summary(folds.emotions[[1]]$train) summary(folds.emotions[[1]]$test) ## End(Not run)
Random partitioning
random.partitions(mld, is.cv = FALSE, r, seed = 10, get.indices = FALSE)
random.partitions(mld, is.cv = FALSE, r, seed = 10, get.indices = FALSE)
mld |
The |
is.cv |
Option to enable treatment of partitions as cross-validation test folds |
r |
A vector of percentages of instances to be selected for each partition |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- random.partitions(emotions, r = c(35, 25, 40)) summary(parts.emotions[[2]]) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- random.partitions(emotions, r = c(35, 25, 40)) summary(parts.emotions[[2]]) ## End(Not run)
Multilabel dataset from the text domain.
rcv1sub1(...)
rcv1sub1(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 6000 instances, 47236 attributes and 101 labels
Lewis, D. D. and Yang, Y. and Rose, T. G. and Li, F., "RCV1: A new benchmark collection for text categorization research", The Journal of Machine Learning Research, Vol. 5, pp. 361-397, 2004
## Not run: rcv1sub1 <- rcv1sub1() # Check and load the dataset toBibtex(rcv1sub1) rcv1sub1$measures ## End(Not run)
## Not run: rcv1sub1 <- rcv1sub1() # Check and load the dataset toBibtex(rcv1sub1) rcv1sub1$measures ## End(Not run)
Multilabel dataset from the text domain.
rcv1sub2(...)
rcv1sub2(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 6000 instances, 47236 attributes and 101 labels
Lewis, D. D. and Yang, Y. and Rose, T. G. and Li, F., "RCV1: A new benchmark collection for text categorization research", The Journal of Machine Learning Research, Vol. 5, pp. 361-397, 2004
## Not run: rcv1sub2 <- rcv1sub2() # Check and load the dataset toBibtex(rcv1sub2) rcv1sub2$measures ## End(Not run)
## Not run: rcv1sub2 <- rcv1sub2() # Check and load the dataset toBibtex(rcv1sub2) rcv1sub2$measures ## End(Not run)
Multilabel dataset from the text domain.
rcv1sub3(...)
rcv1sub3(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 6000 instances, 47236 attributes and 101 labels
Lewis, D. D. and Yang, Y. and Rose, T. G. and Li, F., "RCV1: A new benchmark collection for text categorization research", The Journal of Machine Learning Research, Vol. 5, pp. 361-397, 2004
## Not run: rcv1sub3 <- rcv1sub3() # Check and load the dataset toBibtex(rcv1sub3) rcv1sub3$measures ## End(Not run)
## Not run: rcv1sub3 <- rcv1sub3() # Check and load the dataset toBibtex(rcv1sub3) rcv1sub3$measures ## End(Not run)
Multilabel dataset from the text domain.
rcv1sub4(...)
rcv1sub4(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 6000 instances, 47229 attributes and 101 labels
Lewis, D. D. and Yang, Y. and Rose, T. G. and Li, F., "RCV1: A new benchmark collection for text categorization research", The Journal of Machine Learning Research, Vol. 5, pp. 361-397, 2004
## Not run: rcv1sub4 <- rcv1sub4() # Check and load the dataset toBibtex(rcv1sub4) rcv1sub4$measures ## End(Not run)
## Not run: rcv1sub4 <- rcv1sub4() # Check and load the dataset toBibtex(rcv1sub4) rcv1sub4$measures ## End(Not run)
Multilabel dataset from the text domain.
rcv1sub5(...)
rcv1sub5(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 6000 instances, 47235 attributes and 101 labels
Lewis, D. D. and Yang, Y. and Rose, T. G. and Li, F., "RCV1: A new benchmark collection for text categorization research", The Journal of Machine Learning Research, Vol. 5, pp. 361-397, 2004
## Not run: rcv1sub5 <- rcv1sub5() # Check and load the dataset toBibtex(rcv1sub5) rcv1sub5$measures ## End(Not run)
## Not run: rcv1sub5 <- rcv1sub5() # Check and load the dataset toBibtex(rcv1sub5) rcv1sub5$measures ## End(Not run)
Multilabel dataset from the text domain.
reutersk500(...)
reutersk500(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 6000 instances, 500 attributes and 103 labels
Read, Jesse, "Scalable multi-label classification", University of Waikato, 2010
## Not run: reutersk500 <- reutersk500() # Check and load the dataset toBibtex(reutersk500) reutersk500$measures ## End(Not run)
## Not run: reutersk500 <- reutersk500() # Check and load the dataset toBibtex(reutersk500) reutersk500$measures ## End(Not run)
Multilabel dataset from the image domain.
scene(...)
scene(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 2407 instances, 294 attributes and 6 labels
Boutell, M. and Luo, J. and Shen, X. and Brown, C., "Learning multi-label scene classification", Pattern Recognition, (9)37, pp. 1757–1771, 2004
## Not run: scene <- scene() toBibtex(scene) scene$measures ## End(Not run)
## Not run: scene <- scene() toBibtex(scene) scene$measures ## End(Not run)
Multilabel dataset from the text domain.
slashdot
slashdot
An mldr object with 3782 instances, 1079 attributes and 22 labels
Read, J. and Pfahringer, B. and Holmes, G. and Frank, E., "Classifier chains for multi-label classification", Machine Learning, (3)85, pp. 333–359, 2011
## Not run: toBibtex(slashdot) slashdot$measures ## End(Not run)
## Not run: toBibtex(slashdot) slashdot$measures ## End(Not run)
This function calculates the ratio of zero-valued elements over the total of elements. It is useful to decide whether to export in a dense or sparse format.
sparsity(mld)
sparsity(mld)
mld |
An |
library(mldr.datasets) sparsity(emotions)
library(mldr.datasets) sparsity(emotions)
Multilabel dataset from the text domain.
stackex_chemistry(...)
stackex_chemistry(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 6961 instances, 540 attributes and 175 labels
Charte, Francisco and Rivera, Antonio J. and del Jesus, Maria J. and Herrera, Francisco, "QUINTA: A question tagging assistant to improve the answering ratio in electronic forums", in EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, pp. 1-6, 2015
## Not run: stackex_chemistry <- stackex_chemistry() # Check and load the dataset toBibtex(stackex_chemistry) stackex_chemistry$measures ## End(Not run)
## Not run: stackex_chemistry <- stackex_chemistry() # Check and load the dataset toBibtex(stackex_chemistry) stackex_chemistry$measures ## End(Not run)
Multilabel dataset from the text domain.
stackex_chess
stackex_chess
An mldr object with 1675 instances, 585 attributes and 227 labels
Charte, Francisco and Rivera, Antonio J. and del Jesus, Maria J. and Herrera, Francisco, "QUINTA: A question tagging assistant to improve the answering ratio in electronic forums", in EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, pp. 1-6, 2015
## Not run: toBibtex(stackex_chess) stackex_chess$measures ## End(Not run)
## Not run: toBibtex(stackex_chess) stackex_chess$measures ## End(Not run)
Multilabel dataset from the text domain.
stackex_coffee(...)
stackex_coffee(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 225 instances, 1763 attributes and 123 labels
Charte, Francisco and Rivera, Antonio J. and del Jesus, Maria J. and Herrera, Francisco, "QUINTA: A question tagging assistant to improve the answering ratio in electronic forums", in EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, pp. 1-6, 2015
## Not run: stackex_coffee <- stackex_coffee() toBibtex(stackex_coffee) stackex_coffee$measures ## End(Not run)
## Not run: stackex_coffee <- stackex_coffee() toBibtex(stackex_coffee) stackex_coffee$measures ## End(Not run)
Multilabel dataset from the text domain.
stackex_cooking(...)
stackex_cooking(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 10491 instances, 577 attributes and 400 labels
Charte, Francisco and Rivera, Antonio J. and del Jesus, Maria J. and Herrera, Francisco, "QUINTA: A question tagging assistant to improve the answering ratio in electronic forums", in EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, pp. 1-6, 2015
## Not run: stackex_cooking <- stackex_cooking() # Check and load the dataset toBibtex(stackex_cooking) stackex_cooking$measures ## End(Not run)
## Not run: stackex_cooking <- stackex_cooking() # Check and load the dataset toBibtex(stackex_cooking) stackex_cooking$measures ## End(Not run)
Multilabel dataset from the text domain.
stackex_cs(...)
stackex_cs(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 9270 instances, 635 attributes and 274 labels
Charte, Francisco and Rivera, Antonio J. and del Jesus, Maria J. and Herrera, Francisco, "QUINTA: A question tagging assistant to improve the answering ratio in electronic forums", in EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, pp. 1-6, 2015
## Not run: stackex_cs <- stackex_cs() # Check and load the dataset toBibtex(stackex_cs) stackex_cs$measures ## End(Not run)
## Not run: stackex_cs <- stackex_cs() # Check and load the dataset toBibtex(stackex_cs) stackex_cs$measures ## End(Not run)
Multilabel dataset from the text domain.
stackex_philosophy(...)
stackex_philosophy(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 3971 instances, 842 attributes and 233 labels
Charte, Francisco and Rivera, Antonio J. and del Jesus, Maria J. and Herrera, Francisco, "QUINTA: A question tagging assistant to improve the answering ratio in electronic forums", in EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, pp. 1-6, 2015
## Not run: stackex_philosophy <- stackex_philosophy() # Check and load the dataset toBibtex(stackex_philosophy) stackex_philosophy$measures ## End(Not run)
## Not run: stackex_philosophy <- stackex_philosophy() # Check and load the dataset toBibtex(stackex_philosophy) stackex_philosophy$measures ## End(Not run)
Stratified partitioning
Implementation of the algorithm defined in: Charte, F., Rivera, A., del Jesus, M. J., & Herrera, F. (2016, April). On the impact of dataset complexity and sampling strategy in multilabel classifiers performance. In International Conference on Hybrid Artificial Intelligence Systems (pp. 500-511). Springer, Cham.
stratified.holdout(mld, p = 60, seed = 10, get.indices = FALSE)
stratified.holdout(mld, p = 60, seed = 10, get.indices = FALSE)
mld |
The |
p |
The percentage of instances to be selected for the training partition |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- stratified.holdout(emotions, p = 70) summary(parts.emotions$train) summary(parts.emotions$test) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- stratified.holdout(emotions, p = 70) summary(parts.emotions$train) summary(parts.emotions$test) ## End(Not run)
This method partitions the given dataset into k folds using a stratified strategy, providing training and test partitions for each fold.
Implementation of the algorithm defined in: Charte, F., Rivera, A., del Jesus, M. J., & Herrera, F. (2016, April). On the impact of dataset complexity and sampling strategy in multilabel classifiers performance. In International Conference on Hybrid Artificial Intelligence Systems (pp. 500-511). Springer, Cham.
stratified.kfolds(mld, k = 5, seed = 10, get.indices = FALSE)
stratified.kfolds(mld, k = 5, seed = 10, get.indices = FALSE)
mld |
The |
k |
The number of folds to be generated. By default is 5 |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) folds.emotions <- stratified.kfolds(emotions) summary(folds.emotions[[1]]$train) summary(folds.emotions[[1]]$test) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) folds.emotions <- stratified.kfolds(emotions) summary(folds.emotions[[1]]$train) summary(folds.emotions[[1]]$test) ## End(Not run)
Stratified partitioning
Generalization of the algorithm defined in: Charte, F., Rivera, A., del Jesus, M. J., & Herrera, F. (2016, April). On the impact of dataset complexity and sampling strategy in multilabel classifiers performance. In International Conference on Hybrid Artificial Intelligence Systems (pp. 500-511). Springer, Cham.
stratified.partitions(mld, is.cv = FALSE, r, seed = 10, get.indices = FALSE)
stratified.partitions(mld, is.cv = FALSE, r, seed = 10, get.indices = FALSE)
mld |
The |
is.cv |
Option to enable treatment of partitions as cross-validation test folds |
r |
A vector of percentages of instances to be selected for each partition |
seed |
The seed to initialize the random number generator. By default is 10. Change it if you want to obtain partitions containing different samples, for instance to use a 2x5 fcv strategy |
get.indices |
A logical value indicating whether to return lists of indices or lists of |
An mldr.folds
object. This is a list containing k elements, one for each fold. Each element is made up
of two mldr objects, called train
and test
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- stratified.partitions(emotions, r = c(35, 25, 40)) summary(parts.emotions[[2]]) ## End(Not run)
## Not run: library(mldr.datasets) library(mldr) parts.emotions <- stratified.partitions(emotions, r = c(35, 25, 40)) summary(parts.emotions[[2]]) ## End(Not run)
Multilabel dataset from the text domain.
tmc2007(...)
tmc2007(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 28596 instances, 49060 attributes and 22 labels
Srivastava, A. N. and Zane-Ulman, B., "Discovering recurring anomalies in text reports regarding complex space systems", Aerospace Conference, pp. 3853-3862, 2005
## Not run: tmc2007 <- tmc2007() # Check and load the dataset toBibtex(tmc2007) tmc2007$measures ## End(Not run)
## Not run: tmc2007 <- tmc2007() # Check and load the dataset toBibtex(tmc2007) tmc2007$measures ## End(Not run)
Multilabel dataset from the text domain.
tmc2007_500(...)
tmc2007_500(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 28596 instances, 500 attributes and 22 labels
Srivastava, A. N. and Zane-Ulman, B., "Discovering recurring anomalies in text reports regarding complex space systems", Aerospace Conference, pp. 3853-3862, 2005
## Not run: tmc2007_500 <- tmc2007_500() # Check and load the dataset toBibtex(tmc2007_500) tmc2007_500$measures ## End(Not run)
## Not run: tmc2007_500 <- tmc2007_500() # Check and load the dataset toBibtex(tmc2007_500) tmc2007_500$measures ## End(Not run)
Gets the content of the bibtex
member of the mldr
object and returns it
## S3 method for class 'mldr' toBibtex(object, ...)
## S3 method for class 'mldr' toBibtex(object, ...)
object |
The mldr object whose BibTeX entry is needed |
... |
Additional parameters from the generic toBibtex function not used by toBibtex.mldr |
A string with the BibTeX entry
## Not run: library(mldr.datasets) cat(toBibtex(emotions)) ## End(Not run)
## Not run: library(mldr.datasets) cat(toBibtex(emotions)) ## End(Not run)
Writes one or more files in the specified formats with the content of the mldr
or mldr.folds
given as parameter
write.mldr(mld, format = c("MULAN", "MEKA"), sparse = FALSE, basename = ifelse(!is.null(mld$name) && nchar(mld$name) > 0, regmatches(mld$name, regexpr("(\\w)+", mld$name)), "unnamed_mldr"), noconfirm = FALSE, ...)
write.mldr(mld, format = c("MULAN", "MEKA"), sparse = FALSE, basename = ifelse(!is.null(mld$name) && nchar(mld$name) > 0, regmatches(mld$name, regexpr("(\\w)+", mld$name)), "unnamed_mldr"), noconfirm = FALSE, ...)
mld |
The |
format |
A vector of strings stating the desired file formats. It can contain the values |
sparse |
Boolean value indicating if sparse representation has to be used for ARFF-based file formats |
basename |
Base name for the files. |
noconfirm |
Use TRUE to skip confirmation of file writing |
... |
Additional options for the exporting functions (e.g. |
## Not run: library(mldr.datasets) write.mldr(emotions, format = c('CSV', 'KEEL')) ## End(Not run)
## Not run: library(mldr.datasets) write.mldr(emotions, format = c('CSV', 'KEEL')) ## End(Not run)
Multilabel dataset from the text domain.
yahoo_arts(...)
yahoo_arts(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 7484 instances, 23146 attributes and 26 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_arts <- yahoo_arts() # Check and load the dataset toBibtex(yahoo_arts) yahoo_arts$measures ## End(Not run)
## Not run: yahoo_arts <- yahoo_arts() # Check and load the dataset toBibtex(yahoo_arts) yahoo_arts$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_business(...)
yahoo_business(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 11214 instances, 21924 attributes and 30 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_business <- yahoo_business() # Check and load the dataset toBibtex(yahoo_business) yahoo_business$measures ## End(Not run)
## Not run: yahoo_business <- yahoo_business() # Check and load the dataset toBibtex(yahoo_business) yahoo_business$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_computers(...)
yahoo_computers(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 12444 instances, 34096 attributes and 33 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_computers <- yahoo_computers() # Check and load the dataset toBibtex(yahoo_computers) yahoo_computers$measures ## End(Not run)
## Not run: yahoo_computers <- yahoo_computers() # Check and load the dataset toBibtex(yahoo_computers) yahoo_computers$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_education(...)
yahoo_education(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 12030 instances, 27534 attributes and 33 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_education <- yahoo_education() # Check and load the dataset toBibtex(yahoo_education) yahoo_education$measures ## End(Not run)
## Not run: yahoo_education <- yahoo_education() # Check and load the dataset toBibtex(yahoo_education) yahoo_education$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_entertainment(...)
yahoo_entertainment(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 12730 instances, 32001 attributes and 21 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_entertainment <- yahoo_entertainment() # Check and load the dataset toBibtex(yahoo_entertainment) yahoo_entertainment$measures ## End(Not run)
## Not run: yahoo_entertainment <- yahoo_entertainment() # Check and load the dataset toBibtex(yahoo_entertainment) yahoo_entertainment$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_health(...)
yahoo_health(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 8205 instances, 30605 attributes and 32 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_health <- yahoo_health() # Check and load the dataset toBibtex(yahoo_health) yahoo_health$measures ## End(Not run)
## Not run: yahoo_health <- yahoo_health() # Check and load the dataset toBibtex(yahoo_health) yahoo_health$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_recreation(...)
yahoo_recreation(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 12828 instances, 30324 attributes and 22 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_recreation <- yahoo_recreation() # Check and load the dataset toBibtex(yahoo_recreation) yahoo_recreation$measures ## End(Not run)
## Not run: yahoo_recreation <- yahoo_recreation() # Check and load the dataset toBibtex(yahoo_recreation) yahoo_recreation$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_reference(...)
yahoo_reference(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 8027 instances, 39679 attributes and 33 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_reference <- yahoo_reference() # Check and load the dataset toBibtex(yahoo_reference) yahoo_reference$measures ## End(Not run)
## Not run: yahoo_reference <- yahoo_reference() # Check and load the dataset toBibtex(yahoo_reference) yahoo_reference$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_science(...)
yahoo_science(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 6428 instances, 37187 attributes and 40 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_science <- yahoo_science() # Check and load the dataset toBibtex(yahoo_science) yahoo_science$measures ## End(Not run)
## Not run: yahoo_science <- yahoo_science() # Check and load the dataset toBibtex(yahoo_science) yahoo_science$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_social(...)
yahoo_social(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 12111 instances, 52350 attributes and 39 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_social <- yahoo_social() # Check and load the dataset toBibtex(yahoo_social) yahoo_social$measures ## End(Not run)
## Not run: yahoo_social <- yahoo_social() # Check and load the dataset toBibtex(yahoo_social) yahoo_social$measures ## End(Not run)
Multilabel dataset from the text domain.
yahoo_society(...)
yahoo_society(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 14512 instances, 31802 attributes and 27 labels
Ueda, N. and Saito, K., "Parametric mixture models for multi-labeled text", Advances in neural information processing systems, pp. 721–728, 2002
## Not run: yahoo_society <- yahoo_society() # Check and load the dataset toBibtex(yahoo_society) yahoo_society$measures ## End(Not run)
## Not run: yahoo_society <- yahoo_society() # Check and load the dataset toBibtex(yahoo_society) yahoo_society$measures ## End(Not run)
Multilabel dataset from the biology domain.
yeast(...)
yeast(...)
... |
Additional options for the loading function (e.g. |
An mldr object with 2417 instances, 103 attributes and 14 labels
Elisseeff, A. and Weston, J., "A Kernel Method for Multi-Labelled Classification", Advances in Neural Information Processing Systems, Vol. 14, pp. 681–687, 2001
## Not run: yeast <- yeast() # Check and load the dataset toBibtex(yeast) yeast$measures ## End(Not run)
## Not run: yeast <- yeast() # Check and load the dataset toBibtex(yeast) yeast$measures ## End(Not run)