Training data (or training set) refers to that portion of data used to fit a model. Unsupervised learning refers to analysis in which one attempts to learn something about the data. other than predicting an output value of interest (whether it falls into clusters, for example).