Sorry, you do not have access to this eBook
A subscription is required to access the full text content of this book.
Placing documents within a hierarchical structure is a common task and can be viewed as a multi-label classification with hierarchical structure in the label space. Examples of such data include web pages and their placement in directories, product descriptions and associated categories from product hierarchies, and free-text clinical records and their assigned diagnosis codes. We present a model for hierarchically and multiply labeled bag-of-words data called hierarchically supervised latent Dirichlet allocation (HSLDA). Out-of-sample label prediction is the primary goal of this work, but improved lower-dimensional representations of the bag-of-words data are also of interest. We demonstrate HSLDA on large-scale data from clinical document labeling and retail product categorization tasks. We show that leveraging the structure from hierarchical labels improves out-of-sample label prediction substantially when compared to models that do not.
A subscription is required to access the full text content of this book.
Other ways to access this content: