Denes Csala added Classic_topic_models_such_as__.md  over 8 years ago

Commit id: 60f711a742c93800d92bb882b07fca4b4d0d7ed1

deletions | additions      

         

Classic topic models, such as LDA or PLSA (Probabilistic Latent Semantic Analysis) need thousands of documents to provide reliable topic information. But in practice, the number of documents available for analysis is at most 100 - consider comments or reviews, news articles, etc. There a few possible improvement pathways, but most are infeasible or impractical such as increasing the number of input documents or provide human input for prior domain knowledge. Then another improvement pathway would be to transfer information across domains - as suggested by AMC. This works, because every topic domain will have similar characteristics: for gadget comments, all of them will have price or battery life, books will have length and so on.