Tweet modeling with LSTM recurrent neural networks for hashtag recommendation

Abstract

The hash symbol, called a hashtag, is used to mark the keyword or topic in a tweet. It was created organically by users as a way to categorize messages. Hashtags also provide valuable information for many research applications such as sentiment classification and topic analysis. However, only a small number of tweets are manually annotated. Therefore, an automatic hashtag recommendation method is needed to help users tag their new tweets. Previous methods mostly use conventional machine learning classifiers such as SVM or utilize collaborative filtering technique. A bottleneck of these approaches is that they all use the TF-IDF scheme to represent tweets and ignore the semantic information in tweets. In this paper, we also regard hashtag recommendation as a classification task but propose a novel recurrent neural network model to learn vector-based tweet representations to recommend hashtags. More precisely, we use a skip-gram model to generate distributed word representations and then apply a convolutional neural network to learn semantic sentence vectors. Afterwards, we make use of the sentence vectors to train a long short-term memory recurrent neural network (LSTM-RNN). We directly use the produced tweet vectors as features to classify hashtags without any feature engineering. Experiments on real world data from Twitter to recommend hashtags show that our proposed LSTM-RNN model outperforms state-of-the-art methods and LSTM unit also obtains the best performance compared to standard RNN and gated recurrent unit (GRU).

Publication
2016 International Joint Conference on Neural Networks

Related