Professional Documents
Culture Documents
1stpaper NanlConf Textbook Rec Om Mender
1stpaper NanlConf Textbook Rec Om Mender
Partha Sarathi Chakraborty Department of Information Technology University Institute of Technology University of Burdwan Burdwan , West Bengal Abstract
Recommender systems have proved really useful in order to handle with the information overload on the Internet. Many web sites attempt to help users by incorporating a recommender system that provides users with a list of items and/or web pages that are likely to interest them. This paper describes a content-based text book recommending system that employs neural network as machine learning algorithm to learn personal preferences of users and provide tailored suggestions. Content-based recommendation systems recommend items based on the content of the items and target users ratings. Two different content-based approaches have been proposed: feature-based and text categorizationbased. Feature-based recommendation systems [6], [7] extract important features from the item descriptions and learn a users profile (classifier) using a set of preclassified (according to the users rating) feature vectors. Text categorization [13] systems learn from thousands of features (words or phrases). Several systems using text categorization (TC) have been developed. They have been applied to recommend WebPages [8], books [9]. In this paper, we apply learning for text categorization to the domain of book recommendation. The machine learning algorithm that is used here is a three-layer fully connected feed forward neural network. The rest of the paper is organized as follows. The related work is mentioned in Section 2. The detail of the proposed system is presented in Section 3. Experimental results have been shown in Section 4. Future works and References have been mentioned in section 5 and 6 respectively.
2. Related work
A number of works have been done in the domain of book recommendation. On-line book stores like Amazon and BarnesAndNoble have popular recommendation services. A content based approach was employed in one of the first book recommending systems [10, 11]. One important work has been done in this domain by Mooney and Roy [9] in their LIBRA system. LIBRA uses a content-based approach for recommending books by applying automated text-categorization methods to semi-structured text extracted from the web. After user rating, the system learns a profile of the user using a Bayesian learning algorithm and produces a ranked list of the most recommended additional titles from the systems catalog.
In our approach we have used a soft computing tool, neural network for recommending books online.
formed by taking only b terms with highest TF x IDF weight. The value of the parameters a and b are determined by performing experiments.
3. Our Approach
3.1 Overview
In our recommender system a user first chooses the name of the subject for which he is searching for a book. The user may also limit the searching by specifying name of an author or name of publishing house. A list of books is presented with a brief description for each one. The user evaluates some of the books he desires by rating them in the scale of 1 to 5. Based on his/her rating a small number of books are recommended by the system.
4. Experimental Results
There are several metrics commonly used in evaluating recommender systems. We will be using precision and recall to evaluate our system. Precision is the percentage of correctly recommended items out of the total number of recommended items. Accuracy is the number of correctly classified items divided by the number of classified items. For our current system precision was calculated as 56% whereas recall was 57.1%. The overall percentage of successful recommendations was 71%
5. Future Works
The proposed system has been tested with a small dataset. Rigorous evaluation of the system is required using a large dataset. Currently we are trying to apply genetic algorithm for the neural network weight selection and observing the performance change. For comparative analysis we are also interested in applying self Organizing map in place of feed forward network for our recommender system.
Where N is the number of documents in the document set, and n is the number of documents in which the i th term appears. By this definition, a term that appears in fewer documents will have a higher IDF. The assumption behind this definition is that terms that are concentrated in a few documents are more helpful in distinguishing between documents with different topics. We also use TD x IDF to reduce the dimensionality of the feature space with the help of two parameters a, b. First, those terms are considered with TF x IDF weight greater than the threshold value a. The reduced feature set is then
6. References
[1] P. Resnick and H. R. Varian. Recommender systems. Special issue of Communications of the ACM, pages 5658, March 1997.
[ 2] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.1994. GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of the Computer Supported Collaborative Work Conference. Chapel Hill, NC. [3] Shardanand, U. and Maes, P. 1995. Social Information Filtering: Algorithms for Automating Word of Mouth. In Proceedings of CHI., pp. 210-217. Denver, CO. [4] Schafer B., Konstan J., Riedl J. 1999. Recommender Systems in E-Commerce. In Proceedings of ACM Conference on Electronic Commerce. [5] Sarwar, B.M., Karypis, G., Konstan, J.A. and Riedl, J. 2000a. Application of Dimensionality Reduction in Recommender System A Case Study. In ACM WebKDD 2000 Web Mining for E-commerce Workshop. [6] D. Billsus and M. Pazzani, "A Personal News Agent that Talks, Learns and Explains", Third Intern. Conf on Autonomous Agents (Agents '99), Seattle, Washington, 1999. [7] M. Pazzani, J. Muramatsu, D. Billsus, Syskil & Webert: Identifying Interesting Web Sites, AAAI-96, pp.54-61, 1996. [8] M. Pazzani, J. Muramatsu, D. Billsus, Syskil & Webert: Identifying Interesting Web Sites, AAAI-96, pp.54-61, 1996.
[9] R. J. Mooney, L. Roy, Content-Based Book Recommend-ingUsing Learning for Text Categorization, Fifth ACM Conf. on Digital Libraries, 2000. [10] E. Rich. User modeling via stereotypes. Cognitive Science, 3:329354, 1979. [11] E. Rich. Users are individuals: Individualizing user models. International Journal of Man-Machine Studies, 18:199214, 1983. [12] G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513523, 1988. [13] Fabrizio Sebastiani, Machine Learning in Automated Text Categorization, ACM Computing Surveys, 34(1):147, 2002.