The following information was submitted:
Transactions: WSEAS TRANSACTIONS ON COMPUTERS
Transactions ID Number: 42-228
Full Name: Zakaria Zubi
Position: Assistant Professor
Age: ON
Sex: Male
Address: Sirte, Libya
Country: LIBYA
Tel: +218913752962
Tel prefix:
Fax:
E-mail address: zszubi@yahoo.com
Other E-mails:
Title of the Paper: Using Some Web Content Mining Techniques for Arabic Text Classification
Authors as they appear in the Paper: Zakaria Suliman Zubi
Email addresses of all the authors: zszubi@yahoo.com
Number of paper pages: 12
Abstract: Abstract:- With the massive rise in the volume of information available on the World Wide Web these days, and the emergence requirements for a superior technique to access this information, there has been a strong resurgence of interest in web mining research. Web mining is a critical issue in data mining as well as other information process techniques to the World Wide Web to discover useful patterns. People can take advantage of these patterns to access the World Wide Web more efficiently. Web mining can be divided into three categories such as content mining, usage mining, and structure mining. In this paper we are going to apply web content mining to extract non-English knowledge from the web. We will investigate and evaluate some common methods; using web mining systems which have to deal with issues in language-specific text processing. Arabic language-independent algorithm will be used as a machine learning system. The algorithm will use a set of feature!
s as a vector of keywords for the learning process to apply text classification for the system. The algorithm usually used to classify a various number of documents written in a non English text language. The techniques used in the algorithm to categorize and classified the documents are two classifiers: Classifier K-Nearest Neighbor (CK-NN) and Classifier Naïve Bayes (CNB). However, the algorithms usually depend on some phrase segmentation and extraction programs to generate a set of features or keywords to represent the retrieved web documents. A proposed Arabic text classification system will be called Arabic Text Classifier (ATC). The main goal of ATC is to compares the results between both classifiers used (CK-NN, CNB) and select the best average accuracy result rates to start a retrieving process. The theorem behind the ATC was introduced in this paper without demonstrating any practical views of the system.
Keywords: Web mining, Web content mining, Text mining, Multilingual web mining, Multilingual text mining, Data mining, Text classification, K-Nearest Neighbor, Naïve Bayes.
EXTENSION of the file: .pdf
Special (Invited) Session:
Organizer of the Session:
How Did you learn about congress:
IP ADDRESS: 41.254.2.229