[PDF] An Improved Clustering Method For Text Documents Using Neutrosophic Logic - eBooks Review

An Improved Clustering Method For Text Documents Using Neutrosophic Logic


An Improved Clustering Method For Text Documents Using Neutrosophic Logic
DOWNLOAD

Download An Improved Clustering Method For Text Documents Using Neutrosophic Logic PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get An Improved Clustering Method For Text Documents Using Neutrosophic Logic book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages. If the content not found or just blank you must refresh this page





An Improved Clustering Method For Text Documents Using Neutrosophic Logic


An Improved Clustering Method For Text Documents Using Neutrosophic Logic
DOWNLOAD
Author : Nadeem Akhtar
language : en
Publisher: Infinite Study
Release Date :

An Improved Clustering Method For Text Documents Using Neutrosophic Logic written by Nadeem Akhtar and has been published by Infinite Study this book supported file pdf, txt, epub, kindle and other format this book has been release on with categories.


As a technique of Information Retrieval, we can consider clustering as an unsupervised learning problem in which we provide a structure to unlabeled and unknown data.



Applications Of Soft Computing For The Web


Applications Of Soft Computing For The Web
DOWNLOAD
Author : Rashid Ali
language : en
Publisher: Springer
Release Date : 2018-01-08

Applications Of Soft Computing For The Web written by Rashid Ali and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2018-01-08 with Computers categories.


This book discusses the applications of different soft computing techniques for the web-based systems and services. The respective chapters highlight recent developments in the field of soft computing applications, from web-based information retrieval to online marketing and online healthcare. In each chapter author endeavor to explain the basic ideas behind the proposed applications in an accessible format for readers who may not possess a background in these fields. This carefully edited book covers a wide range of new applications of soft computing techniques in Web recommender systems, Online documents classification, Online documents summarization, Online document clustering, Online market intelligence, Web usage profiling, Web data extraction, Social network extraction, Question answering systems, Online health care, Web knowledge management, Multimedia information retrieval, Navigation guides, User profiles extraction, Web-based distributed information systems, Web security applications, Internet of Things Applications and so on. The book is aimed for researchers and practitioner who are engaged in developing and applying intelligent systems principles for solving real-life problems. Further, it has been structured so that each chapter can be read independently of the others.



Senti Nsetpso Large Sized Document Level Sentiment Analysis Using Neutrosophic Set And Particle Swarm Optimization


Senti Nsetpso Large Sized Document Level Sentiment Analysis Using Neutrosophic Set And Particle Swarm Optimization
DOWNLOAD
Author : Amita Jain
language : en
Publisher: Infinite Study
Release Date :

Senti Nsetpso Large Sized Document Level Sentiment Analysis Using Neutrosophic Set And Particle Swarm Optimization written by Amita Jain and has been published by Infinite Study this book supported file pdf, txt, epub, kindle and other format this book has been release on with Mathematics categories.


In the last decade, opinion mining has been explored by using various machine learning methods. In the literature, document-level sentiment analysis has been majorly dealt with short-sized text only. For large-sized text, document-level sentiment analysis has never been dealt. In this paper, a hybrid framework named as ‘‘Senti-NSetPSO’’ is proposed to analyse large-sized text. Senti-NSetPSO comprises of two classifiers: binary and ternary based on hybridization of particle swarm optimization (PSO) with Neutrosophic Set. This method is suitable to classify large-sized text having more than 25 kb of size. Swarm size generated from large text can give a suitable measurement for implementation of PSO convergence. The proposed approach is trained and tested for large-sized text collected from Blitzer, aclIMDb, Polarity and Subjective Dataset. The proposed method establishes a co-relation between sentiment analysis and Neutrosophic Set. On Blitzer, aclIMDb and Polarity dataset, the model acquires satisfactory accuracy by ternary classifier. The accuracy of ternary classifier of the proposed framework shows significant improvement than review paper classifier present in the literature.



Feature Selection And Enhanced Krill Herd Algorithm For Text Document Clustering


Feature Selection And Enhanced Krill Herd Algorithm For Text Document Clustering
DOWNLOAD
Author : Laith Mohammad Qasim Abualigah
language : en
Publisher:
Release Date : 2019

Feature Selection And Enhanced Krill Herd Algorithm For Text Document Clustering written by Laith Mohammad Qasim Abualigah and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2019 with Document clustering categories.


This book puts forward a new method for solving the text document (TD) clustering problem, which is established in two main stages: (i) A new feature selection method based on a particle swarm optimization algorithm with a novel weighting scheme is proposed, as well as a detailed dimension reduction technique, in order to obtain a new subset of more informative features with low-dimensional space. This new subset is subsequently used to improve the performance of the text clustering (TC) algorithm and reduce its computation time. The k-mean clustering algorithm is used to evaluate the effectiveness of the obtained subsets. (ii) Four krill herd algorithms (KHAs), namely, the (a) basic KHA, (b) modified KHA, (c) hybrid KHA, and (d) multi-objective hybrid KHA, are proposed to solve the TC problem; each algorithm represents an incremental improvement on its predecessor. For the evaluation process, seven benchmark text datasets are used with different characterizations and complexities. Text document (TD) clustering is a new trend in text mining in which the TDs are separated into several coherent clusters, where all documents in the same cluster are similar. The findings presented here confirm that the proposed methods and algorithms delivered the best results in comparison with other, similar methods to be found in the literature.



High Performance Text Document Clustering


High Performance Text Document Clustering
DOWNLOAD
Author : Yanjun Li
language : en
Publisher:
Release Date : 2007

High Performance Text Document Clustering written by Yanjun Li and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2007 with Algorithms categories.


Data mining, also known as knowledge discovery in database (KDD), is the process to discover interesting unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract interesting and nontrivial information and knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. This research focuses on improving the performance of text clustering. We investigated the text clustering algorithms in four aspects: document representation, documents closeness measurement, high dimension reduction and parallelization. We propose a group of high performance text clustering algorithms, which target the unique characteristics of unstructured text database. First, two new text clustering algorithms are proposed. Unlike the vector space model, which treats document as a bag of words, we use a document representation which keeps the sequential relationship between words in the documents. In these two algorithms, the dimension of the database is reduced by considering the frequent word (meaning) sequences, and the closeness of two documents is measured based on the sharing of frequent word (meaning) sequences. Second, a text clustering algorithm with feature selection is proposed. This algorithm gradually reduces the high dimension of database by performing feature selection during the clustering. The new feature selection method applied is based on the well-known chi-square statistic and a new statistical data which can measure the positive and negative term-category dependence. Third, a group of new text clustering algorithms is developed based on the k-means algorithm. Instead of using the cosine function, a new function involving global information is proposed to measure the closeness between two documents. This new function utilizes the neighbor matrix introduced in [Guha:2000]. A new method for selecting initial centroids and a new heuristic function for selecting a cluster to split are adopted in the proposed algorithms. Last, a new parallel algorithm for bisecting k-means is proposed for the message-passing multiprocessor systems. This new algorithm, named PBKP, fully utilizes the data-parallelism of the bisecting k-means algorithm, and adopts a prediction step to balance the workloads of multiple processors to achieve a high speedup. Comprehensive performance studies were conducted on all the proposed algorithms. In order to evaluate the performance of these algorithms, we compared them with existing text clustering algorithms, such as k-means, bisecting k-means [Steinbach:2000] and FIHC [Fung:2003]. The experimental results show that our clustering algorithms are scalable and have much better clustering accuracy than existing algorithms. For the parallel PBKP algorithm, we tested it on a 9-node Linux cluster system and analyzed its performance. The experimental results suggest that the speedup of PBKP is linear with the number of processors and data points. Moreover, PBKP scales up better than the parallel k-means with respect to the desired number of clusters.



Successful Culturing Of Glover S Cancer Organism And Development Of Metastasizing Tumors In Animals Produced By Cultures From Human Malignancy


Successful Culturing Of Glover S Cancer Organism And Development Of Metastasizing Tumors In Animals Produced By Cultures From Human Malignancy
DOWNLOAD
Author :
language : en
Publisher:
Release Date : 1953

Successful Culturing Of Glover S Cancer Organism And Development Of Metastasizing Tumors In Animals Produced By Cultures From Human Malignancy written by and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 1953 with categories.




A Novel Framework Using Neutrosophy For Integrated Speech And Text Sentiment Analysis


A Novel Framework Using Neutrosophy For Integrated Speech And Text Sentiment Analysis
DOWNLOAD
Author : Kritika Mishra
language : en
Publisher: Infinite Study
Release Date : 2020-10-18

A Novel Framework Using Neutrosophy For Integrated Speech And Text Sentiment Analysis written by Kritika Mishra and has been published by Infinite Study this book supported file pdf, txt, epub, kindle and other format this book has been release on 2020-10-18 with Computers categories.


We have proposed a novel framework that performs sentiment analysis on audio files by calculating their Single-Valued Neutrosophic Sets (SVNS) and clustering them into positive-neutral-negative and combines these results with those obtained by performing sentiment analysis on the text files of those audio.



Semantically Enhanced Document Clustering


Semantically Enhanced Document Clustering
DOWNLOAD
Author : Ivan Stankov
language : en
Publisher:
Release Date : 2013

Semantically Enhanced Document Clustering written by Ivan Stankov and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2013 with categories.


This thesis advocates the view that traditional document clustering could be significantly improved by representing documents at different levels of abstraction at which the similarity between documents is considered. The improvement is with regard to the alignment of the clustering solutions to human judgement. The proposed methodology employs semantics with which the conceptual similarity be-tween documents is measured. The goal is to design algorithms which implement the meth-odology, in order to solve the following research problems: (i) how to obtain multiple deter-ministic clustering solutions; (ii) how to produce coherent large-scale clustering solutions across domains, regardless of the number of clusters; (iii) how to obtain clustering solutions which align well with human judgement; and (iv) how to produce specific clustering solu-tions from the perspective of the user's understanding for the domain of interest. The developed clustering methodology enhances separation between and improved coher-ence within clusters generated across several domains by using levels of abstraction. The methodology employs a semantically enhanced text stemmer, which is developed for the pur-pose of producing coherent clustering, and a concept index that provides generic document representation and reduced dimensionality of document representation. These characteristics of the methodology enable addressing the limitations of traditional text document clustering by employing computationally expensive similarity measures such as Earth Mover's Distance (EMD), which theoretically aligns the clustering solutions closer to human judgement. A threshold for similarity between documents that employs many-to-many similarity matching is proposed and experimentally proven to benefit the traditional clustering algorithms in pro-ducing clustering solutions aligned closer to human judgement. 4 The experimental validation demonstrates the scalability of the semantically enhanced document clustering methodology and supports the contributions: (i) multiple deterministic clustering solutions and different viewpoints to a document collection are obtained; (ii) the use of concept indexing as a document representation technique in the domain of document clustering is beneficial for producing coherent clusters across domains; (ii) SETS algorithm provides an improved text normalisation by using external knowledge; (iv) a method for measuring similarity between documents on a large scale by using many-to-many matching; (v) a semantically enhanced methodology that employs levels of abstraction that correspond to a user's background, understanding and motivation. The achieved results will benefit the research community working in the area of document management, information retrieval, data mining and knowledge management.



Text Classification Aided By Clustering A Literature Review


Text Classification Aided By Clustering A Literature Review
DOWNLOAD
Author : Antonia Kyriakopoulou
language : en
Publisher:
Release Date : 2008

Text Classification Aided By Clustering A Literature Review written by Antonia Kyriakopoulou and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2008 with categories.


We presented several clustering methods for dimensionality reduction to improve text classification. Experiments show that one-way clustering is more effective than feature selection, especially at lower number of features. Also, when dimensionality is reduced by as much as two orders of magnitude the resulting classification accuracy is similar to a fullfeature classifier. In some cases of small training sets and noisy features, feature clustering can actually increase classification accuracy. In the case of IB, various heuristics can be applied in order to obtain finer clusters, greedy agglomerative hard clustering (Slonim & Tishby, 1999), or a sequential K-means like algorithm (Slonim et al., 2002). Co-clustering methods are superior to one-way clustering methods as shown through corresponding experiments (Takamura, 2003). Benefits of using one-way clustering and co-clustering as a feature compression and/or extraction method include: useful semantic feature clusters, higher classification accuracy (via noise reduction), and smaller classification models. The second two reasons are shared with feature selection, and thus clustering can be seen as an alternative or a complement to feature selection, although it does not actually remove any features. Clustering is better at reducing the number of redundant features, whereas feature selection is better at removing detrimental, noisy features. The reduced dimensionality allows the use of more complex algorithms, and reduces computational burden. However, it is necessary to experimentally evaluate the trade-off between soft and hard clustering. While soft clustering increases the classification model size, it is not clear how it affects classification accuracy. Other directions for exploration include feature weighting and combination of feature selection and clustering strategies. There are four cases of semi-supervised classification using clustering considered in the area. In the first case, in the absence of a labelled set, clustering is used to create one by selecting unlabelled data from a pool of available unlabelled data. In the second case, it is used to augment an existing labelled set with new documents from the unlabelled data. In the third case, the dataset is augmented with new features derived from clustering labelled and unlabelled data. In the last case, clustering is used under a co-training framework. The algorithms presented demonstrate effective use of unlabelled data and significant improvements in classification performance especially when the size of the labelled set is small. In most experiments, the unlabelled data come from the same information source as the training and testing sets. Since the feature distribution of the unlabelled data is crucial to the success of the method, an area of future research is the effect of the source and nature of information in the unlabelled dataset and clustering. Lastly, clustering reduces the training time of the SVM i) by modifying the SVM algorithm so that it can be applied to large data sets, and ii) by finding and using for training only the most qualified training examples of a large data set and disqualifying unimportant ones. A clustering algorithm and a classifier cooperate and act interchangeably and complementary. In the first case, many algorithms have been proposed (sequential minimal optimisation, projected conjugate gradient, neural networks amongst others) in order to simplify the training process of SVM, usually by breaking down the problem into smaller sub-problems easier to solve. In the second case, the training set is clustered in order to select the most representative examples to train a classifier instead of using the whole training set. The clustering results are used differently by the various approaches, i.e. the selection of the representative training examples follows different methods. Some of the proposed algorithms manage to decrease the number of training examples without compromising the.



Cognitive Intelligence With Neutrosophic Statistics In Bioinformatics


Cognitive Intelligence With Neutrosophic Statistics In Bioinformatics
DOWNLOAD
Author : Florentin Smarandache
language : en
Publisher: Elsevier
Release Date : 2023-02-11

Cognitive Intelligence With Neutrosophic Statistics In Bioinformatics written by Florentin Smarandache and has been published by Elsevier this book supported file pdf, txt, epub, kindle and other format this book has been release on 2023-02-11 with Computers categories.


Cognitive Intelligence with Neutrosophic Statistics in Bioinformatics investigates and presents the many applications that have arisen in the last ten years using neutrosophic statistics in bioinformatics, medicine, agriculture and cognitive science. This book will be very useful to the scientific community, appealing to audiences interested in fuzzy, vague concepts from which uncertain data are collected, including academic researchers, practicing engineers and graduate students. Neutrosophic statistics is a generalization of classical statistics. In classical statistics, the data is known, formed by crisp numbers. In comparison, data in neutrosophic statistics has some indeterminacy. This data may be ambiguous, vague, imprecise, incomplete, and even unknown. Neutrosophic statistics refers to a set of data, such that the data or a part of it are indeterminate in some degree, and to methods used to analyze the data. Introduces the field of neutrosophic statistics and how it can solve problems working with indeterminate (imprecise, ambiguous, vague, incomplete, unknown) data Presents various applications of neutrosophic statistics in the fields of bioinformatics, medicine, cognitive science and agriculture Provides practical examples and definitions of neutrosophic statistics in relation to the various types of indeterminacies