TEXT to analyze useful or meaningful information from

                          TEXT MINING

 

 

 

 

 

 

 

By.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

 

 

SHAH BRIJESH
BHUPENDRA

16BIT128

 

 

 

 

 

 

 

 

 

 

 

 

 

DEPARTMENT OF INFORMATION
TECHNOLOGY

Ahmedabad 382481

 

TEXT MINING

 

 

 

                                                     
Seminar

 

 

Submitted in fulfillment of the requirements

 

For the degree of

 

Bachelor of Technology in Information Technology

 

 

 

 

By

 

SHAH BRIJESH

16BIT128

 

 

Guided By

PREKSHA B. PAREEK

DEPARTMENT OF INFORMATION TECHNOLOGY

 

 

 

 

 

 

DEPARTMENT OF INFORMATION TECHNOLOGY

Ahmedabad 382481

 

 

 

 

 

CERTIFICATE

 

This is to certify that the project/Seminar entitled
“TEXT MINING” submitted by SHAH BRIJESH (16BIT128) , towards the partial
fulfillment of the requirements for the degree of Bachelor of Technology in Information
Technology of Nirma University is the record of work carried out by him/her
under my supervision and guidance. In my opinion, the submitted work has
reached a level required for being accepted for examination.

 

 

 

 

PREKSHA PAREEK                                           Dr.
Madhuri Bhavsar

                     
                                                Dept. of Information Technology,

Department
of Computer Science & Engg.,        
Institute of Technology,

Institute
of Technology,                                    Nirma
University,

Nirma
University,                                             Ahmedabad

Ahmedabad

                                                                                        

                                                                                         

CONTENTS

Certificate

Acknowledgement

Abstract

Table of Contents

List of figures

List of tables

 

Chapter
1   Introduction                                                                             
1

                   1.1     General                                                                             1

                   1.2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

                   Introduction

 

Text mining is basically defined as conversion of huge data or
documents into useful numbers. Text mining is used to analyze useful or
meaningful information from raw data with use of various algorithms and patterns
in the data. Text mining is used for unstructured data or Semi structured data
such as Emails, text message. It used to filter out spam message in emails by
identifying certain text common is such emails. After certain information
retrieval from the data/documents this data is used in data mining projects
(clustering and factoring, graphics, predictive data mining). 

 

Some Common aspects of Text Mining include removing certain keyword
like “THE”, punctuation marks etc. from the important data to improve search
quality. We will learn about it in preprocessing text

                    

 

 

 

 

 

 

PREPROCESSING TASK

 

 

 

 

 

Application of Text Mining

 

 

The main objective of text mining is to reduce time utilization
and filtering out unnecessary data from the main keywords or important data. It
is used to provide better services to the users by giving proper feedback. It
is used to by businesses to analyze consumer base and provide services
accordingly by targeting the potential customers.

 

 

1)   
SPAMMING IDENTIFICATION

 

As
Filtering based on IP address is not sufficient certain techniques of Text
Mining are uses to detect salting. Salting is basically adding certain
information to make it look like original or official content. Email service providing
companies uses text mining to filter out spam messages, promotional message from
the rest of important messages thus saving users time and resources. This can
be used for further filtering out messages according to the suitable age group.
It is used to provide protection against phishing and spamming.

 

2)   
SENTIMENT ANALYSIS

 

Sentiment
Analysis is used to identify positive, negative or neutral reviews about a
subject. Consider a watching a TV SERIES based on the reviews of viewers. The
text used in writing reviews is analyzed and according to the keywords used the
emotion of the user is identified which can be used for marking them as positive
or negative reviews of the show. It also focuses on the words and phrases to
identify how negative or positives these words are. 

Consider
this Statement -“I LOVED THE NEW MOBILE. BUT IT IS VERY EXPENSIVE AND DOES NOT
HAVE GREAT BATTERY LIFE”.

According
to the first line the customer seems impressed but the overall the customer has
a negative impression of the product.

Sentiment
Analysis are used to give indication about products such  as while reading reviews about a hotel you
come across a word ROTTEN this

Create
a negative impression about the hotels.

 

3)   
IN BIOMEDICAL DOMAINS

 

Year
by Year the numbers of researches in medical fields are increasingly
significantly thus the necessity of text mining is evident text mining is used
for quickly sorting out the necessary data from medical record which are available.
IN FIELDS like Cancer treatment text mining means improvising diagnostics, treatment,
and prevention of cancer by mining of database.

Another
important use of text mining is mining EHR (Electronic Health Record) is used to
search the patients previous records of certain diseases and medical history.

Text
Mining is used in for comparing gene markers with the previous

Records
and identifying different pattern in genes for checking diseases.

 

4)   
SOCIAL MEDIA PLATFORMS

 

Social
media is used connecting people i.e. interactions and conversations. Some of these
well known platforms are twitter,facebook,orkut.