Uploaded on Jul 23, 2018
Read the difference between data mining and web mining.
DIFFERENCE BETWEEN DATA MINING AND WEB MINING
DATA MINING AND WEB MINING
DIFFERENCE BETWEEN DATA
MINING AND WEB MINING
What is Data Mining?
Data Mining(Knowledge Discovery in Databases)-The process of
discovering useful patterns or knowledge from different data
sources like databases, texts, images, audio and video and
web etc. The patterns must be valid, potentially useful, and
understandable. Data mining is a multi-disciplinary field
involving machine learning, statistics, databases, artificial
intelligence, information retrieval, and visualization.
What is Web Mining?
The application of data mining techniques to discover patterns from the
web and categorical extraction and evaluation with filtered information for
knowledge discovery from sophisticated web data and its appropriate web
services. It can be divided into three major categories-
1-Web Content Mining (WCM)aims to extract useful information or
knowledge from web data contents like text, image, audio, video records
etc.
2- Web Structure Mining (WSM) tries to discover useful knowledge from the
structure of hyperlinks and tags.
3-Web Usage Mining (WUM) refers to the discovery of user usage logs, http
logs, application server logs, etc.
DIFFERENCES
Comparison Web Mining Data Mining
Definition Process used to extract
information from web
documents.
Process used to extract
hidden information from
the database.
Scale It contains 10 million jobs
in server database, and
therefore search
processing is not big.
It contains 1 million jobs
in database and search
processing is large.
Who does this?
Data scientists
Data engineers
Data scientists/Data
analysts
Data engineers
Structure The information is
obtained from structured,
semi-structured and
unstructured web forms.
It gets the information
from wide database.
It obtains the information
from explicit structure. It
is not able to get all the
information from wide
database as compared to
web mining.
Comparison Web Mining Data Mining
Concept
Pattern identification from
data available in any
systems.
Pattern identification from
web data.
Process
Data extraction -> Pattern
discovery -> Develop the
feature/solve it
(Algorithm)
Same process but on web
using the web documents
Access Data is accessed publicly.
In this, data is not hidden
in web database and only
permission is required to
access the data from web
log master.
Data is accessed privately
and only authorized user
can access the data.
Data It works upon on-line data. It works upon off-line data.
Data Storage Data is stored in server
logs and web server
database.
Data is stored in data
warehouses.
Comparison Web Mining Data Mining
Techniques Web Content Mining,
Graph Based Web Mining,
Utilization in Web Mining,
Text Mining and many
others.
Artificial Neural Network,
Decision Trees, Rule
Induction, Nearest
Neighbor Method and
many others.
Challenges Complexity of web pages,
web is too huge, relevancy
of information, web is
dynamic information
source, diversity of user
communicates etc.
Network settings, data
quality, privacy
preservation, scalability,
complex and
heterogeneous data, etc.
Tools
Machine learning
algorithms
Scrappy,
PageRank,
Apache logs
How significant??
Many organizations are
relying on data science
results for decision
making.
Web-related data pull
would influence the
existing data mining
process.
APPLICATION AREAS
Data Mining
Industry Application
Finance Credit Card Analysis
Insurance Claims, Fraud Analysis
Telecommunication Call record analysis
Transport Logistics management
Consumer goods promotion analysis
Data Service providers Value added data
Utilities Power usage analysis
Web Mining
The most dominant application area for WM is related to Internet based e-
commerce (business-to-consumer) and Web-based customer relationship
management (CRM) an integral part of E-business today.
To discover knowledge for understanding the cause of any disease and its
treatment.
The business benefits that Web mining affords to digital service providers
include personalization , collaborative filtering, enhanced customer
support, product and service strategy definition, particle marketing and
fraud detection.
To track error done by hospital staff and enable them t correct the error
and prohibit them to repeat the same in future. To identify some patterns
to set the policies for health care centers and hospitals.
THANK YOU!!!
Comments