|
Machine
Learning |
|
Learning and induction are two main branches in AI domain. Learning
is a natural way of obtaining knowledge. Machine learning is the
study of computer algorithms that improve automatically their performance
by analyzing data organized in datasets or collected directly from
the environment.
Machine learning plays an increasingly role in computer science
and computer technology. It can help us to understand and improve
efficiency of human learning, for example CAI (Computer-aided
instruction). It can discover new things or structures that are
unknown to humans, such as data mining. And it also can fill in
skeletal or incomplete specifications about a domain. In practice,
many learning algorithms have been developed and have been successfully
applied to an increasing variety of problems such as natural language
processing, handwriting and speech recognition, document classification,
medical data analysis and diagnosis, knowledge discovery in databases,
process control and diagnosis, telephone fraud and network intrusion
detection.
Machine learning combines many fields such as artificial intelligence,
probability and statistics, information theory and other fields.
It could be considered as an evolution of AI, because it blends
AI heuristics with advanced statistical analysis. Machine learning
attempts to let computer programs learn about the data they study,
such that programs make different decisions based on the qualities
of the studied data, using statistics for fundamental concepts,
and adding more advanced AI heuristics and algorithms to achieve
its goals.
Supervised learning and unsupervised learning are two basic
tasks of machine learning. |
|
|
|
Supervised
Learning |
|
In supervised learning, class labels and objects
of each class are provided in data set. The learning system has
to discover common properties in the examples for each class as
the class description. This technique is also known as learning
from examples.
The objective of supervised learning is to estimate unknown quantities
based on observed samples. For example, one may have samples of
educational levels from certain families of Lens. Based on those
samples, one would like to estimate the educational situation
of the city. This is a problem of regression or numerical estimation.
If the observation to be estimated is discrete or categoric, the
problem is classification. A class, together with its description
forms a classification rule which can be used to predict the class
of previously unseen objects.
Algorithms for supervised learning are useful tools in many
areas of science and engineering, from estimating appropriate
dosages of medicine for patients to predicting system failures.
Supervised learning may be used as an final goal or as a preprocessing
step for other systems. For example, classification of blocks
of data into image or text might precede a document compression
system that models the two different categories.
Decide Tree
Basian Network
SVM |
|
|
|
Unsupervised
Learning |
|
In unsupervised learning, some objects without class labels are
provided, the system has to discover the classes itself based on
common properties of objects. Hence, this technique is also known
as learning from observation and discovery.
Unsupervised learning contains clustering (finding groups of
similar objects clustering) and associations (finding that some
values of attributes go with some other.)
The algorithms of unsupervised learning used frequently have:
- K-means
- Agglomerative hierarchical clustering algorithm
- Stochastic simulated annealing
- Competitive learning
- Kerogen self-organizing feature maps
- Fishers' linear discriminant
- Minimum spanning tree
|
|
|
|
References |
|
[1] |
J. R. Quinlan, C4.5: Programs for Machine Learning,
Morgan Kaufman, 1993. |
[2] |
M. Mehta, J. Rissanen, and R. Agrawal: "MDL-based Decision
Tree Pruning", Proc. of the 1st Int'l Conference on Knowledge
Discovery in Databases and Data Mining, Montreal, Canada,
August, 1995. |
[3] |
J. R. Quinlan, "Combining Instance-Based and Model-Based
Learning", Proceedings ML'93 (Utgoff, Ed), San Mateo,
CA: Morgan Kaufmann, 1993. |
[4] |
J. R. Quinlan and R. M. Cameraon-Jones, "Oversearching
and Layered Search in Empirical Learning", IJCAI '95. |
[5] |
J. R. Quinlan, "Boosting, Bagging, and C4.5",
AAAI'96. |
[6] |
J. R. Quinlan, "Learning First-Order Definitions of
Functions", Journal of Artificial Intelligence Research,
vol 5, 139-161, October 1996. |
[7] |
J. R. Quinlan, "Boosting First-Order Learning",
ALT'96. |
[8] |
Han, J., Kamber, M. (2000). Data mining: Concepts and Techniques.
New York: Morgan-Kaufman. |
[9] |
Witten, I. H., & Frank, E. (2000). Data mining. New
York: Morgan-Kaufmann. |
[10] |
J. Ross Quinlan. An emperical comparision of genetic and
decisiontree classifiers. In
Proceedings of the 5th International Conference on Machine
Learning, pages 135 -- 141,
Ann Arbor, 1988. |
[11] |
Nils J. Nilsson. Principles of Artificial Intelligence.
Symbolic Computation. Springer-Verlag, 1982. |
[12] |
Yves Kodratoff and Ryszard S. Michalski, editors. Machine
Learning, an Artificial Intelligence approach, volume 3. Morgan
Kaufmann, San Mateo, California, 1990. |
|
|
|
|
Machine
Learning Links |
|
|
|
|
|
|
|