References | Links

 
Machine Learning  
Learning and induction are two main branches in AI domain. Learning is a natural way of obtaining knowledge. Machine learning is the study of computer algorithms that improve automatically their performance by analyzing data organized in datasets or collected directly from the environment.

Machine learning plays an increasingly role in computer science and computer technology. It can help us to understand and improve efficiency of human learning, for example CAI (Computer-aided instruction). It can discover new things or structures that are unknown to humans, such as data mining. And it also can fill in skeletal or incomplete specifications about a domain. In practice, many learning algorithms have been developed and have been successfully applied to an increasing variety of problems such as natural language processing, handwriting and speech recognition, document classification, medical data analysis and diagnosis, knowledge discovery in databases, process control and diagnosis, telephone fraud and network intrusion detection.

Machine learning combines many fields such as artificial intelligence, probability and statistics, information theory and other fields. It could be considered as an evolution of AI, because it blends AI heuristics with advanced statistical analysis. Machine learning attempts to let computer programs learn about the data they study, such that programs make different decisions based on the qualities of the studied data, using statistics for fundamental concepts, and adding more advanced AI heuristics and algorithms to achieve its goals.

Supervised learning and unsupervised learning are two basic tasks of machine learning.

 
   
Supervised Learning  

In supervised learning, class labels and objects of each class are provided in data set. The learning system has to discover common properties in the examples for each class as the class description. This technique is also known as learning from examples.

The objective of supervised learning is to estimate unknown quantities based on observed samples. For example, one may have samples of educational levels from certain families of Lens. Based on those samples, one would like to estimate the educational situation of the city. This is a problem of regression or numerical estimation. If the observation to be estimated is discrete or categoric, the problem is classification. A class, together with its description forms a classification rule which can be used to predict the class of previously unseen objects.

Algorithms for supervised learning are useful tools in many areas of science and engineering, from estimating appropriate dosages of medicine for patients to predicting system failures. Supervised learning may be used as an final goal or as a preprocessing step for other systems. For example, classification of blocks of data into image or text might precede a document compression system that models the two different categories.

Decide Tree

Basian Network

SVM

 
   
Unsupervised Learning  
In unsupervised learning, some objects without class labels are provided, the system has to discover the classes itself based on common properties of objects. Hence, this technique is also known as learning from observation and discovery.

Unsupervised learning contains clustering (finding groups of similar objects clustering) and associations (finding that some values of attributes go with some other.)

The algorithms of unsupervised learning used frequently have:

  • K-means
  • Agglomerative hierarchical clustering algorithm
  • Stochastic simulated annealing
  • Competitive learning
  • Kerogen self-organizing feature maps
  • Fishers' linear discriminant
  • Minimum spanning tree
 
   
References  
[1] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufman, 1993.
[2] M. Mehta, J. Rissanen, and R. Agrawal: "MDL-based Decision Tree Pruning", Proc. of the 1st Int'l Conference on Knowledge Discovery in Databases and Data Mining, Montreal, Canada, August, 1995.
[3] J. R. Quinlan, "Combining Instance-Based and Model-Based Learning", Proceedings ML'93 (Utgoff, Ed), San Mateo, CA: Morgan Kaufmann, 1993.
[4] J. R. Quinlan and R. M. Cameraon-Jones, "Oversearching and Layered Search in Empirical Learning", IJCAI '95.
[5] J. R. Quinlan, "Boosting, Bagging, and C4.5", AAAI'96.
[6] J. R. Quinlan, "Learning First-Order Definitions of Functions", Journal of Artificial Intelligence Research, vol 5, 139-161, October 1996.
[7] J. R. Quinlan, "Boosting First-Order Learning", ALT'96.
[8] Han, J., Kamber, M. (2000). Data mining: Concepts and Techniques. New York: Morgan-Kaufman.
[9] Witten, I. H., & Frank, E. (2000). Data mining. New York: Morgan-Kaufmann.
[10] J. Ross Quinlan. An emperical comparision of genetic and decision­tree classifiers. In Proceedings of the 5th International Conference on Machine Learning, pages 135 -- 141, Ann Arbor, 1988.
[11] Nils J. Nilsson. Principles of Artificial Intelligence. Symbolic Computation. Springer-Verlag, 1982.
[12] Yves Kodratoff and Ryszard S. Michalski, editors. Machine Learning, an Artificial Intelligence approach, volume 3. Morgan Kaufmann, San Mateo, California, 1990.
 
   
Machine Learning Links  
 
   
 

Copyright © 2003 Huaiguo Fu