Untitled Document


References \| Links

Machine Learning

Learning and induction are two main branches in AI domain. Learning is a natural way of obtaining knowledge. Machine learning is the study of computer algorithms that improve automatically their performance by analyzing data organized in datasets or collected directly from the environment.

Machine learning plays an increasingly role in computer science and computer technology. It can help us to understand and improve efficiency of human learning, for example CAI (Computer-aided instruction). It can discover new things or structures that are unknown to humans, such as data mining. And it also can fill in skeletal or incomplete specifications about a domain. In practice, many learning algorithms have been developed and have been successfully applied to an increasing variety of problems such as natural language processing, handwriting and speech recognition, document classification, medical data analysis and diagnosis, knowledge discovery in databases, process control and diagnosis, telephone fraud and network intrusion detection.

Machine learning combines many fields such as artificial intelligence, probability and statistics, information theory and other fields. It could be considered as an evolution of AI, because it blends AI heuristics with advanced statistical analysis. Machine learning attempts to let computer programs learn about the data they study, such that programs make different decisions based on the qualities of the studied data, using statistics for fundamental concepts, and adding more advanced AI heuristics and algorithms to achieve its goals.

Supervised learning and unsupervised learning are two basic tasks of machine learning.

Supervised Learning

In supervised learning, class labels and objects of each class are provided in data set. The learning system has to discover common properties in the examples for each class as the class description. This technique is also known as learning from examples.

The objective of supervised learning is to estimate unknown quantities based on observed samples. For example, one may have samples of educational levels from certain families of Lens. Based on those samples, one would like to estimate the educational situation of the city. This is a problem of regression or numerical estimation. If the observation to be estimated is discrete or categoric, the problem is classification. A class, together with its description forms a classification rule which can be used to predict the class of previously unseen objects.

Algorithms for supervised learning are useful tools in many areas of science and engineering, from estimating appropriate dosages of medicine for patients to predicting system failures. Supervised learning may be used as an final goal or as a preprocessing step for other systems. For example, classification of blocks of data into image or text might precede a document compression system that models the two different categories.

Decide Tree

Basian Network

SVM

Unsupervised Learning

In unsupervised learning, some objects without class labels are provided, the system has to discover the classes itself based on common properties of objects. Hence, this technique is also known as learning from observation and discovery.

Unsupervised learning contains clustering (finding groups of similar objects clustering) and associations (finding that some values of attributes go with some other.)

The algorithms of unsupervised learning used frequently have:

K-means
Agglomerative hierarchical clustering algorithm
Stochastic simulated annealing
Competitive learning
Kerogen self-organizing feature maps
Fishers' linear discriminant
Minimum spanning tree

References

[1]	J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufman, 1993.
[2]	M. Mehta, J. Rissanen, and R. Agrawal: "MDL-based Decision Tree Pruning", Proc. of the 1st Int'l Conference on Knowledge Discovery in Databases and Data Mining, Montreal, Canada, August, 1995.
[3]	J. R. Quinlan, "Combining Instance-Based and Model-Based Learning", Proceedings ML'93 (Utgoff, Ed), San Mateo, CA: Morgan Kaufmann, 1993.
[4]	J. R. Quinlan and R. M. Cameraon-Jones, "Oversearching and Layered Search in Empirical Learning", IJCAI '95.
[5]	J. R. Quinlan, "Boosting, Bagging, and C4.5", AAAI'96.
[6]	J. R. Quinlan, "Learning First-Order Definitions of Functions", Journal of Artificial Intelligence Research, vol 5, 139-161, October 1996.
[7]	J. R. Quinlan, "Boosting First-Order Learning", ALT'96.
[8]	Han, J., Kamber, M. (2000). Data mining: Concepts and Techniques. New York: Morgan-Kaufman.
[9]	Witten, I. H., & Frank, E. (2000). Data mining. New York: Morgan-Kaufmann.
[10]	J. Ross Quinlan. An emperical comparision of genetic and decisiontree classifiers. In Proceedings of the 5th International Conference on Machine Learning, pages 135 -- 141, Ann Arbor, 1988.
[11]	Nils J. Nilsson. Principles of Artificial Intelligence. Symbolic Computation. Springer-Verlag, 1982.
[12]	Yves Kodratoff and Ryszard S. Michalski, editors. Machine Learning, an Artificial Intelligence approach, volume 3. Morgan Kaufmann, San Mateo, California, 1990.

Machine Learning Links

UCI Machine Learning Repository
UCI Repository Of Machine Learning Databases and Domain Theories contains data sets and domain theories that have been or can be used to evaluate learning algorithms.
inancial Data Finder at OSU, a large catalog of financial data sets
Machine Learning Bibliographies
David Aha's list of ML home pages.
http://www.sgi.com/Technology/mlc
David Rosen's data sources
AI & Link Analysis, maintained by David Jensen at U. Mass Amherst
CALD, CMU's new Center for Automated Learning and Discovery.
Machine Learning Course CMPUT 466: Topic is Machine Learning, University of Alberta, Alberta - Canada (Dr. Russ Greiner)
The Journal of Machine Learning
MLnet Machine Learning Archive at GMD
The Machine Learning Archive at the German National Research Center for Computer Science (GMD) contains a growing collection of Machine Learning related papers, articles, tech reports, data sets and software.
Machine Learning Applied to Information Retrieval
Bibliography of papers concerned with machine learning algorithms applied to information retrieval tasks.
University of Illnois resources for AI researchers
Genetic Programming Bibliography
ftp of Jude Shavlik's Machine Learning Group (University of Wisconsin-Madison)
Learning Laboratory at Carnegie Mellon University
SVM of Bell Labs
SVM of GMD-First Berlin
SVM of ANU Canberra