A NEW Cluster-histo-regression ANALYSIS FOR INCREMENTAL LEARNING FROM TEMPORAL DATA CHUNKS

Nagabhushan P.1, Syed Zakir Ali2, Pradeep Kumar R.3
1Department of Studies in Computer Science, University of Mysore, Mysore, India
2Department of Studies in Computer Science, University of Mysore, Mysore, India
3Amphisoft Technologies Private Limited, Coimbatore, India

Received : -     Accepted : -     Published : 15-06-2010
Volume : 2     Issue : 1       Pages : 53 - 57
Int J Mach Intell 2.1 (2010):53-57
DOI : http://dx.doi.org/10.9735/0975-2927.2.1.53-57

Keywords : Zero instance memory learning, Partial instance memory learning, Knowledge generation, Cluster analysis, Regression analysis, Incremental learning, Incremental augmentation of knowledge
Conflict of Interest : None declared

Cite - MLA : Nagabhushan P., et al "A NEW Cluster-histo-regression ANALYSIS FOR INCREMENTAL LEARNING FROM TEMPORAL DATA CHUNKS." International Journal of Machine Intelligence 2.1 (2010):53-57. http://dx.doi.org/10.9735/0975-2927.2.1.53-57

Cite - APA : Nagabhushan P., Syed Zakir Ali, Pradeep Kumar R. (2010). A NEW Cluster-histo-regression ANALYSIS FOR INCREMENTAL LEARNING FROM TEMPORAL DATA CHUNKS. International Journal of Machine Intelligence, 2 (1), 53-57. http://dx.doi.org/10.9735/0975-2927.2.1.53-57

Cite - Chicago : Nagabhushan P., Syed Zakir Ali, and Pradeep Kumar R. "A NEW Cluster-histo-regression ANALYSIS FOR INCREMENTAL LEARNING FROM TEMPORAL DATA CHUNKS." International Journal of Machine Intelligence 2, no. 1 (2010):53-57. http://dx.doi.org/10.9735/0975-2927.2.1.53-57

Copyright : © 2010, Nagabhushan P., et al, Published by Bioinfo Publications. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

Abstract

In scenarios where data chunks arrive temporally, a good algorithm for exploratory analysis should be able to generate the knowledge and with the next chunk of data arriving, the process should be the one of just updating online by accumulating the knowledge derived from the recent chunk. Such an incremental learning process in most of the cases indent a lot of memory requiring to carry all earlier data in the process of updating the knowledge successively. In this research work we propose to employ a novel Cluster-Histo-Regression analysis of the chunk to extract the knowledge for the temporal instant and fuse this knowledge through Histo-Regression-Distance analysis with the already accumulated knowledge. We have designed a methodology which (i) discards all those data samples from the chunk which have participated in the knowledge generation process (ii) indents minimum amount of memory to carry the accumulated knowledge and (iii) proposes to carry forward only those limited data samples (referred to as hard samples) which could not contribute to knowledge generated at that moment. Knowledge of each cluster is represented in the form of a histogram for each dimension of the clustered data and is transformed to regression line for the compact representation of the knowledge. The regression line parameters of the clusters obtained by incremental augmentation have shown an accuracy of up to 100% for some of the data sets that are considered for experimentation.

References

[1] Christophe G.C. (2000) AI Communications, 13(4), 215-223  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[2] Maloof A.M., Michalski R.S. (2004) Artificial Intelligence 154, 95-126  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[3] Kaufman K.A., Michalski R.S. (2004) Reports of the Machine Learning and Inference Laboratory, MLI 04-4  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[4] Michalski. R.S. (2003) Invited talk at the Sanken Symposium on Data Mining and Semantic Web, Osaka university, Japan  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[5] Jain A.K., Dubes R.C. (1998) Prentice Hall, Englewood Cliffs, NJ  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[6] Jain A.K, Murthy M.N., Flynn P.J. (1999) ACM Computing surveys, 31(3), 264- 323  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[7] Yuanhong Li, Ming Dong, Jing Hua (2008) Pattern recognition Letters 29, 10-18  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[8] Syed Zakir Ali, Nagabhushan P., Pradeep Kumar R. (2009) International Conference on Data Mining (DMIN-09), 375-381  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[9] Kibler D. Aha (1987) Proceedings of the Fourth International Conference on Machine Learning, Morgan Kauffmann, San Francisco, CA, 24-30.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[10] Sebastian Luhr, Mihai Lazarescu (2009) Data and Knowledge Engineering 68, 1- 27  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[11] Widmer G., Kubat M. (1996) Machine Learning 23, 69-101  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[12] Widmer G. (1997) Machine Learning 27, 259-286.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[13] Maloof A.M., Michalski R.S. (2000) Machine Learning 41, 27-52  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[14] Sigita Misina (2006) Proceedings of the International Conference on Computational Intelligence, Theory and Applications, 9th Fuzzy days in Dortmund, Germany, 20, 545-553  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[15] Tian Zhang, Raghu Ramakrishnan, Miron Livny (1996) Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Canada, 103-114.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[16] Martin Ester, Hans-Peter Kriegel, Jorg Sander, Michael Wimmer, Xiaowei X. (1998) Proceedings of the 24th VLDB conference New York, 323-333  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[17] Fazli C., Edward A.F., Cory D.S., Robert K.F. (1995) International Journal of Information Sciences—Informatics and Computer Science, 84, 101-114  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[18] Khalid M. H., Mohammed S. K. (2003) IEEE/WIC International Conference on Web Intelligence, Halifax, Canada, 597- 601.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[19] Chaudhuri B.B. (1994) Pattern Recognition Letters 15, 27-34.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[20] Narasimhamurthy M., Sridhar V. (1991) Pattern Recognition Letters 12, 511- 517  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[21] Martin H.C.L. (2006) Doctoral Dissertation of the Michigan State University  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[22] Nong Y., Xiangyang L. (2002) Journal of Computers and Industrial Engineering 43, 677-692  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[23] Seiichi ozawa, Nikola kasabov (2008) IEEE Transactions on Neural Network, 1061- 1074.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[24] Sarda N.L., Srinivas N.V. (1998) Proceedings of the Ninth International Workshop on Database and Expert Systems Applications, 240 – 245  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[25] Ahmed M. A., Nagwa M.E.M., Yousry Taha (2001) Proceeding of the 1st SIAM conference on Data Mining, Chicago, IL  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[26] Masseglia F., Poncelet P., Teisseire M. (2003) Data & Knowledge Engineering, 46(1), 97-121  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[27] Eric C., Akanksha B., Tig C., Anhai D., Jeffrey N. (2007) Proceedings of the 33rd International Conference on VLDB, 1045-1056  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[28] Jieping Y, Xiong H, Haesun P, Ravi J., Vipin K. (2005) IEEE Transaction on Knowledge and Data Engineering, 17(9), 1208-1222  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[29] Kolonia P. (1994) Popular Photography, 58(1), 30-34.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[30] Charu C. A., Jiawei Han, Jianyong Wang, Phillip S. Yu. (2003) Proceedings of the 29th VLDB Conference, 81-92.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[31] Liaden O’Callagahan, Nina Mishra, Adam Meyerson , Sudipto Guha, Rajeev Motwani (2002) Proceedings of the ICDE, 685-694  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[32] Han Kamber (2006) Second Edition, Elsevier, 51 -56  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[33] Soman K.P., Shyam Diwakar, Ajay V. (2006) Prentice Hall of India  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[34] Martin Ester, Hans-Peter Kriegel, Jorg Sander, Xiaowei Xu. (1996) Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), 226-231  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[35] Diday E., (2002), Electronic Journal of Symbolic Data Analysis, 1-25  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[36] Pradeep Kumar R., Nagabhushan P. (2007) Engineering Letters, 14(1), EL_14_1_30, www.engineeringletters.com/issues_v14 /issue_1/EL_14_1_30.pdf  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[37] Pradeep Kumar R. (2006) PhD Thesis of the University of Mysore, India  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[38] Tsai-Hung Fan, Dennis K.J. Lin, Kuang-Fu Cheng (2007) Data and Knowledge Engineering 61, Elsevier, 554-562  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[39] Langley P. (1995) P. Reimann & H. Spada (Eds), Elsevier, Amsterdam  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[40] John F. Roddick, Myra Spiliopoulou (2002) IEEE Transactions on Knowledge and Data Engineering, 4(4), 301-316  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[41] Michael J. Parik (2008) Elsevier Academic Press.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[42] Yi-Pu Wu, Jin-Jiang Guo, Xue-Jie Zhang (2007) International Conference on Machine Learning and Cybernetics, 5, 19(22), 2608-2614  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[43] Bapu B. K. (2004) PhD thesis of the University of Mysore, India  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[44] Lalitha Rangarajan (2004) PhD Thesis of the University of Mysore, India  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[45] UCI Machine Learning Repository, http://archieve.ics.uci.edu/ml/datasets/Iris  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus