Data Mining Through Relational Database

Authors

  • Ayush Kumar, Neelesh Jain, Neeraj Gupta

Keywords:

Applications of Data Mining, Business Intelligence, Data Mining, Data Presentatio, Database Systems

Abstract

With the wide availability of huge amounts of data and the imminent demands to transform raw data into useful information and knowledge, data mining has become an important research field in both the database area and the machine learning area. Data and information (or knowledge) play a significant role in human activities. Data mining is the process of knowledge discovery by analyzing large volumes of data from various perspectives and summarizing it into useful information. Due to the importance of extracting knowledge/information from large data repositories, data mining has become an essential component in various fields of human life. Advancements in statistics, machine learning, artificial intelligence, pattern recognition, and computational capabilities have evolved modern-day data mining applications, which now enrich various domains such as business, education, medicine, and science. Hence, this paper discusses the various improvements in the field of data mining from past to present and explores future trends. Data mining is defined as the process of solving problems by analyzing data already present in the database and discovering knowledge within it. Database systems provide efficient data storage, fast access structures, and a wide variety of indexing methods to speed up data retrieval. Machine learning provides theoretical support for most of the popular data mining algorithms. The database approach combines the properties of both areas to improve the scalability of Weka, an open-source machine learning software package. Weka implements most machine learning algorithms using main-memory-based data structures, which means it cannot handle datasets larger than the available main memory. A database is implemented to store and access data from DB2, achieving better scalability than Weka. However, the database is slower than Weka because secondary storage access is more expensive than main memory access. In this thesis, we extend the database with a buffer management component to improve its performance. Furthermore, we increase scalability by incorporating additional data structures into the database, utilizing a buffer for data access. We also explore another method to improve algorithm speed by leveraging the data access properties of machine learning algorithms.

References

D. Alexander, “Data Mining,” [Online]. Available: http://www.laits.utexas.edu/~norman/BUS.FOR/course.mat/Alex/. [Accessed: Mar. 11, 2012].

B. Palace, “Data Mining,” [Online]. Available: http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm. [Accessed: Feb. 25, 2012].

M. Sousa, M. Mattoso, and N. Ebecken, “Data mining: a database perspective,” in Proc. International Conference on Data Mining, 1998, pp. 413–431.

G. Dennis Jr., B. Sherman, D. Hosack, J. Yang, W. Gao, H. Lane, and R. Lempicki, “DAVID: Database for Annotation, Presentation, and Integrated Discovery,” Genome Biology, vol. 4, p. 3-14, Aug. 2003.

V. Friedman, “Data Presentation: Modern Approaches,” [Online]. Available: http://www.smashingmagazine.com/2007/08/02/data-Presentation-modern-approaches. [Accessed: Mar. 12, 2012].

R. Mikut and M. Reischl, “Data mining tools,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 1, pp. 431–443, Sep./Oct. 2011.

D. Tegarden, “Business Information Presentation,” Communications of AIS, vol. 1, Jan. 1999.

S. Few, “Human Perception,” [Online]. Available: http://www.interactiondesign.org/encyclopedia/data_Presentation_for_human_perception.html. [Accessed: Mar. 16, 2012].

F. Post, G. Nielson, and G. Bonneau, Data Presentation: The State of the Art. Springer, USA, 2002, p. 464.

G. Grinstein and B. Thuraisingham, “Data Mining and Data Presentation,” in Proc. IEEE Presentation ’95 Workshop on Database Issues for Data Presentation, Oct. 1995, pp. 54–56.

D. Keim, “Visual Techniques for Exploring Databases,” in Proc. International Conference on Knowledge Discovery in Databases (KDD ’97), California, USA, Aug. 1997.

U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “From Data Mining to Knowledge Discovery in Databases,” AI Magazine, vol. 17, pp. 37–54, Fall 1996.

M. Friendly, “A Brief History of Data Presentation,” in Handbook of Computational Statistics: Data Presentation, vol. 2, pp. 15–56, 2008.

E. Tufte, The Visual Display of Quantitative Information, Cheshire, CT: Graphics Press, 1986, p. 200.

S. Allen, “The Value of Many Eyes,” [Online]. Available: www.interactiondesign.sva.edu/classes/dataPresentation/updates. [Accessed: Apr. 1, 2012].

Kochevar, “Database Management for Data Presentation,” Database Issues for Data Presentation, vol. 871, pp. 107–117, 1994.

V. Friedman, “Data Presentation and Infographics,” [Online]. Available: http://www.smashingmagazine.com/2008/01/14/monday-inspiration-data-and-presentation-infographics. [Accessed: Jan. 14, 2008].

B. Alpern and L. Carter, “Hyperbox,” in Proc. IEEE Conference on Presentation ‘91, Oct. 1991, pp. 133–139.

M. Ferreira de Oliveira, “From visual data exploration to visual data mining: a survey,” IEEE Transactions on Presentation and Computer Graphics, vol. 9, pp. 378–394, Jul./Sep. 2003.

C. Romero and S. Ventura, “Educational data mining: A survey from 1995 to 2005,” Expert Systems with Applications, vol. 33, pp. 135–146, 2007.

Downloads

How to Cite

Ayush Kumar, Neelesh Jain, Neeraj Gupta. (2023). Data Mining Through Relational Database. International Journal of Research & Technology, 11(2), 18–27. Retrieved from https://ijrt.org/j/article/view/193

Issue

Section

Original Research Articles

Similar Articles

<< < 2 3 4 5 6 7 8 9 10 11 > >> 

You may also start an advanced similarity search for this article.