Efficient Processing of Continuous Join Queries between a Data Stream and Multiple Relations for Real-Time Analysis of E-Commerce Data

Haeri Kim, Ki Yong Lee

Abstract


Recently, as real-time availability of e-commerce data becomes possible, the requirement of real-time analysis of e-commerce increases significantly. In the real-time analysis of e-commerce data, it is very important to efficiently process continuous join queries between an e-commerce data stream and disk-based large relations. In this paper, we propose an efficient method for processing a continuous join query between an e-commerce data stream and multiple disk-based relations. The proposed method improves the service rate significantly, while reducing the amount of required memory substantially. Through analysis and various experiments, we show the efficiency of the proposed method compared with the previous one in terms of service rate and memory usage.

Full Text:

PDF

References


Babcock, B., Babu, S., Datar, M., Motwani, R., and Widom, J., "Processing sliding window multi-joins in continuous queries over data streams," In Proc. ACM SIGMOD- SIGACTSIGART Symposium on Principles of Database Systems (PODS), Madison, Wisconsin, USA, pp. 1-16, June 2002.

Garcia-Molina, H., Ullman, J. D., and Widom, J., DATABASE SYSTEMS : The complete Book : International Edition, 2/E. pp. 718-745, 2009.

Golab, L. and Ozsu, T., "Processing sliding window multijoins in continuous queries over data streams," In Proc. Int. Conf. on Very Large Databases (VLDB), Berlin, Germany, pp. 500-511, September 2003.

Kang, J., Naughton, J. F., and Viglas, S., "Evaluating window joins over unbounded streams," In Proc. Int. Conf. on Data Engineering, Bangalore, India, pp. 341-352, March, 2003.

Karakasidis, A. and Hellas, I., "ETL queues for active data warehousing," In Proc. Int. Workshop on Information Quality in Informational Systems (IQIS), pp. 28-39, 2005.

Lee, Y. W., Lee, K. Y., and Kim, M. H., "Multiple Continuous Skyline Query Processing over Data Streams," The Journal of Society for e-Business Studies, Vol. 15, No. 4, pp. 165-180, November 2010.

Naeem, M. A., Dobbie, G., and Weber, G., "Optimised X-HYBRIDJOIN for Near- Real-Time Data Warehousing," In Proc. 23rd Australasian Database Conference, pp. 21-30, 2012.

Naeem, M. A., Dobbie, G., and Weber, G., "X-HYBRIDJOIN for Near-Real-Time Data Warehousing," In Proc. 28th British National Conference on Databases, pp. 33-47, 2011.

Naeem, M. A., Dobbie, G., Weber, G., and Alam, S., "R-MESHJOIN for Near-realtime Data Warehousing," In Proc. the ACM 13th International Workshop on Data Warehousing and OLAP, pp. 53-60, 2010.

Polyzotis, N., Skiadopoulos, S., Vassiliadis, P., Simitsis, A., and Frantzell, N., "Meshing Streaming Updates with Persistent Data in an Active Data Warehouse," IEEE Trans. on Knowl. And Data Eng., Vol. 20, No. 7, pp. 976-911, 2008.

CrossRef

Polyzotis, N., Skiadopoulos, S., Vassiliadis, P., Simitsis, and A., Frantzell, N., "Supporting Streaming Updates in an Active Data Warehouse," In Proc. IEEE 23rd International Conference on Data Engineering, Istanbul, Turkey, pp. 476-485, 2007.

Viglas, S. D., Naughton, J. F., and Burger, J., "Maximizing the output rate of multiway join queries over streaming information sources," In Proc. Int. Conf. on Very Large Databases (VLDB), Berlin, Germany, pp. 285-296, September, 2003.

White, C., "Intelligent business strategies: Real-time data warehousing heats up," DM Review, 2002.


Refbacks

  • There are currently no refbacks.