Stock prediction:an event-driven approach based on bursty keywords
Di WU, Gabriel Pui Cheong FUNG, Jeffrey Xu YU, Qi PAN
Stock prediction:an event-driven approach based on bursty keywords
There are many real applications existing where the decision making process depends on a model that is built by collecting information from different data sources. Let us take the stock market as an example. The decision making process depends on a model which that is influenced by factors such as stock prices, exchange volumes, market indices (e.g. Dow Jones Index), news articles, and government announcements (e.g., the increase of stamp duty). Yet Nevertheless, modeling the stock market is a challenging task because (1) the process related to market states (rise state/drop state) is a stochastic process, which is hard to capture using the deterministic approach, and (2) the market state is invisible but will be influenced by the visible market information, like stock prices and news articles. In this paper, we propose an approach to model the stock market process by using a Non-homogeneous Hidden Markov Model (NHMM). It takes both stock prices and news articles into consideration when it is being computed. A unique feature of our approach is event driven. We identify associated events for a specific stock using a set of bursty features (keywords), which has a significant impact on the stock price changes when building the NHMM. We apply the model to predict the trend of future stock prices and the encouraging results indicate our proposed approach is practically sound and highly effective.
event-driven / hidden Markov model / trend prediction
[1] |
Adler P A, Adler P. The market as collective behavior. In: The Social Dynamics of Financial Markets, Greenwich:JAI Press, 1984: 85-105
|
[2] |
Blumer H. Outline of collective behavior. In: Readings in Collective Behavior. 2nd ed. Pittsburgh: Carnegie Press, 1975: 22-45
|
[3] |
Festinger L. A theory of cognitive dissonance. California: Stanford University Press, Reprinted in 1968
|
[4] |
Klausner M. Sociological theory and the bechavio of financial markets. The Social Dynamics of Financial Markets, 1984: 57-81
|
[5] |
Wu D, Fung G P C, Yu J X, Liu Z Integrating multiple data sources for stock prediction. In: Proceedings of WISE 2008, 2008: 77-89
|
[6] |
Lavrenko V, Schmill MD, Lawire D, Ogivie P, Jensen D, Allan J. Mining of Concurrent Text and Time Series. In: Proceedings of KDD00 Workshop on Text Mining, 2000
|
[7] |
Hughes J P, Guttorp P, Charles S P. A non-homogeneous hidden Markov model for precipitation occurrence. Applied Statistics, 1999, 48(1): 15-30
CrossRef
Google scholar
|
[8] |
Bodie Z, Kane A, Marcus A J. Investments. Chicago: Irwin, third edition, 1996
|
[9] |
X. Ge , P. Smyth. Deformable markov model templates for time-series pattern matching. In: Proceedings of KDD00, 2000: 81–90
|
[10] |
Holmes W J, Russell M J. Probabilistic-trajectory segmental hmms. Computer Speech and Language, 1999,13: 0-38
CrossRef
Google scholar
|
[11] |
Jurafsky D , Martin J H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice-Hall, 2000
|
[12] |
Kirshner S. Modeling of multivariate time series using hidden Markov models. PhD thesis, University of California, Irvine, 2005
|
[13] |
Fung G P C, Yu J X, Yu P S, Lu H. Parameter free bursty events detection in text streams. In: Proceedings of VLDB05, 2005: 181-192
|
[14] |
Kohara K, Ishikawa T, Fukuhara Y, Nakamura Y. Stock Price Prediction using prior knowledge and neural networks. Intelligent Systems in Accounting, Finance and Management. 1997, 6: 11-12
|
[15] |
Keogh E J, Chu S, Hart D, Pazzani M J. An online algorithm for segmenting time series. In: Proceedings of ICDM01, 2001: 289-296
|
[16] |
Salton G, McGill M J. Introduction to Modern Information Retrieval. McGraw-Hill Inc., 1986
|
[17] |
Fung G P C, Yu J X, Lam W. News sensitive stock trend prediction. In: Proceedings of PAKDD02, 2002: 481-493
|
[18] |
DPang-Ning Tan M S, Kumar V. Introduction to Data Mining. New York: Addison-Wesley, 2006
|
[19] |
Hellstrom T, Holmstrom K. Predicting the Stock Market. Sweden: Marardalen university, 1998
|
[20] |
Klein F , Prestbo J A. News and the Market. Chicago: Henry Regenry, 1974
|
[21] |
Fawcett T, Provost F J. Activity monitoring: Noticing interesting changes in behavior. In: Proceedings of KDD 99, 1999: 53-62
|
[22] |
Thomas J D, Sycara K. Integrating genetic algorithms and text learning for financial prediction. In: Proceedings of the Genetic and Evolutionary Computing 2000 Conference Workshop on Data Mining with Evolutionary Algorithms, 2000
|
[23] |
Nigam K, Lafferty J, McCallum A. Using maximunm entropy for text classification. In: Proceedings of the 16th International Joint Conference Workshop on Machine Learning for Information Filtering, 1999
|
[24] |
W thrich B, Permunetilleke D, Leung S, Cho V, Zhang J, Lam W. Daily prediction of major stock indices from textual www data. In: Proceedings of KDD98, 1998: 364-368
|
[25] |
W thrich B. Probabilistic knowledge bases. IEEE Transactions on Knowledge and Data Engineering, 1995,7(5): 691-698
CrossRef
Google scholar
|
[26] |
Ponte J M, Croft W B. A language modeling approach to information retrieval. In: Proceedings of SIGIR98, 1998: 275-281
|
[27] |
Fung G P C, Yu J X, Lu H. The predicting power of textual information on financial markets. IEEE Intelligent Informatics Bulletin, 2005,5(1):1-10
|
[28] |
Joachims T. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of 10th European Conference on Machine Learning (ECML98), Chemnitz, Germany, 1998: 137-142
|
[29] |
Mittermayer M A, Knolmayer G F. Newscats: A news categorization and trading system. In: Proceedings of ICDM 06, 2006: 1002-1007
|
[30] |
Kim S, Smyth P, Luther S. Modeling waveform shapes with random effects segmental hidden markov models. In: Proceedings of the 20th conference on Uncertainty in artificial intelligence, 2004: 309-316
|
[31] |
Basseville M, Nikiforov I. Detection of Abrupt Changes: Theory and Applications. Prentice-Hall, 1993
|
/
〈 | 〉 |