As for the stock market analysis case study, I want to introduce two case studies I did previously. The one uses the traditional time-series data analysis approach to detect seasonality, stationarity/non-stationarity, autocorrelations, and so on. The other approach was developed by me to detect intra-patterns within the same features of stock and inter-patterns between different features of stock. Below are details of two approaches.
1. Time Series Data Analysis with ARIMA and Classifiers
The problem of stock analysis is typically formulated as predicting stock movements from daily closing prices collected over a period.
To solve the forecasting/predicting problem, both time series modelling approaches (e.g. ARIMA model) and machine learning approaches (e.g. SVM, Decision Tree, ANN prediction method) could be considered. The aims of the study are to identify a model best fitting the time series of the DJA stock price from 2014 to 2018 and to forecast the stock price in 2019.
I also provide two discussions of the results using the traditional time-seriesanalysis approach. One is whether the seasonal model with different periods would improve the forecast result, and the other is whether different time intervals would improve the forecast result. In addition, I also apply the machine learning approaches to construct classifiers and discuss the differences between the results of time-series modelling and the results of machine learning.
Detailed codes and report: Code(R) ; Report
2. Multivariate Time Series Data Analysis
A multivariate time series (MTS) is made up of data collected by monitoring the values of a set of temporally related or interrelated variables over a period of time at successive instants spaced at uniform time intervals. Given a set of MTS, the problem of classification or clustering such data is concerned with discovering inherent groupings of the data according to how similar or dissimilar the time series are to each other.
Related Papers:
Basic Ideas: A Model-Based Multivariate Time Series Clustering Algorithm
Clustering for Portfolios: An Algorithm for Fuzzy Clustering of Multivariate Time Series
Prediction using Temporal Patterns: Discovering Fuzzy Temporal Association in Multivariate Time Series for Stock Analysis
3. Corporate Communication Network and Stock Price Movements
— An Analysis of the Enron e-mail corpus and its stock market
Can we link a corporation’s communication network to its stock market price? Are there any associations between them that reveal the company’s performance?
Specifically, we would like to find out whether or not there exists any association relationship between the frequency of e-mail exchange of the key employees in a company and the performance of the company as reflected in its stock prices. If such relationships do exist, we would also like to know whether or not the company’s stock price could be accurately predicted based on the detected relationships.
To detect the association relationships, a data-mining algorithm is proposed here to mine e-mail communication records and historical stock prices, so that, based on the detected relationship, rules that can predict changes in stock prices can be constructed.
Using the data-mining algorithm and a set of publicly available Enron e-mail corpus and Enron’s stock prices recorded during the same period, we discovered the existence of interesting, statistically significant association relationships in the data. In addition, we also discovered that these relationships can predict stock price movements with an average accuracy of around 80%. The results confirm the belief that corporate communication has identifiable patterns and such patterns can reveal meaningful information of corporate performance as reflected by such indicators as stock market performance. Given the increasing popularity of social networks, the mining of interesting communication patterns could provide insights into the development of many useful applications in many areas.
Details in the Paper: Corporate Communication Network and Stock Price Movements: Insights From Data Mining






