Feb 2012, Volume 6 Issue 1

  • Select all
    Shaoying LIU
    Jiliang TANG, Xufei WANG, Huiji GAO, Xia HU, Huan LIU

    Social media websites allow users to exchange short texts such as tweets via microblogs and user status in friendship networks. Their limited length, pervasive abbreviations, and coined acronyms and words exacerbate the problems of synonymy and polysemy, and bring about new challenges to data mining applications such as text clustering and classification. To address these issues, we dissect some potential causes and devise an efficient approach that enriches data representation by employing machine translation to increase the number of features from different languages. Then we propose a novel framework which performs multi-language knowledge integration and feature reduction simultaneously through matrix factorization techniques. The proposed approach is evaluated extensively in terms of effectiveness on two social media datasets from Facebook and Twitter. With its significant performance improvement, we further investigate potential factors that contribute to the improved performance.

    Xiaochen LI, Wenji MAO, Daniel ZENG

    Group behavior forecasting is an emergent research and application field in social computing. Most of the existing group behavior forecasting methods have heavily relied on structured data which is usually hard to obtain. To ease the heavy reliance on structured data, in this paper, we propose a computational approach based on the recognition of multiple plans/intentions underlying group behavior.We further conduct human experiment to empirically evaluate the effectiveness of our proposed approach.

    Xiaolong LI, Gang PAN, Zhaohui WU, Guande QI, Shijian LI, Daqing ZHANG, Wangsheng ZHANG, Zonghui WANG

    This paper investigates human mobility patterns in an urban taxi transportation system. This work focuses on predicting humanmobility fromdiscovering patterns of in the number of passenger pick-ups quantity (PUQ) from urban hotspots. This paper proposes an improved ARIMA based prediction method to forecast the spatial-temporal variation of passengers in a hotspot. Evaluation with a large-scale realworld data set of 4 000 taxis’ GPS traces over one year shows a prediction error of only 5.8%. We also explore the application of the prediction approach to help drivers find their next passengers. The simulation results using historical real-world data demonstrate that, with our guidance, drivers can reduce the time taken and distance travelled, to find their next passenger, by 37.1% and 6.4%, respectively.