HJDM  >> Vol. 6 No. 3 (July 2016)

    基于文献计量的大数据研究现状分析
    Big Data Research Analysis Based on Bibliometrics

  • 全文下载: PDF(747KB) HTML   XML   PP.125-137   DOI: 10.12677/HJDM.2016.63015  
  • 下载量: 4,315  浏览量: 10,457  

作者:  

况俞竹,洪 玫,曾嘉彦:四川大学计算机学院,四川 成都

关键词:
大数据文献计量可视化分析文献综述数据分析Big Data Bibliometrics Visualization Systematic Literature Review Data Analysis

摘要:

随着互联网技术的发展,大数据时代已经来临,对大数据的研究受到世界范围的关注。本文针对国内外学术界计算领域对大数据的研究,分析该领域的研究现状与发展趋势。本文采用文献计量方法和可视化文献分析软件,通过自动和手动相结合的文献检索方式,筛选了2000~2015年国内外大数据领域的研究文献,对论文增长与分布、期刊和会议分布、作者合作等进行分析,特别分析了大数据研究的热点和趋势,为研究者的进一步研究工作提供了有价值的依据和参考。

With the development of Internet, the time of big data has come, and much attention has been paid on the research of big data all over the world. This paper analyzes the current research state and development trend of big data on the base of domestic and overseas academic research in this area. We select the 2000-2015 domestic and foreign research literature in the field of big data by the combination of automatic and manual way of literature retrieval, then analyze the paper growth and distribution and the distribution of journal, conference and cooperation of authors, etc., using the method of bibliometrics and visualization software of literature analysis, especially the hot spots and trend analysis of big data research, and provide a valuable basis and reference for the further study work of researchers.

文章引用:
况俞竹, 洪玫, 曾嘉彦. 基于文献计量的大数据研究现状分析[J]. 数据挖掘, 2016, 6(3): 125-137. http://dx.doi.org/10.12677/HJDM.2016.63015

参考文献

[1] 赛迪智库. 大数据时代需要加快布局[EB/OL]. http://www.cio360.net/index.php?m=content&c=index&a=show&catid=201&id=53375, 2012-05-17.
[2] 孟小峰, 慈祥. 大数据管理: 概念、技术与挑战[J]. 计算机研究与发展, 2013, 50(1): 146-169.
[3] 阿尔文•托勒夫, 著. 第三次浪潮[M]. 黄明坚, 译. 北京: 中信出版社, 2006: 19-25.
[4] 麦肯锡全球研究院. 大数据:创新、竞争和生产力的下一个新领域[EB/OL]. http://wenku.baidu.com/view/2e494d6d9b6648d7c1c746a7.html, 2014-05-04.
[5] Big Data Across the Federal Government. http://www.whitehouse.gov/sites/default/files/microsites/ostp/big_data_fact_sheet_final_1.pdf
[6] Hey, T., Tansley, S. and Tolle, K. (2009) The Fourth Paradigm: Data-intensive Scientific Discovery. Microsoft Research, Redmond, Washington. http://research.microsoft.com/en-us/collaboration/fourthparadigm/
[7] Nature.Big Data. http://www.nature.com/news/specials/bigdata/index.html
[8] Reichman, O.J., Matthew, B., Mark, P.H., et al. (2011) Challenges and Opportunities of Open Data in Ecology. Science, 311, 703-705. http://www.sciencemag.org/
[9] 郑文晖. 文献计量法与内容分析法的比较研究[J]. 情报杂志, 2006, 25(5): 31-33.
[10] 汤建民. 2006年国内科学学研究的词频分析与计量研究[J]. 科学学研究, 2007, 25(s2): 518-522.
[11] 冯璐, 冷伏海. 共词分析方法理论进展[J]. 中国图书馆学报, 2006, 32(2): 88-92.
[12] 冯博, 刘佳. 大学科研团队知识共享的社会网络分析[J]. 科学学研究, 2007, 25(6): 1156-1163.
[13] 陈超美, 陈悦, 侯剑华, 梁永霞. CiteSpaceII: 科学文献中新趋势与新动态的识别与可视化[J]. 情报学报, 2009, 28(3): 401-421.
[14] 任磊, 杜一, 马帅, 张小龙, 戴国忠. 大数据可视分析综述[J]. 软件学报, 2014(9): 1909-1936.
[15] 孟小峰, 慈祥. 大数据管理: 概念、技术与挑战[J]. 计算机研究与发展, 2013, 50(1): 146-169.
[16] Yang, Y., Ni, X.H., Wang, H.J., et al. (2012) Parallel Implementation of Ant-Based Clustering Algorithm Based on Hadoop. In: Proceedings of ICSI 2012, Springer, Berlin, 190-197.
[17] Nair, S. and Mehta, J. (2011) Clustering with Apache Hadoop. In: Proceedings of ICWET 2011, ACM, New York, 505-509.
[18] Isard, M. and Yu, Y. (2009) Distributed Data-Parallel Computing Using a High-Level Programming Language. In: Proceedings of SIGMOD 2009, ACM, New York, 987-994.
[19] Agrawal, R. and Srikant, R. (2000) Privacy Preserving Data Mining. In: Proceedings of SIGMOD 2000, ACM, New York, 439-450.
[20] Hadoop, W.T. (2009) The Definitive Guide. 2nd Edition, O’Reilly Media, California.
[21] Dean, J. and Ghemawat, S. (2008) MapReduce: Simplied Data Processing on Large Clusters. Communications of the ACM, 51, 107-113.
[22] Hadoop. http://hadoop.apache.org/index.html
[23] Chaiken, R., Jenkins, B., Larson, P., et al. (2008) SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. Proceedings of the VLDB Endowment, 1, 1265-1276.