| SHI Suixiang,XU Lingyu,DONG Han,WANG Lei,WU Shaochun,QIAO Baiyou,WANG Guoren. 2014. Research on data pre-deployment in information service flow of digital ocean cloud computing. Acta Oceanologica Sinica, 33(9):82-92 |
| Research on data pre-deployment in information service flow of digital ocean cloud computing |
| 数字海洋云计算服务流中数据预部署研究 |
| Received:April 04, 2014 Revised:May 27, 2014 |
| DOI:10.1007/s13131-014-0520-8 |
| Key words:HDFS data prefetching cloud computing service flow digital ocean |
| 中文关键词: HDFS 数据预取 云计算 服务流 数字海洋 |
| 基金项目:The Ocean Public Welfare Scientific Research Project of State Oceanic Administration of China under contract No. 20110533. |
| Author Name | Affiliation | E-mail | | SHI Suixiang | National Marine Data and Information Service, State Oceanic Administration, Tianjin 300171, China Key Laboratory of Digital Ocean, State Oceanic Administration, Tianjin 300171, China | | | XU Lingyu | College of Computer Engineering and Science, Shanghai University, Shanghai 200072, China | xly@shu.edu.cn | | DONG Han | National Marine Data and Information Service, State Oceanic Administration, Tianjin 300171, China Key Laboratory of Digital Ocean, State Oceanic Administration, Tianjin 300171, China | | | WANG Lei | College of Computer Engineering and Science, Shanghai University, Shanghai 200072, China | | | WU Shaochun | College of Computer Engineering and Science, Shanghai University, Shanghai 200072, China | | | QIAO Baiyou | College of Information Science and Engineering, Northeastern University, Shenyang 110819, China | | | WANG Guoren | College of Information Science and Engineering, Northeastern University, Shenyang 110819, China | |
|
| Hits: 2440 |
| Download times: 2825 |
| Abstract: |
| Data pre-deployment in the HDFS (Hadoop distributed file systems) is more complicated than that in traditional file systems. There are many key issues need to be addressed, such as determining the target location of the data prefetching, the amount of data to be prefetched, the balance between data prefetching services and normal data accesses. Aiming to solve these problems, we employ the characteristics of digital ocean information service flows and propose a deployment scheme which combines input data prefetching with output data oriented storage strategies. The method achieves the parallelism of data preparation and data processing, thereby massively reducing I/O time cost of digital ocean cloud computing platforms when processing multi-source information synergistic tasks. The experimental results show that the scheme has a higher degree of parallelism than traditional Hadoop mechanisms, shortens the waiting time of a running service node, and significantly reduces data access conflicts. |
| 中文摘要: |
| 考虑到HDFS的数据预部署比传统意义上的数据部署更加复杂,需要解决预取什么数据,预取数据的目标位置,预取数据量,以及预取数据服务与正常数据访问冲突的平衡等多种关键问题,本文针对数字海洋信息服务流特点,提出了输入数据预取与输出数据定向存储相结合的部署方案,实现数据准备与数据处理的并行,从而减少数字海洋云计算平台在处理多源文件协同工作时大量I/O时间开销。从实验结果看,本文方法比传统hadoop机制具有更高的并行度,服务结点运行等待时间缩短,数据冲突缓解。 |
|
HTML
View Full Text
View/Add Comment Download reader |
| Close |
|
|
|