# 基于遗传模拟退火算法的Hadoop系统性能配置优化Performance Configuration Optimization of Hadoop System Based on Genetic Simulated Annealing Algorithm

DOI: 10.12677/AIRR.2020.92015, PDF, HTML, XML, 下载: 108  浏览: 240  科研立项经费支持

Abstract: In order to improve the performance of Hadoop on open-source cloud computing platform, a per-formance configuration optimization method based on genetic simulated annealing algorithm is proposed. Based on genetic algorithm, the configuration scheme is selected, crossed and mutated as chromosome. Combined with the principle of simulated annealing, the survival rate of new chromosome and the number of iterations of the whole algorithm are controlled to find the optimal scheme of system configuration. According to the genetic simulated annealing algorithm, the overall performance is better, and the optimization speed is faster in the long-term optimization. It can be used to solve the problem of finding the approximate optimal allocation scheme of the system through random search in the global space. Experimental results show that this method can effectively improve the efficiency of finding the optimal configuration. The proposed configuration method improves the operation speed, makes full use of resources and increases the throughput of the system.

1. 前言

2. 概述

2.1. 云计算技术

3. 架构与方法

Table 1. Hadoop basic configuration properties

Table 2. Main configuration attributes affecting Hadoop performance

Table 3. Some properties and default values that affect Hadoop performance

Table 4. Hadoop partial basic test program

Table 5. HiBench test case and its resource characteristics

3.3. 实验方法

$a\prime =\left(1-\alpha \right)a+\beta \ast b$ $b\prime =\left(1-\beta \right)b+\alpha \ast a$ (1)

4. 实验和结果分析

Reduce类以map的输出作为输入，因此Reduce的输入类型是 ，它的输出类型是 。Reduce类也要实现reduce方法，在本实验中，reduce函数将输入的key值作为输出的key值，然后将获得多个value值加起来，作为输出的值。

GeneName，GeneRange和GeneType三个文件都是在进行配置属性值的的规范化和染色体基因值的反规范化时使用。分别用来保存配置属性，属性值范围和属性值类型，通过这三个文件，任何类型的属性值都规范化为浮点类型。算法运行时，种群信息记录在日志文件中。

Figure 2. Algorithm optimization when the tasksize is 1.2 g

Table 6. Compare the best configuration and default configuration under different working conditions

5. 讨论

5.1. 配置优化方法的比较

Table 7. Comparison between genetic simulated annealing algorithm and chaos particle swarm optimization algorithm

5.2. 自适应配置

6. 结论