流域水文模型结课作业

任务

采用流域水文模型进行流域降雨-径流模拟。本次模拟中采用线性扰动LPM模型。

模型概况 ¹

总径流响应（SLM）模型的系统方程

离散化的系统方程表达为：\[Y(k) = \sum_{i=1}^{m}H(i)X(k-i+1) \tag{0.1}\] 水文预报的应用之一就是由降雨X(k)推求总径流Y(k)。显然，其中的前提就是已知系统的响应函数H(i)。然而，在未完全建立SLM模型之前，响应函数是有待确定的。我们把由历史上（或实时方式）观测的输入、输出信息[X(k)、Y(K)]用来便是水文系统模型中未知或待定的部分，称之为水文系统识别。它是水文系统方法中重要的内容之一。只有把系统模型中待定的部分都确定了，才能应用到实际的水文预报或水温计算。对于总径流线性响应模型，系统识别的问题归为水文模型响应函数的推求。采用最多的是矩阵最小二乘法，简述如下：

考虑到所用资料有误差，或对系统所作的线性假定不完善，式(0.1)可做如下表述：\[Y(k)=\sum_{i=1}^{m}X(k-i+1)H(i)+e(k) \tag{0.2}\] 式中,e(k)为随机误差项。随着水文时间序列的变化，即当k=1,2,...,n时，式（0.2）可逐行写为： \[Y(1) = X(1)H(1)+e(1)\] \[Y(2) = X(1)H(2) + X(2)H(1) + e(2)\] \[\vdots\] \[Y(m) = X(1)H(m) + X(2)H(m-1) + \dots + X(m)H(1) + e(m)\] \[\vdots\] \[Y(n) = X(n-m+1)H(m) + X(n-m+2)H(m-1) + \dots +X(n)H(1) + e(n)\]

他们可以用矩阵方程形式表达，简记为\[Y_{n\times 1} = X_{n\times n}H_{m\times 1} + E_{n\times 1} \tag{0.2}\] 式中，n代表水文资料的总长度，m是响应函数的记忆长度。由于n>m，式（0.2）是一个超定方程组，其中的响应函数向量\(H_{m\times 1} = [h(1),h(2),\dots,h(m)]^L\)可采用最小二乘法识别。目标函数记为\[J(H)=\sum_{k=1}^{n}e^2(k)=E^TE=(Y-XH)^T(Y-XH) \\=Y^TY-2H^TX^TY+H^TX^TXH \tag{0.3}\] 由目标函数极小化，即min{J(H)}，不难导出响应函数的最小二乘解向量为：\[\hat H = [X^TX]^{-1}[X^TY] \tag{0.4}\]

线性扰动模型的结构和基本假定

线性扰动模型建立的思想是：依据观测的降雨_{径流（或河道输入}输出）资料记为{I(k),Q(k)}，计算季节均值及其平滑值，分别记为\(I_d\)和\(Q_d\)。关于\(I_d\)~\(Q_d\)之间的复杂关系不作任何假定。然后，分别计算系统输入、输出变量相对他们季节均值的扰动项，即\[X(k) = I(k) - I_d \tag{1}\] \[Y(k) = Q(k) - Q_d \tag{2}\]

由于一个水文输入~输出序列中季节均值占有主导部分，为简化模型假定：输入的扰动项\(X(k) = I(k) - I_d\)与输出的扰动项\(Y(k) = Q(k) - Q_d\)之间存在线性关系，即\[Y(k) = \sum_{j=1}^{m}H(j)X(k-j+1)+e(k) \tag{3}\] 式中，H(j)被称为线性扰动系统响应函数，e(k)为误差，则我们把由上述三式组合的系统模型称为线性扰动模型（LPM）。

模型建立具体步骤²

由观测的资料分别计算水文系统输入~输出序列样本的季节均值\(I_d\)和\(Q_d\)，d=1,2,3...,365。
季节均值是流域的基本水文属性之一，应当是比较平稳的过程。但实际中求得的季节均值不可避免地带有随机噪音而出现振荡，因此，需要采用一定的数学方法使季节均值光滑。本次计算使用傅里叶级数法进行光滑。数学方程为： \[Q_d = \bar{Q_d} + \sum_{j=1}^{L}[A_jcos(\frac{2\pi jd}{365}) + B_jsin(\frac{2\pi jd}{365})] \tag{4}\] 式中，\(\bar{Q_d}\)为均值，\(A_j\)和\(B_j\)为傅里叶系数，j为调和函数的序数，即 \[\bar{Q_d} = \frac{1}{365}\sum_{j=1}^{365}Q_d \tag{5}\] \[A_j = \frac{2}{365}\sum_{d=1}^{365}Q_dcos(\frac{2\pi jd}{365}) \tag{6}\] \[B_j = \frac{2}{365}\sum_{d=1}^{365}Q_dsin(\frac{2\pi jd}{365}) \tag{7}\]

当式(4)中的调和函数只取得几项时，就得到\(Q_d\)的光滑过程。实际中，一般取L=4或5个调和系数。

利用式（1）、（2）计算输入扰动项X(k)和输出扰动项Y(k)形成式（3）的线性系统方程。
采用与线性总径流模型相同的方法，由最小二乘法识别LPM模型的相应函数\(\hat H(j)\)。
一旦响应函数已经求得，便可利用式（3）由降雨（或上流入流）的扰动值X(k)求得相应的出流扰动\(\hat Y(k)\)。
由此计算或预报出系列\(Q(k)=Q_d+\hat Y(k)\)。

模型建立

计算平滑后数据

此次模型建立取L=4进行平滑。根据上述公式对Q、P、E得到平滑结果如下：

建立线性系统方程

根据式（1）（2）计算输入输出扰动项并得到类似式（3）的线性系统方程。

使用遗传算法计算\(\hat H(k)\)

首先编写一个简单的遗传算法代码：

初始化种群:该函数生成一个大小为pop_size的初始种群，每个个体的基因长度为gene_length，基因值为0或1。

1
2
3

function population = initialize_population(pop_size, gene_length)
    population = randi([0, 1], pop_size, gene_length);
end

适应度函数：该函数计算个体的适应度，适应度为个体中1的数量。

1
2
3

function fit = fitness(individual)
    fit = sum(individual);  % Example: maximize the number of 1s
end

选择操作：这个函数根据适应度选择个体，适应度越高，被选中的概率越大。randsample函数用于带权重的随机抽样

function selected_population = selection(population, fitnesses)
    total_fitness = sum(fitnesses);
    probabilities = fitnesses / total_fitness;
    selected_indices = randsample(1:length(population), length(population), true, probabilities);
    selected_population = population(selected_indices, :);
end

交叉操作：这个函数对两个父代个体进行交叉操作，生成两个子代个体。交叉点是随机选择的

function [offspring1, offspring2] = crossover(parent1, parent2)
    point = randi([1, length(parent1) - 1]);
    offspring1 = [parent1(1:point), parent2(point+1:end)];
    offspring2 = [parent2(1:point), parent1(point+1:end)];
end

变异操作：该函数对个体进行变异操作，每个基因以mutation_rate的概率发生变异，xor用于反转基因值

function mutated_individual = mutate(individual, mutation_rate)
    mutation_mask = rand(size(individual)) < mutation_rate;
    mutated_individual = xor(individual, mutation_mask);
end

生成新一代：这个函数生成新一代种群，通过选择、交叉和变异操作生成新的个体

function new_pop = new_generation(population, mutation_rate)
    new_pop = zeros(size(population));
    for i = 1:2:length(population)
        parent1 = population(i, :);
        parent2 = population(i+1, :);
        [offspring1, offspring2] = crossover(parent1, parent2);
        new_pop(i, :) = mutate(offspring1, mutation_rate);
        new_pop(i+1, :) = mutate(offspring2, mutation_rate);
    end
end

遗传算法主函数：该函数包含遗传算法的核心部分，包括上述所有函数以及终止条件，并最终输出结果

function final_population = genetic_algorithm(pop_size, gene_length, mutation_rate, generations)
    population = initialize_population(pop_size, gene_length);
    for gen = 1:generations
        fitnesses = arrayfun(@fitness, population);
        if max(fitnesses) == gene_length
            break;  % Terminate if an optimal solution is found
        end
        population = selection(population, fitnesses);
        population = new_generation(population, mutation_rate);
        fprintf('Generation %d: Max Fitness = %d\n', gen, max(fitnesses));
    end
    final_population = population;
end

fitnesses = arrayfun(@fitiness,population)语法
- arrayfun:arrayfun函数在matlab中用于对数组的每个元素应用指定的函数，它的语法是
  1
  arrayfun(function_handle,array)
  其中function_handle是要应用的函数，array是要处理的数组。
- @fitness：@fitness是一个函数句柄，指向定义的fitness函数。在上文已经提及其用于计算个体的适应度。
- population：population是一个矩阵，表示当前种群。每一行代表一个个体，每一列代表一个基因。

参数设置和运行：这部分代码设置了包括种群大小、基因长度、变异率和最大代数等参数，并运行遗传算法

% Parameters
pop_size = 10;
gene_length = 8;
mutation_rate = 0.01;
generations = 100;

% Run Genetic Algorithm
final_population = genetic_algorithm(pop_size, gene_length, mutation_rate, generations);
disp('Final Population:');
disp(final_population);

然后对其进行改写。首先，对进行交叉、进行变异、生成下一代的函数不做任何改变，用来进行交叉操作，变异操作，生成下一代。

对于适应度函数，由于本次课设任务为拟合流量过程，因此将个体适应度改为在线性系统方程中所得到的函数值与流量数据的差值。

1
2
3

function fit = fitness(individual)
    fit = abs(Y(i)-bin2dec(individual)*X(i));  %将个体由二进制转化为十进制进行计算
end

选择操作中，根据适应度选择个体。在上述情况下，设置适应度最小的个体最有可能被选中。

function selected_population = selection(population, fitnesses)
    total_fitness = sum(fitnesses);
    probabilities = 1 - fitnesses / total_fitness;
    selected_indices = randsample(1:length(population), length(population), true, probabilities);
    selected_population = population(selected_indices, :);
end

对于参数的随机生成，由于选取的参数为方程的系数，且样本数据较大会导致循环次数较多，因此需要限制生成的参数为小数且小数和整数部分的长度都不过长。

%parameters
pop_size = 100;
gene_length = 8;
mutation_rate = 0.01;
generations = 100;

还需要编制按照上述要求将生成的二进制样本转化为十进制数的函数bin_dec。

function decimal = bin_dec(population)
    int_part(j,1:5) = population(j,1:5);
    frac_part(j,1:3) = population(j,6:8);
    a = length(int_part(j,:));
    b = length(frac_part(j,:));
    % Convert the integer part
    decimal_int(1,j) = 0;
    for i = 1:a
        decimal_int(1,j) = decimal_int(1,j) + int_part(j,i) * 2^(a - i);
    end

    % Convert the fractional part
    decimal_frac(1,j)  = 0;
    for i = 1:b
        decimal_frac(1,j) = decimal_frac(1,j)  + frac_part(j,i) * 2^(-i);
    end

    % Combine the integer and fractional parts
    decimal(j) = decimal_int(j) + decimal_frac(j);
end

最后还需要修改程序的数据结构，对X(k),Y(k)中三年内的数据进行计算，得到每一日输入输出的响应函数，最终得到结果。

需要注意的是，由于整个代码都使用函数进行计算，而函数工作区和普通工作区是不互通的，因此需要进行参数的传递。主要需要解决的是由arrayfun()函数传递参数的问题。

整合函数如下：

遗传算法识别响应函数

%遗传算法识别响应函数

%%遗传算法参数设置
% Parameters
pop_size = 1000;
gene_length = 8;
mutation_rate = 0.01;
generations = 1000;
fitness_threshold = 0.01;%循环终止条件
for k = 1:365
c = Y(k);
d = X(k);
% Run Genetic Algorithm
[best_individual,final_population] = genetic_algorithm(pop_size, gene_length, mutation_rate, generations,c,d,fitness_threshold);
ans(1,k) = bin_dec(best_individual);
%disp(ans(1,k));
end

%初始种群
function population = initialize_population(pop_size, gene_length)
    population = randi([0, 1], pop_size, gene_length);
end

%%适应度计算
function fit = fitness(decimal,c,d) 
    fit = abs(c-decimal*d);
end

%%选择
function selected_population = selection(population_decimal, fitnesses,population)
    total_fitness = sum(fitnesses);
    probabilities = 1 - fitnesses / total_fitness;
    selected_indices = randsample(1:length(population_decimal), length(population_decimal), true, probabilities);
    selected_population = population(selected_indices,:);
    %disp(population);
end

%%交叉
function [offspring1, offspring2] = crossover(parent1, parent2)
    point = randi([1, length(parent1) - 1]);
    offspring1 = [parent1(1:point), parent2(point+1:end)];
    offspring2 = [parent2(1:point), parent1(point+1:end)];
end

%%变异
function mutated_individual = mutate(individual, mutation_rate)
    mutation_mask = rand(size(individual)) < mutation_rate;
    mutated_individual = xor(individual, mutation_mask);
end

%%生成下一代
function new_pop = new_generation(population, mutation_rate)
    new_pop = zeros(size(population));
    for i = 1:2:length(population)
        parent1 = population(i,:);
        parent2 = population(i+1,:);
        [offspring1, offspring2] = crossover(parent1, parent2);
        new_pop(i, :) = mutate(offspring1, mutation_rate);
        new_pop(i+1, :) = mutate(offspring2, mutation_rate);
    end
end
%二进制转化
function decimal = bin_dec(population_row)
    % Assuming population_row is a row vector
    if length(population_row) < 8
        error('Population vector must be at least 8 bits long.');
    end
    
    int_part = population_row(1:5); % First 5 bits for integer part
    frac_part = population_row(6:8); % Next 3 bits for fractional part

    % Convert the integer part
    decimal_int = 0;
    for i = 1:length(int_part)
        decimal_int = decimal_int + int_part(i) * 2^(length(int_part) - i);
    end

    % Convert the fractional part
    decimal_frac = 0;
    for i = 1:length(frac_part)
        decimal_frac = decimal_frac + frac_part(i) * 2^(-i);
    end

    % Combine integer and fractional parts
    decimal = decimal_int + decimal_frac;
end
%

%%主函数
function [best_individual,final_population] = genetic_algorithm(pop_size, gene_length, mutation_rate, generations,c,d,fitness_threshold)
    population = initialize_population(pop_size, gene_length);
    for gen = 1:generations
        for i = 1:pop_size
            population_decimal(i) = bin_dec(population(i,:));
        end
        fitnesses = arrayfun(@(individuals) fitness(individuals,c,d), population_decimal);
        max_fitness = max(fitnesses);
        %
        if max_fitness < fitness_threshold
            break;
        end
        %}
        population = selection(population_decimal, fitnesses, population);
        population = new_generation(population, mutation_rate);
        %fprintf('Generation %d: Max Fitness = %d\n', gen, min(fitnesses));
    end
    final_population = population;
    %
    for i = 1:length(final_population)
        population_decimal(i) = bin_dec(final_population(i,:));
    end
    fitnesses = arrayfun(@(individuals) fitness(individuals,c,d), population_decimal);
    [~, max_fitness_index] = min(fitnesses);

    % Get the individual with the highest fitness
    best_individual = final_population(max_fitness_index, :);
    %}
end

模型检验

选取2010-2014年间任一场洪水，检验所得数据精确性。经分析，决定选取2013-9-22——2013-9-29为检验期。由上述模型得到的参数进行检验，得到下图结果。

梁庚辰,J.E.Nash.大流域汇流演算的线性扰动模型[J].水力发电学报,1991,(02):14-28.↩︎
《工程水文学》,2005,叶守则,詹道江.↩︎

任务

模型概况 1