neural network doesn't fit boundaries
我是机器学习的新手,并尝试使用tensorflow在python中使用神经网络拟合示例数据集。在Dymola中实现了神经网络之后,我想将函数的输出与神经网络的输出进行比较。
样本数据集是:
1 2 3 4 5 6 7 8 9 10 11 12 13 | import tensorflow as tf from keras import metrics import numpy as np from keras.models import * from keras.layers import Dense, Dropout from keras import optimizers from keras.callbacks import * import scipy.io as sio import mat4py as m4p inputs = np.linspace(0, 15, num=3000) outputs = 1/7 * ((inputs/5)^3 - (inputs/3)^2 + 5) |
然后将输入和输出缩放到间隔[0; 0.9]:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | inputs_max = np.max(inputs) inputs_min = np.min(inputs) outputs_max = np.max(outputs) outputs_min = np.min(outputs) upper_bound = 0.9 lower_bound = 0 m_in = (upper_bound - lower_bound) / (inputs_max - inputs_min) c_in = upper_bound - (m_in * inputs_max) scaled_in = m_in * inputs + c_in m_out = (upper_bound - lower_bound) / (outputs_max - outputs_min) c_out = upper_bound - (m_out * outputs_max) scaled_out = m_in * inputs + c_in |
,然后用以下方法训练神经网络:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | # shuffle values def shuffle_in_unison(a, b): assert len(a) == len(b) shuffled_a = np.empty(a.shape, dtype=a.dtype) shuffled_b = np.empty(b.shape, dtype=b.dtype) permutation = np.random.permutation(len(a)) for old_index, new_index in enumerate(permutation): shuffled_a[new_index] = a[old_index] shuffled_b[new_index] = b[old_index] return shuffled_a, shuffled_b tf_features_64 = scaled_in tf_labels_64 = scaled_out tf_features_32 = tf_features_64.astype(np.float32) tf_labels_32 = tf_labels_64.astype(np.float32) X = tf_features_32 Y = tf_labels_32 shuffle_in_unison(X, Y) # define callbacks filepath ="weights-improvement-{epoch:02d}-{val_loss:.2f}.hdf5" savebestCallBack = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1) tbCallBack = TensorBoard(log_dir='./Graph', histogram_freq=5, write_graph=True, write_images=True) esCallback = EarlyStopping(monitor='val_loss', min_delta=0, patience=500, verbose=0, mode='min') # neural network architecture visible = Input(shape=(1,)) x = Dense(40, activation='tanh')(visible) x = Dense(39, activation='tanh')(x) x = Dense(38, activation='tanh')(x) x = Dense(30, activation='tanh')(x) output = Dense(1)(x) # setup optimizer Optimizer = optimizers.adam(lr=0.0007, amsgrad=True) model = Model(inputs=visible, outputs=output) model.compile(optimizer=Optimizer, loss=['mse'], metrics=['mae', 'mse'] ) model.fit(X, Y, epochs=1000, batch_size=1, verbose=1, shuffle=True, validation_split=0.05, callbacks=[tbCallBack, esCallback]) # return weights weights1 = model.layers[1].get_weights()[0] biases1 = model.layers[1].get_weights()[1] print('Layer1---------------------------------------------------------------------------------------------------------') print('weights1:') print(repr(weights1.transpose())) print('biases1:') print(repr(biases1)) w1 = weights1.transpose() b1 = biases1.transpose() we1 = {'w1' : w1.tolist()} bi1 = {'b1' : b1.tolist()} ......... ...... |
稍后,我通过将权重和偏差加载到预先配置的"神经网络基类"(已经使用了几次并且正在工作)中,在程序" Dymola"中实现了经过训练的神经网络。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | // Modelica code for Dymola: Real inputs; Real outputs; Real scaled_outputs; Real scaled_inputs(start=0); Real scaled_outputsfunc; der(scaled_inputs) = 0.9; //part of the neural network implementation in Dymola NeuralNetwork.BaseClasses.NeuralNetworkLayer neuralNetworkLayer1( NeuronActivationFunction=NeuralNetwork.Types.ActivationFunction.TanSig, numInputs=1, numNeurons=40, weightTable=[-0.367953330278397; ......]) annotation (Placement(transformation(extent={{-76,22},{-56,42}}))); //scaled inputs neuralNetworkLayer1.u[1] = scaled_inputs; //scaled outputs neuralNetworkLayer5.y[1]= scaled_outputs; //scaled_inputs = 0.06 * inputs inputs = 1/0.06 * (scaled_inputs); outputs = 1/875 * inputs^3 - 1/63 * inputs^2 + 5/7; scaled_outputsfunc = 1.2173139581825052 * outputs - 0.3173139581825052; |
在绘制和比较函数的缩放输出与神经网络的返回(缩放)值时,我注意到在[0.5; 0.8],但输入越接近边界,近似值就越差。
不幸的是,我不知道为什么会发生这种情况以及如何解决此问题。如果有人能帮助我,我会感到非常高兴。
我想回答我自己的问题:我忘了在python代码的输出层中指定激活函数,然后Keras会默认将其设置为线性函数,另请参见:
https://keras.io/layers/core/
在实施我的人工神经网络的Dymola中,\\'tanh \\'是最后一层中的激活函数,这导致边界附近出现发散。
此应用程序的正确python代码必须为:
1 2 3 4 5 6 | visible = Input(shape=(1,)) x = Dense(40, activation='tanh')(visible) x = Dense(39, activation='tanh')(x) x = Dense(38, activation='tanh')(x) x = Dense(30, activation='tanh')(x) output = Dense(1, activation='tanh')(x) |