最近在做毕设,有一个环节是训练一个 RNN 来判断当前系统是否存在攻击行为。我用的是基于 TensorFlow 开发的 TensorLayer 库,数据集用的是 ADFA-LD,搭建的网络如下代码所示:
# 网络结构
network = tl.layers.EmbeddingInputlayer(x, vocabulary_size=max_sys_call+1, embedding_size=128, name='embedding')
network = tl.layers.RNNLayer(network, cell_fn=tf.contrib.rnn.BasicLSTMCell, cell_init_args={'forget_bias':0.0, 'state_is_tuple':True}, n_hidden=128, name='lstm')
network = tl.layers.DropoutLayer(network, keep=0.8, name='dropout')
# 重塑为向量
network = tl.layers.FlattenLayer(network, name='flatten')
network = tl.layers.DenseLayer(network, n_units=2, act=tf.identity, name='output')
data 和 labels 的 placeholder 定义如下:
x = tf.placeholder(tf.int64, shape=[None, max_sequences_len], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')
损失函数定义如下:
# 定义损失函数
y = network.outputs
cost = tl.cost.cross_entropy(y, y_, name='entropy')
correct_prediction = tf.equal(tf.argmax(y, 1), y_)
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
但在实际运行过程中,报了如下错误:
[TL] Finished! use $tensorboard --logdir=logs/ to start server
[TL] Start training the network ...
Traceback (most recent call last):
File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 123, in <module>
rnn_lstm(x, y)
File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 110, in rnn_lstm
print_freq=5, X_val=np.array(x_val), y_val=np.array(y_val), eval_train=True, tensorboard=True)
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorlayer\utils.py", line 147, in fit
loss, _ = sess.run([cost, train_op], feed_dict=feed_dict)
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 905, in run
run_metadata_ptr)
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1113, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (100, 2) for Tensor 'y_:0', which has shape '(?,)'
我尝试将 labels 的 placeholder 改成下面这样:
y_ = tf.placeholder(tf.int64, shape=[100, 2], name='y_')
但这样改完后,会报一个新错误,而且出现在构造网络阶段:
[TL] FlattenLayer flatten: 640
[TL] DenseLayer output: 2 identity
Traceback (most recent call last):
File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 123, in <module>
rnn_lstm(x, y)
File "D:/PycharmProjects/GraduationProject/DynamicDetection.py", line 96, in rnn_lstm
cost = tl.cost.cross_entropy(y, y_, name='entropy')
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorlayer\cost.py", line 36, in cross_entropy
return tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=target, logits=output, name=name))
File "C:\Users\hasee\Anaconda2\envs\tfenv\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 2038, in sparse_softmax_cross_entropy_with_logits
(labels_static_shape.ndims, logits.get_shape().ndims))
ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).
我觉得应该是我 labels 有问题,可能需要 reshape ?但是因为刚刚接触 TensorFlow,对于这块不是很懂,希望能有大佬指点一下迷津!万分感谢!!
附上完整代码:
import os
import numpy as np
import tensorflow as tf
import tensorlayer as tl
from sklearn.cross_validation import train_test_split
max_sys_call = 0
max_sequences_len = 300
learning_rate = 0.0001
ADFA_NormalData_Path = r"./data/ADFA-LD/Normal_Data_Master"
ADFA_WebshellData_Path = r"./data/ADFA-LD/Attack_Data_Master"
def load_one_file(filename):
# 读取单个文件中的系统调用,并记录最大系统调用序号
global max_sys_call
x = []
with open(filename) as f:
for line in f:
line = line.strip('\n')
line = line.split(' ')
for num in line:
if len(num) > 0:
x.append(int(num))
if int(num) > max_sys_call:
max_sys_call = int(num)
return x
def load_ADFA_Data(dir):
# 加载 ADFA 数据集
data = []
label = []
g = os.walk(dir)
i = 0
for path, d, filelist in g:
for filename in filelist:
if filename.endswith('.txt'):
filepath = os.path.join(path, filename)
i += 1
print("[%d] Load %s" % (i, filepath))
nums = load_one_file(filepath)
data.append(nums)
if dir == ADFA_NormalData_Path:
label.append(0)
else:
label.append(1)
return data, label
def rnn_lstm(x, y):
# 构造 rnn,使用 lstm
x_train_and_val, x_test, y_train_and_val, y_test = train_test_split(x, y, test_size=0.4, random_state=0)
x_train, x_val, y_train, y_val = train_test_split(x_train_and_val, y_train_and_val, test_size=0.3, random_state=0)
x_train = tl.prepro.pad_sequences(x_train, maxlen=max_sequences_len, value=0.)
x_val = tl.prepro.pad_sequences(x_val, maxlen=max_sequences_len, value=0.)
x_test = tl.prepro.pad_sequences(x_test, maxlen=max_sequences_len, value=0.)
y_train = tf.keras.utils.to_categorical(y_train, num_classes=2)
y_val = tf.keras.utils.to_categorical(y_val, num_classes=2)
y_test = tf.keras.utils.to_categorical(y_test, num_classes=2)
sess = tf.InteractiveSession()
x = tf.placeholder(tf.int64, shape=[None, max_sequences_len], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')
# y_ = tf.placeholder(tf.int64, shape=[100, 2], name='y_')
# 网络结构
network = tl.layers.EmbeddingInputlayer(x, vocabulary_size=max_sys_call+1, embedding_size=128, name='embedding')
network = tl.layers.RNNLayer(network, cell_fn=tf.contrib.rnn.BasicLSTMCell, cell_init_args={'forget_bias':0.0, 'state_is_tuple':True}, n_hidden=128, name='lstm')
network = tl.layers.DropoutLayer(network, keep=0.8, name='dropout')
# 重塑为向量
network = tl.layers.FlattenLayer(network, name='flatten')
network = tl.layers.DenseLayer(network, n_units=2, act=tf.identity, name='output')
# 定义损失函数
y = network.outputs
cost = tl.cost.cross_entropy(y, y_, name='entropy')
correct_prediction = tf.equal(tf.argmax(y, 1), y_)
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# 定义优化器
train_params = network.all_params
train_op = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost, var_list=train_params)
# 初始化模型参数
tl.layers.initialize_global_variables(sess)
# 训练网络模型
tl.utils.fit(sess, network, train_op, cost, np.array(x_train), np.array(y_train), x, y_, acc=acc, n_epoch=1500,
print_freq=5, X_val=np.array(x_val), y_val=np.array(y_val), eval_train=True, tensorboard=True)
# 评估模型
tl.utils.test(sess, network, acc, x_test, y_test, x, y_, batch_size=None, cost=cost)
sess.close()
x1, y1 = load_ADFA_Data(ADFA_NormalData_Path)
x2, y2 = load_ADFA_Data(ADFA_WebshellData_Path)
x = x1 + x2
y = y1 + y2
rnn_lstm(x, y)
1
epleone 2018-04-08 13:32:07 +08:00
改成这样试试
y_ = tf.placeholder(tf.int64, shape=[None, 2], name='y_') |
2
scoronepion OP @epleone 试过了,依然会报第二个错...
|
3
epleone 2018-04-08 13:38:11 +08:00
@scoronepion
cost = tl.cost.cross_entropy(y, y_, name='entropy') 也要改成 cost = tl.cost.cross_entropy(y_, y, name='entropy') |
4
scoronepion OP @epleone 刚刚试了一下,还是不行,依然在报:ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2).
|
5
epleone 2018-04-08 13:49:11 +08:00
@scoronepion
没道理啊,占位符 y_ = tf.placeholder(tf.int64, shape=[None, 2]) 这样是没有问题的,最终的类型是 int 吗,还是 tf.float32? 调用损失函数的时候,保证前向生成的 y 在前 gt 在后。好好检查一下代码。 还有你用的是 tl,我不太清楚,直接用 tf 不可以么。 |
6
scoronepion OP @epleone 好的好的谢谢大佬,我再看看。用 tl 是因为前段时间刚好在学这个,所以想用用。
|