1 回答

TA貢獻(xiàn)1786條經(jīng)驗(yàn) 獲得超13個(gè)贊
我已經(jīng)下載了兩個(gè) pcap 文件并將它們連接起來(lái)。后來(lái)我提取了 packet_timestamp 和 packet_data。要求您根據(jù)您的要求對(duì) packet_data 進(jìn)行預(yù)處理。如果您有任何要添加的標(biāo)簽,您可以添加到訓(xùn)練數(shù)據(jù)集(在下面的模型示例中,我創(chuàng)建了一個(gè)全為零的虛擬標(biāo)簽并添加為一列)。如果它在一個(gè)文件中,那么您可以將它們壓縮到 pcap 文件中。傳遞(特征,標(biāo)簽)對(duì)的數(shù)據(jù)集是 and 所需要Model.fit的Model.evaluate:
以下是 packet_data 預(yù)處理的示例 -可能您可以修改為if packet_data is valid then labels = valid else malicious.
%tensorflow_version 2.x
import tensorflow as tf
import tensorflow_io as tfio
import numpy as np
# Create an IODataset from a pcap file
first_file = tfio.IODataset.from_pcap('/content/fuzz-2006-06-26-2594.pcap')
second_file = tfio.IODataset.from_pcap(['/content/fuzz-2006-08-27-19853.pcap'])
# Concatenate the Read Files
feature = first_file.concatenate(second_file)
# List for pcap
packet_timestamp_list = []
packet_data_list = []
# some dummy labels
labels = []
packets_total = 0
for v in feature:
(packet_timestamp, packet_data) = v
packet_timestamp_list.append(packet_timestamp.numpy())
packet_data_list.append(packet_data.numpy())
labels.append(0)
if packets_total == 0:
assert np.isclose(
packet_timestamp.numpy()[0], 1084443427.311224, rtol=1e-15
) # we know this is the correct value in the test pcap file
assert (
len(packet_data.numpy()[0]) == 62
) # we know this is the correct packet data buffer length in the test pcap file
packets_total += 1
assert (
packets_total == 43
) # we know this is the correct number of packets in the test pcap file
下面是在模型中使用的示例 -該模型將不起作用,因?yàn)槲覜](méi)有處理字符串類(lèi)型的 packet_data。根據(jù)您的要求進(jìn)行預(yù)處理并在模型中使用。
%tensorflow_version 2.x
import tensorflow as tf
import tensorflow_io as tfio
import numpy as np
# Create an IODataset from a pcap file
first_file = tfio.IODataset.from_pcap('/content/fuzz-2006-06-26-2594.pcap')
second_file = tfio.IODataset.from_pcap(['/content/fuzz-2006-08-27-19853.pcap'])
# Concatenate the Read Files
feature = first_file.concatenate(second_file)
# List for pcap
packet_timestamp = []
packet_data = []
# some dummy labels
labels = []
# add 0 as label. You can use your actual labels here
for v in feature:
(timestamp, data) = v
packet_timestamp.append(timestamp.numpy())
packet_data.append(data.numpy())
labels.append(0)
## Do the preprocessing of packet_data here
# Add labels to the training data
# Preprocess the packet_data to convert string to meaningful value and use here
train_ds = tf.data.Dataset.from_tensor_slices(((packet_timestamp,packet_data), labels))
# Set the batch size
train_ds = train_ds.shuffle(5000).batch(32)
##### PROGRAM WILL RUN SUCCESSFULLY TILL HERE. TO USE IN THE MODEL DO THE PREPROCESSING OF PACKET DATA AS EXPLAINED ###
# Have defined some simple model
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(100),
tf.keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(train_ds, epochs=2)
希望這能回答你的問(wèn)題。快樂(lè)學(xué)習(xí)。
添加回答
舉報(bào)