今天早些时候,这对我来说很好,但是当我重新启动笔记本时,它突然开始非常奇怪。 我有一个TF数据集,该数据集将Numpy文件及其相应的标签作为输入,例如So so。 当我使用1项使用
tf.data.Dataset.from_tensor_slices((specgram_files, labels))
时,我会得到预期的输出,即张量的元组,其中第一个张量包含numpy文件的名称作为字节字符串,第二个张量包含编码的标签。
然后,我有一个函数,可以使用for item in ds.take(1): print(item)
读取文件并产生一个numpy数组,然后将其返回。此功能传递到Map()方法中,看起来像这样:np.load()
,read_npy_file看起来像这样:
ds = ds.map(
lambda file, label: tuple([tf.numpy_function(read_npy_file, [file], [tf.float32]), label]),
num_parallel_calls=tf.data.AUTOTUNE)
您可以看到,映射应创建另一个张量的元组,其中第一个张量是numpy阵列,而第二个张量为标签,未触及。这很早就起作用了,但是现在它给出了最奇怪的行为。我将打印语句放置在
def read_npy_file(data):
# 'data' stores the file name of the numpy binary file storing the features of a particular sound file
# as a bytes string.
# decode() is called on the bytes string to decode it from a bytes string to a regular string
# so that it can passed as a parameter into np.load()
data = np.load(data.decode())
return data.astype(np.float32)
函数中,以查看是否传递了正确的数据。我希望它通过一个字符串,但是当我在
read_npy_file()
函数中调用print(data)
并从数据集中拨打1个项目时,它会产生此输出,以使用read_npy_file()
::触发一个映射::
ds.take(1)
我没有修改输出的任何格式。 我非常感谢任何帮助。与哈哈一起工作绝对是一场噩梦。 there是完整的代码
b'./challengeA_data/log_spectrogram/2603ebb3-3cd3-43cc-98ef-0c128c515863.npy'b'./challengeA_data/log_spectrogram/fab6a266-e97a-4935-a0c3-444fc4426fc5.npy'b'./challengeA_data/log_spectrogram/93014682-60a2-45bd-9c9e-7f3c97b83be9.npy'b'./challengeA_data/log_spectrogram/710f2430-5da3-4822-a252-6ad3601b92d9.npy'b'./challengeA_data/log_spectrogram/e757058c-91de-4381-8184-65f001c95647.npy'
b'./challengeA_data/log_spectrogram/38b12689-04ba-422b-a972-5856b05ca868.npy'
b'./challengeA_data/log_spectrogram/7c9ccc04-a2d2-4eec-bafd-0c97b3658c26.npy'b'./challengeA_data/log_spectrogram/c7cc3520-7218-4d07-9f0a-6bd7bb90a551.npy'
b'./challengeA_data/log_spectrogram/21f6060a-9766-4810-bd7c-0437f47ccb98.npy'
thanks!
您的逻辑似乎很好。实际上,您只是在观察
def read_npy_file(data):
# 'data' stores the file name of the numpy binary file storing the features of a particular sound file
# as a bytes string.
# decode() is called on the bytes string to decode it from a bytes string to a regular string
# so that it can passed as a parameter into np.load()
print(data)
data = np.load(data.decode())
return data.astype(np.float32)
specgram_ds = tf.data.Dataset.from_tensor_slices((specgram_files, labels))
specgram_ds = specgram_ds.map(
lambda file, label: tuple([tf.numpy_function(read_npy_file, [file], [tf.float32]), label]),
num_parallel_calls=tf.data.AUTOTUNE)
num_files = len(train_df)
num_train = int(0.8 * num_files)
num_val = int(0.1 * num_files)
num_test = int(0.1 * num_files)
specgram_ds = specgram_ds.shuffle(buffer_size=1000)
specgram_train_ds = specgram_ds.take(num_train)
specgram_test_ds = specgram_ds.skip(num_train)
specgram_val_ds = specgram_test_ds.take(num_val)
specgram_test_ds = specgram_test_ds.skip(num_val)
# iterating over one item to trigger the mapping function
for item in specgram_ds.take(1):
pass
docs
:如果使用值tf.data.autotune,则基于可用的CPU,动态设置并行调用的数量。 您可以几次运行以下代码来观察更改:
。最后,请注意,使用print(*)
也参见this
import tensorflow as tf
import numpy as np
def read_npy_file(data):
# 'data' stores the file name of the numpy binary file storing the features of a particular sound file
# as a bytes string.
# decode() is called on the bytes string to decode it from a bytes string to a regular string
# so that it can passed as a parameter into np.load()
print(data)
data = np.load(data.decode())
return data.astype(np.float32)
# Create dummy data
for i in range(4):
np.save('{}-array'.format(i), np.random.random((5,5)))
specgram_files = ['/content/0-array.npy', '/content/1-array.npy', '/content/2-array.npy', '/content/3-array.npy']
labels = [1, 0, 0, 1]
specgram_ds = tf.data.Dataset.from_tensor_slices((specgram_files, labels))
specgram_ds = specgram_ds.map(
lambda file, label: tuple([tf.numpy_function(read_npy_file, [file], [tf.float32]), label]),
num_parallel_calls=tf.data.AUTOTUNE)
num_files = len(specgram_files)
num_train = int(0.8 * num_files)
num_val = int(0.1 * num_files)
num_test = int(0.1 * num_files)
specgram_ds = specgram_ds.shuffle(buffer_size=1000)
specgram_train_ds = specgram_ds.take(num_train)
specgram_test_ds = specgram_ds.skip(num_train)
specgram_val_ds = specgram_test_ds.take(num_val)
specgram_test_ds = specgram_test_ds.skip(num_val)
for item in specgram_ds.take(1):
pass
tf.print
应该摆脱任何side-side-septects.。