[英]TypeError: Signature mismatch. Keys must be dtype <dtype: 'string'>, got <dtype:'int64'>
While running the wide_n_deep_tutorial program from TensorFlow on my dataset, the following error is displayed.在我的数据集上从 TensorFlow 运行wide_n_deep_tutorial程序时,显示以下错误。
"TypeError: Signature mismatch. Keys must be dtype <dtype: 'string'>, got <dtype:'int64'>"
Following is the code snippet:以下是代码片段:
def input_fn(df):
"""Input builder function."""
# Creates a dictionary mapping from each continuous feature column name (k) to
# the values of that column stored in a constant Tensor.
continuous_cols = {k: tf.constant(df[k].values) for k in CONTINUOUS_COLUMNS}
# Creates a dictionary mapping from each categorical feature column name (k)
# to the values of that column stored in a tf.SparseTensor.
categorical_cols = {k: tf.SparseTensor(
indices=[[i, 0] for i in range(df[k].size)],
values=df[k].values,
shape=[df[k].size, 1])
for k in CATEGORICAL_COLUMNS}
# Merges the two dictionaries into one.
feature_cols = dict(continuous_cols)
feature_cols.update(categorical_cols)
# Converts the label column into a constant Tensor.
label = tf.constant(df[LABEL_COLUMN].values)
# Returns the feature columns and the label.
return feature_cols, label
def train_and_eval():
"""Train and evaluate the model."""
train_file_name, test_file_name = maybe_download()
df_train=train_file_name
df_test=test_file_name
df_train[LABEL_COLUMN] = (
df_train["impression_flag"].apply(lambda x: "generated" in x)).astype(str)
df_test[LABEL_COLUMN] = (
df_test["impression_flag"].apply(lambda x: "generated" in x)).astype(str)
model_dir = tempfile.mkdtemp() if not FLAGS.model_dir else FLAGS.model_dir
print("model directory = %s" % model_dir)
m = build_estimator(model_dir)
print('model succesfully build!')
m.fit(input_fn=lambda: input_fn(df_train), steps=FLAGS.train_steps)
print('model fitted!!')
results = m.evaluate(input_fn=lambda: input_fn(df_test), steps=1)
for key in sorted(results):
print("%s: %s" % (key, results[key]))
Any help is appreciated.任何帮助表示赞赏。
would help to see the output prior to the error message to determine which part of the process this error tripped out at, but, the message says quite clearly that the key is expected to be a string whereas an integer was given instead.将有助于在错误消息之前查看输出,以确定此错误在流程的哪个部分跳闸,但是,该消息非常清楚地表明该键应该是一个字符串,而给出的是一个整数。 I am only guessing, but are the column names set out correctly in the earlier part of your script as they could potentially be the keys that are being referred to in this instance?
我只是猜测,但是在脚本的前面部分中列名是否正确设置,因为它们可能是在这种情况下被引用的键?
Judging by your traceback , the problem you're having is caused by your inputs to feature columns, or the output of your input_fn
.根据您的回溯判断,您遇到的问题是由您对特征列的输入或
input_fn
的输出引起的。 Your sparse tensors are most likely being fed non-string dtypes for the values
parameter;您的稀疏张量最有可能为
values
参数提供非字符串数据类型; sparse feature columns expect string values.稀疏特征列需要字符串值。 Ensure that you're feeding the correct data, and if you're sure you are, you can try the following:
确保您提供了正确的数据,如果您确定是这样,您可以尝试以下操作:
categorical_cols = {k: tf.SparseTensor(
indices=[[i, 0] for i in range(df[k].size)],
values=df[k].astype(str).values, # Convert sparse values to string type
shape=[df[k].size, 1])
for k in CATEGORICAL_COLUMNS}
This is how I solved this challenge:我是这样解决这个挑战的:
from sklearn.model_selection import train_test_split
# split the data set
X_train, X_test, y_train, y_test = train_test_split(M, N, test_size=0.3)
# covert string to int64 for training set
X_train = X_train[X_train.columns] = X_train[X_train.columns].apply(np.int64)
y_train = y_train.apply(np.int64)
# covert string to int64 for testing set
X_test = X_test[X_test.columns] = X_test[X_test.columns].apply(np.int64)
y_test = y_test.apply(np.int64)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.