简体   繁体   English

如何在 Python 中导入这个数据文件?

[英]How do I import this file of data in Python?

So I just started with Python and I have no idea how to import the text file the way I want it to.所以我刚开始使用 Python,我不知道如何以我想要的方式导入文本文件。

This is my code so far:到目前为止,这是我的代码:

f = open("Data.txt", "r")
attributes = f.readlines()

w1 = 2 * np.random.random((1,19)) -1

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

outputs = sigmoid(np.dot(attributes, w1))

So the problem here is that I get the error message:所以这里的问题是我收到错误消息:

Cannot cast array data from dtype('float64') to dtype('<U32') according to the rule 'safe'.

I know that the problem is that the code does not read the text file as an array of numbers which is why I get the error.我知道问题在于代码没有将文本文件作为数字数组读取,这就是我收到错误的原因。

This is one line in the text file:这是文本文件中的一行:

1,1,22,22,22,19,18,14,49.895756,17.775994,5.27092,0.771761,0.018632,0.006864,0.003923,0.003923,0.486903,0.100025,1,0

You're almost there :您快到了 :

import numpy as np
with open("Data.txt", "r") as f:
    attributes = f.readlines()
attributes = [row.strip().split(",") for row in attributes]
attributes = [[float(x) for x in row] for row in attributes]

w1 = 2 * np.random.random((1,20)) -1

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

outputs = sigmoid(np.dot(np.array(attributes), w1.T))

You needed something like "list comprehension" to split values on each row (if there are more than one).您需要诸如“列表理解”之类的东西来分割每一行的值(如果有多个)。 Also, you were missing one element in w1 to match the dimension of attributes.此外,您在 w1 中缺少一个元素来匹配属性的维度。 And finally, you will have to use the transposition of w1 to make the production using np.dot (np.array(attributes)'s shape is (1,20) as well as w1's.最后,您将不得不使用 w1 的转置来使用 np.dot(np.array(attributes) 的形状是 (1,20) 以及 w1 的形状。

Also : always remember to use this "with" statement to open files (it automatically close the file after ending the statement ; otherwise, you might get into some trouble...)另外:永远记住使用这个“with”语句来打开文件(它会在语句结束后自动关闭文件;否则,你可能会遇到一些麻烦......)


Edit编辑

If your real dataset contains more data, you should consider the use of pandas, which will be much more efficient :如果您的真实数据集包含更多数据,您应该考虑使用 Pandas,这样效率会更高:

import pandas as pd
df = pd.read_csv("Data.txt"), sep=",", header=None)
numpy_array = df.values

I think what you want to do is something like the following:我认为你想要做的是如下:

import numpy as np


f = open("Data.txt", "r")

# Read the file, replace newline by "", split the string on commas
attributes = f.read().replace("\n", "").split(",")

# convert every element from string to float
attributes = np.array([float(element) for element in attributes])

# 20 random values in [-1,1)
w1 = 2 * np.random.random(20) - 1

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

outputs = sigmoid(np.dot(attributes, w1))

# Printing results
print(attributes)
print(w1)
print(outputs)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM