[英]Python Pandas CSV Converting Int64 to the Object and call the right row via input
I am new in Python Pandas and I am trying to figure it out the problem.我是 Python Pandas 的新手,我正试图找出问题所在。
I am fighting with the problem of converting dtype value in my csv.我正在解决在我的 csv 中转换 dtype 值的问题。
I wrote a simple example to understand what is the problem but I cannot see there anything and I am not able to find why it is not working .. Please see below.我写了一个简单的例子来理解问题是什么,但我看不到任何东西,我也找不到它为什么不起作用..请看下面。
I have now a CSV table with 3 columns For the A and B the dtypes is Int64 for C it is object If i will set the variable as str it will change the value from int64 to object.我现在有一个包含 3 列的 CSV 表对于 A 和 B,dtypes 是 Int64 对于 C 它是对象如果我将变量设置为 str 它将值从 int64 更改为对象。
My code is like this :我的代码是这样的:
import pandas as pd
data_Cisla = pd.read_csv("Cisla.csv", sep=";" , dtype=str)
print(data_Cisla.dtypes)
print(data_Cisla)
def cisla():
vstup = input("Input value ")
print(vstup, type(vstup))
print(data_Cisla.loc[vstup])
When I will use also index_col="C" and print the cisla()当我还将使用 index_col="C" 并打印 cisla()
It is working.这是工作。 Program will ask me for an input from the Column C - So I write for example text_2 and it give me output (C)text_2 (A) 2 (B) 20 ----> This is what I am looking for but for the column A as an index_col.
程序会要求我输入来自 C 列的输入 - 所以我写例如 text_2,它给我输出 (C)text_2 (A) 2 (B) 20 ----> 这就是我正在寻找的,但对于A 列作为 index_col。
But if I will use the same thing for index_col A an write 20 when program ask for Input value it doesn´t work and giving me error ..但是如果我对 index_col A 使用相同的东西,当程序要求输入值时写入 20 它不起作用并给我错误..
What I don´t understand is When I am printing each step with data_Cisla.dtypes it will say me that all the time all column are object so what is the differences there ?我不明白的是,当我使用 data_Cisla.dtypes 打印每个步骤时,它会一直说所有列都是对象,那么有什么区别? Why it is working for column C and not for column A?
为什么它适用于 C 列而不适用于 A 列?
Final code looks like this最终代码看起来像这样
import pandas as pd
data_Cisla = pd.read_csv("Cisla.csv", sep=";" , dtype=str, index_col="C")
def cisla():
vstup = input("Input value ")
print(data_Cisla.loc[vstup])
cisla()
Thank you for helping me.感谢你们对我的帮助。
The reason for the observed behavior is that column 'C' is your index.观察到的行为的原因是列 'C' 是您的索引。 I do not know why, because it is not in your code.
我不知道为什么,因为它不在您的代码中。 My solution:
我的解决方案:
import pandas as pd
# build test data
data_Cisla = [[1, 10, 'text_1'],
[2, 20, 'text_2'],]
data_Cisla = pd.DataFrame.from_records(data=data_Cisla, columns=['A', 'B', 'C'])
data_Cisla = data_Cisla.reset_index()
def cisla(data_Cisla: pd.DataFrame, col: str, vstup: str):
# Do not change data_Cisla, just make sure vstup is in the right format (str or float)
try:
vstup = float(vstup)
except ValueError:
pass
mask = data_Cisla[col] == vstup
return data_Cisla[mask]
It will produce the following result:它将产生以下结果:
cisla(data_Cisla, 'C', 'text_1') #-> 1 | 10 | text_1
cisla(data_Cisla, 'A', '1') #-> -> 1 | 10 | text_1
cisla(data_Cisla, 'A', 1) #-> -> 1 | 10 | text_1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.