简体   繁体   English

使用 pandas 从 csv 文件中获取列的前几个字符

[英]Fetching the first few characters of a column from a csv file using pandas

I have a csv file which contains some data, here I will put some data.我有一个 csv 文件,其中包含一些数据,这里我将放置一些数据。

enter image description here在此处输入图像描述

  • I need to fetch the first two characters from the 'ID' column as an output, where the Quantity = 10 and Max value is greater than 40(which we can fetch from the first two characters from 'Max value' column)我需要从“ID”列中获取前两个字符作为 output,其中数量 = 10 且最大值大于 40(我们可以从“最大值”列中的前两个字符中获取)

So, the output should be,所以,output 应该是,

02
04

I have tried these solutions so far,到目前为止,我已经尝试过这些解决方案,

code:代码:

var1 = data.loc[{data["Quantity"] == 10) & (data["Max value"].str[:2] == 40)]

var2 = (var1["ID"].str[:2])

print(var2)

output: output:

Empty DataFrame
Columns: [ID, Quantity, Max value]
Index: []
  • I thought this happened because the column name contains space character so, Other method,我认为这是因为列名包含空格字符所以,其他方法,

code:代码:

var1 = data.loc[(data.Quantity == 10) & (data.Max value.str[:2] > 40)].ID.str[:2]

var2 = (var1.ID.str[:2])

print(var2)

output: output:

same output
  • Let's change the column name, method 3,让我们更改列名,方法3,

code:代码:

data.rename(columns = {'Max value':'MaxValue'}, inplace = True)

var1 = data.loc[(data["Quantity"] == 10) & (data["Max value"].str[:2] > 40)]

var2 = (var1["ID"].str[:2])

print(var2)

output: output:

Series([], Name: ID, dtype: object)
  • The data exists but nothing is showing up, by the way I have tried the same codes without ".loc".数据存在但没有任何显示,顺便说一下,我已经尝试过没有“.loc”的相同代码。
  • Any thoughts?有什么想法吗?

This does the job:这是做的工作:

df = pd.read_csv(***csv file path***)

df["Max value num"] = [int(max_val[:2]) for max_val in df["Max value"]]
desired_data = df[(df["Quantity"] == 10) & (df["Max value num"] >= 40)]
desired_data = [id[:2] for id in desired_data["ID"]]

This stores the first 2 characters in a list.这将前 2 个字符存储在列表中。


If you want to print them out like 02 04 , then use this,如果你想像02 04一样打印出来,那么使用这个,

df = pd.read_csv(***csv file path***)

df["Max value num"] = [int(max_val[:2]) for max_val in df["Max value"]]
desired_data = df[(df["Quantity"] == 10) & (df["Max value num"] >= 40)]

output = ""
for id in desired_data["ID"]:
  output += f"{id[:2]} "

output.strip(" ")

For both the codes above, I have added a Max value num column that stores the numeric part of the values in Max value .对于上面的两个代码,我都添加了一个Max value num列,用于存储Max value value 中值的数字部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM