[英]How can I iterate through each row of a pandas dataframe, then conditionally set a new value in that row?
[英]How can I set the value for a specific row for a Pandas DataFrame in a for loop?
for petid in X['PetID']:
sentiment_file = datapath + '/train_sentiment/' + petid + '.json'
if os.path.isfile(sentiment_file):
json_data = json.loads(open(sentiment_file).read())
X['DescriptionLanguage'] = json_data['language']
X['DescriptionMagnitude'] = json_data['documentSentiment']['magnitude']
X['DescriptionScore'] = json_data['documentSentiment']['score']
# print(petid, sentiment_file,
# json_data['documentSentiment']['magnitude'])
else:
X['DescriptionLanguage'] = 'Unknown'
X['DescriptionMagnitude'] = 0
X['DescriptionScore'] = 0
這就是我所擁有的,但這不起作用。 它將每行設置為具有DescriptionLanguage
, DescriptionMagnitude
和DescriptionScore
。
您可以使用.loc設置單個值,而不是整個列。 這是一個包含的示例
import pandas as pd
import numpy as np
X = pd.DataFrame(np.arange(5), columns=['PetID'])
for ind, row in X.iterrows():
petid = row['PetID']
X.loc[ind, 'DescriptionLanguage'] = 'No description for {}'.format(petid)
除了@Heikki Pulkkinen的出色答案之外,您還可以為數據框中的各個列建立索引,例如:
import pandas as pd
import numpy as np
data = np.array([np.arange(10)]*4).T
X = pd.DataFrame(data,columns=["PetID","DescriptionLanguage","DescriptionMagnitude","DescriptionScore"])
for i in range(len(X['PetID'])):
X['DescriptionLanguage'][i] = 10*i
...導致X變成:
PetID DescriptionLanguage DescriptionMagnitude DescriptionScore
0 0 0 0 0
1 1 10 1 1
2 2 20 2 2
3 3 30 3 3
4 4 40 4 4
5 5 50 5 5
6 6 60 6 6
7 7 70 7 7
8 8 80 8 8
9 9 90 9 9
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.