简体   繁体   English

如何从一列到第二列的值与多个值匹配

[英]How to match values from one column to second column with multiple values

I have a dataframe: 我有一个数据框:

Name  Dept
abc   Genteic|Biology|Chemical Engineering
def   Physics|Chemical Engineering|Astrophysics
xyz   Chemical Engineering|Astrophysics
klm   Biology|Astrophysics
nop   Chemical Engineering|Astrophysics

The first column contains name and second column shows the various departments they are associated with. 第一列包含名称,第二列显示与之关联的各个部门。 I want to know number of people working in each department. 我想知道每个部门工作的人数。 For ex: In biology dept how many people are associated with. 例如:在生物学部门,有多少人与之相关。 The code i have so for is : 我这样做的代码是:

import  pandas as pd
import json
import requests
from requests.exceptions import ConnectionError
from requests.exceptions import ReadTimeout
import csv

def author_name(dataframe):
      response = get_url(term)
      return response

def get_url(term):
print(term)
response = resp.content
data = json.loads(response)
print(data) 

try:
    if data['author-retrieval-response']['subject-areas']['subject-area'] != 'null':
        myvar = data['author-retrieval-response']['subject-areas']['subject-area']['@abbrev']
        myvar = myvar.split('|')

    else:
        data['author-retrieval-response']['subject-areas']['subject-area'] = 'null'
        auth_empty =  data['author-retrieval-response']['subject-areas']['subject-area']['@abbrev']
        print(auth_empty)
except:
    pass

if __name__ =='__main__': 

out = open('out.csv', 'w',encoding='utf-8', newline="\n")
csvwriter = csv.writer(out)
header = ['Scopus ID', 'Title', 'Abstract', 'Affilaition', 'Authors', 
'Citation', 'Pub_Date']       

dataframe = pd.read_csv('author.csv', usecols='auth_name')
for i, row in dataframe.iterrows():
      term = (str(row[0]))
      response = author_name(dataframe)
      csvwriter.writerow(response)

Any help will be greatly appreciated. 任何帮助将不胜感激。 Thanks !! 谢谢 !!

I wrote you a very simple pythonscript that does, what I think you want it to do. 我给您编写了一个非常简单的python脚本,它确实可以满足您的要求。 I ignored that fact that the inputfile is a csv-file, and that there do exist libraries for parsing it. 我忽略了输入文件是一个csv文件的事实,并且确实存在用于对其进行解析的库。 The following is just a quick and dirty solution, to hint you into the right direction. 以下只是一个快速而肮脏的解决方案,以提示您正确的方向。 I would recommend you to improve this snippet: 我建议您改进此代码段:

  • use a csv-library to process the file 使用csv库处理文件
  • use a kind of dictionary for your variables edit: already done 对您的变量使用一种字典 编辑: 已经完成
  • try to get rid of the string-comparison (pe use the subject as a key for your dictionary) edit: already done 尝试摆脱字符串比较(将主题用作词典的键) 编辑: 已经完成

input.csv input.csv

abc   Genteic|Biology|Chemical Engineering
def   Physics|Chemical Engineering|Astrophysics
xyz   Chemical Engineering|Astrophysics
klm   Biology|Astrophysics
nop   Chemical Engineering|Astrophysics

main.py main.py

counters = {"Biology":0, "Genteic":0, "Chemical Engineering":0, "Physics":0, "Astrophysics":0}

csv_file = open("input.csv", "r")

for line in csv_file.read().splitlines():
    arr=line.split("   ")
    name=arr[0]
    professions=arr[1]
    for subj in professions.split("|"):
        counters[subj] += 1

csv_file.close()
print("There are %s teachers working in Biology" % counters["Biology"])
print("There are %s teachers working in Genteic" % counters["Genteic"])
print("There are %s teachers working in Chemical Engineering" % counters["Chemical Engineering"])
print("There are %s teachers working in Physics" % counters["Physics"])
print("There are %s teachers working in Astrophysics" % counters["Astrophysics"])

call of python3 main.py results in: python3 main.py调用导致:

There are 2 teachers working in Biology
There are 1 teachers working in Genteic
There are 4 teachers working in Chemical Engineering
There are 1 teachers working in Physics
There are 4 teachers working in Astrophysics

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在第一列中匹配唯一元素并在第二列中列出相应的值 - How to match unique elements in one column and list the corresponding values from second column 在Python中,如何根据一列中的值比较两个csv文件并从第一个文件中输出与第二个不匹配的记录 - In Python, how to compare two csv files based on values in one column and output records from first file that do not match second 通过将另一列与第二个DataFrame进行比较,替换一列中的值 - Replace values from one column by comparing another column to a second DataFrame 如何匹配列中的值? - How to match values in a column? 使用 pandas 从另一列和多个输入之间最接近的匹配中查找一列的值 - Using pandas to look up the values of one column from the closest match between another column and multiple inputs 如何用第二个 DataFrame 中的值替换一个 DataFrame 中的列中的值,两者在 Python Pandas 中都有主键? - How to replace values in column in one DataFrame by values from second DataFrame both have major key in Python Pandas? 如何根据第一列获取最大值,然后索引其他列的值,如果条件不匹配,则基于第二列? - How to get the max values based on first column, then index other columns values, if criteria not match, then based on second column? 如何从值与另一列匹配的一列中获取最大值? - How can I get the max value from one column where values match another column? 从第二列替换列中的 NaN 值 - Replacing NaN values in a column from a second column 将一列中的值与第二个数据帧中的列中的值进行比较 - Comparing values in one column to values in a column in a second dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM