[英]How to loop through each row in a column in a pandas dataframe
I have an excel file like below with a set of names and their google scholar links.我有一个 excel 文件,如下所示,其中包含一组名称及其谷歌学者链接。
ID Name Link
1 A www.abc.com
2 B www.def.com
3 C www.ghi.com
I have written a code to read the excel file, browse each link using a for loop, within each loop -scrape information from each link and write it in a new file.我编写了一个代码来读取 excel 文件,使用 for 循环浏览每个链接,在每个循环中 - 从每个链接中抓取信息并将其写入一个新文件。 The code is as follows.
代码如下。
File=[]
for i in arr:
driver.get(i)
columns={}
columns['Name'] = driver.find_element_by_id()
columns['Citations'] = driver.find_element_by_id()
File.append(columns)
My question is I want to include a column 'ID' in my new file which is the same as the column 'ID' as my excel file.我的问题是我想在我的新文件中包含一个列“ID”,它与我的 excel 文件中的列“ID”相同。 Essentially, I want the first row of the column'ID' in the first iteration of the for loop, the second row of the column'ID' in the second iteration of the loop and so on.
本质上,我想要 for 循环的第一次迭代中列“ID”的第一行,循环的第二次迭代中列“ID”的第二行,依此类推。 Can someone please help?
有人可以帮忙吗? Thanks!
谢谢!
Instead of saving them as a dictionary, save them as a DataFrame and assign a new column, called source with the id:与其将它们保存为字典,不如将它们保存为 DataFrame 并分配一个名为 source 的新列,其 id 为:
File=[]
for i in arr:
driver.get(i)
columns={}
columns['Name'] = driver.find_element_by_id()
columns['Citations'] = driver.find_element_by_id()
File.append(pd.DataFrame(columns).assign(source=i))
to get only a single dataframe out of it u then can use:要从中仅获取一个 dataframe ,您可以使用:
pd.concat(File)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.