简体   繁体   English

在Python / Pandas DataFrame的列中按字符编制索引

[英]Indexing by Character in a Column of a Python/Pandas DataFrame

I am working on a project in which I scraped NBA data from ESPN and created a DataFrame to store it. 我正在一个项目中,我从ESPN抓取了NBA数据并创建了一个DataFrame来存储它。 One of the columns of my DataFrame is Team. 我的DataFrame列之一是Team。 Certain players that have been traded within a season have a value such as LAL/LAC under team, rather than just having one team name like LAL. 在一个赛季内交易的某些球员的价值,例如在团队下的LAL / LAC,而不仅仅是一个像LAL这样的球队名称。 With these rows of data, I would like to make 2 entries instead of one. 使用这些数据行,我想输入2个条目,而不是1个。 Both entries would have the same, original data, except for 1 of the entries the team name would be LAL and for the other entry the team name would be LAC. 两个条目将具有相同的原始数据,但其中一个条目的团队名称为LAL,而其他条目的团队名称为LAC。 Some team abbreviations are 2 letters while others are 3 letters. 有些团队缩写是2个字母,而其他缩写是3个字母。

I have already managed to create a separate DataFrame with just these rows of data that have values in the form team1/team2. 我已经设法用这些数据行创建了一个单独的DataFrame,这些数据行的格式为team1 / team2。 I figured a good way of getting the data the way I want it would be to first copy this DataFrame with the multiple team entries, and then with one DataFrame, keep everything in the Team column up until the /, and with the other, keep everything in the Team column after the slash. 我想出了一种以所需方式获取数据的好方法,即首先复制具有多个团队条目的DataFrame,然后复制一个DataFrame,将Team列中的所有内容一直保留到/为止,而另一个保留斜杠后“团队”列中的所有内容。 I'm not quite sure what the code would be for this in the context of a DataFrame. 我不太确定在DataFrame上下文中的代码是什么。 I tried the following but it is invalid syntax: 我尝试了以下操作,但语法无效:

first_team =  first_team['TEAM'].str[:first_team[first_team['TEAM'].index("/")]]

where first_team is my DataFrame with just the entries with multiple teams. 其中first_team是我的DataFrame,其中只有多个团队的条目。 Perhaps this can give you a better idea of what I'm trying to accomplish! 也许这可以使您更好地了解我要完成的工作!

Thanks in advance! 提前致谢!

You're probably better off using split first to separate the teams into columns (also see Pandas DataFrame, how do i split a column into two ), something like this: 您可能最好先使用split来将团队分成几列(另请参见Pandas DataFrame,如何将一列分成两部分 ),如下所示:

d = pd.DataFrame({'player':['jordan','johnson'],'team':['LAL/LAC','LAC']})
pd.concat([d, pd.DataFrame(d.team.str.split('/').tolist(), columns =  ['team1','team2'])], axis = 1)

    player     team team1 team2
0   jordan  LAL/LAC   LAL   LAC
1  johnson      LAC   LAC  None

Then if you want separate rows, you can use append . 然后,如果需要单独的行,则可以使用append

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM