Pandas：将非数字标识符代码拆分为多行

Question

Suppose I have a data set that looks like this假设我有一个看起来像这样的数据集

Unique_Identifier  Score1 Score2
112                   50     60 
113-114               50     70 
115                   40     20 
116-117               30     90 
118                   70     70

Notice how some of my unique identifiers are listed as ranges, rather than exact values.请注意我的一些唯一标识符是如何列为范围而不是精确值的。 I want to split up those ranges to each be 2 separate rows with the same scores so that it would look like this:我想将这些范围拆分为 2 个具有相同分数的单独行，以便它看起来像这样：

Unique_Identifier  Score1 Score2
112                   50     60 
113                   50     70
114                   50     70
115                   40     20 
116                   30     90
117                   30     90 
118                   70     70

How would I go about doing this in Python using Pandas?我 go 如何使用 Pandas 在 Python 中执行此操作？ I think there may be a way to test for rows that have a "-" in them, but I'm not sure how I would go about splitting those rows.我认为可能有一种方法可以测试其中包含“-”的行，但我不确定 go 如何拆分这些行。 I should also note that some identifier ranges have more than just 2 identifiers in them, such as 120-124.我还应该注意，某些标识符范围中的标识符不止 2 个，例如 120-124。

Answer 1

df.assign(Unique_Identifier=df.Unique_Identifier.str.split('-')).explode('Unique_Identifier')

  Unique_Identifier  Score1  Score2
0               112      50      60
1               113      50      70
1               114      50      70
2               115      40      20
3               116      30      90
3               117      30      90
4               118      70      70

Answer 2

split on "-" and create a list with the desired range . split为“-”并创建具有所需range的列表。 Then explode to individual rows:然后explode成单独的行：

df["Unique_Identifier"] = df["Unique_Identifier"].apply(lambda x: list(range(int(x.split("-")[0]),int(x.split("-")[1])+1)) if "-" in x else [int(x)])
df = df.explode("Unique_Identifier")

>>> df
  Unique_Identifier  Score1  Score2
0               112      50      60
1               113      50      70
1               114      50      70
2               115      40      20
3               116      30      90
3               117      30      90
4               118      70      70
5               120      80      80
5               121      80      80
5               122      80      80
5               123      80      80
5               124      80      80

Pandas：将非数字标识符代码拆分为多行

问题描述

2 个解决方案

解决方案1
1 2022-04-27 21:01:19

解决方案2
0 2022-04-27 19:56:55

Pandas：将非数字标识符代码拆分为多行

问题描述

2 个解决方案

解决方案1 1 2022-04-27 21:01:19

解决方案2 0 2022-04-27 19:56:55

解决方案1
1 2022-04-27 21:01:19

解决方案2
0 2022-04-27 19:56:55