简体   繁体   English

从数据框中每一行的两个其他值之间获取值

[英]Get values from between two other values for each row in the dataframe

I want to extract the integer values for each Hole_ID between the From and To values (inclusive).我想为 From 和 To 值(含)之间的每个 Hole_ID 提取整数值。 And save them to a new data frame with the Hole IDs as the column headers.并将它们保存到以 Hole ID 作为列标题的新数据框中。

import pandas as pd
import numpy as np
df=pd.DataFrame(np.array([['Hole_1',110,117],['Hole_2',220,225],['Hole_3',112,114],['Hole_4',248,252],['Hole_5',116,120],['Hole_6',39,45],['Hole_7',65,72],['Hole_8',79,83]]),columns=['HOLE_ID','FROM', 'TO'])

Example starting data示例起始数据

  HOLE_ID FROM    TO
0  Hole_1  110   117
1  Hole_2  220   225
2  Hole_3  112   114
3  Hole_4  248   252
4  Hole_5  116   120
5  Hole_6   39    45
6  Hole_7   65    72
7  Hole_8   79    83

This is what I would like:这就是我想要的:

Out[5]:
  Hole_1 Hole_2 Hole_3 Hole_4 Hole_5 Hole_6 Hole_7 Hole_8
0    110    220    112    248    116     39     65     79
1    111    221    113    249    117     40     66     80
2    112    222    114    250    118     41     67     81
3    113    223    Nan    251    119     42     68     82
4    114    224    Nan    252    120     43     69     83
5    115    225    Nan    Nan    Nan     44     70    Nan
6    116    Nan    Nan    Nan    Nan     45     71    Nan
7    117    Nan    Nan    Nan    Nan    Nan     72    Nan

I have tried to use the range function, which works if I manually define the range:我尝试使用 range 函数,如果我手动定义范围,它会起作用:

for i in df['HOLE_ID']:
    df2[i]=range(int(1),int(10))

gives

   Hole_1  Hole_2  Hole_3  Hole_4  Hole_5  Hole_6  Hole_7  Hole_8
0       1       1       1       1       1       1       1       1
1       2       2       2       2       2       2       2       2
2       3       3       3       3       3       3       3       3
3       4       4       4       4       4       4       4       4
4       5       5       5       5       5       5       5       5
5       6       6       6       6       6       6       6       6
6       7       7       7       7       7       7       7       7
7       8       8       8       8       8       8       8       8
8       9       9       9       9       9       9       9       9

but this won't take the df To and From values as inputs to the range.但这不会将 df To 和 From 值作为范围的输入。

df2=pd.DataFrame()
for i in df['HOLE_ID']:
    df2[i]=range(df['To'],df['From'])

gives an error.给出一个错误。

Apply a method that returns a series of a range between from and to and then transpose the result, eg:应用一个方法,该方法返回一系列 from 和 to 之间的范围,然后转置结果,例如:

import numpy as np

df.set_index('HOLE_ID').apply(lambda v: pd.Series(np.arange(v['FROM'], v['TO'] + 1)), axis=1).T

Gives you:给你:

HOLE_ID  Hole_1  Hole_2  Hole_3  Hole_4  Hole_5  Hole_6  Hole_7  Hole_8
0         110.0   220.0   112.0   248.0   116.0    39.0    65.0    79.0
1         111.0   221.0   113.0   249.0   117.0    40.0    66.0    80.0
2         112.0   222.0   114.0   250.0   118.0    41.0    67.0    81.0
3         113.0   223.0     NaN   251.0   119.0    42.0    68.0    82.0
4         114.0   224.0     NaN   252.0   120.0    43.0    69.0    83.0
5         115.0   225.0     NaN     NaN     NaN    44.0    70.0     NaN
6         116.0     NaN     NaN     NaN     NaN    45.0    71.0     NaN
7         117.0     NaN     NaN     NaN     NaN     NaN    72.0     NaN

Let's try:咱们试试吧:

df[['FROM','TO']] = df[['FROM', 'TO']].apply(pd.to_numeric)
dfe = df.set_index('HOLE_ID').apply(lambda x: np.arange(x['FROM'], x['TO']+1), axis=1).explode().to_frame()
dfe.set_index(dfe.groupby(level=0).cumcount(), append=True)[0].unstack(0)

Output:输出:

HOLE_ID Hole_1 Hole_2 Hole_3 Hole_4 Hole_5 Hole_6 Hole_7 Hole_8
0          110    220    112    248    116     39     65     79
1          111    221    113    249    117     40     66     80
2          112    222    114    250    118     41     67     81
3          113    223    NaN    251    119     42     68     82
4          114    224    NaN    252    120     43     69     83
5          115    225    NaN    NaN    NaN     44     70    NaN
6          116    NaN    NaN    NaN    NaN     45     71    NaN
7          117    NaN    NaN    NaN    NaN    NaN     72    NaN

Here is another way that creates a range from the 2 columns and creates a df:这是从 2 列创建范围并创建 df 的另一种方法:

out = (pd.DataFrame(df[['FROM','TO']].astype(int).agg(tuple,1)
      .map(lambda x: range(x[0],x[1]+1).tolist(),index=df['HOLE_ID']).T)

HOLE_ID  Hole_1  Hole_2  Hole_3  Hole_4  Hole_5  Hole_6  Hole_7  Hole_8
0         110.0   220.0   112.0   248.0   116.0    39.0    65.0    79.0
1         111.0   221.0   113.0   249.0   117.0    40.0    66.0    80.0
2         112.0   222.0   114.0   250.0   118.0    41.0    67.0    81.0
3         113.0   223.0     NaN   251.0   119.0    42.0    68.0    82.0
4         114.0   224.0     NaN   252.0   120.0    43.0    69.0    83.0
5         115.0   225.0     NaN     NaN     NaN    44.0    70.0     NaN
6         116.0     NaN     NaN     NaN     NaN    45.0    71.0     NaN
7         117.0     NaN     NaN     NaN     NaN     NaN    72.0     NaN

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 结合 dataframe 中两列的值并获取每个列的计数 - combine values from two columns in a dataframe and get the count of each 查找位于其他两个DataFrame的索引值之间的DataFrame的索引值 - Find index values of a DataFrame that are between the index values of two other DataFrames 如果 dataframe 的每一行中的两个值被氨基酸隔开,如何测量它们之间的差异? - How do I measure the difference between two values within each row of my dataframe if they are separated by amino acids? 如何将 dataframe 中的每一行与另一个 dataframe 中的每一行进行比较,并查看值之间的差异? - How can I compare each row from a dataframe against every row from another dataframe and see the difference between values? Dataframe - 对于每一行,比较两列的值,匹配时获取第三列的值 - Dataframe - for each row, compare values of two columns, get value of third column on match 用“目标行”中其他行中的值替换数据框中的值 - Replacing values in dataframe by values from other rows by “target row” 如何为两个数据框列的每一行值创建一个列表 - How to make a list for each row of values of two dataframe columns DataFrame每行值的总和 - Sum of values in each row of DataFrame 如何在整个熊猫而不是每一行中使用整个数据框的groupby获取最大值 - How to get max values with groupby of entire dataframe in Pandas, not each row Python:获取 pandas dataframe 每一行的最大值列 - Python: Get Columns of max values each row of an pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM