使用 if 语句循环的有效方法

Question

I have a sample data look like this (real dataset has more columns):我有一个像这样的示例数据（真实数据集有更多列）：

data = {'stringID':['AB CD Efdadasfd','RFDS EDSfdsadf dsa','FDSADFDSADFFDSA'],'IDct':[1,3,4]}
data = pd.DataFrame(data)
data['Index1'] = [[3,6],[7,9],[5,6]]
data['Index2'] = [[4,8],[10,13],[8,9]]

What i want to achieve is i want to slice stringID column based on second elment in Index1 and Index2 (both are list), only if IDct value is bigger than 1, otherwise return NaN.我想要实现的是我想根据 Index1 和 Index2 中的第二个元素（都是列表）对 stringID 列进行切片，仅当 IDct 值大于 1 时，否则返回 NaN。

I tried this, it works as Output1 column, but there must be a better way (i mean faster when apply to a large dataset) to do it, please kindly advise, thanks!我试过了，它可以作为 Output1 列使用，但必须有更好的方法（我的意思是应用于大型数据集时更快）来做到这一点，请指教，谢谢！

data['pos'] = data.Index1.map(lambda x: x[1])
data['pos1'] = data.Index2.map(lambda x: x[1])

def cal(m):
    if m['IDct'] > 1:
        return m['stringID'][m['pos']:m['pos1']]
    else:
        return 'NaN'

data['Output1'] = data.apply(cal,axis=1)

Answer 1

I love pandas - but realistically speaking it's just one of many tools that belong in your tool belt.我喜欢熊猫 - 但实际上，它只是属于您工具带的众多工具之一。

pandas and numpy really shine for computation and analysis. pandas 和 numpy 非常适合计算和分析。 It's okay to use pandas to visualize and analyze your data - but that doesn't mean it's the right tool for the job.可以使用 Pandas 来可视化和分析您的数据 - 但这并不意味着它是适合这项工作的工具。

This kind of problem is better suited for regular python.这种问题更适合常规python。 Assuming we can, let's move StringID and IDct out of the dict and back into lists.假设我们可以，让我们将 StringID 和 IDct 从字典中移回列表中。 If we assume the result is regular in shape (all lists are of equal length)如果我们假设结果的形状是规则的（所有列表的长度相等）

StringID = ['AB CD Efdadasfd','RFDS EDSfdsadf dsa','FDSADFDSADFFDSA'],
IDct = [1,3,4]
Index1 = [[3,6],[7,9],[5,6]]
Index2 = [[4,8],[10,13],[8,9]]

for stringID, IDct, Index1, Index2 in zip(stringID, IDct, Index1, Index2):
    result = []
    if IDct > 1:
       result.append(your_indexing_goes_here())
    else:
       result.append(None)

You can then blend the result data back in as you see fit.然后，您可以按照您认为合适的方式重新混合结果数据。

data = {
    'StringID': StringID,
    'IDct': IDct,
    'Index1': Index1,
    'Index2': Index2,
    'Result': result
}

pd.DataFrame(data)

使用 if 语句循环的有效方法

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-09-24 20:56:06

使用 if 语句循环的有效方法

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-09-24 20:56:06

解决方案1
1 已采纳 2020-09-24 20:56:06