繁体   English   中英

根据不同数据帧中的条件从数据帧中读取日期

[英]reading date from a data frame based on conditions in a different data frame

我有2个数据帧。 我需要根据另一个数据框中的值从一个数据框中读取值

话:

words = pd.DataFrame()
words['no'] = [1,2,3,4,5,6,7,8,9]
words['word'] = ['cat', 'in', 'hat', 'the', 'dog', 'in', 'love', '!', '<3']
words

句子:

sentences =  pd.DataFrame()
sentences['no'] =[1,2,3]
sentences['start'] = [1, 4, 6]
sentences['stop'] = [3, 5, 9]
sentences

所需的输出是在文本文件中:

cat in hat
***
the dog
***
in love ! <3

但是我无法通过这一步,我尝试运行以下代码:

forx in sentances:print(words ['word'] [words ['no']。between(sentences ['start'],sentence ['stop'],inclusive = True)

但我带着这个错误返回

 File "<ipython-input-16-ae3f5333be66>", line 3
    print(words['word'][words['no'].between(sentences['start'], sentences['stop'], inclusive = True)
                                                                                                    ^
SyntaxError: unexpected EOF while parsing

no设置为words的索引,然后使用列表解析迭代sentences

v = words.set_index('no')['word']
sentences = [
    ' '.join(v.loc[i:j]) for i, j in zip(sentences['start'], sentences['stop'])
]

或索引不可知:

v = words['word'].tolist()
sentences = [
    ' '.join(v[i - 1:j - 1] for i, j in zip(sentences['start'], sentences['stop'])
]

['cat in hat', 'the dog', 'in love ! <3']

从这里保存到文件应该很简单:

with open('file.txt', 'w') as f:
    for sent in sentences:
        f.write(sent + '\n')
        f.write('***\n')

解决这个问题的一种方法,

res=pd.DataFrame()
res['s']=sentences.apply(lambda x: ' '.join(words.iloc[(x['start']-1):(x['stop'])]['word']),axis=1)
res.to_csv('a.txt',index=False,header=False,line_terminator='\n***\n')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM