如何从 python 中的列表中提取？

Question

If I have a list that is made up of 1MM ids, how would I pull from that list in intervals of 50k?如果我有一个由 1MM id 组成的列表，我将如何以 50k 的间隔从该列表中提取？

For example:例如：

[1]cusid=df['customer_id'].unique().tolist()
[1]1,000,500

If I want to pull in chunks, is the below correct for 50k?如果我想拉大块，下面的 50k 是否正确？

cusid=cusid[:50000] - first 50k ids
cusid=cusid[50000:100001] - the next 50k of ids
cusid=cusid[100001:150001] - the next 50k

are my interval selections correct?我的间隔选择正确吗？

Thanks!谢谢！

Answer 1

cusid2 = [cusid[a:a+50000] for a in range(0, 950000, 50000)]

This is a list comprehension basically you will add to your list every element cusid[a: a+50000] for a going from 0 to 950000 (so 1m minus 50k) and iterate with a step of 50k so a will go up by 50k every iteration这是一个列表理解，基本上你会将每个元素 cusid[a: a+50000] 添加到列表中，从 0 到 950000（所以 1m 减去 50k）并以 50k 的步长进行迭代，因此 go 每增加 50k迭代

Answer 2

Couple of things to mention:有几点要提：

It seems that you're using "data science" stack for your work, good chance you have numpy available, please take a look at numpy.array_split .您似乎正在使用“数据科学”堆栈进行工作，很有可能您有numpy可用，请查看numpy.array_split 。 You can calculate chunk amount once and use np view machinery.您可以计算一次块量并使用 np 视图机制。 Most probably this is a lot faster than bringing np arrays in to native python lists很可能这比将 np arrays 带入本机 python 列表要快得多

Idiomatic python approach (IMO) would be leveraging iterators + islice :惯用的 python 方法（IMO）将利用迭代器 + islice ：

 from itertools import islice # create iterator from your array/list, this is cheap operation iterator = iter(cusid) # if you want element-wise operations, you can use your chunk in loops or function that require iterations # this is really memory-efficient, as you don't put whole chunk in memory chunk = islice(iterator, 50000) s = sum(chunk) # in case you really need whole chunk in memory, just turn isclice into list chunk = list(islice(iterator, 50000)) last_in_chunk = chunk[-1] # and you always use same code to consume next chunk from your source # without maintaining any counters next_chunk = list(islice(iterator, 50000))

When your iterator is exhausted (there's no values left) you will get empty chunk(s).当您的iterator用尽时（没有剩余值），您将得到空块。 When there's not enough elements to create full chunk, you will get as much as is left there.当没有足够的元素来创建完整的块时，你会得到尽可能多的东西。

如何从 python 中的列表中提取？

问题描述

2 个解决方案

解决方案1
0 2022-01-27 22:11:02

解决方案2
0 2022-01-27 22:35:10

如何从 python 中的列表中提取？

问题描述

2 个解决方案

解决方案1 0 2022-01-27 22:11:02

解决方案2 0 2022-01-27 22:35:10

解决方案1
0 2022-01-27 22:11:02

解决方案2
0 2022-01-27 22:35:10