简体   繁体   English

基于 Python 中的“版本号”从列表中删除类似项目

[英]Removing similar items from list based on 'version number' in Python

I've got a list, like this (but larger):我有一个列表,像这样(但更大):

[item_101.1.txt, item_101.2.txt, item_134.1.txt, item_134.2.txt, item_134.3.txt, item_134.4.txt] [item_101.1.txt、item_101.2.txt、item_134.1.txt、item_134.2.txt、item_134.3.txt、item_134.4.txt]

So, when there is an "item_101. 2 .txt", this here "item_101. 1 .txt" becomes redundant, and I want to remove it from the list.所以,当有一个“item_101 2 .TXT”,这在这里“item_101 1个.TXT”变得多余了,我想从列表中删除。 Similarly, "item_134. 4 .txt" should remain, but item_134.同样,“item_134. 4 .txt”应该保留,但 item_134.txt 应该保留。 3 .txt, item_134. 3 .txt,item_134。 2 .txt, item_134. 2 .txt,item_134。 1 .txt should be removed. 1 .txt 应该被删除。

But I can't do this within a for loop, because that deals on a per item basis.但是我不能在 for 循环中执行此操作,因为这是按项目进行的。

Any ideas?有任何想法吗? Any concepts I should be looking into?我应该研究什么概念?

Thanks guys!谢谢你们!

Since this sounds like it might be homework, I'm just going to provide the structure of an algorithm:由于这听起来像是家庭作业,我将提供一个算法的结构:

  • Define a function that can parse the string, returning the root of the file name, and the version number.定义一个可以解析字符串的函数,返回文件名的根和版本号。 You should probably have it return the version number as an integer, instead of a string.您可能应该让它将版本号作为整数而不是字符串返回。 Use would look something like this, assuming they'll always be .txt file extensions:使用看起来像这样,假设它们总是 .txt 文件扩展名:

     > extract_version('item_101.2.txt') ('item_101', 2)
  • Use this function on all of your inputs, returning something like this:在所有输入上使用此函数,返回如下内容:

     [('item_101', 1), ('item_101', 2), ('item_134', 1), ... ]
  • Loop through that list, keeping track of the highest version number for each in a dictionary:循环遍历该列表,跟踪字典中每个列表的最高版本号:

     for fname, version in version_list: if fname not in highest_version: highest_version[fname] = version else: highest_version[fname] = max(highest_version[fname], version)
  • After running this loop, highest_version will contain the maximum version numbers for each file name.运行此循环后,highest_version 将包含每个文件名的最大版本号。 You can loop through the dictionary and rebuild the file names.您可以遍历字典并重建文件名。 Note that they may be in a different order than before, so you may need to sort them based on your criteria.请注意,它们的顺序可能与以前不同,因此您可能需要根据您的条件对它们进行排序。

     for fname, version in highest_version.items(): highest_version_list.append(fname + '.' + str(version) + '.txt'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM