繁体   English   中英

在Python列表中插入值

[英]Inserting values in a Python list

我正在研究一个解析文本文件的脚本,试图对其进行规范化以使其能够插入到数据库中。 数据代表1位或多位作者撰写的文章。 我遇到的问题是,因为作者人数不固定,所以输出文本文件中的列数可变。 例如。

author1, author2, author3, this is the title of the article
author1, author2, this is the title of the article
author1, author2, author3, author4, this is the title of the article

这些结果使我的最大列数为5。因此,对于前两篇文章,我将需要添加空白列,以使输出的列数为偶数。 最好的方法是什么? 我输入的文本是制表符分隔的,通过在制表符上拆分,我可以很容易地遍历它们。

假设您已经具有最大的列数,并且已经将它们分成列表(我假设您将其放入自己的列表中),那么您应该能够只使用list.insert(-1,item)添加空列:

def columnize(mylists, maxcolumns):
    for i in mylists:
        while len(i) < maxcolumns:
            i.insert(-1,None)

mylists = [["author1","author2","author3","this is the title of the article"],
           ["author1","author2","this is the title of the article"],
           ["author1","author2","author3","author4","this is the title of the article"]]

columnize(mylists,5)
print mylists

[['author1', 'author2', 'author3', None, 'this is the title of the article'], ['author1', 'author2', None, None, 'this is the title of the article'], ['author1', 'author2', 'author3', 'author4', 'this is the title of the article']]

使用列表推导功能,不会破坏原始列表的替代版本:

def columnize(mylists, maxcolumns):
    return [j[:-1]+([None]*(maxcolumns-len(j)))+j[-1:] for j in mylists]

print columnize(mylists,5)

[['author1', 'author2', 'author3', None, 'this is the title of the article'], ['author1', 'author2', None, None, 'this is the title of the article'], ['author1', 'author2', 'author3', 'author4', 'this is the title of the article']]

如果我误解了,请原谅我,但听起来您正在以一种困难的方式解决问题。 将您的文本文件转换成将标题映射到一组作者的字典非常容易:

>>> lines = ["auth1, auth2, auth3, article1", "auth1, auth2, article2","auth1, article3"]
>>> d = dict((x[-1], x[:-1]) for x in [line.split(', ') for line in lines])
>>> d
{'article2': ['auth1', 'auth2'], 'article3': ['auth1'], 'article1': ['auth1', 'auth2', 'auth3']}
>>> total_articles = len(d)
>>> total_articles
3
>>> max_authors = max(len(val) for val in d.values())
>>> max_authors
3
>>> for k,v in d.iteritems():
...     print k
...     print v + [None]*(max_authors-len(v))
... 
article2
['auth1', 'auth2', None]
article3
['auth1', None, None]
article1
['auth1', 'auth2', 'auth3']

然后,如果您确实愿意,则可以使用python内置的csv模块输出此数据。 或者,您可以直接输出所需的SQL。

您多次打开同一文件,并多次读取该文件,只是为了获得可以从内存中的数据中获取的计数。 为此,请勿多次读取文件。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM