如何在python代码中使用unix的“ uniq -c”命令？

Question

我必须检查每个单词在一个段落中出现了多少次。 我必须打印单词以及出现的次数。

例如，如果该段是

你现在怎么样？ 现在好点了吗

那么输出应为：

如何1
are-2
你2
现在2
更好的1

我尝试使用子流程

从子流程导入调用
sen = raw_input（“ enter：”）
呼叫（[“ uniq”，“ -c”，sen]）

但是函数需要一个文件作为输入。 我不想输入文件。 我如何使其工作。

Answer 1

仅出于完整性考虑，这是您可以在Python中解决的方法：

import re, collections

paragraph = "how are you now? Are you better now?"

splitter = re.compile('\W')
counts = collections.Counter(word.lower() 
                             for word in splitter.split(paragraph) 
                             if word)
for word, count in counts.most_common():
    print(count, word)

Answer 2

作为对Dimitris Jim的评论（我会作为评论发表，但没有足够的代表），您还需要对输入进行排序。 您可以在python中通过用此替换regex语句来做到这一点

sen_list = sen.split(" ")
sen_list.sort()
sen = '\n'.join(sen_list)

我敢肯定有一种方法可以通过linux sort 。 同样，您可以使用tr ' ' '\\n'用python中的换行替换空格。

Answer 3

如果您真的想知道如何使用uniq进行计数，则：

from subprocess import Popen, PIPE

sen = raw_input("Enter: ")
sen = sen.lower().split() # Remove capitals and split into list of words
# Sort list to provide correct count ("-c" option counts only consecutive repeats)
# If you want to get consecutives, just don't sort.
sen.sort()
sen = "\n".join(sen) # Put each word into its own line (for input)
# uniq accepts input from stdin
p = Popen(["uniq", "-c"], stdin=PIPE, stdout=PIPE)
out = p.communicate(sen)[0].split("\n") # Pass the input, and get the output (make it a list by splittin on newlines)
counts = [] # Parse output and put it into a list
for x in out:
    if not x: continue # Skip empty lines (usually appears at the end of output string)
    counts.append(tuple(x.split())) # Split the line into tuple(number, word) and add it to counts list

# And if you want a nice output like you presented in Q:
for x in counts:
    print x[1]+"-"+x[0]

注意1：这绝对不是这样做的方法。 您确实应该用Python编写代码。

注意2：这在cygwin和Ubuntu 12.04上进行了测试，结果相同

注3：uniq不是函数，它是命令，即存储在/ bin / uniq和/ usr / bin / uniq中的程序

如何在python代码中使用unix的“ uniq -c”命令？

问题描述

3 个解决方案

解决方案1
3 已采纳 2015-09-12 12:45:15

解决方案2
0 2015-09-12 12:38:57

解决方案3
0 2015-09-12 12:47:53

如何在python代码中使用unix的“ uniq -c”命令？

问题描述

3 个解决方案

解决方案1 3 已采纳 2015-09-12 12:45:15

解决方案2 0 2015-09-12 12:38:57

解决方案3 0 2015-09-12 12:47:53

解决方案1
3 已采纳 2015-09-12 12:45:15

解决方案2
0 2015-09-12 12:38:57

解决方案3
0 2015-09-12 12:47:53