简体   繁体   English

计算字符串在特定列中出现的次数

[英]Count how many times a string occurs in a specific column

I am trying to see how many times a string occurs in column 4. More specifically how much times a port number occurs in some Netflow data. 我试图查看字符串在第4列中出现了多少次。更具体地说,某些Netflow数据中端口号出现了多少次。 There are thousands of ports so I'm not looking for anything specific other than recursion. 有成千上万的端口,因此除了递归之外,我没有在寻找其他任何特定的东西。 I have already parsed into the column using the numbers after the colon and I want the code to check how much times that number occurs so the final output should print the number with how many times it occurred like so.. 我已经使用冒号后面的数字将其解析为该列,并且我想让代码检查该数字出现了多少次,因此最终输出应打印出该数字出现了多少次。

[OUTPUT] [OUTPUT]

Port: 80 found: 3 times.
Port: 53 found: 2 times.
Port: 21 found: 1 times.

[CODE] [码]

import re


frequency = {}

file = open('/Users/rojeliomaestas/Desktop/nettest2.txt', 'r')

with open('/Users/rojeliomaestas/Desktop/nettest2.txt', 'r') as infile:    
    next(infile)
    for line in infile:
        data = line.split()[4].split(":")[1]
        text_string = file.read().lower()
        match_pattern = re.findall(data, text_string)


for word in match_pattern:
    count = frequency.get(word,0)
    frequency[word] = count + 1

frequency_list = frequency.keys()

for words in frequency_list:
    print ("port:", words,"found:", frequency[words], "times.")

[FILE] [文件]

Date first seen          Duration Proto      Src IP Addr:Port          Dst IP Addr:Port   Packets    Bytes Flows
2017-04-02 12:07:32.079     9.298 UDP            8.8.8.8:80 ->     205.166.231.250:8080     1      345     1
2017-04-02 12:08:32.079     9.298 TCP            8.8.8.8:53 ->     205.166.231.250:80       1       75     1
2017-04-02 12:08:32.079     9.298 TCP            8.8.8.8:80 ->     205.166.231.250:69       1      875     1
2017-04-02 12:08:32.079     9.298 TCP            8.8.8.8:53 ->     205.166.231.250:443      1      275     1
2017-04-02 12:08:32.079     9.298 UDP            8.8.8.8:80 ->     205.166.231.250:23       1      842     1
2017-04-02 12:08:32.079     9.298 TCP            8.8.8.8:21 ->     205.166.231.250:25       1      146     1

From python standard library. 来自python标准库。 Will return a dictionary with exactly what you are looking for. 将返回包含您所要查找内容的字典。

from collections import Counter
counts = Counter(column)
counts.most_common(n) # will return the most common values for specified number (n)

You need something like: 您需要类似:

frequency = {}
with open('/Users/rojeliomaestas/Desktop/nettest2.txt', 'r') as infile:    
    next(infile)
    for line in infile:
        port = line.split()[4].split(":")[1]
        frequency[port] = frequency.get(port,0) + 1

for port, count in frequency.items(): 
    print("port:", port, "found:", count, "times.")

The heart of this is that you keep a dict of port to count, and increment this for every line. 这样做的核心是,您要保留要计算的端口的数量,并为每一行增加该数量。 dict.get will return the current value or a default (in this case 0). dict.get将返回当前值或默认值(在这种情况下为0)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何分析python数据帧并计算字符串在列中出现的次数? - How to analyze python dataframe and to count how many times a string occurs in a column? 计算在sqlite3数据库表列中出现值的次数 - count how many times in an sqlite3 database table column the values occurs 计算 object 在 DataFrame 列的列表中出现的次数 - Count how many times an object occurs in a list of a list within a DataFrame column 计算每个项目在字典中出现的次数 - count how many times each item occurs in a dictionary Python:计算一个单词在文件中出现的次数 - Python: Count how many times a word occurs in a file Pyspark 计算一个项目在 dataframe 中不同日期出现的次数 - Pyspark count how many times a item occurs in different dates in a dataframe 计算特定字符串在列表中出现的频率 - Count how often a specific string occurs in a list 计算字符串在特定列中出现的次数 - Count the number of times a string appears in a specific column 使用两个 Pandas 数据帧:在 DF1 中创建一个列,该列是对 df2 中的值组合在条件下出现的次数的计数 - Using Two Pandas Dataframes: Create a column in DF1 that is a count of how many times a combination of values occurs in df2 with conditions 有没有一种方法可以计算每天特定次数出现在特定列中的次数? - Is there a way in which I can count how many times per day a specific word is present in specific column?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM