基於計數的用戶和 ip 地址

Question

我有這樣的文件

USER_ID,IP_ADDRESS
XXXXXX24,10.12.6.54
XXXXXX24,10.12.6.54
XXXXXX24,10.12.6.54
XXXXXX24,10.12.6.54
XXXXXX24,10.12.6.54
XXXXXX25,10.12.6.55
XXXXXX25,10.12.6.55
XXXXXX25,10.12.6.55
XXXXXX25,10.12.6.55
XXXXXX25,10.12.6.55
XXXXXX21,10.12.6.51
XXXXXX21,10.12.6.51
XXXXXX21,10.12.6.51
XXXXXX21,10.12.6.51

我需要一個基於 IP 地址計數的 output

10.12.6.51 10.12.6.55 10.12.6.54
XXXXXX21      4
XXXXXX25                 4
XXXXXX24                            4

所以這是代碼，它很好，我得到了這樣的 output。 我需要有關 output 的更多詳細信息。

#!/bin/python3.6

import csv
import collections
datafile=open('conn.csv','r')
usefuldata=[]
for line in datafile:
   usefuldata.append(line)
from collections import Counter
outfile1=Counter(usefuldata)
print(outfile1)

最后在 Barmer 的幫助下，我想出了以下 output

Counter({'XXXXXX24,10.12.6.54\n': 5, 'XXXXXX25,10.12.6.55\n': 5, 'XXXXXX21,10.12.6.51\n': 4, 'XXXXXX24,10.12.6.56\n': 3, 'USER_ID,IP_ADDRESS\n': 1})

Answer 1

您還可以使用pandas和collections.Counter

例如：

import collections

import pandas as pd
from tabulate import tabulate

with open("data_file.csv") as file:
    next(file, None)  # skip the header
    counter = collections.Counter([line.strip() for line in file])

output = collections.defaultdict(dict)
for user_and_ip, ip_to_user_count in counter.items():
    user, ip = user_and_ip.split(",")
    output[ip].update({user: ip_to_user_count})

df = pd.DataFrame(output).fillna("")
print(tabulate(df, headers="keys"))
df.to_csv("user_to_ip.csv")

Output：

          10.12.6.54    10.12.6.55    10.12.6.51
--------  ------------  ------------  ------------
XXXXXX24  5.0
XXXXXX25                5.0
XXXXXX21                              4.0

和.csv文件：

Answer 2

#!/bin/python3.6

import csv
import collections
datafile=open('conn.csv','r')
usefuldata=[]
for line in datafile:
   usefuldata.append(line)
from collections import Counter
outfile1=Counter(usefuldata)
#print(outfile1.most_common())
for value,count in outfile1.most_common():
  print(value,count)

我能夠通過上面的代碼實現我想要的

[root@lhqsb1db2db01 Scripts]# ./conn.py
XXXXXX24,10.12.6.54
 5
XXXXXX25,10.12.6.55
 5
XXXXXX21,10.12.6.51
 4
XXXXXX24,10.12.6.56
 3

基於計數的用戶和 ip 地址

問題描述

2 個解決方案

解決方案1
1 2021-04-13 17:26:00

解決方案2
0 2021-04-13 17:21:27

基於計數的用戶和 ip 地址

問題描述

2 個解決方案

解決方案1 1 2021-04-13 17:26:00

解決方案2 0 2021-04-13 17:21:27

解決方案1
1 2021-04-13 17:26:00

解決方案2
0 2021-04-13 17:21:27