循環遍歷CSV文件中的數據，以便將“ 1”和“ 0”輸出到文本文件（Python）

Question

我最近開始學習Python，並在嘗試格式化我正在處理的項目的某些數據時遇到問題。 我設法將CSV文件作為輸入，現在我嘗試遍歷該數據並將基於數據的輸出“ 1”和“ 0”輸入文本文件。

到目前為止，我有以下代碼：

data = {} 
productIds = [] 

for row in reader:
    productIds.append(row['productCode']) 
    if row['basketID'] not in data:
        data[row['basketID']] = [row['productCode']]
    else:
        data[row['basketID']].append(row['productCode'])

productIds = sorted(set(productIds))

for item in productIds:
    txtFile.write("%s " % item)
txtFile.write('\n')

for key in data: # Will loop through each basket
    for value in data[key]: #Loop through each product in basket
        for i in productIds: # Go through list of available products
            if value == i: 
                txtFile.write('1 ')
            else:
                txtFile.write('0 ')
    txtFile.write('\n')

結果：

23 24 25 #Products 
1  0  0  0 1 0 0 0 1 #Basket 1
1  0  0              #Basket 2
1  0  0              #Basket 3
0  0  1              #Basket 4
0  1  0  0 0 1       #Basket 5

預期結果：

23 24 25 #Products
1  1  1  #Basket 1  
1  0  0  #Basket 2  
1  0  0  #Basket 3  
0  0  1  #Basket 4
0  1  1  #Basket 5

CSV檔案：

basketID productCode 
1        23  
1        24  
1        25  
2        23  
3        23  
4        25  
5        24  
5        25

我認為在針對同一產品瀏覽產品列表時會出錯，但是我不確定如何實現這一目標。

Answer 1

我認為你應該嘗試一下。首先讀為Dataframe

>>> df = pd.read_csv("lia.csv")
>>> df
   basketID  productCode
0         1           23
1         1           24
2         1           25
3         2           23
4         3           23
5         4           25
6         5           24
7         5           25

然后

g1 = df.groupby( [ "productCode","basketID"] ).count()
g1
Empty DataFrame
Columns: []
Index: [(23, 1), (23, 2), (23, 3), (24, 1), (24, 5), (25, 1), (25, 4), (25, 5)

Answer 2

問題出在最后一個for循環中。 您要遍歷每個購物籃並遍歷當前購物籃中的每個產品。 對於每個項目，您都在檢查它是否等於當前productId。 由於有3個productId，因此您在購物籃中獲得3x項輸入。

示例：對於basket1，您正在循環瀏覽第一個項目=> 23，為此您在輸出文件中輸入了3個條目：對於productIds 1中的i。23 = 23 => 1 2. 23 = 24 => 0 3. 23 = 25 => 0

此外，您還有另一個問題。 由於您的字典沒有按鍵排序，因此不能保證籃子循環的順序是從籃子1到籃子5遞增。

將last for循環替換為：（對字典排序，然后進行正確的迭代）

data=collections.OrderedDict(sorted(data.items()));
for key in data: # Will loop through each basket
    for productId in productIds: #Loop through each productId
        if productId in data[key]: # check if productId in the basket products 
            txtFile.write('1 ')
        else:
            txtFile.write('0 ')
    txtFile.write('\n')

輸出：

Answer 3

嘗試這個：

data = {} 
productIds = [] 

for row in reader:
    productIds.append(row['productCode']) 
    if row['basketID'] not in data:
        data[row['basketID']] = set(row['productCode'])
    else:
        data[row['basketID']].add(row['productCode'])

productIds = sorted(set(productIds))

for item in productIds:
    txtFile.write("%s " % item)
txtFile.write('\n')

for key in data: # Will loop through each basket
    for value in sorted(data[key]): #Loop through each product in basket
        for i in productIds: # Go through list of available products
            if value == i: 
                txtFile.write('1 ')
            else:
                txtFile.write('0 ')
    txtFile.write('\n')

循環遍歷CSV文件中的數據，以便將“ 1”和“ 0”輸出到文本文件（Python）

問題描述

3 個解決方案

解決方案1
0 2018-01-31 17:35:48

解決方案2
0 已采納 2018-01-31 19:25:48

解決方案3
-1 2018-01-31 12:20:59

循環遍歷CSV文件中的數據，以便將“ 1”和“ 0”輸出到文本文件（Python）

問題描述

3 個解決方案

解決方案1 0 2018-01-31 17:35:48

解決方案2 0 已采納 2018-01-31 19:25:48

解決方案3 -1 2018-01-31 12:20:59

解決方案1
0 2018-01-31 17:35:48

解決方案2
0 已采納 2018-01-31 19:25:48

解決方案3
-1 2018-01-31 12:20:59