為字母表的每個字母循環添加新列

Question

我有這個看起來很簡單的問題，但為了愛，萬能的人不知道如何解決它。

我有一個包含字符串列的數據框。 我可以應用下面的代碼來添加一個新列，該列顯示字母“a”在其旁邊的字符串中出現的頻率。 使用循環，我可以對列中的每個單詞執行此操作。

我現在想做的是通過將我的循環集成到另一個循環中，對字母表中的每個字母做同樣的事情。 所以最終結果應該是一個數據框，每個字母有一列，顯示該字母在字符串中的出現。

循環內的循環真的讓我感到困惑，雖然我可以想象它在理論上應該如何工作，但我很難讓它在實踐中工作。 :(

非常感謝！

df = pd.DataFrame(["fancy","nice","aweseome","wow"])
A = []
for i in range(len(df)):
    A.append(df.iloc[:, 0][i].count("a"))
df["A"] = A
df

編輯：

嘿，朋友們，現在我想將解決方案應用於 df 中盡可能多的字符串列：

df = pd.DataFrame([["fancy", "cold"], ["nice","sunny"],["awesome", "warm"], ["wow", "rainy"]])

print(df)

for i in range(1,len(df.columns)):
    for letter in ascii_lowercase:
        df[letter] = df.iloc[:, i].apply(lambda x: x.lower().count(letter))
        df.append(df)
        
print(df)

我的代碼覆蓋了第一個字符串列中的 az 列。 但是，如果有意義的話，我希望它為每個字符串列添加單獨的 az 列。

如果這擾亂了數據框中變量的順序，那么將每一列的 az 輸出存儲在新數據框中（附加到一個或每個列一個）也可能是一種解決方案。 我也無法真正做到這一點。 ：/

有什么幫助嗎？ ：）謝謝

Answer 1

這就是我將如何做到的：

from string import ascii_lowercase

for letter in ascii_lowercase:
    df[letter] = df[0].apply(lambda x: x.lower().count(letter))

Answer 2

您可以使用 Pandas 的強大功能來測試所有行，並將輸出放入新列中。 您可以使用 chr(x) 獲取 az 中的所有字母，並獲取小寫單詞的計數。 例如：

df = pd.DataFrame({'word': ["fancy", "nice", "awesome", "wow"]})
df['word_lowercase'] = df['word'].str.lower()

for i in range(97, 123):
    letter = chr(i)
    df[letter] = df['word_lowercase'].str.count(letter)

Answer 3

這可能有點超出這個特定的問題，但也許它對其他人有用，因為我經常看到這樣的頻率圖類型的問題。

我們可以使用Series.explode和str.upper將字符串分解成行，然后使用pd.crosstab獲取值計數並join回df ：

import pandas as pd

df = pd.DataFrame(["fancy", "nice", "aweseome", "wow"])
s = df[0].agg(list).explode().str.upper()
df = df.join(pd.crosstab(s.index, s))

print(df)

df ：

          0  A  C  E  F  I  M  N  O  S  W  Y
0     fancy  1  1  0  1  0  0  1  0  0  0  1
1      nice  0  1  1  0  1  0  1  0  0  0  0
2  aweseome  1  0  3  0  0  1  0  1  1  1  0
3       wow  0  0  0  0  0  0  0  1  0  2  0

crosstab后可以使用reindex來確保所有字母都存在：

import string

import pandas as pd

df = pd.DataFrame(["fancy", "nice", "aweseome", "wow"])
s = df[0].agg(list).explode().str.upper()
df = df.join(pd.crosstab(s.index, s)
             .reindex(columns=list(string.ascii_uppercase),
                      fill_value=0))

print(df.to_string())

df ：

          0  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z
0     fancy  1  0  1  0  0  1  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  1  0
1      nice  0  0  1  0  1  0  0  0  1  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0
2  aweseome  1  0  0  0  3  0  0  0  0  0  0  0  1  0  1  0  0  0  1  0  0  0  1  0  0  0
3       wow  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  2  0  0  0

Answer 4

      import string
      import pandas as pd

      # alph is a list of all letters from a-z and A-Z 
      #(uppercase and lowercase)

      alph = list(string.ascii_letters)

      '''
       ascii_lowercase  ==> for lowercase letters only
      ascii_uppercase  ==> for uppercase letters only
      '''

      '''
       num_letters is the numbers of letters from a-z and A- 
       Z which is 52
       26 lowercase letters and 26 uppercase letters
       so you do not have to worry about it being uppercase 
       or lowercase
       '''
       num_letters = len(alph)

      # this is the dataframe containing a list of the words.
      
      df = pd.DataFrame(['fancy', 'nice', 'awesome', 'wow'])

      '''
       p is the iterator to loop through alph therefore, every 
       letter  in
       the alph is known as p
      '''
     for p in alph:
        # this variable makes every letter (p) a string
          letter = str(p)

        # this sets every letter in alph to a variable which 
           # stores an empty list
          p = []

       '''
        x is used to loop through the dataframe and use 
            the len() function to establish
       '''
           for x in range(len(df)):
                   p.append(df.iloc[:, 0][x].count(letter))
           df[letter] = p

   print(df)

試試這個所有的解釋都嵌入在注釋和文檔字符串中。

這使您可以控制刻字。 一旦你刪除評論和文檔字符串，它只是幾行

你也可以玩弄它。 我還將發布另一種代碼較少的方法

Answer 5

     import pandas as pd
     import string

     x = list(string.ascii_uppercase)
     df = pd.DataFrame(['fancy', 'nice', 'awesome', 'wow'])


     alphabet = string.ascii_uppercase
     for letters in alphabet:
          df[letters] = df[0].apply(lambda x: x.upper().count(letters))
     print(df)

我說的另一種方法

為字母表的每個字母循環添加新列

問題描述

5 個解決方案

解決方案1
2 已采納 2021-07-15 22:45:09

解決方案2
1 2021-07-15 22:44:57

解決方案3
1 2021-07-15 23:00:36

解決方案4
0 2021-07-16 01:01:27

解決方案5
0 2021-07-16 01:18:30

為字母表的每個字母循環添加新列

問題描述

5 個解決方案

解決方案1 2 已采納 2021-07-15 22:45:09

解決方案2 1 2021-07-15 22:44:57

解決方案3 1 2021-07-15 23:00:36

解決方案4 0 2021-07-16 01:01:27

解決方案5 0 2021-07-16 01:18:30

解決方案1
2 已采納 2021-07-15 22:45:09

解決方案2
1 2021-07-15 22:44:57

解決方案3
1 2021-07-15 23:00:36

解決方案4
0 2021-07-16 01:01:27

解決方案5
0 2021-07-16 01:18:30