简体   繁体   中英

AttributeError running lower_case.translate & string.punctuation on Pandas df

I get AttributeError while running lower_case.translate & string.punctuation on Pandas dataframe containing reviews. The imported data is ugly. The error received is AttributeError: 'DataFrame' object has no attribute 'translate' the full error is below.

I tried the different verision in the comments

# cleaned_text = lower_case.translate(str.maketrans(string.punctuation, ' '*len(string.punctuation)))
# cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation)

cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

I also tried this SO post and added a fillna the below above hoping to fix it.

#checking for nulls if present any
print("Number of rows with null values:")
print(lower_case.isnull().sum().sum())

lower_case.fillna("")

a [small sample excel][2] for data frame https://github.com/taylorjohn/Simple_RecSys/blob/master/sample-data.xlsx

code

import string
from collections import Counter
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from nltk.corpus import stopwords
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize

# data is in excel formatted ugly and unclean  columns are Artist Names rows are reviews for said Artist
df = pd.read_excel('sample-data.xlsx',encoding='utf8', errors='ignore')

lower_case = df.apply(lambda x: x.astype(str).str.lower())

#checking for nulls if present any
print("Number of rows with null values:")
print(lower_case.isnull().sum().sum())

lower_case.fillna("")


#cleaned_text = lower_case.translate(str.maketrans(string.punctuation, ' '*len(string.punctuation)))
# cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation)

cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

The Error received is

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-78-9f23b8a5e8e0> in <module>
      2 # cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation)
      3 
----> 4 cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

~\anaconda3\envs\nlp_course\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5272             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5273                 return self[name]
-> 5274             return object.__getattribute__(self, name)
   5275 
   5276     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'translate'

Pandas DataFrames don't have a .translate() method—but, Python strings do. For example:

import string

my_str = "hello world!"                                                                                                                                                                            
my_str.translate(str.maketrans('', '', string.punctuation)) 

If you want to apply that translation to each column value in the row of the DataFrame, you can use .map() on the column. The .map() method takes a function that accepts the column value as an argument, and you can return the transformed value:

def remove_punctuation(value):
    return value.translate(str.maketrans('', '', string.punctuation))

df["my_cleaned_column"] = df["my_dirty_column"].map(remove_punctuation)

You can also use a lambda function, rather than defining a new function:

df["my_cleaned_column"] = df["my_dirty_column"].map(
    lambda x: x.translate(str.maketrans('', '', string.punctuation))
)

If you have many columns you need to apply this to, you can do this:

for column_name in df.columns:
    df[column_name] = df[column_name].map(
        lambda x: x.translate(str.maketrans('', '', string.punctuation))
    )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM