SyntaxError：扫描字符串文字时 EOL

Question

This is what it was asked for me to do:这就是要求我做的事情：

Remove the dollar sign and comma from the columns.从列中删除美元符号和逗号。 If necessary, convert these two columns to the appropriate data type.如有必要，将这两列转换为适当的数据类型。

As my dataset does not contain values with $ sign, I am removing the '." in the numbers of review for "," for the sake of the exercise由于我的数据集不包含带 $ 符号的值，为了练习，我将删除“，”的评论数中的“。”

def remove_commas(value):
    if pd.isna(value):
        return np.NaN
    else:
        return float(value.replace (".", ","))

df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"

Error Message number 1:错误消息编号 1：

File "/var/folders/vr/bbf8y6555gs306xzf_x7zxf80000gn/T/ipykernel_22769/1957524384.py", line 1
df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"
^
SyntaxError: EOL while scanning string literal

Error Message number 2:错误消息编号 2：

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3628             try:
-> 3629                 return self._engine.get_loc(casted_key)
3630             except KeyError as err:

/opt/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

/opt/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'reviews per month'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/var/folders/vr/bbf8y6555gs306xzf_x7zxf80000gn/T/ipykernel_22769/969712826.py in <module>
----> 1 df["reviews per month"]

/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py in __getitem__(self, key)
3503             if self.columns.nlevels > 1:
3504                 return self._getitem_multilevel(key)
-> 3505             indexer = self.columns.get_loc(key)
3506             if is_integer(indexer):
3507                 indexer = [indexer]

/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3629                 return self._engine.get_loc(casted_key)
3630             except KeyError as err:
-> 3631                 raise KeyError(key) from err
3632             except TypeError:
3633                 # If we have a listlike key, _check_indexing_error will raise

KeyError: 'reviews per month'

Question: what is the issue?问题：什么问题？ Could be related to the datatype?可能与数据类型有关？

For this header is displaying为此 header 正在显示

reviews_per_month                               float64

def remove_commas(value):
    if pd.isna(value):
       return np.NaN
    else:
        return float(value.replace (".", ","))

df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"

I was expecting to get this change in this header of the dataset:我期待在数据集的这个 header 中得到这个改变：

from "reviews_per_month: 0.20" to change to "reviews_per_month: 0,20"从“reviews_per_month：0.20”更改为“reviews_per_month：0,20”

Answer 1

There is no example dataframe provided, so i have created one for the purpose of the question.没有提供示例 dataframe，所以我为这个问题创建了一个示例。

Points to note:注意事项：

the implementation of df.apply() was incorrect. df.apply()的实现不正确。
doing float() on values with a comma (which are strings) would fail.对带有逗号（字符串）的值执行float() ) 会失败。

Side comment : it is not clear why you replace .旁注：不清楚你为什么更换. with , as this would change the type from number to string which appears to be suboptimal. with ,因为这会将类型从数字更改为字符串，这似乎是次优的。

So i made those changes.所以我做了这些改变。

This works:这有效：

import numpy as np
import pandas as pd

# Create a sample dataframe
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': ['5.5', '6.1', '7.14', '8.2']})

# Define a function to be applied to each row of the dataframe
def add_columns(row):
    return row['A'] + row['B']

def remove_commas(value:str):
    if pd.isna(value):
        return np.NaN
    else:
        return value.replace(".", ",")

# Apply the function to the dataframe using the apply() method
df['C'] = df['B'].apply(remove_commas)

# Print the resulting dataframe
print(df)

the return is this:回报是这样的：

   A     B     C
0  1   5.5   5,5
1  2   6.1   6,1
2  3  7.14  7,14
3  4   8.2   8,2

SyntaxError：扫描字符串文字时 EOL

问题描述

1 个解决方案

解决方案1
0 2022-12-27 23:54:29

SyntaxError：扫描字符串文字时 EOL

问题描述

1 个解决方案

解决方案1 0 2022-12-27 23:54:29

解决方案1
0 2022-12-27 23:54:29