简体   繁体   English

SyntaxError:扫描字符串文字时 EOL

[英]SyntaxError: EOL while scanning string literal

This is what it was asked for me to do:这就是要求我做的事情:

Remove the dollar sign and comma from the columns.从列中删除美元符号和逗号。 If necessary, convert these two columns to the appropriate data type.如有必要,将这两列转换为适当的数据类型。

As my dataset does not contain values with $ sign, I am removing the '." in the numbers of review for "," for the sake of the exercise由于我的数据集不包含带 $ 符号的值,为了练习,我将删除“,”的评论数中的“。”

def remove_commas(value):
    if pd.isna(value):
        return np.NaN
    else:
        return float(value.replace (".", ","))

df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"

Error Message number 1:错误消息编号 1:

File "/var/folders/vr/bbf8y6555gs306xzf_x7zxf80000gn/T/ipykernel_22769/1957524384.py", line 1
df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"
^
SyntaxError: EOL while scanning string literal

Error Message number 2:错误消息编号 2:

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)
/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3628             try:
-> 3629                 return self._engine.get_loc(casted_key)
3630             except KeyError as err:

/opt/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

/opt/anaconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'reviews per month'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/var/folders/vr/bbf8y6555gs306xzf_x7zxf80000gn/T/ipykernel_22769/969712826.py in <module>
----> 1 df["reviews per month"]

/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py in __getitem__(self, key)
3503             if self.columns.nlevels > 1:
3504                 return self._getitem_multilevel(key)
-> 3505             indexer = self.columns.get_loc(key)
3506             if is_integer(indexer):
3507                 indexer = [indexer]

/opt/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3629                 return self._engine.get_loc(casted_key)
3630             except KeyError as err:
-> 3631                 raise KeyError(key) from err
3632             except TypeError:
3633                 # If we have a listlike key, _check_indexing_error will raise

KeyError: 'reviews per month'

Question: what is the issue?问题:什么问题? Could be related to the datatype?可能与数据类型有关?

For this header is displaying为此 header 正在显示

reviews_per_month                               float64
def remove_commas(value):
    if pd.isna(value):
       return np.NaN
    else:
        return float(value.replace (".", ","))

df["reviews per month"]=df["reviews_per_month"].apply(lambda x: remove_commas(x))"

I was expecting to get this change in this header of the dataset:我期待在数据集的这个 header 中得到这个改变:

from "reviews_per_month: 0.20" to change to "reviews_per_month: 0,20"从“reviews_per_month:0.20”更改为“reviews_per_month:0,20”

There is no example dataframe provided, so i have created one for the purpose of the question.没有提供示例 dataframe,所以我为这个问题创建了一个示例。

Points to note:注意事项:

  • the implementation of df.apply() was incorrect. df.apply()的实现不正确。
  • doing float() on values with a comma (which are strings) would fail.对带有逗号(字符串)的值执行float() ) 会失败。

Side comment : it is not clear why you replace .旁注:不清楚你为什么更换. with , as this would change the type from number to string which appears to be suboptimal. with ,因为这会将类型从数字更改为字符串,这似乎是次优的。

So i made those changes.所以我做了这些改变。

This works:这有效:

import numpy as np
import pandas as pd

# Create a sample dataframe
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': ['5.5', '6.1', '7.14', '8.2']})

# Define a function to be applied to each row of the dataframe
def add_columns(row):
    return row['A'] + row['B']

def remove_commas(value:str):
    if pd.isna(value):
        return np.NaN
    else:
        return value.replace(".", ",")

# Apply the function to the dataframe using the apply() method
df['C'] = df['B'].apply(remove_commas)

# Print the resulting dataframe
print(df)

the return is this:回报是这样的:

   A     B     C
0  1   5.5   5,5
1  2   6.1   6,1
2  3  7.14  7,14
3  4   8.2   8,2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM