[英]How to apply a function to a certain column by name in a dataframe
I have a dataframe with a columns that contain GPS coordinates. 我有一个数据框,其中的列包含GPS坐标。 I want to convert the columns that are in degree seconds to degree decimals. 我想将以秒为单位的列转换为以十进制为单位的度。 For example, I have a 2 columns named "lat_sec" and "long_sec" that are formatted with values like 186780.8954N. 例如,我有一个名为“ lat_sec”和“ long_sec”的2列,其格式设置为186780.8954N。 I tried to write a function that saves the last character in the string as the direction, divide the number part of it to get the degree decimal, and then concatenate the two together to have the new format. 我试图编写一个函数,该函数将字符串中的最后一个字符保存为方向,将其数字部分除以得到小数位数,然后将两者串联在一起以形成新格式。 I then tried to find the column by its name in the data frame and apply the function to it. 然后,我尝试通过数据框中的名称查找该列并将该函数应用到该列。
New to python and can't find other resources on this. python的新手,在此上找不到其他资源。 I don't think I created my function properly. 我认为我没有正确创建函数。 I have the word 'coordinate' in it because I did not know what to call the value that I am breaking down. 我之内有“坐标”一词,因为我不知道该怎么称呼我正在分解的价值。 My data looks like this: 我的数据如下所示:
long_sec
635912.9277W
555057.2000W
581375.9850W
581166.2780W
df = pd.DataFrame(my_array)
def convertDec(coordinate):
decimal = float(coordinate[:-1]/3600)
direction = coordinate[-1:]
return str(decimal) + str(direction)
df['lat_sec'] = df['lat_sec'].apply(lambda x: x.convertDec())
My error looks like this:
Traceback (most recent call last):
File "code.py", line 44, in <module>
df['lat_sec'] = df['lat_sec'].apply(lambda x: x.convertDec())
File "C:\Python\Python37\lib\site-packages\pandas\core\frame.py", line 2917, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Python\Python37\lib\site-packages\pandas\core\indexes\base.py", line 2604, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 129, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index_class_helper.pxi", line 91, in pandas._libs.index.Int64Engine._check_type
KeyError: 'lat_sec'
By doing float(coordinate[:-1]/3600)
you are dividing str
by int
which is not possible, what you can do is convert the str
into float
than divide it by integer 3600
which gives you float
output. 通过进行float(coordinate[:-1]/3600)
您将str
除以int
是不可能的,您可以做的是将str
转换为float
,然后将其除以整数3600
,从而得到float
输出。
Second you are not using apply
properly and there is no lat_sec
column to which you are applying your function 第二你不使用apply
适当,没有lat_sec
列到你申请你的函数
import pandas as pd
df = pd.DataFrame(['635912.9277W','555057.2000W','581375.9850W','581166.2780W'],columns=['long_sec'])
#function creation
def convertDec(coordinate):
decimal = float(coordinate[:-1])/3600
direction = coordinate[-1:]
return str(decimal) + str(direction)
#if you just want to update the existing column
df['long_sec'] = df.apply(lambda row: convertDec(row['long_sec']), axis=1)
#if you want to create a new column, just change to the name that you want
df['lat_sec'] = df.apply(lambda row: convertDec(row['long_sec']), axis=1)
#OUTPUT
long_sec
0 176.64247991666667W
1 154.18255555555555W
2 161.49332916666665W
3 161.43507722222225W
if you don't want output in float but in integer just change float(coordinate[:-1])/3600
to int(float(coordinate[:-1])/3600)
如果您不希望以float形式输出,而是以整数形式,只需将float(coordinate[:-1])/3600
更改为int(float(coordinate[:-1])/3600)
In your code above, inside convertDec
method, there is also an error in : 在上面的代码中,在convertDec
方法内部,还有一个错误:
decimal = float(coordinate[:-1]/3600)
You need to convert the coordinate
to float first before divide it with 3600. 您需要先将coordinate
转换为浮点,然后再除以3600。
So, your code above should look like this : 因此,上面的代码应如下所示:
import pandas as pd
# Your example dataset
dictCoordinates = {
"long_sec" : ["111111.1111W", "222222.2222W", "333333.3333W", "444444.4444W"],
"lat_sec" : ["555555.5555N", "666666.6666N", "777777.7777N", "888888.8888N"]
}
# Insert your dataset into Pandas DataFrame
df = pd.DataFrame(data = dictCoordinates)
# Your conversion method here
def convertDec(coordinate):
decimal = float(coordinate[:-1]) / 3600 # Eliminate last character, then convert to float, then divide it with 3600
decimal = format(decimal, ".4f") # To make sure the output has 4 digits after decimal point
direction = coordinate[-1] # Extract direction (N or W) from content
return str(decimal) + direction # Return your desired output
# Do the conversion for your "long_sec"
df["long_sec"] = df.apply(lambda x : convertDec(x["long_sec"]), axis = 1)
# Do the conversion for your "lat_sec"
df["lat_sec"] = df.apply(lambda x : convertDec(x["lat_sec"]), axis = 1)
print(df)
That's it. 而已。 Hope this helps. 希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.