I am new to Pandas and I am trying to get the biggest string for every row in a DataFrame.
import pandas as pd
import sqlite3
authors = pd.read_sql('select * from authors')
authors['name']
...
12 KRISHNAN RAJALAKSHMI
13 J O
14 TSIPE
15 NURRIZA
16 HATICE OZEL
17 D ROMERO
18 LLIBERTAT
19 E F
20 JASMEET KAUR
...
What I expect is to get back the biggest string in each authors['name'] row:
...
12 RAJALAKSHMI
13 J
14 TSIPE
15 NURRIZA
16 HATICE
17 ROMERO
18 LLIBERTAT
19 E
20 JASMEET
...
I tried to split the string by spaces and apply(max) but it's not working. It seems that pandas is not applying max to each row.
authors['name'].str.split().apply(max)
# or
authors['name'].str.split().apply(lambda x: max(x))
# or
def get_max(x):
y = max(x)
print (y) # y is the biggest string in each row
return y
authors['name'].str.split().apply(get_max)
# Still results in:
...
12 KRISHNAN RAJALAKSHMI
13 J O
14 TSIPE
15 NURRIZA
16 HATICE OZEL
17 D ROMERO
18 LLIBERTAT
19 E F
20 JASMEET KAUR
...
When you tell pandas to apply max
to the split series, it doesn't know what it should be maximizing. You might instead try something like
authors['name'].apply(lambda x: max(x.split(), key=len))
For each row, this will create an array of the substrings, and return the largest string, using the string length as the key.
Also note that while authors['name'].apply(lambda x: max(x.split()))
works without having to specify the key=len
for max, authors['name'].str.split().max()
does not work, since max()
is a pandas dataframe method that is specifically built to get the maximum value of a numeric column, not the maximum length string of each split row.
You are not replacing its values...
Try this function:
def getName(df):
df[0] = df[0].apply(lambda x: max(x.split(), key=len))
And then you just have to call:
getName(authors)
Note that I reassign each value of df[0]
in this code.
Output:
names
0 RAJALAKSHMI
1 O
2 TSIPE
3 NURRIZA
4 HATICE
5 ROMERO
6 LLIBERTAT
7 F
8 JASMEET
The main problem in your code is that you weren't reassigning the values in each row.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.