What is the meaning of string locator ', \\s*([^\\.]*)\\s*\\.'
=?
I have a dataframe identical to Extract sub-string between 2 special characters from one column of Pandas DataFrame
and want to extract the substring located between ","
and "."
. Thanks to the post answer, a way would be as below:
In [157]: df['Title'] = df.Name.str.extract(r',\s*([^\.]*)\s*\.', expand=False)
In [158]: df
Out[158]:
Name Title
0 Jim, Mr. Jones Mr
1 Sara, Miss. Baker Miss
2 Leila, Mrs. Jacob Mrs
3 Ramu, Master. Kuttan Master
Although I see the outcome being correct, what is the meaning of ',\\s*([^\\.]*)\\s*\\.'
? In particular, what is the meaning of '*' and '\\'?
It means the following, match:
,
(comma) \\s*
zero or more whitespaces characters (tab, spaces, etc) ([^\\.])*
zero or more characters that are not a .
(dot) \\s*
zero or more whitespaces characters \\.
(dot) You can find more about regex in here .
UPDATE
As @UnbearableLightness mentioned the character \\
is redundant inside a character set to escape the .
(dot). A character set is anything defined between []
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.