简体   繁体   中英

How to split strings and save as data frame in R?

I am trying to split strings based on the number of new lines each string contains. If the string contains two new lines, I want the first two string from the right side of the strings only. If it doesn't, then just split the strings and save them in data frame.

I have a sample data below:

data<-data.frame(Info=NA,Variable=NA)

   strings<-c(" Fulton Allem \n Full Name"," 5 ft, 11 in\n 180 cm\n Height","215 lbs\n 97 kg\n Weight")

I want the following results:

Info               Variable
Fulton Allem       Full Name
180 cm             Height
97 kg              Weight

Here is my trial:

splitted<-stri_split_regex(string,"\n")

But this does not work for strings with two new lines. The unit for weight and height are two, but same measurement. Hence, I want to get kg for weight and cm for height.

Please note that, the strings can be dynamic. The info for each person varies. Also, some of them do not contain such information. So i cant use regex to just extract those strings.

You can try the following with str_match from stringr :

stringr::str_match(strings, '(?:.*\n)?\\s(.*)\n\\s(.*)')[, -1]

#        [,1]            [,2]       
#[1,] "Fulton Allem " "Full Name"
#[2,] "180 cm"        "Height"   
#[3,] "97 kg"         "Weight"  

Here we capture the last and second last value between '\n' for each string .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM