简体   繁体   中英

Split string into multiple variables

I have dataset that looks like this:

Var1
PASSED=50; NOT PASSED=10; GPA=1;

How can I produce the dataset below?

Pass     Not_pass      GPA  
  50           10        1

I used the following code but it did not work:

generate pass = subinstr(subinstr(word(Var1, 1), "PASSED=", "", .) if regexm(Var1, "PASSED=") == 1
replace pass = pass[_n+1] if pass[_n]=="" & pass[_n+1]!=""

The following works for me:

clear
input strL Var1
"PASSED=50; NOT PASSED=10; GPA=1;"
end

split Var1, parse(";") generate(x)

forvalues i = 1 / 3 {
    generate v`i' = real(regexs(1)) if regexm(x`i',"([0-9]+)")
}

drop x*
rename (v1 v2 v3) (Pass Not_Pass GPA)

list 

     +----------------------------------------------------------+
     |                             Var1   Pass   Not_Pass   GPA |
     |----------------------------------------------------------|
  1. | PASSED=50; NOT PASSED=10; GPA=1;     50         10     1 |
     +----------------------------------------------------------+

you can learn to split up strings in python using the str documentation. For example

var1 = "PASSED=50; NOT PASSED=10; GPA=1;"
p, np, gpa, _ = var1.split(";")

This can actually leave some white space

print(np)
>>> ' NOT PASSED=10'

Which can be fixed with strip

print(np.strip())
>>> 'NOT PASSED=10'

Then you can set up a dictionary to store all of your data

d = {x.strip().split("=")[0]:x.split("=")[1] for x in [p, np, gpa]}
print(d)
>>> {'PASSED': '50', 'NOT PASSED': '10', 'GPA': '1'}

Using moss from SSC is another way to do it in Stata.

clear
input strL Var1
"PASSED=50; NOT PASSED=10; GPA=1;"
end

ssc install moss 
moss Var1, match("([0-9]+)") regex 

rename (_match?) (Pass Not_Pass GPA)
drop _* 

list 

     +----------------------------------------------------------+
     |                             Var1   Pass   Not_Pass   GPA |
     |----------------------------------------------------------|
  1. | PASSED=50; NOT PASSED=10; GPA=1;     50         10     1 |
     +----------------------------------------------------------+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM