简体   繁体   中英

Invalid Syntax in OLS using statsmodels pythton

I want to try regression on some data.

country_scores=country_scores.rename(columns={"Median Math Score (TIMSS Scale, 4th Grade)": "Median Math Score"})
country_scores_log2 = country_scores.copy()
country_scores_log2['GDP Per Capita'] = np.log2(country_scores_log2['GDP Per Capita'].astype(float))
mod = smf.ols(formula="GDP Per Capita ~ Median Math Score", data=country_scores_log2)
res = mod.fit()
print(res.summary())

When I try this, I always get an error saying:

  File "C:\ProgramData\Anaconda3\lib\ast.py", line 50, in parse
    return compile(source, filename, mode, flags,

  File "<unknown>", line 1
    Median Math Score
           ^
SyntaxError: invalid syntax

Typically, the syntax for model fits in python across the packages I have used will take a different model syntax. Try this:

mod = smf.ols("GDP Per Capita" ~ "Median Math Score", data = country_scores_log2)

In general, the syntax that I use is of the format:

mod = smf.ols(y, X)

y being your target variable and X being a matrix/data.table of one or more input variables.

The way I would personally write it would be:

mod = smf.ols(country_scores_log2['GDP Per Capita'], country_scores_log2["Median Math Score"])

Where you could replace "Median Math Score" with a list if you wanted to mess around with different inputs using one or more of your other variables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM