I am trying to read in the txt. file in R and running regression of it.
This is the data set: http://www.stat.nthu.edu.tw/~swcheng/Teaching/stat5410/data/E2.9.txt
e29 <- read.table("http://www.stat.nthu.edu.tw/~swcheng/Teaching/stat5410/data/E2.9.txt", head = F, fill = T)
regr_e29 <- lm(log(e29[ ,7]) ~ log(e29[ ,1]) + log(e29[ ,4]), data = e29)
summary(regr_e29)
Why do I get this error message?
Error in log(e29[, 1]) : non-numeric argument to mathematical function
You need to do some data clean up first.
Capital Labor Real Value Added
YEAR 20 36 37 20 36 37 20 36 37
72 243462 291610 1209188 708014 881231 1259142 6496.96 6713.75 11150.0
73 252402 314728 1330372 699470 960917 1371795 5587.34 7551.68 12853.6
74 246243 278746 1157371 697628 899144 1263084 5521.32 6776.40 10450.8
75 263639 264050 1070860 674830 739485 1118226 5890.64 5554.89 9318.3
76 276938 286152 1233475 685836 791485 1274345 6548.57 6589.67 12097.7
77 290910 286584 1355769 678440 832818 1369877 6744.80 7232.56 12844.8
78 295616 280025 1351667 667951 851178 1451595 6694.19 7417.01 13309.9
...
86 294777 261943 1281262 571454 670927 1171664 8506.37 6651.02 10836.5
To read it into R
you could use
e29 <- read.table("http://www.stat.nthu.edu.tw/~swcheng/Teaching/stat5410/data/E2.9.txt",
head = T,
fill = T,
skip = 1)
which creates a data.frame
YEAR X20 X36 X37 X20.1 X36.1 X37.1 X20.2 X36.2 X37.2
1 72 243462 291610 1209188 708014 881231 1259142 6496.96 6713.75 11150.0
2 73 252402 314728 1330372 699470 960917 1371795 5587.34 7551.68 12853.6
3 74 246243 278746 1157371 697628 899144 1263084 5521.32 6776.40 10450.8
4 75 263639 264050 1070860 674830 739485 1118226 5890.64 5554.89 9318.3
5 76 276938 286152 1233475 685836 791485 1274345 6548.57 6589.67 12097.7
6 77 290910 286584 1355769 678440 832818 1369877 6744.80 7232.56 12844.8
7 78 295616 280025 1351667 667951 851178 1451595 6694.19 7417.01 13309.9
8 79 301929 279806 1326248 675147 848950 1328683 6541.68 7425.69 13402.3
9 80 307346 258823 1089545 658027 779393 1077207 6587.33 6410.91 8571.0
10 81 302224 264913 1111942 627551 757462 1056231 6746.77 6263.26 8739.7
11 82 288805 247491 988165 609204 664834 947502 7278.30 5718.46 8140.0
12 83 291094 246028 1069651 604601 664249 1057159 7514.78 5936.93 10958.4
13 84 285601 256971 1191677 601688 717273 1169442 7539.93 6659.30 10838.9
14 85 292026 248237 1246536 584288 678155 1195255 8332.65 6632.67 10030.5
15 86 294777 261943 1281262 571454 670927 1171664 8506.37 6651.02 10836.5
16 \032 NA NA NA NA NA NA NA NA NA
dplyr
and tidyr
):library(dplyr)
library(tidyr)
clean_e29 <- e29 %>%
rename_with(~gsub("X", "Capital_", .x), 2:4) %>%
rename_with(~gsub("X", "Labor_", .x), 5:7) %>%
rename_with(~gsub("X", "RVA_", .x), 8:10) %>%
rename_with(~gsub("\\.\\d+", "", .x), everything()) %>%
drop_na()
We renamed the columns and removed the last row, which contained a strange symbol. Now we have a clean data.frame
YEAR Capital_20 Capital_36 Capital_37 Labor_20 Labor_36 Labor_37 RVA_20 RVA_36 RVA_37
1 72 243462 291610 1209188 708014 881231 1259142 6496.96 6713.75 11150.0
2 73 252402 314728 1330372 699470 960917 1371795 5587.34 7551.68 12853.6
3 74 246243 278746 1157371 697628 899144 1263084 5521.32 6776.40 10450.8
4 75 263639 264050 1070860 674830 739485 1118226 5890.64 5554.89 9318.3
5 76 276938 286152 1233475 685836 791485 1274345 6548.57 6589.67 12097.7
6 77 290910 286584 1355769 678440 832818 1369877 6744.80 7232.56 12844.8
7 78 295616 280025 1351667 667951 851178 1451595 6694.19 7417.01 13309.9
8 79 301929 279806 1326248 675147 848950 1328683 6541.68 7425.69 13402.3
9 80 307346 258823 1089545 658027 779393 1077207 6587.33 6410.91 8571.0
10 81 302224 264913 1111942 627551 757462 1056231 6746.77 6263.26 8739.7
11 82 288805 247491 988165 609204 664834 947502 7278.30 5718.46 8140.0
12 83 291094 246028 1069651 604601 664249 1057159 7514.78 5936.93 10958.4
13 84 285601 256971 1191677 601688 717273 1169442 7539.93 6659.30 10838.9
14 85 292026 248237 1246536 584288 678155 1195255 8332.65 6632.67 10030.5
15 86 294777 261943 1281262 571454 670927 1171664 8506.37 6651.02 10836.5
lm(log(RVA_20) ~ log(Labor_20) + log(Capital_20), data = clean_e29)
This returns
Call:
lm(formula = log(RVA_20) ~ log(Labor_20) + log(Capital_20), data = clean_e29)
Coefficients:
(Intercept) log(Labor_20) log(Capital_20)
25.4929 -1.4585 0.2269
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.