简体   繁体   中英

R (Synth package/dataprep function): Unit variable not being read as numeric even though it has been converted to numeric before running dataprep

I have a table that looks like this called "traffic" (below is a portion of it, full thing runs across the 9 states from 1994-2015):

state_no state year traffic_fatalities_per_thousand historical_gas_prices unemployment_rate lane_miles_per_thousand
1 Arizona 1994 0.2179594224 20.64 6.358333333 13.11132012
2 California 1994 0.1351334997 28.09 4.958333333 5.397899983
3 Florida 1994 0.1924537227 20.92 4.383333333 8.127749735
4 Missouri 1994 0.2062029014 20.47 4.725 23.15967224
5 Montana 1994 0.2362785888 24.09 4.966666667 81.11373773
6 Nebraska 1994 0.1671239449 21.76 5.391666667 57.18599045
7 Ohio 1994 0.1232962284 21.5 5.75 10.30144488
8 South Dakota 1994 0.2129901886 22.08 6.116666667 115.2318412
9 Texas 1994 0.1737891025 20.44 6.566666667 16.05877834
1 Arizona 1995 0.240311611 18.72 8.016666667 12.66825296
2 California 1995 0.1331067259 24.37 8.191666667 5.410286718
3 Florida 1995 0.1977384781 19.23 7.541666667 8.020780234
4 Missouri 1995 0.208278165 18.56 4.866666667 23.02816544
5 Montana 1995 0.2475469821 21 4.366666667 80.0636023
6 Nebraska 1995 0.1553381908 20 5 56.72596019
7 Ohio 1995 0.1219130342 19.6 5.458333333 10.26964922
8 South Dakota 1995 0.2169581641 19.92 6.016666667 114.4660289
9 Texas 1995 0.1703988275 18.56 6.733333333 15.85603114

When I run the dataprep function, I keep getting this error statement:

Error in dataprep(foo = cell_phone_policy, predictors = c("historical_gas_prices",  : 
 unit.variable not found as numeric variable in foo."

It sounds pretty straightforward (that the "state_no" variable isn't being read as numeric), but even after converting it to a numeric variable, I still get this statement. What am I doing wrong/how can I fix this? Below is my code for the dataprep function (the treatment is state_no == 2 (CA) and the treatment occurs in 2008). Any help is greatly appreciated.

EDIT (2021-04-25) : See code below for abridged reproducible example

library(tidyverse)
library(Synth)

### Columns
state_no <- c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 
              1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 
              2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 
              3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 
              4, 1, 2, 3, 4)
state <- c("Arizona", "California", "Florida", "Missouri", "Arizona", 
                  "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri")
year <- c(1994, 1994, 1994, 1994, 1995, 1995, 1995, 1995, 1996, 1996, 
                 1996, 1996, 1997, 1997, 1997, 1997, 1998, 1998, 1998, 1998, 1999, 
                 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 
                 2002, 2002, 2002, 2002, 2003, 2003, 2003, 2003, 2004, 2004, 2004, 
                 2004, 2005, 2005, 2005, 2005, 2006, 2006, 2006, 2006, 2007, 2007, 
                 2007, 2007, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2010, 
                 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 
                 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 
                 2015)
traffic_fatalities_per_thousand <- c(0.2179594224, 0.1351334997, 0.1924537227, 0.2062029014, 0.240311611, 
                                            0.1331067259, 0.1977384781, 0.208278165, 0.2242623933, 0.1255159203, 
                                            0.1908239401, 0.2138643727, 0.2089096563, 0.1144712094, 0.1896706133, 
                                            0.2204503586, 0.2099725386, 0.1069064046, 0.1894926494, 0.2149860544, 
                                            0.2143007225, 0.1073762862, 0.1932335948, 0.2000607863, 0.2005105665, 
                                            0.1103863561, 0.18688565, 0.2063908747, 0.1981660869, 0.114643306, 
                                            0.1843246454, 0.1946120467, 0.2077371061, 0.1170793346, 0.1883180478, 
                                            0.2128393956, 0.2001607015, 0.1196349841, 0.1871014316, 0.215964586, 
                                            0.2001573783, 0.1156339776, 0.187364873, 0.1967732667, 0.1977776768, 
                                            0.1207454338, 0.1987292625, 0.217281202, 0.20928253, 0.1173822778, 
                                            0.1863023849, 0.1878971921, 0.1685706016, 0.1098205282, 0.1765430594, 
                                            0.168753431, 0.1443036962, 0.09342523068, 0.1625897381, 0.1623924467, 
                                            0.1270661251, 0.08360111619, 0.1372459583, 0.1472885487, 0.1184609996, 
                                            0.07288414513, 0.1296858774, 0.1369252101, 0.127614021, 0.07481726958, 
                                            0.1259628482, 0.130776046, 0.1252812534, 0.07796762635, 0.1258439986, 
                                            0.1371640063, 0.1281195372, 0.08105389155, 0.1228976221, 0.1252446365, 
                                            0.1148335196, 0.07994330262, 0.1253688617, 0.1263278233, 0.1313695754, 
                                            0.08652486263, 0.1449341709, 0.1430057373)
historical_gas_prices <- c(20.64, 28.09, 20.92, 20.47, 18.72, 24.37, 19.23, 18.56, 16.5, 
                       22.01, 17.14, 16.36, 18.38, 25.47, 18.85, 18.23, 26.2, 29.99, 
                       26.44, 25.98, 27.48, 31.08, 27.6, 27.25, 28.32, 32.24, 28.03, 
                       28.09, 27.86, 30.51, 27.42, 27.63, 21.97, 24.76, 21.8, 21.79, 
                       17.84, 20.56, 18.22, 18.22, 25.56, 26.96, 25.56, 25.46, 22.12, 
                       23.34, 21.53, 21.62, 19.73, 21.47, 19.77, 19.97, 17.49, 18.96, 
                       17.39, 17.66, 14.17, 16.3, 13.98, 14.26, 11.86, 13.78, 11.53, 
                       11.94, 10.53, 11.19, 10.12, 10.56, 10.93, 12.27, 10.43, 10.97, 
                       11.38, 12.55, 11.04, 11.73, 8.52, 10.52, 8.52, 8.61, 8, 9.01, 
                       7.69, 7.91, 9.33, 10.27, 9.15, 9.35)
unemployment_rate <- c(6.358333333, 4.958333333, 4.383333333, 4.725, 8.016666667, 
                          8.191666667, 7.541666667, 4.866666667, 2.841666667, 2.941666667, 
                          3.291666667, 3.183333333, 2.825, 3, 3.55, 3.608333333, 5.666666667, 
                          3.05, 3.333333333, 3.7, 6.216666667, 5.791666667, 4.375, 4.116666667, 
                          7.408333333, 8.816666667, 10.26666667, 10.14166667, 2.716666667, 
                          2.508333333, 2.408333333, 2.666666667, 3.258333333, 2.925, 3.041666667, 
                          3.758333333, 5.708333333, 5.491666667, 2.983333333, 3.25, 4.508333333, 
                          4.741666667, 4.55, 4.608333333, 5.341666667, 5.8, 6.625, 7.183333333, 
                          3.416666667, 3.233333333, 4.033333333, 4.35, 9.025, 6.216666667, 
                          5.083333333, 4.875, 5.283333333, 5.6, 6.475, 5.108333333, 2.875, 
                          3.75, 4.466666667, 4.958333333, 6.433333333, 7.533333333, 8.666666667, 
                          10, 5.466666667, 4.925, 5.266666667, 5.933333333, 12.45833333, 
                          11.54166667, 7.3, 5.308333333, 4.625, 5.491666667, 5.3, 6.191666667, 
                          4.308333333, 4.666666667, 4.966666667, 5.783333333, 6.075, 6.791666667, 
                          7.816666667, 8.35)
lane_miles_per_thousand <- c(13.11132012, 5.397899983, 8.127749735, 23.15967224, 12.66825296, 
                             5.410286718, 8.020780234, 23.02816544, 12.38519525, 5.365058287, 
                             7.931150334, 22.8670941, 12.23867895, 5.29525564, 7.802851529, 
                             22.70472246, 11.5632734, 5.07762586, 7.741764113, 22.59229412, 
                             11.39623618, 5.037604177, 7.673557518, 22.46185221, 10.68221978, 
                             4.943620455, 7.269222395, 21.9478946, 10.41663524, 4.890887451, 
                             7.178380114, 22.03547186, 10.49054035, 4.808558247, 7.193136912, 
                             21.96861993, 10.29968246, 4.802053099, 7.107138507, 21.85694835, 
                             10.10577387, 4.765495135, 6.903448351, 21.92768147, 10.02979414, 
                             4.734681207, 6.810177288, 21.74920875, 9.77202124, 4.714393415, 
                             6.770374069, 21.80773214, 9.537066724, 4.704936844, 6.677426654, 
                             21.96550455, 9.29820405, 4.693325559, 6.622912932, 21.94294105, 
                             9.528382883, 4.650115936, 6.510980427, 21.86832337, 10.03687742, 
                             4.612574948, 6.457868513, 21.74125505, 10.05647925, 4.575171682, 
                             6.390462681, 21.90698429, 9.958715173, 4.613364955, 6.306642741, 
                             21.91601843, 10.0263724, 4.565027174, 6.243997042, 21.82267841, 
                             9.744210935, 5.812280137, 6.152373837, 21.69738087, 9.683856261, 
                             5.002807779, 6.050878307, 21.62328936)

### Data Frame
df_test <- data.frame(state_no, state, year, traffic_fatalities_per_thousand,
                         historical_gas_prices, unemployment_rate, lane_miles_per_thousand)

### Synth
dataprep.out <- dataprep(foo = df_test,
                         predictors = c("historical_gas_prices", "unemployment_rate", "lane_miles_per_thousand"),
                         predictors.op = "mean", 
                         special.predictors = list(
                           list("historical_gas_prices", 1994:2008, "mean"),
                           list("unemployment_rate", 1994:2008,"mean"),
                           list("lane_miles_per_thosuand", 1994:2008,"mean"),
                           dependent = "traffic_fatalities_per_thousand",
                           unit.variable = "state_no",
                           time.variable = "year",
                           treatment.identifier = 2, 
                           controls.identifier = c(1, 3:4),
                           time.predictors.prior = 1994:2008,
                           time.optimize.ssr = 1994:2008,
                           y.plot = 1994:2015,
                           unit.names.variable = "state")) 
 

I have a table that looks like this called "traffic" (below is a portion of it, full thing runs across the 9 states from 1994-2015):

state_no state year traffic_fatalities_per_thousand historical_gas_prices unemployment_rate lane_miles_per_thousand
1 Arizona 1994 0.2179594224 20.64 6.358333333 13.11132012
2 California 1994 0.1351334997 28.09 4.958333333 5.397899983
3 Florida 1994 0.1924537227 20.92 4.383333333 8.127749735
4 Missouri 1994 0.2062029014 20.47 4.725 23.15967224
5 Montana 1994 0.2362785888 24.09 4.966666667 81.11373773
6 Nebraska 1994 0.1671239449 21.76 5.391666667 57.18599045
7 Ohio 1994 0.1232962284 21.5 5.75 10.30144488
8 South Dakota 1994 0.2129901886 22.08 6.116666667 115.2318412
9 Texas 1994 0.1737891025 20.44 6.566666667 16.05877834
1 Arizona 1995 0.240311611 18.72 8.016666667 12.66825296
2 California 1995 0.1331067259 24.37 8.191666667 5.410286718
3 Florida 1995 0.1977384781 19.23 7.541666667 8.020780234
4 Missouri 1995 0.208278165 18.56 4.866666667 23.02816544
5 Montana 1995 0.2475469821 21 4.366666667 80.0636023
6 Nebraska 1995 0.1553381908 20 5 56.72596019
7 Ohio 1995 0.1219130342 19.6 5.458333333 10.26964922
8 South Dakota 1995 0.2169581641 19.92 6.016666667 114.4660289
9 Texas 1995 0.1703988275 18.56 6.733333333 15.85603114

When I run the dataprep function, I keep getting this error statement:

Error in dataprep(foo = cell_phone_policy, predictors = c("historical_gas_prices",  : 
 unit.variable not found as numeric variable in foo."

It sounds pretty straightforward (that the "state_no" variable isn't being read as numeric), but even after converting it to a numeric variable, I still get this statement. What am I doing wrong/how can I fix this? Below is my code for the dataprep function (the treatment is state_no == 2 (CA) and the treatment occurs in 2008). Any help is greatly appreciated.

EDIT (2021-04-25) : See code below for abridged reproducible example

library(tidyverse)
library(Synth)

### Columns
state_no <- c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 
              1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 
              2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 
              3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 
              4, 1, 2, 3, 4)
state <- c("Arizona", "California", "Florida", "Missouri", "Arizona", 
                  "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri", 
                  "Arizona", "California", "Florida", "Missouri", "Arizona", "California", 
                  "Florida", "Missouri", "Arizona", "California", "Florida", "Missouri")
year <- c(1994, 1994, 1994, 1994, 1995, 1995, 1995, 1995, 1996, 1996, 
                 1996, 1996, 1997, 1997, 1997, 1997, 1998, 1998, 1998, 1998, 1999, 
                 1999, 1999, 1999, 2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 
                 2002, 2002, 2002, 2002, 2003, 2003, 2003, 2003, 2004, 2004, 2004, 
                 2004, 2005, 2005, 2005, 2005, 2006, 2006, 2006, 2006, 2007, 2007, 
                 2007, 2007, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2010, 
                 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 
                 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 
                 2015)
traffic_fatalities_per_thousand <- c(0.2179594224, 0.1351334997, 0.1924537227, 0.2062029014, 0.240311611, 
                                            0.1331067259, 0.1977384781, 0.208278165, 0.2242623933, 0.1255159203, 
                                            0.1908239401, 0.2138643727, 0.2089096563, 0.1144712094, 0.1896706133, 
                                            0.2204503586, 0.2099725386, 0.1069064046, 0.1894926494, 0.2149860544, 
                                            0.2143007225, 0.1073762862, 0.1932335948, 0.2000607863, 0.2005105665, 
                                            0.1103863561, 0.18688565, 0.2063908747, 0.1981660869, 0.114643306, 
                                            0.1843246454, 0.1946120467, 0.2077371061, 0.1170793346, 0.1883180478, 
                                            0.2128393956, 0.2001607015, 0.1196349841, 0.1871014316, 0.215964586, 
                                            0.2001573783, 0.1156339776, 0.187364873, 0.1967732667, 0.1977776768, 
                                            0.1207454338, 0.1987292625, 0.217281202, 0.20928253, 0.1173822778, 
                                            0.1863023849, 0.1878971921, 0.1685706016, 0.1098205282, 0.1765430594, 
                                            0.168753431, 0.1443036962, 0.09342523068, 0.1625897381, 0.1623924467, 
                                            0.1270661251, 0.08360111619, 0.1372459583, 0.1472885487, 0.1184609996, 
                                            0.07288414513, 0.1296858774, 0.1369252101, 0.127614021, 0.07481726958, 
                                            0.1259628482, 0.130776046, 0.1252812534, 0.07796762635, 0.1258439986, 
                                            0.1371640063, 0.1281195372, 0.08105389155, 0.1228976221, 0.1252446365, 
                                            0.1148335196, 0.07994330262, 0.1253688617, 0.1263278233, 0.1313695754, 
                                            0.08652486263, 0.1449341709, 0.1430057373)
historical_gas_prices <- c(20.64, 28.09, 20.92, 20.47, 18.72, 24.37, 19.23, 18.56, 16.5, 
                       22.01, 17.14, 16.36, 18.38, 25.47, 18.85, 18.23, 26.2, 29.99, 
                       26.44, 25.98, 27.48, 31.08, 27.6, 27.25, 28.32, 32.24, 28.03, 
                       28.09, 27.86, 30.51, 27.42, 27.63, 21.97, 24.76, 21.8, 21.79, 
                       17.84, 20.56, 18.22, 18.22, 25.56, 26.96, 25.56, 25.46, 22.12, 
                       23.34, 21.53, 21.62, 19.73, 21.47, 19.77, 19.97, 17.49, 18.96, 
                       17.39, 17.66, 14.17, 16.3, 13.98, 14.26, 11.86, 13.78, 11.53, 
                       11.94, 10.53, 11.19, 10.12, 10.56, 10.93, 12.27, 10.43, 10.97, 
                       11.38, 12.55, 11.04, 11.73, 8.52, 10.52, 8.52, 8.61, 8, 9.01, 
                       7.69, 7.91, 9.33, 10.27, 9.15, 9.35)
unemployment_rate <- c(6.358333333, 4.958333333, 4.383333333, 4.725, 8.016666667, 
                          8.191666667, 7.541666667, 4.866666667, 2.841666667, 2.941666667, 
                          3.291666667, 3.183333333, 2.825, 3, 3.55, 3.608333333, 5.666666667, 
                          3.05, 3.333333333, 3.7, 6.216666667, 5.791666667, 4.375, 4.116666667, 
                          7.408333333, 8.816666667, 10.26666667, 10.14166667, 2.716666667, 
                          2.508333333, 2.408333333, 2.666666667, 3.258333333, 2.925, 3.041666667, 
                          3.758333333, 5.708333333, 5.491666667, 2.983333333, 3.25, 4.508333333, 
                          4.741666667, 4.55, 4.608333333, 5.341666667, 5.8, 6.625, 7.183333333, 
                          3.416666667, 3.233333333, 4.033333333, 4.35, 9.025, 6.216666667, 
                          5.083333333, 4.875, 5.283333333, 5.6, 6.475, 5.108333333, 2.875, 
                          3.75, 4.466666667, 4.958333333, 6.433333333, 7.533333333, 8.666666667, 
                          10, 5.466666667, 4.925, 5.266666667, 5.933333333, 12.45833333, 
                          11.54166667, 7.3, 5.308333333, 4.625, 5.491666667, 5.3, 6.191666667, 
                          4.308333333, 4.666666667, 4.966666667, 5.783333333, 6.075, 6.791666667, 
                          7.816666667, 8.35)
lane_miles_per_thousand <- c(13.11132012, 5.397899983, 8.127749735, 23.15967224, 12.66825296, 
                             5.410286718, 8.020780234, 23.02816544, 12.38519525, 5.365058287, 
                             7.931150334, 22.8670941, 12.23867895, 5.29525564, 7.802851529, 
                             22.70472246, 11.5632734, 5.07762586, 7.741764113, 22.59229412, 
                             11.39623618, 5.037604177, 7.673557518, 22.46185221, 10.68221978, 
                             4.943620455, 7.269222395, 21.9478946, 10.41663524, 4.890887451, 
                             7.178380114, 22.03547186, 10.49054035, 4.808558247, 7.193136912, 
                             21.96861993, 10.29968246, 4.802053099, 7.107138507, 21.85694835, 
                             10.10577387, 4.765495135, 6.903448351, 21.92768147, 10.02979414, 
                             4.734681207, 6.810177288, 21.74920875, 9.77202124, 4.714393415, 
                             6.770374069, 21.80773214, 9.537066724, 4.704936844, 6.677426654, 
                             21.96550455, 9.29820405, 4.693325559, 6.622912932, 21.94294105, 
                             9.528382883, 4.650115936, 6.510980427, 21.86832337, 10.03687742, 
                             4.612574948, 6.457868513, 21.74125505, 10.05647925, 4.575171682, 
                             6.390462681, 21.90698429, 9.958715173, 4.613364955, 6.306642741, 
                             21.91601843, 10.0263724, 4.565027174, 6.243997042, 21.82267841, 
                             9.744210935, 5.812280137, 6.152373837, 21.69738087, 9.683856261, 
                             5.002807779, 6.050878307, 21.62328936)

### Data Frame
df_test <- data.frame(state_no, state, year, traffic_fatalities_per_thousand,
                         historical_gas_prices, unemployment_rate, lane_miles_per_thousand)

### Synth
dataprep.out <- dataprep(foo = df_test,
                         predictors = c("historical_gas_prices", "unemployment_rate", "lane_miles_per_thousand"),
                         predictors.op = "mean", 
                         special.predictors = list(
                           list("historical_gas_prices", 1994:2008, "mean"),
                           list("unemployment_rate", 1994:2008,"mean"),
                           list("lane_miles_per_thosuand", 1994:2008,"mean"),
                           dependent = "traffic_fatalities_per_thousand",
                           unit.variable = "state_no",
                           time.variable = "year",
                           treatment.identifier = 2, 
                           controls.identifier = c(1, 3:4),
                           time.predictors.prior = 1994:2008,
                           time.optimize.ssr = 1994:2008,
                           y.plot = 1994:2015,
                           unit.names.variable = "state")) 
 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM