[英]Change continuous variable into categorical
我有今年的變量,我想把它改成一個有 3 個級別的分類變量。 我這里用的是levels函數,真的很痛苦。
traintest$YearBuilt <- as.factor(traintest$YearBuilt)
levels(traintest$YearBuilt)[levels(traintest$YearBuilt)%in%c(1872,1875,1879,1880,1882,
1885,1890,1892,1893,1895,
1896,1898,1900,1901,1902,
1904,1905,1906,1907,1908,
1910,1911,1912,1913,1914,
1915,1916,1917,1918,1919,
1920,1921,1922,1923,1924,
1925,1926,1927,1928,1929,
1930,1931,1932,1934,1935,
1936,1937,1938,1939,1940,
1941,1942,1945,1946,1947,
1948,1949)] <- "Before1950"
levels(traintest$YearBuilt)[levels(traintest$YearBuilt)%in%c(1950,1951,1952,1953,1954,
1955,1956,1957,1958,1959,
1960,1961,1962,1963,1964,
1965,1966,1967,1968,1969,
1970,1971,1972,1973,1974,
1975,1976,1977,1978,1979,
1980,1981,1982,1983,1984,
1985,1986,1987,1988,1989,
1990,1991,1992,1993,1994,
1995,1996,1997,1998,1999)] <- "Between1950-2000"
levels(traintest$YearBuilt)[levels(traintest$YearBuilt)%in%c(2000,2001,2002,2003,2004,
2005,2006,2007,2008,2009,
2010)] <- "After2000"
我試過用cut函數,但是對我來說不太好用,它基本上把所有的變量都放到了第一類,其他兩個類都歸零了。
有沒有更簡單的方法可以做到這一點?
一種選擇是創建一個邏輯向量
v1 <- as.numeric(levels(traintest$YearBuilt))
i1 <- v1 < 1950
i2 <- !i1 & v1 < 2000
i3 <- v1 >=2000
levels(traintest$YearBuilt)[i1] <- "Before 1950"
levels(traintest$YearBuilt)[i2] <- "Between1950-2000"
levels(traintest$YearBuilt)[i3] <- "After 2000"
或者使用cut
levels(traintest$YearBuilt) <- cut(v1, breaks = c(-Inf, 1949, 1999,
Inf), labels = c("Before1950", "Between1950-2000", "After 2000"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.