I have a data frame with one column that looks like this
df = read.table(file="sprint.m.df.txt", sep="\t", quote="", header=TRUE)
X.Rank...Time...Wind...Name...Country...Birthdate...City...Date.
1 1 9.58 0.9 "Usain Bolt" "JAM" "21.08.86" "Berlin" "16.08.2009"
2 2 9.63 1.5 "Usain Bolt" "JAM" "21.08.86" "London" "05.08.2012"
3 3 9.69 0 "Usain Bolt" "JAM" "21.08.86" "Beijing" "16.08.2008"
4 3 9.69 2 "Tyson Gay" "USA" "09.08.82" "Shanghai" "20.09.2009"
5 3 9.69 -0.1 "Yohan Blake" "JAM" "26.12.89" "Lausanne" "23.08.2012"
6 6 9.71 0.9 "Tyson Gay" "USA" "09.08.82" "Berlin" "16.08.2009"
i have been trying to split the column into multiple columns using string splits and other methods but nothing is working.
How can i split the data frame so i can end up with a data frame as
X.rank | Time | wind | name | country | birthdate| city | date
1 9.58 0.9 Usian Bolt jam 21.08.86 Berlin 16.08.2009
You can create a tribble with the tibble package
library(tibble)
df <- tribble(
~X, ~RankTime, ~Wind, ~Name, ~Country, ~Birthdate, ~City, ~Date,
1, 9.58, 0.9, "Usain Bolt", "JAM", "21.08.86", "Berlin", "16.08.2009",
2, 9.63, 1.5, "Usain Bolt", "JAM", "21.08.86", "London", "05.08.2012",
3, 9.69, 0, "Usain Bolt", "JAM", "21.08.86", "Beijing", "16.08.2008",
3, 9.69, 2, "Tyson Gay", "USA", "09.08.82", "Shanghai", "20.09.2009",
3, 9.69, -0.1, "Yohan Blake", "JAM", "26.12.89", "Lausanne", "23.08.2012",
6, 9.71, 0.9, "Tyson Gay", "USA", "09.08.82", "Berlin", "16.08.2009")
df
# output
# A tibble: 6 x 8
X RankTime Wind Name Country Birthdate City Date
<dbl> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr>
1 1 9.58 0.9 Usain Bolt JAM 21.08.86 Berlin 16.08.2009
2 2 9.63 1.5 Usain Bolt JAM 21.08.86 London 05.08.2012
3 3 9.69 0 Usain Bolt JAM 21.08.86 Beijing 16.08.2008
4 3 9.69 2 Tyson Gay USA 09.08.82 Shanghai 20.09.2009
5 3 9.69 -0.1 Yohan Blake JAM 26.12.89 Lausanne 23.08.2012
6 6 9.71 0.9 Tyson Gay USA 09.08.82 Berlin 16.08.2009
The spaces within quotes makes the column difficult to parse but it's easy to read. See my comments above and use read.table(file="sprint.m.df.txt", sep=" ")
or if you really have to work with your df
, then try read_delim
or scan
.
df8 <- readr::read_delim(df[,1], delim=" ", col_names =FALSE)
# OR
df8 <- data.frame(matrix(scan(text=df[,1], what=" "), ncol=8, byrow=TRUE))
colnames(df8) <- c("rank", "Time", "wind", "name", "country", "birthdate", "city", "date")
df8
rank Time wind name country birthdate city date
1 1 9.58 0.9 Usain Bolt JAM 21.08.86 Berlin 16.08.2009
2 2 9.63 1.5 Usain Bolt JAM 21.08.86 London 05.08.2012
3 3 9.69 0 Usain Bolt JAM 21.08.86 Beijing 16.08.2008
4 3 9.69 2 Tyson Gay USA 09.08.82 Shanghai 20.09.2009
5 3 9.69 -0.1 Yohan Blake JAM 26.12.89 Lausanne 23.08.2012
6 6 9.71 0.9 Tyson Gay USA 09.08.82 Berlin 16.08.2009
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.