简体   繁体   中英

manipulate data set using R

I have a data set which has for example the following column entries

Data: 05.12.2017   |   Acceleration: 0,0042414... 

Data: 05.12.2017   |   Acceleration: 0,004235235... 

Data: 05.12.2017   |   Acceleration: 0,04235235...

Data: 05.12.2017   |   Acceleration: 0,0023414... 

I want to manipulate the data, such that the name before the ":" is the name of the column.

In other words I want this:

Data         |  Acceleration         

05.12.2017   |  0,0042414... 

05.12.2017   |  0,004235235... 

05.12.2017   |  0,04235235...

05.12.2017   |  0,0023414...

Is there a possibility to do that?

You can set the new names of your dataset and then erase those strings from your entries. Not a general approach though.

library(stringr)

names(your_data_set) = c("Data", "Acceleration")

your_data_set$Data = str_replace_all(your_data_set$Data, "Data: ", "")
your_data_set$Acceleration= str_replace_all(your_data_set$Acceleration, "Acceleration: ", "")

A solution using some reshaping from the tidyr package:

# example dataset
df = data.frame(x = c("Data: 05.12.2017", "Data: 05.12.2017"),
                y = c("Acceleration: 0.0042414", "Acceleration: 0.0042243"),
                stringsAsFactors = F)

df

#                  x                       y
# 1 Data: 05.12.2017 Acceleration: 0.0042414
# 2 Data: 05.12.2017 Acceleration: 0.0042243


library(dplyr)
library(tidyr)

df %>%
  gather() %>%                                
  select(value) %>%
  separate(value,c("v1","v2"), sep = ":") %>%
  group_by(v1) %>%
  mutate(row_num = row_number()) %>%
  spread(v1,v2) %>%
  select(-row_num)

# # A tibble: 2 x 2
#   Acceleration        Data
# *        <chr>       <chr>
# 1    0.0042414  05.12.2017
# 2    0.0042243  05.12.2017

Hope this helps!

#column header
names(df) <- sapply(df[1,], function(x) gsub(":.*","", x))
#column values
df <- sapply(df, function(x) trimws(gsub(".*:","", x)))
#now you can easily format columns as date and numeric


> #sample data
> dput(df)
structure(list(V1 = structure(c(2L, 1L, 1L, 1L), .Label = c("                 Data: 05.12.2017", 
"Data: 05.12.2017"), class = "factor"), V2 = structure(c(3L, 
2L, 4L, 1L), .Label = c(" Acceleration: 0,0023414", " Acceleration: 0,004235235", 
" Acceleration: 0,0042414", " Acceleration: 0,04235235"), class = "factor")), .Names = c("V1", 
"V2"), class = "data.frame", row.names = c(NA, -4L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM