简体   繁体   中英

Removing NA's from the table using R

I have a table looking like this (note that the same ID-row goes in three different rows because there isn't enought space) :

  ID    INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
7900249 2002.12.01             MD          1            KS    60  1942
8200249 2002.12.01             AN          1            KS    50  1952
8300249 2002.12.01             AN         NA            NA    NA    NA
KORGUS TAGAVARA OSAKAAL RINDE_KOOD PUULIIGI_KOOD VANUS AASTA KORGUS TAGAVARA
    15       17      30          1            KS    35  1967     11       39
    20       76      40          1            LV    45  1957     18      115
    NA       NA      NA         NA            NA    NA    NA     NA       NA
OSAKAAL RINDE_KOOD PUULIIGI_KOOD VANUS AASTA KORGUS TAGAVARA OSAKAAL
     70         NA            NA    NA    NA     NA       NA      NA
     60         NA            NA    NA    NA     NA       NA      NA
     NA          J            KU    25  1977      3        0     100

And I want it to be like this:

ID      INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA 
7900249 2002.12.01             MD          1            KS    60  1942
8200249 2002.12.01             AN          1            KS    50  1952
8300249 2002.12.01             AN          J            KU    25  1977
KORGUS TAGAVARA OSAKAAL RINDE_KOOD PUULIIGI_KOOD VANUS AASTA KORGUS TAGAVARA
   15       17      30          1            KS    35  1967     11       39
   20       76      40          1            LV    45  1957     18      115
    3        0     100         
OSAKAAL RINDE_KOOD PUULIIGI_KOOD VANUS AASTA KORGUS TAGAVARA OSAKAAL
    70         
    60         

So NA's are gone and some rows are shorter (eg. ID=8300249) than others.

1) If you you try to mix character strings, including empty character strings, with numbers the entire column will become character or factor making the result useless to work with; however, if you are just doing this for printing purposes then it would be fine and could be done like this:

m <- as.matrix(DF)
as.data.frame(replace(m, is.na(m), ""))

giving:

       ID INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
1 7900249 2002.12.01             MD          1            KS    60  1942
2 8200249 2002.12.01             AN          1            KS    50  1952
3 8300249 2002.12.01             AN          

2) Another approach if you really want to have shorter rows is to abandon the idea of having a rectangular representation and use a list of rows instead like this:

lapply(split(DF, seq_len(nrow(DF))), function(x) x[, !is.na(x)])                           

giving:

$`1`
       ID INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
1 7900249 2002.12.01             MD          1            KS    60  1942

$`2`
       ID INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
2 8200249 2002.12.01             AN          1            KS    50  1952

$`3`
       ID INVENT_KPV KASVUKOHA_KOOD
3 8300249 2002.12.01             AN

Note: The input DF in reproducible form is:

Lines <- " ID    INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
7900249 2002.12.01             MD          1            KS    60  1942
8200249 2002.12.01             AN          1            KS    50  1952
8300249 2002.12.01             AN         NA            NA    NA    NA"
DF <- read.table(text = Lines, header = TRUE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM