Removing NA's from the table using R

Question

I have a table looking like this (note that the same ID-row goes in three different rows because there isn't enought space) :

  ID    INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
7900249 2002.12.01             MD          1            KS    60  1942
8200249 2002.12.01             AN          1            KS    50  1952
8300249 2002.12.01             AN         NA            NA    NA    NA
KORGUS TAGAVARA OSAKAAL RINDE_KOOD PUULIIGI_KOOD VANUS AASTA KORGUS TAGAVARA
    15       17      30          1            KS    35  1967     11       39
    20       76      40          1            LV    45  1957     18      115
    NA       NA      NA         NA            NA    NA    NA     NA       NA
OSAKAAL RINDE_KOOD PUULIIGI_KOOD VANUS AASTA KORGUS TAGAVARA OSAKAAL
     70         NA            NA    NA    NA     NA       NA      NA
     60         NA            NA    NA    NA     NA       NA      NA
     NA          J            KU    25  1977      3        0     100

And I want it to be like this:

ID      INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA 
7900249 2002.12.01             MD          1            KS    60  1942
8200249 2002.12.01             AN          1            KS    50  1952
8300249 2002.12.01             AN          J            KU    25  1977
KORGUS TAGAVARA OSAKAAL RINDE_KOOD PUULIIGI_KOOD VANUS AASTA KORGUS TAGAVARA
   15       17      30          1            KS    35  1967     11       39
   20       76      40          1            LV    45  1957     18      115
    3        0     100         
OSAKAAL RINDE_KOOD PUULIIGI_KOOD VANUS AASTA KORGUS TAGAVARA OSAKAAL
    70         
    60

So NA's are gone and some rows are shorter (eg. ID=8300249) than others.

Answer 1

1) If you you try to mix character strings, including empty character strings, with numbers the entire column will become character or factor making the result useless to work with; however, if you are just doing this for printing purposes then it would be fine and could be done like this:

m <- as.matrix(DF)
as.data.frame(replace(m, is.na(m), ""))

giving:

       ID INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
1 7900249 2002.12.01             MD          1            KS    60  1942
2 8200249 2002.12.01             AN          1            KS    50  1952
3 8300249 2002.12.01             AN

2) Another approach if you really want to have shorter rows is to abandon the idea of having a rectangular representation and use a list of rows instead like this:

lapply(split(DF, seq_len(nrow(DF))), function(x) x[, !is.na(x)])

giving:

$`1`
       ID INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
1 7900249 2002.12.01             MD          1            KS    60  1942

$`2`
       ID INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
2 8200249 2002.12.01             AN          1            KS    50  1952

$`3`
       ID INVENT_KPV KASVUKOHA_KOOD
3 8300249 2002.12.01             AN

Note: The input DF in reproducible form is:

Lines <- " ID    INVENT_KPV KASVUKOHA_KOOD RINDE_KOOD PUULIIGI_KOOD VANUS AASTA
7900249 2002.12.01             MD          1            KS    60  1942
8200249 2002.12.01             AN          1            KS    50  1952
8300249 2002.12.01             AN         NA            NA    NA    NA"
DF <- read.table(text = Lines, header = TRUE)

Removing NA's from the table using R

Question

1 answers

solution1
1 ACCPTED 2016-12-04 14:13:57

Removing NA's from the table using R

Question

1 answers

solution1 1 ACCPTED 2016-12-04 14:13:57

solution1
1 ACCPTED 2016-12-04 14:13:57