R converting continuous variable to categorical

Question

I have a column of continuous numeric values (NO2) which I need to convert into categorical values. Can someone explain how the following code accomplishes that:

cutpoints <- quantile(dataframe%NO2, seq(0,1,length=4),na.rm=TRUE)  
dataframe%newcol <- cut(dataframe%NO2, cutpoints)  
levels(dataframe%newcols) returns (0.3781,1.2] (1.2,1.42] (1.42,2.55]

Answer 1

I think you meant to use $ instead of % to refer column names.

If you run the code step-by-step it will help you to understand.

seq creates a sequence from 0 to 1 with a length of 4.

seq(0,1,length=4)
#[1] 0.000 0.333 0.667 1.000

quantile breaks the vector into quantiles of data with a given probability (here seq(0,1,length=4) ).

set.seed(123)
x <- runif(10)
cutpoints <- quantile(x, seq(0,1,length=4),na.rm=TRUE) 
#    0%  33.3%  66.7%   100% 
#0.0456 0.4566 0.7883 0.9405

and now these breaks are used to cut the data.

cut(x, cutpoints)

meaning we divide x into different groups where cutpoints[1]-cutpoints[2] is one group, cutpoints[2]-cutpoints[3] another group and so on.

You can also use findInterval instead of cut .

R converting continuous variable to categorical

Question

1 answers

solution1
1 ACCPTED 2020-05-28 09:05:57

R converting continuous variable to categorical

Question

1 answers

solution1 1 ACCPTED 2020-05-28 09:05:57

solution1
1 ACCPTED 2020-05-28 09:05:57