I am ignorant when it comes to R programming and programming in general but I have two pieces of code that have come across a similar problem (for me). Here we go...
(A)
I currently have a function that returns record(s) of a patient, trial number, and other information. It looks like this:
ID trial start finish mark mean number
903 A34 19 90910 18775077 8236 -0.0197 1.972876
904 A34 19 18782377 23089165 2343 0.0374 2.052525
905 A34 19 23093018 43203507 10267 -0.0162 1.977668
906 A34 19 43203990 43447468 93 0.2138 2.319478
907 A34 19 43447802 43663369 112 -0.0355 1.951387
908 A34 19 43663624 43834506 80 -0.5385 1.376973
909 A34 19 43834848 59097854 8655 -0.0095 1.986873
Below is the code I have written for it.
getRS <- function(CNA, samples = NULL, trial = NULL){ race <- racing.summary(subset(CNA, samplelist = samples, triallist = trial)) race$number <- (2^race$mean)*2 return(race) }
I am wondering if it is possible to use this output in a new function to do simple arithmetics. I am looking to subtract 'finish' from 'start' to create 'length', create a new 'mean' with all the means from above and extract the largest 'number' to create 'max.number' whilst not displaying 'mark' at all.
An output similar to this:
ID trial max.length mean max.number
A34 19 20110489 -0.05260000 2.3194777
AND/OR
(B)
I have an alternative function that creates a data frame of ALL the patients with the already calculated data. I used this code:
getSum <- function (){
race_mean <- as.data.frame(df %>% group_by(ID, trial) %>% summarise(mean = mean(mean)))
race_length <- as.data.frame(df %>% group_by(ID,trial) %>% summarise(max.length = max(end - start)))
seg_number <- as.data.frame(df %>% group_by(ID,trial) %>% summarise(max.number = max(number)))
race_m_l_merge <- as.data.frame(merge(x = race_length, y = race_mean))
race_m_l_n_merge <- as.data.frame(merge(x = race_m_l_merge, y = race_number))
ordered_summary <- as.data.frame(race_m_l_n_merge[order(race_m_l_n_merge$trial),])
View(ordered_summary)
}
Which gives an output like this:
ID trial max.length mean max.number
1 A22 1 96637812 -1.648909e-01 2.6989533
25 A23 1 101363101 -6.275455e-02 2.2468441
49 A24 1 72598875 -5.878000e-02 2.8204004
73 A25 1 112628591 -3.346917e-01 2.0675182
97 A26 1 55490417 7.621429e-02 2.4512200
121 A28 1 130879821 -4.218571e-02 2.0679481
145 A29 1 72590096 -3.093417e-01 2.3450196
169 A30 1 32642030 4.242500e-02 2.6375528
193 A32 1 34350731 -8.188372e-02 2.1149155
217 A33 1 77537981 -1.305833e-01 2.1125713
With this, I would like to create a function as to specify which ID and which trial I would like to lookup like so: Function("A22",1)
.
I'm hoping that my R Script for the future would work arbitrarily for future endeavors so any help would be much appreciated either on my question A, B or perhaps both! Or even suggestions for links to helpful websites. :)
If you have already defined your functions getRS
and getSum
, then you can call them inside a new function.
Uou just have to change the line that contains View(ordered_summary)
in getSum
to return(ordered_summary)
or simply ordered_summary
, so you it returns an object you can further manipulate.
lookup_function <- function(data_lookup, id_lookup, trial_lookup) {
data_df <- getRS(CNA = data_lookup)
summary_df <- getSum(df = data_df)
subset(x = results_df, subset = (ID == id_lookup & trial == trial_lookup))
}
You can write this function in a concise way, if you feel inclined to do so.
lookup_function <- function(data_lookup, id_lookup, trial_lookup) {
subset(x = getSum(getRS(data_lookup)), subset = (ID == id_lookup & trial == trial_lookup))
}
Or, if you don't want to have three different functions, you can create a function that has getRS
and getSum
defined inside itself.
lookup_function <- function(data_lookup, id_lookup, trial_lookup) {
data_df <- getRS(CNA = data_lookup)
summary_df <- getSum(df = data_df)
subset(x = results_df, subset = (ID == id_lookup & trial == trial_lookup))
}
lookup_function <- function(data_lookup, id_lookup, trial_lookup) {
getRS <- function(CNA, samples = NULL, trial = NULL){
race <-
racing.summary(subset(CNA, samplelist = samples, triallist = trial))
race$number <-
(2 ^ race$mean) * 2
race
}
getSum <- function(df) {
unordered_summary <-
df %>%
group_by(ID, trial) %>%
summarise(mean = mean(mean),
max.length = max(end - start),
max.number = max(number)) %>%
data.frame()
ordered_summary <-
data.frame(unordered_summary[order(unordered_summary$trial), ])
ordered_summary
}
data_df <- getRS(CNA = data_lookup)
summary_df <- getSum(df = data_df)
subset(x = results_df, subset = (ID == id_lookup & trial == trial_lookup))
}
I have edited the code for getSum
, as I didn't see a reason to call summarize
three times, instead of a single time. You can use your own function, of course, as I don't know the particulars of your task at hand.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.