简体   繁体   中英

sparse partial least square regression

I have two data-sets as follows:

     http://www.filedropper.com/dataa_1 ## DataA
     http://www.filedropper.com/datab   ## DataB

In dataA, we have 42 rows and 8 columns and in DataB 42 rows and 6 columns. We wanted to do CCA and sPLS using both of these data in R. But my question here is when we look at DataB, always every eleven rows will have the same values. Will this affect the results or cause a discrepancy in either the CCA or sPLS?

After looking at block B, it looks like the variables are discrete.

It is not a (technical) problem to use such variables in PLS or CCA, but it poses statistical "challenges": the use of bootstap or jackknife may be required to go further into the statistical interpretation of the results.

You should also ask yourself if this "discrete" representation is accurate for your data. It may be wrong if the original variables are categorical, in which case you should use dummy variables .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM