简体   繁体   中英

R to Python subsetting via vector

I'm a python newbie but have some R experience. In R if I'd like to subset a data.frame I can use a variable to do something like this:

# Columns

# Assign column names to variable
colsToUse <- c('col1','col2','col3')

# Use variable to subset
df2 <- df1[,colsToUse]

# Rows

# Assign column names to variable
rowsToUse <- sample(1:nrows(df1), 500)

# Use variable to subset
df3 <- df1[rowsToUse,]

How would I do this in python?

Based on your stated use of pandas

colsToUse = ['col1', 'col2', 'col3']
rowsToUse = np.random.choice(range(len(df1)), 500)

df2 = df1.ix[:, colsToUse]
df3 = df1.ix[rowsToUse, :]

There are also some other DataFrame helper functions for indexing: df1.loc , df1.iloc , and df1.xs .

It's also helpful to look at the guide NumPy for MATLAB Users which also often answers questions for R users too, at least when dealing with just a numpy.ndarray ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM