I'm new to pandas , and need to prepare a table using pandas , imitating exact function performed by following code snippet:
with open(r'D:/DataScience/ml-100k/u.item') as f:
temp=''
for line in f:
fields = line.rstrip('\n').split('|')
movieId = int(fields[0])
name = fields[1]
geners = fields[5:25]
geners = map(int, geners)
My question is how to add a geners column in pandas having same : geners = fields[5:25]
It's not clear to me what you intend to accomplish -- a single genres column containing fields 5-25 concatenated? Or separate genre columns for fields 5-25?
For the latter, you can use [pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)
:
import pandas as pd
cols = ['movieId', 'name'] + ['genre_' + str(i) for i in range(5, 25)]
df = pd.read_csv(r'D:/DataScience/ml-100k/u.item', delimiter='|', names=cols)
For the former, you could concatenate the genres into say, a space-separated list, using:
df['genres'] = df[cols[2:]].apply(lambda x: ' '.join(x), axis=1)
df.drop(cols[2:], axis=1, inplace=True) # drop the separate genre_N columns
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.