multiple columns from a file into a single column of lists in pandas

Question

I'm new to pandas , and need to prepare a table using pandas , imitating exact function performed by following code snippet:

with open(r'D:/DataScience/ml-100k/u.item') as f:
    temp=''
    for line in f:
        fields = line.rstrip('\n').split('|')
        movieId = int(fields[0])
        name = fields[1]
        geners = fields[5:25]
        geners = map(int, geners)

My question is how to add a geners column in pandas having same : geners = fields[5:25]

Answer 1

It's not clear to me what you intend to accomplish -- a single genres column containing fields 5-25 concatenated? Or separate genre columns for fields 5-25?

For the latter, you can use [pandas.read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html) :

import pandas as pd

cols = ['movieId', 'name'] + ['genre_' + str(i) for i in range(5, 25)]
df = pd.read_csv(r'D:/DataScience/ml-100k/u.item', delimiter='|', names=cols)

For the former, you could concatenate the genres into say, a space-separated list, using:

df['genres'] = df[cols[2:]].apply(lambda x: ' '.join(x), axis=1)
df.drop(cols[2:], axis=1, inplace=True) # drop the separate genre_N columns

multiple columns from a file into a single column of lists in pandas

Question

1 answers

solution1
0 2016-10-19 08:21:12

multiple columns from a file into a single column of lists in pandas

Question

1 answers

solution1 0 2016-10-19 08:21:12

solution1
0 2016-10-19 08:21:12