简体   繁体   English

通过jupyter将a.csv文件转换为Python指定样式

[英]Convert a .csv file into a specified style with Python by jupyter

The file looks like:该文件如下所示:该文件看起来像

The first column is user_id , and the second is the rating for joke 1 , and the rest can be done in the same manner.第一列是user_id ,第二列是joke 1的评分,rest 可以用同样的方式完成。 I want to convert the file shown above into the format likes:我想将上面显示的文件转换为如下格式:

user_id | joke_id | rating
--------------------------
   1    |   1     | -7.82
   1    |   2     | 8.79

In addition, after conversion, as the normal ratings are between -10 and +10, the number 99 means the user didn't rate for the corresponding jokes and should be removed.此外,转换后,由于正常评分在-10到+10之间,数字99表示用户没有为相应的笑话评分,应该被删除。

Your question involves several steps, please avoid mixing all questions into one.您的问题涉及多个步骤,请避免将所有问题混为一谈。 Based on your question, following steps would be helpful:根据您的问题,以下步骤会有所帮助:

  • Reead csv file using pandas使用pandas重新读取csv文件
import pandas as pd
raw = pd.read_csv('PATH-TO-FILE')
  • Use melt to unpivot the DataFrame使用melt DataFrame

Because only the image is provided, will use a sample DataFrame instead.由于仅提供图像,因此将使用示例 DataFrame 代替。

raw = pd.DataFrame([[1, -7, 8, 99], [2, 4, 0, 6]], columns = ['user_id', 'joke_1', 'joke_2', 'joke_3'])
   user_id  joke_1  joke_2  joke_3
0        1      -7       8      99
1        2       4       0       6

Unpivot DataFrame using melt :使用melt取消旋转 DataFrame :

df = pd.melt(raw, id_vars=['user_id'], value_vars=['joke_1', 'joke_2', 'joke_3'], var_name='joke', value_name='rating')
   user_id    joke  rating
0        1  joke_1      -7
1        2  joke_1       4
2        1  joke_2       8
3        2  joke_2       0
4        1  joke_3      99
5        2  joke_3       6
  • Filter the Dataframe with condition过滤 Dataframe 条件

With pandas , you can easily filter DataFrame with condition:使用pandas ,您可以轻松过滤 DataFrame 条件:

df_processed = df[df.rating != 99].reset_index(drop=True)

Please notice the reset_index() is used to clean the index, not related to your question.请注意reset_index()用于清理索引,与您的问题无关。

Hope above would help.以上希望对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM