Python 初学者 - 使用 dtale 编码（命令顺序问题）

Question

Due to a university project where I want to work and learn python, I stumbled upon the add-on dtale, which helps me analyzing party manifesto mass data.由于我想工作和学习的大学项目 python，我偶然发现了附加组件 dtale，它可以帮助我分析党宣言海量数据。

Long story short: I added some filters (eg I only want to show rows with an edate >= 20140914. When I run the code, the filters don't seem to be applied - could you please help me with that?长话短说：我添加了一些过滤器（例如，我只想显示 edate >= 20140914 的行。当我运行代码时，似乎没有应用过滤器 - 你能帮我解决这个问题吗？

import pandas as pd
df = pd.read_csv('https://manifestoproject.wzb.eu/down/data/2020b/datasets/MPDataset_MPDS2020b.csv')
d = dtale.show(df)

# DISCLAIMER: 'df' refers to the data you passed in when calling 'dtale.show'

import pandas as pd

if isinstance(df, (pd.DatetimeIndex, pd.MultiIndex)):
    df = df.to_frame(index=False)

# remove any pre-existing indices for ease of use in the D-Tale code, but this is not required
df = df.reset_index().drop('index', axis=1, errors='ignore')
df.columns = [str(c) for c in df.columns]  # update columns to strings in case they are numbers

df.loc[:, 'edate'] = pd.Series(pd.to_datetime(df['edate'], infer_datetime_format=True), name='edate', index=df['edate'].index)
d.open_browser()

So basically, my goal is to not always have to start filtering for dates etc, but that all my progress is saved and applied when running the code.所以基本上，我的目标是不必总是开始过滤日期等，而是在运行代码时保存并应用我的所有进度。

Thanks for your help!谢谢你的帮助！

Answer 1

There are some other arguments you can pass to pd.read_csv() that will probably help you out here:您可以将其他一些 arguments 传递给pd.read_csv() ，这可能会对您有所帮助：

parse_dates : Give this arguments as a list of columns that pandas should convert to dates. parse_dates ：将此 arguments 作为列的列表，pandas 应将其转换为日期。 This might replace your second to last line.这可能会替换您的倒数第二行。
index_col : This allows you to explicitly set an index, which should help you with not having to convert .to_frame() index_col ：这允许您显式设置索引，这应该可以帮助您不必转换.to_frame()

If these don't get you all the way there, I have two ideas:如果这些不能让你一直到那里，我有两个想法：

You can put all this logic inside of it's own function called something like clean_df and call that on newly loaded data.您可以将所有这些逻辑放在它自己的 function 中，称为clean_df之类的东西，并在新加载的数据上调用它。
You can save your cleaned data in a format other than a .csv .您可以将清理后的数据保存为.csv以外的格式。 One (of many) option is that DataFrames can be saved to something called a pickle , which is one way python objects can be saved to memory .一个（许多）选项是DataFrames 可以保存到称为pickle的东西，这是python 对象可以保存到 memory的一种方式。 Loading DataFrames from a pickle brings them back pretty much exactly how you saved them, no need to do all the cleaning.从 pickle 中加载DataFrames可以让它们恢复到与保存它们完全相同的状态，无需进行所有清理工作。

Also small note, I don't think you need to import pandas twice.另外请注意，我认为您不需要两次import pandas 。

Python 初学者 - 使用 dtale 编码（命令顺序问题）

问题描述

1 个解决方案

解决方案1
0 2021-10-07 18:36:23

Python 初学者 - 使用 dtale 编码（命令顺序问题）

问题描述

1 个解决方案

解决方案1 0 2021-10-07 18:36:23

解决方案1
0 2021-10-07 18:36:23