简体   繁体   English

从列表列表中提取元素 - Python Pandas

[英]Extract elements from a list of lists - Python Pandas

I have the following pandas dataframe with only one column: 我有以下pandas数据框只有一列:

          column_name
0   cc_call_center_sk
1   cc_call_center_id
2   cc_rec_start_date
3     cc_rec_end_date

What I want to do is to extract each element inside that pandas column and put it into a string like this: 我想要做的是提取该pandas列中的每个元素并将其放入如下字符串:

my_string = ['cc_call_center_sk', 'cc_call_center_id', 'cc_rec_start_date', 
'cc_rec_end_date']

I tried to do this with the following code: 我尝试使用以下代码执行此操作:

my_list = column_names.values.tolist()

However, the output is a list and it is not as desired: 但是,输出是一个列表,并不是所希望的:

[['cc_call_center_sk'], ['cc_call_center_id'], ['cc_rec_start_date'], ['cc_rec_end_date']]

The df.names.tolist() generates the expected result: df.names.tolist()生成预期结果:

>>> df.name.tolist()
['cc_call_center_sk', 'cc_call_center_id', 'cc_rec_start_date', 'cc_rec_end_date']

For example: 例如:

>>> df=pd.DataFrame([['cc_call_center_sk'], ['cc_call_center_id'], ['cc_rec_start_date'], ['cc_rec_end_date']], columns=['names'])
>>> df
               names
0  cc_call_center_sk
1  cc_call_center_id
2  cc_rec_start_date
3    cc_rec_end_date
>>> df = pd.DataFrame([['cc_call_center_sk'], ['cc_call_center_id'], ['cc_rec_start_date'], ['cc_rec_end_date']], columns=['names'])
>>> df.names.tolist()
['cc_call_center_sk', 'cc_call_center_id', 'cc_rec_start_date', 'cc_rec_end_date']

are you sure you do not "group" values, or perform other "preprocessing" before obtaining the df.names ? 你确定你没有“分组”值,或者在获得df.names之前执行其他“预处理”吗?

You can use the tolist method on the 'column_name' series. 您可以在'column_name'系列上使用tolist方法。 Note that my_string is a list of strings , not a string. 请注意, my_string字符串列表 ,而不是字符串。 The name you have assigned is not appropriate. 您指定的名称不合适。

>>> import pandas as pd
>>> df = pd.DataFrame(['cc_call_center_sk', 'cc_call_center_id', 'cc_rec_start_date', 'cc_rec_end_date'],
...                   columns=['column_name'])
>>> df
         column_name
0  cc_call_center_sk
1  cc_call_center_id
2  cc_rec_start_date
3    cc_rec_end_date
>>>
>>> df['column_name'].tolist()
['cc_call_center_sk', 'cc_call_center_id', 'cc_rec_start_date', 'cc_rec_end_date']

If you prefer the dot notation, the following code does the same. 如果您更喜欢点符号,则以下代码也是如此。

>>> df.column_name.tolist()
['cc_call_center_sk', 'cc_call_center_id', 'cc_rec_start_date', 'cc_rec_end_date']

Lets say you have a data frame named df which looks like this: 假设您有一个名为df的数据框,如下所示:

df
    column_name
0   cc_call_center_sk
1   cc_call_center_id
2   cc_rec_start_date
3   cc_rec_end_date

then: 然后:

my_string = df.column_name.values.tolist()

or: 要么:

my_string = df['column_name'].values.tolist()

would give you the result that you want. 会给你你想要的结果。 Here is the result when you print my_string 这是打印my_string时的结果

['cc_call_center_sk',
'cc_call_center_id',
'cc_rec_start_date',
'cc_rec_end_date']

What you are trying to do is this: 你要做的是这样的:

my_strings = df.values.tolist()

This would give you a list of lists with the number of lists in the outer list being equal to the number of observations in your data frame. 这将为您提供一个列表列表,其中外部列表​​中的列表数量等于数据框中的观察数量。 Each list would contain all the feature information pertaining to 1 observation. 每个列表将包含与1个观察有​​关的所有特征信息。

I hope I was clear in explaining that to you. 我希望我能清楚地向你解释这一点。 Thank you 谢谢

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM