简体   繁体   English

使用两个 Pandas 数据帧创建列表列表

[英]Create a list of lists using two Pandas dataframes

I have these two dataframe: 1) data here are grouped by station_id( from 1 up to 98) and time( a data every hour from 27-01-2020 to 26-05-2020)我有这两个dataframe:1)这里的数据按station_id(从1到98)和时间(从27-01-2020到26-05-2020每小时一个数据)分组

在此处输入图像描述

  1. In the second dataframe I have a latitude and longitude value for every station_id.在第二个 dataframe 中,每个 station_id 都有一个纬度和经度值。

在此处输入图像描述

My aim is to create a list of list in this format:我的目标是以这种格式创建一个列表列表:

     latitude           longitude      flow   hour  month  day
[[53.37947845458979, -1.46990168094635, 278.0, 0.0, 1.0, 27.0], 
 [53.379791259765604, -1.46999669075012, 122.0, 0.0, 1.0, 27.0], 
 [53.380035400390604, -1.47001004219055, 58.0, 0.0, 1.0, 27.0], ...]

In order to have a list [latitude, longitude, flow, month, day] for every row in the first dataframe.为了在第一个 dataframe 中的每一行都有一个列表 [纬度,经度,流,月,日]。 I tried with the following code:我尝试使用以下代码:

import pandas as pd
import datetime as dt

df = pd.read_csv("readings_by_hour.csv")
df['time'] = pd.to_datetime(df['time'])
df1 = pd.read_csv("stations_info.csv")

i = 0
a = []
b = []
count = df1['station_id'].count()

while i < count:
    if df['station_id'][i] == df1['station_id'][i]:
        a = print(df1['latitude'][i] + ", " + df1['longitude'][i] + ", " + df['flow'][i] + ", " + df['time'].dt.hour + ", " + df['time'].dt.month + ", " + df['time'].dt.day)
        b += [a]
        i += 1

print(b)

but it seems it doesn't work, indeed didn't giving any output though it didn't give any error.但它似乎不起作用,确实没有给出任何 output 虽然它没有给出任何错误。

You could merge the two dataframes on the station_id column, then create your list of lists like so:您可以合并station_id列上的两个数据框,然后像这样创建列表列表:

merged_df = pd.merge(df, df1, left_on = 'station_id', right_on = 'station_id')

list_of_lists =[] 
  
# Iterate over each row 
for index, row in merged_df.iterrows():

    # Create list for the current row 
    rowlist =[row.latitude, row.longitude, row.flow, row.hour, row.month, row.day] 
      
    # append the list to the final list 
    list_of_lists.append(rowlist) 

You can use the datetime module to extract the month, day, hour from the Date column您可以使用datetime模块从Date列中提取月、日、小时

See the pandas docs on pd.merge for more info: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html See the pandas docs on pd.merge for more info: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

In the given code, you are trying to assign return value of print function to a and then adding it to b .在给定的代码中,您尝试将 print function 的返回值分配给a然后将其添加到b Here, the value of a is null .这里, a的值为null So, when you try to print the value, you will get empty string.因此,当您尝试打印该值时,您将得到空字符串。

I have made corrections so that it works.我已经进行了更正,以使其正常工作。 Hope it helps..希望能帮助到你..

while i < count:
    if df['station_id'][i] == df1['station_id'][i]:
        a = [df1['latitude'][i],df1['longitude'][i], df['flow'][i], df['time'][i].hour,df['time'][i].month,df['time'][i].day]
        b.append(a)
        i += 1

print(b)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM