简体   繁体   中英

How do I extract the date from a column in a csv file using pandas?

This is the 'aired' column in the csv file: as

Link to the csv file: https://drive.google.com/file/d/1w7kIJ5O6XIStiimowC5TLsOCUEJxuy6x/view?usp=sharing

I want to extract the date and the month (in words) from the date following the 'from' word and store it in a separate column in another csv file. The 'from' is an obstruction since had it been just the date it would have been easily extracted as a timestamp format.

import pandas as pd

file = pd.read_csv('file.csv')
result = []

for cell in file['aired']:
    date = cell[8:22]
    date_ts = pd.to_datetime(date, format='%Y-%m-%d')
    result.append((date_ts.month_name(), date_ts))

df = pd.DataFrame(result, columns=['month', 'date'])
df.to_csv('result_file.csv')

You are starting from a string and want to break out the data within it. The single quotes is a clue that this is a dict structure in string form. The Python standard libraries include the ast (Abstract Syntax Trees) module whose literal_eval method can read a string into a dict, gleaned from this SO answer: Convert a String representation of a Dictionary to a dictionary?

You want to apply that to your column to get the dict, at which point you expand it into separate columns using .apply(pd.Series) , based on this SO answer: Splitting dictionary/list inside a Pandas Column into Separate Columns

Try the following

import pandas as pd
import ast

df = pd.read_csv('AnimeList.csv')
# turn the pd.Series of strings into a pd.Series of dicts
aired_dict = df['aired'].apply(ast.literal_eval)
# turn the pd.Series of dicts into a pd.Series of pd.Series objects
aired_df = aired_dict.apply(pd.Series)
# pandas automatically translates that into a pd.DataFrame
# concatenate the remainder of the dataframe with the new data
df_aired = pd.concat([df.drop(['aired'], axis=1), aired_df], axis=1)
# convert the date strings to datetime values
df_aired['aired_from'] = pd.to_datetime(df_aired['from'])
df_aired['aired_to'] = pd.to_datetime(df_aired['to'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM