简体   繁体   English

如何从 JSON 文件中提取的日期值中删除时间?

[英]How remove time from date values pulled from a JSON file?

I am working in python and using a JSON file and pulling info from it and sending to a csv file.我在 python 工作并使用 JSON 文件并从中提取信息并发送到 csv 文件。 The code I am using is as follows:我正在使用的代码如下:

import csv
import json

csv_kwargs = {
    'dialect': 'excel',
    'doublequote': True,
    'quoting': csv.QUOTE_MINIMAL
}

inpfile = open('checkin.json', 'r', encoding='utf-8')
outfile = open('checkin.csv', 'w', encoding='utf-8')

writer = csv.writer(outfile, **csv_kwargs, lineterminator="\n")

for line in inpfile:
    d = json.loads(line)
    writer.writerow([d['business_id'],d['date']])

inpfile.close()
outfile.close()

checkin.json key values of business_id and date . checkin.json business_iddate的键值。 The date values are in the form of 'MM:DD:YYYY HH:MM:SS' where it shows the date and then the time. date值采用'MM:DD:YYYY HH:MM:SS'的形式,其中显示日期,然后显示时间。 Each business_id includes multiple dates associated with it.每个business_id包含与其关联的多个日期。 I included a line of the JSON file to show how each 'business_id' works and the dates associated with it.我包含了 JSON 文件的一行,以显示每个'business_id'工作方式以及与之关联的日期。 A line from the JSON is shown below: JSON 的一行如下所示:

{"business_id":"--1UhMGODdWsrMastO9DZw","date":"2016-04-26 19:49:16, 2016-08-30 18:36:57, 2016-10-15 02:45:18, 2016-11-18 01:54:50, 2017-04-20 18:39:06, 2017-05-03 17:58:02"}

My question is how do you code this to keep the date, but not the time being that they are in the same key value.我的问题是你如何编码以保持日期,但不是他们处于相同键值的时间。

You can parse the date in your JSON as a timestamp and then truncate it to date using Python's built-in datetime module.您可以将 JSON 中的date解析为时间戳,然后使用 Python 的内置datetime时间模块将其截断为日期。

Import the module:导入模块:

from datetime import datetime

Parse the date while writing:在写入时解析date

for line in inpfile:
    d = json.loads(line)

    dates  = map(lambda dt: datetime.strptime(dt.strip(), '%Y-%m-%d %H:%M:%S').strftime('%Y-%m-%d'), d['dates'].split(' '))
    for date in dates:
       writer.writerow([d['business_id'], date])

If you're strictly using this program to convert the json file to csv, you can simply use string slices:如果您严格使用此程序将json文件转换为 csv,则可以简单地使用字符串切片:

date, time = d['date'][:12], d['date'][12:] 

If you want to store it as a datetime object to do something else如果您想将其存储为datetime时间 object 以执行其他操作

dt = time.strptime(d['date'], "'%m:%d:%Y''%H:%M:%S'")
# Other stuff
dt_string = dt.strftime("'%m:%d:%Y'")

The formatting for date values described in you question isn't consistent, first you say it's MM:DD:YYYY , however in the sample line from the json input file it appears to be YYYY-MM-DD , and while such details may matter, that particular one doesn't to the revised code below.您问题中描述的date值的格式不一致,首先您说它是MM:DD:YYYY ,但是在 json 输入文件的示例行中,它似乎是YYYY-MM-DD ,虽然这些细节可能很重要,那个特定的不符合下面的修改后的代码。 What did matter was the fact that there can be more than one, which is why I'm updating my answer.重要的可以有多个,这就是我更新答案的原因。

import csv
import json

csv_kwargs = {
    'dialect': 'excel',
    'doublequote': True,
    'quoting': csv.QUOTE_MINIMAL,
}

with open('checkin.json', 'r', encoding='utf-8') as inpfile, \
     open('checkin.csv', 'w', encoding='utf-8', newline='') as outfile:

    writer = csv.writer(outfile, **csv_kwargs)

    for line in inpfile:
        d = json.loads(line)
        # Convert date value string into list of dates with the times removed.
        dates = [date.strip().split(' ')[0] for date in d['date'].split(',')]
        writer.writerow([d['business_id']] + dates)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM