选择非结构化csv中具有日期的所有行

Question

So I have the output of a Google Trends query. 因此，我得到了Google趋势查询的输出。 It contains several tables on one sheet. 它在一张纸上包含几张表。 The first part of the sheet looks like: 工作表的第一部分如下所示：

Web Search interest: nespresso  
United States; date_range:(today 90-d)  

Interest over time  
Day nespresso
8/7/2015    70
8/8/2015    82
8/9/2015    91
8/10/2015   84

So here's what I'd like to do. 这就是我想要做的。 Disregard the first few rows and select any rows with a date. 忽略前几行，然后选择任何带有日期的行。 (weekly data from have date as 8/7/2015-8/14/2015). （每周数据来自日期为8/7 / 2015-8 / 14/2015）。 Sure, there's nrow and skip in read.csv, but I was wondering if there was a systematic way to do this. 当然，在read.csv中可以跳过，但是我想知道是否有系统的方法可以做到这一点。

Also, bear in mind that the data from Google trends includes data after the dates. 另外，请记住，来自Google趋势的数据包括日期之后的数据。

11/3/2015    
11/4/2015    


Top subregions for nes  
Subregion   nes
New York    100
Massachusetts   83

Looking for Python or R solution 寻找Python或R解决方案

Answer 1

Consider this Python solution to read in raw csv and convert first column to date. 考虑使用此Python解决方案读取原始csv并将第一列转换为日期。 Try/Except is used to skip rows that do not convert properly to date format. Try/Except用于跳过未正确转换为日期格式的行。

import csv
from datetime import datetime

with open('Unstructured.csv', 'rt') as csvfile:
    csvReader = csv.reader(csvfile)
    data = []

    for row in csvReader:
        try:
            data.append([datetime.strptime(row[0], "%m/%d/%Y").strftime("%Y-%m-%d"), row[1]])
        except ValueError:
            continue

    for i in data:
        print(i)

Output (data list) 输出 （数据列表）

['2015-08-07', '70']
['2015-08-08', '82']
['2015-08-09', '91']
['2015-08-10', '84']
['2015-11-03', '']
['2015-11-05', '']

选择非结构化csv中具有日期的所有行

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-11-05 04:51:25

选择非结构化csv中具有日期的所有行

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-11-05 04:51:25

解决方案1
1 已采纳 2015-11-05 04:51:25