简体   繁体   English

Python和Sqlite3,将数据文本文件转换为sql数据库

[英]Python and Sqlite3, transforming data text file to sql database

So I've been given an assignment/Challenge to complete but I just don't know whee to start with it I've got experience with Python but not with using databases and data transformation onto the description. 因此,已经完成了任务/挑战,但是我不知道从头开始,我有使用Python的经验,但没有使用数据库和描述上的数据转换的经验。

So here is a snippet of my text file I've been given: 因此,这是我的文本文件的摘要:

   Grid-ref=   1, 148
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630
 3020 2820 3040 2880 1740 1360  980  990 1410 1770 2580 2630

Grid-ref=   1, 311
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
  490  290  280  230  200  250  440  530  460  420  530  450
Grid-ref=   1, 312
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410
  460  280  260  220  190  240  430  520  450  400  520  410

So from this i must then create a database containing 4 columns like so: 因此,从此我必须创建一个包含4列的数据库,如下所示:

Xref    Yref    Date        Value
1       148     1,1,2000    3020
1       148     1,2,2000    2820

I hope you can see the pattern so grid-ref= 1, 148 are my X & Y co-ords then each value is obviously the value but i need to iterate through and for each value it then gives it the new date which is just the 1st of each month for 10 years. 我希望您能看到模式,所以grid-ref = 1,148是我的X和Y坐标,那么每个值显然都是该值,但是我需要遍历,对于每个值,然后给它一个新的日期, 10年的每月的1号。

So far I have this code which i know isn't much it's a start. 到目前为止,我有这段代码,我知道这只是个开始。

    import os
    import csv
    import sqlite3

    f_path = os.path.dirname(os.path.abspath(__file__)) + "/data/"

    db = sqlite3.connect('output.db')
    cursor = db.cursor()
    cursor.execute('CREATE TABLE Data (Xref, Yref, Date, Value)')

    date = 2000 - 2010
    grid = 'Xref, Yref'

with open(f_path + "data.to.use.txt") as file_read:
    for row in csv.DictReader(file_read):
        cursor.execute('''INSERT INTO Data
                              VALUES (:Xref, :Yref, :Date, :Value)''', row)


db.commit()
db.close()

Thank you for all feedback and guidance, I'm in unfamiliar territory with this type of task and hope you can help. 感谢您的所有反馈和指导,我不熟悉此类任务,希望您能为您提供帮助。

you could try the below code. 您可以尝试以下代码。 I am not quite not clear with date requirement . 我不太清楚日期的要求。 So I just added a month for each entry 所以我只为每个条目增加一个月

from datetime import date,datetime
from dateutil.relativedelta import relativedelta
Xref=''
Yref=''
date =datetime.strptime('2000-01-01', '%Y-%m-%d')
with open('C:\Users\shmathew\Desktop\Sample\sample.txt') as file_read:
    for row in file_read:
        print row
        if 'Grid-ref' in row:
            Xref = row.split(',')[0].split('=   ')[1]
            Yref = row.split(',')[1]
        else:
            for Value in row.split('  '):
                date = date + relativedelta(months=+1)
                print Xref.strip(),Yref.strip(),date,Value.strip()

Sample output 样品输出

490  290  280  230  200  250  440  530  460  420  530  450

1 311 2009-08-01 00:00:00 490
1 311 2009-09-01 00:00:00 290
1 311 2009-10-01 00:00:00 280
1 311 2009-11-01 00:00:00 230
1 311 2009-12-01 00:00:00 200
1 311 2010-01-01 00:00:00 250
1 311 2010-02-01 00:00:00 440
1 311 2010-03-01 00:00:00 530
1 311 2010-04-01 00:00:00 460
1 311 2010-05-01 00:00:00 420
1 311 2010-06-01 00:00:00 530
1 311 2010-07-01 00:00:00 450
490  290  280  230  200  250  440  530  460  420  530  450

1 311 2010-08-01 00:00:00 490
1 311 2010-09-01 00:00:00 290
1 311 2010-10-01 00:00:00 280
1 311 2010-11-01 00:00:00 230
1 311 2010-12-01 00:00:00 200
1 311 2011-01-01 00:00:00 250
1 311 2011-02-01 00:00:00 440
1 311 2011-03-01 00:00:00 530
1 311 2011-04-01 00:00:00 460
1 311 2011-05-01 00:00:00 420
1 311 2011-06-01 00:00:00 530
1 311 2011-07-01 00:00:00 450
490  290  280  230  200  250  440  530  460  420  530  450
1 311 2011-08-01 00:00:00 490
1 311 2011-09-01 00:00:00 290
1 311 2011-10-01 00:00:00 280
1 311 2011-11-01 00:00:00 230
1 311 2011-12-01 00:00:00 200
1 311 2012-01-01 00:00:00 250
1 311 2012-02-01 00:00:00 440
1 311 2012-03-01 00:00:00 530
1 311 2012-04-01 00:00:00 460
1 311 2012-05-01 00:00:00 420
1 311 2012-06-01 00:00:00 530
1 311 2012-07-01 00:00:00 450

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM