简体   繁体   中英

compare data in text file vs avro file using python

I am new to Python so please bear with me. I am using Python3.6.4 and I want to compare data in a text file Vs data in my Avro Dataset using Python. The data in my text file will be pipe delimited and would be coming from a table from a Relational database. Please help. Thanks in advance.

Thanks Vikas. Here's the code below. Here I have hardcoded data being appended to the avro file and this is easy to compare. But my actual avro file output would be an output from a program and the text file would be an output from another. I'd have to compare those files. Thanks

import avro
import avro.schema
import avro.datafile
from avro.datafile import DataFileReader, DataFileWriter
from avro.io import DatumReader, DatumWriter

def writeAvro(fileName):
    schema = avro.schema.Parse(open("testSchema.avsc", "rb").read())

    writer = DataFileWriter(open("{}".format(fileName), "wb"), DatumWriter(), schema)
    writer.append({"id": 1, "name" : "John", "age": 34})
    writer.append({"id": 2, "name" : "Jane", "age": 134})
    writer.append({"id": 3, "name" : "Davis"})
    writer.close()

def readAvro(fileName):
    reader = DataFileReader(open("{}".format(fileName), "rb"), DatumReader())
    for record in reader:
        #print(record.get('name'))
        dict_name = record.get('name')
        dict_id = record.get('id')
        for p in expected:
            if p['name'] == dict_name:
                print(p)
    reader.close()

expected = [{'name': 'John', 'id': 1, 'age': 34},
{'id': 2, 'name': 'Jane', 'age': 134},
{'id': 3, 'name': 'Davis', 'age': None}]

#print(expected)

writeAvro("test.avro")
readAvro("test.avro")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM