I would like to read a specific txt file and get data from it and write it to tuple. The problem is I dont need all data from file, just specific ones. So the text file looks like this:
HHSDMSDN1-pool 1.02T 141G 39 22 2.62M 940K
**c5t600507680C800000001CBd0 834G 118G** 32 16 2.19M 734K
**c5t600507680C00352d0 216G 22.3G** 7 5 434K 206K
HHSDMSDN2-pool 1.09T 308G 12 6 744K 83.8K
**c5t600507680C800001CDd0 790G 162G** 10 1 617K 12.5K
**c5t600507680C8000000037Dd0 203G 34.8G** 1 0 123K 10.2K
**c5t600507680C800000387d0 126G 112G** 0 5 5.36K 80.5K
HHSDMSDN3-pool 1.13T 33.4G 24 19 1.39M 623K
**c5t600507680C80002E6000001CFd0 921G 30.8G** 18 11 1.10M 465K
**c5t600507680C80002E600000203d0 235G 2.63G** 5 8 293K 158K
Bold text need to go into tuple. Best if first value would be string and next two double/float.
so the output will be
((c5t600507680C800000001CBd0, 834, 118), (c5t600507680C00352d0, 216, 22.3), .....))
Any ideas?
You simply have to iterate over the file line by line and keep track of what you already have seen.
Edit: new solution as requested
import pprint
data = """HHSDMSDN1-pool 1.02T 141G 39 22 2.62M 940K
c5t600507680C800000001CBd0 834G 118G 32 16 2.19M 734K
c5t600507680C00352d0 216G 22.3G 7 5 434K 206K
HHSDMSDN2-pool 1.09T 308G 12 6 744K 83.8K
c5t600507680C800001CDd0 790G 162G 10 1 617K 12.5K
c5t600507680C8000000037Dd0 203G 34.8G 1 0 123K 10.2K
c5t600507680C800000387d0 126G 112G 0 5 5.36K 80.5K
HHSDMSDN3-pool 1.13T 33.4G 24 19 1.39M 623K
c5t600507680C80002E6000001CFd0 921G 30.8G 18 11 1.10M 465K
c5t600507680C80002E600000203d0 235G 2.63G 5 8 293K 158K"""
# collect all records by key
d = {}
# current key "HHSDM..."
k = None
# current records
r = []
for line in data.splitlines():
if line.startswith(" c"):
# this is a record, append it to the current collection of records
fields = line.split()
r.append((fields[0], fields[1], fields[2]))
elif line.startswith("H"):
# this is a key, rember it, we will need it later
k = line.split("-")[0]
elif k:
# this is an empty line and we have a key, store the records
# and reset current records and current key
d[k] = r
r = []
k = None
# append current records at the end of the input
d[k] = r
pprint.pprint(d)
Output:
{'HHSDMSDN1': [('c5t600507680C800000001CBd0', '834G', '118G'),
('c5t600507680C00352d0', '216G', '22.3G')],
'HHSDMSDN2': [('c5t600507680C800001CDd0', '790G', '162G'),
('c5t600507680C8000000037Dd0', '203G', '34.8G'),
('c5t600507680C800000387d0', '126G', '112G')],
'HHSDMSDN3': [('c5t600507680C80002E6000001CFd0', '921G', '30.8G'),
('c5t600507680C80002E600000203d0', '235G', '2.63G')]}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.