python parse text and add to list

I have the below output which I am handed via a python function where I receive each line of the below text in a for loop (as info).

y.y.y.y:/mount/name mounted on /var/log/da: op/s rpc bklog 2579.20 2.00 read: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms) 1.000 2.000 3.000 4 (4.0%) 5.000 6.000 write: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms) 2578.200 165768.087 64.296 0 (0.0%) 21.394 13980.817 x.x.x.x:/mount/othername mounted on /data: op/s rpc bklog 5.00 10.00 read: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms) 0.000 0.000 0.000 0 (0.0%) 0.000 0.000 write: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms) 0.000 0.000 0.000 0 (0.0%) 0.000 0.000

I ideally would like to have a dictionary that contains the mountname as key followed by a list of all metrics (op/s, rpc, read ops, etc.).

Here is what I have so far:

for line in info: if ":/" in line[0]: section = "mountpoint" mountname = line[0] continue elif "op/s" in line[0]: section = "globals" continue elif "read:" in line[0]: section = "reads" continue elif "write:" in line[0]: section = "writes" continue #if section == "mountpoint": # pass if section == "globals": mountglobals = line for i in mountglobals: infos.append(i) if section == "reads": reads = line for i in reads: infos.append(i) if section == "writes": writes = line for i in writes: infos.append(i) parsed[mountname] = "infos": infos

I am missing something in regards to iteration as it will add the keys accordingly but the list of metrics contains all of the metrics. I am not sure how to specify which list of metrics belongs to which mountpoint/key.

Here is what info looks like:

[[u'y.y.y.y:/mount/name', u'mounted', u'on', u'/var/log/da:'], [u'op/s', u'rpc', u'bklog'], [u'2579.20', u'2.00'], [u'read:', u'ops/s', u'kB/s', u'kB/op', u'retrans', u'avg', u'RTT', u'(ms)', u'avg', u'exe', u'(ms)'], [u'1.000', u'2.000', u'3.000', u'4', u'(4.0%)', u'5.000', u'6.000'], [u'write:', u'ops/s', u'kB/s', u'kB/op', u'retrans', u'avg', u'RTT', u'(ms)', u'avg', u'exe', u'(ms)'], [u'2578.200', u'165768.087', u'64.296', u'0', u'(0.0%)', u'21.394', u'13980.817'], [u'x.x.x.x:/mount/othername', u'mounted', u'on', u'/data:'], [u'op/s', u'rpc', u'bklog'], [u'5.00', u'10.00'], [u'read:', u'ops/s', u'kB/s', u'kB/op', u'retrans', u'avg', u'RTT', u'(ms)', u'avg', u'exe', u'(ms)'], [u'0.000', u'0.000', u'0.000', u'0', u'(0.0%)', u'0.000', u'0.000'], [u'write:', u'ops/s', u'kB/s', u'kB/op', u'retrans', u'avg', u'RTT', u'(ms)', u'avg', u'exe', u'(ms)'], [u'0.000', u'0.000', u'0.000', u'0', u'(0.0%)', u'0.000', u'0.000']]

Here is what my current output looks like:

u'y.y.y.y:/mount/name': 'infos': [u'2579.20', u'2.00', u'1.000', u'2.000', u'3.000', u'4', u'(4.0%)', u'5.000', u'6.000', u'2578.200', u'165768.087', u'64.296', u'0', u'(0.0%)', u'21.394', u'13980.817', u'5.00', u'10.00', u'0.000', u'0.000', u'0.000', u'0', u'(0.0%)', u'0.000', u'0.000', u'0.000', u'0.000', u'0.000', u'0', u'(0.0%)', u'0.000', u'0.000'], u'x.x.x.x:/mount/othername': 'infos': [u'2579.20', u'2.00', u'1.000', u'2.000', u'3.000', u'4', u'(4.0%)', u'5.000', u'6.000', u'2578.200', u'165768.087', u'64.296', u'0', u'(0.0%)', u'21.394', u'13980.817', u'5.00', u'10.00', u'0.000', u'0.000', u'0.000', u'0', u'(0.0%)', u'0.000', u'0.000', u'0.000', u'0.000', u'0.000', u'0', u'(0.0%)', u'0.000', u'0.000']

Appreciate some hints.

you should format your input/output text like code, otherwise this is hard to read, So put four spaces at the beginning of each line.
– miracle173
Aug 29 at 5:12

thank you, it seems blhsing already formatted the code
– Marius Pana
Aug 29 at 5:17

1 Answer
1

Unless your input is extremely large, which I don't think is the case, I would recommend that you read the entire input into one string (the following example assumes that have all your input in the variable s), and use re.findall to more easily parse the mount names and the associated 16 numbers for each mount name:

s

re.findall

import re parsed = m[0]: m[1:] for m in re.findall(r'(S+:/S+)%s' % (r'.*?([d.]+)' * 16), s, flags=re.DOTALL)

parsed would become:

parsed

'y.y.y.y:/mount/name': ('2579.20', '2.00', '1.000', '2.000', '3.000', '4', '4.0', '5.000', '6.000', '2578.200', '165768.087', '64.296', '0', '0.0', '21.394', '13980.817'), 'x.x.x.x:/mount/othername': ('5.00', '10.00', '0.000', '0.000', '0.000', '0', '0.0', '0.000', '0.000', '0.000', '0.000', '0.000', '0', '0.0', '0.000', '0.000')

Thank you. It does not work for me, probably because I did not provide you with the format of the info list which does not contain empty lines.
– Marius Pana
Aug 29 at 6:10

My solution should ignore blank lines and newlines since it simply looks for mount names (that contain ':/') and floating numbers, regardless of what's in between. Can you show me how you read the entire input into one string, in order to apply my solution?
– blhsing
Aug 29 at 6:22

':/'

Im using s = ','.join(str(v) for v in info). I may have made a typo when I tested earlier, I will try again shortly as I think it should work
– Marius Pana
Aug 29 at 10:11

The dictionary is there but is in a strange format. I am not sure how to clean it up. For example: for k, v in parsed.iteritems(): print (k) gives me [u'y.y.y.y:/mount/name', '],[u'x.x.x.x:/mount/othername',
– Marius Pana
Aug 29 at 10:42

It does indeed work. My conversion to a single string was not correct.
– Marius Pana
Aug 31 at 8:27

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Dfyjkt