简体   繁体   中英

Extracting fixed number of elements based on specific text and transforming to a data frame in python

How can i extract specific elements from a list as per the below criteria

mylist = ["aabc", "$2322", "$354", "lkgh", "rbus","hjguy", "$33","$77","kjlh","ghfd", "ytrwsd","$876", "$987", "abc", "efg" ]

I want to extract elements in the above list starting from the element with '$' sign till +4 elements from '$' sign. The final output should look like below.

Tried extracting the elements with $ sign first with the below code

Key = '$'
text = []
for text in mylist: 
    if Key in text:
        print(text)

Got actual ouput as below

["$2322", "$354", "$33","$77","$876", "$987"]

alos tried extracting the indices of the elements with $ sigh and take the elements in between the indices with a difference more than one but this does not give the desired output.

indices = [i for i, s in enumerate(mylist) if '$' in s]
print(indices)

but not the desired output as below

mylist = ["$2322", "$354", "lkgh", "rbus", "$33","$77","kjlh","ghfd","$876", "$987", "abc", "efg" ]

Finally this list should be transformed in to a data frame like below

在此处输入图片说明

You can use a while loop to iterate an index through mylist , and a nested while loop to keep incrementing the index until it points to an item that starts with $ , at which point it adds the 4 items at the index to the output:

output = []
i = 0
while i < len(mylist):
    while not mylist[i].startswith('$'):
        i += 1
    output.extend(mylist[i:i + 4])
    i += 4

output becomes:

['$2322', '$354', 'lkgh', 'rbus', '$33', '$77', 'kjlh', 'ghfd', '$876', '$987', 'abc', 'efg']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM