简体   繁体   中英

Efficient way to split R, G and B values from a file containing RGB values (Without NumPy)

I am having a file that contains RGB values. Like,

Sample Image Data.txt file

Each row contains triplets (like 255,255,255) separated by spaces.
Each triplet has three comma separated integers. These integer corresponds to R ('RED'), G ('GREEN') and B ('BLUE') values. All integers are less than 256.

255,255,255 250,250,250 254,254,254 250,250,250 
255,255,255 253,253,253 255,255,255 255,255,255 
251,251,251 247,247,247 251,251,251 250,250,250
195,195,195 191,191,191 195,195,195 195,195,195
255,255,255 253,253,253 254,254,254 255,255,255 
255,255,255 254,254,254 239,239,239 240,240,240
238,238,238 254,254,254 255,255,255 255,255,255

The processed output should look like:
RED = ['255','250','254','250','255','253','255',............,'254','255','255']
GREEN = ['255','250','254','250','255','253','255',............,'254','255','255']
BLUE = ['255','250','254','250','255','253','255',............,'254','255','255']
RGB_Nx3_MATRIX = [['255','255','255'],['250','250','250'],['254','254','254'].....['255','255','255']]

My code works fine.

import re

file_object = open('Image Data.txt','r') 

RED_VECTOR = []         #SEQUENTIALLY STORES ALL 'R' VALUES
GREEN_VECTOR = []       #SEQUENTIALLY STORES ALL 'G' VALUES
BLUE_VECTOR = []        #SEQUENTIALLY STORES ALL 'B' VALUES

RGB_Nx3_MATRIX = []     #Nx3 MATRIX i.e. ['R','G','B'] N times

for line in file_object:
    SPACE_split_LIST = line.split()

    for pixel in SPACE_split_LIST:
        RGB = re.findall(r'\,?(\d+)\,?',pixel)
        RED_VECTOR += [RGB[0]]
        GREEN_VECTOR += [RGB[1]]
        BLUE_VECTOR += [RGB[2]]

        RGB_Nx3_MATRIX += [RGB]




#RESULTS

#print RED_VECTOR
#print GREEN_VECTOR
#print BLUE_VECTOR

#print "------------------"

#print RGB_Nx3_MATRIX

What am I looking for?

I need a better and efficient way to do this. I want to avoid the use of two for-loops.

you can avoid usage of regex

f =open('Image Data.txt','r')                 

R=[]                                 
G=[]                                 
B=[]                                 
for line in f:                       
    for color_set in line.split():       
        r,g,b = color_set.split(',')     
        R+=[r]                       
        G+=[g]                       
        B+=[b]                       

print B

output

['255', '250', '254', '250', '255', '253', '255', '255', '251', '247', '251', '250', '195', '191', '195', '195', '255', '253', '254', '255', '255', '254', '239', '240', '238', '254', '255', '255']

If you're mainly interested in the matrix, you can almost do this in one line:

with open('Image Data.txt','r') as file_h:
    rgb_matrix = [triple.split(',') for line in file_h for triple in line.strip().split()]

which should be fairly efficient. You could also extend this by another loop to convert them to integers.

with open('Image Data.txt','r') as file_h:
    rgb_matrix = [[int(num) for num in triple.split(',')] for line in file_h for triple in line.strip().split()]

If you really need individual colors, you can easily get them as:

red = [row[0] for row in rgb_matrix]
green = [row[1] for row in rgb_matrix]
blue = [row[2] for row in rgb_matrix]

Why would you want to avoid using two for loops? For loops are not inherently inefficient. However, having a function call for every line (such as re.findall) can become very inefficient.

When dealing with large files or processing pixels especially it is always better to stick to simple functions and arithmetic rather than costly function calls. What you might want to do instead is the following:

for line in file:
    split = line.split(' ')
    for s in split:
        r,g,b = s.split(',')
        r_vector.append(r)
        g_vector.append(g)
        b_vector.append(b.split('\')[0]) <<<<Keep in mind, every line will have a '\n' newline char

EDIT: Thanks to @Ashoka Lella for pointing out that each line has multiple rgb sets.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM