简体   繁体   English

从包含RGB值的文件中拆分R,G和B值的有效方法(无NumPy)

[英]Efficient way to split R, G and B values from a file containing RGB values (Without NumPy)

I am having a file that contains RGB values. 我有一个包含RGB值的文件。 Like, 喜欢,

Sample Image Data.txt file 样本图像Data.txt文件

Each row contains triplets (like 255,255,255) separated by spaces. 每行包含三元组(例如255,255,255),并用空格分隔。
Each triplet has three comma separated integers. 每个三元组都有三个逗号分隔的整数。 These integer corresponds to R ('RED'), G ('GREEN') and B ('BLUE') values. 这些整数对应于R('RED'),G('GREEN')和B('BLUE')值。 All integers are less than 256. 所有整数均小于256。

255,255,255 250,250,250 254,254,254 250,250,250 
255,255,255 253,253,253 255,255,255 255,255,255 
251,251,251 247,247,247 251,251,251 250,250,250
195,195,195 191,191,191 195,195,195 195,195,195
255,255,255 253,253,253 254,254,254 255,255,255 
255,255,255 254,254,254 239,239,239 240,240,240
238,238,238 254,254,254 255,255,255 255,255,255

The processed output should look like: 处理后的输出应如下所示:
RED = ['255','250','254','250','255','253','255',............,'254','255','255'] 红色= ['255','250','254','250','255','253','255',............,'254','255','255']
GREEN = ['255','250','254','250','255','253','255',............,'254','255','255'] GREEN = ['255','250','254','250','255','253','255',............,'254','255','255']
BLUE = ['255','250','254','250','255','253','255',............,'254','255','255'] 蓝色= ['255','250','254','250','255','253','255',............,'254','255','255']
RGB_Nx3_MATRIX = [['255','255','255'],['250','250','250'],['254','254','254'].....['255','255','255']] RGB_Nx3_MATRIX = [['255','255','255'],['250','250','250'],['254','254','254'].....['255','255','255']]

My code works fine. 我的代码工作正常。

import re

file_object = open('Image Data.txt','r') 

RED_VECTOR = []         #SEQUENTIALLY STORES ALL 'R' VALUES
GREEN_VECTOR = []       #SEQUENTIALLY STORES ALL 'G' VALUES
BLUE_VECTOR = []        #SEQUENTIALLY STORES ALL 'B' VALUES

RGB_Nx3_MATRIX = []     #Nx3 MATRIX i.e. ['R','G','B'] N times

for line in file_object:
    SPACE_split_LIST = line.split()

    for pixel in SPACE_split_LIST:
        RGB = re.findall(r'\,?(\d+)\,?',pixel)
        RED_VECTOR += [RGB[0]]
        GREEN_VECTOR += [RGB[1]]
        BLUE_VECTOR += [RGB[2]]

        RGB_Nx3_MATRIX += [RGB]




#RESULTS

#print RED_VECTOR
#print GREEN_VECTOR
#print BLUE_VECTOR

#print "------------------"

#print RGB_Nx3_MATRIX

What am I looking for? 我在找什么

I need a better and efficient way to do this. 我需要一种更好而有效的方法来做到这一点。 I want to avoid the use of two for-loops. 我想避免使用两个for循环。

you can avoid usage of regex 您可以避免使用正则表达式

f =open('Image Data.txt','r')                 

R=[]                                 
G=[]                                 
B=[]                                 
for line in f:                       
    for color_set in line.split():       
        r,g,b = color_set.split(',')     
        R+=[r]                       
        G+=[g]                       
        B+=[b]                       

print B

output 输出

['255', '250', '254', '250', '255', '253', '255', '255', '251', '247', '251', '250', '195', '191', '195', '195', '255', '253', '254', '255', '255', '254', '239', '240', '238', '254', '255', '255']

If you're mainly interested in the matrix, you can almost do this in one line: 如果您主要对矩阵感兴趣,则几乎可以在一行中完成:

with open('Image Data.txt','r') as file_h:
    rgb_matrix = [triple.split(',') for line in file_h for triple in line.strip().split()]

which should be fairly efficient. 这应该是相当有效的。 You could also extend this by another loop to convert them to integers. 您还可以将其扩展到另一个循环,以将其转换为整数。

with open('Image Data.txt','r') as file_h:
    rgb_matrix = [[int(num) for num in triple.split(',')] for line in file_h for triple in line.strip().split()]

If you really need individual colors, you can easily get them as: 如果您确实需要单独的颜色,则可以轻松获得它们:

red = [row[0] for row in rgb_matrix]
green = [row[1] for row in rgb_matrix]
blue = [row[2] for row in rgb_matrix]

Why would you want to avoid using two for loops? 为什么要避免使用两个for循环? For loops are not inherently inefficient. For循环并不是天生就没有效率的。 However, having a function call for every line (such as re.findall) can become very inefficient. 但是,对每行(例如re.findall)进行函数调用会变得非常无效率。

When dealing with large files or processing pixels especially it is always better to stick to simple functions and arithmetic rather than costly function calls. 特别是在处理大文件或处理像素时,始终坚持简单的函数和算术而不是昂贵的函数调用总是更好的选择。 What you might want to do instead is the following: 您可能想要做的是以下操作:

for line in file:
    split = line.split(' ')
    for s in split:
        r,g,b = s.split(',')
        r_vector.append(r)
        g_vector.append(g)
        b_vector.append(b.split('\')[0]) <<<<Keep in mind, every line will have a '\n' newline char

EDIT: Thanks to @Ashoka Lella for pointing out that each line has multiple rgb sets. 编辑:感谢@Ashoka Lella指出每一行都有多个rgb集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM