[英]Condensing repeat code with a “for” statement using strings - Python
I am very new with "for" statements in Python, and I can't get something that I think should be simple to work. 我对Python中的“ for”语句非常陌生,我无法获得一些我认为应该很简单的东西。 My code that I have is: 我的代码是:
import pandas as pd
df1 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
df2 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
df3 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
DF1 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
DF2 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
DF3 = pd.DataFrame({'Column1' : pd.Series([1,2,3,4,5,6])})
Then: 然后:
A1 = len(df1.loc[df1['Column1'] <= DF1['Column1'].iloc[2]])
Z1 = len(df1.loc[df1['Column1'] >= DF1['Column1'].iloc[3]])
A2 = len(df2.loc[df2['Column1'] <= DF2['Column1'].iloc[2]])
Z2 = len(df2.loc[df2['Column1'] >= DF2['Column1'].iloc[3]])
A3 = len(df3.loc[df3['Column1'] <= DF3['Column1'].iloc[2]])
Z3 = len(df3.loc[df3['Column1'] >= DF3['Column1'].iloc[3]])
As you can see, it is a lot of repeat code with just the identifying numbers being different. 如您所见,这是很多重复的代码,只是标识号不同。 So my first attempt at a "for" statement was: 因此,我第一次尝试“ for”语句是:
Numbers = [1,2,3]
for i in Numbers:
"A" + str(i) = len("df" + str(i).loc["df" + str(i)['Column1'] <= "DF" + str(i)['Column1'].iloc[2]])
"Z" + str(i) = len("df" + str(i).loc["df" + str(i)['Column1'] >= "DF" + str(i)['Column1'].iloc[3]])
This yielded the SyntaxError: "can't assign to operator". 这产生了SyntaxError:“无法分配给运算符”。 So I tried: 所以我尝试了:
Numbers = [1,2,3]
for i in Numbers:
A = "A" + str(i)
Z = "Z" + str(i)
A = len("df" + str(i).loc["df" + str(i)['Column1'] <= "DF" + str(i)['Column1'].iloc[2]])
Z = len("df" + str(i).loc["df" + str(i)['Column1'] >= "DF" + str(i)['Column1'].iloc[3]])
This yielded the AttributeError: 'str' object has no attribute 'loc'. 这产生了AttributeError:'str'对象没有属性'loc'。 I tried a few other things like: 我尝试了其他一些事情,例如:
Numbers = [1,2,3]
for i in Numbers:
A = "A" + str(i)
Z = "Z" + str(i)
df = "df" + str(i)
DF = "DF" + str(i)
A = len(df.loc[df['Column1'] <= DF['Column1'].iloc[2]])
Z = len(df.loc[df['Column1'] <= DF['Column1'].iloc[3]])
But that just gives me the same errors. 但这给了我同样的错误。 Ultimately what I would want is something like: 最终我想要的是这样的:
Numbers = [1,2,3]
for i in Numbers:
Ai = len(dfi.loc[dfi['Column1'] <= DFi['Column1'].iloc[2]])
Zi = len(dfi.loc[dfi['Column1'] <= DFi['Column1'].iloc[3]])
Where the output would be equivalent if I typed: 如果输入,输出将是等效的:
A1 = len(df1.loc[df1['Column1'] <= DF1['Column1'].iloc[2]])
Z1 = len(df1.loc[df1['Column1'] >= DF1['Column1'].iloc[3]])
A2 = len(df2.loc[df1['Column1'] <= DF2['Column1'].iloc[2]])
Z2 = len(df2.loc[df1['Column1'] >= DF2['Column1'].iloc[3]])
A3 = len(df3.loc[df3['Column1'] <= DF3['Column1'].iloc[2]])
Z3 = len(df3.loc[df3['Column1'] >= DF3['Column1'].iloc[3]])
It is "restricted" to generate variables in for loop (you can do that, but it's better to avoid. See other posts: post_1 , post_2 ). 在for循环中生成变量是“受限制的”(您可以这样做,但是最好避免。请参见其他文章: post_1 , post_2 )。
Instead use this code to achieve your goal without generating as many variables as your needs (actually generate only the values in the for loop): 而是使用此代码来实现您的目标,而无需生成所需数量的变量(实际上仅生成for循环中的值):
# Lists of your dataframes
Hanimals = [H26, H45, H46, H47, H51, H58, H64, H65]
Ianimals = [I26, I45, I46, I47, I51, I58, I64, I65]
# Generate your series using for loops iterating through your lists above
BPM = pd.DataFrame({'BPM_Base':pd.Series([i_a for i_a in [len(i_h.loc[i_h['EKG-evt'] <=\
i_i[0].iloc[0]]) / 10 for i_h, i_i in zip(Hanimals, Ianimals)]]),
'BPM_Test':pd.Series([i_z for i_z in [len(i_h.loc[i_h['EKG-evt'] >=\
i_i[0].iloc[-1]]) / 30 for i_h, i_i in zip(Hanimals, Ianimals)]])})
UPDATE UPDATE
A more efficient way (iterate over "animals" lists only once): 一种更有效的方法(仅对“动物”列表重复一次):
# Lists of your dataframes
Hanimals = [H26, H45, H46, H47, H51, H58, H64, H65]
Ianimals = [I26, I45, I46, I47, I51, I58, I64, I65]
# You don't need using pd.Series(),
# just create a list of tuples: [(A26, Z26), (A45, Z45)...] and iterate over it
BPM = pd.DataFrame({'BPM_Base':i[0], 'BPM_Test':i[1]} for i in \
[(len(i_h.loc[i_h['EKG-evt'] <= i_i[0].iloc[0]]) / 10,
len(i_h.loc[i_h['EKG-evt'] >= i_i[0].iloc[-1]]) / 30) \
for i_h, i_i in zip(Hanimals, Ianimals)])
Figured out a better way to do this that fits my needs. 想出一种更好的方式来满足我的需求。 This is mainly so that I will be able to find my method. 这主要是为了使我能够找到我的方法。
# Change/Add animals and conditions here, make sure they match up directly
Animal = ['26','45','46','47','51','58','64','65', '69','72','84']
Cond = ['Stomach','Intestine','Stomach','Stomach','Intestine','Intestine','Intestine','Stomach','Cut','Cut','Cut']
d = []
def CuSO4():
for i in Animal:
# load in Spike data
A = pd.read_csv('TXT/INJ/' + i + '.txt',delimiter=r"\s+", skiprows = 15, header = None, usecols = range(1))
B = pd.read_csv('TXT/EKG/' + i + '.txt', skiprows = 3)
C = pd.read_csv('TXT/ESO/' + i + '.txt', skiprows = 3)
D = pd.read_csv('TXT/TRACH/' + i + '.txt', skiprows = 3)
E = pd.read_csv('TXT/BP/' + i + '.txt', delimiter=r"\s+").rename(columns={"4 BP": "BP"})
# Count number of beats before/after injection, divide by 10/30 minutes for average BPM.
F = len(B.loc[B['EKG-evt'] <= A[0].iloc[0]])/10
G = len(B.loc[B['EKG-evt'] >= A[0].iloc[-1]])/30
# Count number of esophogeal events before/after injection
H = len(C.loc[C['Eso-evt'] <= A[0].iloc[0]])
I = len(C.loc[C['Eso-evt'] >= A[0].iloc[-1]])
# Find Trach events after injection
J = D.loc[D['Trach-evt'] >= A[0].iloc[-1]]
# Count number of breaths before/after injection, divide by 10/30 min for average breaths/min
K = len(D.loc[D['Trach-evt'] <= A[0].iloc[0]])/10
L = len(J)/30
# Use Trach events from J to find the number of EE
M = pd.DataFrame(pybursts.kleinberg(J['Trach-evt'], s=4, gamma=0.1))
N = M.last_valid_index()
# Use N and M to determine the latency, set value to MaxTime (1800s)if EE = 0
O = 1800 if N == 0 else M.iloc[1][1] - A[0].iloc[-1]
# Find BP value before/after injection, then determine the mean value
P = E.loc[E['Time'] <= A[0].iloc[0]]
Q = E.loc[E['Time'] >= A[0].iloc[-1]]
R = P["BP"].mean()
S = Q["BP"].mean()
# Combine all factors into one DF
d.append({'EE' : N, 'EE-lat' : O,
'BPM_Base' : F, 'BPM_Test' : G,
'Eso_Base' : H, 'Eso_Test' : I,
'Trach_Base' : K, 'Trach_Test' : L,
'BP_Base' : R, 'BP_Test' : S})
CuSO4()
# Create shell DF with animal numbers and their conditions.
DF = pd.DataFrame({'Animal' : pd.Series(Animal), 'Cond' : pd.Series(Cond)})
# Pull appended DF from CuSO4 and make it a pd.DF
Df = pd.DataFrame(d)
# Combine the two DF's
df = pd.concat([DF, Df], axis=1)
df
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.