I have a bunch of c
and e
python lists that I have to compare, in which case the e
list length is always greater than or equal to the c
list. What I want to do is compare these two lists and, if their lengths are not equal, I want to fill in the c
list "gaps" with an "NA".
For instance, if we look at these two lists:
e = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14']
c = ['2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13']
I'd want the c list to fill in an "NA" for the values it's missing (and preserve the order), like so:
c = ['NA', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', 'NA']
I'd use pandas and pd.Series.where
for the mask. First convert them to series e = pd.Series(e)
, then
s = pd.concat([e,c],sort=True).drop_duplicates() #remove sort=True for versions < 0.23
s.where(s.isin(c))
outputs
0 NaN
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 11
11 12
12 13
13 NaN
You can also follow a classic, pure-python approach. Use two pointers, iterate through them simultaneously and compare the values.
p1 = 0
p2 = 0
f = []
while (p1 < len(e)) and (p2 < len(c)):
vale = e[p1]
valc = c[p2]
if vale < valc:
f.append('NA')
p1 += 1
elif vale == valc:
f.append(valc)
p1 += 1
p2 += 1
else:
p2 += 1
if p1 == len(e): f.extend(c[p2:])
if p2 == len(c): f.extend(['NA']*(len(e)-p1))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.