I want to remove races (which are less than 1%) in a county. I am using pandas. If you notice some races have values less than 1% in a county. I want to ignore those race and display races with higher populations
CensusTract State County TotalPop Men Women Hispanic White Black Native Asian Pacific 1001020100 Alabama Autauga 1948 940 1008 0.9 87.4 7.7 0.3 0.6 0 1001020400 Alabama Autauga 4423 2172 2251 10.5 82.8 3.7 1.6 0 0
I tried this
dataset = tract_data.query("Income >= 50000 & Poverty > 50")
dataset.loc[:,'Races'] = dataset.apply(lambda row: list(zip(list(row.index)
[6:12], list(row)[6:12])), axis =1)
dataset.loc[:,'Races'] = dataset.Races.apply(lambda x: '; '.join(['{}:
{}'.format(t[0], t[1]) for t in list(filter(lambda x: x[1]> 1, x))]))
income = dataset[['CensusTract', 'State', 'County','Races']]
print(dataset['Races'])
But I still have error
This is what I expect to have
CensusTract State County races 1001020100 Alabama Autauga White: 87.4 Black: 7.7 1001020400 Alabama Autauga Hispanic: 10.5 White: 82.8 Black: 3.7 Native: 1.6
This is one way to achieve your goal
df['Races'] = df.apply(lambda row: list(zip(list(row.index)[6:], list(row)[6:])), axis =1)
df['Races'] = df.Races.apply(lambda x: '; '.join(['{}: {}'.format(t[0], t[1]) for t in list(filter(lambda x: x[1]> 1, x))]))
Finally, if we print df
, here is what we get.
CensusTract State County TotalPop Men Women Hispanic White Black Native Asian Pacific Races
0 1001020100 Alabama Autauga 1948 940 1008 0.9 87.4 7.7 0.3 0.6 0.0 White: 87.4; Black: 7.7
1 1001020400 Alabama Autauga 4423 2172 2251 10.5 82.8 3.7 1.6 0.0 0.0 Hispanic: 10.5; White: 82.8; Black: 3.7; Nativ...
Here is the idea. The values we want to compare are in the 6th to the last columns. For each row, we want to show the row name as well as the value in it if the value is greater than 1. Now list(row.index)
gives us the column names for that row and list(row)
gives us the values in that row as list. We can zip these list to get a list of tuples [(column_name, value)]
.
Then we can filter the list of tuples by key = value
to contain only the tuples where value
is greater than 1. After filtering, we will get a list of tuples and the rest of the work is just to format the list of tuples to display an answer in a manner that we love. To understand how the filtering is works, just try:
x = [('col1', 8), ('col2', 10), ('col3', 0.9), ('col4', 30)]
'; '.join(['{}: {}'.format(t[0], t[1]) for t in list(filter(lambda x: x[1]> 1, x))])
The result should be;
>>> 'col1: 8; col2: 10; col4: 30'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.