简体   繁体   中英

Removing integer values from a alphanumeric column in python

I am new to python and struggling in one trivial task. I have one alphanumeric column known as region. It has both entries beginning with / such as /health/blood pressure etc and integer values. So typically few observations look like:

/health/blood pressure
/health/diabetes
7867
/fitness
9087
/health/type1 diabetes

Now I want to remove all the rows/cases with integer values. So after importing the data set into python shell, it is showing region as object. I intended to solve this problem with a sort of regular expression. So I did the following:

pattern='/'
data.region=Series(data.region)
matches=data.region.str.match(pattern)
matches

Here it gives a boolean object explaining whether each pattern is in the data set or not. So I get something like this:

0  true
1 false
2 true
3 true
.........
so on.

Now I am stuck further how to remove rows of matches boolean object with false tag. If statement is not working. If anyone can offer some sort of assistance, that would be great!!

Thanks!!

It seems like you are using the pandas framework. So I am not completely sure if this is working:

You can try:

matches = [i for i in data.region if i.str.match(pattern)]

In python this is called a list comprehension that goes through every entry in data.region and checks your pattern and puts it in the list if the pattern is matching (and the expression after 'if' is thus true).

See: https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions

If you want to map those for every region you can try to create a dictionary that maps the regions to the lists with the following dict-comprehension:

matches = {region: [i for i in data.region if i.str.match(pattern)] for region in data}

See: https://docs.python.org/2/tutorial/datastructures.html#dictionaries

However you are definitely leaving the realm of the pandas framework. This could eventually fail of regions is not an integer/string but a list itself (as Is aid I don't know pandas enough to judge).

In that case you could try:

matches = {}
for region in list_of_regions:
    matches[region] = [i for i in data.region if i.str.match(pattern)]

which is basically the same just with a given list of region and the dict comprehension made explicit in a for loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM