Removing spaces from astring within a CSV in python

Question

I have a CSV that is output by a program. The delimiter is a space. One "cell" of the CSV is manually input by a user, the rest is automatically generated. The issue is that the user may have a space within the string they manually input. If I were to input this into excel it would cause the columns to be off. I'm trying to write a program in Python that will eliminate these spaces within the user input and replace them with an underscore.

So I want to go from this

 600 2 light rain event 2015-01-12 17:48:07

to this

 600 2 gmk_light_rain_event 2015-01-12 17:48:07

Is there any way to code this in python?

Answer 1

使用str类的replace方法

"light rain event".replace(' ', '_')

Answer 2

It would be better if you could replace the spaces closer to when the data is entered. But if you already have collected the data, you need a rule to identify that field amongst the others

>>> s = "600 2 light rain event 2015-01-12 17:48:07"
>>> parts = s.split(" ")

Rule: Leave the first and last 2 fields alone. Replace the " " with "_" in the remainder

>>> parts[:2] + ["_".join(parts[2:-2])] + parts[-2:]
['600', '2', 'light_rain_event', '2015-01-12', '17:48:07']

join the parts of the resulting list

>>> " ".join(parts[:2] + ["_".join(parts[2:-2])] + parts[-2:])
'600 2 light_rain_event 2015-01-12 17:48:07'

And you can add the "gmk" tag like this

>>> " ".join(parts[:2] + ["gmk_"+"_".join(parts[2:-2])] + parts[-2:])
'600 2 gmk_light_rain_event 2015-01-12 17:48:07'

Answer 3

You can use a regex:

>>> import re
>>> s="light rain event"
>>> re.sub(r'\s+', '_', s)
'light_rain_event'
>>> 'gmk_'+re.sub(r'\s+', '_', s)
'gmk_light_rain_event'

Answer 4

You need to split it based on the number of spaces before and after, since I'm guessing it can have any amount of spaces in the middle.

#Line read from CSV
line = "600 2 light rain event 2015-01-12 17:48:07"

#Just incase any parts need changing
spaceBetweenWords = "_"
prefix = "gmk"

#Split by spaces
separatedLine = line.split( " " )

#Get the middle part that needs underscores
startBit = " ".join( separatedLine[:2] )
middleBit = spaceBetweenWords.join( [prefix] + separatedLine[2:-2] )
endBit = " ".join( separatedLine[-2:] )


print "{0} {1} {2}".format( startBit, middleBit, endBit )
# Result: 600 2 gmk_light_rain_event 2015-01-12 17:48:07

I added a bit where you can easily change the underscore and 'gmk' if needed, although looking up I can see John pretty much did it the same way :)

Removing spaces from astring within a CSV in python

Question

4 answers

solution1
3 2015-02-13 20:34:28

solution2
2 ACCPTED 2015-02-13 21:59:16

solution3
0 2015-02-13 22:07:55

solution4
0 2015-02-13 22:44:04

Removing spaces from astring within a CSV in python

Question

4 answers

solution1 3 2015-02-13 20:34:28

solution2 2 ACCPTED 2015-02-13 21:59:16

solution3 0 2015-02-13 22:07:55

solution4 0 2015-02-13 22:44:04

solution1
3 2015-02-13 20:34:28

solution2
2 ACCPTED 2015-02-13 21:59:16

solution3
0 2015-02-13 22:07:55

solution4
0 2015-02-13 22:44:04