I have a script that takes input from a large log file. This file has encoded URLs. I am using standard input to grab these URLs from the file. I wish to process each URL separately.
Problem is when I get the a single URL its split up into each character in the URL. I do ''.join(something) when then after processing I get characters.
eg
for line in sys.stdin:
line = line.strip()
line1 = ''.join(line)
I also tried collecting all the characters in the URL and then joining. Still same result.
Sample out I get:
Input from file: " www.cnn.com" output after sys.std and processing : ['w','w','w','.','c','n','n','.','c','o','m']
the list appears because i make it so. Otherwise i get www.cnn.com from sys.stdin. But the underlying structure is same as the output.
What I want is: Input from file: " www.cnn.com" output: "www.cnn.com" (this should be one string. not strings of individual characters)
Thanks
I think your stdin input might be garbled. Consider this script:
#stdin.py
import sys
for line in sys.stdin:
print line.strip()
Then piping input into it works as expected:
$ echo -e "www.cnn.com\nwww.test.com" | python stdin.py
www.cnn.com
www.test.com
If you call list()
on a string, it splits it up by character:
>>> list("test")
['t', 'e', 's', 't']
I'm guessing what you probably want to do is read the entire input and then split on lines, like this:
import sys
lines = sys.stdin.read().split()
print lines
Running it, I get:
$ echo -e "www.cnn.com\nwww.test.com" | python stdin.py
['www.cnn.com', 'www.test.com']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.