Suppose a question is asked in Stackoverflow and example data is given, what is the best way to import the example data
In order to explain this question which I am asking, let me take to a real StackOverflow post.
This post has given the following data to solve the query.
Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
Now what will be the best way to import the data into Python
The methodology should be in such a way that the import is done into the python environment in a single step.
I am aware that, I can copy-paste the data into a spreadsheet -> then save it as CSV file and then import it into Python. But I need a more elegant way.
Other solution as suggested by one developer,
import pandas as pd ;
df = pd.read_clipboard()
pd.read_clipboard(), is not a universal solution across all operating system. My system (Ubuntu 19.10), I did a clipboard copying and then the result came like this. The column names were wrongly imported.
This post deals with R solution, but I am interested in a solution for Python
Edited answer: this is slightly more work, but it's another viable option in my opinion.
from io import StringIO
import pandas as pd
s = StringIO(""" Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10""")
df = pd.read_fwf(s,index_col=0,infer_nrows=10)
Result:
In [175]: df
Out[175]:
Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
As others have pointed out, The best solution seems to be pd.read_clipboard()
In order not to get messed up columns, just replace the spaces in the column names with _
and then to get the actual columns simply do:
df.columns = [name.replace('_',' ') for name in df.columns.values]
So copy the input when you replaced spaces with _
: input:
Access_Type host/IP_address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
run the code:
import pandas as pd
df = pd.read_clipboard()
df.columns = [name.replace('_',' ') for name in df.columns.values]
and you'll get:
Output:
Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
Try to use pandas function read_clipboard()
This method is as simple as you want: it reads copy-pasted tabular data and parses it into a Data Frame.
By the documentation this method read text from clipboard and pass to read_csv.
You can find more information in this blog post
EDIT: I think that your problem didn't cause by your OS, it could be because of the white spaces in the first line of the data.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.