I am trying to use the following code to read the data from a txt file:
import pandas as pd
headerLines=12
data = pd.read_csv('test.txt',skiprows=headerLines,sep='\t',names=['a','b','c','d','e','f','g','h','i'])
print(data.head())
However, the following is what I get which is not what I want. The column names are shifted rightwards, therefore there is one additional column with NaNs generated (what I want is that column name 'a' should be corresponding to the column starting with 2000000, and there should be a column of index to the left of the first column). Any expert could tell me the reason and how to fix this? Thanks a lot.
a b c d e f \
2000000 -65.949737 167.359438 -9.773884 -0.102801 -9.768339 -0.102985
31990000 -44.882304 149.629367 -9.776339 -1.058768 -9.772569 -1.056513
61980000 -43.898586 -155.579474 -9.777945 -1.976854 -9.775798 -1.969913
91970000 -55.187924 -100.870064 -9.781525 -2.895683 -9.778132 -2.877063
121960000 -46.330680 126.798745 -9.783116 -3.803569 -9.779577 -3.782513
g h i
2000000 -68.031965 -40.420658 NaN
31990000 -58.193022 93.591063 NaN
61980000 -53.468840 132.634058 NaN
91970000 -53.542601 171.131622 NaN
121960000 -53.124162 -142.028566 NaN
I was able to reproduce the behaviour you described by separating the first column with spaces instead of tabs. You may want to check whether your input has a similar issue. This can be done easily with
print(data["a"])
If this prints two columns (which are in reality not two but one column with type "string"), then the problem is very likely caused by a wrong delimiter. Pandas interprets an input "1234 1234" as a text string, if the numbers are not separated by the given delimiter (tab in your case).
You can resove such problems by using the argument delim_whitespace=True
instead of sep='\\t'
. This will make pandas use any combination of whitespaces as delimiter. (See also the pandas docs .)
I realized now that the data after the line break start again with the values of the first column in your example. This indicates that the first column is somehow interpreted as the index. Therefore, I do not believe that my answer will help you. I keep it here just in case someone has the issue I described and reads your question.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.