The raw.txt file looks like this
e1 47 3 Self-emp-inc Married-civ-spouse Transport-moving White Male Cuba
e2 52 16 Self-emp-not-inc Married-civ-spouse Prof-specialty White Male United-States
e3 26 9 Private Divorced Craft-repair White Male United-States
e4 60 9 Private Married-civ-spouse Craft-repair White Male United-States
I have tried
adult = pd.read_csv("Adult/dataset_full.txt", header=None)
It only gives get ONE column. If used sep=' '
it gives
<Error tokenizing data. C error: Expected 187 fields in line 3, saw 197>
Have tried skiprows=,
read_fwf()
, read_table()
gives all similar result.
Does anyone have any insights on how to separate this file into columns?
If your file.txt
is this:
e1 47 3 Self-emp-inc Married-civ-spouse Transport-moving White Male Cuba
e2 52 16 Self-emp-not-inc Married-civ-spouse Prof-specialty White Male United-States
e3 26 9 Private Divorced Craft-repair White Male United-States
e4 60 9 Private Married-civ-spouse Craft-repair White Male United-States
Then you have four rows with 9
values separated by a space. So you can:
pandas
DataFrame
headers
for columns .csv
fileFor example:
import pandas as pd
with open("file.txt") as f:
df = pd.DataFrame([line.strip().split() for line in f.readlines()])
headers = [f"Col{i}" for i in range(1, 10)]
df.to_csv("your_table.csv", index=False, header=headers)
Output:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.