简体   繁体   中英

python, using columns without header and index

I have a table like this: enter image description here

I want only date column and units column (column 1 and 5), but with date in another format. I used code like this:

`import pandas as pd

customer_calls = pd.read_excel("sales.xlsx", usecols=[0, 4])



customer_calls["OrderDate"] = pd.to_datetime(customer_calls["OrderDate"]).dt.strftime("%Y%m%d" + "00")
customer_calls.to_excel("sales_YYYYMMDD.xlsx")

print(customer_calls)`

It gives me what I wanted: enter image description here

I need it without header and index. But when I use header=0 or header=None, then can not read line:

`customer_calls["OrderDate"] = pd.to_datetime(customer_calls["OrderDate"]).dt.strftime("%Y%m%d" + "00")`

cause there is no "Orderdate" name of column anymore. I tried to use 0 instead of name and all kind of stuff, but it always says error. How can I remove header and index but still choose date column after that?

I've read dozens of examples here, nothing solved this. Or I can no see it.

If you want to remove the headers and index, then essentially you are seeking only the values . If so, you extract the values and use the tolist() method.

Here is an example of this:

import pandas as pd

# example dataframe
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]], columns=['A', 'B', 'C'])

# extract values only
data = df.values.tolist()

print(data)

Here is the result of the above:

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

The values are now just a list of lists.

I've done it. Posting it for the future similar questions, It can be done really easily in panda. just two more lines.

import pandas as pd


# Read the file and specify which column is the date
customer_calls = pd.read_excel("sales.xlsx", usecols=[0, 1])


# Output with dates converted to YYYY-MM-DD
customer_calls["OrderDate"] = pd.to_datetime(customer_calls["OrderDate"]).dt.strftime("%Y%m%d" + "00")
customer_calls.to_excel("sales_YYYYMMDD.xlsx")


#set the location of the first row with columns
customer_calls.columns = customer_calls.iloc[0] 
#remove first row from the dataframe rows
customer_calls = customer_calls[1:]
#display
print(customer_calls)

it gives output like this:

0   2020010600     East
1   2020020900  Central
2   2020031500     West
3   2020040100     East
4   2020050500  Central
5   2020060800     East
6   2020071200     East
7   2020081500     East
8   2020090100  Central
9   2020100500  Central
10  2020110800     East
11  2020121200  Central

changed data format and without header

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM