简体   繁体   中英

All NAN values when filtering pandas DataFrame to one column

I'm importing data from a .csv file , which is stored in one dataframe. Looks fine there: 在此处输入图像描述

After which I try to store only one column of the dataframe elsewhere. However, it returns all NaN values:

在此处输入图像描述

The exact same code works fine for a.xls file earlier in the same Python script. So I'm unsure of what is happening here. Any clarification would be appreciated. Here is the source code:

    # ------------------------------------------------------------------------------
print("\nSELECT Q MEASUREMENT FILE TO FIX: ")
time.sleep(1)
# Allow User to pick file that which needs X-Y data to be FIXED
tkinter.Tk().withdraw()  # Close the root window
input2 = filedialog.askopenfilename()
print("\nYou selected file:")
print(input2)
print("\n")
input2 = str(input2)

# Check to see if directory/file exists
assert os.path.exists(input2), "File does not exist at, "+str(input2)

# Import data below and store in df
print("\nImporting Excel Workbook...")
time.sleep(1)
# You can check encoding of file with notepad++
dfQ = pd.read_csv(input2, encoding="ansi")
dfQ.values
print(dfQ)  # This DataFrame (dfQ) contains the entire excel workbook
print("\n\nWorkbook Successfully Imported")
time.sleep(.5)
print("...")

# Search Q measurements CSV for "Chip ID" and matches it to corresponding
# "PartID" in the master table created from manually fixed file.
print("Matching PartID's to update proper X-Y values")
time.sleep(.5)
print("...")
IDs = pd.DataFrame(dfQ, columns=['Chip ID'])
time.sleep(.5)
print(IDs)
s = IDs.size
print("\nSuccessfully extracted", s, "Chip ID's!")
print(dfQ.columns)

All you have to do is:

IDs = dfQ["Chip ID"]

and you will get the corresponding pandas.series. If you want the result in pandas.DataFrame to format just do:

IDs = dfQ["Chip ID"].to_frame()

EDIT:

Your columns name starts with space:

Index(['Date', ' Time', ' Device ID', ' Chip ID', ' Lot', ' Wafer', ' X', ' Y',
       ' Q half-width', ' Q fit', ' dQ¸ %', ' Internal Resonant F',
       ' Internal Resonant A', ' Ajusted FG Ampl', ' FG Amplitude (0.10)',
       ' Forced A', ' Forced F', ' Drive Gain', ' Frequency sweep¸start',
       ' Prelim Q  half-width', ' Prelim Q fit', ' Prelim Q Error¸ %',
       ' Execution time', ' Preliminary F', ' residue', ' '],
      dtype='object')

so all you have to do:

IDs = dfQ[" Chip ID"]

The problem is that your column is actually named Chip ID (with a space) and not Chip ID .

So either IDs = dfQ[" Chip ID"] for a Series or IDs = dfQ[[" Chip ID"]] should both work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM