I'm extremely new to python and have been searching google and stackoverflow to solve this issue which I am sure is simply a syntax problem.
I have a data frame with several columns.
import pandas as pd
df = pd.read_csv("C:/path/file.csv")
My csv has 5 columns and ~ 100k rows I simply want a substring of the first 2 digits of column 5.
I've tried:
df.assign(new = lambda x: x.column5[0:2],)
This creates the new field and populates the first two rows with the complete value in column 5 and gives me NaN for the remainder.
These attempts give me syntax erros:
df['new'] = df['column5'].str[0:2]
df.map(lambda df['column5']: [:2])
I am simply at a loss of how to create a new column using the first two digits of an existing column from a table read in via pandas.
If this were SAS I'd have been done hours ago, but I am trying to make a go of Python so your help is appreciated
I guess your column5
column is of int*/float* dtype, so try to convert it to string first:
df['new'] = df['column5'].astype(str).str[:2]
you can explicitly specify types of columns when reading CSV file:
df = pd.read_csv('file_name.csv', ..., dtype={'column5': object})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.