[英]How to reduce digits of a number in Python?
I have a "CSV" file with four columns: 我有一个包含四列的“ CSV”文件:
rep par comm value
USA GER 60705 100
USA GER 607034 200
GER US 607094 300
US UK 60709 340
I intend to reduce the values for comm column and change them to four digits numbers as follows: 我打算减少comm列的值,并将它们更改为四位数,如下所示:
rep par comm value
USA GER 6070 100
USA GER 6070 200
GER US 6070 300
US UK 6070 340
For doing this, I have written following code: 为此,我编写了以下代码:
infile=csv.reader(open("filepath"))
wfile=open("newfilpath", "wb")
writer=csv.writer(wfile, delimiter=",")
writer.writerow(["rep","par","comm","value"])
infile.next()
for row in infile:
comm=row[2]
hs4=comm[0:4]
writer.writerow([row[0],row[1],hs4,row[3]])
wfile.close()
But for numbers like 60705 and 60709 (which are 5 digits), I get 607 not 6070. 但是对于60705和60709(是5位数字)这样的数字,我得到的是607,而不是6070。
Update: I realized that Python add Zero to the numbers with 5 digits and 60705 becomes 060705 as an example. 更新:我意识到Python以5位数字加零,并且60705变成060705为例。 I do not know how to fix this problem.
我不知道如何解决此问题。 Here is a my output for the real data:
这是我的实际数据输出:
'ALB,DNK,880390,11678\n'
'ALB,FIN,961420,10377\n'
'ALB,FRA,030741,10857\n'
'ALB,FRA,030749,4300\n'
'ALB,FRA,091050,14861\n'
'ALB,FRA,121190,1049561\n'
'ALB,FRA,130219,7291\n'
All the data that starts with zero are 5 digits in fact and Python adds zero automatically to the data. 实际上所有以零开头的数据都是5位数字,Python会自动将零添加到数据中。
It may be that you have a space in front of the 6. You can try the .strip()
method to get rid of it. 可能是您在6前面有一个空格。您可以尝试使用
.strip()
方法摆脱它。 I've also modified your code slightly here: 我还在这里稍微修改了您的代码:
EDIT : now removes leading zeros 编辑:现在删除前导零
import csv
with open("filepath") as ifile, open("newfilpath", "wb") as wfile:
infile = csv.reader(ifile)
writer = csv.writer(wfile)
writer.writerow(next(infile))
for row in infile:
row[2] = row[2].strip().lstrip('0')[:4]
writer.writerow(row)
I suggest trying following method by using pandas. 我建议尝试使用熊猫以下方法。
import pandas as pd
df=pd.read_csv("test.csv")
print df
t=(df['comm']).astype(str)
for i in t:
print i[:4]
Output: 输出:
rep par comm value
0 USA GER 60705 100
1 USA GER 607034 200
2 GER US 607094 300
3 US UK 60709 340
6070
6070
6070
6070
Using a slightly modified code to read the CSV file, I get: 使用稍微修改的代码来读取CSV文件,我得到:
import csv
infile=csv.reader(open("filepath"), delimiter=" ", skipinitialspace=True)
wfile=open("newfilpath", "wb")
writer=csv.writer(wfile, delimiter=",")
writer.writerow(["rep","par","comm","value"])
infile.next()
for row in infile:
print row
comm=row[2]
hs4=comm[0:4]
writer.writerow([row[0],row[1],hs4,row[3]])
wfile.close()}
With the input: 用输入:
rep par comm value
USA GER 60705 100
USA GER 607034 200
GER US 607094 300
US UK 60709 340
Using your code, I get an output of: 使用您的代码,我得到以下输出:
rep,par,comm,value
USA,GER,6070,100
USA,GER,6070,200
GER,US,6070,300
US,UK,6070,340
The only thing I can think of is changing the delimiter settings or skipinitialspace when you read in your CSV file. 我唯一能想到的就是在读取CSV文件时更改定界符设置或skipinitialspace。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.