简体   繁体   English

字符串'-1'无法转换为float

[英]string '-1' can't be converted to float

I try to read large sets of numbers from a text file opened with utf-8 encoding. 我尝试从使用utf-8编码打开的文本文件中读取大量数字。 The text file was a copy/paste from a pdf. 文本文件是来自pdf的复制/粘贴。 The problem lies in the negative numbers (-1, -2 etc): I stripped everything, so the individual string bits look like this: -1 , -2 etc. 问题在于负数(-1,-2等):我剥离了所有内容,因此单个字符串位看起来像这样: -1-2等。

Then I want to calculate with them and convert them with float() , but I get an error: 然后我想用它们计算并用float()转换它们,但是我得到一个错误:

can't convert string to float: '-1'

I concluded, the '-' could be interpreted as a long '-' , whatever that is called and replaced it manually in the text file by a '-' . 我得出结论, '-'可以被解释为一个长'-' ,无论调用什么,并在文本文件中用'-'手动替换它。 Now it worked for this single string, float() converted it. 现在它适用于这个单字符串,float()转换它。 I wrote a small script that finds and replaces all '-' by '-' in the text file, but that didn't work. 我写了一个小脚本,在文本文件中查找并替换所有'-''-' ,但是没有用。

with open('text.txt', encoding='utf8') as fobj:
    all = []
    for line in fobj:
        line = line.strip()
        if '-' in line:
            line.replace('-','-')
            print('replaced')
        all.append(line)
with open('text2.txt','w',encoding='utf8') as f:
    for i in all:
        print(i)
        f.write(i)
        f.write('\n')

Why is it I can replace '-' by '-' manually but not with this script? 为什么我可以手动将'-'替换为'-'而不是用这个脚本替换? Thanks for help. 感谢帮助。

Example snipped from the text file: 从文本文件中剪切的示例:

/ 11/3 / 2 / 0 / 0/–1 /
/ 11/5 / 0 / 2 / 0/0 / N
/ 12/3 / 1 / 0 / 0/0 /
/ 12/4 / 1 / 1 / 0/0 / NS

/ 12/4 / 4 / –1 / 0/–1 / H

/ 12/5 / 1 / 0 / 0/–1 / H

/ 12/5 / 2 / 0 / 0/-1 / H

/ 11/4 / 0 / 0 / 0/0 / H

You can actually see the difference between the second last and third last lines -1. 您实际上可以看到倒数第二行和第三行之间的差异-1。 In this copy that is. 在这个副本中。 i replaced the last - manually. 我替换了最后一个 - 手动。

You missed line assignment 你错过了分配line

if '-' in line:
    line = line.replace('-','-')
    print('replaced') 

I just looked at your code: it does replace('-','-') – which is the same character. 我只看了你的代码:它确实replace('-','-') - 这是相同的字符。

You should either do replace('–','-') , or, for better clarity of what you do, replace(u'\–', '-') . 您应该replace('–','-') ,或者为了更清楚地了解您的操作,请replace(u'\–', '-')

Besides, your re-assignment to line is missing. 此外,您的重新分配line丢失了。

use both answers your code should be: 使用两个答案你的代码应该是:

with open('text.txt', encoding='utf8') as fobj:
        all_ = []
        for line in fobj:
            line = line.strip()
            if u'\u2013' in line:
                line = line.replace(u'\u2013', '-')
                print('replaced', line)
            all_.append(line)
    with open('text2.txt','w',encoding='utf8') as f:
        for i in all_:
            print(i)
            f.write(i)
            f.write('\n')

result is 结果是

replaced / 11/3 / 2 / 0 / 0/-1 /
replaced / 12/4 / 4 / -1 / 0/-1 / H
replaced / 12/5 / 1 / 0 / 0/-1 / H
/ 11/3 / 2 / 0 / 0/-1 /
/ 11/5 / 0 / 2 / 0/0 / N
/ 12/3 / 1 / 0 / 0/0 /
/ 12/4 / 1 / 1 / 0/0 / NS

/ 12/4 / 4 / -1 / 0/-1 / H

/ 12/5 / 1 / 0 / 0/-1 / H

/ 12/5 / 2 / 0 / 0/-1 / H

/ 11/4 / 0 / 0 / 0/0 / H

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM