简体   繁体   中英

Python numpy ndarray skipping lines from text

Based on this answer , I am using the changethis method

import numpy as np
import os

def changethis(pos):
    appex = sfile[pos[1]-1][:pos[2]] + '*' + file[pos[1]-1][pos[2]+len(pos[0]):]
    file[pos[1]-1] = appex

pos = ('stack', 3, 16)
sfile = np.genfromtxt('in.cpp',dtype='str',delimiter=os.linesep)
changethis(pos)
print(file)

where the in.cpp is a source file which contains the following:

/* Multi-line 
comment
*/

#include <iostream>
#include <fstream>

using namespace std;

int main (int argc, char *argv[]) {
  int linecount = 0;
  double array[1000], sum=0, median=0, add=0;
  string filename;
  if (argc <= 1)
      {
          cout << "Error" << endl;
          return 0;
      }

I get the output:

['using namespace std;' 'int main (int argc, char *argv[]) {'
 'int linecount = *' 'double array[1000], sum=0, median=0, add=0;'
 'string filename;' 'if (argc <= 1)' '{' 'cout << "Error" << endl;'
 'return 0;' '}']

Notice that the lines of the multi-line comment , the include statements and the empty-lines are missing from the ndarray .

I do not understand why this happens since the delimiter is set to account for each change-of-line character.

Any suggestions on how the output to be:

['/* Multi-line' 'comment' '*/' '' '#include <iostream>',
 '' '#include <fstream>' '' 'using namespace std;'
 '' 'int main (int argc, char *argv[]) {'
 'int linecount = *' 'double array[1000], sum=0, median=0, add=0;'
 'string filename;' 'if (argc <= 1)' '{' 'cout << "Error" << endl;'
 'return 0;' '}']

Again sorry for the use of genfromtxt , didn't understood your intentions, just tried to provide a possible solution for the problem. As a follow up for that particular solution (others have been provided) you can just do:

import numpy as np
import os

def changethis(pos):
    # Notice file is in global scope
    appex = file[pos[1]-1][:pos[2]] + '*' + file[pos[1]-1][pos[2]+len(pos[0]):]
    file[pos[1]-1] = appex

pos = ('stack', 3, 16)
file = np.array([i for i in open('in.txt','r')]) # instead of genfromtext.
changethis(pos)
print(file)

, which resulted in:

['/* Multi-line \n' 'comment\n' '*/\n*' '\n' '#include <iostream>\n'
 '#include <fstream>\n' '\n' 'using namespace std;\n' '\n'
 'int main (int argc, char *argv[]) {\n' '  int linecount = 0;\n'
 '  double array[1000], sum=0, median=0, add=0;\n' '  string filename;\n'
 '  if (argc <= 1)\n' '      {\n' '          cout << "Error" << endl;\n'
 '          return 0;\n' '      }']

EDIT: Also another relevant point mentioned by another user is the scope I was using for file. I did not mean to tell you to do stuff in global scope, I meant to explain that the function was working because file was in global scope. In any case you can create a function to hold the scope:

import numpy as np
import os

def changeallthese(poslist,path):
    def changethis(pos):
        appex = file[pos[1]-1][:pos[2]-1] + '*' + file[pos[1]-1][pos[2]-1+len(pos[0]):]
        file[pos[1]-1] = appex
    file = np.array([str(i) for i in open(path,'r')])
    for i in poslist:
        changethis(i)
    return file

poslist = [('stack', 3, 16),('stack', 18, 1),('/* Multi-line', 1, 1)]
file =   changeallthese(poslist,'in.txt')
print(file)

, which results in:

['* \n' 'comment\n' '*/\n*' '\n' '#include <iostream>\n'
 '#include <fstream>\n' '\n' 'using namespace std;\n' '\n'
 'int main (int argc, char *argv[]) {\n' '  int linecount = 0;\n'
 '  double array[1000], sum=0, median=0, add=0;\n' '  string filename;\n'
 '  if (argc <= 1)\n' '      {\n' '          cout << "Error" << endl;\n'
 '          return 0;\n' '* }']

To write an array to file you can either use the normal file writing system in Python:

fid = open('out.txt','w')
fid.writelines(file)
fid.close()

, or use a function from numpy (but I'm not sure if it will add more endlines or not so be careful):

np.savetxt('out.txt',file,fmt='%s')

If the file is not too big:

import numpy as np
import os

def changethis(linelist,pos):
    appex = linelist[pos[2]-1][:pos[3]] + pos[1] + linelist[pos[2]-1][pos[3]+len(pos[0]):]
    linelist[pos[2]-1] = appex

pos = ('Multi','Three', 1, 3)

with open('in.cpp','r')  as f:
    lines=f.readlines()
    changethis(lines,pos)
print(''.join(lines))

readlines turns your file into a list of lines(which is memory-inefficient and slow, but does the job. If less than 1k lines it should be fine).

The function takes a list of lines as input, in addition to pos . I also modified the function to replce pos[0] with pos[1] instead of a * at line pos[2] and after character pos[3] .

I get this as output:

/* Three-line 
comment
*/

#include <iostream>
#include <fstream>

using namespace std;

int main (int argc, char *argv[]) {
  int linecount = 0;
  double array[1000], sum=0, median=0, add=0;
  string filename;
  if (argc <= 1)
      {
          cout << "Error" << endl;
          return 0;
      }

If you want a list of strings representing the lines of a file, open the file and use readlines() :

with open('in.cpp') as f:
    lines = f.readlines()

# Have changethis take the list of lines as an argument
changethis(lines, pos)

Don't use np.genfromtxt ; that's a tabular data parser with all sorts of behavior you don't want, such as treating # as a line comment marker.

Depending on what you intend to do with this list, you can probably even avoid needing an explicit list of lines. Also, file is a bad choice of variable name (it hides the built-in file ), and changethis should really take the list as an argument instead of a global variable. In general, the earlier answer you got was pretty terrible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM