简体   繁体   中英

Sed operations only works with smaller files

OS: Ubuntu 14.04

I have 12 large json files (2-4 gb each) that I want to perform different operations on. I want to remove the first line, find "}," and replace it with "}" and remove all "]".

I am using sed to do the operations and my command is:

sed -i.bak -e '1d' -e 's/},/}/g' -e '/]/d' file.json

When i run the command on a small file (12,7kb) it works fine. file.json contains the content with the changes and file.json.bak contains the original content.

But when i run the command on my larger files the original file is emptied, eg file.json is empty and file.json.bak contains the original content. The run time is also what I consider to be "to fast", about 2-3 seconds.

What am I doing wrong here?

Are you sure your input file contains newlines as recognized by the platform you are running your commands on? If it doesn't then deleting one line would delete the whole file. What does wc -l < file tell you?

If it's not that then you probably don't have enough file space to duplicate the file so sed is doing something internally like

mv file backup && sed '...' backup > file

but doesn't have space to create the new file after moving the original to backup. Check your available file space and if you don't have enough and can't get more then you'll need to do something like:

while [ -s oldfile ]
do
    copy first N bytes of oldfile into tmpfile &&
    remove first N bytes from oldfile using real inplace editing &&
    sed 'script' tmpfile >> newfile &&
    rm -f tmpfile
done
mv newfile oldfile

See https://stackoverflow.com/a/17331179/1745001 for how to remove the first N bytes inplace from a file. Pick the largest value for N that does fit in your available space.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM