Hey all. I'm reading from one sql-format file to another, and two bytes in the middle are being corrupted, and I assume it's some preparation or safeguard that I didn't do.
Example of corrupted data:
//From the file that is read from. added ** to emphasize the corrupted byte
insert into viruses (virusSig,virusHash) values (
X'579fdc569b170419e15750f0feb360aa9c58d8**90**eede50def97ee7cb03b9e905',
X'ee002fe5');
//From the file that is written to. added ** to emphasize the corrupted byte
insert into changes (filepath,loc,dat,vir,hash) values (
'E:\MyDocs\intel\antivirus\RandomFiles\0\2\5\11\24\49\EG1AxxeJSr.data',
243540,
X'9f4246ff8c73c5a5b470cab8c38416929c4eacc1e0021d5ac1fdbb88145d3e6f',
X'579fdc569b170419e15750f0feb360aa9c58d8**3f**eede50def97ee7cb03b9e905',
X'6546dd27');
Code that reads from/writes to:
public static void insertViruses(FileLocation[] locations, byte[][] viruses, String logpath)
{
int numViruses = viruses.length;
int virusLength = GenerateRandomCorpus.virusSignatureLengthInBytes;
try{
for (int i = 0; i < numViruses; i++)
{
FileOutputStream logwriter = new FileOutputStream(logpath, true);
// Prep to copy section
int locationOfChange = locations[i].index;
String filepathToChange = locations[i].filepath;
File checkIfBackupExists = new File(filepathToChange + ".bak");
if (!checkIfBackupExists.exists())
copyFile(filepathToChange, filepathToChange + ".bak");
copyFile(filepathToChange, filepathToChange + ".tmp");
RandomAccessFile x = new RandomAccessFile(filepathToChange, "rw");
x.seek(locationOfChange);
// Copy section into byte array to write in log
byte[] removedSection = new byte[virusLength];
x.read(removedSection, 0, virusLength);
if (GenerateRandomCorpus.dbg)
System.out.println(filepathToChange + ":" + locationOfChange);
x.close();
// Write changes to log
byte[] removedSectionConvertedToHexString = StringUtils.getHexString(removedSection).getBytes();
byte[] virusConvertedToHexString = StringUtils.getHexString(viruses[i]).getBytes();
byte[] hashConvertedToHexString = StringUtils.getHexString(GenerateRandomViruses.intToByteArray(new String(viruses[i]).hashCode())).getBytes();
System.out.println(StringUtils.getHexString(removedSection));
System.out.println(StringUtils.getHexString(viruses[i]));
logwriter.write(String.format("insert into changes (filepath,loc,dat,vir,hash) values " +
"('%s',%d,X'", filepathToChange, locationOfChange).getBytes());
logwriter.write(removedSectionConvertedToHexString);
logwriter.write("',X'".getBytes());
logwriter.write(virusConvertedToHexString);
logwriter.write("',X'".getBytes());
logwriter.write(hashConvertedToHexString);
logwriter.write("');\n".getBytes());
// Insert virus into file
File original = new File(filepathToChange);
original.delete();
RandomAccessFile fileToInsertIn = new RandomAccessFile(filepathToChange + ".tmp", "rw");
fileToInsertIn.seek(locationOfChange);
fileToInsertIn.write(viruses[i]);
fileToInsertIn.close();
File a = new File(filepathToChange + ".tmp");
original = new File(filepathToChange);
a.renameTo(original);
a.delete();
logwriter.close();
}
} catch (Exception e)
{
System.err.println(e.toString());
System.err.println("Error: InsertVirusesIntoCorpus, line 100");
}
}
Any ideas?
I'm a bit perplexed by your code and why there are so many conversions going on, but here I go...
My gut tells me you've got either some character set conversion going on, unintentionally, or that the corruption is due to moving between raw bytes, Java byte primitives and Java int primitives. Remember that Java the byte
value range is between -127 and 128 and that String's .getBytes() is character encoding scheme aware.
Specifically, this just looks really odd to me:
byte[] virusConvertedToHexString = StringUtils.getHexString(viruses[i]).getBytes();
This is what is happening:
viruses[i]
is giving you a byte
array StringUtils.getHexString()
takes that byte array and gives you a hexadecimal representation of that byte
array as a String
(assumed: What is this StringUtils
? It does not seem to be from [org.apache.commons.lang][1]
.) String
's byte
array into virusConvertedToHexString
Step 2 is were I would suspect trouble.
Also, the code block above does not include the code that produced:
//From the file that is read from. added ** to emphasize the corrupted byte
insert into viruses (virusSig,virusHash) values (
X'579fdc569b170419e15750f0feb360aa9c58d8**90**eede50def97ee7cb03b9e905',
X'ee002fe5');
It would help.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.