简体   繁体   中英

Validate data when uploading to MYSQL with Load Data InFile via PHP

So I am quite happily uploading data to several tables in my database using Load Data infile. My problem is when the data uploaded contains incorrectly formatted data such as a date in d/m/Y rather than Ymd.

This does not prevent the data from being inserted, it just inserts it as 0000-00-00. What I desire is for it to fail so I can inform the user to fix the data before proceeding.

I am currently doing a check to ensure the file uploaded contains the correct columns by comparing it against sample file using the following little function:

function check_csv($f_a, $f_b)
{
    $csv_upload = array_map("str_getcsv", file($f_a,FILE_SKIP_EMPTY_LINES))[0];
    $csv_sample = array_map("str_getcsv", file($f_b,FILE_SKIP_EMPTY_LINES))[0];
    $match = 'true';
    foreach ($csv_sample as $key => $value) {
        if($value != $csv_upload[$key]){
            $match = 'false';
            break 1;
        }
    }
    return $match;
}

... I realise now there is the array_diff() function that may have been useful here, I shall explore that later.

Back to the matter in hand, would I need to do something within this function to check each of the values or is there an option for Load Data Infile that will force the behaviour I desire.

I would say that trying to do validation in MySQL, while using LOAD DATA INFILE, is pretty much an exercise in futility. For one, you use LOAD DATA INFILE, as a faster alternative than going through the parser. If you want to start slowing down there and conducting all manner of parsing, you might as well just not use LOADA DATA INFILE.

I would suggest that you just do your validation in PHP on the CSV, and bail ( if necessary ) before even attempting to run it through MySQL. That'd actually be more efficient since you won't have to bother hitting MySQL at all if the data isn't even valid.

Also, the code you're using to validate the CSV file above only compares the values of the first row of the CSV. That doesn't actually validate that any of the proceeding rows have the correct number of columns. You also don't need array_diff() for this. Simply compare the column count of each row in the CSV to the expected column count.

For example, let's say you expect exactly 4 columns in every row in the CSV, and you expect column 2 to have the formatted date of Ymd :

$row = 1;
$expectedColumnNum = 4; // we expect exactly 4 columns
if (($handle = fopen("uploaded.csv", "r")) !== FALSE) {
    while (($data = fgetcsv($handle)) !== FALSE) {
        // Verify every row contains exact number of expected columns
        if (count($data) != $expectedColumnNum) {
            echo "CSV does not contain the expected number of columns on row $row!\n";
            break;
        }
        // Verify the second column is a formatted date of Y-m-d
        if (!DateTime::createFromFormat('Y-m-d', $data[1])) {
            echo "CSV does not contain valid formatted date on row $row!\n";
            break;
        }
        $row++;
    }
    fclose($handle);
}

If the above validation checks out then you're good to run it through MySQL using LOAD DATA INFILE.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM