简体   繁体   English

从MySQL到MS Access的1GB数据

[英]1GB of Data From MySQL to MS Access

The Situation: I am creating an automated task which queries MySQL (through ODBC) and inserts the result set to a MS Access Database (.mdb) using OLEDB. 情况:我正在创建一个自动化任务,用于查询MySQL(通过ODBC)并使用OLEDB将结果集插入MS Access数据库(.mdb)。

The Code: 代码:

OleDbConnection accCon = new OleDbConnection();
OdbcCommand mySQLCon = new OdbcCommand();
try
{
    //connect to mysql
    Connect();                
    mySQLCon.Connection = connection;              

    //connect to access
    accCon.ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;" +
        @"Data source= " + pathToAccess;
    accCon.Open();
    var cnt = 0;

    while (cnt < 5)
    {
        if (accCon.State == ConnectionState.Open)
            break;
        cnt++;
        System.Threading.Thread.Sleep(50);
    }

    if (cnt == 5)
    {
        ToolBox.logThis("Connection to Access DB did not open. Exit Process");
        return;
    }
} catch (Exception e)
{
    ToolBox.logThis("Faild to Open connections. msg -> " + e.Message + "\\n" + e.StackTrace);
}
OleDbCommand accCmn = new OleDbCommand();
accCmn.Connection = accCon;
//access insert query structure
var insertAccessQuery = "INSERT INTO {0} values({1});";
// key = > tbl name in access, value = > mysql query to b executed
foreach (var table in tblNQuery)
{
    try
    {
        mySQLCon.CommandText = table.Value;
        //executed mysql query                        
        using (var dataReader = mySQLCon.ExecuteReader())
        {
            //variable to hold row data
            var rowData = new object[dataReader.FieldCount];
            var parameters = "";
            //read the result set from mysql query
            while (dataReader.Read())
            {
                //fill rowData with the row values
                dataReader.GetValues(rowData);
                //build the parameters for insert query
                for (var i = 0; i < dataReader.FieldCount; i++)
                    parameters += "'" + rowData[i] + "',";

                parameters = parameters.TrimEnd(',');
                //insert to access
                accCmn.CommandText = string.Format(insertAccessQuery, table.Key, parameters);
                try
                {
                    accCmn.ExecuteNonQuery();
                }
                catch (Exception exc)
                {
                    ToolBox.logThis("Faild to insert to access db. msg -> " + exc.Message + "\\n\\tInsert query -> " + accCmn.CommandText );
                }                              
                parameters = "";
            }
        }
    }
    catch (Exception e)
    {
        ToolBox.logThis("Faild to populate access db. msg -> " + e.Message + "\\n" + e.StackTrace);
    }
}
Disconnect();
accCmn.Dispose();
accCon.Close();

The Issues: 问题:

  1. The memory usage goes very high (300MB++) while the MS Access file size does not change constantly! 内存使用率非常高(300MB ++),而MS Access文件大小不会不断变化! Seems like the insert caches the data rather that saving it to disk. 似乎插入缓存数据而不是将其保存到磁盘。

  2. It is very slow! 这很慢! I know my query executes within a few second but rather insertion process takes long. 我知道我的查询会在几秒钟内执行,但插入过程需要很长时间。

I have tried using prepared statement in MS Access and insert the values as parameters instead of string concat to create insert query. 我已经尝试在MS Access中使用预准备语句并将值作为参数而不是字符串concat插入以创建插入查询。 However I get this exception message: 但是我收到此异常消息:

Data type mismatch in criteria expression. 条件表达式中的数据类型不匹配。

Anyone know how to fix this or have a better approach? 有谁知道如何解决这个问题或者有更好的方法?

You could create a VBA macro that uses the DoCmd.TransferDatabase method to pull data through ODBC into your Access database. 您可以创建一个VBA宏,它使用DoCmd.TransferDatabase方法将数据通过ODBC提取到Access数据库中。 It would probably be much faster and simpler as well. 它可能会更快更简单。

To run the VBA code from an external program or scheduled task, simply initiate Access to open your file with the /x command line switch and it will run the import macro on startup. 要从外部程序或计划任务运行VBA代码,只需启动Access以使用/ x命令行开关打开文件,它将在启动时运行导入宏。 A GB of data though is still going to take a while. 虽然GB数据仍需要一段时间。 I found an article by David Catriel that implemented this approach . 我发现David Catriel一篇文章实现了这种方法

An even better option is to use a different database engine back-end like the free version of SQL Server Express. 更好的选择是使用不同的数据库引擎后端,如SQL Server Express的免费版本。 Then you have a lot more options and it is much more robust. 那么你有更多的选择,它更强大。 If you need MS Access forms and reports, you can create an ADP project file if you use SQL Server, or you can use linked tables to get at your data. 如果您需要MS Access表单和报表,则可以在使用SQL Server时创建ADP项目文件,也可以使用链接表来获取数据。 You could even use Access as a front-end to your MySQL database instead of copying all the data if that would satisfy your requirements. 您甚至可以使用Access作为MySQL数据库的前端,而不是复制所有数据,如果这样可以满足您的要求。

Instead of writing code, you could turn to SQL Server Integration Services (SSIS), and be done before lunch. 您可以转向SQL Server Integration Services(SSIS),而不是编写代码,并在午餐前完成。 It is available as an extension to Visual Studio , in case you do not have it on your computer already with SQL Server. 它可以作为Visual Studio扩展 ,以防您在已经使用SQL Server的计算机上没有它。

With SSIS you are able to create a reusable SSIS package that can be triggered from the command line or scheduled task. 使用SSIS,您可以创建可重用的SSIS包,可以从命令行或计划任务触发。 This guide shows how to pull data from MySQL into SQL Server, but the SQL Server part should be easy to replace with Access . 本指南介绍了如何将数据从MySQL提取到SQL Server,但SQL Server部分应该很容易用Access替换

some changes with comment to add transaction for command execution. 注释中的一些更改为命令执行添加事务。 if transactions is not controlled manually, it will be created and committed every time automatically and it's a time consuming action 如果事务不是手动控制的,那么它将每次自动创建和提交,这是一个耗时的操作

            OleDbConnection accCon = new OleDbConnection();
            OdbcCommand mySQLCon = new OdbcCommand();
            try
            {
                //connect to mysql
                Connect();
                mySQLCon.Connection = connection;

                //connect to access
                accCon.ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;" +
                                          @"Data source= " + pathToAccess;
                accCon.Open();
                var cnt = 0;

                while (cnt < 5)
                {
                    if (accCon.State == ConnectionState.Open)
                        break;
                    cnt++;
                    System.Threading.Thread.Sleep(50);
                }

                if (cnt == 5)
                {
                    ToolBox.logThis("Connection to Access DB did not open. Exit Process");
                    return;
                }
            }
            catch (Exception e)
            {
                ToolBox.logThis("Faild to Open connections. msg -> " + e.Message + "\\n" + e.StackTrace);
            }
//AMK: transaction starts here
            var transaction = accCon.BeginTransaction();
            OleDbCommand accCmn = new OleDbCommand();

            accCmn.Connection = accCon;
            accCmn.Transaction = transaction;
//access insert query structure
            var insertAccessQuery = "INSERT INTO {0} values({1});";
// key = > tbl name in access, value = > mysql query to b executed
            foreach (var table in tblNQuery)
            {
                try
                {
                    mySQLCon.CommandText = table.Value;
                    //executed mysql query                        
                    using (var dataReader = mySQLCon.ExecuteReader())
                    {
                        //variable to hold row data
                        var rowData = new object[dataReader.FieldCount];
                        var parameters = "";
                        //read the result set from mysql query
                        while (dataReader.Read())
                        {
                            //fill rowData with the row values
                            dataReader.GetValues(rowData);
                            //build the parameters for insert query
                            for (var i = 0; i < dataReader.FieldCount; i++)
                                parameters += "'" + rowData[i] + "',";

                            parameters = parameters.TrimEnd(',');
                            //insert to access
                            accCmn.CommandText = string.Format(insertAccessQuery, table.Key, parameters);
                            try
                            {
                                accCmn.ExecuteNonQuery();
                            }
                            catch (Exception exc)
                            {
                                ToolBox.logThis("Faild to insert to access db. msg -> " + exc.Message +
                                                "\\n\\tInsert query -> " + accCmn.CommandText);
                            }
                            parameters = "";
                        }
                    }
//AMK: transaction commits here if every thing is going well
                    transaction.Commit();
                }
                catch (Exception e)
                {
                    ToolBox.logThis("Faild to populate access db. msg -> " + e.Message + "\\n" + e.StackTrace);
//AMK: transaction rollback here if there is a problem
                    transaction.Rollback();
                }
            }
            Disconnect();
            accCmn.Dispose();
            accCon.Close();

Create a DSN (data source name) for the SQL server database. 为SQL Server数据库创建DSN(数据源名称)。 Then select that DSN by opening the Microsoft Access database and choosing to import from that DSN. 然后通过打开Microsoft Access数据库并选择从该DSN导入来选择该DSN。 You should have the ability to import that exact 1GB table (schema, data, everything). 您应该能够导入精确的1GB表(架构,数据,所有内容)。

More information on using a DSN: https://support.office.com/en-us/article/Link-to-SQL-Server-data-0474c16d-a473-4458-9cf7-f369b78d3db8 有关使用DSN的更多信息: https//support.office.com/en-us/article/Link-to-SQL-Server-data-0474c16d-a473-4458-9cf7-f369b78d3db8

Alternatively you can just link to the SQL server database (not import to an Access table) using that DSN and skip the import altogether. 或者,您可以使用该DSN链接到SQL Server数据库(不导入到Access表),并完全跳过导入。

Should the INSERT be part of a TRANSACTION. INSERT是否应该成为TRANSACTION的一部分。 Being within a TRANSACTION usually speeds BULK INSERTS 在交易中通常会加速BULK INSERTS

Thanks everyone for the answers. 谢谢大家的答案。 I just found the main problem in my code. 我刚刚在代码中找到了主要问题。 The reason for heavy memory usage (issue #1) was ODBC was caching the data from MySQL regardless of C# approach (DataReader). 大量内存使用(问题#1)的原因是ODBC无论C#方法(DataReader)如何都从MySQL缓存数据。 That issue is resolved by checking the Don't cache results of forward-only cursors checkbox in DSN settings. 通过在DSN设置中选中Don't cache results of forward-only cursors复选框来解决该问题。 This also made the process slightly faster (30%). 这也使得该过程稍快(30%)。 However, more concrete approach is still what Brian Pressler and Egil Hansen suggested.But since they require software installation and/or migration plan, easiest way would be to stick to this piece of code. 然而,更具体的方法仍然是Brian Pressler和Egil Hansen所建议的。但是因为它们需要软件安装和/或迁移计划,所以最简单的方法就是坚持使用这段代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM