简体   繁体   English

将 SQL Server 的 Linux ODBC 驱动程序添加到 Kaggle/Python docker 映像

[英]Adding Linux ODBC drivers for SQL Server to a Kaggle/Python docker image

I am trying to build out a data science machine capable of running machine learning python programs in a production environment.我正在尝试构建能够在生产环境中运行机器学习 Python 程序的数据科学机器。

The current business case data needs to be pulled from SQL Server, scored with Machine Learning using python and pushed back to SQL Server.当前的业务案例数据需要从 SQL Server 中提取,使用 Python 使用机器学习进行评分,然后推送回 SQL Server。

I want to install the ODBC drivers for Linux.我想为 Linux 安装 ODBC 驱动程序。

https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server

The problem I have is which drivers to install?我的问题是要安装哪些驱动程序?

When I connect to the container using the following command, it works fine and maps over my python programs in the poc directory.当我使用以下命令连接到容器时,它工作正常并映射到 poc 目录中的我的 python 程序。

"docker run -it --volume=c:\\docker\\poc:/poc kaggle/python /bin/bash" “docker run -it --volume=c:\\docker\\poc:/poc kaggle/python /bin/bash”

When I try to figure out the version of Linux using the following command,当我尝试使用以下命令找出 Linux 的版本时,

"uname -a" “uname -a”

I see it is Moby Linux from Docker?我看到它是来自 Docker 的 Moby Linux?

Linux 69982a00af21 4.9.49-moby #1 SMP Wed Sep 27 00:36:29 UTC 2017 x86_64 GNU/Linux Linux 69982a00af21 4.9.49-moby #1 SMP Wed Sep 27 00:36:29 UTC 2017 x86_64 GNU/Linux

However, what base image is that built from?但是,它是根据什么基础映像构建的? (Ubuntu, Debian, Fedora, Red Hat, etc) (Ubuntu、Debian、Fedora、Red Hat 等)

I need a cracker jack Unix Admin to help me out.我需要一个破解杰克 Unix 管理员来帮助我。 How do I get ODBC drivers installed?如何安装 ODBC 驱动程序?

Any takers!任何接受者!

Thanks in advance for your help.在此先感谢您的帮助。

John约翰

I'm not sure I have exactly the answer you're looking before because you make reference to the Microsoft ODBC driver- however I will share what I've done to get UnixODBC installed (which I then use with the PyODBC module to talk to an MSSQL database).我不确定我是否有您之前正在寻找的确切答案,因为您参考了 Microsoft ODBC 驱动程序 - 但是我将分享我为安装 UnixODBC 所做的工作(然后我将其与 PyODBC 模块一起使用以与之交谈) MSSQL 数据库)。

Here is a summary of the commands I've used (caveat: untested in this format as they're cut and paste from a production Dockerfile that I can't share in it's entirety-):以下是我使用过的命令的摘要(警告:未以这种格式进行测试,因为它们是从生产 Dockerfile 中剪切和粘贴的,我无法完整地共享 -):

Assuming you are inside a Debian/Ubuntu-based Linux Docker container假设你在一个基于 Debian/Ubuntu 的 Linux Docker 容器中

$ apt-get update
$ apt-get install python2.7 python-pip unixodbc unixodbc-dev freetds-bin freetds-dev tdsodbc nano

# nano is a text editor; Ctrl + O to write out, Ctrl + X to exit

$ nano /etc/odbcinst.ini

[FreeTDS]
Description = FreeTDS Driver for MSSQL
Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so

$ nano /etc/freetds/freetds.conf

[global]
#   $Id: freetds.conf,v 1.12 2007-12-25 06:02:36 jklowden Exp $
#
# This file is installed by FreeTDS if no file by the same
# name is found in the installation directory.
#
# For information about the layout of this file and its settings,
# see the freetds.conf manpage "man freetds.conf".

# Global settings are overridden by those in a database
# server specific section
[global]
# TDS protocol version
tds version = 8.0

# Whether to write a TDSDUMP file for diagnostic purposes
# (setting this to /tmp is insecure on a multi-user system)
;dump file = /tmp/freetds.log
;debug flags = 0xffff

# Command and connection timeouts
;timeout = 10
;connect timeout = 10

# If you get out-of-memory errors, it may mean that your client
# is trying to allocate a huge buffer for a TEXT field.
# Try setting 'text size' to a more reasonable limit
text size = 64512

# If you experience TLS handshake errors and are using openssl,
# try adjusting the cipher list (don't surround in double or single quotes)
# openssl ciphers = HIGH:!SSLv2:!aNULL:-DH

$ pip install pyodbc

$ nano test_database.py

import pyodbc

host = 'some.host.org'
port = 1433
username = 'some_username'
password = 'some_password'

conn_str = """
            DRIVER={FreeTDS};
            TDS_VERSION=8.0;
            SERVER=%s;
            Port=%i;
            DATABASE=%s;
            UID=%s;
            PWD=%s;
        """ % (
    host, port, database, username, password
)

conn_str = conn_str.replace('\n', '').replace(' ', '')

connection = pyodbc.connect(conn_str, timeout=self._timeout)

connection.autocommit = True

cursor = connection.cursor()

cursor.execute('SELECT * FROM some_table;')

rows = cursor.fetchall()

for row in rows:
    print row

cursor.close()

connection.close()

(the code block above is scrollable) (上面的代码块是可滚动的)

I must caveat again this is all cut and paste from our codebase and I haven't tested it as it appears right now.我必须再次警告,这都是从我们的代码库中剪切和粘贴的,我还没有按照现在的样子对其进行测试。

If you don't have access to an MSSQL database for testing, you can spin one up in as a Docker container (SQL Server Express Edition) with the following command:如果您无权访问 MSSQL 数据库进行测试,则可以使用以下命令将其作为 Docker 容器(SQL Server Express Edition)启动:

docker run -d --name mssql-server -p 1433:1433 -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=some_password' -e 'MSSQL_PID=Express' microsoft/mssql-server-linux:latest

Also please note as part of the packages installed above you'll have the "tsql" command line tool which can be used to double check your database stuff outside of Python.另请注意,作为上面安装的软件包的一部分,您将拥有“tsql”命令行工具,可用于在 Python 之外仔细检查您的数据库内容。

Best of luck!祝你好运!

Ran into the same problem and assembled a Docker image based on this template:遇到了同样的问题,并基于这个模板组装了一个 Docker 镜像:

FROM python:3.7.7-slim-stretch  

# Install pyodbc https://github.com/mkleehammer/pyodbc/wiki/Install
RUN apt-get update && apt-get install -y --no-install-recommends g++ unixodbc-dev curl gnupg apt-transport-https apt-utils
RUN pip install pyodbc

# Install Microsoft ODBC 17 
# https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver15
ENV ACCEPT_EULA=Y
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/debian/9/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get update && apt-get install -y --no-install-recommends msodbcsql17 mssql-tools

The best sources to assemble this solution were pyodbc wiki and MS Instructions for odic 17 that led me to use FROM python:3.7.7-slim-stretch so it would be both updated versions but that were documented and with clear documentation for this.组装这个解决方案的最佳来源是pyodbc wikiMS Instructions for odic 17 ,这让我使用了FROM python:3.7.7-slim-stretch所以它都是更新的版本,但都有记录,并且有明确的文档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM