Machine Learning with SQL Server 2017 and Python

In this article, we will cover the steps for installing SQL Server 2017 in Windows system with Machine Learning Server (In-Database) with Python & R language. Then we call the Python scripts from SQL server and get SQL data from Python also. Later we will create a machine learning model for SQL server and train the model with Python scripts. Finally, we will predict a result from sample data.

SQL Server 2017 Machine Learning Services is an add-on to a database engine instance, used for executing R and Python code on SQL Server. The code runs in an extensibility framework, isolated from core engine processes, but fully available to relational data as stored procedures, as T-SQL script containing R or Python statements, or as R or Python code containing T-SQL.

If you previously used SQL Server 2016 R Services, Machine Learning Services in SQL Server 2017 is the next generation of R support, with updated versions of base R, RevoScaleR, MicrosoftML, and other libraries introduced in 2016. The key value proposition of Machine Learning Services is the power of its enterprise R and Python packages to deliver advanced analytics at scale, and the ability to bring calculations and processing to where the data resides, eliminating the need to pull data across the network. SQL Server 2017 supports R and Python.

revoscalepy is the primary library for scalable Python with functions for data manipulation, transformation, visualization, and analysis.

Step 1 – Installing SQL Server 2017 with Machine Learning Services (R & Python)

Download the free Developer edition (free) of SQL Server 2017 from here – SQL 2017

It is an MSI setup downloader file. Please open it and click Download Media.

This will show you the screen to download the media file. Please choose ISO option and click download.

Run the downloaded setup file and choose the first option. (New SQL Server stand-alone installation)

You can choose the Machine learning services. Here I chose both R & Python services.

There are four services selected now. Please click next.

Microsoft is using Anaconda distribution for installing Python libraries. So please click consent to Install Python. (Please click R consent also).

After some time, our installation will be completed. It will show the below screen.

Please check if all the services have succeeded properly.

Please note that Python Services are installed in below folder.

C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017ML\PYTHON_SERVICES

SQL2017ML is the instance name of the SQL server. We must add this path in Windows system environment variables list to access Python services.

As I mentioned earlier, there are four services installed with SQL server 2017. You can check the windows services and find the below services are running properly.

Only SQL Server service and SQL Server Launchpad service automatically run. We can see SQL Service CEIP service is also running automatically. This service is used for the Customer Experience Improvement Program. Our SQL server is ready now. We can install SSMS for connecting SQL server.

Please download the SSMS from here SSMS and install.

After installing SSMS (Version 17) we can connect SQL server from SSMS.

Step 2 – Enabling External Scripts
EXEC sp_configure  'external scripts enabled', 1  
RECONFIGURE WITH OVERRIDE

The above scripts will enable the external scripts and we are now ready to execute Python scripts from the SQL server. Please restart the SQL server now. Otherwise, this change will not take effect.

Please execute the below Python script in SSMS to test our Python service.
EXEC sp_execute_external_script @language = N'PYTHON',   
@script = N'year=2000   
name="Mahesh Chand "  
print("C# Corner was launched by " + name + " in " + str(year))'

The first time, it will take some time to get the result.

We have successfully executed our Python script. In the above script, we initialized the year and name of two variables and printed these variables in a statement. If you get any error message, please check your Python code. SQL server will correctly show the error message.

In the above statement, I made a syntax error by adding two variables in the same line. Python will not support this and shows the error as above.

Step 3 – Connect SQL data from Python Code

We have already seen executing Python from SQL. Now, we are going to connect SQL data from Python code. You can connect any SQL version using this method.

We are going to create a new database and new table with some sample records in that table.
CREATE DATABASE PythonSQLTest  
GO  
USE PythonSQLTest  
  
DROP TABLE IF EXISTS PythonConnect;  
CREATE TABLE [dbo].[PythonConnect](  
    [id] [int] NULL,  
    [name] [nvarchar](50) NULL,  
    [age] [int] NULL)   
  
INSERT INTO PythonConnect(id,name,age) values (1,'Sarath Lal',37)  
INSERT INTO PythonConnect(id,name,age) values (2,'Aradhya',4)  
INSERT INTO PythonConnect(id,name,age) values (3,'Suresh Kumar',25)  
INSERT INTO PythonConnect(id,name,age) values (4,'Anil Soman',42)

Step 4 – Create a Python file using any code editor.

I am using Visual Studio Code for creating Python file.

PythonSQLConnector.py

import pandas   
  
from revoscalepy import RxComputeContext, RxInSqlServer, RxSqlServerData  
from revoscalepy import rx_import  
  
#Connection string to connect to SQL Server named instance  
conn_str = 'Driver=SQL Server;Server={YOUR SERVER NAME};Database=PythonSQLTest;Trusted_Connection=True;'  
  
#Define the columns we wish to import  
column_info = {  
         "id"   : { "type" : "integer" },  
         "name" : { "type" : "string" },  
         "age"  : { "type" : "integer" },  
               }  
  
#Get the data from SQL Server Table  
data_source = RxSqlServerData(table="dbo.PythonConnect",  
                               connection_string=conn_str, column_info=column_info)  
computeContext = RxInSqlServer(  
     connection_string = conn_str,  
     num_tasks = 1,  
     auto_cleanup = False  
)  
  
RxInSqlServer(connection_string=conn_str, num_tasks=1, auto_cleanup=False)  
  
 # import data source and convert to pandas dataframe  
df = pandas.DataFrame(rx_import(input_data = data_source))  
print("Data frame:\n", df)

In the above code we have used RxComputeContext, RxInSqlServer, RxSqlServerData methods from revoscalepy library which are used for connecting SQL from Python. We also used Panda library to get the data frame.

After completing the code please run the Python file with the Python command. You will get a result as shown below.

Please note we got all five records from our SQL table.

Step 5 – Create a model and predict a result from a model.

We are going to see how to create a model in SQL server and predict data from sample data using python scripts.

Create database and tables,
CREATE DATABASE SQLPythonPrediction  
GO  
USE SQLPythonPrediction  
GO

Add some empty tables: one to store the data, and one to store the models you train. Later you will populate the tables using Python.

DROP TABLE IF EXISTS iris_data;  
GO  
CREATE TABLE iris_data (  
  id INT NOT NULL IDENTITY PRIMARY KEY  
  , "Sepal.Length" FLOAT NOT NULL, "Sepal.Width" FLOAT NOT NULL  
  , "Petal.Length" FLOAT NOT NULL, "Petal.Width" FLOAT NOT NULL  
  , "Species" VARCHAR(100) NOT NULL, "SpeciesId" INT NOT NULL  
);

Run the following code to create the table used for storing the trained model. To save Python models in SQL Server, they must be serialized and stored in a column of type varbinary (max).

DROP TABLE IF EXISTS iris_models;  
GO  
  
CREATE TABLE iris_models (  
  model_name VARCHAR(50) NOT NULL DEFAULT('default model') PRIMARY KEY,  
  model VARBINARY(MAX) NOT NULL  
);  
GO
Populate the table

To move the training data from Python into a SQL Server table is a multi step process:

  • You design a stored procedure that gets the data you want.
  • You execute the stored procedure to get the data.
  • You use an INSERT statement to specify where the retrieved data should be saved.

Create the following stored procedure that includes Python code.

CREATE PROCEDURE get_iris_dataset  
AS  
BEGIN  
EXEC sp_execute_external_script @language = N'Python',   
@script = N'  
from sklearn import datasets  
iris = datasets.load_iris()  
iris_data = pandas.DataFrame(iris.data)  
iris_data["Species"] = pandas.Categorical.from_codes(iris.target, iris.target_names)  
iris_data["SpeciesId"] = iris.target  
',   
@input_data_1 = N'',   
@output_data_1_name = N'iris_data'  
WITH RESULT SETS (("Sepal.Length" float not null, "Sepal.Width" float not null, "Petal.Length" float not null, "Petal.Width" float not null, "Species" varchar(100) not null, "SpeciesId" int not null));  
END;

To populate the table, run the stored procedure and specify the table where the data should be written. When run, the stored procedure executes the Python code, which loads the Iris dataset from the built-in Python sample data.

INSERT INTO iris_data ("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "Species", "SpeciesId")  
EXEC dbo.get_iris_dataset;

After successful execution you may get a message that 150 rows have been affected. That means we have created 150 records from Python dataset in SQL server.

Please check the data using the below statement.

SELECT TOP(10) * FROM iris_data;  
SELECT COUNT(*) FROM iris_data;

Step 6 – Create the stored procedure and train a Python model

Run the following code in SQL Server Management Studio to create the stored procedure that builds a model.
CREATE PROCEDURE generate_iris_model (@trained_model varbinary(max) OUTPUT)  
AS  
BEGIN  
EXEC sp_execute_external_script @language = N'Python',  
@script = N'  
import pickle  
from sklearn.naive_bayes import GaussianNB  
GNB = GaussianNB()  
trained_model = pickle.dumps(GNB.fit(iris_data[[0,1,2,3]], iris_data[[4]]))  
'  
, @input_data_1 = N'select "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "SpeciesId" from iris_data'  
, @input_data_1_name = N'iris_data'  
, @params = N'@trained_model varbinary(max) OUTPUT'  
, @trained_model = @trained_model OUTPUT;  
END;  
GO

To generate the model with the required inputs and save it to a table requires some additional statements.

DECLARE @model varbinary(max);  
DECLARE @new_model_name varchar(50)  
SET @new_model_name = 'Naive Bayes'   
SELECT @new_model_name   
EXEC generate_iris_model @model OUTPUT;  
INSERT INTO iris_models (model_name, model) values(@new_model_name, @model);

To view the models, run a simple SELECT statement.

SELECT * FROM iris_models;

Step 7 – Generate scores from the model

Run the following code to create the stored procedure that performs scoring.
CREATE PROCEDURE predict_species (@model varchar(100))  
AS  
BEGIN  
DECLARE @nb_model varbinary(max) = (SELECT model FROM iris_models WHERE model_name = @model);  
EXEC sp_execute_external_script @language = N'Python',   
@script = N'  
import pickle  
irismodel = pickle.loads(nb_model)  
species_pred = irismodel.predict(iris_data[[1,2,3,4]])  
iris_data["PredictedSpecies"] = species_pred  
OutputDataSet = iris_data.query( ''PredictedSpecies != SpeciesId'' )[[0, 5, 6]]  
print(OutputDataSet)  
'  
, @input_data_1 = N'select id, "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "SpeciesId" from iris_data'  
, @input_data_1_name = N'iris_data'  
, @params = N'@nb_model varbinary(max)'  
, @nb_model = @nb_model  
WITH RESULT SETS ( ("id" int, "SpeciesId" int, "SpeciesId.Predicted" int));  
END;  
GO

The stored procedure gets the “Naive Bayes” model from the table and uses the functions associated with the model to generate scores. In this example, the stored procedure gets the model from the table using the model name. However, depending on what kind of metadata you are saving with the model, you could also get the most recent model, or the model with the highest accuracy.

Run the following lines to pass the model name “Naive Bayes” to the stored procedure that executes the scoring code.
EXEC predict_species 'Naive Bayes';

We have successfully executed our model and got the prediction. This example has been made simple by using the data from the Python iris dataset for scoring. (See the line iris_data[[1,2,3,4]]).)

In this article, we covered all the steps for installing SQL Server 2017 with Machine Learning Services (R & Python) and we executed Python scripts from SQL and connected SQL data from Python code. Finally, we created a model in SQL server using Python script and trained the model. We successfully got the prediction result.

Please follow and like me:

2 thoughts on “Machine Learning with SQL Server 2017 and Python

  1. folorentorium Reply

    Great write-up, I¦m normal visitor of one¦s blog, maintain up the excellent operate, and It’s going to be a regular visitor for a long time.

Leave a Reply

Your email address will not be published. Required fields are marked *