Trending December 2023 # Introduction To Master Data In Sap # Suggested January 2024 # Top 18 Popular

You are reading the article Introduction To Master Data In Sap updated in December 2023 on the website We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Introduction To Master Data In Sap

What is Master Data?

Data stored in SAP R/3 is categorized as

Master Data and

Transactional Data.

If you are producing, transferring stock, selling, purchasing, doing physical inventory, whatever your activity may be, it requires certain master data to be maintained.

Example of Master Data

Material master data

Customer master data

Vendor master data

Pricing/conditions master data

Warehouse management master data (storage bin master data)

The ones we will focus in MM module are material master and purchase info record.

Material Master: What you should know about material master?

Material in SAP is a logical representation of certain goods or service that is an object of production, sales, purchasing, inventory management etc. It can be a car, a car part, gasoline, transportation service or consulting service, for example.

InInIn All the information for all materials on their potential use and characteristics in SAP are called material master. This is considered to be the most important master data in SAP (there are also customer master data, vendor master data, conditions/pricing master data etc), and all the processing of the materials are influenced by material master. That is why it’s crucial to have a precise and well maintained material master.

In order to be confident in your actions you need to understand material master views and its implications on processes in other modules, business transactions and a few more helpful information like tables that store material master data, transactions for mass material maintenance (for changing certain characteristics for a large number of materials at once).

Material types

In SAP ERP, every material has a characteristic called “material type” which is used throughout the system for various purposes.

Why is it essential to differentiate between material types and what does that characteristic represent?

It can represent a type of origin and usage – like a finished product (produced goods ready for sale), semifinished product (used as a part of a finished product), trading goods (for resale), raw materials (used for production of semifinished and finished products) etc. These are some of the predefined SAP material types among others like food, beverages, service and many others.

We can define our custom material types if any of standard ones doesn’t fulfill our need.

Most used material types in standard SAP installation

What can be configured on material type level (possible differences between types)?

Material master views: It defines the views associated with a Material Type. For example, if we have a material type “FERT” assigned to our material Product 1000 – we don’t want to have Purchasing based views for that material because we don’t need to purchase our own product – it is configured on material type level.

Default price control: we can set this control to standard or moving average price (covered later in detail), but this can be changed in material master to override the default settings.

Default Item category group: used to determine item category in sales documents. It can be changed in material master to override the default settings.

internal/external purchase orders, special material types indicators, and few more.

Offered material types in MM01 transaction

So material type is assigned to materials that have the same basic settings for material master views, price control, item category group and few other. Material Type can be assigned during the creation of the material in t-code MM01 (covered in detail later)

Where can we find a complete list of materials with their respective material type?

There are numerous transactions for this. The raw data itself is stored in MARA table

(you can view table contents with t-code SE16 or SE16N – newest version of the transaction), but in some systems these t-codes aren’t allowed for a standard user. In such cases, we can easily acquire the list with t-code MM60 (Material list). MM60 is used particularly often as it displays a lot of basic material characteristics.

Selection screen – you can enter only the material number:

Selection screen for MM60 transaction

We can see that material 10410446 in plant AR01 is of type FERT (finished product).

MM60 report results with the export button highlighted

Using the toolbar button highlighted on screen, we can export the list of materials we have selected on screen.

Material group

Another characteristic SAP material is assigned during it’s creation is “material group”, which can represent a group or subgroup of materials based on certain criteria.

Which criteria can be used to create material groups?

Any criteria that suit your needs for reporting purposes is right for your system. You may group materials by the type of raw material used to produce it (different kinds of plastics used in the production process), or you can divide all services into consulting services (with different materials for SAP consulting, IT consulting, financial consulting etc), transportation services (internal transport, international transport), you can also group by production technique (materials created by welding, materials created by extrusion, materials created by injection etc). Grouping depends mainly on the approach your management chooses as appropriate, and it’s mainly done during the implementation, rarely changes in a productive environment.

Assigned material group in material master

In addition, there is a material hierarchy (used mostly in sales & distribution) that can also be used for grouping, but it’s defined almost always according to sales needs as it is used for defining sales conditions (standard discounts for customers, additional discounts, special offers).

On the other hand, material group is mainly used in PP and MM module.

If you need to display material groups for multiple materials, you can use already mentioned t-code MM60. You just need to select more materials in selection criteria.

Material group in report MM60

Material group is easily subject to mass maintenance via transaction MM17. More on that in the material master editing section.

You're reading Introduction To Master Data In Sap

Estimators – An Introduction To Beginners In Data Science

This article was published as a part of the Data Science Blogathon.

Not having much information about the distribution of a random variable can become a major problem for data scientists and statisticians. Consider, a researcher trying to understand the distribution of Choco-chips in a cookie (a very popular example of Poisson distribution). The researcher is well aware that the distribution of Choco-chips follows a Poisson distribution, but does not know how to estimate the parameter λ of the distribution.

A parameter is essentially a numerical characteristic of a distribution (or any statistical model in general). Normal distributions have µ & σ as parameters, uniform distributions have a & b as parameters, and binomial distributions have n & p as parameters. These numerical characteristics are vital for understanding the size, shape, spread, and other properties of a distribution. In the absence of the true value of the parameter, it seems that the researcher may not be able to continue her investigation. But that’s when estimators step in.

Estimators are functions of random variables that can help us find approximate values for these parameters. Think of these estimators like any other function, that takes an input, processes it, and renders an output. So, the process of estimation goes as follows:

1) From the distribution, we take a series of random samples.

2) We input these random samples into the estimator function.

3) The estimator function processes it and gives a set of outputs.

4) The expected value of that set is the approximate value of the parameter.


Let’s take an example. Consider a random variable X showing a uniform distribution. The distribution of X can be represented as U[0, θ]. This has been plotted below:

(Figure A)

We have the random variable X and its distribution. But we don’t know how to determine the value of θ. Let’s use estimators. There are many ways to approach this problem. I’ll discuss two of them:

1) Using Sample Mean

We know that for a U[a, b] distribution, the mean µ is given by the following equation:

For U[0, θ] distribution, a = 0 & b = θ, we get:

Thus, if we estimate µ, we can estimate θ. To estimate µ, we use a very popular estimator called the sample mean estimator. The sample mean is the sum of the random sample value drawn divided by the size of the sample. For instance, if we have a random sample S = {4, 7, 3, 2}, then the sample mean is (4+7+3+2)/4 = 4 (the average value). In general, the sample mean is defined using the following notation:

Here, µ-hat is the sample mean estimator & n is the size of the random sample that we take from the distribution. A variable with a hat on top of it is the general notation for an estimator. Since our unknown parameter θ is twice of µ, we arrive at the following estimator for θ:

We take a random sample, plug it into the above estimator, and get a number. We repeat this process and get a set of numbers. The following figure illustrates the process:

(Figure B)

The lines on the x-axes correspond to the values present in the sample taken from the distribution. The red lines in the middle indicate the average value of the sample, and the red lines at the end are twice that average value i.e., the expected value of θ for one sample. Many such samples are taken, and the estimated value of θ for each sample is noted. The expected value/mean of that set of numbers gives the final estimate for θ. It can be mathematically proved (using properties of expectation):

It is seen that the expectation of the estimator is equal to the true value of the parameter. This amazing property that certain estimators have is called unbiasedness, which is a very useful criterion for assessing estimators.

2) Maximum Value Method

This time, instead of using mean, we’ll use order statistics, particularly the nth order statistic. The nth order statistic is defined as the nth smallest value of a random sample of size n. In other words, it’s the maximum value of a random sample. For instance, if we have a random sample S = {4, 7, 3, 2}, then the nth order statistic is 7 (the largest value). The estimator is now defined as follows:

We follow the same procedure- take random samples, input them, collect the output and find the expectation. The following figure illustrates the process:

(Figure C)

As noted previously, the lines on the x-axes are the values present in one sample. The red lines at the end are the maximum value for that sample i.e., the nth order statistic. Two random samples are shown for reference. However, we need to take much larger samples. Why? To prove it, we’ll use the general expression for the PDF (Probability Distribution Function) of nth order statistics for U[a, b] distribution:

For U[0, θ] distribution, a = 0 & b = θ, we get:

Using the integral form of expectation of a continuous variable,

Does that mean that we cannot use this estimator? Certainly not. As discussed earlier, the estimator bias can be significantly lowered by taking large n. For large values of n, n = n+1 (approximately). Thus, we get:

The Bottom Line

Hence, we have successfully solved our problem through estimators. We also learned a very important property of estimators- unbiasedness. While this may have been an extensive read, it’s imperative to acknowledge that the study of estimators is not restricted to just the above-explained concepts. Various other properties of estimators such as their efficiency, robustness, mean squared error, and consistency are also vital to deepen our understanding of them.

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion. 



Sap Abap Bdc (Batch Data Communication) Tutorial

Introduction to Batch input

Batch input is typically used to transfer data from non-R/3 systems to R/3 systems or to transfer data between R/3 systems.

It is a data transfer technique that allows you to transfer datasets automatically to screens belonging to transactions, and thus to an SAP system. Batch input is controlled by a batch input session.

In this tutorial you will learn:

Batch input session

Groups a series of transaction calls together with input data and user actions . A batch input session can be used to execute a dialog transaction in batch input, where some or all the screens are processed by the session. Batch input sessions are stored in the database as database tables and can be used within a program as internal tables when accessing transactions.

Points to note

BDI works by carrying out normal SAP transactions just as a user would but it execute the transaction automatically.All the screen validations and business logic validation will be done while using Batch Data Input.

It is suitable for entering large amount of data.

No manual interaction is required

Methods of Batch Input

SAP provide two basic methods for transferring legacy data in to the R/3 System.

Classical Batch Input method.

Call Transaction Method.

Classical Batch Input method

After creating the session, you can run the session to execute the SAP transaction in it.

This method uses the function modules BDC_ OPEN, BDC_INSERT and BDC_CLOSE

Batch Input Session can be process in 3 ways

In the foreground

In the background

During processing, with error display

You should process batch input sessions in the foreground or using the error display if you want to test the data transfer.

If you want to execute the data transfer or test its performance, you should process the sessions in the background.

Points to note about Classical Batch Input method

Synchronous processing

Transfer data for multiple transactions.

Synchronous database update.

A batch input process log is generated for each session.

Session cannot be generated in parallel.

Call Transaction Method.

In this method ABAP/4 program uses CALL TRANSACTION USING statement to run an SAP transaction.

Entire batch input process takes place online in the program

Points to Note:

Faster processing of data

Asynchronous processing

Transfer data for a single transaction.

No batch input processing log is generated.

Batch Input Procedures

You will typically observe the following sequence of steps to develop Batch Input for your organization

Analysis of the legacy data. Determine how the data to be transferred is to be mapped in to the SAP Structure. Also take note of necessary data type or data length conversions.

Generate SAP data structures for using in export programs.

Export the data in to a sequential file. Note that character format is required by predefined SAP batch input programs.

If the SAP supplied BDC programs are not used, code your own batch input program. Choose an appropriate batch input method according to the situation.

Process the data and add it to the SAP System.

Analyze the process log. For the CALL TRANSACTION method, where no proper log is created, use the messages collected by your program.

From the results of the process analysis, correct and reprocess the erroneous data.

Writing BDC program

You may observe the following process to write your BDC program

Analyze the transaction(s) to process batch input data.

Decide on the batch input method to use.

Read data from a sequential file

Perform data conversion or error checking.

Storing the data in the batch input structure,BDCDATA.

Generate a batch input session for classical batch input,or process the data directly with CALL TRANSACTION USING statement.

Batch Input Data Structure

Declaration of batch input data structure


Field name Type Length Description




Module pool




Dynpro number




Starting a dynpro




Field name




Field value

The order of fields within the data for a particular screen is not of any significance

Points to Note

While populating the BDC Data make sure that you take into consideration the user settings. This is specially relevant for filling fields which involves numbers ( Like quantity, amount ). It is the user setting which decides on what is the grouping character for numbers E.g.: A number fifty thousand can be written as 50,000.00 or 50.000,00 based on the user setting.

Condense the FVAL field for amount and quantity fields so that they are left aligned.

Note that all the fields that you are populating through BDC should be treated as character type fields while populating the BDC Data table.

In some screens when you are populating values in a table control using BDC you have to note how many number of rows are present on a default size of the screen and code for as many rows. If you have to populate more rows then you have to code for “Page down” functionality as you would do when you are populating the table control manually.

Number of lines that would appear in the above scenario will differ based on the screen size that the user uses. So always code for standard screen size and make your BDC work always in standard screen size irrespective of what the user keeps his screen size as.

Creating Batch Input Session

Open the batch input session session using function module BDC_OPEN_GROUP.

For each transaction in the session:

Fill the BDCDATA with values for all screens and fields processed in the transaction.

Transfer the transaction to the session with BDC_INSERT.

Close the batch input session with BDC_CLOSE_GROUP

Batch Input Recorder

Begin the batch input recorder by selecting the Recording pushbutton from the batch input initial screen.

The recording name is a user defined name and can match the batch input session name which can be created from the recording.

Enter a SAP transaction and begin posting the transaction.

After you have completed posting a SAP transaction you either choose Get Transaction and Save to end the recording or Next Transaction and post another transaction.

Once you have saved the recording you can create a batch input session from the recording and/or generate a batch input program from the recording.

The batch input session you created can now be analyzed just like any other batch input session.

The program which is generated by the function of the batch input recorder is a powerful tool for the data interface programmer. It provides a solid base which can then be altered according to customer requirements.

Introduction To Deep Learning In Julia

This article was published as a part of the Data Science Blogathon


In the current scenario, the Data science field is dominated by Python/R but there is another competition added not so long ago, Julia! which we will be exploring in this guide. The famous quote (motto) of Julia is –

Looks like Python, runs like C

We know that python is used for a wide range of tasks. Julia, on the other hand, was primarily developed to perform scientific computation, machine learning, and statistical tasks.

Since Julia was explicitly made for high-level statistical work and scientific computations, it has several benefits over Python. In linear algebra, for example, “vanilla” (raw) Julia performs better than “vanilla” (raw) Python. This is primarily because, unlike Julia, Python doesn’t support all equations and the matrices used in machine learning.

While Python is a great language with its library Numpy, Julia completely outperforms it when it comes to non-package experience, with Julia being more catered towards machine learning tasks and computations.

Table of contents




Cuda Arrays

Automatic Differentiation

Training a Classifier


So, let’s get started!


This guide is to get you started with the mechanics of Flux, to start building models right away. While this is loosely based on a tutorial by Pytorch, it will cover all the areas necessary. It introduces basic Julia programming, as well Zygote, a source-to-source automatic differentiation (AD) framework in Julia. Using all these tools, we will build a simple neural network and in the end a CNN which we will train to classify between 10 classes.

What is Flux in Julia?

Flux is an open-source machine-learning software library written completely in Julia. A stable release which we will be using is v0.12.4. As we would have expected, it has a layer-on layer stacking-based interface for simple models with strong support on interoperability with other packages of Julia, instead of having a monolithic design. For example, if we need GPU support we can get it directly via the implementation of CuArrays. This is in complete contrast to other frameworks in Julia which are implemented in different languages but bound with Julia such as Tensorflow (Julia Package) and thus are more or less limited by the functionality present in their implementation.

Installation of Julia

Before we move further, if you don’t have Julia installed in your system, it can be from its official site chúng tôi .

To use Julia in Jupyter notebook like Python, we only need to add the IJulia package as follows and we can run Julia right from the jupyter notebook.

using Pkg Pkg.add("IJulia")

We can use Julia as we used Python in Jupyter notebook for exploratory data analysis.

Arrays in Julia

Before moving on to the framework, we need to understand the basics of a deep learning framework. Arrays, CudaArrays, etc. In this section, I’ll explain the basics of the array in Julia.

with just three elements.

x = [10,12,16]

Here’s a matrix – a square array with four elements.

x = [10 12; 13 14]

elements, each a random number ranging from 0 to 1.

x = rand(12, 2)

rand is not just the only function that can create a random matrix (array) we can use different functions like ones, zeros, or randn. Try them out in the jupyter notebook to see what they do.

By default, Julia stores all the numbers in a high-precision format called Float64. In Machine Learning we often don’t need all those many digits, so we can configure Julia to decrease it to Float32, or if we need higher precision than 64 bits we can use BigFloat. Below is an example of a random matrix of 6×3 = 18 elements of BigFloat.

x = rand(BigFloat, 6, 3) x = rand(Float32, 6, 3)

To count the number of elements in a matrix we can use the length function.


Or, if we need the size we can check it more explicitly.


We can do many sorts of algebraic operations on matrix, for example, we can add two matrices

x + x

Or subtract them.

x - x

Julia supports a feature called broadcasting, using the “.” syntax. The broadcast() is an inbuilt function in julia that is used to broadcast or apply the function f over the collections, arrays, or tuples. This makes it easy to apply a function to one or more arrays with a concise dot. syntax. For example – f.(a, b) means “apply f elementwise to a and b”. We can use this broadcasting in our matrix to add 1 element-wise in x.

x .+ 1

Finally, we have to use Matrix Multiplication more or less every time we use Machine Learning. Is super-easy to use with Julia.

W = randn(4, 10) x = rand(10) W * x CUDA Arrays in Julia

CUDA functionality is provided separately by the CUDA package from Julia. If you have a GPU and CUDA available, you can run ] add CUDA in IJulia (jupyter notebook) to get it. Once you get the CUDA installed (compatible versions below julia 1.6) we can transfer our arrays into CUDA arrays (or in GPU) using cu function. It supports all the basic functionalities of an array but now works on GPU.

GPU hardware. In this section, I will briefly demonstrate the use of the CuArray type. Since we are exposing CUDA’s functionality by implementing existing Julia interfaces on the CuArray type, we should refer to the upstream Julia documentation for more in-depth information on these operations.

import Pkg Pkg.add("CUDA")

Import the library and covert matrix into CudaArrays. Now, these CuArrays will run on GPU which by default is much faster than Arrays and we barely had to do anything in it.

using CUDA x = cu(rand(6, 3)) Flux.jl in Julia

Flux is a library or package in Julia specifically for machine learning. It comes with a vast range of functionalities that help us harness the full potential of Julia, without getting our hands messy (like auto-differentiation). We follow a few key principles in Flux.jl

and will be faster.

You could have written Flux from scratch – From LSTMs to GPU Kernels, it is a very straightforward Julia code. Whenever in doubt, one can always look to the documentation. If you need something different, you can easily write your own code in it.

Integrates nicely with others – Flux works well with Julia libraries from Data Frames to Images (Images package) and even differential equations solver (another package in Julia for computation), so you can easily build complex data processing pipelines that integrate Flux models.


You can add Flux from using Julia’s package manager, by typing ] add Flux in the Julia prompt or use

import Pkg Pkg.add("Flux") Automatic Differentiation

Automatic differentiation (AD), also called algorithmic differentiation or simply “auto diff”, is used to calculate differentiation of functions. It is a family of techniques similar to backpropagation for efficiently evaluating derivatives of numeric functions expressed as a form of computer programs.

One probably has learned to differentiate functions in calculus classes but let’s recap it in Julia code.

f(x) = 4x^2 + 12x + 3 f(4)

In simpler cases like these, we can easily find the gradient by hand, for example in this it is 8x + 12. But it’s much faster and efficient to make the Flux do it for us!

using Flux: gradient df(x) = gradient(f, x)[1] df(4)

We can cross-check with few more inputs, to see if the gradient calculated by Flux is correct and is indeed 8x+12. We can do it multiple times and since the function we took was the C_2 function second derivative is just an integer 8.

ddf(x) = gradient(df, x)[1] ddf(4)

As long as the mathematical functions we create in Julia are differentiable we can use auto differentiation of Flux to handle any code we throw at it, which includes recursion, loops, and even custom layers. For example, we can try to differentiate the Taylor series approximation of sin function.

mysin(x) = sum((-1)^k*x^(1+2k)/factorial(1+2k) for k in 0:6) x = 0.6 mysin(x), gradient(mysin, x) sin(x), cos(x)

As we expected the derivative is numerically very close to the function cos(x) (which is sinx derivative).

What if instead of just taking a single number as input, we take arrays as inputs? This gets more interesting as we proceed further. Let’s take an example where we have a function that takes a matrix and two vectors.

myloss( W , b , x ) = sum(W * x .+ b) #calculating loss W = randn(3, 5) b = zeros(3) x = rand(5) gradient(myloss, W, b, x)

Now we get gradients for each of the inputs W, b, and x, and these will come in very handy when we have to train our model. Since we know that machine learning models can contain hundreds or thousands of parameters, Flux here provides a slightly different method of writing gradient. Just like other deep learning frameworks, we mark our arrays with params to indicate that we want its gradients. W and b represent the weight and bias respectively.

using Flux: params W = randn(3, 5) b = zeros(3) x = rand(5) y(x) = sum(W * x .+ b)

Using those parameters we can now get the gradients of W and b directly. It’s especially useful when we are working with layers. Think of the layer as a container for parameters. For example, the Dense function from Flux does familiar linear transform.

using Flux m = Dense(10, 5) x = rand(Float32, 10)

To get parameters of any layer or model we can always simply use params from Flux.


So even if our network has many many parameters we can easily calculate their gradient for all parameters.

x = rand(Float32, 10) #ran array m = Chain(Dense(10, 5, relu), Dense(5, 2), softmax) #creating a layer l(x) = sum(Flux.crossentropy(m(x), [0.5, 0.5])) #loss function l(x)

We don’t explicitly have to use layers but sometimes they can be very convenient for many simple kinds of models and faster iterations.

The next step would be to update the weights of the network and perform optimization using different algorithms. The first optimization algorithm which comes to mind is Gradient Descent because of its simplicity. We take the weights and steps using a learning rate which is hyper-param and the gradients. weights = weights – learning_rate x gradient.

using Flux.Optimise: update!, Descent η = 0.1 #learning rate for p in params(m) end

While the method we used above to update the param in place using gradients is valid, it can get way more complicated as the algorithms we use gets more involved in it. Here, Flux comes to the rescue with its prebuilt set of optimizers which makes our work way too easy. All we need to do is give the algorithm a learning rate and that’s it.

opt = Descent(0.01)

So training a new network finally reduces down to iteration on the given dataset multiple times (epochs) and performing all the steps in order (given below in code). For the sake of simplicity and clarity, we do a quick implementation in Julia, let’s train a network that learns to predict 0.5 for every input of 10 floats. Flux has a function called train! to do all this for us.

data, labels = rand(10, 100), fill(0.5, 2, 100) #dataset loss(x, y) = sum(Flux.crossentropy(m(x), y)) #creating loss function Flux.train!(loss, params(m), [(data,labels)], opt) #training the model

You don’t have to use the train! In cases where arbitrary logic might be better suited, you could open up this training loop like so:

for d in training_set #assuming d looks like ( data, labels) # our logic here gs = gradient( params( m ) ) do # m is our model l = loss(d...) end update!( opt, params(m), gs) end

And this concludes the basics of Flux usage, in the next section, we will learn to implement it to train a classifier for the CIFAR10 dataset.

Training a Classifier for the Deep Learning Model

Getting a real classifier to work might help fix the workflow in Julia a bit more. CIFAR10 is a dataset of 50k tiny training images split into 10 classes of dogs, birds, deer, etc. The reader is requested to check the image below for more details.

We will do the following steps in order to get a classifier trained –

Load the dataset of CIFAR10 (both training and test dataset)

Create a Convolution Neural Network (CNN)

Define a loss function to calculate losses

Use training data to train our network

Evaluate our model on the test dataset

Useful Libraries to install before we proceed, installation is simple but might take few minutes to completely install.

] add Metalhead #to get the data ] add Images #Image processing package ] add ImageIO #to output images

Loading the Dataset

Metalhead.jl (Package) is an excellent package that has tons of classic predefined and pre-trained CV (computer vision) models. It also consists of a variety of data loaders that come in handy during the dataset load process.

using Statistics using Flux, Flux.Optimise #deep learning framework using Metalhead, Images #to load dataset using Metalhead: trainimgs using Images.ImageCore #to work on image processing using Flux: onehotbatch, onecold #to encode using Base.Iterators: partition using CUDA #for GPU functionality

This image will give us an idea of the different types of labels we are dealing with. #download the dataset CIFAR10 X = trainimgs(CIFAR10) #take the training dataset as X labels = onehotbatch([X[i].ground_truth.class for i in 1:50000],1:10) #encode the dataset

To get more information about what we are dealing with let’s take a look at a random image from the dataset.

image(x) = chúng tôi # handy for use later ground_truth(x) = x.ground_truth image.(X[rand(1:end, 10)]) #to show the images in IJulia itself

With 3 RGB layers of the matrix (32x32x3), together create the image vector we see above. Now since the dataset is too large, we can pass them in batches (take 1000) and keep a set for validation to check the evaluation of our model. This process of passing them in batches is called mini-batch learning and is very popular in machine learning. So, in layman terms, rather than sending our entire dataset which is big and might not fit in RAM, we break the dataset into small packets (mini-batches), usually chosen randomly, and then train our model on it. It is observed that they help with escaping the saddle points (it is the minimax point on the surface of the curve).

First, we define a ‘getarray’ function that would help in converting the matrices to Float type.

getarray(X) = float.( permutedims( channelview( X ), (2, 3, 1))) #get the matrix to float type imgs = [ getarray(X[i].img ) for i in 1:50000] #get all the matrices into float

In our batch of 1000, the first 49,000 images will make our training set and the rest will be saved for validation or test set. To achieve this we can use the function called ‘partition’ which handily breaks down the set we give it in consecutive parts (1000). and to concatenate we use use ‘cat’ function along any dimension.

valset = 49001:50000

Defining the Classifier

Now comes the part where we can define our Convolutional Neural Network (CNN).

Definition of a convolutional neural network is – one that defines a kernel and slides it across a matrix to create an intermediate representation to extract features from. As it goes into deeper layers it creates higher-order features which make it suitable for images (although it can be used in plenty of other situations), where the structure of the subject is what will help us determine which class it belongs to.

m = Chain( #crearting a CNN MaxPool((2,2)), #first layer of CNN MaxPool((2, 2)), #second layer of CNN Dense(200, 120), #first layer Dense(120, 84), #second layer Dense(84, 10), #third and final layer with 10 classification labels.

Whenever we have to work with data that has multiple independent classes, cross-entropy comes in handy. And for the momentum, as the name suggests, it gradually lowers the learning rate as we proceed further with the training. This is necessary in case we overshoot from the desired destination and the chances for local minima increase while helping us to maintain a bit of adaptivity in our optimization.

using Flux: crossentropy, Momentum #import the optimizers loss(x, y) = sum(crossentropy(m(x), y)) #using loss function opt = Momentum(0.01) #fixing the momentum

Before starting our training loop, we will need some sort of basic accuracy numbers about our model to keep the track of our progress. We can design our custom function to achieve just the same.

accuracy(x, y) = mean( onecold(m (x), 1:10) .== onecold(y, 1:10))

Training the Classifier

This is the part where we finally stitch everything together, here we do all the interesting operations which we defined previously to see what our model is capable of doing. Just for the tutorial, we will only be using 10 iterations over dataset (epochs) and optimize it, although for greater accuracy you can increase the epochs and play with hyperparameters a bit.

epochs = 10 #number of iterations for epoch = 1:epochs for d in train gs = gradient(params(m)) do l = loss(d...) #calculate losses end update!(opt, params(m), gs) #upadate the params weights end @show accuracy(valX, valY) #show the accuracy of model after each epoch end

Step by step training process gives us a brief idea of how the network was learning the function.  This accuracy is not bad at all for a model which was small and had no hyperparameter tuned with smaller epochs.

Training on a GPU

Testing the Network

As we have trained our neural network for 100 passes over the training dataset. But we would need to check if our model has learned anything at all. To check this, we simply predict the labels corresponding to each class from our neural net output, and checking it against the true values of class labels. If the prediction is correct, we add that sample to the correct prediction (true values) list. This will be done on the still unseen part of the data.

Firstly, we would have to get the same processing of images as we did on the training data set to compare them side by side.

valset = valimgs(CIFAR10) #value set valimg = [ getarray(valset[i].img) for i in 1:10000] #get them to array labels = onehotbatch([valset[i].ground_truth.class for i in 1:10000],1:10)#encode them test = gpu.( [(cat(valimg[i]..., dims = 4), labels[:,i]) for i in partition(1:10000, 1000)])

Next, we display some of the images from our validation dataset.

ids = rand(1:10000, 10) #random image ids image.(valset[ids]) #show images in vector form

We have 10 values as the output for all 10 classes. If the particular value is higher for a class, our network thinks that image is from that particular class. The below image shows the values (energies) in 10 floats and every column corresponds to the output of one image.

Let’s see how our model fared on the dataset.

rand_test = getarray.( image.(valset[ids])) #get the test images rand_truth = ground_truth.(valset[ids]) #check the values against true values m(rand_test)

This looks very similar to how we would have expected our results to be. Even after the small training period, let’s see how our model actually performs on any new data given, (that was prepared by us).

accuracy( test[1]...)#testing accuracy

49% is clearly much better than the chances of randomly having it correct which is 10% (since we have 10 classes) which is not bad at all for the small hand-coded models without hyper-parameter tuning like ours.

Let’s take a look at how the net performed on all the classes performed individually.

class_correct = zeros(10) #creating an array of zeros class_total = zeros(10) for i in 1:10 preds = m(test[i][1]) #prediction after feeding it in our model lab = test[i][2] for j = 1:1000 pred_class = findmax(preds[:, j])[2] #find the argmax for each class actual_class = findmax(lab[:, j])[2] #true vale of class if pred_class == actual_class #if both are equal then then increment values by 1 class_correct[pred_class] += 1 end class_total[actual_class] += 1 end end class_correct ./ class_total #getting total number of ratios (/100) times we get it correct

The spread seems pretty good, but some classes are performing significantly better than others. It is left for the reader to explore the reason.


In this article, we learned how powerful Julia is when it comes to computation. We learned about the Flux package and how to use it to train our hand-written model to classify between 10 different classes in just a few lines of code, that too on GPU!. We also learned about CuArrays and their significance in decreasing computation time. Hope this article has been helpful in starting your journey with Flux (Julia).

Thanks to the Mike Innes, Andrew Dinhobl, Ygor Canalli et al. for valuable documentation. Reach out to me via LinkedIn (Nihal Singh).


Correspondence In Sap – Configuration & Types

There are various standard correspondence types available like invoice print, account statement etc. Custom correspondence types can also be created.

Correspondences can be created at the time of particular business transaction processing or at a later stage for already created transaction postings.

Correspondence can be sent to customer/ vendor in various formats like email, and fax. Correspondence is basically letters etc. which is sent from SAP to vendor/ customer etc.

Correspondence can be created individually or collectively, ad-hoc or via automated batch job.

Types of correspondence

Following is example list of various standard correspondence types, which can be copied to create a specific custom form, program, etc.

Correspondence Type Correspondence Description Print Program Required Data Sample standard SAP Script Form

SAP01 Payment notices RFKORD00 Document number F140_PAY_CONF_01

SAP06 Account statements RFKORD10 Account number and date F140_ACC_STAT_01

SAP07 Bill of exchange charges statements RFKORD20 Document number F140_BILL_CHA_01

SAP09 Internal documents RFKORD30 Document number F140_INT_DOCU_01

SAP10 Individual letters RFKORD40 Account number F140_IND_TEXT_01

SAP11 Document extracts RFKORD50 Document number F140_DOCU_EXC_01

SAP13 Customer statements RFKORD11 Customer number and date F140_CUS_STAT_01

How to do Correspondence configuration

Configuration of Correspondence in SAP can be carried out in the following steps below

Step 1) Define Correspondence Type

Transaction Code:-OB77

Here various SAP standard correspondence types are available. You can also create your custom correspondence types. You can specify that what data is required for generating a correspondence, e.g. for account statement you can specify that customer/ vendor master is necessary for the statement. Also, you can specify the date parameters and the text to appear for date selection.

Step 2) Assign Program to Correspondence Type

Transaction Code: –OB78

Here you need to link the correspondence generator program to the correspondence type. You can also specify different programs for different company codes. (Also, you can specify the default variant here for the program to execute. You can create such variant from transaction SE38/ SA38 for the program.)

You can also create your own custom program as a copy of the standard program and can make suitable changes to meet any of your client specific need.

Step 3) Determine Call-Up Functions for Correspondence Type

Transaction Code:-OB79

Here you need to specify that at what point of time you can generate the particular correspondence type. You can also specify a different setting for different company codes. The various options available are:-

At the time of posting payments (e.g. F-28, F-26, etc.)

At the time of document display or change (e.g. FB02, FB03, etc.)

At the time of account display (e.g. FBL1N, FBL5N, etc.)

Step 4) Assign Correspondence Form to Correspondence Print Program

Transaction Code: –OB96

Here you need to specify that which forms definition will be used by the correspondence print program. You can also specify a different setting for different company codes. (The SAP Script form is defined using the transaction SE71, where the various data is arranged in the output format to get processed. This SAP Script form defines the layout in the output.)

You can also use two digit form IDs, by which you can call different forms for different form IDs in the same company code.

This form ID can be given in the selection screen of the print program generating correspondence. You can select only one form ID at one time for a correspondence type. You can create multiple correspondence types, triggering different form ids.

Step 5) Define Sender Details for Correspondence

Transaction Code:-OBB1

You can here link the details for header, footer, signature and sender. This text is defined using the transaction SO10 with text ID as linked above (e.g. ADRS). You can also specify a different setting for different company codes.(Also two digit sender variant can be defined, which you can give to the selection parameters of the print program. This will enable different sender details within same company code.)

Step 6) Define Sort Variants for Correspondence

Transaction Code: –O7S4

You can here specify that in which order the correspondence letters will get generated. E.g. if you are generating account statement for multiple vendors, then vendors will get sorted in this order and then the letter will get generated. This Sort Variant can be given in the selection screen of the print program generating correspondence.

Step 7) Define Sort Variants for Line Items in Correspondence

Transaction Code: –O7S6

You can here specify that in which order the various line items will appear in a correspondence letters. E.g. if a vendor account statement has multiple invoices, then invoices will get sorted in this order and then the letter will get generated.

This Sort Variant can be given in the selection screen of the print program generating correspondence.

Correspondence Generation

As shown earlier also while configuring the call-up point, the correspondence can be generated at below point of times:-

At the time of posting payments (e.g. F-28, F-26, etc.)

At the time of document display or change (e.g. FB02, FB03, etc.)

At the time of account display (e.g. FBL1N, FBL5N, etc.)

Correspondence can be generated for a particular document or for vendor(s)/ customer(s) account. Subsequent slides will explain the generation of correspondence via different ways and its printing.

Correspondence Generation (Method A):-

The correspondence can be generated while you create, change or display the document.

Similarly, you can create the correspondence from document display/ change from the transaction, like in FB02/ FB03/ FBl1N/ FLB5N, etc.

Correspondence Generation (Method B)

For existing accounting documents you can use transaction code FB12.

Hereafter entering the company code, it will ask the correspondence type. Select Correspondence Type and it will ask you to enter document number/ account no. etc. based on the correspondence type setting. After this, the correspondence is requested.

Correspondence Generation (Method C)

From transaction F.27, you can generate the correspondence (Account Statement) for vendor(s) / customer(s).

If you select “Individual Request” check box, then if the same vendor/ customer has line items in multiple company codes, then for each company code a separate statement will get generated.

Correspondence Printing

Correspondence Printing (Method A):-

Use transaction code F.61 to print the relevant correspondence type already generated. On execution, it will simply print the correspondence (If Email/ Fax, etc. is configured, then output will be generated in that format)

Correspondence Printing (Method B):-

From transaction F.64, you can see the correspondence letter (Spool) generated and can print it. (The difference from F.61 is that, in F.64 you can also do other operations (like delete, print preview, etc.) for correspondence request already generated.)

Correspondence Via Email

Then the correspondence for this customer/ vendor will get generated in email format instead of print output (Taking into account the user exit setting made to determine method of communication in the user exit given in next slide.)

(Note: Similarly you can make setting for Fax output via selecting the standard communication as FAX and maintaining Fax no.)

Python And Mysql: A Practical Introduction For Data Analysis

This article was published as a part of the Data Science Blogathon


Let’s look at a practical example of how to make SQL queries to a MySQL server from Python code: CREATE, SELECT, UPDATE, JOIN, etc.

Most applications interact with data in some form. Therefore, programming languages ​​(Python is no exception) provide tools for storing data sources and accessing them. MySQL is one of the most fantastic and rich database management systems ( DBMS ). Last year it was ranked second after Oracle in the database rankings.

Using the techniques described in this tutorial, you can effectively integrate a MySQL database into your Python application. In this tutorial, we will develop a small MySQL database for a movie rating system and learn how to grab data from it using Python code.

What you will get to know after this tutorial is:

Connect your application to the MySQL database

Retrieval of data via a query for the required data from the database

Handle exceptions thrown when accessing the database

Comparing MySQL to Another SQL Databases

SQL stands for Structured Query Language is a widely-used programming language for managing relational databases. You may have heard of various SQL-based DBMS: MySQL, PostgreSQL, SQLite, and SQL Server. All of these databases comply with SQL standards but differ in detail.

Because of its open-source code, MySQL quickly became the market leader in SQL solutions. MySQL is currently used by most of the famous tech firms like Google, LinkedIn, Uber, Netflix, Twitter, and more.

Besides the support from the open-source community, there are other reasons for MySQL’s success:

Easy to install- MySQL is designed to be user-friendly. The database is easy to create and customize. MySQL is available for major operating systems including Windows, macOS, Linux, and Solaris.

Speed- MySQL has a reputation for being a fast database solution. This DBMS also scales well.

User rights and security- MySQL allows you to set password security levels, add and remove privileges to user accounts. User rights management looks much simpler than in many other DBMS such as PostgreSQL, where managing configuration files requires some skill.

Installing MySQL Server and MySQL Connector

MySQL Server and MySQL Connector are the only two software that you need to get started with this tutorial. MySQL Server will provide the resources needed to work with the database. After starting the server, you should be able to connect your Python application to it using the MySQL Connector / Python.

Installing MySQL Server

The official documentation describes the recommended ways to download and install MySQL Server. There are instructions for all popular operating systems, including Windows, macOS, Solaris, Linux, and many more.

For Windows, your best bet is to download the MySQL installer and let it take care of the process. The Installation Manager will also help you configure the security settings for your MySQL server. On the accounts page, you will need to enter a password for the root account and, if desired, add other users with different privileges.

Setting up a MySQL account

Other useful tools such as MySQL Workbench can be customized using the installers. A convenient alternative to installing on an operating system is to deploy MySQL using Docker.

Installing MySQL Connector / Python

Database driver – software that allows an application to connect to and interact with a DBMS. These drivers are usually supplied as separate modules. The standard interface that all Python database drivers must conform to is described in PEP 249. To install the driver (connector), we will use the package manager: pip

pip install mysql-connector-python

pip will install the connector into the currently active environment. To work with a project in isolation, we recommend setting up a virtual environment.

Let’s check the installation result by running the following command in the Python terminal:

import mysql.connector

If the import statement runs without errors, then it is successfully installed and ready to use. MySQL.connector

Establishing a connection to the MySQL server

MySQL is a server-side database management system. One server can contain multiple databases. To interact with the database, we must establish a connection to the server. Step by step interaction for a Python program with a MySQL-based database looks like this:

We connect to the MySQL server.

We create a new database (if necessary).

We connect to the database.

We execute the SQL query, collect the results.

We inform the database if changes have been made to the table.

Lastly, just close the connection to the MySQL server.

Whatever the application, the first step is to link the application and database together.

Connecting to MySQL Server from Python

To establish a connection, use the module. This function takes parameters, and, and returns an object. Credentials can be obtained as a result of input from the user: connect() mysql.connector host user password MySQLConnection

from getpass import getpass from mysql.connector import connect, Error try: with connect( host="localhost", user = input ("Username:"), password = getpass ("Password:"), ) as connection: print(connection) except Error as e: print(e)

The object is stored in a variable that we will use to access the MySQL server. A few important points: MySQLConnection connection

Wrap all database connections in blocks. This will make it easier to catch and examine any exceptions. try … except

Remember to close the connection after you finish accessing the database. Unused open connections lead to unexpected errors and performance problems. The code uses the context manager ( with … as …) for this.

You should never embed credentials (username and password) in string form in a Python script. This is bad deployment practice and poses a serious security risk. The code above asks for your login credentials. For this, a built-in module is used to hide the entered password.

So, we have established a connection between our program and the MySQL server. Now you need to either create a new database or connect to an existing one.

Create a new database

To create a new database, for example with a name, you need to execute the SQL statement: online_movie_rating

CREATE DATABASE online_movie_rating;


MySQL requires you to put a semicolon ( 😉 at the end of a statement. However, MySQL Connector/Python automatically adds a semicolon at the end of each query.

To execute an SQL query, we need a cursor that abstracts the process of accessing database records. MySQL Connector / Python provides a corresponding class, an instance of which is also called a cursor. MySQLCursor

Let’s pass our request to create a database: online_movie_rating

try: with connect( host="localhost", user = input ("Username:"), password = getpass ("Password:"), ) as connection: create_db_query = "CREATE DATABASE online_movie_rating" with connection.cursor() as cursor: cursor.execute(create_db_query) except Error as e: print(e)

The request is stored as a string in a variable and then passed for execution to CREATE DATABASE create_db_query cursor.execute()

If a database with the same name already exists on the server, we will receive an error message. Using the same object as before, let’s run a query to see all the tables stored in the database: MySQLConnection SHOW DATABASES

try: with connect( host="localhost", user = input ("Username:"), password = getpass ("Password:"), ) as connection: show_db_query = "SHOW DATABASES" with connection.cursor() as cursor: cursor.execute(show_db_query) for db in cursor: print(db) except Error as e: print(e)


Enter username: root

Enter password: ········






The above code will print the names of all databases located on our MySQL server. The command in our example also dumped databases that are automatically created by the MySQL server and provide access to database metadata and server settings. SHOW DATABASES

Connecting to an existing database

So, we have created a database called. To connect to it, we simply supplement the call with a parameter: online_movie_rating connect() database

try: with connect( host="localhost", user = input ("Username:"), password = getpass ("Password:"), database="online_movie_rating", ) as connection: print(connection) except Error as e: print(e) Creating, modifying, and dropping tables

In this section, we discuss how to use Python to perform some basic queries: ’,’ and ‘.’ CREATE TABLE DROP ALTER

Defining the database schema

Let’s start by creating a database schema for the movie rating system. Take the database comprised of three tables:

1. movies- general information about films:




release year



2. reviewers- information about the people who published the ratings of the films:




3. ratings- information about the ratings of films by reviewers:

movie_id (foreign key)

reviewer_id (foreign key)


These three tables are sufficient for the purposes of this guide.

Film rating system diagram

The tables in the database are related to each other: movies and reviewers must have a many-to-many relationship: one movie can be viewed by multiple reviewers, and one reviewer can review multiple movies. The table ratings connect the movies table to the reviewer’s table.

Creating tables using the CREATE TABLE statement

To create a new table in MySQL, we need to use the operator. The following MySQL query will create our database table: CREATE TABLE movies online_movie_rating

CREATE TABLE movies( id INT AUTO_INCREMENT PRIMARY KEY, title VARCHAR(100), release_year YEAR(4), genre VARCHAR(100), collection_in_mil INT );

If you have encountered SQL before, you will understand the meaning of the above query. The MySQL dialect has some distinctive features. For example, MySQL offers a wide range of data types, including, and so on. In addition, MySQL uses the keyword when the column value should be automatically incremented when new records are inserted. YEAR INT BIGINT AUTO_INCREMENT

To create a table, you need to pass the specified query to the cursor.execute()

create_movies_table_query = """ CREATE TABLE movies( id INT AUTO_INCREMENT PRIMARY KEY, title VARCHAR(100), release_year YEAR(4), genre VARCHAR(100), collection_in_mil INT ) """ with connection.cursor() as cursor: cursor.execute(create_movies_table_query)

Let’s repeat the procedure for the table: reviewers

create_reviewers_table_query = “””

CREATE TABLE reviewers ( id INT AUTO_INCREMENT PRIMARY KEY, first_name VARCHAR(100), last_name VARCHAR(100) ) """ with connection.cursor() as cursor: cursor.execute(create_reviewers_table_query) Finally, let's create a table ratings: create_ratings_table_query = """ CREATE TABLE ratings ( movie_id INT, reviewer_id INT, rating DECIMAL(2,1), FOREIGN KEY(movie_id) REFERENCES movies(id), FOREIGN KEY(reviewer_id) REFERENCES reviewers(id), PRIMARY KEY(movie_id, reviewer_id) ) """ with connection.cursor() as cursor: cursor.execute(create_ratings_table_query)

The implementation of foreign key relationships in MySQL is slightly different and has limitations compared to standard SQL. In MySQL, both parent and child of a foreign key must use the same storage engine — the underlying software component that the database management system uses to perform SQL operations. MySQL offers two kinds of such mechanisms:

Transactional storage engines are transaction-safe and allow you to roll back transactions using simple commands such as. Many popular MySQL engines fall into this category, including InnoDB and NDB. rollback

Non-transactional storage engines rely on manual code to undo statements committed to the database. These are, for example, MyISAM and MEMORY.

InnoDB is the most popular default storage engine. By enforcing foreign key constraints, it helps maintain data integrity. This means that any foreign key CRUD operation is pre-checked to ensure that it does not lead to inconsistency between different tables.

Note that the table uses columns and two foreign keys, acting together as a primary key. This feature ensures that a reviewer cannot rate the same film twice. ratings movie_id reviewer_id

The same cursor can be used for multiple hits. In this case, all calls will become one atomic transaction. For example, you can execute all statements with one cursor, and then commit the transaction at once:

CREATE TABLE with connection.cursor() as cursor: cursor.execute(create_movies_table_query) cursor.execute(create_reviewers_table_query) cursor.execute(create_ratings_table_query)

Displaying Table Schema Using the DESCRIBE Statement

We have created three tables and can view the schema using the operator. DESCRIBE

Assuming you already have an object in a variable, we can print the results obtained. This method retrieves all lines from the last executed statement: MySQLConnection connection cursor.fetchall()

show_table_query = "DESCRIBE movies" with connection.cursor() as cursor: cursor.execute(show_table_query) # Fetch rows from last executed query result = cursor.fetchall() for row in result: print(row)


(‘id’, ‘int(11)’, ‘NO’, ‘PRI’, None, ‘auto_increment’)

(‘title’, ‘varchar(100)’, ‘YES’, ”, None, ”)

(‘release_year’, ‘year(4)’, ‘YES’, ”, None, ”)

(‘genre’, ‘varchar(100)’, ‘YES’, ”, None, ”)

(‘collection_in_mil’, ‘int(11)’, ‘YES’, ”, None, ”)

After executing the above code, we should get a table containing information about the columns in the table. For each column, information is displayed about the data type, whether the column is a primary key, and so on. movies

Changing the schema of a table using the ALTER statement

The name column in the table contains the movie’s box office in millions of dollars. We can write the following MySQL statement to change the data type of an attribute from to collection_in_mil movies collection_in_mil INT DECIMAL

ALTER TABLE movies MODIFY COLUMN collection_in_mil DECIMAL(4,1);

DECIMAL(4,1) indicates a decimal number, which can have a maximum of four figures, of which one corresponds to the tenth discharge, for example, and so on. d. 120.1 3.4 38.0

alter_table_query = """ ALTER TABLE movies MODIFY COLUMN collection_in_mil DECIMAL(4,1) """ show_table_query = "DESCRIBE movies" with connection.cursor() as cursor: cursor.execute(alter_table_query) cursor.execute (show_table_query) # Get rows from the last executed query result = cursor.fetchall () print ("Movie table schema after modification:") for row in result: print(row) The movie table schema after making changes: ('id', 'int(11)', 'NO', 'PRI', None, 'auto_increment') ('title', 'varchar(100)', 'YES', '', None, '') ('release_year', 'year(4)', 'YES', '', None, '') ('genre', 'varchar(100)', 'YES', '', None, '') ('collection_in_mil', 'decimal(4,1)', 'YES', '', None, '')

As shown in the output, the attribute changed its type too. Note that in the above code, we are calling twice, but only fetches rows from the most recently executed query, which is. collection_in_mil DECIMAL(4,1) cursor.execute() cursor.fetchall() show_table_query

Dropping tables using the DROP statement

To delete tables, use the operator. Dropping a table is an irreversible process. If you run the code below, you will need to invoke the query on the table again: DROP TABLE CREATE TABLE ratings

drop_table_query = "DROP TABLE ratings" with connection.cursor() as cursor: cursor.execute(drop_table_query) Inserting records into tables

Let’s fill the tables with data. In this section, we will look at two ways to insert records using the MySQL Connector in Python code.

The first method works well when the number of records is small. The second one is better suited for real-life scenarios. .execute() .executemany()

Inserting records with .execute ()

The first approach uses the same method that we have been using so far. We write a request and send it to the cursor.execute() INSERT INTO cursor.execute()

insert_movies_query = """ INSERT INTO movies (title, release_year, genre, collection_in_mil) VALUES ("Forrest Gump", 1994, "Drama", 330.2), ("3 Idiots", 2009, "Drama", 2.4), ("Eternal Sunshine of the Spotless Mind", 2004, "Drama", 34.5), ("Good Will Hunting", 1997, "Drama", 138.1), ("Skyfall", 2012, "Action", 304.6), ("Gladiator", 2000, "Action", 188.7), ("Black", 2005, "Drama", 3.0), ("Titanic", 1997, "Romance", 659.2), ("The Shawshank Redemption", 1994, "Drama",28.4), ("Udaan", 2010, "Drama", 1.5), ("Home Alone", 1990, "Comedy", 286.9), ("Casablanca", 1942, "Romance", 1.0), ("Avengers: Endgame", 2023, "Action", 858.8), ("Night of the Living Dead", 1968, "Horror", 2.5), ("The Godfather", 1972, "Crime", 135.6), ("Haider", 2014, "Action", 4.2), ("Inception", 2010, "Adventure", 293.7), ("Evil", 2003, "Horror", 1.3), ("Toy Story 4", 2023, "Animation", 434.9), ("Air Force One", 1997, "Drama", 138.1), ("The Dark Knight", 2008, "Action",535.4), ("Bhaag Milkha Bhaag", 2013, "Sport", 4.1), ("The Lion King", 1994, "Animation", 423.6), ("Pulp Fiction", 1994, "Crime", 108.8), ("Kai Po Che", 2013, "Sport", 6.0), ("Beasts of No Nation", 2023, "War", 1.4), ("Andadhun", 2023, "Thriller", 2.9), ("The Silence of the Lambs", 1991, "Crime", 68.2), ("Deadpool", 2023, "Action", 363.6), ("Drishyam", 2023, "Mystery", 3.0) """ with connection.cursor() as cursor: cursor.execute(insert_movies_query)

Inserting records with .executemany ()

The previous approach is well suited for the smaller record that can be inserted easily via code. But usually, the data is stored in a file or generated by another script. Here’s where it comes in handy. The method takes two parameters: .executemany()

A query containing placeholders for the records to be inserted.

List of records to insert.

Let’s take an approach to fill the table: reviewers

insert_reviewers_query = """ INSERT INTO reviewers (first_name, last_name) VALUES ( %s, %s ) """ reviewers_records = [ ("Chaitanya", "Baweja"), ("Mary", "Cooper"), ("John", "Wayne"), ("Thomas", "Stoneman"), ("Penny", "Hofstadter"), ("Mitchell", "Marsh"), ("Wyatt", "Skaggs"), ("Andre", "Veiga"), ("Sheldon", "Cooper"), ("Kimbra", "Masters"), ("Kat", "Dennings"), ("Bruce", "Wayne"), ("Domingo", "Cortes"), ("Rajesh", "Koothrappali"), ("Ben", "Glocker"), ("Mahinder", "Dhoni"), ("Akbar", "Khan"), ("Howard", "Wolowitz"), ("Pinkie", "Petit"), ("Gurkaran", "Singh"), ("Amy", "Farah Fowler"), ("Marlon", "Crafford"), ] with connection.cursor() as cursor: cursor.executemany(insert_reviewers_query, reviewers_records)

This code takes placeholders for two strings that are inserted into. Placeholders act as format specifiers and help to reserve space for a variable within a string. %s insert_reviewers_query

Let’s fill in the table in the same way: ratings

insert_ratings_query = """ INSERT INTO ratings (rating, movie_id, reviewer_id) VALUES ( %s, %s, %s) """ ratings_records = [ (6.4, 17, 5), (5.6, 19, 1), (6.3, 22, 14), (5.1, 21, 17), (5.0, 5, 5), (6.5, 21, 5), (8.5, 30, 13), (9.7, 6, 4), (8.5, 24, 12), (9.9, 14, 9), (8.7, 26, 14), (9.9, 6, 10), (5.1, 30, 6), (5.4, 18, 16), (6.2, 6, 20), (7.3, 21, 19), (8.1, 17, 18), (5.0, 7, 2), (9.8, 23, 3), (8.0, 22, 9), (8.5, 11, 13), (5.0, 5, 11), (5.7, 8, 2), (7.6, 25, 19), (5.2, 18, 15), (9.7, 13, 3), (5.8, 18, 8), (5.8, 30, 15), (8.4, 21, 18), (6.2, 23, 16), (7.0, 10, 18), (9.5, 30, 20), (8.9, 3, 19), (6.4, 12, 2), (7.8, 12, 22), (9.9, 15, 13), (7.5, 20, 17), (9.0, 25, 6), (8.5, 23, 2), (5.3, 30, 17), (6.4, 5, 10), (8.1, 5, 21), (5.7, 22, 1), (6.3, 28, 4), (9.8, 13, 1) ] with connection.cursor() as cursor: cursor.executemany(insert_ratings_query, ratings_records)

All three tables are now filled with data. The next step is to figure out how to interact with this database.

Reading records from the database

So far, we have only created database items. It’s time to run a few queries and find the properties we are interested in. In this section, we will learn how to read records from database tables using the operator. SELECT

Reading records with a SELECT statement

To get records, you need to send to the request and return the result using: cursor.execute() SELECT cursor.fetchall()

select_movies_query = "SELECT * FROM movies LIMIT 5" with connection.cursor() as cursor: cursor.execute(select_movies_query) result = cursor.fetchall() for row in result: print(row)


(1, ‘Forrest Gump’, 1994, ‘Drama’, Decimal(‘330.2’))

(2, ‘3 Idiots’, 2009, ‘Drama’, Decimal(‘2.4’))

(3, ‘Eternal Sunshine of the Spotless Mind’, 2004, ‘Drama’, Decimal(‘34.5’))

(4, ‘Good Will Hunting’, 1997, ‘Drama’, Decimal(‘138.1’))

(5, ‘Skyfall’, 2012, ‘Action’, Decimal(‘304.6’))

The variable contains the records returned. It is a list of tuples representing individual records in a table. result .fetchall()

In the above query, we use a keyword to limit the number of rows received from the operator. Developers are often used to paginate output when processing large amounts of data. LIMIT SELECT LIMIT

In MySQL, two non-negative numeric arguments can be passed to an operator: LIMIT

SELECT * FROM movies LIMIT 2,5;

When using two numeric arguments, the first specifies an offset, which in this example is 2, and the second limits the number of rows returned to 5. That is, the query from the example will return rows 3 through 7.

select_movies_query = "SELECT title, release_year FROM movies LIMIT 2, 5" with connection.cursor() as cursor: cursor.execute(select_movies_query) for row in cursor.fetchall(): print(row)


(‘Eternal Sunshine of the Spotless Mind’, 2004)

(‘Good Will Hunting’, 1997)

(‘Skyfall’, 2012)

(‘Gladiator’, 2000)

(‘Black’, 2005)

Filtering Results with WHERE

Table entries can also be filtered using. To get all films with a box office of over $ 300 million, run the following query: WHERE

select_movies_query = """ SELECT title, collection_in_mil FROM movies ORDER BY collection_in_mil DESC """ with connection.cursor() as cursor: cursor.execute(select_movies_query) for movie in cursor.fetchall(): print(movie) ('Avengers: Endgame', Decimal('858.8')) ('Titanic', Decimal('659.2')) ('The Dark Knight', Decimal('535.4')) ('Toy Story 4', Decimal('434.9')) ('The Lion King', Decimal('423.6')) ('Deadpool', Decimal('363.6')) ('Forrest Gump', Decimal('330.2')) ('Skyfall', Decimal('304.6'))

The phrase in the query allows you to sort the fees from highest to lowest. ORDER BY

MySQL provides many string formatting operations such as for string concatenation. For example, movie titles are usually displayed along with the release year to avoid confusion. Let’s get the names of the five most profitable films along with their release dates: CONCAT

select_movies_query = """ SELECT CONCAT(title, " (", release_year, ")"), collection_in_mil FROM movies ORDER BY collection_in_mil DESC LIMIT 5 """ with connection.cursor() as cursor: cursor.execute(select_movies_query) for movie in cursor.fetchall(): print(movie)


(‘Avengers: Endgame (2023)’, Decimal(‘858.8’))

(‘Titanic (1997)’, Decimal(‘659.2’))

(‘The Dark Knight (2008)’, Decimal(‘535.4’))

(‘Toy Story 4 (2023)’, Decimal(‘434.9’))

(‘The Lion King (1994)’, Decimal(‘423.6’))

If you do not want to use and do not need to get all records, you can use the cursor methods and: LIMIT .fetchone() .fetchmany()

.fetchone() Retrieves the next row of the result as a tuple, or if there are no more rows available. None

.fetchmany() Retrieves a list of the next set of rows as a tuple. To do this, an argument is passed to it, which by default is 1. If there are no more rows available, the method returns an empty list.

Again, extract the titles of the five highest-grossing films by year, but this time using: .fetchmany()

select_movies_query = """ SELECT CONCAT(title, " (", release_year, ")"), collection_in_mil FROM movies ORDER BY collection_in_mil DESC """ with connection.cursor() as cursor: cursor.execute(select_movies_query) for movie in cursor.fetchmany(size=5): print(movie) cursor.fetchall()


(‘Avengers: Endgame (2023)’, Decimal(‘858.8’))

(‘Titanic (1997)’, Decimal(‘659.2’))

(‘The Dark Knight (2008)’, Decimal(‘535.4’))

(‘Toy Story 4 (2023)’, Decimal(‘434.9’))

(‘The Lion King (1994)’, Decimal(‘423.6’))

You may have noticed an additional challenge. We do this to clean up any remaining unread results. cursor.fetchall() .fetchmany()

Before executing any other statements on the same connection, you must clear any unread results. Otherwise, an exception is thrown. InternalError

JOIN Multiple Tables

To find out the names of the five highest-rated movies, run the following query:

select_movies_query = """ SELECT title, AVG(rating) as average_rating FROM ratings INNER JOIN movies ON chúng tôi = ratings.movie_id GROUP BY movie_id ORDER BY average_rating DESC LIMIT 5 """ with connection.cursor() as cursor: cursor.execute(select_movies_query) for movie in cursor.fetchall(): print(movie)


(‘Night of the Living Dead’, Decimal(‘9.90000’))

(‘The Godfather’, Decimal(‘9.90000’))

(‘Avengers: Endgame’, Decimal(‘9.75000’))

(‘Eternal Sunshine of the Spotless Mind’, Decimal(‘8.90000’))

(‘Beasts of No Nation’, Decimal(‘8.70000’))

You can find the name of the reviewer with the most ratings like this:

select_movies_query = """ SELECT CONCAT(first_name, " ", last_name), COUNT(*) as num FROM reviewers INNER JOIN ratings ON chúng tôi = ratings.reviewer_id GROUP BY reviewer_id ORDER BY num DESC LIMIT 1 """ with connection.cursor() as cursor: cursor.execute(select_movies_query) for movie in cursor.fetchall(): print(movie) ('Mary Cooper', 4)

As you can see, most of the reviews were written by Mary Cooper.

The process of executing a query always remains the same: we pass the query to get the results using. cursor.execute() .fetchall()

Updating and deleting records from the database

In this section, we will update and remove some of the entries. We will select the required lines using a keyword. WHERE

UPDATE command

Imagine a reviewer Amy Farah Fowler is married to Sheldon Cooper. She changed her last name to Cooper and we need to update the database. To update records in MySQL, use the operator: UPDATE

update_query = """ UPDATE reviewers SET last_name = "Cooper" WHERE first_name = "Amy" """ with connection.cursor() as cursor: cursor.execute(update_query)

Let’s say we want to allow reviewers to change grades. The program needs to know, and the new. SQL example: movie_id reviewer_id rating

UPDATE ratings SET rating = 5.0 WHERE movie_id = 18 AND reviewer_id = 15; SELECT * FROM ratings WHERE movie_id = 18 AND reviewer_id = 15; The specified queries first update the rating and then output the updated one. Let's write a Python script that will allow us to adjust the grades: from getpass import getpass from mysql.connector import connect, Error movie_id = input("Enter movie id: ") reviewer_id = input("Enter reviewer id: ") new_rating = input("Enter new rating: ") update_query = """ UPDATE ratings SET rating = "%s" WHERE movie_id = "%s" AND reviewer_id = "%s"; SELECT * FROM ratings WHERE movie_id = "%s" AND reviewer_id = "%s" """ % ( new_rating, movie_id, reviewer_id, movie_id, reviewer_id, ) try: with connect( host="localhost", user=input("Enter username: "), password=getpass("Enter password: "), database="online_movie_rating", ) as connection: with connection.cursor() as cursor: for result in cursor.execute(update_query, multi=True): if result.with_rows: print(result.fetchall()) except Error as e: print(e)


Enter movie id: 18

Enter reviewer id: 15

Enter new rating: 5

Enter username: root

Enter password: ········

[(18, 15, Decimal(‘5.0’))]

[(18, 15, Decimal(‘5.0’))]

To pass multiple requests to the same cursor, we assign a value to the argument. In this case, it returns an iterator. Each item in the iterator corresponds to a cursor object that executes the instruction passed in the request. The above code starts a loop on this iterator, calling for each cursor object. multi True cursor.execute() for .fetchall()

If no result set was obtained for the operation, then an exception is thrown. To avoid this error, in the code above, we use a property that indicates whether the last performed operation created rows. .fetchall() cursor.with_rows

While this code does the job, the instruction, as it stands, is a tempting target for hackers. It is vulnerable to a SQL injection attack that could allow attackers to corrupt or misuse the database. WHERE

For example, if the user submits, and as input, then the result would look like this: movie_id = 18 reviewer_id = 15 ratings = 5.0

$ python

Enter movie id: 18

Enter reviewer id: 15

Enter new rating: 5.0

Enter username:

Enter password:

[(18, 15, Decimal(‘5.0’))]

[(18, 15, Decimal(‘5.0’))]

The score for and changed too. But if you were a hacker, you could send a hidden command to the input: movie_id = 18 reviewer_id = 15 5.0

$ python

Enter movie id: 18

Enter reviewer id: 15″; UPDATE reviewers SET last_name = “A

Enter new rating: 5.0

Enter username:

Enter password:

[(18, 15, Decimal(‘5.0’))]

[(18, 15, Decimal(‘5.0’))]

Again, the output shows that the reported rating has been changed to 5.0. What changed?

The hacker intercepted the data update request. An update request will change all records in the reviewer’s table: last_name “A”

… SELECT first_name, last_name

… FROM reviewers

… “””

… cursor.execute(select_query)

… for reviewer in cursor.fetchall():

… print(reviewer)

(‘Chaitanya’, ‘A’)

(‘Mary’, ‘A’)

(‘John’, ‘A’)

(‘Thomas’, ‘A’)

(‘Penny’, ‘A’)

(‘Mitchell’, ‘A’)

(‘Wyatt’, ‘A’)

(‘Andre’, ‘A’)

(‘Sheldon’, ‘A’)

(‘Kimbra’, ‘A’)

(‘Kat’, ‘A’)

(‘Bruce’, ‘A’)

(‘Domingo’, ‘A’)

(‘Rajesh’, ‘A’)

(‘Ben’, ‘A’)

(‘Mahinder’, ‘A’)

(‘Akbar’, ‘A’)

(‘Howard’, ‘A’)

(‘Pinkie’, ‘A’)

(‘Gurkaran’, ‘A’)

(‘Amy’, ‘A’)

(‘Marlon’, ‘A’)

The above code displays and for all records in the table of reviewers. An SQL injection attack corrupted this table, changing all records to “A”. first_name last_name last_name

There is a quick solution to prevent such attacks. Do not add user-supplied query values ​​directly to the query string. Better to update the script by sending request values ​​as arguments to .execute() from getpass import getpass from mysql.connector import connect, Error movie_id = input("Enter movie id: ") reviewer_id = input("Enter reviewer id: ") new_rating = input("Enter new rating: ") update_query = """ UPDATE ratings SET rating = %s WHERE movie_id = %s AND reviewer_id = %s; SELECT * FROM ratings WHERE movie_id = %s AND reviewer_id = %s """ val_tuple = ( new_rating, movie_id, reviewer_id, movie_id, reviewer_id, ) try: with connect( host="localhost", user=input("Enter username: "), password=getpass("Enter password: "), database="online_movie_rating", ) as connection: with connection.cursor() as cursor: for result in cursor.execute(update_query, val_tuple, multi=True): if result.with_rows: print(result.fetchall()) except Error as e: print(e)

Note that placeholders are no longer enclosed in string quotes. verifies that the values ​​in the tuple given as an argument are of the required data type. If the user tries to enter some problematic characters, the code will throw an exception: %s cursor.execute()


$ python

Enter movie id: 18

Enter reviewer id: 15″; UPDATE reviewers SET last_name = “A

Enter new rating: 5.0

Enter username:

Enter password:

1292 (22007): Truncated incorrect DOUBLE value: ’15”;

This approach should always be used when you include user input in a request. Take the time to learn about other ways to prevent SQL injection attacks.

Deleting records: the DELETE command

The procedure for deleting records is very similar to updating them. Since this is an irreversible operation, we recommend that you first run the query with the same filter to ensure that you are deleting the records you want. For example, to remove all movie ratings, data, we can first run the appropriate query: DELETE SELECT reviewer_id = 7 SELECT

select_movies_query = """ SELECT reviewer_id, movie_id FROM rating WHERE review_id = 7 """ with connection.cursor() as cursor: cursor.execute(select_movies_query) for movie in cursor.fetchall(): print(movie)


(2, 7)

(2, 8)

(2, 12)

(2, 23)

The above code snippet displays a pair, and for entries in the table estimates, for which. After making sure that these are the records to be deleted, let’s execute the query with the same filter: reviewer_id movie_id reviewer_id = 2 DELETE

delete_query = "DELETE FROM ratings WHERE reviewer_id = 2" with connection.cursor() as cursor: cursor.execute(delete_query)

Other ways to connect Python and MySQL

In this tutorial, we introduced the MySQL Connector / Python, which is the officially recommended means of interacting with a MySQL database from a Python application. Here are a couple of other popular connectors:

mysqlclient is a library that is a competitor to the official connector and is actively being supplemented with new functions. Since the core of the library is written in C, it has better performance than the official pure Python connector. The big drawback is that mysqlclient is quite difficult to set up and install, especially on Windows.

MySQLdb is legacy software that is still used in commercial applications today. Written in C and faster MySQL Connector / Python, but only available for Python 2.

These drivers act as interfaces between your program and the MySQL database. In fact, you just send your SQL queries through them. However, many developers prefer to use the object-oriented paradigm for data management, not SQL queries.

Object-relational mapping ( ORM ) is a process that allows not only the query but also the manipulation of data from a database directly usingOOPs. The ORM library encapsulates the code needed to manipulate data, freeing developers from the need to use SQL queries. Here are the most popular ORM libraries for combining Python and SQL:

SQLAlchemy is an ORM that simplifies communication between Python and other SQL databases. You can create different engines for different databases like MySQL, PostgreSQL, SQLite, etc.

peewee is a lightweight and fast ORM library with a simple configuration, which is very useful when your interaction with the database is limited to fetching a few records. If you need to copy individual records from a MySQL database to a CSV file, then peewee is the best choice.

The Django ORM is one of the most powerful parts of the Django web framework, allowing you to easily interact with a variety of SQLite, PostgreSQL, and MySQL databases. Many Django-based applications use the Django ORM for data modelling and basic queries, however, for more complex tasks, developers usually use SQLAlchemy.


In this tutorial, we took a look at how to integrate a MySQL database into your Python application. We also developed a test sample of the MySQL database and interacted with it directly from Python code. Python has connectors for other DBMSs such as MongoDB and PostgreSQL. We will be glad to know what other materials on Python and databases you would be interested in.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.


Update the detailed information about Introduction To Master Data In Sap on the website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!