PyTorch 101
A brief introduction to PyTorch
- Packages
- Matrices
- Tensor operations
- Tensor arithmetic
- 4.1 Addition
- 4.2 Subtraction
- 4.3 Multipliciation
- 4.4 Division
- 4.5 Mean
- 4.6 Standard Deviation
- Gradients
- Linear Regression
- 6.1 Building a simple dataset
- 6.2 Building the model
- 6.3 Training the model
- 6.4 Save the model
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# pytorch
import torch
import torch.nn as nn
# disable warnings
import warnings
warnings.filterwarnings("ignore")
arr = np.array([1,2,3,4,5,])
print(arr)
print(arr.dtype)
type(arr)
There are multiple ways to convert numpy array to a tensor. The different ways are:
torch.from_numpy()
- This converts a numpy array to a tensor.torch.as_tensor()
- This is a general way to convert an object to tensor.torch.tensor()
- This creates a copy of the object as a tensor.
x = torch.from_numpy(arr)
print(x)
print(x.dtype)
print(type(x))
x = torch.as_tensor(arr)
print(x)
print(x.dtype)
print(type(x))
For both of the above mentioned methods, there's a little caveat to understand. torch.from_numpy()
and torch.as_tensor()
creates a direct link between the numpy array and the tensor. Any change in the numpy array will affect the tensor. Following is an example.
arr = np.array([1,2,3,4,5,])
print("Numpy array : ",arr)
x = torch.from_numpy(arr)
print("PyTorch tensor : ",x)
# chaning the 0th index of arr
arr[0] = 999
print("Numpy array : ",arr)
print("PyTorch tensor : ",x)
It shows that though torch.from_numpy()
converts the array to tensor, but it creates a reference to the same object. To create a copy of the array, torch.tensor()
can be used.
arr = np.array([1,2,3,4,5,])
print("Numpy array : ",arr)
x = torch.tensor(arr)
print("PyTorch tensor : ",x)
# chaning the 0th index of arr
arr[0] = 999
print("Numpy array : ",arr)
print("PyTorch tensor : ",x)
Similar to numpy
functions such as ones
, eye
& arange
etc, torch
provides some handy functions to work with.
torch.empty(4,2)
torch.zeros(4,3)
torch.ones(3,4)
torch.arange(0,50,10)
torch.linspace(0,50,10)
my_tensor = torch.tensor([1,2,3])
print(my_tensor)
print(my_tensor.dtype)
type(my_tensor)
torch.rand(4,3)
torch.randn(4,3)
torch.randint(low=0,high=10,size=(5,5))
x = torch.zeros(2,5)
print("Shape of incoming tensor")
print(x.shape)
print("-"*50)
print("Random tensor with uniform distribution")
print(torch.rand_like(x))
print("-"*50)
print("Random tensor with standard normal uniform distribution")
print(torch.randn_like(x))
print("-"*50)
print("Random tensor with integers")
print(torch.randint_like(x,low=0,high=10))
print("The tensor")
print(x)
arr = x.numpy()
print("-"*50)
print("The numpy array")
print(arr)
print(arr.dtype)
print(type(arr))
x = torch.arange(10)
print("Reshaped tensor")
print(x.reshape(2,5))
print("-"*50)
print("Original tensor")
print(x)
x = torch.arange(10)
print("Reshaped tensor")
print(x.view(2,5))
print("-"*50)
print("Original tensor")
print(x)
x = torch.arange(10)
print(x.shape)
print("-"*50)
z = x.view(2,-1)
print(z)
print(z.shape)
As in the above case, if we put -1
as the second dimension, it will itelf infer the same provided it makes sense. Try changing the above code to get a shape of (3,-1)
which will provide an error; since a tensor of size 10 can't be reshaped to a tensor with first dimension as 3.
x = torch.arange(6).reshape(3,2)
print(x)
print(x[1,1])
type(x[1,1])
x = torch.arange(6).reshape(3,2)
print(x)
print("_"*50)
print(x[:,1])
print(x[:,1].shape)
print("_"*50)
print(x[:,1:])
print(x[:,1:].shape)
There are multiple ways to do arithmetic. Direct operators as well as torch functions can be used.
a = torch.tensor([1.,2.,3.])
b = torch.tensor([4.,5.,6.])
print(a+b)
print(torch.add(a,b))
a = torch.tensor([1.,2.,3.])
b = torch.tensor([4.,5.,6.])
a.add_(b)
print(a)
The in-place arithmetic is applicable for all operations - add_
, sub_
, mul_
etc.
a = torch.tensor([1,2,3,4])
b = torch.tensor([1,1,1,1])
print(a-b)
print(torch.sub(a,b))
a = torch.ones(2,2)
b = torch.zeros(2,2)
print(a*b)
print(torch.mul(a,b))
a = torch.ones(2,2)
b = torch.zeros(2,2)
print(a/b)
print(torch.div(a,b))
The tensor contains inf
since any number divided by 0
is infinity.
a = torch.arange(1,11,1,dtype=torch.float64)
print(a.mean(dim=0))
a = torch.arange(1,11,1).mean(dim=0)
won't work since this would create a tensor of int
and the mean
can only be calculated for float
in PyTorch.
a = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(a.std(dim=0))
A gradient, also called slope is simply defined as the rate of change of the functon at a particular point. PyTorch can easily calculate gradients and accumulate them. Just a single parameter requires_grad
is used while defining tensors to track the gradient.
x = torch.tensor([2],dtype=torch.float64, requires_grad = True)
x
y = 5 * ((x + 1)**2)
y
y.backward()
x.grad
x = torch.ones(2,requires_grad = True)
x
y = 5 * ((x + 1)**2)
y
We can observe that y
is also a tensor with more than one value. But backward
can only be called on a scaler or a 1-element tensor. We can create another function let's say o
depending upon y
.
o = 0.5 * torch.sum(y)
o
o.backward()
x.grad
Linear Regression allows to understand the relationship between 2 continuous variables. For example x
is an independent variable and y
is dependent upon x, then the linear equation would be as follows with a
as the slope of the equation and b
as the intercept.
$$
y = ax+b
$$
In this part of notebook, I'm creating a Linear Regression model in PyTorch.
x = np.arange(0,11,1,dtype=np.float32) # creating an independent variable x
y = 2 * x + 1 # creating a dependent variable y
print(x.shape)
print(y.shape)
print(x)
print(y)
x=x.reshape(-1,1)
y=y.reshape(-1,1)
print(x.shape, y.shape)
This is the plot of the equation that states y=2x+1
.
plt.scatter(x,y)
plt.plot(x,y,color="Orange",linewidth=2)
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Y = 2X + 1")
plt.show()
class LinearRegressionModel(nn.Module):
def __init__(self,input_dim, output_dim):
super(LinearRegressionModel,self).__init__()
self.linear = nn.Linear(input_dim, output_dim)
def forward(self,x):
out = self.linear(x)
return out
input_dim=1
output_dim=1
model = LinearRegressionModel(input_dim,output_dim)
Many loss functions can be used according to the problem statement. Linear regression mostly uses MSE or RMSE. I'll be using MSE for this case i.e Mean square error and its formula is as follows. $$ MSE = 1/n \sum_{n=1}^{n} (y_p - y_i)^2 $$
criterion=nn.MSELoss() # loss function
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
epochs = 100
for epoch in range(epochs):
# convert numpy arrays to tensors
inputs=torch.tensor(x,dtype=torch.float32, requires_grad=True)
labels=torch.tensor(y,dtype=torch.float32)
# clear the gradients
optimizer.zero_grad()
# forward to get output
outputs=model(inputs)
# calculating loss
loss=criterion(outputs,labels)
# backpropagation
loss.backward()
# updating parameters
optimizer.step()
print("Training complete...")
The true values are
y
The predicted values are
predicted = model(torch.tensor(x,requires_grad=True)).data.numpy()
predicted
plt.plot(x, y, 'go', color="Blue",label='True data',)
# Plot predictions
plt.plot(x, predicted, '--',color="Red", label='Predictions')
# Legend and plot
plt.legend(loc='best')
plt.show()
After this step, you may check out ./linear_regression_model.pkl
file path, where the model would be saved.
torch.save(model.state_dict(),'linear_regression_model.pkl')
I'll be updating it regularly. If it helped you, consider giving an upvote. ✌🏼
Check my other notebooks
- https://www.kaggle.com/namanmanchanda/asl-detection-99-accuracy
- https://www.kaggle.com/namanmanchanda/rnn-in-pytorch
- https://www.kaggle.com/namanmanchanda/heart-attack-eda-prediction-90-accuracy
- https://www.kaggle.com/namanmanchanda/stroke-eda-and-ann-prediction
The code in this notebook has been referenced from various sources available online, some of which are