Sparse to Dense Matrix Scipy

In this class, We discuss Sparse to Dense Matrix Scipy

For Complete YouTube Video: Click Here

Sparse Matrix

The reader should have prior knowledge of the data frame and numpy array. Click here.

A matrix is said to be sparse. if most of the elements are zero’s.

Most of the data we consider in machine learning is sparse.

So we use sparse matrix representation most of the time in our data science.

A sparse matrix will save data other than zero.

Take an example and understand sparse matrix representation in scipy.

import pandas as pd
df = pd.DataFrame([[1, 0,0,0,0,0],[0, 0, 0,1,2,0],[0, 0,3,0,0,0]],columns=['A', 'B', 'C','D','E','F'])
print(df)

Output:
   A  B  C  D  E  F
0  1  0  0  0  0  0
1  0  0  0  1  2  0
2  0  0  3  0  0  0

x=df.to_numpy(dtype=int)
print(x)

Output:
[[1 0 0 0 0 0]
 [0 0 0 1 2 0]
 [0 0 3 0 0 0]]

# defining a sparse matrix
from scipy.sparse import csr_matrix
sparsematrix=csr_matrix(x)
print(sparsematrix)

Output:
(0, 0)	1
  (1, 3)	1
  (1, 4)	2
  (2, 2)	3

In the above example, the data consist of most of the zero’s.

We have taken a data frame. We converted the data frame to a numpy array.

csr_matrix class in scipy will take the array as a parameter.

We use the class csr_matrix in scipy to generate a sparse matrix.

The sparse matrix output is row number, column number, and the value in the location.

Attributes

Shape attribute: The shape attribute will display the shape of the matrix.

It gives the number of rows and the number of columns in the matrix.

The example is shown below.

# shape attribute
print(sparsematrix.shape)

Output:
(3, 6)

dtype Attribute: The dtype attribute will provide the type of the data in the matrix.

An example given below.

# dtype attribute
print(sparsematrix.dtype)

Output:
int32

ndim attribute: The ndim attribute gives the number of dimensions of the matrix.

# ndim attribute
print(sparsematrix.ndim)

Output:
2

Methods

toarray method: the toarray method will convert the sparse matrix to numpy array.

Most of the methods used in machine learning will take sparse matrix arguments or dense array arguments.

By converting to the array, we can use the methods available in the array class.

# toarray method 
densearray=sparsematrix.toarray()
print(densearray)

Output:
[[1 0 0 0 0 0]
 [0 0 0 1 2 0]
 [0 0 3 0 0 0]]

To convert our sparse matrix to dense matrix, we use todense method.

todense Method: The todense method will convert the sparse matrix to a dense matrix.

We can use the dense matrix methods.

According to our requirement, we have to convert our sparse matrix to the array or dense matrix.

If we need to use matrix methods like transpose, inverse, etc., we convert to the dense matrix using the todense method.

The example program is given below.

densematrix=sparsematrix.todense()
print(densematrix)
print(densematrix.getT())

Output:
[[1 0 0 0 0 0]
 [0 0 0 1 2 0]
 [0 0 3 0 0 0]]
[[1 0 0]
 [0 0 0]
 [0 0 3]
 [0 1 0]
 [0 2 0]
 [0 0 0]]