Data Frame Attributes
In this class, We discuss Data Frame Attributes.
For Complete YouTube Video: Click Here
The reader should have prior knowledge of Data Frame creation. Click here.
Take an example and understand the attributes present in the data frame.
The example data frame is given below.
import pandas as pd
students={'name':['rajesh','suresh','mahesh','j'],'age':[25,35,40,1],'marks':[85.0,45,65,20]}
df=pd.DataFrame(data=students)
print(df)
Output:
name age marks
0 rajesh 25 85.0
1 suresh 35 45.0
2 mahesh 40 65.0
3 j 1 20.0
List of Attributes
Columns Attribute
The columns attribute will give the list of column names of our data frame.
We can use the columns Attribute to display the column names for index values of columns.
Both the examples are shown below.
# columns Attribute
print(df.columns)
print(df.columns[[1,2]])
Output:
Index(['name', 'age', 'marks'], dtype='object')
Index(['age', 'marks'], dtype='object')
By default data frame is assigned column values 0,1,2..
Based on these values, we can get the column names.
In the above example, we are displaying the column names of indexes 1 and 2.
dtypes Attribute
The dtypes Attribute is used to give the list of column types.
An example of our data frame is given below.
# dtypes Attribute
print(df.dtypes)
Output:
name object
age int64
marks float64
dtype: object
In our example, the name is shown as a type object, similar to a string in python.
We discuss the panda’s data types in our next classes.
The age column is of type integer, and the marks column is of type float.
iloc Attribute
In our previous class, we discussed loc attributes.
The loc attribute is used to access the elements of the data frame.
The same way iloc is used to access the elements of the data frame using index values.
The examples are given below.
# iloc Attribute
print(df.iloc[0,1])#0 row index, 1 column index
print("-------------------")
print(df.iloc[[0,1]])# list of row indexes
print("-------------------")
print(df.iloc[0:1,0:2])# slicing way
Output:
25
-------------------
name age marks
0 rajesh 25 85.0
1 suresh 35 45.0
-------------------
name age
0 rajesh 25
index Attribute
We discussed the columns attribute above. This columns attribute is used to get the column names.
The same way index attribute is used to get the row names.
The examples are shown below.
# index Attribute
import pandas as pd
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],index=['a', 'c', 'b'],columns=['C1', 'C2'])
print(df)
print("-----------------")
print(df.index)
print(df.index[[0,1]])
Output:
C1 C2
a 1 2
c 4 5
b 7 8
-----------------
Index(['a', 'c', 'b'], dtype='object')
Index(['a', 'c'], dtype='object')
ndim Attribute
The ndim Attribute will display the no of dimensions of the data frame.
The data frame is of 2 dimensions. Because we have rows and columns.
# ndim Attribute
print(df.ndim) # gives no of dimensions
Output:
2
shape Attribute
The shape attribute will give the shape of the data frame. ie number of rows and columns.
The example is shown below.
# shape Attribute
print(df.shape) # gives shape ie no of rows and columns
Output:
(3,2)
size Attribute
The size attribute will give the number of elements present in the data frame.
# size Attribute
print(df.size)#gives number of elements in dataframe
Output:
6
values Attribute
The values attribute is used to take the elements in the data frame.
We can take all the elements in the data frame using df.values.
Values attribute will convert the data into numpy array.
We can take the values in a column.
The elements in a row can be taken and converted to a numpy array.
Examples are given below.
# values Attribute
import pandas as pd
students={'name':['rajesh','suresh','mahesh','j'],'age':[25,35,40,1],'marks':[85.0,45,65,20]}
df=pd.DataFrame(data=students)
print(df)
print("----------------------")
x=df.values
print(x)
print(type(x))
print("----------------------")
y=df["name"].values
print(y)
print(type(y))
print("------------------------")
z=df.loc[0].values
print(z)
print(type(z))
Output:
name age marks
0 rajesh 25 85.0
1 suresh 35 45.0
2 mahesh 40 65.0
3 j 1 20.0
----------------------
[['rajesh' 25 85.0]
['suresh' 35 45.0]
['mahesh' 40 65.0]
['j' 1 20.0]]
<class 'numpy.ndarray'>
----------------------
['rajesh' 'suresh' 'mahesh' 'j']
<class 'numpy.ndarray'>
------------------------
['rajesh' 25 85.0]
<class 'numpy.ndarray'>