Python for Data Science Data Frame Attributes

Data Frame Attributes

In this class, We discuss Data Frame Attributes.

For Complete YouTube Video: Click Here

The reader should have prior knowledge of Data Frame creation. Click here.

Take an example and understand the attributes present in the data frame.

The example data frame is given below.

import pandas as pd 
students={'name':['rajesh','suresh','mahesh','j'],'age':[25,35,40,1],'marks':[85.0,45,65,20]}
df=pd.DataFrame(data=students)
print(df)

Output:
     name  age  marks
0  rajesh   25   85.0
1  suresh   35   45.0
2  mahesh   40   65.0
3       j    1   20.0

List of Attributes

Columns Attribute

The columns attribute will give the list of column names of our data frame.

We can use the columns Attribute to display the column names for index values of columns.

Both the examples are shown below.

# columns Attribute
print(df.columns)
print(df.columns[[1,2]])

Output:
Index(['name', 'age', 'marks'], dtype='object')
Index(['age', 'marks'], dtype='object')

By default data frame is assigned column values 0,1,2..

Based on these values, we can get the column names.

In the above example, we are displaying the column names of indexes 1 and 2.

dtypes Attribute

The dtypes Attribute is used to give the list of column types.

An example of our data frame is given below.

# dtypes Attribute
print(df.dtypes)

Output:
name      object
age        int64
marks    float64
dtype: object

In our example, the name is shown as a type object, similar to a string in python.

We discuss the panda’s data types in our next classes.

The age column is of type integer, and the marks column is of type float.

iloc Attribute

In our previous class, we discussed loc attributes.

The loc attribute is used to access the elements of the data frame.

The same way iloc is used to access the elements of the data frame using index values.

The examples are given below.

# iloc Attribute
print(df.iloc[0,1])#0 row index, 1 column index
print("-------------------")
print(df.iloc[[0,1]])# list of row indexes
print("-------------------")
print(df.iloc[0:1,0:2])# slicing way

Output:
25
-------------------
     name  age  marks
0  rajesh   25   85.0
1  suresh   35   45.0
-------------------
     name  age
0  rajesh   25

index Attribute

We discussed the columns attribute above. This columns attribute is used to get the column names.

The same way index attribute is used to get the row names.

The examples are shown below.

# index Attribute
import pandas as pd
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],index=['a', 'c', 'b'],columns=['C1', 'C2'])
print(df)
print("-----------------")

print(df.index)
print(df.index[[0,1]])

Output:
   C1  C2
a   1   2
c   4   5
b   7   8
-----------------
Index(['a', 'c', 'b'], dtype='object')
Index(['a', 'c'], dtype='object')

ndim Attribute

The ndim Attribute will display the no of dimensions of the data frame.

The data frame is of 2 dimensions. Because we have rows and columns.

# ndim Attribute
print(df.ndim) # gives no of dimensions

Output:
2

shape Attribute

The shape attribute will give the shape of the data frame. ie number of rows and columns.

The example is shown below.

# shape Attribute
print(df.shape) # gives shape ie no of rows and columns

Output:
(3,2)

size Attribute

The size attribute will give the number of elements present in the data frame.

# size Attribute
print(df.size)#gives number of elements in dataframe

Output:
6

values Attribute

The values attribute is used to take the elements in the data frame.

We can take all the elements in the data frame using df.values.

Values attribute will convert the data into numpy array.

We can take the values in a column.

The elements in a row can be taken and converted to a numpy array.

Examples are given below.

# values Attribute
import pandas as pd 
students={'name':['rajesh','suresh','mahesh','j'],'age':[25,35,40,1],'marks':[85.0,45,65,20]}
df=pd.DataFrame(data=students)
print(df)
print("----------------------")

x=df.values
print(x)
print(type(x))
print("----------------------")
y=df["name"].values
print(y)
print(type(y))
print("------------------------")
z=df.loc[0].values
print(z)
print(type(z))

Output:
     name  age  marks
0  rajesh   25   85.0
1  suresh   35   45.0
2  mahesh   40   65.0
3       j    1   20.0
----------------------
[['rajesh' 25 85.0]
 ['suresh' 35 45.0]
 ['mahesh' 40 65.0]
 ['j' 1 20.0]]
<class 'numpy.ndarray'>
----------------------
['rajesh' 'suresh' 'mahesh' 'j']
<class 'numpy.ndarray'>
------------------------
['rajesh' 25 85.0]
<class 'numpy.ndarray'>

Previous Lesson

Back to Course

Next Lesson