Table of Contents
Introduction
In this section, we will discuss accessing the rows in a dataframe by both label (loc attribute) and position (iloc attribute).
We first import the required libraries and modules.
1import numpy as np
2import pandas as pd
3import random as rd
The next step is to create a DataFrame from a dictionary of lists with the help of the random module.
1a=['Tom','Gary','Lois','Wendy','Betty']
2b=rd.choices(range(50,85),k=5)
3c=np.random.uniform(1.6, 1.9, 5).round(2)
4
5df = pd.DataFrame({'Weight':b,
6 'Height':c}, index=a)
7df
| Weight | Height | |
|---|---|---|
| Tom | 58 | 1.65 |
| Gary | 54 | 1.84 |
| Lois | 57 | 1.70 |
| Wendy | 79 | 1.73 |
| Betty | 65 | 1.68 |
Selection by the loc Attribute
Syntax
Theloc attribute.
1dataframe.loc[row_selection, column_selection]
where row_selection and column_selection refer to a list of row labels and column labels, respectively, i.e. ['label1', 'label2', 'label3']. loc supports the slice notation and therefore accepts a colon(:) to select all rows or columns.
Selecting a Cell
We can select a cell value by specifying both row and column labels.
Example
Selecting a cell using theloc attribute.
1df.loc['Tom','Height']
1.65
Selecting a Row
Select an entire row using its label. The colon (:) for column_selection indicates all columns are selected.
Example
Selecting a row using theloc attribute.
1df.loc['Tom',:]
Weight 58.00
Height 1.65
Ratio 35.20
Name: Tom, dtype: float64
The above returns a Series object.
1type(df.loc['Tom',:])
pandas.core.series.Series
If the row label is in a list, then a DataFrame will be created instead.
Example
Selecting a row using theloc attribute - returns a DataFrame.
1df.loc[['Tom'],:]
| Weight | Height | Ratio | |
|---|---|---|---|
| Tom | 58 | 1.65 | 35.2 |
1type(df.loc[['Tom'],:])
pandas.core.frame.DataFrame
It is also possible to select a row using the following shorter version by omitting the comma and colon.
Example
Selecting a row using theloc attribute - short version.
1df.loc['Gary']
Weight 54.00
Height 1.84
Ratio 29.30
Name: Gary, dtype: float64
Selecting Multiple Rows
The labels of the rows we wish to select are inside a list. This will always return a DataFrame.
Example
Selecting multiple rows using theloc attribute.
1df.loc[['Tom','Betty'],:]
| Weight | Height | Ratio | |
|---|---|---|---|
| Tom | 58 | 1.65 | 35.2 |
| Betty | 65 | 1.68 | 38.7 |
We can also selecting multiple rows by omitting the comma and colon.
Example
Selecting multiple rows using theloc attribute - short version.
1df.loc[['Gary','Tom']]
| Weight | Height | Ratio | |
|---|---|---|---|
| Gary | 54 | 1.84 | 29.3 |
| Tom | 58 | 1.65 | 35.2 |
We can also select multiple contiguous rows (adjacent to each other with no gap) using the colon : notation.
Example
Contiguous row selection.1df.loc['Gary':'Wendy',:]
| Weight | Height | Ratio | |
|---|---|---|---|
| Gary | 54 | 1.84 | 29.3 |
| Lois | 57 | 1.70 | 33.5 |
| Wendy | 79 | 1.73 | 45.7 |
Example
Contiguous row selection - short version.1df.loc['Gary':'Betty']
| Weight | Height | Ratio | |
|---|---|---|---|
| Gary | 54 | 1.84 | 29.3 |
| Lois | 57 | 1.70 | 33.5 |
| Wendy | 79 | 1.73 | 45.7 |
| Betty | 65 | 1.68 | 38.7 |
When using the colon (:) notation with labels, the end-index becomes inclusive. This is inconsistent with respect to how everything else in Python works.
Summary
Row selection by theloc attribute.
| Selection | Return Data Type | Example |
|---|---|---|
| Single value | Scalar | df.loc['Tom','Height'] |
| Single row | Series | df.loc['Gary',:] |
| Single row | DataFrame | df.loc[['Gary'],:] |
| Multiple rows | DataFrame | df.loc[['Tom','Betty'],:] |
| Contiguous rows | DataFrame | df.loc['Gary':'Wendy',:] |
When selecting rows using the loc attribute, it is possible to omit the comma and colon. For example, df.loc['Gary',:] becomes df.loc['Gary'].
Selection by the iloc Attribute
Syntax
Theiloc attribute.
1dataframe.iloc[row_selection, column_selection]
where row_selection and column_selection refer to a list of row indices and column indices, respectively.iloc supports the slice notation and therefore accepts a colon(:) to select all rows or columns.
When performing data selection using the iloc attribute, we employ the standard half-open interval.
Selecting a Cell
We can select a cell value by specifying both row and column indices.
Example
Selecting a cell using theiloc attribute.
1df.iloc[0,0]
58
Selecting a Row
Select an entire row using its label. The colon (:) in column_selection indicates all columns are selected.
Example
Selecting a row using theiloc attribute.
1df.iloc[1,:]
Weight 54.00
Height 1.84
Ratio 29.30
Name: Gary, dtype: float64
The above returns a Series object.
1type(df.iloc[1,:])
pandas.core.series.Series
If the row index is in a list, then a DataFrame will be created instead.
Example
Selecting a row using theiloc attribute - returns a DataFrame.
1df.iloc[[1],:]
| Weight | Height | Ratio | |
|---|---|---|---|
| Gary | 54 | 1.84 | 29.3 |
1type(df.iloc[[1],:])
pandas.core.frame.DataFrame
Selecting Multiple Rows
Selecting multiple rows (whose indices are inside a list) will always return a DataFrame.
Example
Selecting multiple rows using theiloc attribute.
1df.iloc[[1,3],:]
| Weight | Height | Ratio | |
|---|---|---|---|
| Gary | 54 | 1.84 | 29.3 |
| Wendy | 79 | 1.73 | 45.7 |
We can also select multiple contiguous rows (adjacent to each other with no gap) using the slice (colon :) notation.
Example
Contiguous row selection.1df.iloc[1:3,:]
| Weight | Height | Ratio | |
|---|---|---|---|
| Gary | 54 | 1.84 | 29.3 |
| Lois | 57 | 1.70 | 33.5 |
Summary
Row selection by theiloc attribute.
| Selection | Return Data Type | Example |
|---|---|---|
| Single value | Scalar | df.iloc[1,2] |
| Single row | Series | df.iloc[2,:] |
| Single row | DataFrame | df.iloc[[2],:] |
| Multiple rows | DataFrame | df.iloc[[3,1],:] |
| Contiguous rows | DataFrame | df.iloc[1:3,:] |