Sorting and Rearranging Rows

Table of Contents

Introduction

In this article, we will discuss the following operations on the rows of a DataFrame:

sorting rows
rearranging rows

We first import the required libraries and modules.

1import numpy as np
2import pandas as pd

We then create a dataframe with the help of some random functions in NumPy.

 1R = np.random.RandomState(10) # set a random seed
 2a='Tom Gary Lois Wendy Betty Jacobs Donald Yasmine Nick Andrew'.split()
 3b=R.choice(range(50,85), size=10, replace=False)
 4c=R.uniform(1.6, 1.9, 10).round(2)
 5d=R.choice(['Germany', 'UK', 'USA', 'France', 'Canada'], size=10)
 6
 7df = pd.DataFrame({'Weight':b,
 8                   'Height':c,
 9                   'Country':d}, index=a)
10df

	Weight	Height	Country
Tom	73	1.78	USA
Gary	74	1.65	Germany
Lois	52	1.85	UK
Wendy	57	1.63	USA
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Yasmine	64	1.83	UK
Nick	76	1.60	France
Andrew	63	1.78	Canada

Sorting Rows

We usually sort the rows of a DataFrame in three typical scenarios:

sorting rows based on the index.
sorting rows base on values of a single column.
sorting rows base on values of multiple columns.

Sorting Rows Based on Index

To sort the rows according to the first alphabet.

Example

Sorting rows according to the first alphabet of index.

1df.sort_index()

	Weight	Height	Country
Andrew	63	1.78	Canada
Betty	55	1.69	Germany
Donald	62	1.62	France
Gary	74	1.65	Germany
Jacobs	53	1.62	Germany
Lois	52	1.85	UK
Nick	76	1.60	France
Tom	73	1.78	USA
Wendy	57	1.63	USA
Yasmine	64	1.83	UK

To rearrange the rows according to the first alphabet, in reverse order.

Example

Sorting rows according to the first alphabet of index (reverse order).

1df.sort_index(ascending=False)

	Weight	Height	Country
Yasmine	64	1.83	UK
Wendy	57	1.63	USA
Tom	73	1.78	USA
Nick	76	1.60	France
Lois	52	1.85	UK
Jacobs	53	1.62	Germany
Gary	74	1.65	Germany
Donald	62	1.62	France
Betty	55	1.69	Germany
Andrew	63	1.78	Canada

A key function can be specified which is applied to the index before sorting.

For example, to sort the rows according to the length of the row labels.

Example

Sorting rows according to the length of the index.

1df.sort_index(key=lambda x: x.str.len())

	Weight	Height	Country
Tom	73	1.78	USA
Gary	74	1.65	Germany
Lois	52	1.85	UK
Nick	76	1.60	France
Wendy	57	1.63	USA
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Andrew	63	1.78	Canada
Yasmine	64	1.83	UK

Same as above but in descending order.

Example

Sorting rows according to the length of the index (descending).

1df.sort_index(key=lambda x: x.str.len(), ascending=False)

	Weight	Height	Country
Yasmine	64	1.83	UK
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Andrew	63	1.78	Canada
Wendy	57	1.63	USA
Betty	55	1.69	Germany
Gary	74	1.65	Germany
Lois	52	1.85	UK
Nick	76	1.60	France
Tom	73	1.78	USA

Note

The original DataFrame is not modified. However, we can sort the rows in-place by including the argument inplace=True. This will do away with the need to reassign the DataFrame.

Example

Sorting rows according to the length of the index (in-place).

1df.sort_index(key=lambda x: x.str.len(), inplace=True)
2df

	Weight	Height	Country
Tom	73	1.78	USA
Gary	74	1.65	Germany
Lois	52	1.85	UK
Nick	76	1.60	France
Wendy	57	1.63	USA
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Andrew	63	1.78	Canada
Yasmine	64	1.83	UK

Sorting Rows Based on Values of a Single Column

In this example, we will sort the rows in ascending order of ‘Weight’ using the sort_values() method.

Example

Sorting rows by the ‘Weight’ column (ascending).

1df.sort_values(by ='Weight', axis=0)

	Weight	Height	Country
Lois	52	1.85	UK
Jacobs	53	1.62	Germany
Betty	55	1.69	Germany
Wendy	57	1.63	USA
Donald	62	1.62	France
Andrew	63	1.78	Canada
Yasmine	64	1.83	UK
Tom	73	1.78	USA
Gary	74	1.65	Germany
Nick	76	1.60	France

We can also sort the rows in descending order of ‘Weight’ by using the optional argument ascending=False.

Example

Sorting rows by the ‘Weight’ column (descending).

1df.sort_values(by ='Weight', axis=0, ascending=False)

	Weight	Height	Country
Nick	76	1.60	France
Gary	74	1.65	Germany
Tom	73	1.78	USA
Yasmine	64	1.83	UK
Andrew	63	1.78	Canada
Donald	62	1.62	France
Wendy	57	1.63	USA
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Lois	52	1.85	UK

Sorting Rows Based on Values of Multiple Columns

In this example, we will sort the rows by ‘Country’ and then by ‘Height’, both in ascending (default) order.

Example

Sorting rows by 2 columns.

1df.sort_values(by=['Country', 'Height'], axis=0)

	Weight	Height	Country
Andrew	63	1.78	Canada
Nick	76	1.60	France
Donald	62	1.62	France
Jacobs	53	1.62	Germany
Gary	74	1.65	Germany
Betty	55	1.69	Germany
Yasmine	64	1.83	UK
Lois	52	1.85	UK
Wendy	57	1.63	USA
Tom	73	1.78	USA

We can also employ mixed sorting where different columns can be assigned either ascending or descending order.

Example

Sorting rows by ‘Country’ (alphabetical order) and then by ‘Height’ (descending order)

1df.sort_values(by=['Country', 'Height'], axis=0, ascending=[True,False])

	Weight	Height	Country
Andrew	63	1.78	Canada
Donald	62	1.62	France
Nick	76	1.60	France
Betty	55	1.69	Germany
Gary	74	1.65	Germany
Jacobs	53	1.62	Germany
Lois	52	1.85	UK
Yasmine	64	1.83	UK
Tom	73	1.78	USA
Wendy	57	1.63	USA

Rearranging Rows

We discuss two methods of rearranging rows:

by manipulating row labels
by manipulating row integer indices

But first, let’s print out our current DataFrame.

1df

	Weight	Height	Country
Tom	73	1.78	USA
Gary	74	1.65	Germany
Lois	52	1.85	UK
Nick	76	1.60	France
Wendy	57	1.63	USA
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Andrew	63	1.78	Canada
Yasmine	64	1.83	UK

Manipulating Row Labels

To move any row to any position, we can manipulate the list of row labels.

Example

Moving a row to a specified position: insert ‘Wendy’ between ‘Donald’ and ‘Andrew’.

Step 1 : Get list of row labels.

1ind = df.index.tolist()
2ind

['Tom',
 'Gary',
 'Lois',
 'Nick',
 'Wendy',
 'Betty',
 'Jacobs',
 'Donald',
 'Andrew',
 'Yasmine']

We can use the insert() method to insert the desired row to its destination position.

Step 2 : Using insert() method to manipulate label positions.

1rem=ind.index("Wendy") # index of row to be moved
2ins=ind.index("Donald") # index of destination row
3
4ind.pop(rem) # remove "Wendy"
5ind.insert(ins,'Wendy') # insert "Wendy"
6
7ind

['Tom',
 'Gary',
 'Lois',
 'Nick',
 'Betty',
 'Jacobs',
 'Donald',
 'Wendy',
 'Andrew',
 'Yasmine']

We now feed the above modified list of row labels into our DataFrame using the loc attribute.

Step 3 : Feeding new label to DataFrame using loc attribute.

1df.loc[ind]

	Weight	Height	Country
Tom	73	1.78	USA
Gary	74	1.65	Germany
Lois	52	1.85	UK
Nick	76	1.60	France
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Wendy	57	1.63	USA
Andrew	63	1.78	Canada
Yasmine	64	1.83	UK

Manipulating Row Indices

We can also rearrange rows of a DataFrame by manipulating the integer index of the rows. Let’s first print out the current DataFrame.

1df

	Weight	Height	Country
Tom	73	1.78	USA
Gary	74	1.65	Germany
Lois	52	1.85	UK
Nick	76	1.60	France
Wendy	57	1.63	USA
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Andrew	63	1.78	Canada
Yasmine	64	1.83	UK

Example

Moving a row to a specified position: insert ‘Wendy’ between ‘Donald’ and ‘Andrew’.

The first step is to reset the current index to an integer index.

Step 1 : Reset to integer index using reset_index() method.

1df1 = df.reset_index()
2df1

	index	Weight	Height	Country
0	Tom	73	1.78	USA
1	Gary	74	1.65	Germany
2	Lois	52	1.85	UK
3	Nick	76	1.60	France
4	Wendy	57	1.63	USA
5	Betty	55	1.69	Germany
6	Jacobs	53	1.62	Germany
7	Donald	62	1.62	France
8	Andrew	63	1.78	Canada
9	Yasmine	64	1.83	UK

We then generate the list of integer row indices.

Step 2 : Get list of row indices using index.tolist().

1ind = df1.index.tolist()
2ind

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

We can then pick the relevant row indices (corresponding to both ‘Wendy’ and ‘Donald’) to be manipulated.

Step 3 : Using insert() and pop() methods to manipulate row indices.

1rem=4 # index of row to be moved
2ins=7 # index of destination row
3
4ind.pop(rem) # remove "Wendy"
5ind.insert(ins,rem) # insert "Wendy"
6ind

[0, 1, 2, 3, 5, 6, 7, 4, 8, 9]

We now feed the above modified list of row indices into our DataFrame using the loc attribute.

Step 4 : Feeding new index to DataFrame using loc attribute.

1df1 = df1.loc[ind]
2df1

	index	Weight	Height	Country
0	Tom	73	1.78	USA
1	Gary	74	1.65	Germany
2	Lois	52	1.85	UK
3	Nick	76	1.60	France
5	Betty	55	1.69	Germany
6	Jacobs	53	1.62	Germany
7	Donald	62	1.62	France
4	Wendy	57	1.63	USA
8	Andrew	63	1.78	Canada
9	Yasmine	64	1.83	UK

If desired, we may remove the numerical index and revert back to the row labels by setting the ‘index’ column as the new index.

Step 5 : Revert back to original row labels using set_index() method.

1df1.set_index('index').rename_axis(None, axis=0)

	Weight	Height	Country
Tom	73	1.78	USA
Gary	74	1.65	Germany
Lois	52	1.85	UK
Nick	76	1.60	France
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Wendy	57	1.63	USA
Andrew	63	1.78	Canada
Yasmine	64	1.83	UK

The rename_axis(None, axis=0) method ensures that the index column remains unnamed.

Sorting and Rearranging Rows

Introduction

Sorting Rows

Sorting Rows Based on Index

Sorting Rows Based on Values of a Single Column

Sorting Rows Based on Values of Multiple Columns

Rearranging Rows

Manipulating Row Labels

Manipulating Row Indices

Share this Article