Sorting and Rearranging Columns

Table of Contents

Introduction

In this article, we will discuss the following operations on the columns of a DataFrame:

sorting columns
rearranging columns
re-indexing columns

We first import the required libraries and modules.

1import numpy as np
2import pandas as pd

We then create a dataframe with the help of some random functions in NumPy.

 1R = np.random.RandomState(10) # set a random seed
 2a='Tom Gary Lois Wendy Betty Jacobs Donald Yasmine Nick Andrew'.split()
 3b=R.choice(range(50,85), size=10, replace=False)
 4c=R.uniform(1.6, 1.9, 10).round(2)
 5d=R.choice(['Germany', 'UK', 'USA', 'France', 'Canada'], size=10)
 6
 7df = pd.DataFrame({'Weight':b,
 8                   'Height':c,
 9                   'Country':d}, index=a)
10df = df.T
11df

	Tom	Gary	Lois	Wendy	Betty	Jacobs	Donald	Yasmine	Nick	Andrew
Weight	73	74	52	57	55	53	62	64	76	63
Height	1.78	1.65	1.85	1.63	1.69	1.62	1.62	1.83	1.6	1.78
Country	USA	Germany	UK	USA	Germany	Germany	France	UK	France	Canada

Sorting Columns

Sorting Columns by First Letter

To sort the columns according to the first alphabet.

Example

Sorting columns by first alphabet of the labels.

1df[sorted(df.columns)]

	Andrew	Betty	Donald	Gary	Jacobs	Lois	Nick	Tom	Wendy	Yasmine
Weight	63	55	62	74	53	52	76	73	57	64
Height	1.78	1.69	1.62	1.65	1.62	1.85	1.6	1.78	1.63	1.83
Country	Canada	Germany	France	Germany	Germany	UK	France	USA	USA	UK

To sort the columns according to the first alphabet, in reverse order.

1df[sorted(df.columns, reverse=True)]

	Yasmine	Wendy	Tom	Nick	Lois	Jacobs	Gary	Donald	Betty	Andrew
Weight	64	57	73	76	52	53	74	62	55	63
Height	1.83	1.63	1.78	1.6	1.85	1.62	1.65	1.62	1.69	1.78
Country	UK	USA	USA	France	UK	Germany	Germany	France	Germany	Canada

Example

Sorting columns by length of the labels.

Recall that we can sort a Python list according to the length of the items using the key argument which is a function used to decide the order.

👀 Review

The sorted() Function.

In this case, we employ the len() function as the key.

1sorted(df.columns, key=lambda x: len(x))

['Tom',
 'Gary',
 'Lois',
 'Nick',
 'Wendy',
 'Betty',
 'Jacobs',
 'Donald',
 'Andrew',
 'Yasmine']

We can then sort the columns according to the length of the column labels.

1df[sorted(df.columns, key=lambda x: len(x))]

	Tom	Gary	Lois	Nick	Wendy	Betty	Jacobs	Donald	Andrew	Yasmine
Weight	73	74	52	76	57	55	53	62	63	64
Height	1.78	1.65	1.85	1.6	1.63	1.69	1.62	1.62	1.78	1.83
Country	USA	Germany	UK	France	USA	Germany	Germany	France	Canada	UK

Example

Sorting columns by length of the labels (reverse order).

We first sort the list of column labels in reverse order according to the length.

1sorted(df.columns, key=lambda x: len(x), reverse=True)

['Yasmine',
 'Jacobs',
 'Donald',
 'Andrew',
 'Wendy',
 'Betty',
 'Gary',
 'Lois',
 'Nick',
 'Tom']

We then feed the sorted list into the DataFrame.

1df[sorted(df.columns, key=lambda x: len(x), reverse=True)]

	Yasmine	Jacobs	Donald	Andrew	Wendy	Betty	Gary	Lois	Nick	Tom
Weight	64	53	62	63	57	55	74	52	76	73
Height	1.83	1.62	1.62	1.78	1.63	1.69	1.65	1.85	1.6	1.78
Country	UK	Germany	France	Canada	USA	Germany	Germany	UK	France	USA

Sort columns based on values of a single row.

Example

Sorting columns in ascending order of weight.

1df.sort_values(by ='Weight', axis=1)

	Lois	Jacobs	Betty	Wendy	Donald	Andrew	Yasmine	Tom	Gary	Nick
Weight	52	53	55	57	62	63	64	73	74	76
Height	1.85	1.62	1.69	1.63	1.62	1.78	1.83	1.78	1.65	1.6
Country	UK	Germany	Germany	USA	France	Canada	UK	USA	Germany	France

Example

Sorting columns in descending order of weight.

We can include the optional argument ascending=False to sort the columns in descending order of ‘Weight’.

1df.sort_values(by ='Weight', axis=1, ascending=False)

	Nick	Gary	Tom	Yasmine	Andrew	Donald	Wendy	Betty	Jacobs	Lois
Weight	76	74	73	64	63	62	57	55	53	52
Height	1.6	1.65	1.78	1.83	1.78	1.62	1.63	1.69	1.62	1.85
Country	France	Germany	USA	UK	Canada	France	USA	Germany	Germany	UK

Sort columns based on values of multiple rows.

Example

Sorting columns by ‘Country’ (alphabetical order) and then by ‘Height’.

1df.sort_values(by =['Country', 'Height'], axis=1)

	Andrew	Nick	Donald	Jacobs	Gary	Betty	Yasmine	Lois	Wendy	Tom
Weight	63	76	62	53	74	55	64	52	57	73
Height	1.78	1.6	1.62	1.62	1.65	1.69	1.83	1.85	1.63	1.78
Country	Canada	France	France	Germany	Germany	Germany	UK	UK	USA	USA

Similar to above but in descending order.

Example

Sorting columns by ‘Country’ (reverse alphabetical order) and then by ‘Height’ (descending order).

1df.sort_values(by =['Country', 'Height'], axis=1, ascending=False)

	Tom	Wendy	Lois	Yasmine	Betty	Gary	Jacobs	Donald	Nick	Andrew
Weight	73	57	52	64	55	74	53	62	76	63
Height	1.78	1.63	1.85	1.83	1.69	1.65	1.62	1.62	1.6	1.78
Country	USA	USA	UK	UK	Germany	Germany	Germany	France	France	Canada

We can also employ mixed sorting where different rows can be assigned either ascending or descending order.

Example

Sorting columns by ‘Country’ (alphabetical order) and then by ‘Height’ (descending order).

1df.sort_values(by =['Country', 'Height'], axis=1, ascending=[True,False])

	Andrew	Donald	Nick	Betty	Gary	Jacobs	Lois	Yasmine	Tom	Wendy
Weight	63	62	76	55	74	53	52	64	73	57
Height	1.78	1.62	1.6	1.69	1.65	1.62	1.85	1.83	1.78	1.63
Country	Canada	France	France	Germany	Germany	Germany	UK	UK	USA	USA

Rearranging Columns

Our first example is to move the last column to become the first column (after index column).

Example

Rearranging columns by manipulating the list of column labels.

The first step is to create a list of the column labels using the tolist() method.

1cols = df.columns.tolist()
2cols

['Tom',
 'Gary',
 'Lois',
 'Wendy',
 'Betty',
 'Jacobs',
 'Donald',
 'Yasmine',
 'Nick',
 'Andrew']

The following code will extract the last element as a list.

1cols[-1:]

['Andrew']

And the following code will slice the original list by excluding the last element.

1cols[:-1]

['Tom',
 'Gary',
 'Lois',
 'Wendy',
 'Betty',
 'Jacobs',
 'Donald',
 'Yasmine',
 'Nick']

We now concatenate (+) the two lists to create the desired list of (rearranged) column labels.

1cols1 = cols[-1:] + cols[:-1]
2cols1

['Andrew',
 'Tom',
 'Gary',
 'Lois',
 'Wendy',
 'Betty',
 'Jacobs',
 'Donald',
 'Yasmine',
 'Nick']

Finally, we feed the list of column labels into the DataFrame.

1df[cols1]

	Andrew	Tom	Gary	Lois	Wendy	Betty	Jacobs	Donald	Yasmine	Nick
Weight	63	73	74	52	57	55	53	62	64	76
Height	1.78	1.78	1.65	1.85	1.63	1.69	1.62	1.62	1.83	1.6
Country	Canada	USA	Germany	UK	USA	Germany	Germany	France	UK	France

To move any column to any position, we can proceed in a likewise manner.

Example

Rearranging columns.

For example, to move the column ‘Wendy’ to column ‘Yasmine’ position.

1rem=cols.index("Wendy") # index of column to be moved
2ins=cols.index("Yasmine") # index of destination column
3
4cols.pop(rem) # remove "Wendy"
5cols.insert(ins,'Wendy') # insert "Wendy"
6
7cols

['Tom',
 'Gary',
 'Lois',
 'Betty',
 'Jacobs',
 'Donald',
 'Yasmine',
 'Wendy',
 'Nick',
 'Andrew']

We then feed the above list of rearranged columns to the DataFrame.

1df[cols]

	Tom	Gary	Lois	Betty	Jacobs	Donald	Yasmine	Wendy	Nick	Andrew
Weight	73	74	52	55	53	62	64	57	76	63
Height	1.78	1.65	1.85	1.69	1.62	1.62	1.83	1.63	1.6	1.78
Country	USA	Germany	UK	Germany	Germany	France	UK	USA	France	Canada

Re-indexing the Columns

Re-indexing a DataFrame allows us to completely redefine the columns of a DataFrame by specifying the columns to keep or create.

We will work with the transposed version of the DataFrame and name it df1.

1df1=df.T
2df1

	Weight	Height	Country
Tom	73	1.78	USA
Gary	74	1.65	Germany
Lois	52	1.85	UK
Wendy	57	1.63	USA
Betty	55	1.69	Germany
Jacobs	53	1.62	Germany
Donald	62	1.62	France
Yasmine	64	1.83	UK
Nick	76	1.6	France
Andrew	63	1.78	Canada

We now create a new column with label ‘Ratio’ which is the ratio of the ‘Weight’ and ‘Height’ columns.

Example

Creating a new column from two existing columns.

1df1['Ratio'] = df1['Weight']/df1['Height']
2df1

	Weight	Height	Country	Ratio
Tom	73	1.78	USA	41.011236
Gary	74	1.65	Germany	44.848485
Lois	52	1.85	UK	28.108108
Wendy	57	1.63	USA	34.969325
Betty	55	1.69	Germany	32.544379
Jacobs	53	1.62	Germany	32.716049
Donald	62	1.62	France	38.271605
Yasmine	64	1.83	UK	34.972678
Nick	76	1.6	France	47.5
Andrew	63	1.78	Canada	35.393258

We now use the reindex method to keep only the ‘Height’ and ‘Ratio’ columns while creating a new column with label ‘Placeholder’. The argument axis=1 tells pandas we are reindexing the column labels.

Example

Reindexing columns.

1df1.reindex(['Height', 'Ratio', 'Placeholder'], axis=1)

	Height	Ratio	Placeholder
Tom	1.78	41.011236	NaN
Gary	1.65	44.848485	NaN
Lois	1.85	28.108108	NaN
Wendy	1.63	34.969325	NaN
Betty	1.69	32.544379	NaN
Jacobs	1.62	32.716049	NaN
Donald	1.62	38.271605	NaN
Yasmine	1.83	34.972678	NaN
Nick	1.6	47.5	NaN
Andrew	1.78	35.393258	NaN

By default, values in the new column that do not have corresponding records in the DataFrame are assigned NaN.

We can fill in the missing values by passing a value to the parameter fill_value.

Example

Reindexing columns with fill_value=0.

1df1.reindex(['Height', 'Ratio', 'Placeholder'], axis=1, fill_value=0)

	Height	Ratio
Tom	1.78	41.011236
Gary	1.65	44.848485
Lois	1.85	28.108108
Wendy	1.63	34.969325
Betty	1.69	32.544379
Jacobs	1.62	32.716049
Donald	1.62	38.271605
Yasmine	1.83	34.972678
Nick	1.6	47.5
Andrew	1.78	35.393258

Sorting and Rearranging Columns

Introduction

Sorting Columns

Sorting Columns by First Letter

Sort columns based on values of a single row.

Sort columns based on values of multiple rows.

Rearranging Columns

Re-indexing the Columns

Share this Article