Table of Contents
- Introduction
- Syntax
- Constructing Series Objects
- pandas Series vs NumPy Array
- pandas Series vs Python Dictionary
Introduction
A pandas Series is a one-dimensional array of indexed data. It is capable of holding data of any type (integer, string, float, python objects, etc.).
Syntax
A pandas Series can be created using the following constructor.
Syntax
Thepandas.Series
function.
1pandas.Series(data=None, index=None, dtype=None, copy=False)
Parameter | Required? | Default Value | Description |
---|---|---|---|
data |
✔️ Yes | NA | Array-like, Iterable, dict, or scalar value. |
index |
❌ No | np.arrange(n) |
Array-like or Index (1d) where n is the length of data . See Index Object. |
dtype |
❌ No | Inferred from data |
Data type: str , numpy.dtype , or ExtensionDtype . |
copy |
❌ No | False |
bool . Copy input data. Only affects Series or 1d ndarray input. |
Constructing Series Objects
We first import the required libraries and modules.
1import numpy as np
2import pandas as pd
3import random as rd
A pandas Series can be created from a list as follows:
Example
Creating a pandas Series from a Python list.1x = pd.Series([0.5, 1.0, 1.5, 2.0])
2x
0 0.5
1 1.0
2 1.5
3 2.0
dtype: float64
The pandas Series includes both a sequence of values and a sequence of indices, which we can access with the values
and index
attributes, respectively.
The values
are simply a NumPy array.
Example
Accessing the values of a pandas Series .1x.values
array([0.5, 1. , 1.5, 2. ])
The index
is an array-like object of type pd.Index
.
Example
Accessing the index of a pandas Series1x.index
RangeIndex(start=0, stop=4, step=1)
Like with a NumPy array, data can be accessed by the associated index via the familiar Python square-bracket notation.
Example
Accessing values of a pandas Series by index.1x[0]
0.5
Example
Slicing a pandas Series.1x[1:3]
1 1.0
2 1.5
dtype: float64
The data
can be just a single scalar, which is repeated to fill the (longer) specified index
.
Example
A pandas Series where scalar data input is replicated.1pd.Series(88, range(5))
0 88
1 88
2 88
3 88
4 88
dtype: int64
pandas Series vs NumPy Array
While the NumPy array has an implicitly defined integer index used to access the values, the Pandas Series has an explicitly defined index associated with the values.
As a consequence, the index needs not be an integer, but can consist of values of any desired type. For example, we can use strings as an index:
Example
Explicitly defined index of a pandas Series.1x = pd.Series([0.5, 1.0, 1.5, 2.0], index=['a', 'b', 'c', 'd'])
2x
a 0.5
b 1.0
c 1.5
d 2.0
dtype: float64
We can access any value using its index, just like a NumPy array.
Example
Accessing a value of a pandas Series using its index.1x['c']
1.5
Example
The index of a pandas Series can be unordered.1y = pd.Series([0.5, 1.0, 1.5, 2.0], index=rd.sample(range(1,5),4))
2y
3 0.5
2 1.0
1 1.5
4 2.0
dtype: float64
1y[2]
1.0
Example
Duplicated indices are allowed.1y1 = pd.Series([0.5, 1.0, 1.5, 2.0], index=rd.choices(range(1,4),k=4))
2y1
2 0.5
1 1.0
2 1.5
3 2.0
dtype: float64
1y1[2]
2 0.5
2 1.5
dtype: float64
The index can also be heterogeneous. In the following case, the index includes both characters and integers.
Example
The index of a pandas Series can be heterogeneous.1z = pd.Series([0.5, 1.0, 1.5, 2.0], index=['a', 10, 'c', 'd'])
2z
a 0.5
10 1.0
c 1.5
d 2.0
dtype: float64
1z[10]
1.0
The values of a pandas Series can also be heterogeneous.
Example
The values of a pandas Series can be heterogeneous.1z1 = pd.Series([0.5, 'apple', 1.5, 2.0], index=['a', 10, 'c', 'd'])
2z1
a 0.5
10 apple
c 1.5
d 2.0
dtype: object
1z1[10]
'apple'
In the above example, the Series now has type object
instead of float64
. However, do note that the main benefit of pandas, i.e. vectorized operations, is lost as soon as we start mixing data types in a series. Therefore, using mixed data types in a Series should be avoided whenever possible.
pandas Series vs Python Dictionary
You may recall that a Python dictionary is a data structure that maps arbitrary keys to a set of arbitrary values. On the other hand, a pandas Series is a structure that maps typed keys to a set of typed values.
Just as the type-specific compiled code behind a NumPy array makes it more efficient than a Python list for certain operations, the type information of a pandas Series makes it much more efficient than Python dictionaries for certain operations.
In the following example, we construct a Series object directly from a Python dictionary.
Example
Constructing a pandas Series from a Python dictionary.1age_dict = {'Tom': 32,
2 'Gary': 26,
3 'Lois': 22,
4 'Wendy': 31,
5 'Betty': 35}
6age_dict
{'Tom': 32, 'Gary': 26, 'Lois': 22, 'Wendy': 31, 'Betty': 35}
The associated pandas Series can then be constructed directly from the above dictionary using pd.Series
.
1age = pd.Series(age_dict)
2age
Tom 32
Gary 26
Lois 22
Wendy 31
Betty 35
dtype: int64
We can also create an index which is sorted from the dictionary keys using the sort_index()
method.
Example
Constructing a Series from a dictionary with sorted index.1age = pd.Series(age_dict).sort_index()
2age
Betty 35
Gary 26
Lois 22
Tom 32
Wendy 31
dtype: int64
We can also sort the indices in descending order by using the argument ascending=False
.
Example
Constructing a Series from a dictionary with sorted (reversed) index.1age = pd.Series(age_dict).sort_index(ascending=False)
2age
Wendy 31
Tom 32
Lois 22
Gary 26
Betty 35
dtype: int64
We can now easily access the elements using indexing and slicing.
Example
Access elements of a pandas Series using indexing.1age['Wendy']
31
Example
Access elements of a pandas Series using slicing.1age['Tom':'Gary']
Tom 32
Lois 22
Gary 26
dtype: int64
In contrast to a NumPy array, the slicing of a pandas Series includes the element at the end-index.
Also, we can create a Series from a dictionary by explicitly specifying the index which may include only specfic keys in any desired order.
Example
Creating a Series from a dictionary with specific keys only.1 pd.Series({1:'a', 2:'b', 3:'c', 4:'d'}, index=[3, 1, 2])
3 c
1 a
2 b
dtype: object