Underrated Pandas functions


Mar 10 edited
tags: python code pandas 


The aim of this page is to highlight some underrated python function from the one of the most used libraries: Pandas.
The idea is to save you some time, preventing having to reinvent the wheel, and maybe help make cleaner, shorter and more concise code. The different functions are presented by broad category, and accompanied by application examples and eventually combinations that you could add to your python function toolbox.



Underrated Pandas functions



crosstab

The crosstab function is powerful tool from the Pandas library, for creating cross-tabulation tables from a Dataframe.It allows you to easily summarize and visualize the relationships between categorical variables revealing patterns and frequencies at a glance. Just like your Excel Pivot table, specify what you want to see in columns,rows and which measure should be displayed with a given aggregation function (ex:sum).


# Python code example: given a Sales data sheet stating the number of sales per quarter 
import panda as pd
data={'Product':['Ref1','Ref2','Ref3','Ref4','Ref3','Ref2','Ref2','Ref3'],
'Quarter':['Q1','Q1','Q2','Q1','Q2','Q3','Q2','Q1'],
'Level':[100,200,300,200,500,200,150,225]}
df = pd.DataFrame(data)
# display original data in df
print('Display DataFrame df data')
print(df)

corsstab=pd.crosstab(index=df['Product'],columns=df['Quarter'], values=df['Level'],aggfunc='sum')
# display crosstab result
print('Display DataFrame crosstab data')
print(crosstab)


Output:



Display DataFrame df data
Product Quarter Level
0 Ref1 Q1 100
1 Ref2 Q1 200
2 Ref3 Q2 300
3 Ref4 Q1 200
4 Ref3 Q2 500
5 Ref2 Q3 200
6 Ref2 Q2 150
7 Ref3 Q1 225
Display DataFrame crosstab data Quarter Q1 Q2 Q3
Product
Ref1 100.0 NaN NaN
Ref2 200.0 150.0 200.0
Ref3 225.0 800.0 NaN
Ref4 200.0 NaN NaN

To learn more about this function, please check the official documentation Here.

melt

The melt in Pandas, is used to transform a structured dataframe by melting it into a long-form format. In short it transposes a set of columns into rows, with a new column holding the variable names and another the column. It helps reducing the number of columns and this data reshaping can simplify the analysis.



# Python code example: given a Sales data sheet stating the number of sales per quarter and region (N,S,W,E)
import panda as pd
data={'Product':['Ref1','Ref2','Ref3','Ref4','Ref3','Ref2','Ref2','Ref3'],
'Q1':[100,200,0,200,0,0,0,225],
'Q2':[0,0,300,0,500,0,150,0],
'Q3':[0,0,0,0,0,200,0,0],
'Region':['S','N','W','S','E','W','N','S']}

df = pd.DataFrame(data)
# display original data in df
print('Display DataFrame df data')
print(df)

melt=df.melt(id_vars=['Product','Region'],value_vars=['Q1','Q2','Q3'])
# display original data in df
print('Display melt data, with quarters displayed in one column')
print(melt)


Output:


Display DataFrame df data
  Product   Q1   Q2   Q3 Region
0    Ref1  100    0    0      S
1    Ref2  200    0    0      N
2    Ref3    0  300    0      W
3    Ref4  200    0    0      S
4    Ref3    0  500    0      E
5    Ref2    0    0  200      W
6    Ref2    0  150    0      S
7    Ref3  225    0    0      S

Display melt data, with quarters displayed in one column
   Product Region variable  value
0     Ref1      S       Q1    100
1     Ref2      N       Q1    200
2     Ref3      W       Q1      0
3     Ref4      S       Q1    200
4     Ref3      E       Q1      0
5     Ref2      W       Q1      0
6     Ref2      S       Q1      0
7     Ref3      S       Q1    225
8     Ref1      S       Q2      0
9     Ref2      N       Q2      0
10    Ref3      W       Q2    300
11    Ref4      S       Q2      0
12    Ref3      E       Q2    500
13    Ref2      W       Q2      0
14    Ref2      S       Q2    150
15    Ref3      S       Q2      0
16    Ref1      S       Q3      0
17    Ref2      N       Q3      0
18    Ref3      W       Q3      0
19    Ref4      S       Q3      0
20    Ref3      E       Q3      0
21    Ref2      W       Q3    200
22    Ref2      S       Q3      0
23    Ref3      S       Q3      0

To learn more about this function, please check the official documentation Here.

The fuel of coding inspiration:music, travel, photography and whatever drives creativity to code.

You might also like