Unleashing the Power of Flow: A Journey Through Sankey Charts
Sankey charts are a type of data visualization that have gained increasing popularity in recent years. They are especially useful for representing processes or flows between different categories or groupings, visualizing the volume of movement between nodes. Created in a visually pleasing way, Sankeys help users understand and interpret complex connections more efficiently. In this article, we will explore the creation of Sankey charts and their diverse applications across various industries, including data-driven decision making, process mapping, and project management.
When to Use Sankey Charts
Sankey charts are particularly suitable when you want to visualize flows or the transfer of something between different categories over time. Common use cases include environmental flows, material conservation, energy consumption, supply and value chain flows, migration patterns, and web traffic. They do best when there are several flows, where the size of the flows can be proportional to the values associated with them, providing an intuitive sense of magnitude.
Key Elements of Sankey Charts
-
Nodes: These represent the categories or nodes at the beginning or end of flows.
-
Links (Arrows): These represent the flows between the nodes, with the width of the link often symbolizing the magnitude of a flow.
-
Labels: These provide additional data, such as specifics of a link or flow.
-
Color Coding: Helps in distinguishing between different types of flows.
Creating Sankey Charts
Sankey charts can be created using various tools and software. For this example, we’ll discuss two popular tools:
1. Plotly
Plotly is a powerful and flexible open-source charting library that allows you to create interactive Sankey charts. Here’s a basic framework to create a Sankey chart with Plotly:
“`python
import plotly.express as px
Sample data
data = {
‘source’: [‘Cat1’, ‘Cat1’, ‘Cat2’, ‘Cat2’, ‘Cat3’,
‘Cat3’, ‘Cat1’, ‘Cat1’, ‘Cat3’, ‘Cat1’],
‘target’: [‘Cat2’, ‘Cat3’, ‘Cat1’, ‘Cat3’, ‘Cat2’,
‘Cat1’, ‘Cat3’, ‘Cat2’, ‘Cat2’, ‘Cat2’],
‘value’: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
‘color_var’: [‘Red’, ‘Green’, ‘Blue’, ‘Yellow’, ‘Red’,
‘Green’, ‘Blue’, ‘Yellow’, ‘Red’, ‘Green’]
}
df = pd.DataFrame(data)
Creating Sankey plot
fig = px.sunburst(df, names=’source’, parents=’target’, values=’value’,
color=’colorvar’, colordiscrete_sequence=px.colors.qualitative.Prism)
Showing the chart
fig.show()
“`
2. Python Libraries
Python libraries like NetworkX, Matplotlib, and Seaborn can also be used for creating more complex Sankey diagrams with more customization options.
“`python
import networkx as nx
import matplotlib.pyplot as plt
Data for links and labels
links = [(‘A’, ‘Z’), (‘A’, ‘X’), (‘X’, ‘Y’), (‘Y’, ‘Z’), (‘B’, ‘Z’)]
labels = {“_”.join(sorted(link)): i for i, link in enumerate(links)}
value = [5, 7, 6, 8, 4]
Create Sankey diagram
pos = list(range(len(links)))
width = value
node = set().union(*links)
G = nx.MultiGraph()
G.addnodesfrom(node)
G.addedgesfrom(links, width=width)
nodes = list(G.nodes())
fig, ax = plt.subplots()
ax.set_title(‘Sankey Model Plot’)
nlabel = []
for n in sorted(G.nodes()):
nlabel.append(ax.text(*nodes[n]))
nk1 = [k for k in nlabel if len(k.gettext()) == 3]
nk2 = [k for k in nlabel if len(k.gettext()) == 2]
nk3 = [k for k in nlabel if len(k.get_text()) == 1]
for n in nk1:
n.setposition((nodes[sorted(G.nodes())[sorted(G.nodes()).index(‘Z’)]][0], n.getposition()[1]))
for n in nk2:
n.setposition((nodes[sorted(G.nodes()).index(‘A’)][0], n.getposition()[1] + .1))
for n in nk3:
n.setposition((nodes[sorted(G.nodes()).index(‘B’)][0], n.getposition()[1]))
edges = G.edges()
for e in edges:
(l, r) = (labels[l] for l in e)
dx = nodes[r][0] – nodes[l][0]
dy = nodes[r][1] – nodes[l][1]
if abs(dx) > abs(dy):
x = [nodes[l][0], nodes[r][0]]
y = [nodes[l][1] + dy / 2., nodes[r][1] + dy / 2.]
else:
x = [nodes[l][0] + dx / 2., nodes[r][0] + dx / 2.]
y = [nodes[l][1], nodes[r][1]]
ax.annotate('', xytext=(x[0], y[0]), xy=(x[1], y[1]), arrowprops=dict(arrowstyle="-|>", color=G[e]['color']))
Show or save the chart
plt.tight_layout()
plt.show()
“`
Conclusion
Sankey charts offer a unique and impactful way to visualize the flow between different categories, making complex data easily digestible. By using tools such as Plotly or Python libraries like NetworkX, even sophisticated Sankey diagrams can be created with ease, enhancing data analysis and decision-making processes. Remember, like any visualization tool, the key to effective use lies in understanding your data and knowing when and how to apply a Sankey chart for maximum effect.
SankeyMaster
SankeyMaster is your go-to tool for creating complex Sankey charts . Easily enter data and create Sankey charts that accurately reveal intricate data relationships.