Media_Channel | Start | Finish | Duration (weeks) | |
---|---|---|---|---|
0 | Branded Search | 2017-01-15 | 2017-03-15 | 8 |
1 | Non-branded Search | 2017-05-01 | 2017-08-20 | 15 |
2 | 2017-02-01 | 2017-11-15 | 41 | |
3 | 2017-09-01 | 2017-09-30 | 4 | |
4 | Out-of-Home (OOH) | 2017-04-10 | 2017-06-25 | 10 |
5 | TV | 2017-07-01 | 2017-09-01 | 8 |
6 | Radio | 2017-10-01 | 2017-12-15 | 10 |
Gantt Chart in Python
What is gantt chart?
A Gantt chart is a type of bar chart that illustrates the duration and timing of various tasks or events. Each task is represented as a horizontal bar, with the length and position of the bar reflecting the start date, end date, and duration. In data science, it is particularly useful for visualizing and managing the scheduling of activities, analyzing overlaps, and assessing the progress of different tasks within a given timeframe.
Importing libraries
Libraries | Description |
---|---|
pandas | Data manipulation and analysis |
matplotlib.pyplot | Plotting graphs and charts |
matplotlib.dates | Date handling and formatting for plots |
plotly.figure_factory | Creating interactive charts and visualizations |
Importing Data
The dataset provides a one-year timeline of various marketing campaigns, detailing the start and end dates along with the duration of each campaign in weeks.
Gantt chart using Matplotlib
Using Matplotlib, the Gantt chart visually represents each marketing campaign as horizontal bars, showing their start and end dates on the x-axis. It highlights the duration and overlap of campaigns, offering a clear overview of their distribution.
# Plotting the Gantt Chart with reduced figure size
= plt.subplots(figsize=(8, 5)) # Reduced figure size
fig, ax # Define colors for each task (optional)
= ['tab:blue', 'tab:orange', 'tab:green', 'tab:red', 'tab:purple', 'tab:brown', 'tab:pink']
colors # Iterate over tasks to plot them
for i, task in enumerate(df.itertuples()):
= task.Start
start = task.Finish
finish - start).days, left=start, color=colors[i % len(colors)])
ax.barh(task.Task, (finish # Format the x-axis for dates
ax.xaxis_date()=1))
ax.xaxis.set_major_locator(mdates.MonthLocator(interval'%Y-%m-%d'))
ax.xaxis.set_major_formatter(mdates.DateFormatter(# Add labels and title
'Date')
ax.set_xlabel('Campaigns')
ax.set_ylabel('Marketing Campaign Gantt Chart using Matplotlib')
ax.set_title(# Rotate date labels
=90)
plt.xticks(rotation# Show plot
plt.tight_layout() plt.show()
Gantt chart using Plotly
- The create_gantt function from Plotly’s figure_factory module is used to create a Gantt chart.
Here’s what it does: It takes a list of dictionaries (or DataFrame) where each dictionary represents a task with attributes such as Task (name of the task), Start (start date), and Finish (end date). - Using Plotly, we have plotted the Gantt chart to display each marketing campaign as horizontal bars representing their start and end dates. Additionally, when using create_gantt, hovering over the bars provides detailed information, including the start and end dates for each campaign.
= df.to_dict(orient='records')
df_dict # Create the Gantt chart using Plotly's figure_factory
= ff.create_gantt(df_dict,show_colorbar=True,index_col='Task',title="Gantt Chart of Marketing Campaigns using Plotly",showgrid_x=True,showgrid_y=True)
fig # Update layout to center the title and adjust other settings
fig.update_layout(='Date',
xaxis_title='Marketing Channels',
yaxis_title=dict(
xaxis="%m-%d-%Y",
tickformat=90 # Rotate the x-axis labels to 90 degrees
tickangle
),=dict(
yaxis="reversed" # Reverse the y-axis to have the first channel at the top
autorange
),=dict(
title="Gantt Chart of Marketing Campaigns using Plotly", # Set the title text
text=0.5, # Center the title horizontally
x='center', # Anchor the title in the center
xanchor=dict(t=20) # Add padding to the top of the title to create space
pad
),=dict(t=100) # Adjust top margin to ensure title and graph do not overlap
margin
)# Show the chart
fig.show()
Conclusion
Gantt chart provides a clear visual representation of the timing and duration of various tasks or events. It helps in analyzing and managing the scheduling of activities, identifying overlaps, and assessing the progress of different components over a defined period. This visualization is crucial for understanding the temporal aspects of data-driven processes and optimizing workflows.