Automated data storytelling: From EDA reports to NLG narratives

Introduction:

Sometimes, we just want quick insights and summaries from our data without writing too much code. Luckily, Python has some great libraries that can automatically create reports, tell data stories, and show visualizations straight from a DataFrame.

We’ve previously explored libraries like ydata-profiling, Sweetviz, and Autoviz, which excel at generating automated EDA reports and visual summaries with minimal code.

While these tools are great for quick insights and visuals, a new wave of tools is emerging that go beyond visualization — introducing Natural Language Generation (NLG). These platforms aim to craft human-like narratives based on data, turning numbers into clear, contextual stories — much like how a data analyst would explain findings to a stakeholder.

In this article, we’ll briefly summarize the earlier EDA tools and then shift focus to this evolving landscape of data storytelling using NLG.

Summary of Automated EDA Libraries

The table below highlights five popular Python libraries for automated exploratory data analysis (EDA)

	Library	Strengths	Best For	Output Type
0	ydata-profiling	Comprehensive EDA with stats, warnings, correlations, and missing value analysis	One-click, in-depth data profiling	HTML report
1	Sweetviz	Visual comparison of datasets or target classes with key insights	Comparing train/test or target group differences	HTML report
2	Dataprep	Interactive charts for quick data quality and distribution understanding	Fast, interactive EDA workflows	Browser / Jupyter
3	Lux	Auto-suggests visualizations based on data structure and trends	Exploratory data analysis in notebooks	Jupyter widget
4	Autoviz	Hands-free discovery of important features and plots	Lazy or large-scale automated EDA	Jupyter plots

Comparison of NLG Tools for Data Narratives

The following table summarizes key tools and libraries that bring Natural Language Generation (NLG) capabilities into data analysis workflows — helping convert insights into clear, human-like narratives.

	Tool / Library	Type	Strengths	Status	Output Type
0	Arria NLG	Tool (Closed-source)	BI-integrated narratives for enterprise reporting	Fully functional	Embedded in BI tools, Word, Excel
1	Narrative Science Quill	Tool (Closed-source)	Auto-generated stories for dashboards (Tableau)	Fully functional	Dashboard, email, PDF
2	SimpleNLG	Python Library (Open-source)	Custom rule-based sentence generation	Mature	Plain text
3	pyNLG / nlglib / narrative-text	Python Libraries (Open-source)	Experimental NLG modules for research	In testing	Plain text
4	GPT / LLM-based APIs	APIs / Tools (Open-source)	Contextual, prompt-driven text generation	Production-ready	Text, Markdown, HTML, JSON

Conclusion: Why NLG is a step beyond traditional EDA

While traditional EDA tools like ydata-profiling and Sweetviz are excellent for visualizing and summarizing data patterns, NLG tools offer a transformative shift by making insights more accessible through human-like narratives. This is especially valuable in business environments where not all stakeholders can interpret complex charts or statistical summaries.

Advantages of NLG over Traditional EDA:

Converts insights into stakeholder-friendly language
Reduces analyst time spent on manual summary writing
Enables real-time contextual reporting in BI tools and dashboards

Limitations / Drawbacks:

May require fine-tuning or prompt engineering for accuracy (especially in LLMs)
Closed-source tools can be costly and inflexible
Rule-based libraries need manual templates and lack adaptability