Automated data storytelling: From EDA reports to NLG narratives

Introduction:

Sometimes, we just want quick insights and summaries from our data without writing too much code. Luckily, Python has some great libraries that can automatically create reports, tell data stories, and show visualizations straight from a DataFrame.

We’ve previously explored libraries like ydata-profiling, Sweetviz, and Autoviz, which excel at generating automated EDA reports and visual summaries with minimal code.

While these tools are great for quick insights and visuals, a new wave of tools is emerging that go beyond visualization — introducing Natural Language Generation (NLG). These platforms aim to craft human-like narratives based on data, turning numbers into clear, contextual stories — much like how a data analyst would explain findings to a stakeholder.

In this article, we’ll briefly summarize the earlier EDA tools and then shift focus to this evolving landscape of data storytelling using NLG.

Summary of Automated EDA Libraries

The table below highlights five popular Python libraries for automated exploratory data analysis (EDA)

  Library Strengths Best For Output Type
0 ydata-profiling Comprehensive EDA with stats, warnings, correlations, and missing value analysis One-click, in-depth data profiling HTML report
1 Sweetviz Visual comparison of datasets or target classes with key insights Comparing train/test or target group differences HTML report
2 Dataprep Interactive charts for quick data quality and distribution understanding Fast, interactive EDA workflows Browser / Jupyter
3 Lux Auto-suggests visualizations based on data structure and trends Exploratory data analysis in notebooks Jupyter widget
4 Autoviz Hands-free discovery of important features and plots Lazy or large-scale automated EDA Jupyter plots

Comparison of NLG Tools for Data Narratives

The following table summarizes key tools and libraries that bring Natural Language Generation (NLG) capabilities into data analysis workflows — helping convert insights into clear, human-like narratives.

  Tool / Library Type Strengths Status Output Type
0 Arria NLG Tool (Closed-source) BI-integrated narratives for enterprise reporting Fully functional Embedded in BI tools, Word, Excel
1 Narrative Science Quill Tool (Closed-source) Auto-generated stories for dashboards (Tableau) Fully functional Dashboard, email, PDF
2 SimpleNLG Python Library (Open-source) Custom rule-based sentence generation Mature Plain text
3 pyNLG / nlglib / narrative-text Python Libraries (Open-source) Experimental NLG modules for research In testing Plain text
4 GPT / LLM-based APIs APIs / Tools (Open-source) Contextual, prompt-driven text generation Production-ready Text, Markdown, HTML, JSON

Conclusion: Why NLG is a step beyond traditional EDA

While traditional EDA tools like ydata-profiling and Sweetviz are excellent for visualizing and summarizing data patterns, NLG tools offer a transformative shift by making insights more accessible through human-like narratives. This is especially valuable in business environments where not all stakeholders can interpret complex charts or statistical summaries.

Advantages of NLG over Traditional EDA:

  • Converts insights into stakeholder-friendly language

  • Reduces analyst time spent on manual summary writing

  • Enables real-time contextual reporting in BI tools and dashboards

Limitations / Drawbacks:

  • May require fine-tuning or prompt engineering for accuracy (especially in LLMs)

  • Closed-source tools can be costly and inflexible

  • Rule-based libraries need manual templates and lack adaptability