
In our latest post, Chris examines how AI is transforming enterprise data analytics and visualisation. AI analytics platforms like Plotly can reveal crucial insights and generate exploration pathways missed by traditional methods. As Chris explains, these platforms are more accessible than ever before:
My co-founders and I are traditional engineers and scientists – I was an electrical engineer who dealt with R&D data in every job. Around 2013, we witnessed two major technological shifts that created a significant opportunity.
First, the entire industry was adopting Python as the primary language for data analytics across all sectors, particularly in science, engineering, and data science. Second, web browsers were becoming remarkably powerful application platforms, largely thanks to Chrome and Google’s V8 engine.
These changes revealed a gap in the technology landscape. We saw the opportunity to build a web-based visualisation layer featuring sophisticated interactive data visualisations rendered entirely in browsers – something completely new at the time. Since then, we’ve expanded from data visualisations into comprehensive data applications and now AI-powered data analytics.
Natural language serves as a universal equaliser. Every analytics tool – Salesforce, Google Analytics, Tableau – has unique chart builders with different paradigms: X versus Y, rows versus columns, dimensions versus measures. Each requires mastering complex UI-based interfaces.
Natural language eliminates this fragmentation. When pulling data from multiple systems, you can request visualisations in plain language without learning tool-specific interfaces. This could democratise data visualisation across organisations, removing the bottleneck of specialist experts who currently handle complex reporting requests.
A picture still tells a thousand words – visualisation remains the primary interface for understanding data and scenarios. AI’s breakthrough is its ability to generate code that creates new visualisations, allowing users to examine data from multiple angles rapidly and efficiently.
Currently, humans still generate insights by interpreting visualisations, though this may evolve as vision models begin interpreting graphs and analytics directly. Today, these tools primarily accelerate the different ways users can examine data to develop their own insights.
I view data science broadly as any computational work beyond basic data visualisation. While often associated with machine learning, many industries require heavy computation that isn’t necessarily ML – bioinformatics in life sciences, quantitative portfolio optimisation in finance, or complex scenario modelling in business.
Traditional drag-and-drop visualisation tools can’t handle these computational requirements, which is why Python-based tools are essential across these industries. We’ve built first-class interfaces through our Dash framework that let stakeholders interact with the scenarios and models that data scientists create.
AI is dramatically lowering barriers to quantitative work. In our latest product, data scientists can define models using natural language, while end users can still interact with published applications through GUIs. More importantly, if stakeholders want to create their own analysis, they can now do so using natural language prompts to generate their own scenarios.
While there are currently some complexity limitations for AI-generated data science models, that gap narrows daily. The exciting part is that everyone can work with the same Python backend – whether directly through code or via natural language interfaces. This reduces technology switching and creates a more unified workflow across different user types.
My biggest ‘aha moments’ occur when AI creates visualisations I wouldn’t have conceived myself.
My biggest ‘aha moments’ occur when AI creates visualisations I wouldn’t have conceived myself. Our AI systems generate diverse chart sets, presenting users with multiple visualisation options they might not have considered.
LLMs leverage their world knowledge to produce charts highly relevant to specific industries or data domains. I might approach a dataset with preconceived ideas about analysis approaches, but then receive numerous alternative perspectives on the same data. This automatically generates new exploration pathways – a remarkable workflow that expands analytical possibilities beyond my initial assumptions.
I’ve been working extensively with San Francisco’s public 311 call data – city complaints about issues like sidewalk trash or blocked driveways. Simply inputting this dataset immediately generated analyses I hadn’t considered: neighbourhood-by-neighbourhood response rate comparisons, year-over-year trend analysis, and performance assessments tied to our new mayor and administration.
The system automatically explored questions like whether response times had improved in different neighbourhoods, or if cleanup efforts had focused particularly around the Civic Center near the mayor’s offices. I approached the data with broad curiosity, but was immediately presented with eight different charts showing exploration directions – including several I hadn’t initially considered. This demonstrates how AI can expand analytical thinking beyond our initial assumptions.
311 CALL DATA VISUALISED USING PLOTLY STUDIO
Absolutely. This typically stems from data quality issues –the classic challenge where data preparation and cleaning represents 80% of a data scientist’s work. We’ve seen visualisations with missing bars that initially appear to be AI errors. Users assume the AI ‘messed up’ the graph due to missing data points. However, investigation often reveals that certain periods – like March in one example – simply aren’t present in the source data.
AI generates code to visualise data as provided, so the visualisation quality directly reflects the underlying data quality. The AI isn’t creating the data problem; it’s accurately representing flawed or incomplete datasets.
Data analysis involves countless assumptions and choices: binning strategies, moving average windows, filtering criteria, shared versus independent y-axes when comparing visualisations. Often, there’s no definitively ‘right’ answer –AI makes decisions just as human analysts would.
Our product design philosophy focuses on surfacing these ambiguities through user controls. Rather than AI making hidden assumptions about binning intervals or moving average windows, we encode these choices into dropdown menus within visualisations. Users immediately see whether binning is set to one month or two weeks, whether the moving average window is seven days, or whether y-axes are shared.
This approach makes underlying assumptions transparent and adjustable, unlike chat-based AI systems where these decisions remain buried in the model’s reasoning or generated code. Some of these hidden assumptions could prove dangerous if users can’t examine or modify them.
Transparency is absolutely essential to our product. We embed transparency at multiple levels: users can adjust parameters in the final application interface, and we auto-generate specification files in natural language that describe exactly what the code does. Crucially, this specification is created by a separate AI agent to avoid bias from the code-generating system.
We’re also building transparent logging interfaces showing step-by-step data transformations in generated code. This addresses a fundamental misunderstanding about how modern AI data analysis works. Early ChatGPT enthusiasm led people to believe you could simply feed raw datasets to LLMs and get answers, but LLMs can’t actually process numerical data – they generate tokens.
Today’s approach is fundamentally different: LLMs generate Python code, which then processes and analyses data. The LLM understands dataset structure – column names and types – enabling domain-specific code generation based on user requirements. But actual data processing happens through the generated code, making the process more rigorous.
Our tools focus on making code transparency accessible to broader audiences by implementing debuggers, logs, and other software engineering practices that were previously limited to technical users.
We approach this by trusting users while implementing strong defaults. Storytelling bias has always existed – if someone wants to mislead, they’ve had tools to do so before AI and will continue to have them.
Rather than moderating output or preventing user intent, we focus on providing excellent defaults that enable honest storytelling. For example, we don’t offer 3D pie charts like the one Steve Jobs famously used in 2007 to make the iPhone market share appear larger. Our pie charts automatically order sectors from largest to smallest and include clear labels.
We invest heavily in thoughtful visualisation defaults that make misleading presentations difficult while trusting users to act with good intent.
The recent OpenAI GPT-5 launch provides a contemporary example. Their bar chart showed confusing results where a lower numerical score appeared higher visually than a higher score from another model. Whether this was model-generated, human error, or deliberate misleading remains unclear, but it demonstrates that visualisation accuracy challenges persist.
It’s easier than ever to create visualisations, but the responsibility for accuracy and verification still lies with creators and communicators.
That example highlights why our technical approach matters. If that was image generation, it represents a completely different technology than ours. We don’t use LLMs for vision or image generation – we generate Python code that processes data into JSON structures, which JavaScript engines convert to SVG and render in browsers. This architecture eliminates the possibility of visual hallucinations that could occur with image-generation approaches.
We address this through multiple accuracy layers. First, basic functionality: does the code actually run without syntax errors? Raw LLM output today only succeeds about one-third of the time – two-thirds contain syntax errors that prevent execution. However, with our autocorrection loops and surrounding tooling, we achieve 90%+ accuracy rates.
The deeper challenge is ensuring analytical correctness. We’re building transparent verification tools into our product, including English-language descriptions of code functionality. This works well for analytics because analytical code follows sequential steps – ‘bin data by hour, then calculate average’ – creating straightforward one-toone correspondence between descriptions and code with minimal hallucination risk.
Most importantly, we provide step-by-step data transformation verification. Users can examine raw data, see intermediate transformations, and inspect final results. This enables easy spot-checking at each stage – the same verification process needed for any analytics project, whether human or AI-generated.
We’re actually building superior verification interfaces compared to current practices. Try auditing a complex Excel spreadsheet with hundreds of formulas – our transparent, step-by-step approach will be far more accessible. This represents standard verification practice regardless of how code is generated.
We’re deliberately not having our systems generate insights for users. Instead, we provide visualisation and scenario exploration tools with humans making interpretations.
Consider my 311 complaints example: if someone from the mayor’s office requested visualisations showing city improvement, a chatbot might hallucinate confirmatory answers and fabricate supporting numbers. Our system would generate code comparing this year versus last year and visualise the results without interpretation – letting users draw their own conclusions from actual data.
While the system could potentially manipulate data to support narratives, we discourage this behaviour and ensure end viewers can examine underlying data sources. Ultimately, people have always been able to fabricate data to support their stories – AI doesn’t fundamentally change this reality.
Our primary guardrail is visualising data directly without interpretation layers. We’ve embedded a decade’s worth of data visualisation best practices into the product itself – appropriate chart types, our established house style, and proper aggregation methods and controls that enable visualisations to tell complete stories.
We also rely on the frontier models themselves, which are building moderation techniques directly into their systems. We’re largely deferring to these advanced models to handle much of the content moderation.
Natural language interfaces can obscure what’s technically possible with underlying code. When coding directly, you develop intuition about feasibility because you’re crafting the strategy yourself. We’ve seen users request impossible functionality, making it difficult to provide clear feedback about technical limitations – leading them down unproductive rabbit holes.
This reflects AI’s ‘jagged frontier’ – remarkable capabilities in some areas, surprising limitations in others, with boundaries that aren’t apparent until experienced. I tell users that their existing expertise in Python, data science, or Plotly libraries remains valuable. The better you understand underlying fundamentals, the more effectively you can guide AI toward achievable solutions.
We’ve maintained a code-first approach to data visualisation and application development since our founding. Our Python library, launched in 2014, now sees tens of millions of downloads monthly. This positions us perfectly for the AI era because LLMs excel at code generation.
Our open-source libraries – Dash and Plotly graphing library – include tens of thousands of examples that LLMs have been trained on. This enables them to generate sophisticated Python code for applications and visualisations.
Our latest product, Plotly Studio, is an AI-native application for creating visualisations, dashboards, and data apps. However, we’ve learned that LLMs represent only 30% of the solution. The remaining 70% is the tooling ecosystem – running code, verification, testing, and iterative improvement. This creates what many call ‘agentic AI’ – code generation within an execution environment that can test and refine its output.
Ultimately, people have always been able to fabricate data to support their stories – AI doesn’t fundamentally change this reality.
Plotly Studio bundles everything: Python runtime, code generation, automatic rendering, error correction, and an intuitive interface. This comprehensive approach makes agentic analytics accessible to everyone.
Exactly. We enable data scientists and analysts to visually explore datasets through various lenses – raw data, simulations, scenario modelling. These analytical capabilities leverage custom code execution, which traditional BI tools struggle with. While most BI platforms excel at data visualisation, they’re limited in running analytics on top of that data.
Code-based analytics unlock these advanced capabilities, and AI now allows users to instruct this process in natural language rather than requiring programming expertise.
Plotly Studio’s simplicity masks significant technological complexity. We’ve embedded hundreds, potentially over a thousand suggestions that guide code generation toward consistent structures. Unlike other vibe coding tools that generate monolithic files with thousands of lines, we enforce clean architecture: separate files, structured projects, and consistent templates that improve both accuracy and maintainability.
We’ve encoded years of hard-learned lessons from building applications for customers. This includes optimisation defaults like automatically enabling WebGLbased visualisations over slower SVG alternatives. LLMs are powerful when guided by experienced operators –we’ve baked that expertise directly into the product so users benefit from best practices without needing deep technical knowledge.
Python installation remains notoriously difficult for newcomers. We’ve packaged Python directly into our application runtime, handling cross-platform compatibility, certificate issues, corporate network constraints, and permissions automatically. Users can download and run immediately without technical setup.
Our auto-correction system provides rich context to LLMs when syntax errors occur – variable scope, debugging traces, and detailed error information enable superior self-correction. We’ve architected code generation for parallel execution across multiple agents. While generating
2,000-5,000 lines of Python code would typically require 10 minutes sequentially, our parallel approach delivers complete applications in 90 seconds to two minutes.
Error compounding presents a fundamental challenge: if each step in a 10-step agentic process achieves 99% accuracy, the overall success rate drops to approximately 90% (99^10). We’ve specifically designed our architecture to minimise steps and enhance self-correction capabilities, maintaining high accuracy in complex multiagent workflows.
This engineering focus on error prevention and parallel processing enables the reliable, fast experience that makes advanced analytics accessible to non-technical users.
We’ve built a custom engine that runs code in our controlled environment with strict structural requirements. Rather than allowing freewheeling agents, we provide specific instructions, expected inputs/outputs, and constraints to ensure high accuracy and consistent testing.
AI systems work best in feedback loops – generating code, running it, testing it, and self-correcting. LLMs won’t do this by default, so we’ve structured our code generation for easy testing. Our proprietary testing engine evaluates generated code and provides targeted error correction loops for automatic fixes.
This controlled approach maintains the accuracy needed for reliable analytics applications while preserving the flexibility that makes AI-powered analysis powerful.
The biggest risk is rushing to fully automated insights generation. It remains unclear how effectively AI can generate insights independently versus requiring human interpretation. I believe we should focus on building excellent tools for human interpretation rather than having AI make decisions without understanding implicit analytical ambiguities.
Many systems may attempt to ‘skip to the finish’ – having AI interpret results and make decisions autonomously. This could lead to incorrect insights and assumptions, representing a potentially dangerous leap too far ahead of current capabilities.