Lin Wang is a Data Science Leader at Bayer, currently overseeing Analytics and Design. Lin has 10+ years’ experience in data, specialising in machine learning, and disruptive innovation and implementation. With a background in genomics, bioinformatics and people management, Lin has lead organizational transformations, and developed talents and teams for the fast-evolving landscape in the era of AGI.
In this post, Lin considers the rapid transformation of AI – and the impact this has on the role of the data scientist. How can the data scientist adapt to meet the new challenges of fast-evolving tech, and utilise the opportunities it offers? What new skills and personal qualities should the Data Scientist bring to the table?
IN RECENT MONTHS, WE’VE WITNESSED A SEISMIC SHIFT IN ARTIFICIAL INTELLIGENCE.
This transformation, resembling a grand renaissance, has been sparked by large language models (LLMs) like OpenAI’s GPT series. What was once considered simple pattern prediction has now unveiled emergent capabilities that have taken centre stage, revolutionising our conception of AI’s potential. The prospect of achieving Artificial General Intelligence (AGI) has rocketed skyward, setting us on an accelerated path of adaptation and adoption that has left many astounded, eager, and even fearful.
As a people leader interested in the developing talent of Data Scientists, I’ve observed a wave of change sweeping across the tech sector. Companies are in full sprint, vying to stay ahead of the technological curve. In the scramble to adapt, however, there’s a blind spot emerging: the crucial element of human potential seems to be getting sidelined.
This brings us to a critical juncture. With the rapid pace of AI evolution, what does the future hold for our Data Scientists? Through the lens of this article, I aim to offer my perspectives on how the roles and responsibilities of Data Scientists may evolve in the coming years. I invite you to join me as we explore this exciting future landscape, teeming with promises and opportunities.
Let’s take a moment to peel back the layers of the world of Data Science in an industry, which, at its heart, is dedicated to solving problems and driving tangible outcomes. If you’re peering into this world from the outside, you might imagine a Data Scientist’s day is filled with intellectual battles over complex problems, meditating over the merits of Data Science techniques, and crafting the perfect implementation tactics.
However, the reality can often be far less glamorous and somewhat surprising to those not entrenched in the field. The truth is that Data Scientists often find themselves more like explorers in a vast wilderness, dedicating substantial time to the arduous task of hunting, gathering, and refining the raw materials of their craft: the data itself. They then spend hours coding and troubleshooting to extract the insights before finally weaving those into stories with narratives that non-data savvy stakeholders can understand and act upon.
The advancements in AI might just be the game-changer we need to tackle these less-visible inefficiencies. They equip Data Scientists with powerful tools to harness their core competencies fully. They will have more time and focus on employing cutting-edge analytical methods to derive actionable insights and address real-world issues. This is the “future” many Data Scientists envisioned when starting their journeys, and we’re journeying back to that future now.
AI advancements are triggering significant productivity boosts and impact acceleration in Data Science. Let’s delve into a few recent AI-enabled innovations that illustrate this point.
Take, for instance, GitHub’s Copilot. This AI-powered coding aide serves as a steadfast companion for every Data Scientist, offering instant code suggestions and considerably reducing their workload. Imagine the convenience of telling Copilot your coding objective in layman’s terms, and it responds with the necessary subroutines or functions. Of course, sanity checks remain crucial even when using AI. This isn’t science fiction – it’s the reality we are experiencing today. Several similar coding assistants are emerging, including DeepMind’s AlphaDev, which impressively identified sorting algorithms boasting a speed and scalability improvement of up to 20% compared to leading human-designed benchmarks. Such AI-enabled coding assistants empower our Data Scientists to dedicate more time to discovery and problem-solving, thus boosting their efficiency.
Let’s also consider the potential of a tool capable of swiftly skimming through lengthy reports or intricate technical documents, identifying key points to form hypotheses or spotlighting opportunities for system improvement. This is now feasible thanks to AI’s phenomenal prowess in summarising vast bodies of text. This area is rapidly growing, with paid and open-source options becoming available. Notable newcomers include Jasper (formerly Jarvis), a GPT-3 model-based tool adept at tackling generalised summarisation tasks. There’s also Scholarcy, tailored for academic use, including direct PDF ingestion capabilities. Scholarcy appears to operate based on a proprietary algorithm, albeit drawing inspiration from Google’s PageRank algorithm and ‘bottom-up attention’ research. While these tools may overlook nuances requiring deep domain knowledge, their abilities are continually improving. It’s just a matter of time before we have access to embedded summarisation tools for industrial settings capable of meeting requirements for IP capture and incorporating profound industrial knowledge. Such tools will assist Data Scientists in navigating information more promptly and efficiently.
AI’s transformative potential also impacts how insights are communicated and implemented. AIgenerated presentations and visuals enable Data Scientists to distill intricate insights into digestible narratives. For instance, Beautiful.ai provides a userfriendly platform for creating vibrant presentations, eliminating the need for meticulous crafting in PowerPoint. Another example is SlidesAI.io, integrated into the Google Docs ecosystem, making visually appealing slides easy to create. Granted, these tools focus more on the aesthetic aspect than the content, but just think about the potential when you pair these capabilities with AI’s text summarisation prowess, as previously mentioned.
Imagine a scenario where Data Scientists can articulate their findings to business stakeholders using the specific lingo or style that encourages understanding, support, and rapid implementation. This approach will undoubtedly expedite the journey from insight discovery to solution implementation.
These increasingly advanced AI-enabled tools are becoming more sophisticated and more widely accessible, which is an exciting development. We’re seeing an array of AI-powered tools integrating seamlessly into familiar software like Microsoft’s Office Suite, which now includes built-in AI features. The open-source world is also teeming with groundbreaking innovations, drawing inspiration mainly from Meta’s recently “leaked” LLM model, known as LLaMA.
Rumours are starting to circulate that Meta may be looking into offering commercial licenses, which could open the door for companies to integrate AI into their operations natively. This development is exhilarating and signals a future where cutting-edge AI technology is not solely within reach of tech giants but is a shared resource available to all.
Indeed, we are on the brink of a new era. AI is helping Data Scientists not only return to their original mission but it’s also helping them unlock new opportunities. Rather than being confined to the analytical sidelines, Data Scientists are now stepping into strategic roles, spearheading business decision-making processes. AI acts as their navigation system, guiding them through uncharted territories toward a future teeming with promise and potential.