Data Science Talent Logo
Call Now

In the Age of AI: How to Balance Flexibility, Control, and Business Value by Nikhil Srinidhi

 

 width=Nikhil Srinidhi helps large organisations tackle complex business challenges by building high-performing teams focused on data, AI and technology. Before joining Rewire as a partner in 2024, Nikhil spent over six years at McKinsey and QuantumBlack, where he led holistic data and AI initiatives.
In our latest post, Nikhil explains the importance of data architecture for organisations. With a solid architecture in place, organisations can scale data and AI correctly, ensuring everyone is working toward the same end goal. Nikhil identifies the practices leaders can apply to build a strong architecture that delivers real business value:

How would you define data architecture and why should business leaders care?

In a small company, your daily conversations serve as the data architecture. The problem arises at scale. Once you have multiple teams, you need something that contains those agreements and design patterns, and you can’t have 100 meetings a week explaining this to a whole organisation.

It’s critically important because without it, everyone is building with different blueprints. Imagine constructing a building with each person using a different schematic. Connecting it to the origin of the word ‘architecture,’ it’s really a way of ensuring everyone is working toward the same end goal. For data, it’s about how you work with technology, different types of data, how you process it, and how you deal with structural and quality issues in a way that moves the needle forward.

I’d argue that architecture is key to scaling data and AI correctly. That’s why organisations have been investing heavily in it. However, many haven’t got everything they’d like out of it, so there’s still an ROI question worth discussing.

Data architecture is really a way of ensuring everyone is working toward the same end goal.

What made data architecture a C-suite topic, and what role does AI play?

Twenty years ago, companies like IBM provided the full value chain from databases to ETL to visualisation. Over time, many companies began specialising in niche parts of the data value chain. Enterprises suddenly faced new questions as the importance of data grew: Which combination of technologies should I use? Where should I do what? Often you had data in one place and certain capabilities in another. Do you move the data? Do you move the capability?

Now with generative AI requiring vast amounts of unstructured information, you’re thinking about knowledge architecture and information architecture. How do you ensure the right information feeds these models? The problem is growing fast.

Could you clarify the key layers of a modern data stack?

I’d break it into two aspects. First is the static aspect, the data technology architecture: what tools, components, and vendors you use from ingestion through to consumption.

The second aspect is the dynamic part: data flows. How data moves from creation to consumption, with clarity about where processes should be standardised and where they can vary.

What’s important is providing guidelines on how these patterns and technologies can be applied at scale. Successful data architecture becomes easily applied by the teams actually building things.

What principles should good modern data architecture have?

Ironically, while architecture suggests permanence, data architecture needs modularity and flexibility. If there’s a disruption in one component or an entirely new processing method emerges, you should be able to switch that component without breaking the entire system.

The human angle is often ignored. How do you encapsulate architecture as code and reusable modules that development teams can easily pull from a repository? The more practical and tangible you make it, the better.

Another quality is observability. You should be able to tell which parts of your data architecture are incurring the highest costs, which are growing fastest, and where leverage is reducing when it should be increasing.

How should data architecture align with business strategy, and where do you see disconnects?

The business or data strategy describes what to do with data. Why we need it, what business objective it helps achieve, what data represents our competitive advantage.

Architecture focuses on how. So, building things effectively and efficiently with optimal resources. It provides perspective on trade-offs. You can’t have lower cost, higher quality, and speed simultaneously. Architecture should provide crystal clear clarity on these decisions so organisations walk into them consciously rather than falling into them. Whoever’s building the architecture needs to be well-versed in the business strategy. When architecture becomes so generic you can switch the company name and it works for any industry, it probably won’t work.

What are the biggest misconceptions about modern data architecture?

It depends on the industry, of course, but the biggest misconception I’ve encountered is that extreme abstraction will always make your architecture better. There’s a tendency for architecture to become overly theoretical, but we need to ensure we make pragmatic trade-offs.

How do you make it pragmatic? There might be a specific part of the architecture, like your storage solution, where it’s okay not to have all the flexibility through abstractions or modularity. You can double down on specific technologies and storage patterns. It’s okay to commit to something. For example, if you want to store all your data as Iceberg tables or Parquet files, and that’s a decision you’ve made for now, you can go with it. You don’t have to build it in a way where you’re always noncommittal about your decisions.

What’s important is recognising where commitment benefits you and where it could become a cost.

For example, in life sciences R&D, you’d want to give consumers freedom to explore datasets in different ways. Diversity is fine there. But there’s no point building the most perfect storage layer that tries to remain neutral. The misconception that architecture must be perfectly modular at every angle leads to unnecessary work.

How has GenAI influenced data architecture decisions in traditional business sectors?

GenAI has achieved visibility from the board to developers. The realisation is that without leveraging proprietary information, the benefit GenAI provides will be the same for any company.

The biggest challenge is providing the right endpoints for data to be accessed and injected into LLM prompts and workflows. How do you build the right context? How do you use existing data models with metadata to help GenAI understand your business better?

The broader question is, how do you handle unstructured data? Information in documents, PDFs, PowerPoint slides. How do you make this part of the knowledge architecture going forward? There’s no clear approach yet.

How should organisations approach centralisation versus decentralisation?

I’ll be controversial. While data mesh was an elegant concept, the term created more confusion than good. It became about decentralisation versus centralisation, but the answer always depends.

For high-value data like customer touchpoints, you’d want standardisation. Centralisation may be fine. But ‘centralised’ triggers reactions because it means ‘bottleneck.’

Much advantage comes when data practitioners are deep in business context. If someone is working in the R&D space or clinical space, the closer they are to domain knowledge, the better, even if they have a background as a data engineer or data scientist. In these situations when something is centralised, requirements get thrown back and forth.

Focus on how you want data, knowledge, and expertise to flow. There’s benefit to having expertise at the edge, but also to controlling variability. Both approaches should be examined without emotion.

What’s your approach to separating signal from noise in the current data and AI landscape?

First, understand what types of data you have. A data map that’s 70-80% correct is enough to start. ‘All models are wrong, but some are useful.’

Second, understand technologies and innovations in flux within each capability. Know the trends so you can identify leapfrog opportunities rather than doing a PoC for every capability.

Third, determine what is good enough. ‘Perfect is the enemy of good.’ Half the time, organisations pick solutions with a silver bullet mentality. Be honest, this works in 80% of cases, but here’s the 20% that won’t. Being aware of that de-hypes the signal.

To recap: know capabilities’ connection to business value, understand market trends, and identify the extent capabilities need to be implemented, recognising you have limited resources.

What mindset and capability shifts do organisations need around data initiatives?

Working backwards, successful organisations have product teams that rapidly reuse design patterns and components to focus on problems requiring their expertise. Moving away from what we call pre-work to actual work.

Data scientists spend 70-80% of their time on data cleaning and prep. We want everyone to easily pull integration pattern codes, templates, and snippets without reinventing the wheel.

Individuals need to build with reusability in mind. If it takes 10 hours to build a module, it may take three more hours to build it in a more generalised fashion. Knowing when to invest that time is critical.

People building architecture need a customerfacing mindset. Think of other product teams as internal customers. This drives adoption and creates a flywheel effect.

How should organisations structure the data architecture capability?

The most successful architects have grown from engineering implementation roles. They’ve built things, been involved in products, then broadened their focus from one product to multiple products. That’s the most successful way of scaling the architectural mindset.

Even if architecture is a separate chapter, intentionally bring them together in product teams. Make the product team, where developers, architects, and business owners collaborate, the first level of identity an employee has.

If you ask an employee ‘what do you do?’ They should say ‘I’m part of product team X,’ not ‘I’m in the architecture chapter.’ This mindset shift requires investment. It’s a people issue. It’s about ensuring there is trust between groups and recognising what architecture is at that product team level.

How should we measure the impact of data architecture? What’s a smarter way to think about the value?

There’s no clear-cut answer because data architecture is fundamentally enabling and it’s difficult to directly attribute value. It’s like a highway. Can you figure out what part of GDP comes from that highway? You can use proxies, but it’s abstract.

The most important thing is almost forgetting ROI. Nobody questions whether a highway is important. Nobody questions the ROI of their IT laptop. We need to dream about a future where data architecture is similarly valued.

Ensure whatever you build connects to an initiative with a budget and specific business objective. You’re not just building something hoping it will be used. Recognise that some capabilities individual product teams will never be incentivised to build and you need centralised teams with allocated budget for that.

Benchmark against alternatives: what would it cost teams to build this on their own using AWS or Azure accounts? Is there an economies of scale argument?

Measure proxy KPIs where possible because, ultimately, it’s about the feeling of value. But also tell the story of what would happen without a central provider. What would it cost individual teams to do that on their own? That helps justify and track ROI.

Can you give some examples from regulated industries that illustrate the principles you have shared?

In life sciences R&D, data architecture is about bringing together different data types, including unstructured information and making it usable quickly. There’s a big push in interoperability using standards like FHIR and HL7. If you’re designing something internally, why not use these from the start rather than building adapters later?

Beyond the commercial space, there’s also increasing effectiveness in filing for regulatory approval and generating evidence. There’s tremendous value in ensuring you have the right audit trails for how data moves in the enterprise, especially as companies enter the softwareas-a-medical-device space. Knowing how information and data travels through various layers of processing is made possible through data architecture.

One of the biggest competitive advantages is becoming better at R&D. How do you take ideas to market? How do you balance a very academia-driven approach with a datadriven and technology-driven approach? This is where data architecture can be quite impactful.

Think about developing different types of solutions that require medical data to support patient-facing systems or clinical decision support systems. In all of these, it’s highly critical to get it right in terms of how data flows, but also to ensure the data that’s seen and used has a level of authenticity and trust.

The kinds of data we’re working with vary from realworld data you can purchase – from healthcare providers, hospitals, especially EMRs and EHRs – to very structured types of information. How do you take that information, combine it, build the right models around it, and provide it in a way that different teams can use to drive innovation in drastically different spaces? Data architecture there is less about giving you an offensive advantage and more about reducing the resistance and friction to letting the entire research and development process flow through.

For example, how do you build the right integration patterns to interface with external data APIs? The datasets you’re buying are probably made accessible via APIs you need to call, and you’re often bound by contracts that require you to report how often these datasets are used. If you’re using a specific dataset 30 times, it corresponds to a certain cost. However, if you’re not able to report on that, the entire commercial model you can negotiate with data providers will change. They’ll naturally have to charge more because they don’t have a sense of how it’s being used and will be more conservative in their estimates.

Being able to acquire data in different forms with the right types of APIs and record usage is a huge step forward.

Good data architecture is needed because across that architecture, you apply data observability principles. How is my data coming in? When is it coming in? How fresh is the information? How is it stored? How big is it? Who is consuming this information? What kind of projects do they belong to? How are they using it? Are they integrating these datasets directly into our products or tools?

Successful organisations go for leaner solutions with four to five integration patterns. They say: ‘This is how we get external data. If there’s a way not covered by these, talk to the team.’ This level of control is required, because without it, tracing data and maintaining lineage becomes very difficult.

A lot of the value comes from acquisition in the pipeline. The second source of value comes from how data is consumed. What kind of tools can you provide an organisation to actually look at patient data? For example, with multimodal information, genomic information, medical history, diagnostic tests. How do you bring them all together to provide that realistic view? This is also an area where data architecture is very important because this goes much more into the actual data itself.

Also, what are the links between the information? How do you ensure you can link different points of data to one object? What kind of tools can you provide to the end user to explore this information? The classic example is combining a dataset and providing a table with filters, letting users filter on the master dataset. But recognising the kinds of questions your users would have also allows you to support those journeys. In these situations, successful companies have always taken a more tailored approach. Identifying personas and then building up that link between all these different types of data, especially in the R&D space.

Can you elaborate on the stakeholder challenges in life sciences?

Life sciences need diverse technologies and integration patterns, but technology and IT are still seen as a cost bucket. The more technologies you have, the more quickly data gets siloed.

Where to draw the line on variability in data architecture – especially in storage and data acquisition – is critical. This quickly balloons to a large IT bill. When you can’t directly link value to it, organisations cut technology costs without realising the impact. It can dramatically affect capabilities in commercial excellence or drug discovery pipelines. We need to bring these two worlds closer together.

How does the diversity of life sciences data – such as omics, clinical trials and experimental data –affect architecture?

When you have such diverse multimodal data and dramatically different sizes, it’s important to ensure you have good abstracted data APIs even for consumption within the company. If I’m consuming imaging information or clinical trial information, how do I also have the appropriate metadata around it that describes exactly what this data contains, what are its limitations, under what context was it collected, and under what context can it be used, depending on the agreement?

This kind of metadata is key if you want to automate data pipelines or bring about computational governance. This is a key capability when you’re dealing with very sensitive healthcare information, and data is often collected with a very predefined purpose. For example, to research a specific disease or condition. Initially it might not be clear whether you can use that information to look at something else in the future.

These kinds of agreements that have been made in the past or haven’t been made yet need to get to the granularity where the legal contracts you sign with institutions, individuals, and organisations about data use are somewhat translatable and depicted as code in a way that can automatically influence downstream pipelines where you actually have to implement and enforce that governance.

For example, if a dataset is only allowed to be used for a specific kind of R&D, it needs to show up at the data architecture level that only someone from a specific part of the organisation (because they’re working on this project) can access this information during this period. The day the project ends, that access is revoked, and all this is done automatically. This isn’t the case yet. It’s still quite hybrid. This computational governance, because of the multimodality of the information combined with sensitivity, is the biggest problem many of these companies are trying to solve today.

Could GenAI help researchers navigate complex data catalogues with regulatory and compliance requirements?

I think GenAI has immense capability here because many of the issues are around how you process the right types of information in a very complex setting, recognising there are legal guidelines, ethical guidelines, and contractual guidelines you want to ensure work properly. It also interfaces from the legal space to the system space, where the information actually becomes bits and bytes.

Through a set of questions and a conversation, you can at least determine what kind of use this person is thinking about, what kind of modalities are involved, where those datasets actually sit, and which ones are bound by certain rules. This is where the ability to deploy agents can make sense because when you want to really provide this kind of guidance, it means you need clarity that’s fed into the model as context that it can then base its analysis upon. Or if it’s a RAG-like retrieval approach, you need to know exactly where to retrieve the guidance from.

The logic to evaluate is sometimes something that may need to be deterministically encoded somewhere for it to be used. That requires individuals to identify or create what I call labelled data for this kind of application. If this was the scenario, this was the data, this was the user, this is what they wanted and here’s the kind of guidance the AI should provide. With that level of labelled information, you have a bit more certainty.

Organisations have vast amounts of unstructured data that could be vectorised and embedded to navigate it better, to increase utilisation. How do you see this evolving in the future?

Vector databases, chunking, indexing, and creating embeddings in multidimensional spaces is the first step. But architecture is still limited by how you ensure data sources can be accessed via APIs and programmatic calls and protocols. You still need that so all the different islands of information have a consistent, standardised way of interfacing with them.

This is the upgrade data architectures are currently going through, driven by use cases.

What’s your one piece of advice for leaders responsible for data architecture?

Simplify and make data architecture accessible. Use simple English. Don’t use jargon. Make it a topic that even business users want to understand. Just like Microsoft made everyone comfortable with typing or Excel, architecture needs to adopt that principle. It doesn’t mean everyone needs to spend cognitive capacity on it, but it’s helpful if everyone understands its place.

Just like Microsoft made everyone comfortable with typing or Excel, architecture needs to adopt that principle.

Back to blogs
Share this:
© Data Science Talent Ltd, 2026. All Rights Reserved.