Dr Ana Moya is a data scientist and analytics expert with a doctorate from the Technical University of Dortmund. During her 15+ year tenure at the FUNKE and Handelsblatt Media Group, Ana’s projects have included data integrations, the development of data and text mining algorithms, and the application of statistical and advanced AI models.
Ana is a sought-after academic lecturer and teaches at the International School of Management in Germany. An established author, she addresses the application of statistical methods in data journalism articles.
In this post, Dr Ana Moya examines the evolving landscape of digital content. The way we consume media is changing, and editorial teams must adjust their techniques to stay relevant. Analytics data, Ana explains, is now key for optimising reader engagement. By utilising their data in the right way, teams can achieve effective editorial intelligence:
Beyond data journalism, statistics derived from news offerings drive the core work of editorial teams: selecting topics, shaping coverage, and publishing content. This is particularly evident in online journalism, where search engines and social media exert increasing influence. The growing use of mobile devices introduces diverse usage scenarios and reader needs. News media and publishers aim to reduce dependence on the advertising market, increasing the need to better understand readers and subscribers.
With changing media consumption patterns, editorial teams must adjust their techniques to stay relevant and achieve desired outcomes. This involves navigating a dynamic environment while maintaining content excellence and honesty, using contemporary techniques to meet audience expectations and maintain competitiveness.
Data touches every part of a publishing business, from content creation to customer engagement, helping to understand and optimise editorial processes. Historically, the tangible nature of printed newspapers made the business more concrete. Now, with digital content, new forms of presenting information, such as apps, websites, and newsletters, are essential.
In subscription-based businesses, ERP (enterprise resource planning) systems track subscription details, such as billing cycles, renewal dates, and account statuses. CRM (customer relationship management) systems focus on managing customer communications and tracking interactions.
Beyond these core systems, additional data sources provide valuable insights into reader behaviour and engagement. Analytics tools monitor how users interact with content on websites, apps and so on, capturing metrics such as page views or time spent on articles.
Additionally, content data from CMS (content management systems) play a pivotal role by storing and organising article metadata, such as publication dates, author information, categories, and tags.
Integrating and analysing diverse data sources gives publishers a comprehensive understanding of audience preferences and behaviours. A data-driven approach in journalism involves making decisions based on quantitative data and analysis. This method, encompassing algorithms, models, and historical performance metrics, places the success or failure of published content at the forefront of decision-making. While statistical evaluations identify trend topics and audience preferences, they are considered alongside journalistic criteria such as objectivity, balance, and timeliness, making the strategy data-informed rather than purely data-driven.
The process of transforming raw data into valuable knowledge involves stages illustrated by the DIKW (data, information, knowledge, wisdom) pyramid. This progression helps understand past events, predict future trends, and derive actionable insights for intelligent decisions.
The development of data into wisdom occurs through application of progressive analyses, predominantly predictive analytics, not only to understand the past, but also to make forecasts for the future, to generate meaningful, action-oriented insights to derive ‘intelligent’ decisions.
Measuring success in data-informed strategies requires establishing key performance indicators (KPIs). These KPIs define essential metrics for evaluating performance and ensuring alignment with strategic goals.
Different metrics can thus serve different purposes, but some things are currently harder to measure than others. Some will likely always resist quantification. Some sources of data, like sessions, are used in attempts to understand quite different things, like reach versus engagement. These aspects are clearly visualised in the following figure:
The effectiveness of data analysis hinges on how results (based on KPIs) are processed and communicated. Dashboards offer a graphical, interactive user interface that displays live data, while reports summarise the current status at specific times and are actively distributed. Both forms of communication should have clear goals and be tailored to recipients’ needs. Consistent dashboards and comprehensive reporting ensure relevant teams and stakeholders are informed and can act on the insights provided.
Editorial intelligence refers to a media organisation’s ability to analyse, interpret, and generate predictive insights from collected data. This process supports data-informed editorial strategies, enhancing content quality and relevance. By integrating editorial intelligence, media professionals can make informed decisions that align with audience needs, continuously improving the editorial process. The ability to understand and use data competently is crucial for effective editorial intelligence.
Advanced analytics, including predictive and prescriptive methods, play a crucial role in informed decision-making. Predictive analytics uses historical data to forecast future events, such as user conversion rates or cancellation likelihoods, involving techniques like statistical modelling, machine learning, and text mining.
For instance, a project aimed at understanding subscriber reading habits might use clustering algorithms to segment users into groups based on their content preferences and engagement levels. Classification algorithms could then be employed to categorise new content into these segments, while regression analysis might predict how different types of content will perform. Text mining techniques can analyse and extract characteristics from articles like sentiment and identify trending topics.
A powerful text mining technique is latent semantic analysis (LSA). It evaluates documents and looks for the underlying meaning or concept of the documents. LSA would have an easy job if each word only had one meaning, but oftentimes, words are not only ambiguous but also are synonyms or have multiple meanings. One example could be the word ‘may’, which could be a verb, a noun for a month, or a name. To overcome this problem, LSA fundamentally compares how frequently the words appear together in one document and afterwards compares it across all other documents. By grouping words with other words, it tries to identify those words which are semantically related to each other and eventually to get the true meaning of ambiguous words. Near neighbours, comparison matrix, one-tomany comparison and Pairwise comparison are LSA methods, as stated by Rozeva and Zerkova, 2017.
FIGURE 3
Latent semantic analysis applications; Source: Rozeva/Zerkova 2017:
Prescriptive analytics goes beyond prediction by providing actionable recommendations based on data insights. This involves using optimisation, AI, and simulation techniques to suggest specific actions that can enhance business processes and content strategies.
While generative AI is not yet a major component of Editorial Intelligence, traditional statistical models and algorithms remain essential tools for data analytics. Methods like robust text mining, clustering, and classification continue to offer reliable and efficient benefits.
Text mining, for instance, remains a highly reliable technique for extracting insights from unstructured data. Its robustness comes from its ability to handle large volumes of text efficiently, uncovering valuable patterns, trends, and sentiments that inform content strategies and audience engagement. This longestablished method has proven its effectiveness over time and continues to deliver reliable results.
Additionally, traditional methods are resource-efficient, requiring less infrastructure and energy compared to newer technologies, making them environmentally sustainable.
Ensuring high data quality is critical for implementing data-informed strategies. Poor data quality can lead to incorrect decisions. Practical steps to address this issue include implementing alerts for data anomalies and fostering responsibility among departments to maintain data quality.
Promoting data culture and literacy within departments is also essential, enabling individuals to understand and utilise data effectively, transforming insights into actionable strategies. A collaborative environment enhances internal processes. It’s important to involve stakeholders in understanding the complexity and value of different analytical approaches. Transparency about the time and resources required for various types of analyses helps set realistic expectations and achieve better outcomes.
REFERENCES AND RECOMMENDED LITERATURE
Cao, Longbing: Data Science Thinking. Cham [Springer] 2018. Cherubini, F.: Editorial Analytics: How News Media Are Developing and Using Audience Data and Metrics – Reuters Institute Digital News Report. 2020.
Rozeva, A.; Zerkova, S.: Assessing semantic similarity of texts –Methods and algorithms. In: Pasheva, V.; Popivanov, N.; Venkov, G. (eds.), 2017.
Shi-Nash, Amy; Hardoon, David R.: Data Analytics and Predictive Analytics in the Era of Big Data. In: Geng, Hwaiyu (Hrsg.): Internet of Things and Data Analytics Handbook. Hoboken [John Wiley & Sons], 2017.