Gubbi Labs: Using AI to improve story production workflows
Project: Babbler
Newsroom size: 10 - 20
Solution: An AI-powered platform that uses generative AI and language models to simplify research, create science stories, and produce regional-language podcasts, making scientific knowledge more accessible to the public.
Gubbi Labs is an Indian research-based news organisation that has been communicating science since 2014. Their typical day-to-day workflow for writing stories includes perusing research papers from all domains of science, engineering, and humanities, understanding them, and writing news stories based on leads that they generate. The entire routine, including writing a draft, used to take the small team of six anywhere between a week and two weeks to complete.
The problem: Time-consuming manual workflows
The organisation’s prior manual workflow involved significant efforts to skim hundreds of sources to identify newsworthy pieces and a multi-step editing and publishing process.
With the arrival of GPT 2 and later, ChatGPT into the AI scene, they realised that they could potentially tap into AI to help them with initial drafts and improve turnaround times. “Clearly, that is where I guess using LLMs became sort of the go-to solution for us. In the sense, to really have greater and quicker turnaround times that will help also in getting more throughput out,” said HS Sudhira, Director at Gubbi Labs.
This led to creating the AI-based news workflow improvement tool, Babbler.
Building the solution: Roadmap to prototyping
After being selected as a grantee for the Innovation Challenge, the team zeroed in on certain aspects of the workflow to automate using AI. One of the first, was to identify and create a “newsworthiness index”.
“We have come up with our own matrix to kind of see through them and assign an index to say what could be newsworthy. So now instead of skimming through thousands of papers every month, which from India alone is anywhere around two-and-a-half to three-and-a-half thousand papers, we now know we just have to pick up the top 50 from the newsworthiness index across different topics,” said Sudhira.
Using AI tools in this instance significantly reduced the time and effort involved in picking research papers to read. Where the team used to spend hours, they now spend about an hour overall in a week.
The opportunities: An answer that has many applications
Using LLMs to solve the newsworthiness issue also lent itself to other applications within their workflow. For example, they built a pipeline using engineered prompts to generate story drafts and social media collateral.
“Something that we have been very cognisant of is that we have made that at different touchpoints we are bringing in a sort of human-in-the-loop. So our editors will still play a very critical role in basically making sure the outputs are good, reasonable or any changes to be quickly edited,” added Sudhira.
The user interface for the tool is designed to be intuitive especially for editors.
The dashboard allows editors to view and sort papers by newsworthiness index, institution, or journal, then select a paper to generate a story and social media collateral with pre-derived prompts that can be modified. To tailor the system to the type of editor using it, it has two modes: the what-you-see-is-what-you-get aspect for immediate revisions, and advanced settings for experienced users to tweak models.
Technology stack
The newsworthiness index was developed by analysing Google News for relevance including freshness, authoritativeness, relevance, and context. They identified eight broad topics of social relevance (health, ecology, environment, climate change, technology) to categorise research articles. The index also incorporates a ranking system based on the institution and impact factor of the journals where papers are published.
“So higher impact factor journals would have a greater weightage for instance or an institution with a higher ranking would have a greater weightage. We arrived at sort of a combination of these weightages and topics and then linked it with Google News’ API to see which are trending. Based on that we derived the newsworthiness index,” said Sudhira.
The backend uses Python as the core language, with a SQL database for journal papers and Python for API requests, newsworthiness calculations, and story generation. The frontend is built using React and Material UI.
Team and challenges faced
The team assembled to work on the project included consulting researchers to test the newsworthiness index, and a UI/UX consultant for design feedback. Apart from this, their internal team of editors also took part in reviewing and evaluating the generated content.
During the initial phase, the team faced several challenges.
“Some of the things that we had realised was that when you parse a PDF (to LLMs) it would kind of simply take all the text in and not just the main text that is required because at times in a research paper you have a lot of other information like journal name, page numbers, keywords, institutional affiliation, author names and all of it,” explained Sudhira.
They also faced initial resistance from some members of the newsroom to apply AI keeping in mind the quality and issues of bias. However, they addressed this by keeping human-in-the-loop systems as a part of the tool.
Lessons for newsrooms
Benefits of time-boxed "sprints": Embrace the use of short product development cycles (like four-week sprints). This approach helps in clearly conceiving the end goal and achieving it through smaller, focused milestones.
Sticking to focus and goals: Using sprints helps the team maintain focus on the immediate task while ensuring that each milestone fits the larger goal of arriving at the final tool.
The team aims to build their revenue and subscription strategy by making Babbler, available commercially for academic institutions and other research institutions.
Explore Previous Grantees Journeys
Find our 2024 Innovation Challenge grantees, their journeys and the outcomes here. This grantmaking programme enabled 35 news organisations around the world to experiment and implement solutions to enhance and improve journalistic systems and processes using AI technologies.
The JournalismAI Innovation Challenge, supported by the Google News Initiative, is organised by the JournalismAI team at Polis – the journalism think-tank at the London School of Economics and Political Science, and it is powered by the Google News Initiative.
