This year, AdTech Holding attended ML Conference Serbia as a contributor. Two experts took the stage to share their in-house AI systems that are already delivering measurable business impact.
No futuristic roadmaps or theoretical concepts: just real systems, real numbers, and honest lessons learned.
Sergey Kozlov, Head of RNA (Rotation & Ad Ranking) at AdTech Holding, presented ‘Automating Ad Operations: Where to Use LLMs, and Where to Keep the Code’. This is a story about removing a critical bottleneck between the speed of algorithms and the speed of human operations.
AdTech Holding’s PropellerAds, a multichannel advertising network, serves more than 2 billion ad impressions daily, manages 40,000 active campaigns, and operates across 195+ geographies.
Many clients use PropellerAds SSP to launch campaigns on their own. But for managed-service clients, the process is different: they send their campaign requirements to an account manager, who then coordinates the launch with other teams. At that scale, getting such campaigns live was taking too much time and effort.
Turning such a request into a live campaign could take up to five hours, involving six handoffs between teams and a significant amount of repetitive manual work for four specialists.
It could take five hours to turn a client brief into a live campaign, with six handoffs between different teams along the way. Four specialists spent much of their time on repetitive manual work.
The biggest challenge was the input. Campaign briefs arrived in Jira as unstructured text, often mixing Russian and English, using internal industry slang, and containing implicit requirements.
For example, there were abbreviations such as ‘SCPM (‘SmartCPM)’ or ‘push’ (push notifications), as well as vague instructions like ‘Android is mandatory, iOS by exception’.
Regular expressions couldn’t handle these cases well, so the system needed to understand the meaning of the text.
The solution became a six-stage pipeline orchestrated by an Airflow DAG. The key architectural decision was identifying exactly where an LLM should and should not be used. And, the LLM was responsible for a single task: extracting structured information from unstructured text.
So, the pipeline works like this:
The LLM is responsible for:
The code is responsible for:
The team initially tried including large reference lists directly in prompts, but this quickly became expensive and reduced accuracy. Instead, the LLM only identifies likely matches, while the final selection is handled by code.
Another important principle was keeping humans in control. The system never launches campaigns blindly. If the model is confident and the campaign meets predefined business rules, it can be launched automatically. However, campaigns from new advertisers, those with large budgets, or those in highly regulated verticals are always sent to Draft status for manual review by the AdOps team.
Today, roughly 70% of campaigns follow the fully automated path, while the remaining 30% are intentionally routed for review.
What stood out most during the Q&A was how many teams are facing the same challenge: deciding where AI should stop and traditional software engineering should begin. It’s tempting to push more responsibilities into the model, but our experience shows that the best results come from clearly separating interpretation from execution. If I had ten more minutes, I would have gone deeper into how we evaluate confidence thresholds and design safety mechanisms for production systems.
Ekaterina Orlova, Middle Data Scientist at AdTech Holding, presented ‘Smart BI Platform: Building an AI Detective for Performance Anomalies’, a practical look at how the team reduced investigation time from 30 minutes to just a few minutes through AI-powered diagnostics.
Imagine opening a dashboard and seeing an alert:
Advertiser revenue dropped 40% compared to yesterday.
What happens next? An analyst starts investigating:
One investigation can easily consume 30 minutes: there may be 1,000 such signals every day across the network. Manual investigations covered only about 5% of them, while the remaining 95% often went unexplored.
The Smart BI platform is built around two specialized agents that share a common tool layer, including Stats API, CRM Audit, SQL, charting services, Jira, and Slack integrations.
The ReAct agent handles open-ended analytical questions such as:
‘Show the top five advertisers by revenue growth in the US yesterday.’
It works iteratively, selecting tools, reviewing results, and continuing its reasoning process. This makes it effective for exploratory analysis.
The Detective agent operates autonomously.
Triggered by an anomaly signal, it receives structured input such as:
{ advertiser_id, day, metric }
and executes a predefined investigation workflow.
The system performs nine checks in parallel – including seasonality, campaign pauses, targeting changes, bid modifications, budget limits, and several additional diagnostics, followed by sequential analysis of zones, auctions, and competitive activity.
Only after all evidence has been collected does the LLM get involved to synthesize the final report.
Example output:
Ekaterina openly shared the challenges the team encountered while building the platform:
Solving technical challenges was only part of the job. The team also had to answer a more important question: how do you get managers to trust an AI system that investigates revenue anomalies and recommends actions?
According to Ekaterina, trust depends on two factors.
First, every investigation is fully logged. Teams can see which tools were used, which datasets were created, how many tokens were consumed, and how much each analysis cost.
Second, managers provide direct feedback on investigation quality. Their ratings feed evaluation datasets, while comments become part of the knowledge base used to improve future investigations.
| Metric | Before | After |
| Investigation time | 30 minutes (manual) | 1–3 minutes |
| Signals processed per day | ~15-20 | Up to 1,000 |
| Coverage | ~5% | Nearly 100% |
| Manager’s role | Collecting data across dashboards | Reviewing ranked causes and making decisions |
With investigation time reduced from 30 minutes to just a few minutes, managers can now focus on making decisions instead of spending their mornings collecting data from different dashboards.
For Ekaterina, Smart BI is not just a conference topic, but one of her main projects this year. That made ML Conference Serbia a good place to share what the team has already learned while building an AI agent for analytics and anomaly investigation.
The topic turned out to be very relevant for the audience. Ekaterina says she received more questions than expected, mostly about the challenges the team faced and how they solved them. She also noted that the AI landscape is changing very quickly. Looking back, she would have liked to spend more time on plans and new ideas, because even during the Smart BI project, many AI tools and approaches had already evolved.
One idea from another conference talk especially resonated with her: companies should use AI to create more value, not just to cut costs.
The goal shouldn’t be to replace people, but to help them achieve more with new tools and new skills.
Ekaterina also highlighted one important lesson from working with AI inside an AdTech company. Modern models are good at general tasks such as coding, content generation, search, and summarization. But they do not know the company’s internal context, terminology, or past cases.
What they lack is domain-specific company knowledge. That’s why internal knowledge bases and good documentation are becoming so important.
Beyond the talks, the conference was also a great opportunity to meet people from different industries, discuss common problems, and exchange practical ideas.
Although the two presentations covered different systems, they shared a common challenge and a common answer.
Both teams found that modern models perform well at general reasoning. However, they have no access to the company’s internal context: its terminology, its history, its business rules. Sergey’s pipeline ran into this when campaign briefs arrived full of internal shorthand that no base model could parse. Ekaterina’s Detective hit the same wall with 15 years of AdTech jargon that foundation models had simply never seen.
Both teams solved it the same way: not by stuffing more context into prompts, but by building external structures the model can reach when it needs domain knowledge: knowledge bases, glossaries, fuzzy matchers, and semantic search over past investigations.
This is probably the most important lesson from both talks: the model is the reasoning engine, but the company’s knowledge has to live somewhere it can actually reach.
At AdTech Holding, we continue to share what works in production – including the successes, the failures, and the lessons learned along the way – because practical experience is ultimately more valuable than hype.