Cybersecurity Needs a New Data Architecture

By Alex Thaman, CTO at Andesite

Enterprise organizations are dealing with an unprecedented volume of increasingly dense and complex data. SecOps teams must determine the best way to collect, organize, and use that data so they can identify, prioritize, and respond to threats efficiently and effectively. 

The lack of data management solutions that are both scalable and cost-effective often leads to a trade-off between visibility, latency, and costs. To optimize data architecture for SecOps, organizations need to re-think their approach to data storage, management, and access, and consider moving to a modern, modular stack.

In my conversations with security leaders, many are frustrated with how much their SIEM costs but continue to pay because they don’t see another easily manageable path to reduce risk. However, modern data architectures and AI technology make it possible to break out of this cycle.

The Problem with Legacy Solutions

For cybersecurity, data architecture involves the underlying framework for how data is collected, managed, and used. Building a robust architecture requires solid understanding of the ways data from a multitude of sources will be used for analytics and decision-making. It cannot be optimized in isolation from these needs.

A primary SecOps challenge is the proliferation of products that were developed when data was much less complex. Attacks were also less sophisticated — for example, living-off-the-land techniques involving persistent threats over time only became prevalent a decade ago. Our industry has favored and incentivized point-products with targeted solutions, leading to further data sprawl. But attacks that span long periods of time or involve lateral movement are incredibly difficult to track with simple point solutions that don’t connect the data.

Twenty years ago, the standard way to collect and analyze large volumes of data was through a relational database architecture, often managed by a database admin (DBA).  Lacking the resources to tailor these complex solutions precisely, many organizations opted for SIEMs that can both store data and act as an analyst interface. 

SIEMs were initially created to review application data logs. Over time, we started filtering additional types of data through them. But traditional SIEMS are not highly scalable, certainly not in a cost-effective way.

 

 

While the SIEM is still widely used, the data realities it was architected for are outdated. Today’s SOC needs vastly exceed basic log storage. Continuing to use a single, simplified architecture leads to prohibitively high costs, which forces the inevitable trade-off between access to broader insights vs. the costs of managing that data. 

Delayed migration to better systems due to cost and change management fears causes further trade-offs between storage methods, query latency, and volume. This requires the architecture to be tailored to various use cases — for example, low-latency BI dashboards or high-latency bulk data science analysis. Many CISOs choose what data to stream into systems based on cost, which may increase risk. However, without the ability  to quantify that risk, they are essentially flying blind.

The bottom line: while the SIEM may still be the best solution for collecting, organizing, and using some data, especially medium scale event logs, it falls short of what’s needed for SecOps today.

Building a Layered Architecture

The difficulty separating bodies of security analysis work leads to an indiscriminate single data store model. To overcome scalability and cost issues, we have to separate data architecture from the tools and analytics using the data, which tend to be closely coupled. 

Modern tech stacks separate the data warehouse and analytical layer to consider who or what analyzes the data — machines, algorithms, or humans. Large organizations are adopting a layered approach, with architecture that federates data across the organization. 

Another way to think about this is as a “satellite” model, combining small and large systems with many “satellites” of data orbiting around them. Different data can be filtered into different solutions, depending on what type of analysis you’re performing. 

For example, if triaging a single alert in near real time, you need immediate access to a small amount of data across multiple tools all at once. When looking for attacks spanning a long timeframe, you could sacrifice some analysis speed for data completeness. Perhaps you want to correlate logs from today with emails, asset inventory, or other data points to answer complex “who” questions. This can’t be easily done in one system and may require additional ways to relate all of the data. Yet another challenge might be evaluating alert and resolution patterns to understand how to optimize detection.

Supporting all of these functions well requires a layered architecture that lends itself to each kind of analysis. You’ll also want to optimize schemas, aggregations, and other variables as you scale. A modern approach to data architecture includes all of these systems as well as whatever solution connects them, allowing more granular management of data movement and access. This is how organizations can solve for storage costs without trading off query latency or data volumes. 

 

A Modular Approach

The shift to a more modular framework is a natural progression. With data coming from so many different places, it makes sense to use multiple specialized systems. However, it’s not an easy transition. Even at the largest, most sophisticated companies, designing such complex, multi-layered data systems is demanding — creating a significant security challenge. 

Companies that package a single data platform and sell it as a product lack the flexibility to meet different needs. Solutions that excel at simple data analysis on a massive scale may work well for easy tasks but are less adept at advanced analysis at reasonable scale and cost. Some products are optimized for horizontal scaling by adding more machines, while others may be fundamentally superior in efficiently storing and processing data but have poor accessibility. 

Compounding this complexity is a growing need to analyze data at, or closer to, the edge, in real time, without waiting for log ingestion. Some solutions address this by determining which data is and isn’t worth capturing. By doing more work at the point of data creation, you can bring less data into the central systems for analysis. 

The Path Forward: Modern Data Architecture, A Proactive Approach 

Cybersecurity has so far been bad at asking better questions of data, resorting to primitive and use-case-dependent analytics like simple rule matching, probably due to the difficulty of making advanced analytics repeatable for scaling SOC operations. To overcome critical challenges, we must focus on how to use data for better protection and response while also shifting from a reactive stance to being more proactive and protective — both of which start with better data architecture. 

Adopting a modern, modular approach to data architecture with a single security-centric decision layer on top empowers analysts to manage and access data more efficiently and effectively, without prohibitive costs or scalability issues. 

About Alex Thaman

Over a 20+ year career, Alex has been an engineering leader at Microsoft, Unity Software, and Scale AI. At Microsoft, Alex worked on compiler technologies before transitioning to AI. He helped develop Xbox Kinect, Hololens, and Microsoft’s Speech platform. As Chief Architect and Manager for Computer Vision at Unity Software, he developed and led an engineering and product team that worked to simplify the creation of synthetic data to train and test computer vision models. Alex holds a BS with a double major in Computer Science and Math from Purdue University.

What’s Next for AI-Powered Cybersecurity – Insights From Andesite Leaders and Advisors

While AI-powered cybersecurity redefines our field and the broader landscape is impacted by geopolitical conflicts and world events, the industry needs to revisit strategies and rules of engagement. 

 

At Andesite, we are dedicated to arming cybersecurity teams with actionable insights that put humans at the helm, enabling them to make critical decisions, and build a sustainable advantage based on prevention rather than reaction. To help you stay one step ahead, we gathered Andesite’s leaders and advisors to get their insights on where security technology for the enterprise market is going. 

 

“Investigation timelines for SOC teams that embrace AI SOC tech will accelerate dramatically, shifting the focus from investigation speed to investigation quality.”

— William MacMillan, Chief Product Officer, Andesite

 

To prepare for what’s next and empower your team to assess risk and make critical decisions, tap into strategic insights from seasoned security experts who’ve served global organizations including the CIA, Microsoft, JP Morgan Chase, CrowdStrike, and AWS. 

 

Expert insights from security leaders:

  • William MacMillan Chief Product Officer, Andesite
  • Greg Rattray Chief Strategy and Risk Officer, Andesite
  • Alex Thaman Chief Technology Officer, Andesite
  • Merritt Baer Andesite Advisor, Chief Security Officer, Enkrypt.AI
  • Kris Merritt Andesite Advisor, Founder & President, Vector8, Inc.

 


Why AI Won’t Replace Us: The Critical Role of Human Oversight in AI-Driven Workflows

The inevitable follow-up question I receive after telling someone I work with artificial intelligence (AI) is some version of the question, “So, will AI take my job?” This reaction isn’t surprising. Microsoft’s 2024 Workplace Learning Report shows nearly half of workers worry AI might replace them. But this framing misses a crucial nuance about our relationship with technology: the question isn’t whether AI will replace us but how we can most effectively wield this powerful tool in our work.

By addressing the misconceptions about AI replacing jobs and emphasizing the criticality of human input and oversight in AI-driven workflows, we can shift the conversation from fear to a more productive vision of human-AI collaboration.

Misconception #1: AI Will Outperform Humans in All Tasks

AI systems excel at processing large amounts of data and can help humans perform specific tasks with remarkable speed and accuracy. However, the belief that AI outperforms humans in every domain overlooks its key limitations. While AI is highly effective at pattern recognition, it’s limited by the quality and scope of the data it’s been trained on. Like a fraud detection model that performs well when new cases closely align with the legitimate and fraudulent purchases it’s been trained on but struggles with cases it hasn’t encountered, AI’s capabilities are constrained by the quality, diversity, and completeness of its training data, which is curated by humans.

AI’s disconnect from reality reveals itself when it confronts situations outside its training data. Humans can adapt to new contexts with limited information, drawing on intuition, prior experience, and flexible reasoning. In contrast, AI systems often falter under uncertainty, constrained by statistical patterns rather than conceptual understanding. Many also suffer from temporal rigidity. Trained on fixed snapshots of knowledge, they require human updates to remain current. Take Google’s Bard; it once confidently claimed that the James Webb Space Telescope took the first images of an exoplanet when such images were captured years before the telescope’s launch. This error demonstrates that AI doesn’t know things the way humans do – it predicts them, sometimes incorrectly, based on outdated or misaligned information.

Even powerful tools, like LLMs, lack a true understanding of real-world concepts and relationships. While they can generate coherent text or summarize data, they can’t understand some of the concepts and relationships that humans intuitively grasp. For instance, in cybersecurity, AI can analyze attack patterns based on historical data, but when facing novel threats, it lacks the reflex and intuition that come from years of hands-on experience.

Misconception #2: AI Will Remove the Need for Humans in Decision-Making

AI systems lack any innate moral compass or judgment. Ideas like dignity, justice, and human rights aren’t embedded in their architecture – they’re the product of centuries of philosophical debate, social struggle, and lived experience. That absence makes human oversight non-negotiable. Humans ensure that AI-powered work reflects the values we choose to uphold, not just the patterns we’ve recorded.

Executive decision-making is another area where human judgment remains superior. Business leaders understand what measures make sense at certain junctures based on organizational context, stakeholder needs, and subtle factors like team readiness or financial runway. This requires understanding unwritten rules, past experiences with similar situations, and internal dynamics that AI cannot access. The most effective decisions often integrate quantitative data with qualitative judgment in ways that AI cannot replicate.

Humans also possess creative problem-solving abilities that AI can’t match. While AI primarily recombines patterns from existing data, humans routinely make conceptual leaps that challenge established conventions. Consider Edward Jenner’s development of the smallpox vaccine: his insight didn’t come from structured data but from observing that milkmaids exposed to cowpox didn’t contract smallpox. This lateral thinking – drawing a novel connection from lived, physical experience – sparked a medical revolution. AI might eventually infer such relationships from large datasets, but it lacks the embodied experience and intuitive spark that led Jenner to his discovery. 

Misconception #3: AI Systems Won’t Require Human Oversight

Humans inherently trust other humans more than they trust machines. This comes from our innate understanding of emotional contexts that AI cannot authentically replicate. Humans recognize nuance, respond to emotional cues, and can communicate with genuine empathy. Those capabilities foster trust in ways AI cannot match. 

Accountability is another critical factor. When AI systems make mistakes or cause harm, responsibility ultimately falls to humans. Organizations require clear accountability chains with designated oversight roles and channels for appeals or remediation. People expect that, for decisions impacting their lives, a qualified human will be reviewing the process, ensuring that context, empathy, and moral reasoning are considered. This “human in the loop” approach serves as a critical safeguard against errors and unwittingly unjust outcomes.

Communities also want their values represented in decision-making processes. Human oversight ensures AI systems respect diverse stakeholder perspectives and operate within accepted ethical frameworks. As AI adoption grows, maintaining human involvement enhances legitimacy and upholds public confidence in AI-assisted decisions. 

The Future of Human-AI Collaboration

While AI won’t replace us anytime soon, it will undoubtedly transform how we work. The most successful organizations will be those that leverage AI as a powerful tool for augmentation rather than replacement. This represents an opportunity for humans to focus on what we do best – creative thinking, relationship building, and meaningful work.

As AI handles more routine tasks, humans can dedicate their energy to higher-order thinking. This productivity multiplier effect is already emerging across industries. Radiologists use AI to pre-screen images and focus on difficult cases, cybersecurity teams deploy AI for data analysis and triage while concentrating on higher-impact activities like proactive prevention and remediation, and content creators use AI for research while applying their perspective and creativity to the final product.

For organizations implementing AI, recommended best practices include:

  • Design AI systems with humans at the center – both as end-users and oversight providers. Ensure clear accountability chains with designated human review roles and appeal processes for AI-generated decisions.
  • Implement robust ethical guardrails, including thorough data privacy protections, a transparent explanation of how AI is used, ongoing bias monitoring, and proportional deployment that matches the level of AI autonomy to the risk involved.
  • Focus on skill transformation rather than replacement. As AI adoption grows, new roles like AI ethics specialists and human-AI collaboration managers will emerge.

By embracing AI as a tool, we can build a future where technology advances human potential rather than diminishing it. The most powerful outcomes will come not from AI alone, but from the combination of humans and AI working in concert.

About Stephanie Klaskin

Stephanie Klaskin is a data scientist at Andesite, where she evaluates the AI behind the product and works with security teams to translate customer data into better detection and faster response. Before Andesite, she partnered with clients in healthcare, marketing, and finance to solve repeatable problems with practical data science. She holds a M.A. in Quantitative Methods from the University of Texas at Austin and a B.A. in Cognitive Science from Johns Hopkins University.

Human-AI Collaboration is key to secure government systems, Andesite CPO William MacMillan tells GovCast

GovCast interviewed Andesite Chief Product Officer William MacMillan to talk about the role of Human-AI collaboration in national security.

Artificial intelligence powers many cybersecurity applications, and government agencies are increasingly using AI to augment systems in national security and intelligence capacities. The complexities of AI implementation require careful architectural considerations and robust governance frameworks to ensure safe execution.

William MacMillan, former CISO at CIA and current chief product officer at Andesite AI, noted how AI holds tremendous potential to enhance efficiency and accuracy, particularly through “human in the loop” systems that manage vast amounts of data.

MacMillan also talks about the critical role of leadership in establishing international AI standards and the necessity of user training and human-AI collaboration for effective implementation.

 

AI can help the industry finally get SOC automation right

Andesite’s Chief Product Officer William MacMillan writes about how “despite massive investment in tools and technologies, many SOCs still find themselves overwhelmed by the very chaos they aim to control.”

“Analysts are drowning in data, jumping between disconnected tools, and trying to make sense of endless alerts. The result? An epidemic of burnout among the talented security professionals who are critical to keeping organizations safe.

“This has become particularly acute for state and local government security teams that must protect critical infrastructure and sensitive citizen data with typically smaller budgets and staff than their federal or private-sector counterparts.

“Despite this challenge, today we’re seeing states significantly increase cybersecurity investments, with initiatives like the proposed $88 million Cyber Command in Texas and New York’s enhanced cybersecurity funding for its Joint Security Operations Center.

“The root cause lies in a fundamental misconception about security operations. For decades, we’ve tried to impose rigid structure on inherently unstructured problems. Various products promised to bring order through centralization and automation. Instead, they often added layers of complexity, transforming threat hunting from finding a needle in a haystack to finding the right needle in a stack of needles.

The Current AI Revolution Will (Finally) Transform Your SOC

Alex, Thaman, Our Chief Technology Officer writes about the effects of AI on the cybersecurity stack.

Artificial intelligence (AI) is profoundly transforming cybersecurity, reimagining detection through remediation. While AI’s value across cybersecurity workflows has been inconsistent, recent breakthroughs in machine learning will significantly decrease organizational risk and become necessary in defense operations to keep up with constantly evolving threats. Modern AI technology requires less specialized data to build capabilities, making it accessible for enterprises of every size and creating a more competitive technology ecosystem. 

We have seen AI technology go through four major transitions over the past few decades, all of which have made their way into the cybersecurity ecosystems. 

 

WSJ PRO Venture Capital Newsletter: Too Much of a Good Thing

Good day. The hype around artificial intelligence is creating challenges for businesses selling AI products, says Jack Altman, managing partner of Alt Capital, which just launched an accelerator for business software startups. 

On the one hand, demand is strong. “You are seeing companies ramp revenue very quickly,” he said.

Budgets are opening up for AI from even more conservative industries, such as education, government and healthcare. “They are really ready to buy this stuff,” Altman said.

However, customer commitment may be weak and it is an open question whether initial contracts will be renewed. “You are dealing with a very strong uptake in experimental budgets and you have to be careful on the other side,” Altman said.