Tuesday, January 30, 2024

Is the Path to AI/ML Commercial Success Curated Wall Gardened Data Sources?

There has been quite a lot written on the subject of hallucinations and the increasing amount of erroneous results coming from the major AI/LLM systems, in part as more bad data is generated the more bad data exists on the internet for the major AI systems to "learn" from. A good article on the subject is Is ChatGPT Getting Worse Over Time?

This phase of technology development is very different from the origins of the web and search engines where the search companies dumped in everything to there indexing engines and then presented many results for users to sort through, with LLMs the user asks a specific question and expects a correnct answer, so the relationships between questions and answers are now 1:1 rather than 1:n - I would argue that this puts a higher burden on the providers of answers.

So how can the benefits of AI/ML be safely realized by businesses and governments? I would posit that at least until the LLMs that use public data can guarantee a higher quality the answer is to leverage well curated data sources. A good example of this is Apple working on licensing data from well known publishers, see: Apple Explores A.I. Deals With News Publishers.

So are the major beneficiaries of this wave of innovation the aggregators of clean curated data? Perhaps the recent Juniper/HPE deal is an indication of this. One of the rationals for this deal was discussed on a couple of podcasts on Silicon Angle  Research Analysis: HPE Acquires Juniper and  The AI evolution in tech: Pioneering smarter decisions, from surgery to security. I would argue if HPE can integrate its data from compute, storage and corporate wireless into MIST AI the combined company will be able to offer customers something very unique - the ability to manage their whole IT infrastructure through AI/ML that is safe and dependable.

This then raises the question for customers of data aggregators, why should I allow you to collect and use my data? There has to be a strong value proposition for the customer to share their data and a high guarantee of anonymization. In the case of MIST AI the benefit is improved IT management, giving unified management of compute, storage and networking. I imagine that HPE hopes that the benefit is that enterprises who buy Juniper will want to add HP gear and vice versa to leverage the single view of the enterprise provided by MIST.

There are many other curated information sources, health care, security, manufacturing, etc, but I believe in all these verticals the key to successfully deployment of AI/ML solutions is aggregating the data (and getting permission to do it, with the appropriate anonymization). Also aggregating it such that "truth" is maintained and aggregated from all the different sources. The ability of the human brain to do such "voting" on multiple reference frames is discussed by Jeff Hawkins in "A Thousand Brains". Being able to automate this knowledge collection and creating a clean knowledge base in multiple domains is one of the big challenges, I think, we face making AI/ML successful.

So not only is there a need to have a lot of data, there is also a need to have well organized data sets that have unambiguous facts collected from multiple sources that can be leverage to give accurate answers to questions that are are now 1:1 rather than 1:n. Customers will need to see a benefit to allow the aggregator to collect the data, as the value of the data increases there could be some interesting discussions on licensing. When a customer goes to a corporate support site, or management tool, and asks a question, the expectation is now a single correct answer not a list of search results that the user has to decide which is relevant to their problem.


Saturday, January 06, 2024

Cisco, Isovalent Acquistion

Well it has happened, I have followed Isovalent/Cilium for a while now and have always been impressed by the company and what they have been doing with eBPF. In the past several months in conversations with associates we have discussed Cilium and its future/exit strategy and the one company I always came back too was Cisco. Though I thought is also might make sense for one of the pure play security companies to acquire them, especially Fortinet as that would help plug their major public cloud gap, especially as Cilium's Tetragon product is such a good security play. The combination of networking and security that Isovalent has pioneered with eBPF is a good fit for Cisco - they are both plumbers :-).

I see this as a big plus for Cisco if they manage it correctly, they get an instant footprint in most if not all public cloud vendors, where they are not as strong as they could be. They will also have additional opportunities in service providers and mobile where they are strong and Kubernetes is gathering stream. The other big play for Cisco is the amount of observability data that Cilium/Tetragon makes available. Data is the life blood of AI/ML and the data generated by Cilium/Tetragon is structured, close to realtime, and common across cloud vendors. The opportunity for Cisco/Isovalent is to provide almost realtime AI/ML powered security across clouds, and also enterprises.

Moving forward I see an opportunity for Cisco/Cilium to expand Tetragon beyond Kubernetes, especially in public cloud where deploying eBPF infrastructure as a part of a standard image would be possible enabling deeper visibility and security, and broadening the security boundaries. Once again data is the life blood of AI/ML and getting a large footprint into the infrastructure layer provides Cisco and the public cloud vendors to provide better and more realtime security.

The only downside of the acquisition, I see, is that the pace of eBPF/Linux development may slow. Isovalent as a nimble startup could push changes into the kernel with minimal corporate overhead, now as part of Cisco I would worry that the focus and speed of innovation of eBPF/Linux will slow down. However time will tell.

So congratulations to the Isovalent team and it will be interesting to see where this goes.

Saturday, February 18, 2006


It has been a long time since I last posted. A combination of things, have caused this major case of blogger's block. Lack on interesting things to say or comment on, and working through some career decisions. Though I did have the chance to build a fairly extensive REST application interface that did illustrate (to me at least) the power and simplicity of REST. To design a well thought out REST application is hard, and is very dependant on defining the core resources and their granularity. Once they are defined the interfaces follow naturally.

However the changes mentioned in the title are not around a life changing conversion from upper case Web Services to lower case web services, I still believe they both still have their place and both could do with some improvements.

The changes are a little more extensive professional, after several years working in start-ups, I have made the change to work at a some what larger company. A few weeks ago I started working at Cisco Systems in the Application Orientated Networking (AON) group. It is challenging and exciting as we are working to make the network smarter and hence simplify how applications are developed.

Now that I am back working directly in the web services (both upper and lower case) and distributed computing space - I hope my blogger's block will be cured....

Wednesday, June 22, 2005

Search: Depth, breadth and trust

This Yahoo Search vs. Google and Technorati: Link Counts and Analysis (by Jeremy Zawodny) prompted some thoughts about how search engines are going beyond a nice to have to a critical part of business and personal lives. With that comes a host of issues and makes that deciding on which search engine to use based on the number of pages it indexes and how fast it re-indexes them a secondary metric.

The most important metrics are going to become who can I trust most and who is less prone to manipulation either by inside advertisters or through extranal manipulation by sophisticated (and some not so sophisticated) linking schemes. How trust is propogated through a search engine is going to become the key differentiator (IMHO) over the next several years not the index size or refresh rate.

Tuesday, June 21, 2005

Tools and Artisans

This from Doug Tidwell: "They're afraid that if they open up their business processes, their customers will realize just how little value they add" via Mark Baker.

My feeling is that many organizations confuse have cool tools/software with having people that know how to use the output effectively. The goal of business processes is to deliver information to decision makers. The difference between a great artisan and the rest of us is not the tools they use it is in their native and acquired skills. The same is true in any organization - great people succeed no matter what - good tools merely help make their life easier.

and to Doug's other point - no The Specials were the coolest ;-)