Blog – edparsons.com

There is a joke/useful analogy that in very simple terms explains despite all the complexities and technicalities, how modern AI systems work at a fundamental level..

An AI walks into a pub and goes up to the bar, the bartender greets the newcomer and wants to know what they would like to drink..

“What’s everyone else drinking…”

Good is it not…

An AI or specifically a LLM is a reflection of it’s training data and is looking for the most statistical relevant or in simple terms “most common” response to any question you give it.. what most people are drinking in the bar analogy..

“What’s everyone else drinking…”

The reason I bring this up is a trip I made to Dublin last week, and a visit to one or two bars in that fine city.

What is everyone else drinking… well in the Temple Bar area of Dublin, is going to be a pint of Guinness.. and perhaps in most of the city that is going to be the case.

But how representative is this.. the bartender in the joke / analogy is of course the training data used to train our model so while Guinness have the statistical significance in a Temple Bar pub, is it the case for Dublin, or indeed the rest of Ireland.

If we expanded out sample of bartenders to include all or Ireland Guinness may have less significance on the other hand if we focused on some of Dublins more up market bars we might find a lot of expresso martinis consumed..

A bar in Dallas, Sydney, or Bangkok are all lightly to produce different responses for our imaginary bartender..

The moral of this is clearly that models are very sensitive to their training data and how representative the training data is of the subject of interest, in almost all cases in may not be as representative and we might like and an important question for the industry is what to do in those circumstances.

How we alter the response (weight) of a system based on a foundation model to take into account limitations of data is the real “Question for our Times”, and indeed it’s also important to remember that sometimes the data is actually an accurate reflection of reality even if we might not like it..

In AI data is the code

In AI data is the code, so we need to really understand all aspects of it, not just how representative it is but its antecedence, who created it and for what original purpose.

More thinking along this lines to follow…

It was 20 years ago today…

BA002 on finals returning from New York JFK on the last commercial operation by Concorde on the 24th October 2003 marking the end of the era of supersonic air travel. Filled with celebrities G-BOAF touched down at 4:05pm and I was lucky enough to be there amongst the many thousands of Concorde fans to see it and the two previous consecutive Concorde landings..

Today G-BOAF also the last Concorde ever to fly is on display at the excellent Aerospace Museum in Bristol, while at Heathrow her sister aircraft G-BOAB resides outside at British Airways Maintenance Facility, seemingly unloved and forgotten.

However all is not lost, if you look at Heathrow Airport of Google Maps you will find that Concorde is the only airliner visible !

Heathrow the airport without airliners ?

This is not the result of a Satellite or Aerial photograph captured during a very quiet period or during the COVID lock-down when of course there were many aircraft parked at the airport, instead it is the result of image processing and the use of AI techniques to remove moving objects..

As Concorde G-BOAB has not moved in many years it is the only airliner at Heathrow..

A last laugh for Alpha Bravo !