I wasn’t expecting that

AI. Or rather LLM’s and the current state of machine learning and GPT’s.

In 2009 I started a collaboration with the Sri Lanka Institute of Information Technology to fund research into an AI chatbot capable of answering initial customer technical support questions. Exetel funded the Chair of AI Studies, for which Dr. Mahima Weerasinghe was appointed. We used the years of collected helpdesk resolutions as our primary training data.

Prior to that, we had been running our own in-house development using the AIML framework developed by Richard Wallace. As three times Loebner Prize winner, it was pretty much state of the art for the day. I started the project, but it quickly outpaced my limited coding skills, and so I recruited a brilliant young intern, Vinna, to take over. She got it to a stage where, while we could see the potential, it was clear it need a whole lot more dedicated resources to make it commercially viable.

I had already worked with the SLIIT to get their best and brightest students on paid internship and scholarship programs, so they were the natural choice to sound out the idea of a commercial collaboration. I ran it past Prof. Kapurubandara, our primary contact at the SLIIT, then on her recommendation to the Vice Chancellor Prof. Gamage, and then to the Chancellor Prof. Ratnayake, and all four of us worked out the practical details, in, what I recall to be a pleasant afternoon at a nearby country club.

That program ran from 2009 to the end of 2012. Dr. Weerasinghe was (still is no doubt!) a brilliant researcher and made great strides in adapting machine learning and language processing for customer facing commercial use. Yet it never quite got there. We were missing something.

Then 2022 ChatGPT bursts onto the scene, stunning the world. We might not have been the leading edge in 2012, but we knew what research was being done, and our model was at least as good as, say Siri at the time. Better in fact for our specific purpose. But ChatGPT was just so far beyond anything I envisaged as possible. I felt I had been caught flat footed and blindsided at the same time.

After some frantic “catch up” research, it dawned on me what that “something” was that we had missed. Backpropagation. There was nothing new about it, since the method had been used since at least 1982. But since there had been no real progress since then, its application remained mostly academic and esoteric. But even if we had considered it, the processor power was not realistically available, and would have been way, way out of our budget.

Leave a Comment

Your email address will not be published. Required fields are marked *