Some thoughts on Artificial Intelligence – Part 1: It works.

In a recent issue (10/2018 Vol.61 No.10) of Communications of the ACM there is a very interesting article by Adnan Darwiche titled “Human-Level Intelligence or Animal-Like Abilities?“.

Adnan took a balanced view of the current development in AI and its related field, and in many respect challenged many of the AI hypes. A lot of the discussions, especially the ones that compare function-based and model-based approaches echo my experience with data-driven research in the past ten years. 

One of the main research challenges I faced many years ago was modelling the perceptual quality of video content so any quality assessment (e.g., Netflix wants to know what level of QoS it has at the user side) can be done by a “golden-eye objective model” without human intervention. The workflow is simple: build lots of test videos “corrupted” by network impairments -> do subjective experiment to collect user opinions on the videos -> build an objective model by correlating user opinions and impairments -> optimise the model. Function-based machine learning such as Neural Networks was an option for data modelling but they were not considered as a good choice back then. Why? people didn’t feel comfortable champion something that is very difficult to reason. While function-based learning is aggressively powerful in achieving good performance, you end up with tens of parameters going in different directions and there is usually no logical connection between the parameters and the theory that underpins your research. Therefore, you don’t really need a ton of area knowledge to get a decent model as long as there are sufficient data available. So, using this type of learning felt like cheating and was not considered as “proper research”. What makes things worse is that you won’t be able to generalise your findings beyond the scope of the experimental dataset. I ended up following the “convention”: build a theoretical framework (based on psychology, anatomy and physiology of the human visual system, and video compression standard), then use logistic regression (and some other model-based learning tools) to complete a model. The model performance is more than acceptable and the design is backed by discussions of the logic, necessity and data range of every parameter.

Soon after that, the world of AI drastically changed. With the increasing availability of data, computing resources and very smart tweaks such as stochastic gradient descent in fitting functions, AI researchers have proved that if the tasks are sufficiently specific (or localised), we can use (networks of) complex functions to fit the data and complete low-level cognitive tasks with little necessity of modelling the human’s reasoning process. The seemingly primitive learning approach (that we looked down on) is winning by brutal force. At the end of the day, it’s the result that matters, especially in the eyes of the general public. If the training dataset is large and broad enough, it is very likely that a prediction can be made with the support of ground truth closeby.

On top of that, applications, especially interactive ones, can be programmed to continuously learn and improve from user inputs. So the more we use them, the more data they get from us and the better they’ll get. We just need to accept that the logic employed by these applications to conduct a task may not be the same logic used by humans. This is particularly obvious when human perception and emotion is involved. My 5 years old boy wants a red bike because he thinks the red ones go faster. His decision might have been influenced by the data he used for model training: Lightning McQueen is red and fast, Blaze the monster truck is red and fast, Ferraris are red and fast, etc. A function-based model would make the same “mistake” and some more data on fast blue vehicles or slow red vehicles will “fix” it. But it won’t fix for my boy. He is well aware that the colour is not 100% correlated with the speed (all sorts of new-gens are faster than Lightning in Disney Cars 3). For him (from the human’s perspective) red is a positive expression/statement associated with a vehicle. In this particular context, the absolute speed/acceleration doesn’t matter, it’s the sensation that counts. The ability to reason abstractly (also known as fluid intelligence) is often what separates high-level intelligence from a static knowledge base.

This leads to the question: Is it appropriate to call function-based learning Artificial Intelligence while there is little intelligence in it? As is pointed out in the article, “The vision systems of the eagle and the snake outperform everything that we can make in the laboratory, but snakes and eagles cannot build an eyeglass or a telescope or a microscope”. Just because a machine can deliver the same or better results in a task compared to human, shall we call it intelligence? Compared with other types of AI, function-based learning is principally less intelligent but it is certainly dominating the discussions in the AI-related work due to its reasonable performance in classification tasks (which underpin many useful applications). Does the level of intelligence really matter when the job is done right? Or should the definition of intelligence be dictated by how animals learn? One way for AI and human to coexist is intelligence amplification (IA) where each side complements the other in decision making. However, the idea is based on the hypothesis that machines are smart enough to do things very well but at the same time stupid enough to not understand what they are doing. If the machines are capable of both model and function based learning. Why would they still need us?

(to be continued)

Interview on VR cognitive load in Education

Recently Yoana Slavova and myself were interviewed by a research and consultancy company 3Rs IMMERSIVE on the use of VR in education. We shared our experiences from previous experiments and an outlook for future research. 

The interview can be found at: https://3rsimmersive.com/designing-vr-for-cognitive-load/

Some of our discussions are:

1) What were you trying to achieve through your research?

Over the years we have worked with numerous primary and secondary schools on VR trials that aimed at improving student engagement. The novelty factor of VR can undoubtedly contribute to better student attention in classrooms. As educators in University, we wanted to know whether VR-assisted learning can reach beyond the initial “WOW effect” and improve knowledge acquisition in comparison with the conventional learning using lecture notes.

2) Your paper showed that ‘ students are less likely to recall or recognise specific details due to the additional cognitive load in VR’  – why do you think this is, can you elaborate a little bit on this?

We think the cognitive load can be attributed to the use of new technology and also how media content is developed. Several students who claimed in our research interview that they “learned a lot” from VR content struggled to recall details such as the year and location of key historical events in comparison with students who studied using just lecture notes. This indicates that students might have been overwhelmed by the VR environment and the dynamics of the content. Cognitive load is not necessarily a negative factor in education. The more attention we paid to something, the more likely it is to be remembered. The challenge is to allow learners to focus on the details that are essential to learning.

3) Did you put in place any measures to lower cognitive load beforehand? ie allowing students time to become familiar with the device or making adjustments to the design of the content within VR?

Participants were given general guidance of controls and how to navigate through the content. We expected university students to pick up the technologies quickly. We plan to carry out more studies on how to better measure the cognitive load in VR and its impact on learning.

4) Do you have any advice for VR designers as a result of your research?

There are useful design principles we can borrow from computer games design to build better VR content for education. We really need to think about different aspects including storyline, cameras, level setup, visual overlay, user controls, and multiplayer while trying to avoid overwhelming your audiences within their learning tasks. VR in education also deserves a new set of designs rules. For instance, our research shows that text content still has its unique role in learning so we can work on how to augment rich VR content with simple and legible texts as a pathway to improve learning outcome.

5) What sort of content is VR best used to teach in your opinion?

The teaching of subjects like medical sciences, geography and arts can certainly benefit from the use of VR. However, I wouldn’t be surprised to see creative use of VR in any subject areas once VR developing tools and resources become more accessible to educators.

6) Are you doing any further work in this field?

Yes. We have a few VR-related projects in our team. We have been working with a secondary school on coaching sixth form students (between 16 and 18 years of age) to develop VR courseware for their younger counterparts. Understanding user’s attention is also a key area in VR research. One of our postgraduate students is experimenting with VR eye-tracking solutions in an attempt to develop content that can react to viewers attention and emotion. In education, this could mean tailored experience for each student’s needs and capabilities.