Data science and network management at IEEE IM 2019, Washington, D.C.

IEEE IM 2019 – Washington DC, USA (link to papers)
Following IM 2017 in the picturesque Lisbon, one of the most beautiful cities in Europe, this year’s event was held in the US capital city during its peak cherry blossom season.

The conference adopted the theme of “Intelligent Management for the Next Wave of Cyber and Social Networks”. Besides the regular tracks, the five-day conference features some great tutorials, keynotes and panels. I have pages of notes and many contacts to follow up.

A few highlights are: Zero-touch network and service management (and how it’s actually “touch less” rather than touchless!), Huawei’s Big Packet Protocol (network management via packet header programming), DARPA’s Off-planet network management (fractionated architectures for satellites), Blockchain’s social, political, regulatory challenges (does not work with GDPR?) by UZH, Data science/ML for network management from Google and Orange Labs (with some python notebooks and a comprehensive survey paper of 500+ references.) and many more. I am hoping to write more about some of them in the future when I have a chance to study them further. There are certainly some good topics for student projects.

Since I am linked to both the multimedia/HCI and communication network communities, I have the opportunity to observe different approaches and challenges faced by these communities towards AI and ML. In multimedia communities, its relatively easy to acquire large and clean datasets, and there is a high level of tolerance when it comes to “trial and error”: 1) No one will get upset if a few from a hundred image search results are not accurate and 2) you can piggy-back some training module/reinforced learning on your services to improve the model. Furthermore, applications are often part of a closed proprietary environment (end to end control) and users are not that bothered with giving up their data. In networking, things are not far from “mission impossible”. 95% accuracy in packet forwarding will not get you very far, and there is not much infrastructure available to track any data, let alone making any data open for research. Even when there are tools to do so, you are likely to encounter encryption or information that is too deep to extract in practice. Also, tracking network data seems to attracts more controversy. We have a long and interesting way to go.

Washington, D.C. is surrounded by some amazing places to visit. George Washington’s riverside Mount Vernon is surely worth a trip. Not far from the Dulles airport is the Great Falls Park with spectacular waterfalls on Potomac river that separate Maryland and Virginia. Further west is the 100-mile scenic Skyline Drive and Appalachian Trail in Shenandoah National Park.

We are taking VR and art research to ACM TVX 2019

I have been a regular visitor of ACM TVX since it first became an ACM sponsored event in 2014 (previously known as EuroITV). This year, the conference will be held at MediaCityUK, Salford in early June. We’ll bring two pieces of early-stage research to Salford: understanding the user attention in VR using gaze-controlled games by Murtada Dohan (a newly started PhD candidate), and a demo of abstract painting in VR by a fine art artist Dr Alison Goodyear. You might have guessed that we have plans to bring these two together and experiment with new ways of content creation and audience engagement for both the arts and HCI communities.


Dohan, M. and Mu, M., Understanding User Attention In VR Using Gaze Controlled Games.
Abstract: Understanding user’s intent has a pivotal role in developing immersive and personalised media applications. This paper introduces our recent research and user experiments towards interpreting user attention in virtual reality (VR). We designed a gaze-controlled Unity VR game for this study and implemented additional libraries to bridge raw eye-tracking data with game elements and mechanics. The experimental data show distinctive patterns of fixation spans which are paired with user interviews to help us explore characteristics of user attention.

Goodyear, A. and Mu, M., Abstract Painting Practice: Expanding in a Virtual World
Abstract: This paper sets out to describe, through a demo for the TVX Conference, how virtual reality (VR) painting software is beginning to open up as a new medium for visual artists working in the field of abstract painting. The demo achieves this by describing how an artist who usually makes abstract paintings with paint and canvas in a studio, that is those existing as physical objects in the world, encounters and negotiates the process of making abstract paintings in VR using Tilt Brush software and Head-Mounted Displays (HMD). This paper also indicates potential future avenues for content creation in this emerging field and what this might mean not only for the artist and the viewer, but for art institutions trying to provide effective methods of delivery for innovative content in order to develop and grow new audiences.

Copyright belongs to Dr Alison Goodyear

Speak at Westminster HE Forum – Technologies in higher education

I had the great pleasure of joining a Westminster Higher Education Forum event today as a speaker. My session was chaired by the Labour MP Mr Alex Sobel and the main theme of the session is around the opportunities and challenges in adopting new technologies in colleges and universities in the UK. The venue was packed with 100+ delegates from over 60 institutions and businesses across England. I spoke about our research findings on the use of VR in education and shared my views on how technologies can empower human educators in Education 4.0. The following are my notes. The official transcripts from all speakers will be available on Westminster website.


Virtual reality in its early days was mainly used for industrial simulation and military training in a controlled environment using specialised equipment. As technologies become more accessible, we started to see more use of VR in gaming and education. In education, VR is mostly used as a stimulus to enhance students engagement and learning experience. It helps visual learners, breaks barriers, and can visualise things that are hard to imagine. So we are mostly encapsulating on the indirect benefit.

My research group is interested in whether such stimuli can truly improve learning outcome and how, so we know how to improve the technology or use it more appropriately. We conducted an experiment with two groups of university students to compare how well they learn hard sciences using VR materials and Powerpoint slides. Their performance was measured using a short exam paired with interviews. The results suggest that the majority of students prefer learning in VR but there is no significant difference between the two on average scores. A recent research by Cornell University shows a similar finding. However, When we look at the breakdown of scores on individual questions we discovered that students who studied via VR can do very well with questions related to visual information recognition but they struggled to recall numerical and textual details such as the year and location of an event. We think its due to how information is presented and the extra cognitive load in VR. So VR made something better but others worse, it’s a double-edged sword.

This does not mean what VR is a waste of money. We need more work to learn how to better use the tool. This means two things: One, we need VR to be more accessible. Not only its cost but more importantly easy-to-use design tools and open libraries that help average lectures to embrace the technology. We also need appropriate metrics and measurement tools to access the actual impact of new technologies, and share that experience with the community.

Furthermore, we need to keep eyes on what roles VR should take in education. One thing we can learn from the past is Powerpoint in education. (Powerpoint was invented in 1980s, acquired by Microsoft, and went on to become one of the most commonly used tools in business and education. Powerpoint has drastically changed how teaching is done in classrooms. ) Powerpoint was meant to augment a human presenter but it has become a main delivery vehicle in classroom while lecturers are the operators or narrators of slides. People conclude that Powerpoint has not empowered academia. (Some institutes have banned teachers using Powerpoint. According to NYTimes, similar decisions were also made in the US Armed Forces because they regard it as a poor tool for decision-making.) Many institutes including University of Northampton are moving away from pure slideshow to active and blended learning and use data sciences and smart campus to support hands-on, experimental and interactive learning. So we can certainly learn from the past when we approach VR and other new technologies.

Another important aspect is the human factor. At the end of the day, only human educators are accountable to the teaching process. We listen to what learners say, observe their emotions, sympathise with their personal issues and I reason with them for every decision I made while trying to be as fair as possible. My team is work on many computer science research topics related to human factors such as interpretable machine learning, understanding human intent. However new technologies such as VR and AI should be designed and integrated to empower human educators rather than replacing us.


Top 1% of reviewers in Computer Science?

https://publons.com/awards/2018/esi/?name=Mu%20Mu&esi=13

Like many academics, I regularly engage with the reviewing process of renowned conferences and journals. I have not ventured into any substantial editorial role yet but I try to help out as much as possible. Any of my journal review activities are registered on Publons to keep a record (mainly for myself). It is to my surprise that I received the Publon’s Peer Review Awards 2018 as the “Top 1% of reviewers in Computer Science”. I know many people will see this as a “gimmick”, but hey, our research communities rely heavily on quality peer reviews from volunteers. Also, an award is an award! 🙂

Some thoughts on Artificial Intelligence – Part 2: Yes but why?

While Part 1 “rants” about how function-based machine learning overwhelms data research, this part aims at future opportunities.

A Hundred Thousand Whys – Children’s science book series in Chinese

Many of my childhood friends and I counted ourselves lucky to own a copy of the A Hundred Thousand Whys (Chinese: 十万个为什么), a children’s science book series. It was the only child-friendly encyclopedia you could find in the 70s/80s in China. The book series originated from a book with the same name authored by a Ukrainian writer Mikhail ll’in (Russian: М. Ильи́н). The fact that our parents invested half a month salary on these books in those difficult years showed how much they struggled to answer our never-ending questions of whys.

Reasoning is a very important part of the learning process. We humans and other animals learn very quickly from direct experiences for survival. At the same time, abstract knowledge is the core of our human intelligence. We discover rules and develop theories from our observations and experiments so to generalise our findings and use them for situations other than our initial observations. It is believed that “the power of generalization is a property that comes with neural networks” and “abstraction is often attributed to humans and the development of the language”. A conventional function-based machine learning models tend to give us a very good answer under normal circumstances but there will be no reasoning on how the conclusions or predictions were made. It is therefore often referred to as a black box. We don’t know what’s in the black box. Even if we do by illustrating the ML neural networks, chances are we won’t be able to explain it rationally. What made things more challenging (I am ranting again…) is the opacity of the input data to the black box. Data science researchers need to be very careful when handling and documenting the training data because they determine the sensitivity and operational range of your model. However, that’s just me training my own model for research. Do we know exactly what data Google uses to show stuff to us? So we have no clue how the machine works nor how the machine gets its knowledge from.

https://medium.com/bbc-news-labs/what-we-talk-about-when-we-talk-about-fair-ai-8c72204f0798

Some may argue that they don’t really care about how things were done as long as we have a good result. This kind of attitude has helped the function-based ML to thrive in the past few years. We are now getting used to the convenience enabled by ML-based software and services without bothering about the whys.

So is reasoning still relevant?

Even if we don’t expect a machine to ever be as smart as humans, we still need results interpretable to ensure fairness, transparency, and accountability. The fairness concerns with whether resources are shared “equally” between parties or whether a rule is applied consistently. Many researchers including myself argue that fairness is subjective and therefore needs to be measured at the user level. We spent a lot of time looking into human factors (sensual, perceptual, societal, etc.) and studying how the machine can take into account individual preferences and experience. However, I often feel what we do is an exercise of incorporating biases. The algorithm would allocate more resources to picky users and hungry user devices so that there is minimum QoE (quality of user experience) discrepancy across a group of users. You might disagree with my understanding of fairness but at least we are able to discuss what’s right and wrong based on my reasoning. With a high degree of complexity, an ML model can produce excellent outcomes with the wrong reason. The image below from Dr Kasia Kulma’s talk is a perfect showcase. The model which achieves high precision in classifying wolf and husky apparently recognise wolves by detecting snow in the picture. It certainly capitalised on the high probability of snow and wolf co-appearance in an image: a smart move to deliver results but not necessarily what we consider as intelligent. For fairness, the reasoning often outweighs the outcome.

https://www.slideshare.net/0xdata/interpretable-machine-learning-using-lime-framework-kasia-kulma-phd-data-scientist

The transparency is a measure for the general interpretability or opacity. There are different opinions on the measurement target. Some believe it’s about the model itself (i.e., the interpretability of how the model works). However, this might be too much to expect for intricate deep learning models. Furthermore, forcing higher transparency affects how models are built and potentially limit the use of some useful features. There is generally a trade-off between transparency and model performance. As a compromise, the interpretability can be set on the results coming out of a model rather than the model itself. This means that we attempt to interpret how the input features impact the results (on a much smaller scale). It would also enable some hypothetical reasoning to support humans to optimise a solution.

When it comes to using ML in critical missions such as performance review, credit rating, and autonomous vehicles, the main concern is people can get hurt when things go wrong. And that may be due to the wrong usage of training data or the algorithm itself. Ultimately, humans are accountable for what machines do. Even if its something that the machines taught themselves to do, its human who instructed the learning process. Prof Ben Shneiderman‘s talk covers this topic quite extensively from an HCI perspective. However, not many people consider the human-computer relationship as part of ML/AI research. Zero-control automation has a pivotal role in future network and system designs: it’s impractical to get human to make every single decision and we want to reduce human errors when humans are involved in decision making. Therefore, algorithm accountability is a technical challenge but perhaps more of a regulatory challenge.

[to be continued]

Some thoughts on Artificial Intelligence – Part 1: It works.

In a recent issue (10/2018 Vol.61 No.10) of Communications of the ACM there is a very interesting article by Adnan Darwiche titled “Human-Level Intelligence or Animal-Like Abilities?“.

Adnan took a balanced view of the current development in AI and its related field, and in many respect challenged many of the AI hypes. A lot of the discussions, especially the ones that compare function-based and model-based approaches echo my experience with data-driven research in the past ten years. 

One of the main research challenges I faced many years ago was modelling the perceptual quality of video content so any quality assessment (e.g., Netflix wants to know what level of QoS it has at the user side) can be done by a “golden-eye objective model” without human intervention. The workflow is simple: build lots of test videos “corrupted” by network impairments -> do subjective experiment to collect user opinions on the videos -> build an objective model by correlating user opinions and impairments -> optimise the model. Function-based machine learning such as Neural Networks was an option for data modelling but they were not considered as a good choice back then. Why? people didn’t feel comfortable champion something that is very difficult to reason. While function-based learning is aggressively powerful in achieving good performance, you end up with tens of parameters going in different directions and there is usually no logical connection between the parameters and the theory that underpins your research. Therefore, you don’t really need a ton of area knowledge to get a decent model as long as there are sufficient data available. So, using this type of learning felt like cheating and was not considered as “proper research”. What makes things worse is that you won’t be able to generalise your findings beyond the scope of the experimental dataset. I ended up following the “convention”: build a theoretical framework (based on psychology, anatomy and physiology of the human visual system, and video compression standard), then use logistic regression (and some other model-based learning tools) to complete a model. The model performance is more than acceptable and the design is backed by discussions of the logic, necessity and data range of every parameter.

Soon after that, the world of AI drastically changed. With the increasing availability of data, computing resources and very smart tweaks such as stochastic gradient descent in fitting functions, AI researchers have proved that if the tasks are sufficiently specific (or localised), we can use (networks of) complex functions to fit the data and complete low-level cognitive tasks with little necessity of modelling the human’s reasoning process. The seemingly primitive learning approach (that we looked down on) is winning by brutal force. At the end of the day, it’s the result that matters, especially in the eyes of the general public. If the training dataset is large and broad enough, it is very likely that a prediction can be made with the support of ground truth closeby.

On top of that, applications, especially interactive ones, can be programmed to continuously learn and improve from user inputs. So the more we use them, the more data they get from us and the better they’ll get. We just need to accept that the logic employed by these applications to conduct a task may not be the same logic used by humans. This is particularly obvious when human perception and emotion is involved. My 5 years old boy wants a red bike because he thinks the red ones go faster. His decision might have been influenced by the data he used for model training: Lightning McQueen is red and fast, Blaze the monster truck is red and fast, Ferraris are red and fast, etc. A function-based model would make the same “mistake” and some more data on fast blue vehicles or slow red vehicles will “fix” it. But it won’t fix for my boy. He is well aware that the colour is not 100% correlated with the speed (all sorts of new-gens are faster than Lightning in Disney Cars 3). For him (from the human’s perspective) red is a positive expression/statement associated with a vehicle. In this particular context, the absolute speed/acceleration doesn’t matter, it’s the sensation that counts. The ability to reason abstractly (also known as fluid intelligence) is often what separates high-level intelligence from a static knowledge base.

This leads to the question: Is it appropriate to call function-based learning Artificial Intelligence while there is little intelligence in it? As is pointed out in the article, “The vision systems of the eagle and the snake outperform everything that we can make in the laboratory, but snakes and eagles cannot build an eyeglass or a telescope or a microscope”. Just because a machine can deliver the same or better results in a task compared to human, shall we call it intelligence? Compared with other types of AI, function-based learning is principally less intelligent but it is certainly dominating the discussions in the AI-related work due to its reasonable performance in classification tasks (which underpin many useful applications). Does the level of intelligence really matter when the job is done right? Or should the definition of intelligence be dictated by how animals learn? One way for AI and human to coexist is intelligence amplification (IA) where each side complements the other in decision making. However, the idea is based on the hypothesis that machines are smart enough to do things very well but at the same time stupid enough to not understand what they are doing. If the machines are capable of both model and function based learning. Why would they still need us?

(to be continued)

Interview on VR cognitive load in Education

Recently Yoana Slavova and myself were interviewed by a research and consultancy company 3Rs IMMERSIVE on the use of VR in education. We shared our experiences from previous experiments and an outlook for future research. 

The interview can be found at: https://3rsimmersive.com/designing-vr-for-cognitive-load/

Some of our discussions are:

1) What were you trying to achieve through your research?

Over the years we have worked with numerous primary and secondary schools on VR trials that aimed at improving student engagement. The novelty factor of VR can undoubtedly contribute to better student attention in classrooms. As educators in University, we wanted to know whether VR-assisted learning can reach beyond the initial “WOW effect” and improve knowledge acquisition in comparison with the conventional learning using lecture notes.

2) Your paper showed that ‘ students are less likely to recall or recognise specific details due to the additional cognitive load in VR’  – why do you think this is, can you elaborate a little bit on this?

We think the cognitive load can be attributed to the use of new technology and also how media content is developed. Several students who claimed in our research interview that they “learned a lot” from VR content struggled to recall details such as the year and location of key historical events in comparison with students who studied using just lecture notes. This indicates that students might have been overwhelmed by the VR environment and the dynamics of the content. Cognitive load is not necessarily a negative factor in education. The more attention we paid to something, the more likely it is to be remembered. The challenge is to allow learners to focus on the details that are essential to learning.

3) Did you put in place any measures to lower cognitive load beforehand? ie allowing students time to become familiar with the device or making adjustments to the design of the content within VR?

Participants were given general guidance of controls and how to navigate through the content. We expected university students to pick up the technologies quickly. We plan to carry out more studies on how to better measure the cognitive load in VR and its impact on learning.

4) Do you have any advice for VR designers as a result of your research?

There are useful design principles we can borrow from computer games design to build better VR content for education. We really need to think about different aspects including storyline, cameras, level setup, visual overlay, user controls, and multiplayer while trying to avoid overwhelming your audiences within their learning tasks. VR in education also deserves a new set of designs rules. For instance, our research shows that text content still has its unique role in learning so we can work on how to augment rich VR content with simple and legible texts as a pathway to improve learning outcome.

5) What sort of content is VR best used to teach in your opinion?

The teaching of subjects like medical sciences, geography and arts can certainly benefit from the use of VR. However, I wouldn’t be surprised to see creative use of VR in any subject areas once VR developing tools and resources become more accessible to educators.

6) Are you doing any further work in this field?

Yes. We have a few VR-related projects in our team. We have been working with a secondary school on coaching sixth form students (between 16 and 18 years of age) to develop VR courseware for their younger counterparts. Understanding user’s attention is also a key area in VR research. One of our postgraduate students is experimenting with VR eye-tracking solutions in an attempt to develop content that can react to viewers attention and emotion. In education, this could mean tailored experience for each student’s needs and capabilities.