Search
Close this search box.

We are creating some awesome events for you. Kindly bear with us.

EXCLUSIVE – Brain-inspired deep learning algorithms for computer vision systems

EXCLUSIVE - Brain-inspired deep learning algorithms for computer vision systems

OpenGov spoke to Dr. Gang Wang on the sidelines of EmTech Asia 2017, where he was honoured as one of MIT Technology Review Innovators under 35.

Dr. Wang’s research interests include developing effective and efficient machine learning techniques which can advance the general artificial intelligence research and developing working computer vision systems and techniques. 

He is a former Associate Professor (till March 2017) with the School of Electrical and Electronic Engineering at Nanyang Technological University (NTU), Singapore and an associate director of the Rapid-Rich Object Search (ROSE) Lab at NTU. The ROSE Lab is a joint collaboration between Nanyang Technological University, Singapore, and Peking University, China. He is currently Chief Scientist at Alibaba AI labs.

A team led by Dr. Wang achieved a top 5 ranking in the ImageNet challenge on scene classification in 2015 and 2016. The technologies invented by his group have been successfully licensed to 6 international and local companies.

Can you tell us about your work?

I am working on deep learning for the visual understanding problem. We want to make computers understand visuals like humans.

It is a hard problem. Scientists have been working on it since the 1960s. When I started thinking of this problem, I thought of learning from our brains. Our brains are very compact, low power consuming, also very intelligent. It is almost like there is a magic mechanism inside that has been developed during the long process of human evolution over a period of thousands of years.

What if we can leverage this mechanism to help the computers to learn? This has been tried a little bit but not very deeply.  I worked with my students to go deep and find out what is the magic mechanism within our human brain. Then we try to model such a mechanism using deep neural networks.

The connection between the neurons in our brain is flexible and adaptive. When people try to recognise a cat versus say recognising cars, the connection inside the neurons might be slightly different. In the classic neural network methods, the connections are fixed between the different artificial neurons. I tried to make the connection between the neurons adaptive to the specific visual recognition test. We finally did it and we found that it resulted in significantly improved performance.

Where do you see your research going during the next 2-3 years?

I divide my research into two categories. One is more academic, more like fundamental research. In that, I will continue to push the boundary of brain-inspired deep learning algorithms. We also need to work closely with neuroscientists to try to understand our brains better, so that we can develop better algorithms.

The other side is applications. I am very interested in robotics. A general purpose robot needs to be able to sense the environment. Visual understanding is important for that. I want to transfer my visual understanding technology to robotics applications.

Where do computers stand now in visual recognition compared to humans?

We have made significant progress. Now computers can achieve very high accuracy in recognising thousand or more different categories.  

So, right now, for this problem computers can do better than humans. They have the advantage of  they can have more computing resources than humans. Humans cannot actually remember so many categories.

Can you give a basic explanation about how the understanding works?

There are two categories of understanding. The simpler one is for the purpose for navigation. In this case, you have to understand the 3D geometry, the environmental layout.

Suppose a robot is moving in this room, it needs to find what is the flow in order to navigate. To get to that chair, it needs to understand that there is something in the way. That is what we call a low level problem.

The second one is about more advanced applications. For example, once we have a humanoid robot, we ask the robot to give a cup of water. It must understand what is a cup and what is a cup of water. This is related to semantics. It has to understand object names, the meanings of object names and associate it with real world objects. This is human-like understanding.

Are we at the stage yet where the system will be able to understand that an object is a chair though it does not look like any chair it has been exposed to before?

Currently it is unlikely. We have to feed the computers with huge volumes of data with similar patterns to teach them what the object is. For the case you mentioned it requires the computer to have generalisation capability, which is very easy for humans.

What trends do you see in the areas of computer vision and deep learning?

I believe that the future is quite promising. We have achieved huge progress in many tasks such as vehicle detection for self-driving cars. And also, image classification for some online applications such as advertisement.             

A lot of applications have been built based on deep learning technology. They are creating venues for commercialisation. Like in China, the banks can use the computers to verify the faces. They no longer require human workers. This saves them manpower costs.

Also, we are collecting more and more data because deep learning requires a lot of data. The more data we have, the better will be the performance of the technology. 

PARTNER

Qlik’s vision is a data-literate world, where everyone can use data and analytics to improve decision-making and solve their most challenging problems. A private company, Qlik offers real-time data integration and analytics solutions, powered by Qlik Cloud, to close the gaps between data, insights and action. By transforming data into Active Intelligence, businesses can drive better decisions, improve revenue and profitability, and optimize customer relationships. Qlik serves more than 38,000 active customers in over 100 countries.

PARTNER

CTC Global Singapore, a premier end-to-end IT solutions provider, is a fully owned subsidiary of ITOCHU Techno-Solutions Corporation (CTC) and ITOCHU Corporation.

Since 1972, CTC has established itself as one of the country’s top IT solutions providers. With 50 years of experience, headed by an experienced management team and staffed by over 200 qualified IT professionals, we support organizations with integrated IT solutions expertise in Autonomous IT, Cyber Security, Digital Transformation, Enterprise Cloud Infrastructure, Workplace Modernization and Professional Services.

Well-known for our strengths in system integration and consultation, CTC Global proves to be the preferred IT outsourcing destination for organizations all over Singapore today.

PARTNER

Planview has one mission: to build the future of connected work. Our solutions enable organizations to connect the business from ideas to impact, empowering companies to accelerate the achievement of what matters most. Planview’s full spectrum of Portfolio Management and Work Management solutions creates an organizational focus on the strategic outcomes that matter and empowers teams to deliver their best work, no matter how they work. The comprehensive Planview platform and enterprise success model enables customers to deliver innovative, competitive products, services, and customer experiences. Headquartered in Austin, Texas, with locations around the world, Planview has more than 1,300 employees supporting 4,500 customers and 2.6 million users worldwide. For more information, visit www.planview.com.

SUPPORTING ORGANISATION

SIRIM is a premier industrial research and technology organisation in Malaysia, wholly-owned by the Minister​ of Finance Incorporated. With over forty years of experience and expertise, SIRIM is mandated as the machinery for research and technology development, and the national champion of quality. SIRIM has always played a major role in the development of the country’s private sector. By tapping into our expertise and knowledge base, we focus on developing new technologies and improvements in the manufacturing, technology and services sectors. We nurture Small Medium Enterprises (SME) growth with solutions for technology penetration and upgrading, making it an ideal technology partner for SMEs.

PARTNER

HashiCorp provides infrastructure automation software for multi-cloud environments, enabling enterprises to unlock a common cloud operating model to provision, secure, connect, and run any application on any infrastructure. HashiCorp tools allow organizations to deliver applications faster by helping enterprises transition from manual processes and ITIL practices to self-service automation and DevOps practices. 

PARTNER

IBM is a leading global hybrid cloud and AI, and business services provider. We help clients in more than 175 countries capitalize on insights from their data, streamline business processes, reduce costs and gain the competitive edge in their industries. Nearly 3,000 government and corporate entities in critical infrastructure areas such as financial services, telecommunications and healthcare rely on IBM’s hybrid cloud platform and Red Hat OpenShift to affect their digital transformations quickly, efficiently and securely. IBM’s breakthrough innovations in AI, quantum computing, industry-specific cloud solutions and business services deliver open and flexible options to our clients. All of this is backed by IBM’s legendary commitment to trust, transparency, responsibility, inclusivity and service.