We are a highly interdisciplinary laboratory merging neuroscience, psychology, computer science and engineering to develop a new framework for the study of brain science: a unified testing environment of perception and action, based on neurobiologically inspired machine-learning techniques (‘deep learning’ networks), embedded in robots that sense, act and learn in a real-world environment.
The ‘heart’ of our research efforts uses our Remotely operated vehicles for education and research(‘R.O.V.E.R.’). These are low-cost, off the shelf wireless robotic devices , consisting of a color video camera, a microphone, and of course the ability to drive around. Each rover is independently controlled wirelessly by a devoted ‘brain’, consisting of an artificial neural network, housed on a separate computer. These computers both control the rover and receive and record the resulting perceptual feedback. The rovers are subjected to reinforcement protocols (i.e. positive and negative feedback in response to a behavioral outcome) similar to those employed in behavioral neuroscience. Learning of both perceptual features as well as action selection is based on a class of recently developed, neurally inspired machine learning architectures applied over this perception/action/feedback data stream.
It would take an individual over 5 million years to watch the amount of video that will cross the internet each month in 2018, by then, every second, nearly a million minutes of video content will cross the network. Closed circuit television cameras networks have become ubiquitous in many major cities yet most of the data goes unexamined. Video will be in the range of 80 to 90 percent of global internet traffic by 2018. Without automated means to sort and analyze this video data we will be unable to take advantage of humanity’s largest data stream. Compressed sensing and sparse modeling now offer a means to tackle this global deluge.
Natural images can often be decomposed into cartoon and texture. While much work has been done on edge detection, of equal importance is the ability to recognize and classify different texture patterns. From terrain estimation for autonomous vehicles to medical image analysis, the ability to identify and segment complex textures represents a major challenge for existing image processing techniques. Inspired by the human visual system, sparse dictionary modeling has shown to yield state of the art results in visual discrimination tasks.
AUDITORY SCENE ANALYSIS
The cocktail party effect refers to the human brain’s ability to selectively tune into a particular conversation amidst many competing conversations and significant background noise. The separation and grouping of acoustic data streams is called auditory scene analysis. Given the current trends in big data it will be come increasingly more important to selectively and dynamically attend to salient subsets of the data. Perceptual binding of music and speech amidst competing background noise is something that the brain can do almost effortlessly but for only a single steam at a time. Machine techniques should allow for the segmentation and binding of multiple streams concurrently.
Lip reading of human speech would suggest that there is a lot of redundancy in the audio visual stream. Devices like smart phones now contain dozens of sensors, multiple RGB megapixel cameras, microphone, touchscreen, accelerometers/tilt sensors, GPS, temperature, battery, fingerprint, etc. Combining multiple sensor streams into a coherent situation report is taxing for existing models and techniques. Using models from neuroscience we are exploring multi-sensory encoding techniques that will enable compressed sensing for multi-sensor acquisition, i.e. multispectral imaging data, as well as assist in disambiguation of corrupted signals.
Microscopes and telescopes have played a foundational role in the development of the natural sciences; the macroscope will be no different. Traditionally referred to as wireless sensor networks (WSN, or sensor networks), macroscopes now provide science a view of nature previously inaccessible. The fundamental element of of a WSN is a low cost mobile sensor. The mobile sensor has three key components: (1) The sensor (or multiple sensors, including: camera, microphone, thermometer, barometer, etc.), (2) A transmitter or transceiver that relays data to collecting points, and (3) A power supply or energy harvester. WSN have applications in climate and geological modeling, large scale real time situational awareness, medical sensors, and traffic control.
Autonomous agents are software and robotic entities that can carry out complicated tasks without direct human control. These agents include self-driving and self-parking cars, mobile security drones, as well as software entities such as advanced email filters and recommendation engines. Many tasks are too dangerous and many more simply too boring for humans to be of much use. Low-cost robotics platforms and ubiquitous computing networks will enable a rich ecosystem of autonomous agents that transform economies and cultures.
Motor cognition is the notion that understanding is often grounded in our action capabilities. The roots of the word perception translate to “grasping with mind”. Our ability to move and interact within our environment shapes our mental representations. Often complex behaviors can be derived from simple mechanical linkages, suggesting that seemingly cognitive tasks can be accomplished with simple action-perception loops, i.e. Braitenberg vehicles. Going beyond sensory input, motor cognition considers the structures necessary for the planning and production of actions, as well as recognizing, predicting, and mimicking the actions of others.
Copyright © 2018 | Machine Perception & Cognitive Robotics Laboratory - Center for Complex Systems and Brain Sciences - Florida Atlantic University