Acoustic sensing, computing and processing

The Audio-Sensing network for ECoB is a flexible, cheap and easily adaptable acoustic and multi- sensing prototype system for buildings, composed of a configurable network of wired or wireless microphones and hubs of connexion that register and transfer the ambient sounds in big spaces.

It can also be combined with other environmental sensors providing variables to be managed and transmitted to a central Acoustic Processing Server (APS) in order to decipher the meaning of unknown sounds and determine the level of occupancy. The audio system is scalable, easily installed and configured in a wide range of indoor spaces and outside areas within the building's immediate surroundings.

The advantage of using unique cheap microphones is their capability for realising and monitoring in near real- time, the occupancy of big spaces, such as lobbies of terminals or shopping malls [Fraunhofer].

This relevant occupancy parameters integrated with the measurement of environmental variables and conditioning controls help to optimise the BEM strategies and systems. This cheap ICT solution for ECoB guarantees positive cost/benefit results and high-profitable investment.

Furthermore, the system allows the development of basic alert/alarm information messaging for support, maintenance, control and security of the buildings [D'Appolonia]. A semi-automatic learning Systemis periodically be retrained from a repository via the Internet allowing the enhancement of applications in other domains and information services in the future [Fraunhofer].

Acoustic sensing and processing in large spaces and open environments [Fraunhofer] provides near real-time information about the estimated occupancy level or the moving active objects in typical areas, e.g. lobby, hall, office room, cinema, auditory, car park, corridor, or shop. Like humans' audition, microphones allow to detect sounds reduce noises and interferences as a pre-processing step for a succeeding Acoustic Event Detection (AED). This AED expresses the meaningfulness of the sound or noise and includes the semantic tags of meaning and the stamps of position and time, such as ”nobody is here now” or “a vehicle is coming here now”. The APU system has to be sensitive in big environments with high background noise and reverberation and the estimation of the direction of arrival (DOA) is not relevant for monitoring the occupancy level.

Acoustic Processing Server

audio system architecture

Audio-System Architecture and main functionalities of the Acoustic Processing Server or APS. (APU is part of the Occupancy Sensor)

The above graph represents the audio-system architecture and the three main functionalities of the Acoustic Processing Server prototype deliverable for the WP3:


  1. A functional, adaptable and duplicable prototype that detected at least six degrees of occupancy in the five types of spaces of control of inside/outside areas [Fraunhofer] and transfered this parameters to be integrated into the BEM system for optimization [The Institute for Microelectronic and Mechatronic Systems].
  2. A learning semi-automatic process from previous experience by managing abnormal sounds, digesting and sending them to the Internet, so they can be processed manually by an expert in order to retrain the system. WP3 has been build an Internet-based semantic repository of sounds, processing packets to be downloaded to upgrade the APS and subsequently the APU database. In consequence, APS and APU has been learned from the experience of others to improve auditory capacities progressively [Fraunhofer].
  3. The audio system for ECoB has been based on a platform of HW/SW modules and devices easily adaptable and scalable [The Institute for Microelectronic and Mechatronic Systems] to several diverse buildings and spaces and, moreover, connected and compatible to the BEM system installed. IMMS has been provided a connection to the very particular BEM system from D'Appolonia. This system provides the interface to commercial and already installed building automation systems.

Project partners Fraunhofer and IMMS provided extensive knowledge in the field of acoustic sound pick-up and processing, e.g. concerning noise reduction and acoustic source localization and general signal enhancement and signal de-reverberation / Listening Room Compensation (LRC).

Project partner IMMS is specialized in design and implementation of optimized embedded signal processing systems with a special focus on open source software platforms.Optimized, energy-efficient and networked sensing systems were realized in several projects including audio and building automation applications.

Previous attempts in this area

Acoustic event detection and activity recognition procedure for different environments has been applied in prototypes of sound recognition systems based on an ultra-low power hardware implementation in a button-like miniature form. Sound recognition in different environments has been designed to recognize and classify various activities of daily living associated with an environment based on sound. It uses an HMM classifier and MFCC features. Preliminary results showed a high average accuracy of task identification, based on their sound profile, on the limited range of activities that occur within, dealing with background and foreground sounds and adaptive background modeling. Other projects have detected human activity in public places producing a system capable of detecting coffee shop activities real-time or near real-time using both single-speaker acoustic events and analyses of the full-sound spectrum as an auditory scene analyses.

A related and inspiring project is MEMORIES (FP6-IST 035300), by which ontologies for sounds are developed. Related-application oriented projects in FP6 and FP7 are SAMURAI (Suspicious and abnormal behavior monitoring using a network of cameras & sensors for situation awareness enhancement, FP7 grant 217899) , WOMBAT Worldwide observatory of malicious behaviors and attack threats (FP7-ICT-216026- WOMBAT), SCANDLE Acoustic scene analyses for detecting living entities (FP7 Grant 231168) and S2S2 Sound to sense, sense to sound (FP6 – IST-2004-03773).

Other EU-funded projects of technological relevance are SCENIC (Self-configuring environment-aware intelligent acoustic sensing, FP7 ICT grant 226007), AUDIS (Digital signal processing in audiology, FP7 People grant 214699) and POP (Perception On Purpose, FP6-IST-2004-027268 STREP project). Further work on Acoustic Event Detection (AED) has been carried out in the framework of the CHIL EU project ("Computers in the Human Interaction Loop", 2004-2007, FP6 IP 506909). The DIRAC project, one of whose partners is FHG, establishes current info extraction techniques that perform well when event types are well represented in the training data but often fail when encountering information-rich unexpected rare events. DIRAC project addresses this crucial machine weakness and aims at designing and developing an environment-adaptive autonomous artificial cognitive system that will detect, identify and classify possible threatening rare events from the information derived by multiple active information-seeking audio-visual sensors, in the Area of Audio Information Systems.

S4ECoB innovation

It has to be stressed that most of the discussed S&T contributions to the SoA are based on a close-talk situation, meaning that the microphone is very close to the sound source. This will not be true for the proposed S4ECoB project and poses major innovation in:

  • the ambient noise levels and a high amount of reverberation in large rooms,
  • the integration of available technologies,
  • the configuration and installation of the audio-sensors network,
  • the methods of acoustic processing,
  • the algorithms and semantics of the acoustics events identification,
  • the logical models of the definition of context situation and activities recognition.

Proposed technical (audio) solution

It will consist of:

  • 1 occupancy sensor per room / area (embedded integrated HW, integrated APU, cost-effective, distributed solution; advantages: no need for building-wide high-bandwidth network, possibly reduced energy consumption of audio-system)
  • 4 to 16 microphones per room / area (number depends on room size; necessary for required audio quality and SNR, option for sound localization) – with wired or wireless connection to occupancy sensor or implemented as mic array inside a satellite unit
  • 1 audio processing server (PC-like hardware) per building (retraining, sound database) and 1 central sound server (connected via internet)

The architecture have distributed APU’s (inside the rooms and near the microphones / mic arrays) and 1 central Acoustic Processing Server (APS). There was no need for a audio sensor network gateway because the APU’s and the APS communicate via IP-based network connections.

acoustic processing unit new

Scheme of the Acoustics acquisition based on an audio-sensor network and processing in a central unit to introduce the monitoring of occupancy in the Optimizer BEM system and/or DCC for ECoB

The figure shows the signal pick-up by means of microphones. IDMT is responsible for microphones and IMMS for acquiring and processing raw audio streams from several microphones. and the preprocessing stage for signal clean- up such as cancellation of ambient noise, echoes or reverberation. This signal enhancement is needed for the succeeding detection and classification of acoustic sounds. The sounds and noises that emanate from a particular environment are to some extent dependent on that particular environment and its circumstance. The management of the abnormal sounds and the retraining with new AED is able to improve auditory capacities.

The system provides data for interoperable applications and additional services. It will allow detecting sounds and noises due to malfunction of the system, risk situation or accidents to provide additional alerts and emergency messages to support maintenance of heating, ventilation, air-conditioning and lighting (HVACL) systems and, moreover, some information services for the security and control of buildings.

Concerning security integration, sounds can provide a clear added value with respect to:

  • Infrared Technology: Even if infrared solution are of lower costs, acoustic sensing can provide a better integration with security system, since sound processing can be used not only for energy efficiency but also for automatic danger situations detection.
  • Image processing: Image processing can provide similar capabilities to sound processing for automatic detection of danger situations. But with higher costs (economic and ecologic) due to a bigger and a greater installation for a visual coverage of all the different areas and to the highest bandwidth requested for real-time video streaming instead of sound streaming.

The variety of sound sources for each sound class that occurs in natural environments were restricted to those relevant to define occupancy, captured and reflected in the various instances of each sound class included in the database. Moreover, the simplified method is initially based on the intensity of the ambient noise rather than the detection of specific acoustic events (AED). This simplification  facilitates and accelerates the implementation of the testing APU prototype. The bigger AED databases were needed for the tuning and calibration of the systems and their evaluation in real conditions.

Additionally, an acoustic flow metering prototype has been based on an audio-based sensor system to measure water and air flow of the in/outcomes pipes of the heat exchanger and pre-conditioning ventilation system. The acoustic flow metering system has been provided the flows in HVAC pipes as well as the temperatures. It was intended that the acoustic-sensor installation and calibration does not require any pipe modification, this implies also unobtrusive sensors with low cost maintenance not interfering to the HVAC system functionality. The information of the acoustic flow metering system has been connected to the global BEMO system to provide the necessary information to adjust the HVAC system.

Project partners Fraunhofer and IMMS provided extensive knowledge in the field of acoustic sound pickup and processing, e.g. concerning noise reduction and acoustic source localization and general signal enhancement and signal de-reverberation / Listening Room Compensation (LRC). Within the project the partner contributed to increase SoA by development and publication of new algorithms that were adapted to the specific problems in S4ECoB and ECoB in general.

Project partner IMMS is specialised in design and implementation of optimized embedded signal processing systems with a special focus on open source software platforms. Optimized, energy efficient and networked sensing systems were realized in several projects including audio and building automation applications. Therefore, there is a solid S&T base of knowledge and experience in Acoustic Computing for AI/AmI Applications that is briefly summarized in the next paragraphs.