The trend in video surveillance is an ever increasing number of (digital) cameras for surveying complex scenarios (e.g. crowds). Currently available video surveillance systems cannot cope with this increased complexity, the detection rates are too low and the systems are not reliable enough. This hinders the broad use of automatic surveillance systems. AUTOVISTA proposes to use modern visual computing technologies to advance the state-of-the-art of video surveillance considerably. In order to cope with the increasing number of cameras, AUTOVISTA will (1) use novel on-line learning techniques to increase the detection rate and decrease the false alarm rate, while the camera adapts in an unsupervised manner to the surveyed scene. Besides an increased performance, this has the additional advantage that the installation and maintenance effort will be substantially decreased; (2) exploit novel visualization and interaction techniques to support the human operator. Furthermore two complementary visualization modes are proposed, blending smoothly between these allows the operator to maintain coherence. These techniques will enable a single operator to cope simultaneously with a large amount of cameras. AUTOVISTA will tackle the problem of increased people densities and highly cluttered scenes in a novel manner. Instead of relying on single person detection and tracking (which is not feasible for high people density scenarios), methods will be investigated to handle the crowd as a whole. AUTOVISTA will derive spatio-temporal crowd statistics, describe normal crowd behavior and use this for unusual event detection.