8.3 Integrated Analysis of Multi-Source Data

 

In daily life, we use our sensing organs and brain to recognize things and then make decisions and take actions. Our sensing organs include eyes, ears, nose, tongue and skin touch. The first three are our remote sensors. Our sensors pass scene, sound, smell, tastes and feeling to our brains, our brains process the evidences collected by different sensors and analyze them and then compare with things in our memory that have been recognized before to see if based on the data collected we can recognize (label) the newly detected thing as one of the things which has been recognized before. If the recognized thing is a tree in our way, our brain may decide to go around it. In an increasingly competitive society, in order to make optimized decisions, we have to make best use of all the evidences that are available to arrive an accurate recognition. In our daily life, we experience thousands of processes like this, evidence collection - evidence analysis - decision making - action taking. For example, our eyes cannot resolve details either from too far away or due to their sizes being too small. This has been made possible with the help of a telescope and a macroscope.

 

We cannot see in the spectral ranges outside the visible spectral wavelength region, various detectors sensitive to different non-visible regions can record images for us to see as if our eyes were sensitive to those spectral regions. In spatial data handling, our brains cannot memorize exactly the location and spatial extent that certain phenomenon occupies, electro-magnetic media can be used to do so. The evidence volume is so large that our brain can only process a very small amount of it. Therefore, we need to use computers to assist us to do so. In this chapter, we examine some of the techniques that can be used in computer assisted handling of various spatial evidence, especially integrated analysis of spatial evidence from multiple sources, such as from field survey, remote sensing and/or existing map sources.

Data integration: integrate spatial data from different sources for a single application. What types of application are we referring to?

One problem in data integration is:

incompatibility between spatial data sets, in the following aspects:

• data structures

• data types

• spatial resolutions

• levels of generalization

- Data structures Raster vs.Vector

Discrepancies in concepts of spatial representation

cell ¨ææÆ object

Location (i, j) {(xi, yi)}

Entity/Attribute Incomplete/Broken Complete

Being Represented

Ease of representing Discrete

continuous

Phenomena Phenomena

More flexible

Level of Generalization Low High

Communication Hard Easy

Storage Large amount Less

_________

Is overlay of digital files a data integration method?

Yes, a very preliminary one. Given two data sets A = {(x, y) : z}

B = {(x, y) : u} AUB = {(x, y) : z, u}

It is more or less a data accumulation.

Five types of models,

PM Point Model : (x, y, z, ...)

GM Grid Model : (i, j, z, ...) Æi, Æj

LM Line Model : ({x, y}, z, ...)

AM Area Model : ({x, y}, z, ...)

CM Cartographic Model: Traditional meaning {PM, LM, AM}

Now {PM, GM, LM, AM}

_________

* An important extension, 3rd spatial dimension and the temporal dimension.

* Discussion:

Do PM, LM, AM involve scale as their individual components?

No, data acquisition error and processing error are involved.

Only CM involves scale.

* Scale, generalization, error and uncertainty are so much interrelated that deserve some conceptual clarification.

Models for converting different data models

• Aggregation

(1) Point Æ Surface Interpolation

(2) Grid Æ Larger Majority rule; composite rule based on statistics

From low generalization level to higher levels

(3) Line Æ Simplified line

(4) Area Æ Simplified area, or point

_________

 

Comment on why do we need (3)?

• Disaggregation

Boundaries Æ Probability surfaces

_________

(1) Mark and Csillag's model (1989)

Homogeneity is broken only at the boundaries.

(2) Goodchild et al. (1992) spatial autoregressive model

e = {ei} ei random number obey (0, S2)

X' = r W X + e

X' = {x'i} x'i R

X = {xi} xi {0, 1} or {A, B, ..., }

r is a spatial dependence factor,

W is an N x N weight matrix of interactions between pixels

_________

The problem is that do we need disaggregate our data? What is the uncertainties involved in the disaggregation process?