Skip to content

Commit

Permalink
Bug Fix + Add Kriging to doc
Browse files Browse the repository at this point in the history
  • Loading branch information
sunnyqywang committed Jul 6, 2017
1 parent 93227b5 commit 981ddff
Show file tree
Hide file tree
Showing 11 changed files with 67 additions and 116 deletions.
12 changes: 8 additions & 4 deletions volume_project/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ def __exit__(self):
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)

'''
pfd = prepare_flow_data()
tStart = datetime.now()
Expand Down Expand Up @@ -124,9 +124,13 @@ def __exit__(self):
vol, non = tex.testing_entire_TO()
del tex
logger.info('Finished calculating AADT for Toronto in %s', str(datetime.now()-tStart))

'''
tStart = datetime.now()
spa = spatial_extrapolation()
spa.fill_all()
spa.plot_semivariogram(201200)
spa.plot_semivariogram(201300)
spa.plot_semivariogram(201400)

#spa.fill_all()
del spa
logger.info('Finished filling in AADT for Toronto in %s', str(datetime.now()-tStart))
# logger.info('Finished filling in AADT for Toronto in %s', str(datetime.now()-tStart))
21 changes: 14 additions & 7 deletions volume_project/spatial_extrapolation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ Output: volume
Covariance matrix is constructed based on the coordinate information of the segments in order to find the spatial correlation of volumes.

## Methodology Evaluation

### Major Arterials
### Regression
#### Major Arterials
-|Linear Regression (proximity only) | Direction Linear Regression | Average of Nearest Neighbours|
-|:-----------------------------------:|:----------------------------:|:------------------------------:|
Scatter plot| ![major_arterials_proximity_regr](img/major_arterials_proximity_regr.png)|![major_arterials_directional_regr](img/major_arterials_directional_regr.png)|![major_arterials_neighbour_avg](img/major_arterials_neighbour_avg.png)|
Expand All @@ -32,7 +32,7 @@ Coef. of Det.|0.480|0.542|0.492

![major_arterials_proximity_regr_scores](img/major_arterials_proximity_regr_scores.png)

### Minor Arterials
#### Minor Arterials
-|Linear Regression (proximity only) | Direction Linear Regression| Average of Nearest Neighbours|
-|:-----------------------------------:|:----------------------------:|:------------------------------:|
Scatter plot| ![minor_arterials_proximity_regr](img/minor_arterials_proximity_regr.png)|![minor_arterial_directional_regr](img/minor_arterials_directional_regr.png)|![minor_arterial_neighbour_avg](img/minor_arterials_neighbour_avg.png)|
Expand All @@ -41,7 +41,7 @@ Coef. of Det.|0.345|0.461|0.341|

![minor_arterials_proximity_regr_scores](img/minor_arterials_proximity_regr_scores.png)

### Collectors
#### Collectors
-|Linear Regression (proximity only) | Direction Linear Regression| Average of Nearest Neighbours|
-|:-----------------------------------:|:----------------------------:|:------------------------------:|
Scatter plot| ![collectors_proximity_regr](img/collectors_proximity_regr.png)|![collectors_directional_regr](img/collectors_directional_regr.png)|![collectors_neighbour_avg](img/collectors_neighbour_avg.png)|
Expand All @@ -50,7 +50,7 @@ Coef. of Det.|0.312|0.268|0.364|

![collectors_proximity_regr_scores](img/collectors_proximity_regr_scores.png)

### Locals
#### Locals
-|Linear Regression (proximity only) | Direction Linear Regression| Average of Nearest Neighbours|
-|:-----------------------------------:|:----------------------------:|:------------------------------:|
Scatter Plot|![locals_proximity_regr](img/locals_proximity_regr.png)|![locals_directional_regr](img/locals_directional_regr.png)|![locals_neighbour_avg](img/locals_neighbour_avg.png)|
Expand All @@ -59,9 +59,16 @@ Coef. of Det.|0.230|0.046|0.213|

![locals_proximity_regr_scores](img/locals_proximity_regr_scores.png)

## Implementation
### Kriging
|Road Class|Semivariogram|
|:----------:|:-------------:|
|Major Arterial|![major_arterials_semivariogram](img/major_arterials_semivariogram.png)|
|Minor Arterial|![minor_arterials_semivariogram](img/minor_arterials_semivariogram.png)|
|Collector|![collectors_semivariogram](img/collectors_semivariogram.png)|

Directional linear regression is noticeably superior than the other two methods
The relationship between distance and volume relationship is weak. The variance does not fit any model very well. A Gaussian Process Kriging model was fitted to each road class anyway and the results are inferior than regression. Therefore kriging is not used in actual implementation.

## Implementation

|Road Class|Method|
|----------|------|
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
WITH segments AS (
SELECT group_number, AVG(volume) AS volume, shape, feature_code
FROM prj_volume.aadt JOIN prj_volume.centreline_groups_geom USING (group_number)
WHERE confidence = 1
GROUP BY group_number, feature_code, shape)

SELECT g1, AVG(neighbourvolume)::int, volume::int
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@ SELECT ST_X(ST_StartPoint(shape)), ST_Y(ST_StartPoint(shape)), ST_X(ST_EndPoint(
FROM (SELECT group_number, dir_bin, volume, (CASE WHEN dir_binary(ST_Azimuth(ST_StartPoint(shape), ST_EndPoint(shape))) = dir_bin THEN shape ELSE ST_REVERSE(shape) END) AS shape
FROM (SELECT shape, group_number, dir_bin, AVG(volume)::int AS volume
FROM prj_volume.aadt JOIN prj_volume.centreline_groups_geom USING (group_number)
WHERE feature_code = $1
WHERE feature_code = $1 AND confidence = 1
GROUP BY group_number, shape, dir_bin) A) B
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,6 @@ FROM(
JOIN (SELECT group_number, AVG(volume) AS volume FROM prj_volume.aadt GROUP BY group_number) E ON (E.group_number = g2)
WHERE not parallel OR C.dir_bin = D.dir_bin) G

JOIN (SELECT group_number, AVG(volume) AS volume FROM prj_volume.aadt GROUP BY group_number) F ON (F.group_number = g1)
JOIN (SELECT group_number, AVG(volume) AS volume FROM prj_volume.aadt WHERE confidence = 1 GROUP BY group_number) F ON (F.group_number = g1)
WHERE row_number < 3
GROUP BY g1, volume
5 changes: 5 additions & 0 deletions volume_project/spatial_extrapolation/query_semi_variogram.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
SELECT (ST_Distance(a1.shape, a2.shape)/50)::int AS dist, sum(a1.volume-a2.volume)^2/2/count(*) AS semivariance, corr(a1.volume, a2.volume) as correlation, COUNT(*) as num_observations
FROM (prj_volume.aadt JOIN prj_volume.centreline_groups_geom USING (group_number)) a1 JOIN (prj_volume.aadt JOIN prj_volume.centreline_groups_geom USING (group_number)) a2 ON ST_DWithin(a1.shape, a2.shape, 5000)
WHERE a1.confidence = 1 AND a2.confidence = 1 AND a1.feature_code = $1 AND a2.feature_code = $1 AND a1.group_number > a2.group_number
GROUP BY (ST_Distance(a1.shape, a2.shape)/50)::int
ORDER BY (ST_Distance(a1.shape, a2.shape)/50)::int
54 changes: 37 additions & 17 deletions volume_project/spatial_extrapolation/spatial_extrapolation.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@ def average_neighbours_eval(self, road_class, sample_size):
self.scatterplot(y_predict, y_test, road_class, r2_score(y_test, y_predict), 'neighbour_avg', ' Average of 5 Nearest Neighbours')
self.logger.info('Average of Neighbour Volumes Evaluation for road class' + self.rc_lookup[road_class] + 'done.')

def color_y_axis(self, ax, color):
for t in ax.get_yticklabels():
t.set_color(color)

def get_coord_data(self, road_class):
return self.get_sql_results("query_coord_volume.sql",['from_x','from_y','to_x','to_y','volume'], parameters=[road_class])

Expand Down Expand Up @@ -159,7 +163,7 @@ def linear_regression_prox_eval(self, road_class, sample_size=0.3):
fig.savefig('spatial_extrapolation/img/'+self.rc_lookup[road_class].lower().replace(' ', '_') +'_proximity_regr_scores.png')


def scatterplot(self, y_predict, y_test, road_class, coef_det, estimation_method, title_notes):
def scatterplot(self, y_predict, y_test, road_class, coef_det, estimation_method, title_notes=''):

fig, ax = plt.subplots(figsize=[8,6])

Expand All @@ -176,24 +180,40 @@ def scatterplot(self, y_predict, y_test, road_class, coef_det, estimation_method
ax.annotate('Root Mean Squared Error: ' + "{:.0f}".format(np.sqrt(mean_squared_error(y_test,y_predict))), xy=((x[1]-x[0])*0.06+x[0], x[1]*0.92), fontsize = 11)
ax.annotate('Coef of Det: ' + "{:.3f}".format(coef_det), xy=((x[1]-x[0])*0.06+x[0], x[1]*0.86), fontsize = 11)
fig.savefig('spatial_extrapolation/img/'+self.rc_lookup[road_class].lower().replace(' ','_') + '_' + estimation_method + '.png')

'''
# Backup functions
def GP_Kriging(self, data):
volume = np.array(data['volume'])
coord = np.array(data[['from_x','from_y','to_x','to_y']])

def plot_semivariogram(self, road_class):
data = self.get_sql_results("query_semi_variogram.sql", columns = ['dist','semivariance','correlation','numobs'], parameters=[road_class])
data['dist'] = data['dist']*50/1000
fig, ax = plt.subplots(figsize=[8,6])
ax1 = ax.twinx()
ax2 = ax.twinx()
ax.plot(data['dist'], data['semivariance'], color='b', label='semivariance')
ax1.plot(data['dist'], data['correlation'], 'r')
ax2.plot(data['dist'], data['numobs'], 'c', label='Num Observations')
ax.set_xlabel('Distance (km)')
h0, l0 = ax.get_legend_handles_labels()
h1, l1 = ax1.get_legend_handles_labels()
h2, l2 = ax2.get_legend_handles_labels()
ax.legend(h0+h1+h2, l0+l1+l2)
self.color_y_axis(ax,'b')
self.color_y_axis(ax1,'r')
self.color_y_axis(ax2,'c')
ax.set_title(self.rc_lookup[road_class]+' Semivariogram')
fig.savefig('spatial_extrapolation/img/'+self.rc_lookup[road_class].lower().replace(' ', '_') +'_semivariogram.png')

# Back up function
def Kriging(self, road_class):
group = self.get_sql_results("query_coord_volume.sql", columns = ['from_x','from_y','to_x','to_y','volume'], parameters=[road_class])

volume = np.array(group['volume'])
coord = np.array(group[['from_x','from_y','to_x','to_y']])

coord = preprocessing.normalize(coord, axis=0)
x_train, x_test, y_train, y_test = train_test_split(coord, volume, test_size=sample_size/100, random_state=0)
kernel = RationalQuadratic(length_scale=1.0, length_scale_bounds=(1e-1, 10.0)) * RationalQuadratic(length_scale=1.0, length_scale_bounds=(1e-1, 10.0)) * ExpSineSquared(length_scale=1.0, length_scale_bounds=(1e-1, 10.0)) * ExpSineSquared(length_scale=1.0, length_scale_bounds=(1e-1, 10.0))
x_train, x_test, y_train, y_test = train_test_split(coord, volume, test_size=0.3, random_state=0)

kernel = RationalQuadratic()
gp = GaussianProcessRegressor(kernel=kernel)
gp.fit(x_train, y_train)

y_mean = gp.predict(x_test, return_std=False)
plt.scatter(y_mean, y_test)
#lims = [np.min([plt.xlim(), plt.ylim()]), np.max([plt.xlim(), plt.ylim()])]
#plt.plot(lims, lims,'k-')
plt.show()
'''
y_predict = gp.predict(x_test, return_std=False)
self.scatterplot(y_predict, y_test, road_class, 0, 'Kriging', '')
86 changes: 0 additions & 86 deletions volume_project/spatial_extrapolation/spatial_gp.py

This file was deleted.

0 comments on commit 981ddff

Please sign in to comment.