Examples¶
The examples in this page will guide you through the functionality of the xgeo.
Firstly, let’s import the necessary libraries and open a data
>>> import xgeo # Needs to be imported to use geo extension
>>> import xarray as xr
>>> ds = xr.open_dataset("data.nc")
The code-blocks in the rest of the examples will start after the code-block presented above unless and otherwise mentioned.
Geotransform¶
The geotransform of the dataset is given by the transform attribute. It can be accessed as
>>> ds.geo.transform
(0.022222222222183063, 0, -179.99999999999997, 0, -0.022222222222239907, 90.00000000000001)
User can also assign different geotransform. In such a case, the coordinates of the dataset will be recalculated to comply with the changed transform. The transform can be set as:
>>> ds.geo.transform = (0.0111, 0, -180, 0, -0.0111, 90)
Projection / Coordinate Reference System (CRS)¶
The projection/CRS of the Dataset is given by the projection attribute. XGeo converts and stores the crs system of the dataset into the proj4 string. The CRS can be accessed as
>>> ds.geo.projection
'+init=epsg:4326'
User can also assign different crs system. The assignment can be done in multiple format. User can provide CRS in WKT, EPSG or PROJ4 system.
Note
The assignment of new CRS system doesn’t reproject to it. Main purpose of this assignment is to provide CRS to dataset, in case of missing CRS system in dataset.
The CRS can be assigned as:
>>> ds.geo.projection = 4326
Origin of Dataset¶
The origin of the Dataset is given in human readable format by origin attribute. The origin can be any one of top_left, top_right, bottom_left, bottom_right. The origin can be accessed as:
>>> ds.geo.origin
'top_left'
User can also assign different origin to the Dataset. In such a case, the data and attributes are adjusted accordingly to match with the new orign. The origin can be changed as:
>>> ds.geo.origin = "bottom_right"
Reproject data¶
All the raster data (DataArrays) in the dataset can be reprojected to the new projection system by simply calling the reproject function.
>>> dsout = ds.geo.reproject(target_crs=3857)
The result of the reprojection can be seen in two images below.
Subset Data¶
Xgeo provides two method to subset data. One method provides a mechanism to subset data with vector file while other method allow user to slice the dataset using indices or bounds. The method providing vector file based subsetting is called subset while the other is called slice_dataset.
>>> dsout = ds.geo.subset(vector_file='vector.shp')
In the example above, the size of both input and output dataset is same. However, if user want the output dataset to fit the total bound of the vectors, it can be achieved through:
>>> dsout = ds.geo.subset(vector_file='vector.shp',crop=True)
Generate Statistics¶
The general statistics min, max, mean and standard deviations for each band and each dataset can be calculated as follow:
>>> ds.geo.stats()
data_mean data_std data_min data_max
band time
1 0 508.532965 573.045988 1 17841
2 0 826.767885 529.762916 10 16856
3 0 776.372960 622.791312 23 16241
4 0 1233.895797 472.069397 129 12374
5 0 2107.471764 492.178186 140 11863
6 0 2343.641019 553.738875 148 12101
7 0 2287.690683 620.665450 125 15630
8 0 2534.175579 596.514672 87 12540
9 0 2040.396011 737.076977 148 14817
10 0 1480.038654 1183.614634 100 15092
The function returns a pandas dataframe with the statics to provide user with more flexibility to manipulate the output of the statistics.
Generate Zonal Statistics¶
The zonal statistics min, max, mean and standard deviations for each band and each dataset can be calculated as follows:
>>> ds.geo.zonal_stats(vector_file='vector.shp', value_name="class")
data
class time band stat
1 0 1 mean 394.727040
std 536.226651
min 1.000000
max 11437.000000
2 0 1 mean 845.517894
std 874.189620
min 1.000000
max 10162.000000
3 0 1 mean 250.684041
std 114.707457
min 140.000000
max 1166.000000
1 0 2 mean 735.645520
std 512.267703
min 10.000000
max 12409.000000
2 0 2 mean 1148.695677
std 799.273444
min 121.000000
max 8882.000000
3 0 2 mean 642.283655
std 111.673970
min 474.000000
max 1488.000000
1 0 3 mean 668.089339
std 725.145967
min 23.000000
max 12289.000000
2 0 3 mean 1166.711904
std 927.510453
...
8 min 387.000000
max 9246.000000
3 0 8 mean 3075.893308
std 259.402703
min 1622.000000
max 3950.000000
1 0 9 mean 1903.334876
std 903.854786
min 180.000000
max 12004.000000
2 0 9 mean 2457.078426
std 1509.694257
min 247.000000
max 14817.000000
3 0 9 mean 1946.978378
std 156.187383
min 1067.000000
max 2661.000000
1 0 10 mean 1197.950185
std 1093.367547
min 145.000000
max 13230.000000
2 0 10 mean 2227.742274
std 2436.064617
min 182.000000
max 15088.000000
3 0 10 mean 997.758945
std 126.103658
min 529.000000
max 1552.000000
[120 rows x 1 columns]
The column names are generated in convention <vector_value>_<dataset>_<variable>. If value_name isn’t provided, the method takes the id of each polygon as the value_name. In such a case, the statistics will be calculated for each polygon.
Sample Pixels¶
>>> ds.geo.sample(vector_file='vector.shp', value_name='class')
data
class x y time band
1.0 261009.452737 9.850486e+06 0.0 1.0 183.0
9.850476e+06 0.0 1.0 195.0
261019.451371 9.850496e+06 0.0 1.0 214.0
9.850486e+06 0.0 1.0 211.0
9.850476e+06 0.0 1.0 177.0
9.850466e+06 0.0 1.0 195.0
9.850456e+06 0.0 1.0 185.0
9.850446e+06 0.0 1.0 193.0
261029.450005 9.850506e+06 0.0 1.0 197.0
9.850496e+06 0.0 1.0 199.0
9.850486e+06 0.0 1.0 231.0
9.850476e+06 0.0 1.0 195.0
9.850466e+06 0.0 1.0 205.0
9.850456e+06 0.0 1.0 205.0
9.850446e+06 0.0 1.0 217.0
9.850436e+06 0.0 1.0 226.0
9.850426e+06 0.0 1.0 238.0
261039.448639 9.850526e+06 0.0 1.0 222.0
9.850516e+06 0.0 1.0 213.0
9.850506e+06 0.0 1.0 202.0
9.850496e+06 0.0 1.0 189.0
9.850486e+06 0.0 1.0 198.0
9.850476e+06 0.0 1.0 192.0
9.850466e+06 0.0 1.0 164.0
9.850456e+06 0.0 1.0 179.0
9.850446e+06 0.0 1.0 211.0
9.850436e+06 0.0 1.0 220.0
9.850426e+06 0.0 1.0 229.0
9.850416e+06 0.0 1.0 217.0
9.850406e+06 0.0 1.0 201.0
...
3.0 264908.920002 9.847826e+06 0.0 10.0 840.0
9.847816e+06 0.0 10.0 845.0
9.847806e+06 0.0 10.0 850.0
9.847796e+06 0.0 10.0 854.0
9.847786e+06 0.0 10.0 855.0
9.847776e+06 0.0 10.0 850.0
9.847766e+06 0.0 10.0 844.0
9.847756e+06 0.0 10.0 836.0
9.847746e+06 0.0 10.0 836.0
9.847736e+06 0.0 10.0 846.0
9.847726e+06 0.0 10.0 850.0
9.847716e+06 0.0 10.0 850.0
9.847706e+06 0.0 10.0 854.0
9.847696e+06 0.0 10.0 860.0
9.847686e+06 0.0 10.0 879.0
9.847676e+06 0.0 10.0 911.0
9.847666e+06 0.0 10.0 953.0
264918.918636 9.847786e+06 0.0 10.0 858.0
9.847776e+06 0.0 10.0 853.0
9.847766e+06 0.0 10.0 845.0
9.847756e+06 0.0 10.0 833.0
9.847746e+06 0.0 10.0 831.0
9.847736e+06 0.0 10.0 840.0
9.847726e+06 0.0 10.0 846.0
9.847716e+06 0.0 10.0 850.0
9.847706e+06 0.0 10.0 858.0
9.847696e+06 0.0 10.0 871.0
9.847686e+06 0.0 10.0 888.0
9.847676e+06 0.0 10.0 907.0
9.847666e+06 0.0 10.0 921.0
[761450 rows x 1 columns]