######### Examples ######### The examples in this page will guide you through the functionality of the xgeo. Firstly, let's import the necessary libraries and open a data >>> import xgeo # Needs to be imported to use geo extension >>> import xarray as xr >>> ds = xr.open_dataset("data.nc") The code-blocks in the rest of the examples will start after the code-block presented above unless and otherwise mentioned. Geotransform ============ The geotransform of the dataset is given by the `transform` attribute. It can be accessed as >>> ds.geo.transform (0.022222222222183063, 0, -179.99999999999997, 0, -0.022222222222239907, 90.00000000000001) User can also assign different geotransform. In such a case, the coordinates of the dataset will be recalculated to comply with the changed transform. The transform can be set as: >>> ds.geo.transform = (0.0111, 0, -180, 0, -0.0111, 90) Projection / Coordinate Reference System (CRS) ============================================== The projection/CRS of the Dataset is given by the `projection` attribute. XGeo converts and stores the crs system of the dataset into the proj4 string. The CRS can be accessed as >>> ds.geo.projection '+init=epsg:4326' User can also assign different crs system. The assignment can be done in multiple format. User can provide CRS in WKT, EPSG or PROJ4 system. .. note:: The assignment of new CRS system doesn't reproject to it. Main purpose of this assignment is to provide CRS to dataset, in case of missing CRS system in dataset. The CRS can be assigned as: >>> ds.geo.projection = 4326 Origin of Dataset ================= The origin of the Dataset is given in human readable format by `origin` attribute. The origin can be any one of `top_left`, `top_right`, `bottom_left`, `bottom_right`. The origin can be accessed as: >>> ds.geo.origin 'top_left' User can also assign different origin to the Dataset. In such a case, the data and attributes are adjusted accordingly to match with the new orign. The origin can be changed as: >>> ds.geo.origin = "bottom_right" Reproject data ================= All the raster data (DataArrays) in the dataset can be reprojected to the new projection system by simply calling the reproject function. >>> dsout = ds.geo.reproject(target_crs=3857) The result of the reprojection can be seen in two images below. |data_4326| >> |data_3857| .. |data_4326| image:: _static/data_4326.png :width: 45% .. |data_3857| image:: _static/data_3857.png :width: 45% Subset Data =========== Xgeo provides two method to subset data. One method provides a mechanism to subset data with vector file while other method allow user to slice the dataset using indices or bounds. The method providing vector file based subsetting is called `subset` while the other is called `slice_dataset`. >>> dsout = ds.geo.subset(vector_file='vector.shp') |full_data| >> |clipped_data| .. |full_data| image:: _static/data_togo.png :width: 45% .. |clipped_data| image:: _static/data_togo_clipped.png :width: 45% In the example above, the size of both input and output dataset is same. However, if user want the output dataset to fit the total bound of the vectors, it can be achieved through: >>> dsout = ds.geo.subset(vector_file='vector.shp',crop=True) |clipped_crop_data| .. |clipped_crop_data| image:: _static/data_togo_clipped_crop.png :width: 45% Generate Statistics =================== The general statistics min, max, mean and standard deviations for each band and each dataset can be calculated as follow: >>> ds.geo.stats() data_mean data_std data_min data_max band time 1 0 508.532965 573.045988 1 17841 2 0 826.767885 529.762916 10 16856 3 0 776.372960 622.791312 23 16241 4 0 1233.895797 472.069397 129 12374 5 0 2107.471764 492.178186 140 11863 6 0 2343.641019 553.738875 148 12101 7 0 2287.690683 620.665450 125 15630 8 0 2534.175579 596.514672 87 12540 9 0 2040.396011 737.076977 148 14817 10 0 1480.038654 1183.614634 100 15092 The function returns a pandas dataframe with the statics to provide user with more flexibility to manipulate the output of the statistics. Generate Zonal Statistics ========================= The zonal statistics min, max, mean and standard deviations for each band and each dataset can be calculated as follows: >>> ds.geo.zonal_stats(vector_file='vector.shp', value_name="class") data class time band stat 1 0 1 mean 394.727040 std 536.226651 min 1.000000 max 11437.000000 2 0 1 mean 845.517894 std 874.189620 min 1.000000 max 10162.000000 3 0 1 mean 250.684041 std 114.707457 min 140.000000 max 1166.000000 1 0 2 mean 735.645520 std 512.267703 min 10.000000 max 12409.000000 2 0 2 mean 1148.695677 std 799.273444 min 121.000000 max 8882.000000 3 0 2 mean 642.283655 std 111.673970 min 474.000000 max 1488.000000 1 0 3 mean 668.089339 std 725.145967 min 23.000000 max 12289.000000 2 0 3 mean 1166.711904 std 927.510453 ... 8 min 387.000000 max 9246.000000 3 0 8 mean 3075.893308 std 259.402703 min 1622.000000 max 3950.000000 1 0 9 mean 1903.334876 std 903.854786 min 180.000000 max 12004.000000 2 0 9 mean 2457.078426 std 1509.694257 min 247.000000 max 14817.000000 3 0 9 mean 1946.978378 std 156.187383 min 1067.000000 max 2661.000000 1 0 10 mean 1197.950185 std 1093.367547 min 145.000000 max 13230.000000 2 0 10 mean 2227.742274 std 2436.064617 min 182.000000 max 15088.000000 3 0 10 mean 997.758945 std 126.103658 min 529.000000 max 1552.000000 [120 rows x 1 columns] The column names are generated in convention `__`. If `value_name` isn't provided, the method takes the id of each polygon as the value_name. In such a case, the statistics will be calculated for each polygon. Sample Pixels ============= >>> ds.geo.sample(vector_file='vector.shp', value_name='class') data class x y time band 1.0 261009.452737 9.850486e+06 0.0 1.0 183.0 9.850476e+06 0.0 1.0 195.0 261019.451371 9.850496e+06 0.0 1.0 214.0 9.850486e+06 0.0 1.0 211.0 9.850476e+06 0.0 1.0 177.0 9.850466e+06 0.0 1.0 195.0 9.850456e+06 0.0 1.0 185.0 9.850446e+06 0.0 1.0 193.0 261029.450005 9.850506e+06 0.0 1.0 197.0 9.850496e+06 0.0 1.0 199.0 9.850486e+06 0.0 1.0 231.0 9.850476e+06 0.0 1.0 195.0 9.850466e+06 0.0 1.0 205.0 9.850456e+06 0.0 1.0 205.0 9.850446e+06 0.0 1.0 217.0 9.850436e+06 0.0 1.0 226.0 9.850426e+06 0.0 1.0 238.0 261039.448639 9.850526e+06 0.0 1.0 222.0 9.850516e+06 0.0 1.0 213.0 9.850506e+06 0.0 1.0 202.0 9.850496e+06 0.0 1.0 189.0 9.850486e+06 0.0 1.0 198.0 9.850476e+06 0.0 1.0 192.0 9.850466e+06 0.0 1.0 164.0 9.850456e+06 0.0 1.0 179.0 9.850446e+06 0.0 1.0 211.0 9.850436e+06 0.0 1.0 220.0 9.850426e+06 0.0 1.0 229.0 9.850416e+06 0.0 1.0 217.0 9.850406e+06 0.0 1.0 201.0 ... 3.0 264908.920002 9.847826e+06 0.0 10.0 840.0 9.847816e+06 0.0 10.0 845.0 9.847806e+06 0.0 10.0 850.0 9.847796e+06 0.0 10.0 854.0 9.847786e+06 0.0 10.0 855.0 9.847776e+06 0.0 10.0 850.0 9.847766e+06 0.0 10.0 844.0 9.847756e+06 0.0 10.0 836.0 9.847746e+06 0.0 10.0 836.0 9.847736e+06 0.0 10.0 846.0 9.847726e+06 0.0 10.0 850.0 9.847716e+06 0.0 10.0 850.0 9.847706e+06 0.0 10.0 854.0 9.847696e+06 0.0 10.0 860.0 9.847686e+06 0.0 10.0 879.0 9.847676e+06 0.0 10.0 911.0 9.847666e+06 0.0 10.0 953.0 264918.918636 9.847786e+06 0.0 10.0 858.0 9.847776e+06 0.0 10.0 853.0 9.847766e+06 0.0 10.0 845.0 9.847756e+06 0.0 10.0 833.0 9.847746e+06 0.0 10.0 831.0 9.847736e+06 0.0 10.0 840.0 9.847726e+06 0.0 10.0 846.0 9.847716e+06 0.0 10.0 850.0 9.847706e+06 0.0 10.0 858.0 9.847696e+06 0.0 10.0 871.0 9.847686e+06 0.0 10.0 888.0 9.847676e+06 0.0 10.0 907.0 9.847666e+06 0.0 10.0 921.0 [761450 rows x 1 columns]