Basic example
We’ll search for two months of Sentinel-2 data overlapping our area of interest—in this case, the Santa Fe ski area in New Mexico, USA (Google maps ).
We use stackstac
to create an xarray of all the data. From there, it’s easy to filter out cloudy scenes from the array based on their metadata, then create a median composite for each month.
We use pystac-client to find the relevant STAC (Spatio-Temporal Asset Catalog) items. These basically provide metadata about the relevant scenes, and links to their data.
We’ll use element84’s search endpoint to look for items from the sentinel-s2-l2a-cogs
collection on AWS.
CPU times: user 12.6 ms, sys: 3.74 ms, total: 16.4 ms
Wall time: 462 ms
Use stackstac
to turn those STAC items into a lazy xarray. Using all the defaults, our data will be in its native coordinate reference system, at the finest resolution of all the assets.
CPU times: user 16.8 ms, sys: 1.9 ms, total: 18.7 ms
Wall time: 17.8 ms
<xarray.DataArray 'stackstac-0f471b3bb92fa5774b7eddff84c2ca4d' (time: 13,
band: 17,
y: 10980,
x: 10980)>
dask.array<fetch_raster_window, shape=(13, 17, 10980, 10980), dtype=float64, chunksize=(1, 1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/24)
* time (time) datetime64[ns] 2020-04-01T18:04:04 ......
id (time) <U24 'S2B_13SDV_20200401_0_L2A' ... 'S...
* band (band) <U8 'overview' 'B11' ... 'visual' 'SCL'
* x (x) float64 4e+05 4e+05 ... 5.097e+05 5.098e+05
* y (y) float64 4e+06 4e+06 ... 3.89e+06 3.89e+06
gsd int64 10
... ...
updated (time) <U24 '2020-09-05T06:23:47.836Z' ... '2...
sentinel:utm_zone int64 13
proj:epsg int64 32613
sentinel:valid_cloud_cover bool True
title (band) <U31 'True color image' ... 'Scene Cla...
epsg int64 32613
Attributes:
spec: RasterSpec(epsg=32613, bounds=(399960.0, 3890220.0, 509760.0...
crs: epsg:32613
transform: | 10.00, 0.00, 399960.00|\n| 0.00,-10.00, 4000020.00|\n| 0.0...
resolution: 10.0 dask.array<chunksize=(1, 1, 1024, 1024), meta=np.ndarray>
Array
Chunk
Bytes
198.51 GiB
8.00 MiB
Shape
(13, 17, 10980, 10980)
(1, 1, 1024, 1024)
Count
27183 Tasks
26741 Chunks
Type
float64
numpy.ndarray
13
1
10980
10980
17
Coordinates: (24)
time
(time)
datetime64[ns]
2020-04-01T18:04:04 ... 2020-05-...
array(['2020-04-01T18:04:04.000000000', '2020-04-03T17:54:07.000000000',
'2020-04-06T18:04:04.000000000', '2020-04-08T17:54:08.000000000',
'2020-04-11T18:04:03.000000000', '2020-04-13T17:54:10.000000000',
'2020-04-16T18:04:06.000000000', '2020-04-18T17:54:06.000000000',
'2020-04-21T18:04:01.000000000', '2020-04-23T17:54:12.000000000',
'2020-04-26T18:04:09.000000000', '2020-04-28T17:54:06.000000000',
'2020-05-01T18:04:03.000000000'], dtype='datetime64[ns]') id
(time)
<U24
'S2B_13SDV_20200401_0_L2A' ... '...
array(['S2B_13SDV_20200401_0_L2A', 'S2A_13SDV_20200403_0_L2A',
'S2A_13SDV_20200406_0_L2A', 'S2B_13SDV_20200408_0_L2A',
'S2B_13SDV_20200411_0_L2A', 'S2A_13SDV_20200413_0_L2A',
'S2A_13SDV_20200416_0_L2A', 'S2B_13SDV_20200418_0_L2A',
'S2B_13SDV_20200421_0_L2A', 'S2A_13SDV_20200423_0_L2A',
'S2A_13SDV_20200426_0_L2A', 'S2B_13SDV_20200428_0_L2A',
'S2B_13SDV_20200501_0_L2A'], dtype='<U24') band
(band)
<U8
'overview' 'B11' ... 'visual' 'SCL'
array(['overview', 'B11', 'B01', 'B12', 'B02', 'B03', 'B04', 'AOT', 'B05',
'B06', 'B07', 'B08', 'B8A', 'B09', 'WVP', 'visual', 'SCL'], dtype='<U8') x
(x)
float64
4e+05 4e+05 ... 5.097e+05 5.098e+05
array([399960., 399970., 399980., ..., 509730., 509740., 509750.]) y
(y)
float64
4e+06 4e+06 ... 3.89e+06 3.89e+06
array([4000020., 4000010., 4000000., ..., 3890250., 3890240., 3890230.]) gsd
()
int64
10
sentinel:product_id
(time)
<U60
'S2B_MSIL2A_20200401T174909_N021...
array(['S2B_MSIL2A_20200401T174909_N0214_R141_T13SDV_20200401T220155',
'S2A_MSIL2A_20200403T173901_N0214_R098_T13SDV_20200403T220105',
'S2A_MSIL2A_20200406T174901_N0214_R141_T13SDV_20200406T221027',
'S2B_MSIL2A_20200408T173859_N0214_R098_T13SDV_20200408T215856',
'S2B_MSIL2A_20200411T174909_N0214_R141_T13SDV_20200411T220443',
'S2A_MSIL2A_20200413T173901_N0214_R098_T13SDV_20200413T235616',
'S2A_MSIL2A_20200416T174911_N0214_R141_T13SDV_20200416T221101',
'S2B_MSIL2A_20200418T173859_N0214_R098_T13SDV_20200418T214953',
'S2B_MSIL2A_20200421T174859_N0214_R141_T13SDV_20200421T214716',
'S2A_MSIL2A_20200423T173911_N0214_R098_T13SDV_20200423T234141',
'S2A_MSIL2A_20200426T174911_N0214_R141_T13SDV_20200426T221124',
'S2B_MSIL2A_20200428T173859_N0214_R098_T13SDV_20200428T214309',
'S2B_MSIL2A_20200501T174909_N0214_R141_T13SDV_20200501T220131'],
dtype='<U60') instruments
()
<U3
'msi'
array('msi', dtype='<U3') created
(time)
<U24
'2020-09-05T06:23:47.836Z' ... '...
array(['2020-09-05T06:23:47.836Z', '2020-09-23T15:43:10.502Z',
'2020-09-05T11:18:48.722Z', '2020-08-31T15:23:56.855Z',
'2020-09-23T16:16:43.693Z', '2020-09-24T06:31:45.567Z',
'2020-09-24T05:31:29.634Z', '2020-09-19T07:13:25.290Z',
'2020-09-05T11:20:20.745Z', '2020-08-23T09:50:31.532Z',
'2020-09-05T13:56:08.881Z', '2020-08-31T13:13:14.956Z',
'2020-09-24T07:38:40.841Z'], dtype='<U24') sentinel:data_coverage
(time)
float64
33.85 100.0 33.9 ... 100.0 34.29
array([ 33.85, 100. , 33.9 , 100. , 33.87, 100. , 33.37, 100. ,
34.09, 100. , 32.84, 100. , 34.29]) data_coverage
(time)
object
33.85 100 33.9 ... 32.84 100 34.29
array([33.85, 100, 33.9, 100, 33.87, 100, 33.37, 100, 34.09, None, 32.84,
100, 34.29], dtype=object) sentinel:sequence
()
<U1
'0'
sentinel:grid_square
()
<U2
'DV'
eo:cloud_cover
(time)
float64
29.24 1.16 27.26 ... 87.33 5.41
array([ 29.24, 1.16, 27.26, 1.15, 9.37, 100. , 4.53, 35. ,
14.97, 31.78, 99.42, 87.33, 5.41]) platform
(time)
<U11
'sentinel-2b' ... 'sentinel-2b'
array(['sentinel-2b', 'sentinel-2a', 'sentinel-2a', 'sentinel-2b',
'sentinel-2b', 'sentinel-2a', 'sentinel-2a', 'sentinel-2b',
'sentinel-2b', 'sentinel-2a', 'sentinel-2a', 'sentinel-2b',
'sentinel-2b'], dtype='<U11') constellation
()
<U10
'sentinel-2'
array('sentinel-2', dtype='<U10') sentinel:latitude_band
()
<U1
'S'
view:off_nadir
()
int64
0
updated
(time)
<U24
'2020-09-05T06:23:47.836Z' ... '...
array(['2020-09-05T06:23:47.836Z', '2020-09-23T15:43:10.502Z',
'2020-09-05T11:18:48.722Z', '2020-08-31T15:23:56.855Z',
'2020-09-23T16:16:43.693Z', '2020-09-24T06:31:45.567Z',
'2020-09-24T05:31:29.634Z', '2020-09-19T07:13:25.290Z',
'2020-09-05T11:20:20.745Z', '2020-08-23T09:50:31.532Z',
'2020-09-05T13:56:08.881Z', '2020-08-31T13:13:14.956Z',
'2020-09-24T07:38:40.841Z'], dtype='<U24') sentinel:utm_zone
()
int64
13
proj:epsg
()
int64
32613
sentinel:valid_cloud_cover
()
bool
True
title
(band)
<U31
'True color image' ... 'Scene Cl...
array(['True color image', 'Band 11 (swir16)', 'Band 1 (coastal)',
'Band 12 (swir22)', 'Band 2 (blue)', 'Band 3 (green)',
'Band 4 (red)', 'Aerosol Optical Thickness (AOT)', 'Band 5',
'Band 6', 'Band 7', 'Band 8 (nir)', 'Band 8A', 'Band 9',
'Water Vapour (WVP)', 'True color image',
'Scene Classification Map (SCL)'], dtype='<U31') epsg
()
int64
32613
Attributes: (4)
spec : RasterSpec(epsg=32613, bounds=(399960.0, 3890220.0, 509760.0, 4000020.0), resolutions_xy=(10.0, 10.0)) crs : epsg:32613 transform : | 10.00, 0.00, 399960.00|
| 0.00,-10.00, 4000020.00|
| 0.00, 0.00, 1.00| resolution : 10.0
Well, that’s really all there is to it. Now you have an xarray DataArray, and you can do whatever you like to it!
Here, we’ll filter out scenes with >20% cloud coverage (according to the eo:cloud_cover
field set by the data provider). Then, pick the bands corresponding to red, green, and blue, and use xarray’s resample
to create 1-month median composites.
<xarray.DataArray 'stackstac-0f471b3bb92fa5774b7eddff84c2ca4d' (time: 2,
band: 3,
y: 10980,
x: 10980)>
dask.array<stack, shape=(2, 3, 10980, 10980), dtype=float64, chunksize=(1, 1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/16)
* time (time) datetime64[ns] 2020-04-01 2020-05-01
* band (band) <U8 'B04' 'B03' 'B02'
* x (x) float64 4e+05 4e+05 ... 5.097e+05 5.098e+05
* y (y) float64 4e+06 4e+06 ... 3.89e+06 3.89e+06
gsd int64 10
instruments <U3 'msi'
... ...
view:off_nadir int64 0
sentinel:utm_zone int64 13
proj:epsg int64 32613
sentinel:valid_cloud_cover bool True
title (band) <U31 'Band 4 (red)' ... 'Band 2 (blue)'
epsg int64 32613
Attributes:
spec: RasterSpec(epsg=32613, bounds=(399960.0, 3890220.0, 509760.0...
crs: epsg:32613
transform: | 10.00, 0.00, 399960.00|\n| 0.00,-10.00, 4000020.00|\n| 0.0...
resolution: 10.0 dask.array<chunksize=(1, 1, 1024, 1024), meta=np.ndarray>
Array
Chunk
Bytes
5.39 GiB
8.00 MiB
Shape
(2, 3, 10980, 10980)
(1, 1, 1024, 1024)
Count
45696 Tasks
726 Chunks
Type
float64
numpy.ndarray
2
1
10980
10980
3
Coordinates: (16)
time
(time)
datetime64[ns]
2020-04-01 2020-05-01
array(['2020-04-01T00:00:00.000000000', '2020-05-01T00:00:00.000000000'],
dtype='datetime64[ns]') band
(band)
<U8
'B04' 'B03' 'B02'
array(['B04', 'B03', 'B02'], dtype='<U8') x
(x)
float64
4e+05 4e+05 ... 5.097e+05 5.098e+05
array([399960., 399970., 399980., ..., 509730., 509740., 509750.]) y
(y)
float64
4e+06 4e+06 ... 3.89e+06 3.89e+06
array([4000020., 4000010., 4000000., ..., 3890250., 3890240., 3890230.]) gsd
()
int64
10
instruments
()
<U3
'msi'
array('msi', dtype='<U3') sentinel:sequence
()
<U1
'0'
sentinel:grid_square
()
<U2
'DV'
constellation
()
<U10
'sentinel-2'
array('sentinel-2', dtype='<U10') sentinel:latitude_band
()
<U1
'S'
view:off_nadir
()
int64
0
sentinel:utm_zone
()
int64
13
proj:epsg
()
int64
32613
sentinel:valid_cloud_cover
()
bool
True
title
(band)
<U31
'Band 4 (red)' ... 'Band 2 (blue)'
array(['Band 4 (red)', 'Band 3 (green)', 'Band 2 (blue)'], dtype='<U31') epsg
()
int64
32613
Attributes: (4)
spec : RasterSpec(epsg=32613, bounds=(399960.0, 3890220.0, 509760.0, 4000020.0), resolutions_xy=(10.0, 10.0)) crs : epsg:32613 transform : | 10.00, 0.00, 399960.00|
| 0.00,-10.00, 4000020.00|
| 0.00, 0.00, 1.00| resolution : 10.0
So we don’t pull all ~200 GB of data down to our local machine, let’s slice out a little region around our area of interest.
We convert our lat-lon point to the data’s UTM coordinate reference system, then use that to slice the x
and y
dimensions, which are indexed by their UTM coordinates.
<xarray.DataArray 'stackstac-0f471b3bb92fa5774b7eddff84c2ca4d' (time: 2,
band: 3,
y: 400, x: 400)>
dask.array<getitem, shape=(2, 3, 400, 400), dtype=float64, chunksize=(1, 1, 387, 316), chunktype=numpy.ndarray>
Coordinates: (12/16)
* time (time) datetime64[ns] 2020-04-01 2020-05-01
* band (band) <U8 'B04' 'B03' 'B02'
* x (x) float64 4.275e+05 4.275e+05 ... 4.315e+05
* y (y) float64 3.963e+06 3.963e+06 ... 3.959e+06
gsd int64 10
instruments <U3 'msi'
... ...
view:off_nadir int64 0
sentinel:utm_zone int64 13
proj:epsg int64 32613
sentinel:valid_cloud_cover bool True
title (band) <U31 'Band 4 (red)' ... 'Band 2 (blue)'
epsg int64 32613
Attributes:
spec: RasterSpec(epsg=32613, bounds=(399960.0, 3890220.0, 509760.0...
crs: epsg:32613
transform: | 10.00, 0.00, 399960.00|\n| 0.00,-10.00, 4000020.00|\n| 0.0...
resolution: 10.0 dask.array<chunksize=(1, 1, 387, 316), meta=np.ndarray>
Array
Chunk
Bytes
7.32 MiB
0.93 MiB
Shape
(2, 3, 400, 400)
(1, 1, 387, 316)
Count
45720 Tasks
24 Chunks
Type
float64
numpy.ndarray
2
1
400
400
3
Coordinates: (16)
time
(time)
datetime64[ns]
2020-04-01 2020-05-01
array(['2020-04-01T00:00:00.000000000', '2020-05-01T00:00:00.000000000'],
dtype='datetime64[ns]') band
(band)
<U8
'B04' 'B03' 'B02'
array(['B04', 'B03', 'B02'], dtype='<U8') x
(x)
float64
4.275e+05 4.275e+05 ... 4.315e+05
array([427520., 427530., 427540., ..., 431490., 431500., 431510.]) y
(y)
float64
3.963e+06 3.963e+06 ... 3.959e+06
array([3962930., 3962920., 3962910., ..., 3958960., 3958950., 3958940.]) gsd
()
int64
10
instruments
()
<U3
'msi'
array('msi', dtype='<U3') sentinel:sequence
()
<U1
'0'
sentinel:grid_square
()
<U2
'DV'
constellation
()
<U10
'sentinel-2'
array('sentinel-2', dtype='<U10') sentinel:latitude_band
()
<U1
'S'
view:off_nadir
()
int64
0
sentinel:utm_zone
()
int64
13
proj:epsg
()
int64
32613
sentinel:valid_cloud_cover
()
bool
True
title
(band)
<U31
'Band 4 (red)' ... 'Band 2 (blue)'
array(['Band 4 (red)', 'Band 3 (green)', 'Band 2 (blue)'], dtype='<U31') epsg
()
int64
32613
Attributes: (4)
spec : RasterSpec(epsg=32613, bounds=(399960.0, 3890220.0, 509760.0, 4000020.0), resolutions_xy=(10.0, 10.0)) crs : epsg:32613 transform : | 10.00, 0.00, 399960.00|
| 0.00,-10.00, 4000020.00|
| 0.00, 0.00, 1.00| resolution : 10.0
[########################################] | 100% Completed | 7.13 s