关于python:表示连接切片的切片

Slice that represents concatenated slices

切片slice(start, stop[, step])的指数通常可以用range(start, stop, step)表示(或在考虑基础维度时用range(*slice(start, stop, step).indices(length))表示)。

假设我有两个多维切片，第二个切片可以用作应用第一个切片的结果中的切片。

例子：

1
2
3
4

import numpy as np
data = np.random.rand(*(100, 100, 100))
a = data[::2, 7, :] # slice 1, a.shape = (50,100)
b = a[1, ::-1] # slice 2, b.shape = (100,)

我想找到一个通用表达式来计算执行相同任务的单个切片。我知道底层数据结构的维度。

1 2	c = data[2, 7, ::-1] # same as b np.array_equal(b, c) # True

因此，在本例中，从[::2, 7, :]和[1, ::-1]到[2, 7, ::-1]的过程中，我需要一个如下的函数：

1
2
3

def concatenate_slices(shape, outer_slice, inner_slice):
...
return combined_slice

其中，outer_slice和inner_slice都是一组切片。在示例中，shape=(100, 100, 100)和outer_slice=(slice(None, None, 2), 7, slice(None, None, None))和inner_slice=(1, slice(None, None, -1))。

我不知道如何有效地做到这一点。

当调用__getitem__(slice)时，我的对象会做一些事情(没有中间视图)，我只想做一次，但仍然有可能有切片。

作为扩展(可选)，我想知道如果切片中有椭圆会发生什么。那么我该如何组合呢？

相关讨论

我怀疑您只需要通过分析每个维度的繁琐工作，就可以构建新的切片或索引数组。我怀疑有没有捷径。

举例说明：

1
2
3

In [77]: shape=(100,100,100)
In [78]: outer_slice=(slice(None, None, 2), 7, slice(None, None, None))
In [79]: inner_slice=(1, slice(None, None, -1))

目标是(对吗？)：

1	(2, 7, slice(None,None,-1))

第一维度-对整个索引范围创建一个数组，并按顺序对其进行切片：

1
2
3

In [80]: idx=np.arange(shape[0])
In [81]: idx[outer_slice[0]][inner_slice[0]]
Out[81]: 2

我能从[：：2]和[1]推断出这一点吗？我必须解释它从0开始，形状足够大，可以产生第二个值，等等。

现在是第二维度。这是一个标量，所以没有对应的inner片。

1 2	In [82]: outer_slice[1] Out[82]: 7

对于第三个，让我们和第一个一样，但是考虑到外部列表和内部列表之间的偏移：

1
2
3
4

In [83]: idx=np.arange(shape[2])
In [84]: idx[outer_slice[2]][inner_slice[1]]
Out[84]:
array([99, 98, 97, 96, 95, 94, 93, 92, 91, ....7, 6, 5, 4, 3, 2, 1, 0])

或者我可以推断，outer_slice[2]什么都不做，所以我可以直接使用inner_slice[1]。

当然，将两片元组应用到实际数组同样简单有效。

1	X[outer_slice][inner_slice]

只要outer_slice生成一个视图，将它们组合成一个复合片并没有多大的改进。

对于形状和切片元组，有足够的信息来构建新的元组。但是，所需的逻辑似乎非常复杂，并且需要对切片有深入的了解，以及大量的测试。

让我们从简单的例子开始：一维数组。我们需要跟踪最终切片的start、stop和step值，我们可以这样更新：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

def update_1d(a, b, length):
a_start, a_stop, a_step = a.indices(length)
a_length = len(xrange(a_start, a_stop, a_step))
if a_length == 0:
# doesn't matter what b is if data[a] is []
return a
b_start, b_stop, b_step = b.indices(a_length)
b_length = len(xrange(b_start, b_stop, b_step))
if b_length == 0:
# result will be empty, so we can exit early
return slice(0, 0, 1)
# convert b's start into a's coordinates, without wrapping around
start = max(0, a_start + b_start * a_step)
# steps are multiplicative, which makes things easy
step = a_step * b_step
# the stop index is the hard part because it depends on the sign of both steps
x = a_start + b_stop * a_step
if step < 0:
# indexing backwards, so truncate if b's converted step goes below zero
stop = x if x >= 0 else None
elif a_step > 0:
# both steps are positive, so take the smallest stop index
stop = min(a_stop, x)
else:
# both steps are negative, so take the largest stop index
stop = max(a_stop, x)
return slice(start, stop, step)

请注意，这预期a和b是切片。不过，通常可以将其他窗体转换为切片对象。这甚至包括Ellipsis对象，假设您知道您有多少维。

要将此扩展到多维情况，我们需要做一些簿记来跟踪原始维度被切片的情况。例如，如果有data[::2, 7, :][:, 2:-2]，则必须将第二个切片的第二个维度映射到第一个切片的第三个维度。