python查找列表中第一个副本的索引

python find index of the first duplicate within a list

我想遍历一个列表，这样我就可以找到索引号，列表中的第一个项在这里找到第一个匹配项。我的结果应该打印mylist[0:first_match]。

我的意思是：

1
2
3
4
5
6
7
8
9
10
11
12
13

.APT 5B APT 5B .
.BUSINESS JOEY BUSINESS.
. 1ST FL .
. NATE JR SAM .
. JOE 7 .
. .
.2ND FLR TOM 2ND FLR .
.A1 2FL APT 71E .
.APT E205 APT 1R .
. CONSTRUCTION .
.APT 640 APT 545.
.PART1 SYNC PART2 .
. NATE JR SAM .

我遇到的问题是，即使在找到第一个匹配项之后，程序仍在向字典中添加项，因此附加了我想忽略/忽略的数据。

以下是我的资料：

1
2
3
4
5
6
7
8
9
10
11
12
13

dictt = {}
with open(path + 'sample33.txt', 'rb') as txtin:
for line in txtin:
part2 = line[1:29].split()
uniq = []
print '%r' % part2

for key in part2:
if key not in dictt:
dictt[key] = key
uniq.append(key)
dictt = {}
print ' '.join(uniq)

结果：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

['APT', '5B', 'APT', '5B']
APT 5B
['BUSINESS', 'JOEY', 'BUSINESS']
BUSINESS JOEY
['1ST', 'FL']
1ST FL
['NATE', 'JR', 'SAM']
NATE JR SAM
['JOE', '7']
JOE 7
[]

['2ND', 'FLR', 'TOM', '2ND', 'FLR']
2ND FLR TOM
['A1', '2FL', 'APT', '71E']
A1 2FL APT 71E
['APT', 'E205', 'APT', '1R']
APT E205 1R # Would like to stop adding items after first 'APT' match
['CONSTRUCTION']
CONSTRUCTION
['APT', '640', 'APT', '545']
APT 640 545 # same here...
['PART1', 'SYNC', 'PART2']
PART1 SYNC PART2
['NATE', 'JR', 'SAM']
NATE JR SAM
[Finished in 0.1s]

我希望我已经正确地解释了这个问题，并且有人可以对它进行微调。

谢谢您

编辑第1页下面是我想要打印的示例：

1 2	listt: ['APT', '640', 'APT', '1', '2', '3']

找到"apt"匹配，因此：

1 2	print: APT 640

忽略...'APT', '1', '2', '3']。

干得好：

1
2
3
4
5
6
7

>>> f = open('your_file.txt')
>>> for x in f:
line = re.findall('\w+',x.strip())
print line
try:
print"" .join(line[:line[1:].index(line[0])+1])
except: print"".join(line)

输出：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

['APT', '5B', 'APT', '5B']
APT 5B
['BUSINESS', 'JOEY', 'BUSINESS']
BUSINESS JOEY
['1ST', 'FL']
1ST FL
['NATE', 'JR', 'SAM']
NATE JR SAM
['JOE', '7']
JOE 7
[]

['2ND', 'FLR', 'TOM', '2ND', 'FLR']
2ND FLR TOM
['A1', '2FL', 'APT', '71E']
A1 2FL APT 71E
['APT', 'E205', 'APT', '1R']
APT E205 # not printing after match
['CONSTRUCTION']
CONSTRUCTION
['APT', '640', 'APT', '545']
APT 640 # not printing after match
['PART1', 'SYNC', 'PART2']
PART1 SYNC PART2
['NATE', 'JR', 'SAM']
NATE JR SAM

相关讨论

我不确定我完全理解你需要什么，但这可能有用。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

def read_text(name_file, string):

index_found = [0, 0]
result = [0, 0]

with open (name_file) as f:
read_temp = [word for line in f for word in line.split()]

for s in read_temp:
if string in str(s):
index_str = read_temp.index(s)
index_found[0] = index_str
index_found[1] = index_str + 1

result[0] = read_temp[index_found[0]]
result[1] = read_temp[index_found[1]]

return result

os.chdir('Path to your .txt')

result_list = read_text("your_file.txt","APT") #"APT" or whatever string you need to find.

print result_list

输出：

1	['APT', '5B']

如果你担心的是从你的列表中删除重复的条目，那么"set"就是用来拯救你的。

1	uniqlist = list(set(dupelist))

我还应该提到另一篇文章提到了从列表中删除重复项的能力。

使用集合的python唯一列表