列表‘真’去重-刪除全部重複元素

最近遇到一個問題:在遍歷列表時刪除重複內容,不夠徹底;再解決後,來分享下我的思路

實際情景

下圖是某張表的記錄【僅考慮這些字段】,直接看來,就感覺大部分都是重複的,全部去重後也就三條;但在腳本,執行我寫的方法後 ,結果還是很多條,我就有些犯迷糊了,沒頭緒。

實際我用的代碼如下:

        for i in abc:
            if abc.count(i) != 1:
                abc.remove(i)

在這裏插入圖片描述

爲了不泄露公司的數據,用列表abc來分享:

    def test_012(self):
        abc = [('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               66666, 66666, 66666, 66666,
               7878, 7878]
        print(len(abc), '最初的長度')

        for i in abc:
            if abc.count(i) != 1:
                abc.remove(i)
        print(len(abc), '處理後的長度')

這一部分是我第一版的代碼,感覺沒毛病呢。執行結果卻是:
在這裏插入圖片描述
但肉眼看着就覺得不對;若是全部去重,肯定是4條啊,爲什麼會是7條呢。

解決方法

上面的思路是:對列表abc做個遍歷,若當前元素在列表中出現的次數不等於1,就移除當前元素;但這次移除某元素後的 列表(長度、元素)實際變了。【遍歷在新的列表操作】看下圖:

在這裏插入圖片描述

那要怎樣才能實現呢?

思路1a:拿出一個不會變、元素相同的列表 來代替列表abc,因爲列表 真正的拷貝是要使用分片的方法,故而

    def test_234(self):
        abc = [('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               66666, 66666, 66666, 66666,
               7878, 7878]
        print(len(abc), '最初的長度')
        for i in abc[:]:
            if abc.count(i) != 1:
                abc.remove(i)
        print(len(abc), '處理後的長度')
        print(abc)

倒序的列表 來代替列表abc:

    def test_345(self):
        abc = [('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               66666, 66666, 66666, 66666,
               7878, 7878]
        print(len(abc), '最初的長度')
        for i in abc[::-1]:
            if abc.count(i) != 1:
                abc.remove(i)
        print(len(abc), '處理後的長度')
        print(abc)

思路1a:拿出一個元素相同的元組 來代替列表abc

    def test_45678(self):
        abc = [('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               66666, 66666, 66666, 66666,
               7878, 7878]
        print(len(abc), '最初的長度')
        for i in tuple(abc):
            if abc.count(i) != 1:
                abc.remove(i)
        print(len(abc), '處理後的長度')
        print(abc)

思路2:如果我非要堅持使用會變的列表abc作遍歷呢,我想到的是 讓其一直在做遍歷,直到某次遍歷前後 列表abc的長度不做改變,就跳出循環;

    def test_123(self):
        abc = [('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               66666, 66666, 66666, 66666,
               7878, 7878]
        print(len(abc), '最初的長度')
        while True:
            length1 = len(abc)
            for i in abc:
                if abc.count(i) != 1:
                    abc.remove(i)
                    length2 = len(abc)
            if length1 == length2:
                break
        print(len(abc), '處理後的長度')
        print(abc)

思路3:把列表abc中count爲1的元素內容扔進新list

    def test_012(self):
        abc = [('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               66666, 66666, 66666, 66666,
               7878, 7878]
        print(len(abc), '最初的長度')

        ABC = list()
        for i in abc[:]:
            if abc.count(i) != 1:
                abc.remove(i)
            else:
                ABC.append(i)

        print(len(ABC), '處理後的長度')
        print(ABC)
    def test_7892(self):
        abc = [('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               66666, 66666, 66666, 66666,
               7878, 7878]
        print(len(abc), '最初的長度')
        temp = abc[:]
        abc.clear()
        for e in temp:
            if e not in abc:
                abc.append(e)

        print(len(abc), '處理後的長度')
        print(abc, '新的')

上面的執行結果:
在這裏插入圖片描述

思路4:如果不特別關注列表中元素的前後順序,使用set()函數

    def test_789(self):
        abc = [('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('115', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               ('66666', 'C2019093015002', 1569826868000, 1569826868000),
               66666, 66666, 66666, 66666,
               7878, 7878]
        print(len(abc), '最初的長度')
        set_abc = set(abc)
        print(len(set_abc), '處理後的長度')
        print(list(set_abc), '新的')

這一部分執行的結果就有可能是:

在這裏插入圖片描述

交流技術 歡迎+QQ 153132336 zy
個人博客 https://blog.csdn.net/zyooooxie

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章