over 3 years ago

Be Defensive When Iterating Over Arguments

假設我們有一個函式,功用是算出輸入資料所佔百分比

# from __future__ import division (for Python 2)

def normalize(numbers):
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result

輸入如果是 list,程式可以正常運作沒問題:

percentages = normalize([15, 35, 80])
# [11.538461538461538, 26.923076923076923, 61.53846153846154]

但如果是 iterator 就死了

percentages = normalize(iter([15, 35, 80])) # []

percentages = normalize(n for n in [15, 35, 80]) # []

因為 normalize() 中呼叫 sum() 已經將 numbers 用光了,其實就是當 iterator 丟出 StopIteration 後就不能再從來一次。

it = iter([15, 35, 80])
list(it) # [15, 35, 80]

list(it) # [] -> Already exhausted

書中列出幾種解決方式,其中最好的作法是 -- 禁止傳入 iterator !!

這裡用個小技巧: iter() 如果傳入的是 iterator ,回傳值是該 iterator 。

def normalize_defensive(numbers):
    if iter(numbers) is iter(numbers):  # An iterator -- bad!

        raise TypeError('Must supply a container')
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result

然後將要傳入的 iterator 用個實作 iterator protocol 的 class 包起來

class ReadVisits(object):
    def __init__(self, data_path):
        self.data_path = data_path

    def __iter__(self):
        with open(self.data_path) as f:
            for line in f:
                yield int(line)

這樣 normalize_defensive() 每次使用 numbers 時都是產生新的 iterator 物件。

← Effective Python 心得筆記: Item 16 Effective Python 心得筆記: Item 18 →
 
comments powered by Disqus