almost 2 years ago

簡介

其實網路上相關文件很多,但都是用英文寫的,而且不是很齊全。在這裡整理出一份比較完整的清單,希望能為台灣的 Python 學習者有點貢獻。

預設讀者

  • 剛入門的 Python 程式設計師
  • 至少看得懂 Python 語法

目錄

編號 主題
1 前言
2 If Statements
3 請愛用 in
4 Tuple 的妙用
5 Conditional Expressions
6 善用 enumerate
7 負數索引值
8 太長怎麼辦
9 loop 可以有 else
10 Chained Comparisons
11 避免用 mutable 預設引數
12 用 join 生成字串
13 請愛用 dict.get()
14 用 property 取代 getters, setters
15 Context Managers
16 用 _ 代表未使用的變數
17 List Comprehensions
18 Generator Expressions
19 請愛用 BIFs
20 避免覆蓋 BIFs
21 dict.setdefault()
22 defaultdict
23 PEP 8
24 遵循 PEP 8 的命名規則
25 import 的順序
26 儘量別用 from module import *
27 儘量少用 from module import obj
28 別用 implicit relative imports
29 Convenience Imports
30 Python 之禪
 
11 months ago

Make pickle Reliable with copyreg

在講 copyreg 這個內建的 module ,搭配 pickle 使用。

pickle 使用上很簡單,假設我們有個 class:

class GameState(object):
    def __init__(self):
        self.level = 0
        self.lives = 4

state = GameState()
state.level += 1  # Player beat a level

state.lives -= 1  # Player had to try again

可以用 pickle 保存 object

import pickle
state_path = '/tmp/game_state.bin'
with open(state_path, 'wb') as f:
    pickle.dump(state, f)

with open(state_path, 'rb') as f:
    state_after = pickle.load(f)
# {'lives': 3, 'level': 1}

print(state_after.__dict__)

但是如果增加了新的 field,game_state.bin load 回來的 object 當然不會有新的 field (points),可是它仍然是 GameState 的 instance,這會造成混亂。

class GameState(object):
    def __init__(self):
        self.level = 0
        self.lives = 4
        self.points = 0

with open(state_path, 'rb') as :
    state_after = pickle.load(f)
# {'lives': 3, 'level': 1}

print(state_after.__dict__)
assert isinstance(state_after, GameState)

使用 copyreg 可以解決這個問題,它可以註冊用來 serialize Python 物件的函式。

Default Attribute Values

pickle_game_state() 回傳一個 tuple ,包含了拿來 unpickle 的函式以及傳入該函式的引數。

import copyreg

class GameState(object):
    def __init__(self, level=0, lives=4, points=0):
        self.level = level
        self.lives = lives
        self.points = points

def pickle_game_state(game_state):
    kwargs = game_state.__dict__
    return unpickle_game_state, (kwargs,)

def unpickle_game_state(kwargs):
    return GameState(**kwargs)

copyreg.pickle(GameState, pickle_game_state)

Versioning Classes

copyreg 也可以拿來記錄版本,達到向後相容的目的。

假設原先的 class 如下

class GameState(object):
    def __init__(self, level=0, lives=4, points=0, magic=5):
        self.level = level
        self.lives = lives
        self.points = points
        self.magic = magic

state = GameState()
state.points += 1000
serialized = pickle.dumps(state)

後來修改了,拿掉 lives ,這時原先使用預設參數的做法不能用了。

class GameState(object):
    def __init__(self, level=0, points=0, magic=5):
        self.level = level
        self.points = points
        self.magic = magic

# TypeError: __init__() got an unexpected keyword argument 'lives'

pickle.loads(serialized)

在 serialize 時多加上版號, deserialize 時加以判斷

def pickle_game_state(game_state):
    kwargs = game_state.__dict__
    kwargs['version'] = 2
    return unpickle_game_state, (kwargs,)

def unpickle_game_state(kwargs):
    version = kwargs.pop('version', 1)
    if version == 1:
        kwargs.pop('lives')
    return GameState(**kwargs)

copyreg.pickle(GameState, pickle_game_state)

Stable Import Paths

重構程式時,如果 class 改名了,想要 load 舊的 serialized 物件當然不能用,但還是可以使用 copyreg 解決。

class BetterGameState(object):
    def __init__(self, level=0, points=0, magic=5):
        self.level = level
        self.points = points
        self.magic = magic

copyreg.pickle(BetterGameState, pickle_game_state)

可以發現 unpickle_game_state() 的 path 寫入 dump 出來的資料中,當然這樣做的缺點就是 unpickle_game_state() 所在的 module 不能改 path 了。

state = BetterGameState()
serialized = pickle.dumps(state)
print(serialized[:35])
>>>
b'\x80\x03c__main__\nunpickle_game_state\nq\x00}'
 
about 1 year ago

Consider contextlib and with Statements for Reusable try/finally Behavior

在講 contextlib.contextmanager ,方便我們實做 context managers

from contextlib import contextmanager

@contextmanager
def log_level(level, name):
    logger = logging.getLogger(name)
    old_level = logger.getEffectiveLevel()
    logger.setLevel(level)
    try:
        yield logger
    finally:
        logger.setLevel(old_level)

with log_level(logging.DEBUG, 'my-log') as logger:
    logger.debug('This is my message!')
    logging.debug('This will not print')

logger = logging.getLogger('my-log')
logger.debug('Debug will not print')
logger.error('Error will print')

相關

Python 慣用語 - 15 Context Managers

 
about 1 year ago

Define Function Decorators with functools.wraps

在講 functools.wraps ,用途是避免覆蓋掉本來函式的 __module____name__ 以及 __doc__

from functools import wraps

def trace(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        print('%s(%r, %r) -> %r' %
              (func.__name__, args, kwargs, result))
        return result
    return wrapper

@trace
def fibonacci(n):
    """Return the n-th Fibonacci number"""
    if n in (0, 1):
        return n
    return (fibonacci(n - 2) +
            fibonacci(n - 1))

functools.wraps 原始碼

# update_wrapper() and wraps() are tools to help write

# wrapper functions that can handle naive introspection


WRAPPER_ASSIGNMENTS = ('__module__', '__name__', '__doc__')
WRAPPER_UPDATES = ('__dict__',)
def update_wrapper(wrapper,
                   wrapped,
                   assigned = WRAPPER_ASSIGNMENTS,
                   updated = WRAPPER_UPDATES):
    for attr in assigned:
        setattr(wrapper, attr, getattr(wrapped, attr))
    for attr in updated:
        getattr(wrapper, attr).update(getattr(wrapped, attr, {}))
    # Return the wrapper so this can be used as a decorator via partial()

    return wrapper

def wraps(wrapped,
          assigned = WRAPPER_ASSIGNMENTS,
          updated = WRAPPER_UPDATES):
    return partial(update_wrapper, wrapped=wrapped,
                   assigned=assigned, updated=updated)

這個 decorator 很常用,Python 慣用語怎麼會漏掉這個....

 
about 1 year ago

Consider concurrent.futures for True Parallelism

使用 concurrent.futures 裡頭的 ProcessPoolExecutor 可以很簡單地平行處理 CPU-bound 的程式,省得用 multiprocessing 自幹。

from concurrent.futures import ProcessPoolExecutor

start = time()
pool = ProcessPoolExecutor(max_workers=2)  # The one change

results = list(pool.map(gcd, numbers))
end = time()
print('Took %.3f seconds' % (end - start))