about 2 years ago

簡介

其實網路上相關文件很多,但都是用英文寫的,而且不是很齊全。在這裡整理出一份比較完整的清單,希望能為台灣的 Python 學習者有點貢獻。

預設讀者

  • 剛入門的 Python 程式設計師
  • 至少看得懂 Python 語法

目錄

編號 主題
1 前言
2 If Statements
3 請愛用 in
4 Tuple 的妙用
5 Conditional Expressions
6 善用 enumerate
7 負數索引值
8 太長怎麼辦
9 loop 可以有 else
10 Chained Comparisons
11 避免用 mutable 預設引數
12 用 join 生成字串
13 請愛用 dict.get()
14 用 property 取代 getters, setters
15 Context Managers
16 用 _ 代表未使用的變數
17 List Comprehensions
18 Generator Expressions
19 請愛用 BIFs
20 避免覆蓋 BIFs
21 dict.setdefault()
22 defaultdict
23 PEP 8
24 遵循 PEP 8 的命名規則
25 import 的順序
26 儘量別用 from module import *
27 儘量少用 from module import obj
28 別用 implicit relative imports
29 Convenience Imports
30 Python 之禪
 
about 1 year ago

Make pickle Reliable with copyreg

在講 copyreg 這個內建的 module ,搭配 pickle 使用。

pickle 使用上很簡單,假設我們有個 class:

class GameState(object):
    def __init__(self):
        self.level = 0
        self.lives = 4

state = GameState()
state.level += 1  # Player beat a level

state.lives -= 1  # Player had to try again

可以用 pickle 保存 object

import pickle
state_path = '/tmp/game_state.bin'
with open(state_path, 'wb') as f:
    pickle.dump(state, f)

with open(state_path, 'rb') as f:
    state_after = pickle.load(f)
# {'lives': 3, 'level': 1}

print(state_after.__dict__)

但是如果增加了新的 field,game_state.bin load 回來的 object 當然不會有新的 field (points),可是它仍然是 GameState 的 instance,這會造成混亂。

class GameState(object):
    def __init__(self):
        self.level = 0
        self.lives = 4
        self.points = 0

with open(state_path, 'rb') as :
    state_after = pickle.load(f)
# {'lives': 3, 'level': 1}

print(state_after.__dict__)
assert isinstance(state_after, GameState)

使用 copyreg 可以解決這個問題,它可以註冊用來 serialize Python 物件的函式。

Default Attribute Values

pickle_game_state() 回傳一個 tuple ,包含了拿來 unpickle 的函式以及傳入該函式的引數。

import copyreg

class GameState(object):
    def __init__(self, level=0, lives=4, points=0):
        self.level = level
        self.lives = lives
        self.points = points

def pickle_game_state(game_state):
    kwargs = game_state.__dict__
    return unpickle_game_state, (kwargs,)

def unpickle_game_state(kwargs):
    return GameState(**kwargs)

copyreg.pickle(GameState, pickle_game_state)

Versioning Classes

copyreg 也可以拿來記錄版本,達到向後相容的目的。

假設原先的 class 如下

class GameState(object):
    def __init__(self, level=0, lives=4, points=0, magic=5):
        self.level = level
        self.lives = lives
        self.points = points
        self.magic = magic

state = GameState()
state.points += 1000
serialized = pickle.dumps(state)

後來修改了,拿掉 lives ,這時原先使用預設參數的做法不能用了。

class GameState(object):
    def __init__(self, level=0, points=0, magic=5):
        self.level = level
        self.points = points
        self.magic = magic

# TypeError: __init__() got an unexpected keyword argument 'lives'

pickle.loads(serialized)

在 serialize 時多加上版號, deserialize 時加以判斷

def pickle_game_state(game_state):
    kwargs = game_state.__dict__
    kwargs['version'] = 2
    return unpickle_game_state, (kwargs,)

def unpickle_game_state(kwargs):
    version = kwargs.pop('version', 1)
    if version == 1:
        kwargs.pop('lives')
    return GameState(**kwargs)

copyreg.pickle(GameState, pickle_game_state)

Stable Import Paths

重構程式時,如果 class 改名了,想要 load 舊的 serialized 物件當然不能用,但還是可以使用 copyreg 解決。

class BetterGameState(object):
    def __init__(self, level=0, points=0, magic=5):
        self.level = level
        self.points = points
        self.magic = magic

copyreg.pickle(BetterGameState, pickle_game_state)

可以發現 unpickle_game_state() 的 path 寫入 dump 出來的資料中,當然這樣做的缺點就是 unpickle_game_state() 所在的 module 不能改 path 了。

state = BetterGameState()
serialized = pickle.dumps(state)
print(serialized[:35])
>>>
b'\x80\x03c__main__\nunpickle_game_state\nq\x00}'
 
over 1 year ago

Consider contextlib and with Statements for Reusable try/finally Behavior

在講 contextlib.contextmanager ,方便我們實做 context managers

from contextlib import contextmanager

@contextmanager
def log_level(level, name):
    logger = logging.getLogger(name)
    old_level = logger.getEffectiveLevel()
    logger.setLevel(level)
    try:
        yield logger
    finally:
        logger.setLevel(old_level)

with log_level(logging.DEBUG, 'my-log') as logger:
    logger.debug('This is my message!')
    logging.debug('This will not print')

logger = logging.getLogger('my-log')
logger.debug('Debug will not print')
logger.error('Error will print')

相關

Python 慣用語 - 15 Context Managers

 
over 1 year ago

Define Function Decorators with functools.wraps

在講 functools.wraps ,用途是避免覆蓋掉本來函式的 __module____name__ 以及 __doc__

from functools import wraps

def trace(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        print('%s(%r, %r) -> %r' %
              (func.__name__, args, kwargs, result))
        return result
    return wrapper

@trace
def fibonacci(n):
    """Return the n-th Fibonacci number"""
    if n in (0, 1):
        return n
    return (fibonacci(n - 2) +
            fibonacci(n - 1))

functools.wraps 原始碼

# update_wrapper() and wraps() are tools to help write

# wrapper functions that can handle naive introspection


WRAPPER_ASSIGNMENTS = ('__module__', '__name__', '__doc__')
WRAPPER_UPDATES = ('__dict__',)
def update_wrapper(wrapper,
                   wrapped,
                   assigned = WRAPPER_ASSIGNMENTS,
                   updated = WRAPPER_UPDATES):
    for attr in assigned:
        setattr(wrapper, attr, getattr(wrapped, attr))
    for attr in updated:
        getattr(wrapper, attr).update(getattr(wrapped, attr, {}))
    # Return the wrapper so this can be used as a decorator via partial()

    return wrapper

def wraps(wrapped,
          assigned = WRAPPER_ASSIGNMENTS,
          updated = WRAPPER_UPDATES):
    return partial(update_wrapper, wrapped=wrapped,
                   assigned=assigned, updated=updated)

這個 decorator 很常用,Python 慣用語怎麼會漏掉這個....

 
over 1 year ago

Consider concurrent.futures for True Parallelism

使用 concurrent.futures 裡頭的 ProcessPoolExecutor 可以很簡單地平行處理 CPU-bound 的程式,省得用 multiprocessing 自幹。

from concurrent.futures import ProcessPoolExecutor

start = time()
pool = ProcessPoolExecutor(max_workers=2)  # The one change

results = list(pool.map(gcd, numbers))
end = time()
print('Took %.3f seconds' % (end - start))