Joblib and multiprocessing

Joblib and multiprocessing

Joblib

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
from joblib import Parallel, delayed
import numpy as np

def square(x):
"""Return the square of a number."""
return x ** 2

if __name__ == '__main__':
# Create an array of numbers
numbers = np.arange(10)
# use all available cores
n_jobs = -1
jobs = len(numbers)

# Use joblib to parallelize the square function
tasks = (delayed(square)(i) for i in numbers)
batch_size = max(1, (jobs + n_jobs - 1) // n_jobs)
results = Parallel(n_jobs=n_jobs, batch_size=batch_size)(tasks)

print(results)

这里的batch_size是指每个任务的大小. 如果任务太少, 那么batch_size至少为1, 如果任务太多, 那么batch_size就是任务数除以核数, 向上取整.

Multiprocessing

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import multiprocessing as mp
import numpy as np

def square(x):
"""Return the square of a number."""
return x ** 2

if __name__ == '__main__':
# Create an array of numbers
numbers = np.arange(10)
# use all available cores
n_jobs = -1
jobs = len(numbers)

# Use multiprocessing to parallelize the square function
pool = mp.Pool(processes=n_jobs)
results = pool.map(square, numbers)

print(results)

用pool.map来并行化任务.