关于多线程:如何在Python中使用线程?

How to use threading in Python?

我试图理解Python中的线程。我看过文档和示例,但坦率地说,许多示例过于复杂,我很难理解它们。

如何清楚地显示多线程划分的任务?


自从2010年提出这个问题以来,如何使用带有map和pool的python进行简单的多线程处理已经有了真正的简化。

下面的代码来自一篇文章/博客文章,您应该明确地检查(没有附属关系)-并行性在一行中:更好的日常线程处理任务模型。我将在下面总结一下-它最终只是几行代码:

1
2
3
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
results = pool.map(my_function, my_array)

多线程版本:

1
2
3
results = []
for item in my_array:
    results.append(my_function(item))

描述

Map is a cool little function, and the key to easily injecting parallelism into your Python code. For those unfamiliar, map is something lifted from functional languages like Lisp. It is a function which maps another function over a sequence.

Map handles the iteration over the sequence for us, applies the function, and stores all of the results in a handy list at the end.

enter image description here

实施

Parallel versions of the map function are provided by two libraries:multiprocessing, and also its little known, but equally fantastic step child:multiprocessing.dummy.

multiprocessing.dummy与多处理模块完全相同,但使用线程代替(一个重要的区别-对CPU密集型任务使用多个进程;IO(和IO)中的线程):

multiprocessing.dummy replicates the API of multiprocessing but is no more than a wrapper around the threading module.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import urllib2
from multiprocessing.dummy import Pool as ThreadPool

urls = [
  'http://www.python.org',
  'http://www.python.org/about/',
  'http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html',
  'http://www.python.org/doc/',
  'http://www.python.org/download/',
  'http://www.python.org/getit/',
  'http://www.python.org/community/',
  'https://wiki.python.org/moin/',
]

# make the Pool of workers
pool = ThreadPool(4)

# open the urls in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)

# close the pool and wait for the work to finish
pool.close()
pool.join()

以及计时结果:

1
2
3
4
Single thread:   14.4 seconds
       4 Pool:   3.1 seconds
       8 Pool:   1.4 seconds
      13 Pool:   1.3 seconds

传递多个参数(仅在python 3.3及更高版本中使用此方法):

要传递多个数组:

1
results = pool.starmap(function, zip(list_a, list_b))

或者传递常量和数组:

1
results = pool.starmap(function, zip(itertools.repeat(constant), list_a))

如果您使用的是早期版本的Python,则可以通过此解决方案传递多个参数。

(感谢用户136036的帮助评论)


下面是一个简单的例子:您需要尝试一些可选的URL,并返回第一个要响应的URL的内容。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import Queue
import threading
import urllib2

# called by each thread
def get_url(q, url):
    q.put(urllib2.urlopen(url).read())

theurls = ["http://google.com","http://yahoo.com"]

q = Queue.Queue()

for u in theurls:
    t = threading.Thread(target=get_url, args = (q,u))
    t.daemon = True
    t.start()

s = q.get()
print s

在这种情况下,线程被用作一种简单的优化:每个子线程都在等待一个URL来解析和响应,以便将其内容放到队列中;每个线程都是一个守护进程(如果主线程结束,则不会保持进程的运行,这比不结束更常见);主线程启动所有子线程,队列中的get不是要ait直到其中一个执行了put,然后发出结果并终止(这会删除任何可能仍在运行的子线程,因为它们是守护进程线程)。

在python中正确使用线程总是与I/O操作相连接(因为cpython不使用多个内核来运行CPU绑定的任务,所以线程化的唯一原因是在等待某些I/O时不阻塞进程)。队列几乎总是将工作分包给线程和/或收集工作结果的最佳方法,顺便说一下,它们本质上是线程安全的,因此可以避免您担心锁、条件、事件、信号量和其他线程间协调/通信概念。


注意:对于python中的实际并行化,应该使用多处理模块来分叉并行执行的多个进程(由于全局解释器锁,python线程提供交错,但实际上是串行执行的,而不是并行执行的,并且仅在交错I/O操作时才有用)。

但是,如果您只是在寻找交错(或者正在执行可以并行化的I/O操作,尽管使用了全局解释器锁),那么线程模块就是开始的地方。作为一个非常简单的例子,让我们考虑通过并行求和子范围来求和大范围的问题:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         super(SummingThread, self).__init__()
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i


thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completed
thread2.join()  
# At this point, both threads have completed
result = thread1.total + thread2.total
print result

请注意,上面是一个非常愚蠢的例子,因为它绝对没有I/O,并且由于全局解释器锁的缘故,将在cpython中以串行方式执行,尽管是交错的(带有额外的上下文切换开销)。


和前面提到的一样,cpython只能将线程用于gil导致的IO等待。如果要从多个内核中受益于CPU绑定的任务,请使用多处理:

1
2
3
4
5
6
7
8
9
from multiprocessing import Process

def f(name):
    print 'hello', name

if __name__ == '__main__':
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()


只需注意,线程不需要队列。

这是我能想象的最简单的例子,它显示了10个进程同时运行。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import threading
from random import randint
from time import sleep


def print_number(number):
    # Sleeps a random 1 to 10 seconds
    rand_int_var = randint(1, 10)
    sleep(rand_int_var)
    print"Thread" + str(number) +" slept for" + str(rand_int_var) +" seconds"

thread_list = []

for i in range(1, 10):
    # Instantiates the thread
    # (i) does not make a sequence, so (i,)
    t = threading.Thread(target=print_number, args=(i,))
    # Sticks the thread in a list so that it remains accessible
    thread_list.append(t)

# Starts threads
for thread in thread_list:
    thread.start()

# This blocks the calling thread until the thread whose join() method is called is terminated.
# From http://docs.python.org/2/library/threading.html#thread-objects
for thread in thread_list:
    thread.join()

# Demonstrates that the main process waited for threads to complete
print"Done"


亚历克斯·马泰利的回答帮助了我,但是这里有一个我认为更有用的修改版本(至少对我来说)。

更新:在python2和python3中工作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
try:
    # for python3
    import queue
    from urllib.request import urlopen
except:
    # for python2
    import Queue as queue
    from urllib2 import urlopen

import threading

worker_data = ['http://google.com', 'http://yahoo.com', 'http://bing.com']

#load up a queue with your data, this will handle locking
q = queue.Queue()
for url in worker_data:
    q.put(url)

#define a worker function
def worker(url_queue):
    queue_full = True
    while queue_full:
        try:
            #get your data off the queue, and do some work
            url = url_queue.get(False)
            data = urlopen(url).read()
            print(len(data))

        except queue.Empty:
            queue_full = False

#create as many threads as you want
thread_count = 5
for i in range(thread_count):
    t = threading.Thread(target=worker, args = (q,))
    t.start()


我发现这非常有用:创建尽可能多的线程作为核心,让它们执行大量的任务(在本例中,调用shell程序):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import Queue
import threading
import multiprocessing
import subprocess

q = Queue.Queue()
for i in range(30): #put 30 tasks in the queue
    q.put(i)

def worker():
    while True:
        item = q.get()
        #execute a task: call a shell program and wait until it completes
        subprocess.call("echo"+str(item), shell=True)
        q.task_done()

cpus=multiprocessing.cpu_count() #detect number of cores
print("Creating %d threads" % cpus)
for i in range(cpus):
     t = threading.Thread(target=worker)
     t.daemon = True
     t.start()

q.join() #block until all tasks are done


给定函数f的线程如下:

1
2
import threading
threading.Thread(target=f).start()

f传递参数

1
threading.Thread(target=f, args=(a,b,c)).start()


python 3具有启动并行任务的功能。这使我们的工作更容易。

它有线程池和进程池。

以下内容提供了一种见解:

线程池执行器示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

进程池执行器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import concurrent.futures
import math

PRIMES = [
    112272535095293,
    112582705942171,
    112272535095293,
    115280095190773,
    115797848077099,
    1099726899285419]

def is_prime(n):
    if n % 2 == 0:
        return False

    sqrt_n = int(math.floor(math.sqrt(n)))
    for i in range(3, sqrt_n + 1, 2):
        if n % i == 0:
            return False
    return True

def main():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
            print('%d is prime: %s' % (number, prime))

if __name__ == '__main__':
    main()

对于我来说,线程处理的最佳示例是监视异步事件。看看这个代码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# thread_test.py
import threading
import time

class Monitor(threading.Thread):
    def __init__(self, mon):
        threading.Thread.__init__(self)
        self.mon = mon

    def run(self):
        while True:
            if self.mon[0] == 2:
                print"Mon = 2"
                self.mon[0] = 3;

您可以通过打开ipython会话并执行以下操作来使用此代码:

1
2
3
4
5
6
7
8
>>>from thread_test import Monitor
>>>a = [0]
>>>mon = Monitor(a)
>>>mon.start()
>>>a[0] = 2
Mon = 2
>>>a[0] = 2
Mon = 2

等几分钟

1
2
>>>a[0] = 2
Mon = 2


大多数文档和教程都使用了python的ThreadingQueue模块,对于初学者来说,它们似乎是压倒性的。

也许考虑一下python 3的concurrent.futures.ThreadPoolExecutor模块。结合with条款和清单理解,这可能是一个真正的魅力。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from concurrent.futures import ThreadPoolExecutor, as_completed

def get_url(url):
    # Your actual program here. Using threading.Lock() if necessary
    return""

# List of urls to fetch
urls = ["url1","url2"]

with ThreadPoolExecutor(max_workers = 5) as executor:

    # Create threads
    futures = {executor.submit(get_url, url) for url in urls}

    # as_completed() gives you the threads once finished
    for f in as_completed(futures):
        # Get the results
        rs = f.result()


使用全新的concurrent.futures模块

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def sqr(val):
    import time
    time.sleep(0.1)
    return val * val

def process_result(result):
    print(result)

def process_these_asap(tasks):
    import concurrent.futures

    with concurrent.futures.ProcessPoolExecutor() as executor:
        futures = []
        for task in tasks:
            futures.append(executor.submit(sqr, task))

        for future in concurrent.futures.as_completed(futures):
            process_result(future.result())
        # Or instead of all this just do:
        # results = executor.map(sqr, tasks)
        # list(map(process_result, results))

def main():
    tasks = list(range(10))
    print('Processing {} tasks'.format(len(tasks)))
    process_these_asap(tasks)
    print('Done')
    return 0

if __name__ == '__main__':
    import sys
    sys.exit(main())

执行者的方法可能对所有以前用Java弄脏手的人来说是熟悉的。

还有一点需要注意的是:为了让整个宇宙保持清醒,如果你不使用with上下文(这是非常棒的,它为你做到了这一点),不要忘记关闭你的池/执行器。


下面是使用线程导入csv的非常简单的示例。[库包含可能因用途不同而有所不同]

帮助程序函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from threading import Thread
from project import app
import csv


def import_handler(csv_file_name):
    thr = Thread(target=dump_async_csv_data, args=[csv_file_name])
    thr.start()

def dump_async_csv_data(csv_file_name):
    with app.app_context():
        with open(csv_file_name) as File:
            reader = csv.DictReader(File)
            for row in reader:
                #DB operation/query

驱动程序功能:

1
import_handler(csv_file_name)

多线程和简单的例子,这将是有益的。您可以运行它并轻松理解多线程如何在Python中工作。在前一个线程完成工作之前,我使用锁来防止访问其他线程。通过使用

tLock = threading.BoundedSemaphore(value=4)

这一行代码可以一次允许多个进程,并保留线程的其余部分,这些线程将在稍后或完成之前的进程后运行。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import threading
import time

#tLock = threading.Lock()
tLock = threading.BoundedSemaphore(value=4)
def timer(name, delay, repeat):
    print "

Timer:"
, name," Started"
    tLock.acquire()
    print"

"
, name," has the acquired the lock"
    while repeat > 0:
        time.sleep(delay)
        print"

"
, name,":", str(time.ctime(time.time()))
        repeat -= 1

    print"

"
, name," is releaseing the lock"
    tLock.release()
    print"

Timer:"
, name," Completed"

def Main():
    t1 = threading.Thread(target=timer, args=("Timer1", 2, 5))
    t2 = threading.Thread(target=timer, args=("Timer2", 3, 5))
    t3 = threading.Thread(target=timer, args=("Timer3", 4, 5))
    t4 = threading.Thread(target=timer, args=("Timer4", 5, 5))
    t5 = threading.Thread(target=timer, args=("Timer5", 0.1, 5))

    t1.start()
    t2.start()
    t3.start()
    t4.start()
    t5.start()

    print"

Main Complete"


if __name__ =="__main__":
    Main()

我在这里看到了很多例子,其中没有执行真正的工作+它们主要是CPU限制的。下面是一个CPU绑定任务的例子,它计算1000万到1050万之间的所有素数。我用了这四种方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
import math
import timeit
import threading
import multiprocessing
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor


def time_stuff(fn):
   """
    Measure time of execution of a function
   """

    def wrapper(*args, **kwargs):
        t0 = timeit.default_timer()
        fn(*args, **kwargs)
        t1 = timeit.default_timer()
        print("{} seconds".format(t1 - t0))
    return wrapper

def find_primes_in(nmin, nmax):
   """
    Compute a list of prime numbers between the given minimum and maximum arguments
   """

    primes = []

    #Loop from minimum to maximum
    for current in range(nmin, nmax + 1):

        #Take the square root of the current number
        sqrt_n = int(math.sqrt(current))
        found = False

        #Check if the any number from 2 to the square root + 1 divides the current numnber under consideration
        for number in range(2, sqrt_n + 1):

            #If divisible we have found a factor, hence this is not a prime number, lets move to the next one
            if current % number == 0:
                found = True
                break

        #If not divisible, add this number to the list of primes that we have found so far
        if not found:
            primes.append(current)

    #I am merely printing the length of the array containing all the primes but feel free to do what you want
    print(len(primes))

@time_stuff
def sequential_prime_finder(nmin, nmax):
   """
    Use the main process and main thread to compute everything in this case
   """

    find_primes_in(nmin, nmax)

@time_stuff
def threading_prime_finder(nmin, nmax):
   """
    If the minimum is 1000 and the maximum is 2000 and we have 4 workers
    1000 - 1250 to worker 1
    1250 - 1500 to worker 2
    1500 - 1750 to worker 3
    1750 - 2000 to worker 4
    so lets split the min and max values according to the number of workers
   """

    nrange = nmax - nmin
    threads = []
    for i in range(8):
        start = int(nmin + i * nrange/8)
        end = int(nmin + (i + 1) * nrange/8)

        #Start the thrread with the min and max split up to compute
        #Parallel computation will not work here due to GIL since this is a CPU bound task
        t = threading.Thread(target = find_primes_in, args = (start, end))
        threads.append(t)
        t.start()

    #Dont forget to wait for the threads to finish
    for t in threads:
        t.join()

@time_stuff
def processing_prime_finder(nmin, nmax):
   """
    Split the min, max interval similar to the threading method above but use processes this time
   """

    nrange = nmax - nmin
    processes = []
    for i in range(8):
        start = int(nmin + i * nrange/8)
        end = int(nmin + (i + 1) * nrange/8)
        p = multiprocessing.Process(target = find_primes_in, args = (start, end))
        processes.append(p)
        p.start()

    for p in processes:
        p.join()

@time_stuff
def thread_executor_prime_finder(nmin, nmax):
   """
    Split the min max interval similar to the threading method but use thread pool executor this time
    This method is slightly faster than using pure threading as the pools manage threads more efficiently
    This method is still slow due to the GIL limitations since we are doing a CPU bound task
   """

    nrange = nmax - nmin
    with ThreadPoolExecutor(max_workers = 8) as e:
        for i in range(8):
            start = int(nmin + i * nrange/8)
            end = int(nmin + (i + 1) * nrange/8)
            e.submit(find_primes_in, start, end)

@time_stuff
def process_executor_prime_finder(nmin, nmax):
   """
    Split the min max interval similar to the threading method but use the process pool executor
    This is the fastest method recorded so far as it manages process efficiently + overcomes GIL limitations
    RECOMMENDED METHOD FOR CPU BOUND TASKS
   """

    nrange = nmax - nmin
    with ProcessPoolExecutor(max_workers = 8) as e:
        for i in range(8):
            start = int(nmin + i * nrange/8)
            end = int(nmin + (i + 1) * nrange/8)
            e.submit(find_primes_in, start, end)

def main():
    nmin = int(1e7)
    nmax = int(1.05e7)
    print("Sequential Prime Finder Starting")
    sequential_prime_finder(nmin, nmax)
    print("Threading Prime Finder Starting")
    threading_prime_finder(nmin, nmax)
    print("Processing Prime Finder Starting")
    processing_prime_finder(nmin, nmax)
    print("Thread Executor Prime Finder Starting")
    thread_executor_prime_finder(nmin, nmax)
    print("Process Executor Finder Starting")
    process_executor_prime_finder(nmin, nmax)

main()

这是我的Mac OSX 4核心机器上的结果

1
2
3
4
5
6
7
8
9
10
Sequential Prime Finder Starting
9.708213827005238 seconds
Threading Prime Finder Starting
9.81836523200036 seconds
Processing Prime Finder Starting
3.2467174359990167 seconds
Thread Executor Prime Finder Starting
10.228896902000997 seconds
Process Executor Finder Starting
2.656402041000547 seconds


上面的解决方案都没有在我的GNU/Linux服务器上实际使用多个内核(我没有管理权限)。他们只是在一个核心上运行。我使用较低级别的os.fork接口生成多个进程。这就是适用于我的代码:

1
2
3
4
5
6
7
8
9
from os import fork

values = ['different', 'values', 'for', 'threads']

for i in range(len(values)):
    p = fork()
    if p == 0:
        my_function(values[i])
        break


1
2
3
4
5
6
7
8
9
10
11
import threading
import requests

def send():

  r = requests.get('https://www.stackoverlow.com')

thread = []
t = threading.Thread(target=send())
thread.append(t)
t.start()