关于python：SQLAlchemy – 处理与DB和Sessions的连接(不清楚行为和部分文档)

SQLAlchemy - work with connections to DB and Sessions (not clear behavior and part in documentation)

我使用SQLAlchemy(非常好的ORM，但文档不够清晰)与PostgreSQL进行通信
一切都很好，直到达到最大连接限制的postgres"崩溃"的一个案例：不再允许连接(max_client_conn)。
那个案子让我觉得我做错了。经过几次实验，我弄清楚如何不再面对这个问题，但还有一些问题
下面你会看到代码示例(在Python 3 +中，PostgreSQL设置是默认的)没有和提到的问题，我最想听到的是以下问题的答案：

上下文管理器对连接和会话的确切作用是什么？关闭会话和处理连接还是什么？

为什么代码的第一个工作示例在没有NullPool作为"connect"方法中的poolclass的情况下表现为问题？

为什么在第一个示例中，我只为所有查询获得了1个与db的连接，但在第二个示例中，我为每个查询分别建立了连接？ (如果我理解错了，请纠正我，用"pgbouncer"检查)

当您将SQLAlchemy和PostgreSQL数据库用于侦听请求的多个脚本实例(或脚本中的单独线程)时，打开和关闭连接(和/或使用Session)的最佳实践是什么？必须与每个实例分别进行会话？ (我的意思是原始SQLAlchemy而不是Flask-SQLAlchemy或像这样的smth)

代码的工作示例没有问题：

建立与DB的连接：

1
2
3
4
5
6
7
8
9
10

from sqlalchemy.pool import NullPool # does not work without NullPool, why?

def connect(user, password, db, host='localhost', port=5432):
"""Returns a connection and a metadata object"""
url = 'postgresql://{}:{}@{}:{}/{}'.format(user, password, host, port, db)

temp_con = sqlalchemy.create_engine(url, client_encoding='utf8', poolclass=NullPool)
temp_meta = sqlalchemy.MetaData(bind=temp_con, reflect=True)

return temp_con, temp_meta

函数以使会话与DB一起工作：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

from contextlib import contextmanager

@contextmanager
def session_scope():
con_loc, meta_loc = connect(db_user, db_pass, db_instance, 'localhost')
Session = sessionmaker(bind=con_loc)

"""Provide a transactional scope around a series of operations."""
session = Session()
try:
yield session
session.commit()
except:
session.rollback()
raise

查询示例：

1 2	with session_scope() as session: entity = session.query(SomeEntity).first()

失败的代码示例：

函数以使会话与DB一起工作：

1
2
3
4
5
6
7

def create_session():
# connect method the same as in first example
con, meta = connect(db_user, db_pass, db_instance, 'localhost')
Session = sessionmaker(bind=con)

session = Session()
return session

查询示例：

1 2	session = create_session() entity = session.query(SomeEntity).first()

希望你有主意

首先，您不应在connect()函数中重复创建引擎。通常的做法是在应用程序中为每个数据库URL设置一个全局Engine实例。 sessionmaker()创建的session类也是如此。

What exactly does context manager do with connections and sessions? Closing session and disposing connection or what?

您编程的目的是什么，如果这看起来不清楚，请阅读一般的情境管理人员。在这种情况下，如果在由with语句控制的块内引发异常，它将提交或回滚会话。这两个操作都会将会话使用的连接返回到池，在您的情况下为NullPool，因此只需关闭连接。

Why does first working example of code behave as example with issue without NullPool as poolclass in"connect" method?

和

1
from sqlalchemy.pool import NullPool # does not work without NullPool, why?

如果没有NullPool引擎，你会反复创建池连接，所以如果它们由于某种原因没有超出范围，或者它们的引用不会归零，即使会话返回它们，它们也会保持连接。目前还不清楚第二个例子中的会话是否及时超出范围，因此他们也可能会继续保持联系。

Why in the first example I got only 1 connection to db for all queries but in second example I got separate connection for each query? (please correct me if I understood it wrong, was checking it with"pgbouncer")

第一个示例由于使用正确处理事务的上下文管理器和NullPool而最终关闭连接，因此连接将返回到另一个池层的保镖。

第二个示例可能永远不会关闭连接，因为它缺少事务处理，但由于给出了示例，因此不清楚。它也可能会保留您创建的单独引擎中的连接。

问题集的第4点几乎涵盖了"会话基础"中的官方文档，特别是"我何时构建会话，何时提交会话，何时关闭它？"和"会话线程安全吗？"。

有一个例外：脚本的多个实例。您不应该在进程之间共享引擎，因此为了在它们之间建立连接，您需要一个外部池，例如PgBouncer。

@Ilja Everil？回答大多有帮助
我会在这里留下编辑过的代码，也许它会帮助别人

像我预期的那样工作的新代码如下：

连接到DB ::

1
2
3
4
5
6
7
8
9
10

from sqlalchemy.pool import NullPool # will work even without NullPool in code

def connect(user, password, db, host='localhost', port=5432):
"""Returns a connection and a metadata object"""
url = 'postgresql://{}:{}@{}:{}/{}'.format(user, password, host, port, db)

temp_con = sqlalchemy.create_engine(url, client_encoding='utf8', poolclass=NullPool)
temp_meta = sqlalchemy.MetaData(bind=temp_con, reflect=True)

return temp_con, temp_meta

每个应用程序的一个连接实例和sessionmaker，例如您的主要功能：

1
2
3
4
5

from sqlalchemy.orm import sessionmaker

# create one connection and Sessionmaker to each instance of app (to avoid creating it repeatedly)
con, meta = connect(db_user, db_pass, db_instance, db_host)
session_maker = sessionmaker(bind=con) enter code here

函数与with语句获取会话：

1
2
3
4
5
6
7
8
9
10
11
12
13

from contextlib import contextmanager
from some_place import session_maker

@contextmanager
def session_scope() -> Session:
"""Provide a transactional scope around a series of operations."""
session = session_maker() # create session from SQLAlchemy sessionmaker
try:
yield session
session.commit()
except:
session.rollback()
raise

包装事务和使用会话：

1 2	with session_scope() as session: entity = session.query(SomeEntity).first()

What exactly does context manager do with connections and sessions?
Closing session and disposing connection or what?

Python中的上下文管理器用于创建与with语句一起使用的运行时上下文。简单地说，当您运行代码时：

1 2	with session_scope() as session: entity = session.query(SomeEntity).first()

会话是产生的会话。那么，关于上下文管理器对连接和会话的处理方式的问题，您所要做的就是查看yield之后发生的事情，看看会发生什么。在这种情况下，它只是：

1
2
3
4
5
6

try:
yield session
session.commit()
except:
session.rollback()
raise

如果没有触发异常，它将是session.commit()，根据SQLAlchemy文档将"刷新挂起的更改并提交当前事务"。

Why does first working example of code behave as example with issue
without NullPool as poolclass in"connect" method?

poolclass参数只是告诉SQLAlchemy要使用的Pool的子类。但是，在这里传递NullPool的情况下，您告诉SQLAlchemy不使用池。传入NullPool时，您实际上是在禁用池连接。从文档："禁用池，改为将池类设置为NullPool。"我不能肯定地说，但使用NullPool可能会导致您的max_connection问题。

Why in the first example I got only 1 connection to db for all queries
but in second example I got separate connection for each query?
(please correct me if I understood it wrong, was checking it with
"pgbouncer")

我不太我认为这与第一个示例中的方法有关，您使用的是上下文管理器，因此with块中的所有内容都将使用session生成器。在第二个示例中，您创建了一个初始化新session并返回它的函数，因此您不会返回生成器。我还认为这与您的NullPool使用有关，这会阻止连接池。使用NullPool，每个查询执行都会自己获取连接。

What is the best practices to open and close connections(and/or work
with Session) when you use SQLAlchemy and PostgreSQL DB for multiple
instances of script (or separate threads in script) that listens
requests and has to have separate session to each of them? (I mean raw
SQLAlchemy not Flask-SQLAlchemy or smth like this)

请参阅会话线程安全部分？为此，你需要对你的并发采取"无共享"的方法。因此，在您的情况下，您需要脚本的每个实例在彼此之间不共享任何内容。

您可能想要查看使用引擎和连接。如果并发性是您正在进行的工作，我不认为弄乱会话是您想要的。有关NullPool和并发的更多信息：

For a multiple-process application that uses the os.fork system call,
or for example the Python multiprocessing module, it’s usually
required that a separate Engine be used for each child process. This
is because the Engine maintains a reference to a connection pool that
ultimately references DBAPI connections - these tend to not be
portable across process boundaries. An Engine that is configured not
to use pooling (which is achieved via the usage of NullPool) does not
have this requirement.